Ordering guarantee inside a single bio?

오준택 na94jun at gmail.com
Mon Jan 27 23:50:56 EST 2020


Hello,

As I know, there is no way to guarantee ordering between block writes
inside a bio.

That is the reason why bio for journal commit block write and for other log
block writes are separated in JBD2 module.

And, I think your idea can be optimized more efficiently.

If you write checksum for some data, ordering between checksum and data is
not needed.

When the crash occurs, we just recalculate checksum with data and compare
the recalculated one with a written one.

Even though checksum is written first, the recalculated checksum will be
different with the written checksum because data is not written.

So, i think if you use checksum, ordering guaranteeing is not needed.

This is first time that i send mail to kernelnewbies mailing list.

If i did wrong thing on this mail, very sorry about that.

Thank you.

Joontaek Oh.

2020년 1월 28일 (화) 오전 3:23, Lukas Straub <lukasstraub2 at web.de>님이 작성:

> On Mon, 27 Jan 2020 12:27:58 -0500
> "Valdis Klētnieks" <valdis.kletnieks at vt.edu> wrote:
>
> > On Sun, 26 Jan 2020 13:07:38 +0100, Lukas Straub said:
> >
> > > I am planing to write a new device-mapper target and I'm wondering if
> there
> > > is a ordering guarantee for the operation inside a single bio? For
> example if I
> > > issue a write bio to sector 0 of length 4, is it guaranteed that
> sector 0 is
> > > written first and sector 3 is written last?
> >
> > I'll bite.  What are you doing where the order of writing out a single
> bio matters?
>
> I plan to improve the performance of dm-integrity on HDDs by removing the
> requirement for bitmap or journal (which causes head seeks even for
> sequential writes). I also want to avoid cache flushes and FUA. The problem
> with dm-integrity is that the data and checksum update needs to be atomic.
> So I came up with the following Idea:
>
> The on-disk layout will look like this:
>
> |csum_next-01|data-chunk-01|csum_prev-01|csum_next-02|data-chunk-02|csum_prev-02|...
>
> Under normal conditions, csum_next-01 (a single sector) contains the
> checksums for data-chunk-01 and csum_prev-01 is a duplicate of csum_next-01.
>
> Updating data will first update csum_next (with FUA), then update the data
> (FUA) and finally update csum_prev (FUA).
> But if there is a ordering guarantee we have a fast path: If a full chunk
> of data is written, we simply issue a single big write with csum_next, data
> and csum_prev, all without FUA (except if the incoming request asks for
> that).
> So that's why I'm asking.
>
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies at kernelnewbies.org
> https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20200128/d864a0ac/attachment.html>


More information about the Kernelnewbies mailing list