Block device driver question

neha naik nehanaik27 at gmail.com
Fri Nov 1 16:47:37 EDT 2013


Hi,
  I am executing the command
sync && echo 3 | tee /proc/sys/vm/drop_cache

for dropping all the caches.

And yes my block device sector size is declared as 512.

Should this ensure that if i write only 512 i will not get a read.

Regards,
Neha

On Fri, Nov 1, 2013 at 2:25 PM, Rajat Sharma <fs.rajat at gmail.com> wrote:
>
>
>
> On Fri, Nov 1, 2013 at 12:17 PM, neha naik <nehanaik27 at gmail.com> wrote:
>>
>> Hi Rajat,
>>  Thanks for the information. One more question :
>>    Say my block device driver doesn't support reads and the
>> application always does aligned io in 512 chunks (but it is not direct
>> io). In that case, will i get a read because the page size is 4096 and
>> yet we are writing 512. Because i am not getting any read which is why
>> i am confused.I have been doing the io after syncing the page cache so
>> it is not like i get a pagecache hit every time.
>
>
> sync does not evict page cache. And is your block device sector size
> declared as 512 ?
>
>>
>>   I am doing a normal dd without any special flags, just 'bs=512'.
>>
>> Regards,
>> Neha
>>
>>
>> On Fri, Nov 1, 2013 at 12:16 PM, Rajat Sharma <fs.rajat at gmail.com> wrote:
>> > Hi Neha,
>> >
>> >
>> > On Fri, Nov 1, 2013 at 10:26 AM, neha naik <nehanaik27 at gmail.com> wrote:
>> >>
>> >> Hi,
>> >>   I am writing a block device driver and i am using the
>> >> 'blq_queue_make_request' call while registering my block device
>> >> driver.
>> >>   Now as far as i understand this will bypass the linux kernel queue
>> >> for each block device driver (bypassing the elevator algorithm etc).
>> >> However, i am still not very clear about exactly how i get a request.
>> >>
>> >>  1.  Consider i am doing a dd on the block device directly :
>> >>   Will it bypass the buffer cache(/page cache) or will it use it.
>> >> Example if i register my block device with set_blocksize() as 512. And
>> >> i do a dd of 512 bytes will i get a read because it passes through the
>> >> buffer cache and since the minimum page size is 4096 it has to read
>> >> the page first and then pass it to me.
>> >>     I am still unclear about the 'page' in the bvec. What does that
>> >> refer to? Is it a page from the page cache or a user buffer (DMA).
>> >>
>> >
>> > If you are not using oflag=direct with dd, then you are getting 'page'
>> > in
>> > bvec that belongs to buffer cache (in 2.6 it is implemented as
>> > page-cache of
>> > block_device->bd_inode->i_mapping). You get user buffer only with direct
>> > IO,
>> > but then you need to take care to issue aligned IO requests yourself (if
>> > your block device wants only aligned buffers its your implementation
>> > though).
>> >
>> >>
>> >> 2. Another thing i am not clear about is a queue. When i register my
>> >> driver, the 'make_request' function gets called whenever there is an
>> >> io. Now in my device driver, i have some more logic about  writing
>> >> this io i.e some time may be spent in the device driver for each io.
>> >> In such a case, if i get two ios on the same block one after the other
>> >> (say one is writing 'a' and the other is writing 'b') then isn't it
>> >> possible that i may end up passing 'b' followed by 'a' to the layer
>> >> below me (changing the order because thread 'a' took more time than
>> >> thread 'b'). Then in that case should i be using a queue in my layer -
>> >> put the ios in the queue whenever i get a call to 'make_request'.
>> >> Another thread keeps pulling the ios from the queue and processing
>> >> them and passing it to the layer below.
>> >>
>> >
>> > If your application does not quarantee the ordering of writes, then you
>> > don't have to worry either. Most likely block layer will do the merges
>> > in
>> > page-cache if it is not a direct IO. As a driver developer, you don't
>> > need
>> > to worry about out of order writes from application.
>> >
>> >>
>> >>
>> >> Regards,
>> >> Neha
>> >>
>> >> _______________________________________________
>> >> Kernelnewbies mailing list
>> >> Kernelnewbies at kernelnewbies.org
>> >> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>> >
>> >
>
>



More information about the Kernelnewbies mailing list