Stackable file systems and NFS

Thu Aug 16 03:30:37 EDT 2012

Correct me if I am reading something wrong, in your program listing,
while printing the buffer you are passing a total_count variable,
while vfs_read returned value is collected in count variable.

debug_dump("Read buffer", buf, total_count);

One suggestion, please fill up buf with some fixed known pattern
before vfs_read.

> We have also noticed that the expected increase (inc) and the size
returned in (vfs_read()) is different.

There is nothing which is blocking updates to file size between
vfs_getattr() and vfs_read(), right? no locking?

-Rajat

On Thu, Aug 16, 2012 at 12:01 PM, Ranjan Sinha <rnjn.sinha at gmail.com> wrote:
> Hi,
>
> On Tue, Aug 14, 2012 at 4:19 PM, Rajat Sharma <fs.rajat at gmail.com> wrote:
>> Try mounting with noac nfs mount option to disable attribute caching.
>>
>> ac / noac
>>
>> "Selects whether the client may cache file attributes. If neither
>> option is specified (or if ac is specified), the client caches file
>> attributes."
>
> i don't think this is because of attribute caching. The size does change and
> that is why we go to the read call (think of this is a simplified case of
> tail -f). The only problem is that sometimes when we read we get ASCII NUL bytes
> at the end. If we read the same block again, we get the correct data.
>
> In addition, we cannot force specific mount options in actual deployment
> scenarios.
>
>
> <edit>
>
>>> On Tue, Aug 14, 2012 at 5:10 PM, Ranjan Sinha <rnjn.sinha at gmail.com> wrote:
>>> > For now, /etc/export file has the following setting
>>> > *(rw,sync,no_root_squash)
>>>
>>> hm, AFAIK that means synchronous method is selected. So,
>>> theoritically, if there is no further data, the other end of NFS
>>> should just wait.
>>>
>>> Are you using blocking or non blocking read, btw? Sorry, i am not
>>> really that good reading VFS code...
>>>
>
> This is a blocking read call. I think this is not because there is no data,
> rather somehow the updated data is not present in the VM buffers but the
> inode size has changed. As I just said, if we read the file again from the
> exact same location, we get the actual contents. Though after going through the
> code I don't understand how is this possible.
>
>>> > On client side we have not specified any options explicitly. This is
>>> > from /proc/mounts entry
>>> > >rw,vers=3,rsize=32768,wsize=32768,hard,proto=tcp,timeo=600,retrans=2,sec=sys
>>>
>>> hm, not sure, maybe in your case, read and write buffer should be
>>> reduced so any new data should be transmitted ASAP. I was inspired by
>>> bufferbloat handling, but maybe I am wrong here somewhere....
>>>
>
> --
> Regards,
> Ranjan