lg_local_lock issue
Saket Sinha
saket.sinha89 at gmail.com
Thu Sep 12 00:44:24 EDT 2013
I am facing some issue on a filesystem-driver I ported from 2.6.18
kernel to 3.8. Now a lot has changed in the VFS since 2.6.18 to 3.8, which
has caused a lot of problem for this driver. One of the biggest core change
is the change in Path resolution.
*LET ME GIVE SOME Basic Information*
Path resolution is the finding a dentry corresponding to a path name
string, by performing a path walk.
Paths are resolved by walking the namespace tree, starting with the first
component of the pathname (eg. root or cwd) with a known dentry, then
finding the child of that dentry, which is named the next component in the
path string. Then repeating the lookup from the child dentry and finding
its child with the next element, and so on.
Kindly have a look at the below image-
http://lwn.net/images/ns/kernel/dentry.png
In my driver code it is done here in the function-
int get_full_path_d(const struct dentry *dentry, char *real_path) at the
below link-
https://github.com/disdi/hepunion/blob/master/fs/hepunion/helpers.c#L375
*I would like to point out here that I had dcache_lock on 2.6.18 kernel to
protect this lookup but on 3.8 kernel, I haven't found a suitable
replacement
*
The entire code flow chart of my driver can be found here-
https://github.com/disdi/hepunion/issues/6
Since 2.6.38, RCU is used to make a significant part of the entire path
walk (including dcache look-up) completely "store-free" (so, no locks,
atomics, or even stores into cachelines of common dentries). This is known
as "rcu-walk" path walking. rcu-walk uses a d_seq protected snapshot. When
looking up a child of this parent snapshot, we open d_seq critical section
on the child before closing d_seq critical section on the parent. This
gives an interlocking ladder of snapshots to walk down.
Something like the below link-
http://spatula-city.org/~im14u2c/images/man_running_up_crumbling_stairs.jpg
*PROBLEM*
Since this filesystem does not follow RCU-lookup as of now or since I have
not done anything to upgrade our driver to tell VFS whether we follow RCU
or not and simply ported the driver to 3.8 kernel by changing the kernel
functions or APIs but the respective replacements. Now there is no
compilier error but at runtime, I find myself caught in some endless loop
and in the stack or dmesg I see something called *lg_local_lock.*
The entire kernel stack of my driver can be found here -
https://github.com/disdi/hepunion/issues/5
I searched and found the following link describing it thoroughly-
http://lwn.net/Articles/401738/
To sum up
1. This thing has come up with the new RCU-lookup change.
2. lglocks is to *protect the list of open files* which is attached to each
superblock structure.
Since this filesystem driver is failing when I do a "ls in mount point" and
if we do a strace on "ls" we have
open("/scratch", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
and in the stack after it hangs, it shows lg_lock
3.And most importanly
"The real reason that the per-superblock open files list exists is to let
the kernel check for writable files when a filesystem is being remounted
read-only."
I have a union of filesystems in this filesystem, where one is Read Only
and the other is Read-Write. I think I am violating some of this kernel
protection mechanism and find mydriver stuck.
If you can have a look at this issue, I shall be grateful to you.
Regards,
Saket Sinha
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20130912/7c1d69b1/attachment.html
More information about the Kernelnewbies
mailing list