Zbr's days.
May
Sun Mon Tue Wed Thu Fri Sat
       
21
2008
Months
May
Aug Sep
Oct Nov Dec

About TODO Blog RSS Old blog Projects Gallery Notes

Wed, 21 May 2008

iput() locking in POHMELFS.

iput() is a very tricky call in Linux VFS, besides the fact that it drops inode when its reference counter reached zero, it also waits until all associated pages are flushed to storage too.
POHMELFS uses singler per network state (network connection structure) thread, which only reads async replies from the server, so it is possible, that reply which requres iput() (for example create command reply) will happend in parallel with object removal, so inode will be deleted, but yet not freed. When reply is received and iput() called, it will try to free inode and wait until all associated to its mapping pages are synced. But page sync happens on reply to another command (consider for example several writeback transactions), which can not be processed, since thread is waiting them to be completed. This problem can not be fixed by introducing multiple threads, since each one can be exactly in the same situation simultaneously.

In turn we should not allow to grab inode and free it in the receiving path. This is ok for writeback transactions, since inode can not be freed until pages are synced, so just by holding pages we are able not to lock, but object creation for empty files or directories does not have pages attached, so they have to be synced with special transaction. There still can be a problem with empty file though - some pages can be attached and it can be removed while system waits for creation transaction complete, but actually we do not need to know about that - we shuold not grab inode it all, since transaction already contains all needed into, namely inode number, so we can lookup inode (if it still exist) and mark it as created without need for lock-prone grab/put.

This bit took me last three days, during which POHMELFS moved to non-blocking receiving and timeout-based sending (and returned back), it got scanning 'watchdog' which resends trasactions if they were not acked after some time and eventually dropes them if they still does not get a reply, POHMELFS got couple of new operations supported and likely something else to existing set of features implemented to date (full transaction support for all operations and data and metadata coherency protool were added for the next release).
New release is scheduled for the end of the week, and there is no readpage transaction support yet...
So, stay tuned!

/devel/fs :: Link / Comments (3)

Zbr wrote at 2008-05-21 22:16:

And actually this is still racy - inode can be dropped after ilookup(), so subsequent idrop() in receiving path will free inode and thus sleep... I've fixed it by introducing a list of to be dropped inodes in POHMELFS superblock, so private ->drop_inode() will just put inode there, and special scanning workqueue will actually release inodes there.

anon wrote at 2008-05-21 22:53:

Your fix basically is the same technique as RCU... Common "pattern" for non-locking, non-blocking data structures.

Zbr wrote at 2008-05-22 00:31:

Yeah, kind of its call_rcu() deferring.

Please solve this captcha to be allowed to post (need to reload in a minute): 82 - 98

Name:
URL (optional):
Captcha:
Comments: