|
|
About ::
TODO ::
Blog ::
RSS ::
Old blog ::
Projects ::
GIT ::
Gallery ::
Notes
Thu, 10 Apr 2008
Busy inodes after unmount.
VFS: Busy inodes after unmount of pohmel. Self-destruct in 5 seconds. Have a nice day...
After removing private cache of inodes I found, that objects, which were
sent by the server and which were never attached to directory entry (dentry),
will never be freed.
So, essentially this does not work with Linux VFS:
iget()/iget_locked()
...
umount
Inodes, created by iget()/iget_locked() will be placed into at least three
different lists:
inode_in_use - global list of ever created inodes, which have i_count and i_nlink
more than 0
s_inodes - per superblock list, which contains every inode, created for this superblock
inode_hashtable - hash table indexed by inode number. If you want to
work with writeback,
your inodes have to be there. Did not yet investigate why.
So, essentially all inodes, which you created, are accessible by VFS and will be checked
during umount via generic_shutdown_super()->invalidate_inodes(),
where system will notice that if inode in s_inodes list has non-zero reference
counter (or course, otherwise it would be already freed by filesystem), then this inode
can not be freed. Thus we have a leak.
Above lists can only be accessed under global inode lock, so it is not a good idea to destroy inodes
traversing them in for example ->put_super() callback or in any other filessytem callback,
so I had to add a list of all inodes into POHMELFS superblock. Ugly.
/devel/fs :: Link / Comments (0)
get_user_pages() sclability.
Just found an article at LWN
about get_user_pages(). Main problems happend to be a locking
between multiple threads...
Out of curiosity, was this
scalability problem fixed (for the busy reader: this is my more than 2-years old
testing of the get_user_pages() performance with single thread,
ran to find bottlenecks in kevent
AIO).
Here is a graph (perfomance vs. number of pages):
/devel/other :: Link / Comments (0)
POHMELFS development status.
It has developed very rapidly last couple of days,
so essentially I rewrote it. I think it is ready for the next
release, which I will announce in a day or so.
Right now all first-milestone features except cache-coherency (check below),
which I planned, are completed (although maybe not in the most
optimal way sometimes).
Because of name cache usage it is now possible to create huge pathes
with multiple directories via single command. The same applies to directory
removal,
although it is because of different design issue.
It would be possible to rewrite generic read/write helpers and provide
set of pages into POHMELFS network stack (which is page
based for data now), but I decided that for the first
step it is not needed.
POHMELSF has now fully async processing of all operations except link creation
(I just decided that it is a bit simpler to make them write-through,
it was done because of laziness and not some fundamental arch problems).
It was achieved by serious (read: from scratch) changes in the arch,
which had own problematic places, namely error report. Because of this
move it becomes really simple to implement any kind of protocol, if it obeys
async rules, namely sending of the message never requires sync reply,
and where it is needed, reply comes as an independent incoming message,
which is processed asynchronously from waiting and via common state machine.
Such arch allows to have simple cache coherency algorithm, when server just sends
a missed entries or commands to remove some objects and client's core handles that just
fine since its reciving code does not depend on sending one. This is not
100% correct way to handle collisions (collisions thus became new objects
in the filesystem tree, like old name plus some suffix), but it is what lots
of the users need, but not real cache-coherency.
Writeback cache does not play very well with cache-coherency, since every metadata
changes (like object creation or removal)
has to be checked against server state, since different clients can do the same with
the same object. Level of paranoidality has to be thought of in advance.
First cache-coherency step is implementation of the trivial scheme, when
every object is synced during its writeback time and changes being broadcasted by server
to other clients. If another client has the same object being processed
it can either be renamed to collision or just overwritten. Having locks
and thus real states is a next step.
Also, POHMELFS does not have authentification and strong checksums right now,
and although this is a simple task to implement, its priority is questionable.
There is also possibility to implement cryptographically strong encryption of the
communication channels.
So, lots of ideas, but main part is ready - async data processing design was
definitely a right choice to implement, so all other features become very simple
to complete.
New release will be announced very soon, stay tuned!
/devel/fs :: Link / Comments (0)
|