|
|
About
TODO
Blog
RSS
Old blog
Projects
Gallery
Notes
Mon, 24 Dec 2007
First CRFS (cache coherent remote file system) results.
Zach Brown posted
first public results of his CRFS filesystem.
He compared NFS and CRFS when remote storage is on disk (likely btrfs) and in ram (tmpfs) for two operations:
big number of file/dir creations (a lot of metaoperations) with small write (untarring kernel archive)
and reading all that data into RAM.
In both tests CRFS is noticebly faster: metadata operation test (untarring kernel archive) is 4 times faster
for disk storage and about 6 times faster for ram, CRFS reading is about 1.8 times faster than NFS.
Very impressive results, although without knowledge of the CRFS internals it is quite hard
to tell, where and how such gain was created, so I will handwave here :)
When CRFS will be opened (if wit will), we will check my thoughts..
First, since there was a tmpfs test, then userspace server does not use anything btrfs specific (like open
by inode), although there is a possibility, that btrfs exports some ioctls or kernel was patched, right now
I will not consider this as a fact. So, first, userspace server can work on top of any filesystem.
Second, reading is only 2 times faster, while metadata operations is 4-6 times faster. Zach says
it is limited by disk speed, so this means metadata was heavily cached. There is a question, though, does
server see the last metadata change or it will be sent to server only when another client will access
cached data (so caches will become coherent), getting into account, that NFS always sends metadata changes,
it looks like CRFS does not. If it is correct, than there is a question, does it need to send metadata updates
at all until sync or flush started.
Third, userspace server is fast. With logic I
described for pohmelfs server,
I think it will not be able to compete, so there is a place for thoughts.
Fourth, network protocol in CRFS batches requests. This can be done either because of special transactional layer
between VFS callbacks and network or because of the way VFS callbacks work, for example data is not sent
in ->commit_write() callback, but only in ->writepage() and ony if there is a strong demand
on that. The same applies to metadata operations - how are they batched and network communication reduced
to get 4-6 times performance increase? The most simple case is never send them at creation time at all,
but only when writeback for files started (or cache-coherence algorithm requires), so when for example directory
is created only notification about dirty parent dir is sent, and when new file is created in this new dir, content
of the directory is transferred.
Anyway, from features above pohmelfs currently does not have anything, it is actually read-only, but I already
see where it can be improved - for example directory listing (->readdir()
callback) is invoked for each access (i.e. each ls /mnt forces directory content resending), since
pohmelfs does not cache it.
There is fair number of changes I want to implement to catch with CRFS (I think so :), so stay tuned, I will
implement basic functionality first and will run the same tests too...
Making bets? I vote for slower than NFS speeds, because of bad userspace support and no
caching of the metadata.
But pohmelfs is developed only 3 days, it is quite young... So, stay tuned.
/devel/fs :: Link / Comments (0)
Please solve this captcha to be allowed to post (need to reload in a minute): 98 + 1
|