Zbr's days.
April
Sun Mon Tue Wed Thu Fri Sat
   
21
     
2008
Months
Apr
Aug Sep
Oct Nov Dec

About TODO Blog RSS Old blog Projects Gallery Notes

Mon, 21 Apr 2008

Cache coherency in POHMELFS.

Example:

Client 1			Client 2
# ls -a /mnt/
. ..
				ls -a /mnt
				. ..
				echo qwe > /mnt/asdasd
				sync
ls -a /mnt/
. .. asdasd
rm -f /mnt/asdasd
sync
ls -a /mnt/
. ..
				dmesg | tail -n1
				pohmelfs_remove_response: parent: 2, path: '//asdasd'.
				ls -a /mnt
				. .. asdasd
As you might noticed, when one client creates an object and it is written back to server (during writeback), it is broadcasted to all clients, which read the same directory before. This information is stored on server in binary tree, so it takes (M-1)*O(log(N)) time, where M is total number of clients and N is number of directories they read. This can be further optimized though.

Objects are not removed from clients, when one of them remove it (and this is synced to server via writeback), since so far I can not call sys_unlink() directly from module, and I did not yet wrote code to deal with dentry cache (that will be siple), instead you can see in dmesg, that another clients received a command and just need to drop inode and dentry.

Also inode information is not broadcasted yet (for example when file size increases or access rights are changed), so new files have always zero size. This informaion should be broadcasted during writing, and since server is heavily multithreaded, this should not hurt performance.

There is different opinion though: we do not need cache coherency at all, since the last writer will overwrite data anyway, and when we open new object, we first look it up on server, so if it was created there, it will be opened, but if it exists only in cache on some other client, we do not know about it anyway. We can broadcast above messages during object creation on clients, but this will be effectively write-through cache, since we can create object on server that time.

Anyway, I will proceed with either remove/stat messages, or with ability to copy data to userspace from different thread. The latter looks like very interesting hack.

/devel/fs :: Link / Comments (2)

anonymous wrote at 2008-04-22 20:53:

"This information is stored on server in binary tree, so it takes (M-1)*O(log(N)) time, where M is total number of clients and N is number of directories they read. This can be further optimized though."

just curious... how can be this optimized ? do you refer to the "M-1" part, dont you ? or the O(Loh(N)) ?

Zbr wrote at 2008-04-22 21:12:

It is possible to have tree of hashes and each one will have a list of registered clients, so we will not have to traverse all clients to determine which one registered with given dir. Result will be O(log(T) + K), where 'T' is number of total registered directories to watch and 'K' is number of clients for that particular dir, which at most is equal to number of clients minus one (original client, which performed operation).

Then we can replace binary tree with hash table or judy array.

Please solve this captcha to be allowed to post (need to reload in a minute): 45 + 87

Comments are closed for this story.