Zbr's days.

About :: TODO :: Blog :: RSS :: Old blog :: Projects :: GIT :: Gallery :: Notes

Tue, 23 Oct 2007

Studying existing distributed filesystems.

I already wrote short notes about googlefs, hadoop sf (hdfs) and DragonflyBSD hammer as a part of preparation for the new filesystem development.

Now, let's move a bit into different area: IBM's GPFS (originally Tiger Shark FS) and PVFS (second version of course).

Here are my short notes about PVFS2, which I got from its design notes:

  • virtual filesystem, as it works not as real filesystem, but a userspace wrapper on top of usual filesystem - just like googlefs.
  • non-posix compliant - what I do not get, that its interface, which is heavily MPI oriented. It is possible to use usual POSIX syscalls with special kernel module, but it does not have file's sematic - there are no files, there are only some references, which can be deleted without thinking about others, who already opened it.
  • no redundancy - this problem is handled either by having shared storage, or using so called lazy redundancy, which basically means a new helper for user's applications, which allows to force redundancy writes for given file.
  • lockless metadata updates - sounds like a really good idea, which is based on strong state machine of the update process, but in practice it is possible to have complex races and fallbacks, which can be complex enough and does not worth locklesses.
  • userspace IO daemons, PVFS2 uses traditional UNIX filesystem to store data and Berkeley DB to store metadata.
  • really bad at serving several types of loads like executing off the file system, shared mmapping of files, storing mail in mbox format. PVFS2 was designed for different loads.
This filesystem was designed for the only purpose of working with heavy dataflows, created by huge scientific MPI applications. Most of it works in userspace.
But what I really like is how it was written - with bits of fun, self-irony and excellent description of what it is for - no empty advertisement words and other pathos crap.

/devel/fs :: Link / Comments ()