|
|
About
TODO
Blog
RSS
Old blog
Projects
Gallery
Notes
Tue, 15 Jan 2008
POHMELFS development progrees.
If you are curious about strange delay in POHMELFS development do not think
it is closed or stuck, there is number of things I'm working on in this network
filesystem and delay is only because of administrivia steps about my testing environment
and things like that...
Now it seems things settled down and I have some news.
First, it supports object creation in the filesystem, so far only regular files, but
directories, links and directories is just a matter of additional flags, so it is simple.
Second, it supports object removal (tested on files only though). It does not support
file writing yet, and all metadata operations described above (removing and creation)
perform network sending and receiving (removing can be done in local cache only).
I will write more detailed explaination of the operations involved just after directory/link
creation is ready, likely tomorrow.
/devel/fs :: Link / Comments (0)
BTRFS 0.10 has been released.
Chris Mason announced
new release of the BTRFS filesystem.
According to changelog, this version contains pretty serious changes:
- on-disk format changes, now it supports back references from every data and metadata blocks.
This allows future extensions like implementation of the on-line fsck
(a question rises, why is it ever needed for COW FS?) and to allow data migration between different
devices.
- online resizing (including shrinking)
- in-place conversation from ext3 to btrfs :) Although it is offline only, it is a very good
step for easier migration for users.
The conversion program uses the copy on write nature of Btrfs to preserve the
original Ext3 FS, sharing the data blocks between Btrfs and Ext3 metadata.
Btrfs metadata is created inside the free space of the Ext3 filesystem, and it
is possible to either make the conversion permanent (reclaiming the space used
by Ext3) or roll back the conversion to the original Ext3 filesystem.
- data=ordered support. (Probably it is option of the transactin log journal)
- mount options to disable checksumming and COW (the latter explains a lot about
fsck and journalling)
- barrier supports
From the changelog observation only, it looks really impressive, my congratulations for the
project, although list of not fixed bugs worries a bit, but I'm pretty sure, things will be fixed.
/devel/fs :: Link / Comments (0)
Direct IO with filesystem from the kernel and fast mapping for loop device.
Although every bit of the system is easily accessible from the kernel,
it is quite hard to do filesystem related tasks, which are generally only
performed from the userspace. For example to read and write files. Actually
one can call the whole sys_open()/sys_read()/sys_write() path
from the kernel, but it is quite slow and ineffective.
Likely the most common example is loop block device driver, which allows
to make a usual file to look like a block device, so one can
mount if, create files there and so on.
With time loop driver became more and more complex, I recall I my first
block layer driver (async block device,
which was similar to loop device, but allowed to perform a lot of operations
asynchronously, it was used to test acrypto
crypto system) was based on it.
Loop device is quite slow, so Jens Axboe (block layer maintainer) came into the game and
extended it to support much faster mapping of the blocks to read/write from the kernel,
than existing.
His first version was extended by Chris Mason (btrfs
author among other), which basically moved mapping code into the filesystem,
so address space operations were extended to include new callbacks called
->map_extent() and ->extent_io_complete().
The former is used to map offset inside the file into extent. Basically extent is a bigger than
a block area on the disk, so far it is not supported by mainline tree (at least 2.6.24 tree),
so one can consider this callback is a mapping from file offset into block number. Usually it is
implemented by filesystem specific ->get_block() callback. Extent part of the patchset
adds a special tree of extents, which can be addressed by offset in the address space, if there
is no extent in the tree, it can be inserted. Extent creation is implemented via ->get_block().
Second callback, ->extent_io_complete(), is only used to invoke calling layer, when
IO is completed, so far it is only used to show when hole filling is completed. Actually I do not know,
how this callback can be used by classical filesystem, but copy-on-write ones should benefit greatly,
since they automatically get a completion, which is async, so higher-layer tree can be updated. Classical
filesystems already handle this situation though. Since it is only implemented for hole filling, it looks
like a little hack :)
Here is Jens' first presentation,
and here is Chris' presentation
of the extent mapping code used to implement fast mapping in loop device.
/devel/fs :: Link / Comments (0)
|