Zbr's days.
December
Sun Mon Tue Wed Thu Fri Sat
           
21
         
2007
Months
Dec

About TODO Blog RSS Old blog Projects Gallery Notes

Fri, 21 Dec 2007

Anatomy of the filesystem ->readpage() callback.

This callback is used to read page from the storage to RAM. It has following prototype:

static int pohmelfs_readpage(struct file *file, struct page *page)
Where file is an object associated with opened in userspace file, and page is a page where filesystem has to put data.
On-disk filesystems usually use VFS helpers (like mpage_readpage() or block_read_full_page()), which maps page into set of buffer_head objects, which are then submitted to block layer, where next level of reading from the disk happens. This mapping is implemented via per-filesystem get_block() callback.

Pohmelfs does not follow this standard, since it does not know, which filesystem is on the remote side, and since there is no block device under it. So it just uses request/reply protocol to get given page from the remote host. Page structure already contains its offset from the begining of the file (from the beginning of the address space actually), and it is locked, so simultaneous access is not possible, so we only need to fetch data and mark page (if copy was successful) is uptodate.
Simple.

Here is the result:
server $ md5sum /tmp/ltp-full-20071130.tgz
77bf4032c10c03e858512a5a90c05015  /tmp/ltp-full-20071130.tgz

client # md5sum /mnt/tmp/ltp-full-20071130.tgz
77bf4032c10c03e858512a5a90c05015  /mnt/tmp/ltp-full-20071130.tgz

/devel/fs :: Link / Comments (0)


Anathomy of the filesystem. ->lookup() and ->read_inode() callbacks. First pohmelfs results.

I talked about ->readdir() callback previously, now its time to get other two the most significant callbacks in the VFS lyer.
I call them the most significant (three), since without them it is impossible to mount and get data from filesystem, they have to be implemented for any FS.

Ok, let's first look at ->lookup().
It has following prototype:

struct dentry *pohmelfs_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
As name suggests, this callback is used to lookup inode for given directory entry.
One can check struct dentry, it contains qstr field, which in turn has char array containg name, it also has its length and hashed value (used in dentry cache).
When inode number is found for given directory entry, inode has to be allocated and filled by metainformation. It then should be added into dentry:
err = -ENOMEM;
inode = iget(dir->i_sb, cmd->ino);
if (!inode)
	goto err_out_free;

kfree(data);

d_add(dentry, inode);
That's all for this callback. Pohmelfs uses simple request/reply protocol to get inode for given name, userspace server is rather dumb and contains linked list (it will be changed to tree) of all object names in given directory, so it looks parent directory up, and then finds given name in the dir, then it sends data to client. This operation can be potentially fast (only two tree lookups - one to get parent dir in the main tree and one to find object in the dir).
Pohmelfs client in future can cache received information, so that subsequent access to the same dir would not require rather slow network operations. Right now it does not.

Second callback is ->read_inode(). As name suggests, this has to read inode's metainformation from disk to RAM. It has following prototype:
static void pohmelfs_read_inode(struct inode *inode)
quite simple. Folowing members have to be filled in this callback:
  • i_mode - file mode (file/dir/somthing, access rights)
  • i_nlink - number of links to this inode
  • i_uid/i_gid - uid/gid of the owner
  • i_blocks - number of blocks allocated for this object on disk
  • i_rdev - if object is not regular file, this will hold device numbers
  • i_size - size of the object
  • i_version - used by some filesystems to show that given inode is dead (or not uptodate)
  • i_blkbits - 1 shifted left by this number results in filesstem block size
  • i_mtime/i_atime/i_ctime - modify/access/create time for given inode
  • i_fop - file operations for given inode, this operations include read/write/readdir/aio_read and so on
  • i_op - inode operations, this includes lookup
  • a_op - address space operations, this include readpage/writepage/sync_page/prepare_wrte/commit_write operations
Pohmelfs uses simple request/reply protocol to get this information from the remote server (except various operations).

Having that, one can create simple
$ wc -l fs/pohmelfs/*.[ch]
   120 fs/pohmelfs/config.c
   218 fs/pohmelfs/dir.c
   417 fs/pohmelfs/inode.c
    96 fs/pohmelfs/net.c
   169 fs/pohmelfs/netfs.h
  1020 total
network filesystem, which allows to read data from the remote server
$ wc -l ./fserver/*.[ch]
   267 cfg.c
   750 fserver.c
   581 list.h
   390 rbtree.c
   164 rbtree.h
  2152 total
Note, that rbtree.[ch] and list.h I just got from kernel sources.

Here is an example on client machine:
# ./cfg -a 192.168.4.81 -p 10250 -i 0
# mount -t pohmel /dev/hdb1 /mnt

# ls -l /mnt/
total 88
drwxr-xr-x   2 root root  4096 2007-12-21 15:01 bin
drwxr-xr-x   4 root root  3072 2007-12-21 15:01 boot
drwxr-xr-x  11 root root  3780 2007-12-21 15:01 dev
drwxr-xr-x 105 root root 12288 2007-12-21 15:01 etc
drwxr-xr-x   6 root root  4096 2007-12-21 15:01 home
drwxr-xr-x  14 root root  4096 2007-12-21 15:01 lib
drwx------   2 root root 16384 2007-12-21 15:01 lost+found
drwxr-xr-x   2 root root  4096 2007-12-21 15:01 media
drwxr-xr-x   2 root root     0 2007-12-21 15:01 misc
drwxr-xr-x   4 root root    28 2007-12-21 15:01 mnt
drwxr-xr-x   2 root root     0 2007-12-21 15:01 net
drwxr-xr-x   2 root root  4096 2007-12-21 15:01 opt
dr-xr-xr-x 197 root root     0 2007-12-21 15:01 proc
drwxr-x---   9 root root  4096 2007-12-21 15:01 root
drwxr-xr-x   2 root root  4096 2007-12-21 15:01 sbin
drwxr-xr-x   5 root root     0 2007-12-21 15:01 selinux
drwxr-xr-x   3 root root  4096 2007-12-21 15:01 srv
drwxr-xr-x   5 root root  4096 2007-12-21 15:01 storage1
drwxr-xr-x   7 root root  4096 2007-12-21 15:01 storage2
drwxr-xr-x  12 root root     0 2007-12-21 15:01 sys
drwxrwxrwt  20 root root  4096 2007-12-21 15:01 tmp
drwxr-xr-x  13 root root  4096 2007-12-21 15:01 usr
drwxr-xr-x  23 root root  4096 2007-12-21 15:01 var

# mount | grep mnt
/dev/hdb1 on /mnt type pohmel (rw)
Believe me or not, that is exactly content of the '/' on the my desktop, which is used as a server.

Next step is readpage/writepage/prepare_write/commit_write callbacks, which will allow to read and write files.
Stay tuned.

/devel/fs :: Link / Comments (0)