|
|
About
TODO
Blog
RSS
Old blog
Projects
Gallery
Notes
Mon, 31 Dec 2007
Mostly completed appartment development.
I finished painting, glueing, covering, lights and called friends.
There is number of tasks to be completed, but mostly very
small, so they can be postponed for a while.
I feel myself really excited about my loft, it
looks very interesting for me and I do not have anything
I would like to change or fix.
Groovy!
/devel/flat :: Link / Comments (0)
Sun, 30 Dec 2007
Meanwhile at appartment development side.
A lot of changes. Huge step forward was made today (and yesterday night).
Right now I completed my table (although it has only isngle layer of varmish
and not finished rim), mostly finished kitchen (there was not enough
wallpapers, but I painted ceiling, glued all wallpapers, which I had,
and will setup floor cover tomorrow), finished paintings in the room
(I have a blue wall now) and hall
(no uchuu
yet). So, the only really needed thing is to remove huge amount of dust and garbage
from the appartments and then setup floor cover.
I think I'm ready for New Year celebration and amount of work I made will
absolutely end up with a good celebration.
/devel/flat :: Link / Comments (0)
Sat, 29 Dec 2007
POHMELFS abbreviation.
POHMELFS stands for Parallel Optimized
Host Message Exchange Layered
File System.
And it has a metadata cache on client now. It contains just
pohmelfs inodes, which are indexed by
three tuples: first contains of
name hash, parent inode number and length of the string (this guarantees,
that there will be no identical tuples), second tuple is
inodes number and the last one is offset in the address space of the parent.
Cache update operation is independant from its usage, althoguh both are guarded
by the same lock.
/devel/fs :: Link / Comments (0)
Fri, 28 Dec 2007
Table development.
It happend that I changed my table design once more, now
it has single right angle. Since my wood door contains
holes between wood plates, I decided not to remove orgalite
(paper filled with glue) plates from top and bottom of the
wood plates, but it does not soack up a mordant, so I wll
get a coloured varnish and cover table plate again.
It is quite hard to work with wood, especially with
straigh places without plane using only knife and electric jigsaw,
so I will buy one too. I also need set of chisels.
Given that, table is still in very early development stage,
but one can check a preliminary photo (made by phone,
so quality is not very good)
here.
Since table is postponed I will paint walls in the hall and
try to clean my loft a bit, I would also like to setup a boiler,
start the last arc (instead of the kitchen door), but it is too
loud process, so likely will postpone it for tomorrow too.
It will be busy day, and quite short actually - Mephody and Irin arrive,
which is a start of the NY celebration process!
/devel/flat :: Link / Comments (0)
Thu, 27 Dec 2007
Meanwhile at appartment devlopment side: blue wall and table.
Yes, I've made it - now I have a blue wall (colour is called 'royal marine',
and although it does not look like a sea (I was not near the
sea so many years already that do not even recall when I was last time,
maybe it was changed, and I did not see an ocean at all, so I will
change that too), it looks great. But to made my feelings worse
devil told me to get not enough colour, so I actually have only 3 quarters
of the wall painted. Will fix it tomorrow or in a day when will
move to development shop to get LED cord for the
ceiling. So, painting in the room and hall will be finished very soon.
The same roughly applies to my
table
development. Not that I made too big progress, but I cleaned my old enter wood door
(which is a base for the table), put to the floor and painted table contour on it.
It looks very expressive and completely different from above pictures: there
are no straight lines (although its base is letter 'L'),
it will have only single leg (I bought it today) at the
end of the longer side, its opposite side will be attached to the walls
and will be in a level with window-sill. Maybe later I will replace single leg
with leg from the floor to ceiling - that side of the table is essentially round,
so it will be convenient to put there some round (glass) shelves.
I tried to saw the table using my electric jigsaw, but it is quite loud process
and it is about 24-00 in Moscow already,
so I postponed it for tomorrow or later. It has to be completed this year,
since I need a table to put a lot of things on it (for example fir made of
lots of empty beer bottles and jars I have here).
Since I have no chairs, I will made couple of long benches if will have enough
of materials (there are two doors here - bigger one (about 200x90 sm)
is used as a base for the table, smaller one (200x60 sm) - for smaller part
of letter 'L', the rest will be used for benches).
I could get wood plates in development shop (it has a lot of interesting
types there), but I have some troubles getting it home
without a car and do not want to wait for delivery (which will took about a week).
I also got a water hatch for my bathroom today, but I will not set it up, since
I have no glue for ceramic tiles, so better to devote this time to other interesting
tasks.
Expect some photos of my loft closer to NY time...
/devel/flat :: Link / Comments (0)
Wed, 26 Dec 2007
New release of the distributed storage: Groundhogs strike back: no New Year for humans!
Short changelog:
- mirroring algorithm improvements
- debug cleanups
- extended mirroring initialization
- documentation update
- name is 'Groundhogs strike back: no New Year for humans' now
As usual, one can get patch or pull changes from the project
homepage.
/devel/dst :: Link / Comments (2)
CDMA (EVDO) vs GPRS.
Good people gave me SkyLink CDMA modem (model USB CNU-550 with EVDO support)
to test internet conection in Linux and compare it against GPRS.
Well, here are my conclusions:
- CDMA works always while GPRS really sucks in the middle of the day: well, it has to be proven
tomorrow, but I tested CDMA modem at about 19-00 and it worked ok, while MTS (Mobile TeleSystems
in Russia) at about 12-00 worked very bad (I connected quickly, but ssh login took enormously long time).
- CDMA speed is usually about 10-16 kb/s, while GPRS is usually can not be higher than 1-2 kb/sec.
More on this: I believe CDMA session can have higher speeds (pppd session requests about
900 kbit/sec), behaviour of initial login (I saw quite a few of them on different initial speeds
during usual work and userspace network stack testing)
shows there is some limitation on server or hardware side (i.e. SkyLink either because of
special tariff scale or driver limitations, I was told there is no 16 kb/sec limitation
with given card though) side (note on testing different congestion
control algorithms and develop own if needed: speed downgrades to less than 10kb/sec frequently
during download of the big file. SkyLink is a very interesting source of data for such development:
it looks like its RTT is quite high for default (new CUBIC) congestion control, at least low-traffic but
very-small-latency-wanted source (
mutt over ssh on remote host
without serious traffic shaping) works quite bad in this setup.
Very likely it is just an empty speculation and problem is in hardware on the server (i.e.
it has support for bulk streaming access at high speeds (16 kb/sec), but fails to work
with low-latency applications, which work in small packets ping-pong environment)).
- CDMA USB CNU-550 modem works ok in Linux (modulo above issues) with this peers/pap-secrets
files without any problems.
Anyway, CDMA SkyLink is much faster and more smooth than MTS GPRS, so decision
about what is better is quite obvious...
/other :: Link / Comments (3)
Tue, 25 Dec 2007
Continuing CRFS debates.
Zach Brown again shed
some light on his CRFS desing and implementation. Let's compare facts with
my thoughts.
The most exciting news is that
CRFS caches not only data but metadata too on the client, which is flushed to server on writeback.
That is what allows to have 4-6 times higher performance in metadata intensive
operations.
Another news is actually quite bad for majority of the potential CRFS users -
userspace server is btrfs
specific, which can be another gain in the benchmarks (although should
be noticebly smaller than metadata caching part). Server does not require
any additional patches, but since it is btrfs specific, it likely works
on top of ramdisk (when test was perfromed with RAM storage), not tmpfs.
Userspace server has exclusive access to given block device, so it is not allowed
to simultaneously mount it via usual way (probably it is possible to mount
it read-only whlist it is used by CRFS).
Client kernel module only depends on ->write_begin()/->write_end()
patchset by Nick Piggin, which was added to mainline
recently.
Batching of network requests happens naturally in request/reply protocol,
but reply contains not only single request, but set of them, since client caches
metadata, it can check if data is in the cache or not and update it if needed.
Getting that knowledge, let's summarize given bits:
- CRFS is btrfs specific, while pohmelfs is supposed to be fs-agnostic. This CRFS feature
allows to have faster (probably even noticebly faster) access to on-disk data. Do not think
it is a bad sign, consider it as a client-server filesystem, no one claims AFS is bad, since
there is only AFS specific kernel server. Here is the same, but server is in userspace.
From another point of view, not allowing to work with the same btrfs volume locally can be a
show-stopper for some users.
- Metadata caching. That rocks. It has to be implemented.
- Extended request/reply protocol: i.e. do not reply with only single data (if it was not
explicitly requested), but try to combine objects. The most obvious example is
->readdir()
callback, when each request from client should transfer multiple objects, which will then
be cached.
I think I was corect in most if not all prognosis about CRFS, probably I should try weather next time...
Given that, I have a clean expectations of what pohmelfs should have and which results we should expect.
CRFS project is a serious step forward in this area, so it is very exciting to work with its ideas
and move further.
Stay tuned!
/devel/fs :: Link / Comments (2)
Mon, 24 Dec 2007
Climbing evening.
That was hard. That was really bloody hard, but great training.
I climbd high over number of new traces - first for warming
I tried something new without label (new yellow
trace in the left verticall sector), it happend to be
quite complex trace, so I completed it with couple of falls
since did not know exact holds of the trace. Next several traces
were the same part of the complex trace started on the horizontal
negative slope, but I skipped that part, since wanted to know
how it behaves higher, I already knew that its start is very
complex and fully corresponds to its category (red 7a trace in the middle sector).
There were also couple of simpler traces I made for warming and at the very end
to completely flush the power and fasten blood.
It was excellent time!
/life :: Link / Comments (0)
First CRFS (cache coherent remote file system) results.
Zach Brown posted
first public results of his CRFS filesystem.
He compared NFS and CRFS when remote storage is on disk (likely btrfs) and in ram (tmpfs) for two operations:
big number of file/dir creations (a lot of metaoperations) with small write (untarring kernel archive)
and reading all that data into RAM.
In both tests CRFS is noticebly faster: metadata operation test (untarring kernel archive) is 4 times faster
for disk storage and about 6 times faster for ram, CRFS reading is about 1.8 times faster than NFS.
Very impressive results, although without knowledge of the CRFS internals it is quite hard
to tell, where and how such gain was created, so I will handwave here :)
When CRFS will be opened (if wit will), we will check my thoughts..
First, since there was a tmpfs test, then userspace server does not use anything btrfs specific (like open
by inode), although there is a possibility, that btrfs exports some ioctls or kernel was patched, right now
I will not consider this as a fact. So, first, userspace server can work on top of any filesystem.
Second, reading is only 2 times faster, while metadata operations is 4-6 times faster. Zach says
it is limited by disk speed, so this means metadata was heavily cached. There is a question, though, does
server see the last metadata change or it will be sent to server only when another client will access
cached data (so caches will become coherent), getting into account, that NFS always sends metadata changes,
it looks like CRFS does not. If it is correct, than there is a question, does it need to send metadata updates
at all until sync or flush started.
Third, userspace server is fast. With logic I
described for pohmelfs server,
I think it will not be able to compete, so there is a place for thoughts.
Fourth, network protocol in CRFS batches requests. This can be done either because of special transactional layer
between VFS callbacks and network or because of the way VFS callbacks work, for example data is not sent
in ->commit_write() callback, but only in ->writepage() and ony if there is a strong demand
on that. The same applies to metadata operations - how are they batched and network communication reduced
to get 4-6 times performance increase? The most simple case is never send them at creation time at all,
but only when writeback for files started (or cache-coherence algorithm requires), so when for example directory
is created only notification about dirty parent dir is sent, and when new file is created in this new dir, content
of the directory is transferred.
Anyway, from features above pohmelfs currently does not have anything, it is actually read-only, but I already
see where it can be improved - for example directory listing (->readdir()
callback) is invoked for each access (i.e. each ls /mnt forces directory content resending), since
pohmelfs does not cache it.
There is fair number of changes I want to implement to catch with CRFS (I think so :), so stay tuned, I will
implement basic functionality first and will run the same tests too...
Making bets? I vote for slower than NFS speeds, because of bad userspace support and no
caching of the metadata.
But pohmelfs is developed only 3 days, it is quite young... So, stay tuned.
/devel/fs :: Link / Comments (0)
Sun, 23 Dec 2007
Continuing appartments development.
I think I finished ceramic tiles glueing, at least for this year: first, I have no glue
anymore, second, I have to glue only one vertical line with 2.5 tiles width,
where vater hatch will be located, and since I do not have a hatch, I do not glue tiles.
Dirty work in bathroom has been essentially completed - I will fill 2mm holes
between tiles with plaster and attach ceiling soon, and that will be the end.
Next dirty work is ceramic granite in the hall and checkroom - that will take some time,
but since I have no glue and not sure it will be delivered this year, I can postpone this
task too. So, main issues are painting finishing and table.
When my head is aching I frequently think out something really interesting and new,
so I have a new design for the table in the mind, if I will complete it, that will be really great.
First, table is not movable, but attached to wall (potentially two walls in the corner),
end of the table, which does not touch the wall is round and have singe (better steel tube) leg,
another part, which touches the wall has a turn, so table looks like letter 'L' with smaller part
attached to wall (and window), bigger part can be accessible from both sides.
Or something like that...
/devel/flat :: Link / Comments (0)
GPRS sucks!
Even MTS one is so slow... Although it is enough to
read emails and check some news (you know, I have infinite patience
which almost never reaches its end), but I want a normal connection.
I know, there is no Ded Moroz (Santa Claus), so there will be no fast
internet until past New Year vacations (10 days in Russia),
when I will start kicking local ISPs again.
/other :: Link / Comments (0)
Sat, 22 Dec 2007
Meanwhile at appartment development side.
I painted most of the room, and then decided to make one wall either ultramarine
or just marine blue. Just because I want, so waiting for the colour, otherwise
root would be completed.
Also glued some bits of ceramic tiles in the bathroom - it almost ready too.
Today I spent most of the time cutting tiles using
corner-grinding machine
to make different forms for corners, hatches, door and so on. Became dirty as hell,
but completed all but corners and ceiling. Hopefully will finish them tomorrow
(or likely not :).
/devel/flat :: Link / Comments (0)
Fri, 21 Dec 2007
Anatomy of the filesystem ->readpage() callback.
This callback is used to read page from the storage to RAM. It has following prototype:
static int pohmelfs_readpage(struct file *file, struct page *page)
Where file is an object associated with opened in userspace file,
and page is a page where filesystem has to put data.
On-disk filesystems usually use VFS helpers (like mpage_readpage()
or block_read_full_page()), which maps page into set of buffer_head
objects, which are then submitted to block layer, where next level of reading from the
disk happens. This mapping is implemented via per-filesystem get_block()
callback.
Pohmelfs does not follow this standard, since it does not know, which filesystem is
on the remote side, and since there is no block device under it. So it just
uses request/reply protocol to get given page from the remote host. Page structure
already contains its offset from the begining of the file (from the beginning of the
address space actually), and it is locked, so simultaneous access is not possible,
so we only need to fetch data and mark page (if copy was successful) is uptodate.
Simple.
Here is the result:
server $ md5sum /tmp/ltp-full-20071130.tgz
77bf4032c10c03e858512a5a90c05015 /tmp/ltp-full-20071130.tgz
client # md5sum /mnt/tmp/ltp-full-20071130.tgz
77bf4032c10c03e858512a5a90c05015 /mnt/tmp/ltp-full-20071130.tgz
/devel/fs :: Link / Comments (0)
Anathomy of the filesystem. ->lookup() and ->read_inode() callbacks. First pohmelfs results.
I talked about ->readdir() callback
previously,
now its time to get other two the most significant callbacks in the VFS lyer.
I call them the most significant (three), since without them it is impossible to
mount and get data from filesystem, they have to be implemented for any FS.
Ok, let's first look at ->lookup().
It has following prototype:
struct dentry *pohmelfs_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
As name suggests, this callback is used to lookup inode for given directory entry.
One can check struct dentry, it contains qstr field, which in turn
has char array containg name, it also has its length and hashed value (used in dentry cache).
When inode number is found for given directory entry, inode has to be allocated and filled
by metainformation. It then should be added into dentry:
err = -ENOMEM;
inode = iget(dir->i_sb, cmd->ino);
if (!inode)
goto err_out_free;
kfree(data);
d_add(dentry, inode);
That's all for this callback. Pohmelfs uses simple request/reply protocol to get inode for given name,
userspace server is rather dumb and contains linked list (it will be changed to tree) of all object names
in given directory, so it looks parent directory up, and then finds given name in the dir, then it sends
data to client. This operation can be potentially fast (only two tree lookups - one to get parent dir
in the main tree and one to find object in the dir).
Pohmelfs client in future can cache received information, so that subsequent access to the same dir would
not require rather slow network operations. Right now it does not.
Second callback is ->read_inode(). As name suggests, this has to read
inode's metainformation from disk to RAM. It has following prototype:
static void pohmelfs_read_inode(struct inode *inode)
quite simple. Folowing members have to be filled in this callback:
- i_mode - file mode (file/dir/somthing, access rights)
- i_nlink - number of links to this inode
- i_uid/i_gid - uid/gid of the owner
- i_blocks - number of blocks allocated for this object on disk
- i_rdev - if object is not regular file, this will hold device numbers
- i_size - size of the object
- i_version - used by some filesystems to show that given inode is dead (or not uptodate)
- i_blkbits - 1 shifted left by this number results in filesstem block size
- i_mtime/i_atime/i_ctime - modify/access/create time for given inode
- i_fop - file operations for given inode, this operations include read/write/readdir/aio_read and so on
- i_op - inode operations, this includes lookup
- a_op - address space operations, this include readpage/writepage/sync_page/prepare_wrte/commit_write operations>
Pohmelfs uses simple request/reply protocol to get this information from the remote server (except
various operations).
Having that, one can create simple
$ wc -l fs/pohmelfs/*.[ch]
120 fs/pohmelfs/config.c
218 fs/pohmelfs/dir.c
417 fs/pohmelfs/inode.c
96 fs/pohmelfs/net.c
169 fs/pohmelfs/netfs.h
1020 total
network filesystem, which allows to read data from the remote server
$ wc -l ./fserver/*.[ch]
267 cfg.c
750 fserver.c
581 list.h
390 rbtree.c
164 rbtree.h
2152 total
Note, that rbtree.[ch] and list.h I just got from kernel sources.
Here is an example on client machine:
# ./cfg -a 192.168.4.81 -p 10250 -i 0
# mount -t pohmel /dev/hdb1 /mnt
# ls -l /mnt/
total 88
drwxr-xr-x 2 root root 4096 2007-12-21 15:01 bin
drwxr-xr-x 4 root root 3072 2007-12-21 15:01 boot
drwxr-xr-x 11 root root 3780 2007-12-21 15:01 dev
drwxr-xr-x 105 root root 12288 2007-12-21 15:01 etc
drwxr-xr-x 6 root root 4096 2007-12-21 15:01 home
drwxr-xr-x 14 root root 4096 2007-12-21 15:01 lib
drwx------ 2 root root 16384 2007-12-21 15:01 lost+found
drwxr-xr-x 2 root root 4096 2007-12-21 15:01 media
drwxr-xr-x 2 root root 0 2007-12-21 15:01 misc
drwxr-xr-x 4 root root 28 2007-12-21 15:01 mnt
drwxr-xr-x 2 root root 0 2007-12-21 15:01 net
drwxr-xr-x 2 root root 4096 2007-12-21 15:01 opt
dr-xr-xr-x 197 root root 0 2007-12-21 15:01 proc
drwxr-x--- 9 root root 4096 2007-12-21 15:01 root
drwxr-xr-x 2 root root 4096 2007-12-21 15:01 sbin
drwxr-xr-x 5 root root 0 2007-12-21 15:01 selinux
drwxr-xr-x 3 root root 4096 2007-12-21 15:01 srv
drwxr-xr-x 5 root root 4096 2007-12-21 15:01 storage1
drwxr-xr-x 7 root root 4096 2007-12-21 15:01 storage2
drwxr-xr-x 12 root root 0 2007-12-21 15:01 sys
drwxrwxrwt 20 root root 4096 2007-12-21 15:01 tmp
drwxr-xr-x 13 root root 4096 2007-12-21 15:01 usr
drwxr-xr-x 23 root root 4096 2007-12-21 15:01 var
# mount | grep mnt
/dev/hdb1 on /mnt type pohmel (rw)
Believe me or not, that is exactly content of the '/' on the my desktop,
which is used as a server.
Next step is readpage/writepage/prepare_write/commit_write callbacks, which will allow
to read and write files.
Stay tuned.
/devel/fs :: Link / Comments (0)
Thu, 20 Dec 2007
open-by-inode() vs. name lookup in network filesystems.
Network filesystem is a tricky bustard - depending on where it is implemented
(kernel or userspace) it is very different. By 'very' I mean really complex differences.
In kernel inode, or basic object's identity, always exists for all objects
checked before (until special steps completed, when inode is dropped, but usually
it stays alive - for example when you traverse some dir, inodes for every object
you checked continue to exist, even if you already do not use that directory.
When file is opened, inode will be attached to file, when file will be closed, inode
will live. This is a fundamental feature of the split of directory entries and inodes -
directory entries are linked into the tree, which we can see, but inodes
are shadowed objects behind that entries.
In userspace things are completely different: there are no indes, but only files,
identified by file descriptors. That's all. So, when kernel performs a lookup,
it checks some name in the inode with given number - i.e. it perfoms in-kernel
reference-by-inode operation, but in userspace there is no API (except rare special cases,
which I think Zach uses in
CRFS,
and that is likely good speedup for Btrfs)
to get file handler by inode number. Basically userspace should have either
opened file descriptor for parent directory, or perform a reverse lookup,
create a path and open directory to check if some object exists there, since
userspace can only work with file descriptors.
open-by-inode was marked by Linus Torvalds as fundamentally broken
because of number of reasons (namely because of races with directory layout changes
like move and rename), and likely it is correct, but absence of such API
greatly reduces performance of userspace metadata operations.
Having network fileserver in kernel is of course much (MUCH) simpler and faster,
but so far its implementation will be postponed a bit.
Initial server will be quite dumb - it will always perform a lookup from the
root and always close directory, later it will be possible to add cache of opened directories...
/devel/fs :: Link / Comments (8)
Wed, 19 Dec 2007
Anatomy of the filesystem. ->readdir() callback.
Here I will write simple notes about how some callbacks are
used in linux VFS and what filesystem write should implement
to be correctly understood by VFS layer.
Let's start from essentially the first callback invoked by FS after
fs has been mounted. As name suggests, ->readdir()
is used to read directory content. Its prototype looks like this:
static int pohmelfs_readdir(struct file *filp, void *dirent, filldir_t filldir)
where filp is a file structure which is connected to the root inode (which you
have to initialize in ->fill_super() callback to be able to mount fs).
Dirent is a magic structure, which hosts all directory content you will read,
and filldir is a function, which transforms directory names
into dirent structure.
Its prototype looks like this:
int filldir(void * __buf, const char * name, int namlen, loff_t offset,
u64 ino, unsigned int d_type)
and is invoked this way:
size = 1;
if (filldir(dirent, ".", size, filp->f_pos, inode->i_ino, DT_DIR) < 0)
return -1;
filp->f_pos += size;
I think every step is very straightforward, except last two entries:
the former is inode number, which is unique id of the structure, every filesystem
has to store it on disk for every inode, obviously in Unix '.' refers to curent
dir, so its inode number should be taken from the current dir inode. For '..' directory,
which is a parent for given one, filldir() is executed by the following way:
size = 2;
if (filldir(dirent, "..", size, filp->f_pos, parent_ino(filp->f_path.dentry), DT_DIR) < 0)
return -1;
filp->f_pos += size;
and for some other dir:
size = 8;
if (filldir(dirent, "test_dir", size, filp->f_pos, 14, DT_DIR) < 0)
return -1;
filp->f_pos += size;
where '14' is inode number for 'test_dir' subdir.
Directory listing for this filesystem will look like this (data from live pohmelfs setup):
# ls -la /mnt/
total 9
drwxr-xr-x 1 root root 4096 1969-12-31 20:02 .
drwxr-xr-x 21 root root 1024 2007-02-08 15:04 ..
drwxr-xr-x 1 root root 4096 1969-12-31 20:02 test_dir
# mount | grep mnt
/dev/hdb1 on /mnt type pohmel (rw)
The last parameter of the filldir() is type of the directory entry, DT_DIR
is for directories and it corresponds to 12-15 bits of the stat.st_mode returned from
stat() call.
Note, that ->readdir() will be invoked (by ls -la at least) until
filp->f_pos stops changing, so after you filled your directory entry and properly updated
filp->f_pos, you have to check, that provided filp->f_pos exceeds or not
size of the directory (here I mean overall size used by every copied directory entry), and if it does
(or is equal), just return 0.
So, how network filesystem
should behave here? Answer is pretty simple: it should just send a request to remote server to provide
directory listing, copy answer to the allocated buffer and fill directory with provided data. It is possible
to cache that data here, but each subsequent ->readdir() has to check on server that data
is still valid and was not changed.
With this work pohmelfs becomes a network filesystem,
with many interesting features I have in mind, but will open when they got implemented.
This is not intended for mainline inclusion, since Zach Brown's work was first
and likely will be more stable and/or feature complete when this stuff become ready.
But nevertheless, stay tuned..
/devel/fs :: Link / Comments (0)
Tue, 18 Dec 2007
Fundamental race between block layer/IO and networking.
This header is about impossibility to work without races with
netowork's ->sendpage() method, which is used mostly
to transfer IO mapped pages, without either turning off offload capabilities
and copying data into new buffer or using own acks in the protocol.
->sendpage() in the optimised case (hardware supports checksum offloading
and scater/gather) will not copy content of the page to the new buffer, but instead
will increase page's reference counter, so that page could not be freed. When
->sendpage() returns this does not guarantee, that data was sent, received by remote
side or whatever, since packet can be queued (in hardware or qdisk), it can be later retransmitted,
there is no way to know that data was received until ACK (lets talk about TCP)
is received, but there is no API to know that ACK was received. When ACK is received,
appropriate packet will be found in the TCP retransmit queue and freed, this will drop page's
reference counter.
If user (and there is no other way actually) does expect that after
->sendpage()'s return data can be processed (for example rewritten),
then there is non-zero probability that remote side will get this new data, instead
of old, which can lead to state machine breaks and data corruption.
One can try to use sendfile() and simultaneously write data to the
file - remote side can get mix of the old and new data. One can argue that using proper locking
around sendfile() and write will help, but actually it will not -
consider the case when we send only single page - after sendfile() returned,
data still can be in the queue, so subsequent write, which already does not race with
sendfile() itself, but not with data sending, will overwrite data and
remote side will get new one instead of old data.
There are two fixes for thei problem: first is not to use ->sendpage()
(or use it with copy of the data into new buffer, which is essentially how
usual send() works), second is to use protocol specific acknoledgement
system, so that any subsequent operation on given data would be postponed not until
->sendpage()/sendfile() returns, but until that ACK received.
Both greatly harm performance.
I would be really glad to find that my conclusions are incorrect.
/devel/fs :: Link / Comments (8)
Climbing evening.
That was very good although again a bit shorter training -
most of it was devoted to the complex trace with the start on the
horizontal negative slope, which sucked power very quickly, so that
at the end (after about 3 hours) I was not able to complete even small
parts of it (while doing it quite stable at the begining).
Trace requires back and arms especially, so after the training I feel
myself tired as hell, which is great of course!
It was very good time there today!
/life :: Link / Comments (0)
Mon, 17 Dec 2007
New release of the distributed storage: Dancing with the smoked neutrino.
Short changelog:
- new improved mirroring algorithm.
This algorithm uses sliding window approach for full resync
and write log for partial resync.
- fixed number of typos and debug cleanups
- update inode size when linear algorithm changes the size of the
storage in run time
- extended number of sysfs files and documentation for them
- fixed leak in local export node setup
- name is 'Dancing with the smoked neutrino' now
Overall list of features of the DST can be found on project's
homepage.
DST is also exported as a git tree available for clone and pull from
here.
Interested reader can test DST with 2.6.23 tree too
(it should compile fine, but was not tested).
/devel/dst :: Link / Comments (4)
New distributed storage mirroring algorithm.
Resync logic - sliding window algorithm.
At startup system checks age (unique cookie) of the node and if it
does not match first node it resyncs all data from the first node in
the mirror to others (non-sync nodes), each non-synced node has a
window, which slides from the start of the node to the end.
During resync all requests, which enter the window are queued, thus
window has to be sufficiently small. When window is synced from the
other nodes, queued requests are written and window moves forward,
thus subsequent resync is started when previous window is fully completed.
When window reaches end of the node, it is marked as synchronized.
If age of the node matches the first one, but log contains different
number of write log entries compared to the first node (first node always
stands as a clean), then partial resync is scheduled.
Partial resync will also be scheduled when log entry pointed by resync
index of the node contains error.
Mechanism of this resync type is following: system selects a sync node
(checking each node's flags) and fetches a log entry pointed by resync
index of the given node and resync data from other nodes to given one.
Then it checks the rest of the write log and checks if there are
another failed writes, so that next resync block would be fetched for
them.
Mirroring log is used to store write request information.
It is allocated on disk and in memory (sync happens each time
resync work queue fires), and eats about 1% of free RAM or disk
(what is less). Each write updates log, so when node goes offline,
its log will be updated with error values, so that this entries
could be resynced when node will be back online. When number of
failed writes becomes equal to number of entries in the write log,
recovery becomes impossible (since old log entries were overwritten)
and full resync is scheduled.
This does not work well with the situation, when there are multiple
writes to the same locations - they are considered as different
writes and thus will be resynced multiple times.
The right solution is to check log for each write, better if log
would be not array, but tree.
/devel/dst :: Link / Comments (0)
Fri, 14 Dec 2007
Linux Test Project on top of DST storage.
# pwd
/mnt/ltp-full-20071130
# ./runltp -p -f fs -d `pwd`/tmp
...
# cat /mnt/ltp-full-20071130/results/results.2007-12-14.11.21.41.17106
Test Start Time: Fri Dec 14 11:21:41 2007
-----------------------------------------
Testcase Result Exit Value
-------- ------ ----------
gf01 PASS 0
gf02 PASS 0
gf03 PASS 0
gf04 PASS 0
gf05 PASS 0
gf06 PASS 0
gf07 PASS 0
-----------------------------------------------
Total Tests: 57
Total Failures: 0
Kernel Version: 2.6.22-rc5-dst
Machine Architecture: x86_64
Hostname: uganda
# mount | grep mnt
/dev/dst-storage-32 on /mnt type xfs (rw)
# cat /sys/devices/storage/n-0-ffff*/type
R: 192.168.4.81:1025
R: 192.168.4.81:1026
All 'fs' tests completed successfully, although I saw following dump in dmesg:
[ 8398.605691] BUG: MAX_LOCK_DEPTH too low!
[ 8398.609641] turning off the locking correctness validator.
which is XFS bug.
Since DST is quite dumb device, that tests will not find tricky places, but they are good
to generate high load on top of given block device.
/devel/dst :: Link / Comments (0)
New release of the userspace network stack.
Changed data reading function, now it does not copy TCP header into
user's buffer, only data, and forced packet socket reading path
to limit maximum number of packets to be read, which do not match
created netchannel.
As usual, new release is available from project
homepage.
/devel/networking/unetstack :: Link / Comments (0)
New mirroring module in the distributed storage.
$ git-diff-index --stat HEAD drivers/block/dst/alg_mirror.c
drivers/block/dst/alg_mirror.c | 745 ++++++++++++++++++++--------------------
1 files changed, 364 insertions(+), 381 deletions(-)
It is cool and works good in my environment, but (like previous) it
forces total mirror resync after main storage node reboot or crash (if it is
required, for example when array was not in sync already and main node rebooted).
I want to extend DST mirroring algorithm not to force full resync, but store a log
of the writes on each node, so when new array starts, it would check not only
age of the nodes (uique id stored at the end of each node, if it does not match,
total resync starts), but also write log, so that the latter does not match, only
selected number of regions would be synchronized.
Stay tuned...
/devel/dst :: Link / Comments (0)
Thu, 13 Dec 2007
Why pushing project into the kernel is not a main goal?..
One have to have some courage and do not afraid to throw something
out and create new things instead of old, even if it will require a lot
of efforts and some problems in a short cycle.
So I've just erased mirroring algorithm from DST and will rewrite it mostly
from scratch, since I have a very interesting sync algortihm inmind,
which will not require clean/dirty bitmap.
Havind DST in kernel would not allow me to have such flexibility...
/devel/dst :: Link / Comments (0)
Wed, 12 Dec 2007
Climbing evening.
It was again a bit late training and thus shorter than usual,
but nevertheless it was very saturated - I tried old complex
start on the horizontal negative slope and several times
managed to complete it fully. That's a very interesting and complex
trace itself, but some time ago I tried some of its bits and completed
them. I think I can finish it without falls after several trainings,
but right now I'm working with the most complex I think part: with
power sucking start.
Horizontal negative slope is usually a big problem for me because of
my power endurance, it also requires very strong back in some movements,
so right now I'm feeling that I still have some muscles in the body and
they did not dissapear after sitting in the chair most of the time.
Excellent time!
/life :: Link / Comments (0)
I was a bit pessimistic about DST design bugs.
Things are only bad when resync of the mirror node is in place...
I fixed both issues, but will spent additional time debugging and testing
the them, since I do not like how it was done. I think I will rewrite mirroring
resync logic.
Subrata Modak of IBM suggested to use
Linux Test Project, which I found to
have interesting benchmarks, which while being very useful for filesystem
development, still can find some bugs in DST.
/devel/dst :: Link / Comments (0)
Shame on me or how complex are design bugs...
I have to admit, that mirroring in DST is not currently well supported.
First, because of a bug I made in the early development stage: in DST there
are two objects, which represent a part of the storage, first one is a node,
this object contains information about type of the storage and pointers to
structure, which represents low level device itself (like block device or network
connection). Network connection in turn is represented as a state structure,
which contains socket, state machine for transferred data and so on.
Nodes are used when block io request comes from the higher layer and
states are used when data is transfeerred via network. The former uses
fain grained reference counters: when node is being operated on (request is processed),
its reference counter is increased, if operations become asynchronous
(for example sending queue is full and thus block can not be sent right now),
then block request is queued into state's request list and reference counter for
the node is dropped. If it reaches zero, node is being freed, which in turn
calls exit callback for the state, which flushes the queue of requests.
Things seem simple and correct, but devil is in details - async processing thread
can enter at any point into the game and process state too, which leads to bugs.
Second, DST mirroring can ate all your memory during resync, since it does not check
amount of free ram in the system and tries to allocate new pages until all memory is used.
This is already fixed in the private tree though.
And the last (known) problem is mirror bitmap - it uses single bit for single sector
of the device, and although uses vmalloc(), it is still too much of RAM.
Back to fixing.
/devel/dst :: Link / Comments (0)
Tue, 11 Dec 2007
First pohmelfs dmesg and bits of Linux VFS internals.
[ 9941.748766] pohmelfs_alloc_inode, inode: ffff81003bc83ac8.
[ 9941.755070] pohmelfs_read_inode, inode: ffff81003bc83ae0, num: 12,
inode is regular: 0, dir: 1, link: 0.
[ 9947.667710] pohmelfs_readdir: filp: ffff81003c5ad6a8, inode: ffff81003bc83ae0,
dirent: ffff81003c5aff38, filldir: ffffffff8027f274.
[ 9950.283976] pohmelfs_readdir: filp: ffff81003e82faa8, inode: ffff81003bc83ae0,
dirent: ffff81003a00ff38, filldir: ffffffff8027f274.
[10028.705354] pohmelfs_readdir: filp: ffff81003d4f1068, inode: ffff81003bc83ae0,
dirent: ffff81003e10ff38, filldir: ffffffff8027f274.
[10095.745022] pohmelfs_lookup: dir: ffff81003bc83ae0, dentry: ffff81003b5343a0,
nameidata: ffff81003e10fe88.
[10095.754922] pohmelfs_lookup: dir: ffff81003bc83ae0, dentry: ffff81003b5343a0,
nameidata: ffff81003e10fdf8.
uganda:~# mount | grep pohmel
/dev/hdb1 on /mnt type pohmel (rw)
uganda:~# ls -la /mnt/
total 0
It is about 12kb of code just to register own filesystem and provide number of VFS
callbacks, so that filesystem could be mounted.
It is not possible to create files or directories since directory lookup method is not
implemented (it returns NULL), ls -l does not show any data since ->readdir()
callback does not fill directory entries, since there are no such objects in the filesystem
at all.
As you understood, this is fairly trivial implementation, which was created just as a reference point.
So far it includes stubs for the following VFS methods:
- basic address space operationsL
->readpage() which reads a page, usually implemented as a generic mpage_readpage(),
which uses per-filesystem get_block() callback. This is called via read path,
when file's page is not in the page cache yet.
->writepage() - writes a page usually via generic block_write_full_page()
helper, which uses per-filesystem get_block() callback. This is called by the VFS
core when there is a need to write page from the cache to disk. This happens for example when
you call sync and friends.
->prepare_write()/->commit_write() - they are called via write path (for example from
generic_file_buffered_write()), this functions has to reserve a space on disk,
update related metadata and perform other private filesystem steps for given page, which will be
flushed to that on-disk area in ->writepage().
- basic directory inode and file operations:
- file operations include
->read() callback, which has to return -EISDIR, and
->readdir(), which has to read directory entries for given inode into provided buffer.
Right now it is empty.
- inode operations are used to create/remote/lookup and perform other tasks on directory content.
Readonly filesystems only have to provide
->lookup() callback, which is used to lookup
inode for given directory entry. Others have to implement lot more operations: create, lookup, link, unlink,
symlink, mkdir, rmdir, mknod, rename, setattr, set of extended attributes operations and so on...
Pohmelfs currently does not perform anything at all, but already provide an empty lookpup callback.
- basic file operations (file operations itself and inode operations for regular files):
- file operations for regular files are those provided by
generic_ro_fops currenly,
it includes:
->llseek() - generic_file_llseek() - seek inside file mapping, it just updates
files current position and performs some checks, so it does not include anything filesystem specific.
->read() - do_sync_read() - a helper used by read syscall, it will eventually call
->aio_read(), which is generic_file_aio_read() for this file operations,
it will call ->readpage() for pages, which are not yet in the page cache
->aio_read(), described above.
->mmap() - generic_file_readonly_mmap() - it will setup a mapping file operations,
which include only a fault handler, which in turn will call page_cache_read(), which
ends up with ->readpage() calls. Of course mapping is a bit more complex tasks, but from
the filesystem point of view that all what we have to know.
->splice_read() - generic_file_splice_read() - this callback is used for splice
system calls, which ends up calling the same ->readpage() callback for the set of pages,
which are put into spliced buffer of pages.
- inode operations for regular files is not needed, if it is readonly filesystem (although it can provide
some useful callbacks like getting extending and usual attributes), for usual filesystem at least
->truncate() and ->getattr() callbacks are required.
/devel/fs :: Link / Comments (0)
Mon, 10 Dec 2007
I have started laid off process.
Most of the projects have been moved to collegues, talks with management completed.
Just waiting for tiny bits and that's all...
/devel/other :: Link / Comments (2)
PohmelFS.
linux-2.6.fs$ mkdir fs/pohmelfs
linux-2.6.fs$ date
Mon Dec 10 19:38:53 MSK 2007
Stay tuned...
This is a working name of the filesystem, I will think about release name later.
First I will implement a simple base, which will just register itself with the Linux
VFS code, so that I will put here some specs about what Linux VFS requires from the
filesystem. In parallel it will be used as a base for either
network filesystem
and/or distributed/local filesystem.
/devel/fs :: Link / Comments (2)
New distributed storage release: Gamardjoba, genacvale!
Short changelog:
- wakeup state when mirror detected error to seedup reconnect
- if connecting in csum mode to no-csum server, do not enable csums
- do not clean queue until all users are removed
- allow to increase size of the storage in linear add callback
(with this change it is possible to add nodes into linear array
in real time without stopping storage. Filesystem has to be prepared
for the case when underlying device has changed its size.
Real-time addon of mirror nodes is also supported)
- allow to delete gendisk only after device was started
- dst debug config option
- Name: Gamardjoba, genacvale! ('Hi friend' in georgian)
Great thanks to Matthew Hodgson (matthew_mxtelecom.com) for debugging!
As usual, one can get new release from the project homepage.
/devel/dst :: Link / Comments (0)
Sat, 08 Dec 2007
Pancho Villa.
I spent excellent time in this mexican restourant with friends.
We celebrated Perec's birthday: tequilla did flow plentifully, buritos
were hot, and hours dissapeared silently.
Excellent time!
/life :: Link / Comments (0)
Fri, 07 Dec 2007
Climbing evening.
It was quite short and not very hard training - I was a bit later
than usually, and most of the time I tried quite old but very complex
start on the horizontal negative slope. Meantime I talked with instructor
and found that start in question does not contain one hold, which was
there originally, so that should explain why I fell. I will continue that
red trace next time, I even want to put a huge paper around another hold, located
where old one was: 'I'm a red hold, I'm just feigning'.
/life :: Link / Comments (0)
Strong checksumms in DST rocks.
Great thanks to person, who suggested me
to implement them and Zach Brown, who showed, that
Castagnoli crc is a better one than Adler.
I've debugged a setup where system failed to mount XFS filesystem on top of distributed storage,
and after turned on strong checksums, system detected they were wrong, so some corruption
happend during filesystem setup.
Turning off TSO, RX and TX offload of e1000 nics on machines, which form the storage, fixed the problem.
Strong checksumms rocks!
/devel/dst :: Link / Comments (3)
Distributed storage and long distances.
I've just completed some tests over the distributed system,
created on top of usual internet links between machines,
located in Moscow, Russia and London, UK.
Remote target was setup, then XFS filesystem created, mounted
and some tests ran.
One of the machines (main storage server) is located behind at least
one NAT firewall.
/devel/dst :: Link / Comments (4)
The return of syslets.
Zach Brown announced
new syslet patchset aimed to simplify and stbilize basic async operations.
Syslets is a mechanims of performing syscalls asynchronously - new thread
is started when syscall is about to block, execution blocks and old thread
is scheduled away to the new one, on behalf of which userspace continues its
execution.
Version 7 of the patchset was built on top of indirect syscall, threadlets,
userspace function execution and async io was removed from the patchset for simplicity,
number of comments and code clarifications were added.
Main goal of the syslets right now is to make fundamental things working right.
Asynchronous IO operations has too long history already - it was implemented
as a state machine in KAIO and kevent AIO,
kernel supports AIO for directIO operations (userspace requires libaio).
Syslet approach was shown to be in some cases much slower than libaio (which is actually
a sync operations for usual files), but it was resolved as unfairness of CFS scheduler,
and (iirc) it was fixed/extended.
My main objection against this is the fact, that when you have thousands of actively
running applications, system starts sucking badly, but if it is possible to reduce maximum
amount of working thread per user to some resonable limit, things will be just fine.
Syslets (and its more friendly threadlets user) were supported by Linus and Ingo Molnar,
so very likely it will be the default way to do asynchronous IO and other operations.
Right now Zach highlighted following problems:
- ring buffer of syslet statuses limitations
ptrace() problems
- stale data (when thread issuing a syslet calls for example
setuid(),
in which case another thread, which actually executes blocked syscall, contains wrong data)
- problems with
sys_clone() and syslets, sys_clone() is actually
a mechanism to create a new thread in syslets, so we get a recursion
All above problems are technically not-impossible for resolution, and I think it is not
that bad to introduce some simple limitations for users, so that majority of async IO qustions
are resolved with this mechanism.
/devel/other :: Link / Comments (0)
B(something)-tree vs RB-tree. On-disk allocations.
In the previous
article it was shown, how btree and rbtree behave with allocations are being
done in memory. In such conditions btree should suck compared to rbtree, and generally it is
true, although in some conditions its insert speed can be even slightly higher htan rbtree.
Now, let's check how they behave when all allocations are performed from disk.
Below graph shows insert speed for both rbtree and btree in such conditions,
each node was allocated with 1024+sizeof(node) offset from previous one
so that readahead and thus cached disk apges would not influence the results.
Totally 1 million keys were inserted into the tree.
Search speed is roughly the same as with in-memory tests, since most of the tree
sat in the ram after insertion.

High jump around 220 keys is likely a place, where node size becomes bing enough,
and amount of them is small enough, so that total tree started to fit the page cache.
In some cases there is no such a peak and graph slowly moves to around 40k insertions
per second, which likely happens when some background task is actively using page
cache flushing away test file's pages from the memory.
/devel/fs :: Link / Comments (0)
The most discouragement-resistant hacker out there.
That is how Jonathan Corbet calls me :)
/devel/other :: Link / Comments (0)
Thu, 06 Dec 2007
Multithreaded filesystem access.
Trees are generally (if not always) very bad in parallel access,
since there is no a good strategy what to lock and tree modifications
usually requires more than one node changes and in some cases (like b-tree
or AVL tree) can lead to changes at every layer.
Thus it is much simpler to lock the whole tree during any changes, but since
not every node in the tree is in the main memory and thus has to be fetched
from the disk, this can lead to long delays per operation.
Contrary Linux VFS operates with pages, where each page is locked individually.
Similar changes for hash tables (i.e. one lock per hash bucket) actually leads
to lower performance since when the whole table is locked by single lock because
of bad cache line, containing per-bucket lock, bounces, but this, again, is only
applicable to main memory, since usually access to single bucket in the hash table
is quite cheap even if it contains several entries.
So, I do not know perfect locking scheme for trees, when they are allocated
on the disk, so I will find that knowledge in experiments.
The best solution, which is the most related to the real life, is trivial filesystem of course.
Initially this will be a simple and very small kernel module with basic filesystem
in it, so that it could be trivially changed to support on-disk filesystem and
network filesystem.
I wanted to put my dirty hands into it quite for a while already, so it is time to start...
Stay tuned!
/devel/fs :: Link / Comments (2)
A simple way to crash machine using XFS and DST.
Let's suppose you want to create an XFS on top of DST array.
If you mistakenly will run mkfs.xfs /dev/sda1 (let's suppose
you want to create DST storage on top of /dev/sda1 device)
and then start DST on top of /dev/sda1:
./dst -n storage -A alg_mirror -d /dev/sda1 -R -s0 -S0
this will overwrite the last sector of the /dev/sda1,
where XFS stores its metadata. Mounting XFS after that will lead
to almost 100% crash of the machine on 2.6.22 kernels because of some
bugs in XFS, which appear when XFS reads corrupted metadata from the
last sector.
To work with DST you have to operate with /dev/dst-$storage-$num
devices (i.e. run mkfs.xfs /dev/dst-$storage-$num), and not with
underlying ones.
/devel/dst :: Link / Comments (0)
Wed, 05 Dec 2007
BTRFS 0.9 release.
Chris Mason announced
new release of his btrfs filesystem.
It includes:
- bigger filesystem block sizes
- extended attributes (no ACL yet)
- extent alignment parameter
- inlining of the file data into btree
- number of performance and stability improvements
Chris also showed
a rough timeline for the filessytem development.
As he pointed, btrfs is still very bad in database loads and does not support
multithreaded operations.
As you probably got, implemented inlining of the file data into btree
is virtually scaling inodes
algorithm, although a bit simpler.
I do like btrfs, and wish a great success to this filesystem. But onlu until I start my own :)
Kidding of course.
/devel/fs :: Link / Comments (0)
Storage hotplugging in DST.
For the interested reader: yes, it is possible to add disks
into DST storage on the fly, but be sure that your filesystem supports that
(in case of linear setup), mirroring is fairly transparent.
Command to add another node into mirror setup is pretty simple:
./dst -n storage -A alg_mirror -S0 -s0 -a kano -p 1026
Just like adding usual node into the storage before it was started.
Please note, that when adding node which is smaller than current device size,
device size will be reduced and this can damage your filesystem!
The same applies to linear setup.
/devel/dst :: Link / Comments (0)
Tue, 04 Dec 2007
DST FAQ.
The most frequently asked question about DST is:
Can you give us a summary of how this differs from using device mapper with NBD or iSCSI?
Answer is quite simple:
From the higher point of view it does not, but it operates quite differently:
it has async processing of the requests, thus not blocking, it has
different protocol with smaller overhead, supports strong checksums, has
in-kernel export server, which supports simple security attributes (i.e.
allow to connect, to read or write). It uses smaller amount of memory
(zero additional allocations in the common path for linear mapping,
not including network allocations, it uses smaller amount of additional
allocations for mirroring case).
DST supports failure recovery in case of dropped connection (core will
reconnect to the remote node when it is ready), thus it is possible to
turn off and on remote nodes without special administration steps. DST
has simple autoconfiguration at the startup time (support checksums and
storage size autonegotiation). It is possible to turn one of the mirror
nodes off and use it as a offline backup, since dst mirror node stores
data at the end of the storage, so it can be mounted locally.
/devel/dst :: Link / Comments (0)
New distributed storage subsystem release.
This is a maintenance release and includes
bug fixes and simple feature extensions only.
Short changelog:
- fixed bug with XFS metadata update (it can provide slab pages to the
DST, so it is not allowed to transfer them using
->sendpage())
- fixed async error completion path
- extended netlink communication channel to report errors back to userspace
- DST name is now "The 10'th dynasty of smuggled slothes"
- number of fixes for userspace DST target
Great thanks to Matthew Hodgson (matthew_mxtelecom.com) for debugging and
fixes for userspace DST target and preliminary netlink extension patches.
As usual you can download this release from the homepage.
If you want to try distributed storage this release is a really good candidate to start with.
Enjoy!
Update: This release includes bug fixes for all bugs described
here,
including uninterruptible sync read operations.
/devel/dst :: Link / Comments (2)
The 22'th century netchannels release.
This is the 22'th release of the netchannels, a peer-to-peer protocol
agnostic communication channel between hardware and users. It uses
unified cache to store channels, allows to allocate buffers for data
from userspace mapped area or from other preallocated set of pages
(like VFS cache). All protocol processing happens in process context.
Users of the system can be for example userspace - it allows to receive
and send traffic from the wire without any kernel interference, to
implement own protocols and offload its processing to the hardware.
This idea was originally proposed and implemented by Van Jacobson.
This patchset (with userspace netowrk stack) is a logical continuation
of the idea with move to the full peer-to-peer processing.
Short changelog:
- update cached route in the netchannel when it expires
Thanks to Salvatore Del Popolo (delpopolo_dit.unitn.it) for testing.
You can get the latest sources from netchannels homepage.
Userspace network stack is available from own homepage.
/devel/networking :: Link / Comments (0)
Mon, 03 Dec 2007
Climbing evening.
That was a great training, although I completed not that many traces,
but they all were good. Several old ones from really simple to quite
interesting at the beginning, then number of traverses and boulderings.
When Grange arrived it was already
quite late, so we made couple of simple traces without the rest in between,
and then the greatest trace started: 7a+ (although I was allowed to modify
it a bit, so I reduced its categry down to 6c/6c+) over the black holds
in the left center sector. I did not finish it (it is not on-sight already
quite for a while), since was too tired, but made a key point several times,
which flushed me down to the bottom.
I tired as hell and that was a great feeling!
/life :: Link / Comments (0)
Sun, 02 Dec 2007
Distributed filesystem roadmap.
- Distributed storage. This step is mostly completed, although some bugs are there
and there is number of features to be implemented, work is being done on them
and it is no the finish line. Feature list include:
- sync/barrier support
- error report to usersapce via netlink (patch was made by Matthew Hodgson (matthew_mxtelecom.com)
- some thoughts about sync operations which can stuck in uninterruptible state if there are some
problems with remote noes (Hi NFS), I will create a fix for this issue for DST at least.
There is a nasty bug in DST currently, which I can not reproduce locally and debug it with Matthew
on his setup.
There is also fair number of fixes for userspace DST target made by him. Great thanks!
- Local filesystem with very scalable and fast on-disk format,
possibilities to have on-line backups, snapshots, no fsck, scalable
locking (multithreading reading and writing).
This had originally a very simialr to btrfs design,
but I want to move further and have ability to perform multithreaded and then
mutli-machine access to the same files. Call me a looser or wheel reinventer (I would not be
where I am if I cared about it), but I want to have a project where I know every single bit
to be able to fix things quickly and break something if it is needed for better implementation.
- linking both network and fs layers together, this will include
distributed byte-range locking and cache coherency for client nodes. Bits of this step
I described in short discussion
with Zach Brown.
This does not mean steps will be completed in the above order, I'm working in parallel in
different directions and some parts can appear earlier, so that I would be able to evalute
its problems.
Bug fixes has obviously the highest priority.
/devel/fs :: Link / Comments (0)
Meanwhile at appartment development side.
I reached a big milestone yesterday - I completed wallpaper glueing
in the hall, room and checkroom. Although it still requires
some fixes and bits of work with a knife, it is a huge step forward.
I also wanted to paint the walls in the room yesterday, but fell in slack.
Today is supposed to be another heavy working day - I will move
to the development shop (on the opposite side of Moscow) to get neon cord,
water system hatch and ceiling for the bathroom. If I will return not that late,
I Will start setting them up, otherwise will paint the room.
I'm curious, when I will have a real vacations and some rest, but I think I found an answer -
tomorrow I will start discharge process at work and expect it to be completed
in a week or two at most, so I will have about two weeks before the new year
celebrations. Most of them will be devoted to the appartment developemnt though.
Well, we will definitely have some rest in an eternity...
/devel/flat :: Link / Comments (0)
Fri, 30 Nov 2007
Climbing evening.
That was a very good one - I tried number of old traces, which I either never
climbed or climbed couple of times and dropped. This included simple
from the first point of view (and its category), but quite complex on the wall actually,
and really complex from the category and damn bloody complex on the wall.
Eventually I even fixed on trace to be a bit simpler, so that it matched its category
(7a+) with permissions of the instructors, although I think it became too simple
just after single hold change.
That was a good time!
/life :: Link / Comments (0)
Thu, 29 Nov 2007
The 21'th netchannels release.
Netchanel is a peer-to-peer protocol agnostic communication channel between hardware and users.
It uses unified cache to store channels, allows to allocate buffers for data
from userspace mapped area or from other preallocated set of pages
(like VFS cache). All protocol processing happens in process context.
Users of the system can be for example userspace - it allows to receive
and send traffic from the wire without any kernel interference, to
implement own protocols and offload its processing to the hardware.
This idea was originally proposed and implemented by Van Jacobson.
This patchset (with userspace netowrk stack) is a logical continuation
of the idea with move to the full peer-to-peer processing.
One of its users is userspace network stack.
Short changelog:
- fixed queue length usage
- fixed dst release path. Both problems reported by Salvatore Del Popolo (delpopolo_dit.unitn.it)
- removed nat user
More details can be found on project homepage.
/devel/networking :: Link / Comments (0)
B(something)-tree vs. RB-tree.
This simple benchmarks test btree vs rbtree search and insert speeds,
when nodes are allocated in memory.
Btree is (likely) a b+tree where each data node is located at the bottom layer (probably
it is a bit different, algorithm does not strickly follow rules described in btree papers.
1 million entries were inserted or searched in the tree.
Graphs below show speed of the given operation depending on number of keys in the
btree node (when number of keys is smaller than 20 linear search ofthe key is used,
otherwise binary search).

B(something)-0tree vs. RB-tree. Search speed.

B(something)-0tree vs. RB-tree. Insert speed.
As you see, btree search is about 2-2.5 times slower than rbtree. This can be clearly described by the fact,
that rbtree contains data in each node, while btree only in the lowest nodes, so each btree search requires
to travel over all layers (this is roughly equal to log2(number of keys in each node)) multiplied by number
of keys in each node (in the worst case), while rbtree needs to perform the same amount of searches only
in the worst case, while generally it requires about 2 times less traverses.
Btree insert speed can be even slightly faster than rbtree, since during insertion tree grows from low levels and each new
allocation reserves space for several keys, thus greatly reducing number of needed splits or rotations.
I will run the same tests for the case, when all nodes are allocated from disk storage, which should
show clear win of the btree approach.
Stay tuned.
/devel/fs :: Link / Comments (0)
Astonishingly screwed tapeworm.
New release of the distributed storage
subsystem. This is maintenance release and includes bug fixes only.
Short changelog:
- use node's size in sectors instead of bytes
- fixed old/new ages for the first node. Error spotted by Matthew Hodgson (matthew_mxtelecom.com)
- fixed debug printk declaration
- new name
Overall list of features of the DST can be found on project's
homepage.
/devel/dst :: Link / Comments (4)
Wed, 28 Nov 2007
Slackass.
Yes, you, who sat near the computer instead of moved climbing. Who likely
forgot how to hold a rope and how holds look and how to use them. You, who
do not know already what is a power in the muscles and how tireness kicks the body.
You, who strains only when stands from the chair.
You, who promised to go climbing, and miss trainings one by one, you are a slackass :)
Meanwhile I damaged a foot at the training - I climbed without insurance
on the horizontal negative slope and when moved on top of it (only about 4 meters)
I fell and managed to unfortunately land right foot into the hole between
floor-mats and damage it kicking the floor. I think it is not cracked, but it aches quite
noticebly when I move. So likely I will become a slackass too.
It was quite interesting training though - I even climbed high on the wall with
the strongest negative slope - I tried new trace on-sight and fell several times during that,
although trace is quite simple and holds are big. Also tried several boulderings and
above start on the horizontal negative slope several times.
It was a good training.
/life :: Link / Comments (1)
Teams make a business, individuals make innovations.
/other :: Link / Comments (4)
Reducing entropy of (software) bugs in the universe.
Yesterday I added
a bug to Fedora Core bugzilla, today I fixed
one bug in kernel bugzilla related to IPv6 addrconf.
My carma is clean again.
/devel/other :: Link / Comments (0)
20 hours.
Yesterday I woke up at about 6 o'clock and went back to bad (hammock)
today about 2 o'clock. This resulted in properly working (although
lack of tricky interesting features) search/insert operations of the given
btree. It looks a bit different than usual btree (or b+, b# or b-whatever),
right now it supports 64bit keys, but I plan to extend it to 128 bits.
So far I only tested it with usual memory and thus it was a bit limited,
I will run rb-tree vs. btree benchmark for on-disk and in-memory allocations
for milliards of keys.
Stay tuned.
/devel/fs :: Link / Comments (0)
Tue, 27 Nov 2007
Fibre Channel over Ethernet Project.
Robert Love (if I understood correctly
it is not that Robert Love who wrote "Linux System Programming",
"Linux Kernel Development" and "Linux in a Nutshell" :)
announced
new Intel's project aimed to allow systems with an Ethernet adapter and a Fibre Channel Forwarder to
login to a Fibre Channel fabric (the FCF is a "gateway" that bridges the
LAN and the SAN). That fabric login was previously reserved exclusively
for Fibre Channel HBAs.
System provides both fibre channel and ethernet transport modules, as long
as software target and initiator. Although right now code can not be
imported into the tree (small BSD code usage, small amount of documentation,
ioctl() usage and kernel/userspace interaction, but there are
several git trees, so that interested users could setup a testbed.
Homepage: http://open-fcoe.org/
/devel/other :: Link / Comments (0)
Reproducible GTK (probably buffer overflow) bug in FC7.
Program received signal SIGSEGV, Segmentation fault.
0x00b096e3 in ?? () from /usr/lib/libgdk_pixbuf-2.0.so.0
(gdb) bt
#0 0x00b096e3 in ?? () from /usr/lib/libgdk_pixbuf-2.0.so.0
#1 0x00b026f1 in gdk_pixbuf_composite_color () from /usr/lib/libgdk_pixbuf-2.0.so.0
#2 0x08083ece in gtk_tree_path_free ()
#3 0x0808450d in gtk_tree_path_free ()
#4 0x068c4a91 in ?? () from /lib/libglib-2.0.so.0
#5 0x068c67f2 in g_main_context_dispatch () from /lib/libglib-2.0.so.0
#6 0x068c97cf in ?? () from /lib/libglib-2.0.so.0
#7 0x068c9b79 in g_main_loop_run () from /lib/libglib-2.0.so.0
#8 0x06f20f44 in gtk_main () from /usr/lib/libgtk-x11-2.0.so.0
#9 0x08097d3f in gtk_tree_path_free ()
#10 0x007bff70 in __libc_start_main () from /lib/libc.so.6
#11 0x080532c1 in gtk_tree_path_free ()
(gdb)
It was obtained during btree debugging - I generated a big graph using Graphviz
and tried to see it with gqview, which crashed badly. All updates were
installed. x86 arch. I've filled a bug
in Fedora bugzilla, but I'm not sure it will be resolved.
Crap - I still can not develop my interesting btree, but I'm very close to the finish.
/devel/other :: Link / Comments (0)
Mon, 26 Nov 2007
Climbing evening.
That was a great training. Although besides couple of warming
traverses I only did the same start on the horizontal negative slope,
I tried it many, really many, REALLY many times. That sucked all
the power, so at the end I was tired as hell, but that is a great feeling.
Met there climbers, which were not there quite for a while so I thought that
will not met them again anymore, but no, things changed...
Bloody excellent time!
/life :: Link / Comments (0)
Sat, 24 Nov 2007
Coherent Remote File System.
Zach Brown has an extremely
interesting idea of network filesystem implementation.
One can thing about it like NFS client or more proceise as a
client-server protocol, which allows clients to have a cache of
data instaed of relying on server. This of course requires a cache
coherency protocol to be involved in client-server communications,
which makes things more complex.
Simply this works as a trivial filesystem, mounted on clients,
where each read/write/meta operation is perfomed on top of locally
cached data, if data is not preset in the local cache, it is fetched from
the server. Client flushes its updated cache to the server in number
of various conditions either because of usual writeback process or
because of cache coherency process (i.e. when another node reads
from the file, updated by given client).
Zach will present
it at LCA this February.
So far it is closed Oracle's project (as far as I know open sourcing process in on the way,
just like it was with Chris Mason's btrfs),
and I strongly want to implement exactly the same idea myself :)
This process will have number of benefits:
- simple open source filesystem, which can be used as a base for real filesystem development
(do not confuse it with virtual filesystems like
sysfs or debugfs)
- ability to extend it for own protocols
- cache coherency mechanism will be used in distributed filesystem
- possibility to test byte range locking
in a real life
- implement filesystem bits first in userspace (I do not want to introduce additional mispredicted
behavuiour because of FUSE)
Zach, what about small competition? :)
Frankly saying I'm not an expert in cache coherency protocols and filesystem development either
(you will not believe me, but last several days I'm trying to implement inteteresting B-tree,
but with each day spent on that problem I comment more and more bits in the code and it still does
not work the way I want :).
With recent trends I believe I will have pretty high-end hardware soon to perform various tests and
find common and tricky bottlenecks.
This implementation can be used by various users aimed for distributed systems,
but which do not want to have (or bother with) real filesystem developemnt and which
are ready to have a server in userspace on top of existing filesystems (in
receiving zero-copy
project I showed huge problem with in-kernel usage of some of Linux
filesystems, especially those which use in-kernel JBD journaling, when
it is impossible to preallocate (->prepare_write()) number of pages for given
file and then write into them and commit (->commit_write()) at once for maximum performance).
/devel/fs :: Link / Comments (2)
Meanwhile at appartment development side.
I've completed big arc in the room and finished most of the smaller
arc for the checkroom (rough strong emery paper rocks),
it requires some polishing and eventual
wallpaper glueing and painting. I think I will finish this part
(as long as painting of all room's walls) next weekend. I want to complete
room, checkroom and (hopefully) hall. The latter requires a bit more
work - if I will have enough time and glue for ceramics, I will setup
ceramic granit floor. I hope to get a ceiling and a water system hatch
for the bathroom next weekend too, so that it would be completed too.
When it is ready, I think I will have some rest, or maybe will go a usual way -
proceed with development of the kitchen. It is not that complex and will not
require a lot of time actually, even if I will (and I want to) install
hinged ceiling there too.
/devel/flat :: Link / Comments (0)
Tue, 20 Nov 2007
Crazy company wanted!
I tired of my paid work. I really like all people
here, but when I'm assigned to do tasks, which can be completed
in several hours without major thinking, without interest and without
good understanding for what it is needed and will it be used
at all, and that happens for the last years constantly,
I feel really frustrated.
If you read this, then very likely you know what I can do
and how I behave (frequently not very good and friendly), and thus understand
my intentions.
I want to work on my own
projects first.
If you believe that they correlate with your business and want to pay
me for doing that with some influence over TODO list, then feel
free to drop me a mail.
There are probably some issues with the process, which we can discuss privately.
/devel/other :: Link / Comments (4)
Maintenance release of the distributed storage subsystem.
It contains only following bug fix:
- Cleanup sysfs files on error path. Patch by Chris Madden (chris_reflexsecurity.com)
You can find the latest release on the project homepage.
/devel/dst :: Link / Comments (0)
Mon, 19 Nov 2007
Kind of working...
Hacking on getting motion JPEG (Morgan codec) live dataflow from adv202 hardware codec.
One can watch
resulted several seconds SWF 'movie' (hardware around captured by small analog camera
connected to mentioned above codec on AMCC PPC 405gpr cpu board), about 1.5 Mb.
/devel/other :: Link / Comments (2)
Sat, 17 Nov 2007
Meanwhile on appartment development side: the concrete jungle king.
Yes, I made it,
I installed lavatory pan and wash-stand in the bathroom, although it is not finished yet
(I did not complete glueing ceramic tiles on the small wall with door and part
of the wall where water/sewerage hatch is located).
This required to complete sewerage and water projects in the bathroom, when I will get
my camera back, I will make some photos - water system is not that trivial: it contains
of filters for hot and cold water, counters, headers (water collectors). It does not yet
support boiler, but I will set it up soon.
One of my fellows live near the
good development shop, so he will bring me a special hatch, where I will install
tiles; neon cord for my hinged ceiling
and ceiling for bathroom.
I think I will complete it this month and it would be great if I install a shower cabin,
since washing without it is a real pain in the ass.
Tomorrow I will start polishing my main arc at the room's door area and will glue
bits of wallpaper I removed when made it, I also plan to start painting on the wallpapers
tomorrow (and of course my 'uchuu').
If there will be enough time I will also develop small arc in the checkroom and install
chains for coatrack.
The nearest future plan include hall development (ceramic granite on the floor, wallpapers
and pains on the walls, fortunately I already finished painting the ceiling, this time
it does not have any special details) and move to the kitchen setup.
I also want to make a table
as soon as possible and start developing interesting
things at home.
Appartment development sucks really lots of my power, but I do like it very much.
Although I frequently break thins and then have to fix that and move forward, it is
a good way I believe.
/devel/flat :: Link / Comments (0)
Fri, 16 Nov 2007
ARM MMU domains.
Grange brought me Xscale board, so I will start
MMU domains
feature implementation for 2.6 kernel.
This is a new area for me, and it is quite time limited - I have to return board
in about two weeks, so either I will complete and submit it for inclusion
into 2.6 kernel tree, or abandon because of lack of the time.
Board requires 24 V power setup and I do not have it right now (even do not have
two 12 V cords), so I will postpone powering till monday.
/devel/arm :: Link / Comments (0)
Climbing evening.
That was a very good training, Grange
finally completed his manager's tasks (i.e. slacking on the meetings
and fscking brains of the subordinates) so climbing training was very interesting.
Not from the beginning, since he was late, but nevertheless.
Anyway, training was started with usual warming traverses, then I moved
to the negative slope and tried several complex starts. I did that on the
wall created for bottom rope insurance, where I did not climb quite for
a while already. It took quite a bit of time and power, so when I started
to climb upstears on the walls with Grange, I was not very fresh.
Nevertheless I managed to complete several old traces and tried couple of new
ones, simpler one I finished without serious problems, I probably
can say I finished it on-sight, although fell couple
of times, but just because I was already too tired, not of the absence of technique.
Another new trace was quite complex one with start on the horizontal
negative slope, which I did previously only partially: horizontal slope
or rest of the trace. I tried to combine them, but fell, although I believe
I will finish it next time if will start early when have anough power.
That was really good time there!
/life :: Link / Comments (0)
Thu, 15 Nov 2007
Ground points of the filesystem development.
1. Data read/write rebalance in the filesystem.
When it is possible to add/remove storages from the system,
there is a clear question about theirs utilisation. First, when
you have your data spread over different nodes/storages, reading
will always be faster, since it can be performed in parallel.
From another point of view, this can lead to heavy data fragmentation,
if done incorrectly (like in case of tightly packet data in the first place,
which after spreading will require heavy write/update overhead).
So, this is a good solution for read-mostly setups, but is a bad choice
for write-mostly cases.
The cleanest solution for this issue I see is to use copy-on-write sematic,
which implies that each new write will be placed to the new location. Thus in case
of new storage added to the filesystem, it will be readily utilized for new
writes, which in turn can work with delayed allocation and extents heavily reducing
fragmentation.
Reading is a bit more trickier, ideally data should be spread over the new storage,
but having large contiguous regions for the same file is a huge win because
of read-ahead logic and the way disks work, so only fragmented files have to be
moved around. Here we enter defragmentation land, which is very small and easy
in copy-on-write design - file should be read and written to get a new
contiguous region, or special operation should be introduced to do essentually the
same, without writing to the data (like do that on sync or flush).
So, to summarise my ideas, the only needed thing for having high-performance read and write
in case of multiple (or extendible) storages is to have copy-on-write semantic
behind IO logic with correctly implemented balancing algorithms (like proper delayed
allocation and extent usage).
This is a first base point of my filesystem design.
2. Locking.
Obviously, the less locks you have, the less time you will spent in busy
loops (zero in the perfect case).
Thus main design principle is to allow multiple IO (simultaneous reads and writes)
and metadata (file creation/deletion and so on) operations.
While multiple readers are handled just fine in Linux kernel
via generic_file_aio_read() all writers are stuck
in generic_file_aio_write()'s inode->i_mutex,
which effectively blocks multithreaded writing to the same file.
But inode->i_mutex
should only guard metadata updates actually, not writing itself,
so this issue has to be resolved in any filesystem, aimed for high performance
applications (no filesystem in Linux kernel tries to avoid grabbing
inode->i_mutex for writes currently).
Getting into account number of hacks I implemented
for network without touching a lot of core code, I'm pretty sure I will
be able to do so for own filesystem only.
3. Motivation.
I do strongly believe that it is impossible to make a really good things
when you are forced to do them. So, my idealism says me, that when
you are paid to do the work, it will not be completed in the best way.
Do not confuse, when you get money for things you do for yourself
or on your own intention, they are completely different approaches.
4. Fun.
It has to be fun. If project starts sucking the power without good
feedback, it has to be completed to the next milestone and frozen. If something
is not interesting, it should be avoided.
That were my rules for success filesystem project,
the last two items obviously apply to any other project.
Stay tuned :)
/devel/fs :: Link / Comments (0)
CEPH distributed storage.
It was announced on LWN and kerneltrap recently.
I already wrote
about this filesystem, after that I found
(from discussion with Zach Brown)
that this filesystem does not have a byte-range locking and when number
of threads write to the same file, they become sync writes
(i.e. no cache coherence protocols involved). I'm also not
sure what this is about: I/O workloads should be done with the client
cache off because the writeback is too non-deterministic.
That was my envy comments :), now good news.
First, Sage Weil (an author) works full-time on this project
and funds it from own web hosting company, so it is possible to attract
developers for money (he even hired someone to write kernel client
instead of FUSE one). Second, it has completed design and working
implementation (although some design issues are questionable).
So, likely it is a good choice to take a look for you, if you are searching
for the solution which should be ready shortly.
/devel/dst :: Link / Comments (0)
Wed, 14 Nov 2007
Climbing evening.
That was very good training, although quite short - after usual
warming traverses I tried number of starts of various traces
on the negative horizontal slope, where later tried several traces.
I found that besides horizontal slope I can do them pretty
easily already, so I continue to develop power endurance
on the negative slope. I can not say there is a major progress
in that area, but I can complete some startes on the negative
horizontal slope, which I previously could not, so likely there
is some gain in that exercises.
/life :: Link / Comments (0)
Perfect bugs.
A recent thread
started by Natalie Protasevich, who is a kernel bugzilla master now, shows a
number of bugs which were reporeted recently to different kernel subsystems.
Andrew Morton replied marking essentially most of the bugs (if not almost all)
as being not responded by developers at all and some words about decreased kernel
quality. This rised quite heavy (void actually) discussion about how this should be
fixed (not bugs, but 'the process') and so on (I deleted the whole thread after read
about a one quarter of all messages if not less).
There are two interesting moments of fixing bugs, which I want to highlight here.
First, do not even expect someone will look at your bug just because of that. I like
to fix bugs, I really like to do it, but having 'no reply from developers' behind
does not force anyone to start doing so. And that is frustrated, when work is being done,
and done pretty good, but instead kernel leaders say that there was no reply from developers.
This frustration and complete wrongness of such approach was showed in the thread,
and I hope was gotten into account.
Another issue is a bug quality. I have number of friends, who are able to read minds, but
right now they are all on vacations, so it is pretty hard to determine what the bug is
(like 'I used 3 years old kernel and it works bad/crashes/destroy my data/whatever').
Yes, providing a bug is not that trivial and simple, if you want it to be fixed,
please help us to do so, do not throw it and expect things to get changed.
Really perfect bug has a description and a test case. While I wrote this entry
I found how performance regression, reported by Nick Piggin, was analyzed
by David Miller, tested by Nick, problem was found and bug got fixed by Herbert Xu.
Just because it was a good bug report, with tight cooperation with reporter.
If it would not be fixed right now (americans will go to bed soon or should
be there already), I wanted to fix
it myself just because it contain perfect description of the problem
(perfomance degradation using special benchmark tool) and it is possible to
(easily) perform the same tests locally (tool is a tbench benchmark ran over lopback).
And I pretty sure, I even insist, assure and even can prove, that when bug is reported correctly
and there is a way for developers to catch the problem, it will be fixed immediately.
Of course it is not always possible (like when bug is in the driver and only limited
number of people have hardware), but even then reporter should do a bit (just a really
small bit) of work - find a maintainer of the driver (it is easy - check MAINTAINERS
file in the source tree or search for driver name in the mail archives) and kick it.
Provide a lot of info, maybe resend bug report several times (yes, people frequently
forget about such things like fixing own bugs), copy appropriate mail lists.
If you have enough background, start helping developers a little bit more:
like use git-bisect
to find exact commit which caused a regression, if it is recent bug; or add debug
prints to determine where driver stops working; or just run different (as simple as possible)
tests to show exact condition where problem occurs so that developers could
reproduce problem using own setup.
This are really simple things to get bugs fixed, and we do can fix 'the process' without
all those words, just by doing things.
/devel/other :: Link / Comments (0)
Tue, 13 Nov 2007
Moved to development shop.
To get some small things - screws, drills over ceramic, nuts, chains and so on,
also got bits of colour to draw uchuu
on the wall.
Searched for bath cabin, but "Leroy Merlin" shop has only crappy stuff, only maybe couple
of cabins were good, but they contain things I really do not want to have there like radio,
own shower sockets, seat and so on... I want a simple thing: two glass walls
(maybe rounded) with good mechanical parts and fixes.
Still searching...
/devel/flat :: Link / Comments (0)
Mon, 12 Nov 2007
Climbing evening.
It was very good training - I tired very much climbing
over several new traces on the vertical wall but with
start on the horizontal negative balcony. End
of the training was devoted to simplified climbing -
I started not at the balcony, but higher on the vertical part,
but even that sucket power very noticebly on the new traces.
Anyway, th |