Zbr's days.
February
Sun Mon Tue Wed Thu Fri Sat
         
 
2008
Months
Feb
Nov Dec

About :: TODO :: Blog :: RSS :: Old blog :: Projects :: GIT :: Gallery :: Notes

Fri, 29 Feb 2008

A music created solely from Windows 98/XP system sounds.

Enjoy.



Should be filled under 'boyan of the day'?

/other :: Link / Comments (2)


Richard Stallman in Moscow.

Here is a schedule, he will present a lecture called "Free Software in Ethics and Practice" March 4 in MIPT where I studied (already long time ago?).

I will not visit though :)

/devel/other :: Link / Comments (1)


Debugging undebuggable.

If something looks undebuggable from the first view, than take a secon one. Better from different angle. Some problems require third look.

Bits of history of the problem. Pohmelfs has extremely large latencies when syncing local inode to the remote server. This involves sending a command to the server to create an object with given name and receive back a response with its real inode information (like inode number and other fields cached for faster stat() and similar workloads). Pohmelfs then changes local inode info to match the real data.
Syncing of small tree of 500 files takes about 40 (!) seconds. Well, in Xen environment where I develop this things local creation of 500 files in single ext3 directory takes more than 15 seconds, but another 25 is a pure overhead.
That was short description of previous series.

Next, problems of fixing the problems.
First, Xen version used at that testing machine is old enough, so oprofile does not work. Second, I do not know VFS internals enough (this is my first filesystem, interested reader can find how I managed to step likely on every possible rakes on that field, some of them were even small kid rakes...) to determine where there is a possibility to catch that long delays, but since linux filesystem is actually a not that complex system, but set of callbacks, implementation is not really outstanding, but knowing in which condition each callback can be invoked and which problems can be here or there is kind of a magic... Third, remote userspace pohmelfs server was not actually written by me, instead its bytecode was blown out because of some substances inspiration, so it can be very much a reason for all the problems, given that it is trivial as pretty much all my userspace code, even total rewrite will not fix the issue.

So, latency problem in pohmelfs looked really undebuggable. But you know, cup of excellent tea (from tea-packet) with lemon can fix any problem (or high themperature and substances, or fair amount of alcohol, everyone has fun the way he likes), so it was first decided to implement a simple network kernel module which would connect to remote userspace server and exchange messages in a similar fasion like pohmelfs does.
Such module was implemented, started and showed excellent performance (about 1 thousand of messages per second send and received back in test network, which is several orders of magnitude faster than pohmelfs). So, move back to VFS and pray for inspiration.

Inspiration was met today (thanks Arnaldo, likely it is because I'm getting healthier :).
I always thought that number of subsequent calls for recv() is not a good idea no matter where: in kernel or userspace, since it takes a socket lock, which in turn can introduce latencies found, so I eliminated subsequent recvs in pohmelfs code (testing module was written better and does sending and receiving without such 'fragments'), which resulted in... nothing, results did not changed at all. So, wrong step, but having subsequent sending calls in a row is not a good idea too, so I replaced them with allocation and copy, so that there would be only single kernel_sendmsg() call. As you might expect performance... changed by 30 times. Just by having single send call instead of two for as much as 500 invokations forced the whole network exchange to behave completely different.
So, to debug problem further I extended testing module and introduced ability to send and receive data not by single packet but via two fragments: 4 bytes and rest of the packet (60 bytes). Here is a result table for 1000 of messages sent and received back by testing module:

no fragments:				1.43 seconds
send fragments (4 and 60 bytes):	40.43 seconds
recv fragments (4 and 60 bytes):	1.43 seconds
both fragmentations:			40.43 seconds
It is 30 times difference just for simple application change!
tcpdump on receiving side shows that subsequent fragments sending results in a real message sending all the time kernel_sendmsg() is invoked, which results on ack for each such message (both 4 and 60 bytes), which completely degrades tcp window and connection just can not recover with such behaviour.

So, all that words were written just to show that even undebuggable from the first view problems can be easily solved, and that harmless (from the first view again) programming mistakes can result in very interesting results...

Now back to drawing board to think how to improve pohmelfs protocol even more to get the last bits out of the wire.

Btw, interested reader can get my network testing module and userspace from theirs just created homepage.

/devel/networking :: Link / Comments (4)


Thu, 28 Feb 2008

Yellow bus eats your brain.

Or flying bus model.



And cutter.



Wanna one? Get it here.

/other :: Link / Comments (0)


Wed, 27 Feb 2008

Meanwhile at appartment development side: wine shelves.

Still trying to make something interesting at home, since there is no access to servers to fix things and implement new ideas...

Shelves will have a classical wine-cellar design, something like this:

|\  /\  /\  /|
| \/  \/  \/ |
| /\  /\  /\ |
|/  \/  \/  \|
|\  /\  /\  /|
| \/  \/  \/ |
| /\  /\  /\ |
|/__\/__\/__\|
I can not make a photo of the workplace, where wood plates are placed because of the ever crappiest repair service of Nikon, so enjoy that pseudo-grahics instead :)
Originally it was supposed to host not only bottles of wine (I do not drink wine though), beer and strong alcohol, but also some books, but I made a mistake in the design which was further increased when plates were sawn by the people (well, I gained a huge experience of not letting others do the thing I can do better, even if I pay them for that), so cells became smaller (it now can host only about 5-6 classical (0.5l) bottles of beer, each side of the square is about 17 sm) and have cut side. Both this issues forced me to rething design a little bit, so now shelves reduced its functionality, but looks probably even more interesting than in original design.
So far I did not finished wood polishing and did not start its mordanting and plastering, but only contructed the whole structure on the floor, so it will take a while to finish, but result is expected to worth all the eforts.
One told me that there is something similar in Ikea, well, there is always something similar in Ikea no matter what you created or only thought about, but that completely does not matter - only process of creation matters and that is the most interesting and important.

/devel/flat :: Link / Comments (2)


Tue, 26 Feb 2008

Traffic jam simulator and some math analysis.

Trying to make at least something during fscking sick.

I've create a simple traffic simulator, which contains variable number of cars and lights, each of which can be programmed to different acceleration, maximum allowed speed, stop and deceleration distances. Each light can be programmed to switch lights after different interval, there are only two lights in real life: red and green, at least in Moscow very unfortunately) no one ever cares about yellow, lots of drivers specially accelerate when see yellow... So, only two colors.

Since I did not bother to implement a nice config for each car and light, there is only signle set of parameters, but command line parameters allow to vary initial number of cars and distance between them, number of time frames before lights change the state or new car enters the road, number of lights and distance between them.



There are two known problems with the lights on the road: first, bad drivers, who do not maintain a huge enough buffer, so they have to wait until car in from of them moves far enough so they can start, this takes some time from limited timeframe of the green light. If buffer is large enough, drivers can start simultaneously and thus move much faster.
One can simulate this behaviour with variable initial number of cars and with different distances between them, if distance is less than stop distance (i.e. distance where driver has to stop its car, it is 4 in the current setup), then driver will have to wait, until distance becomes more or equal to stop distance, if driver stopped far than stop distance (let say 5 'meters'), then car can start simultaneously with the head. The latter approach allows to move more cars through the light during fixed time frame, but psychologically it looks better to stay as closer as possible to the head car, which introduces a latency, since we have to wait until head car moves far enough, so we could start. This leads to negative exponential speed increase for each car behind instead of linear speed if drivers would maintain the buffer. Appropriate equations are quite simple: difference of the distance moved by the single time frame is proportional to the acceleration of the head car, which in turn is proportional to its coordinate, so we have a simple differential equation, which solution results in a negative exponential. One can read a bit more here.



Second problem is light interval. If interval is too short, then cars can not start, only couple of them moves forward, and if it is big enough, then during red light a large backlog of cars can be accumulated, and it will not be removed during green light because of the above problem: each car has to wait until head one moves to some distance. The latter is actually worse, since backlog can become so huge, that it will not be removed at all, which will lead to complete stall of the traffic flow (at the back side, front one will move, but number of cars at the tail will be bigger than number of cars which leave the traffic jam).



One can play with the programm, called traffic. It requires gtk2 devel package installed. Homepage contains essentially the same text and link to the source code. It also shows usage example.

Enjoy!

/devel/other :: Link / Comments (7)


Thu, 21 Feb 2008

CacheFS and NFS local caching.

David Howells of RedHat recently posted next round of his CacheFS implementation. Main idea of the project is to store locally data and metadata modification on disk.

Cache is implemented as write-through one. Locally data is stored as usual files on a special partition formatted as one or another filesystem.

David also posted benchmarks of his apporach. Metadata intensive operations showed significant slowdown with the local on-disk cache, getting metadata from local cache also shows a slowdown. The former can be explained by the write-through nature of the cache and slow local disk operations, which is also a reason for metadata reading downgrade of the speed.
There is also no cache-coherency algorithm implemented for CacheFS. Another problem, pointed also by Kevin Coffman is possible slower reading of data from the cache than from the local filesystem (and from remote one if bandwith is not a limiting factor which is frequently the case).

This is third (actually the first :) local cache implementation for the network filesystem, so competition between CRFS, POHMELFS and CACHEFS becomes even more interesting :)
Stay tuned!

/devel/fs :: Link / Comments (0)


Wed, 20 Feb 2008

Latency problems in pohmelfs.

trying to make at least something...

As was mentioned full inode resync logic is very slow. Latency is introduced likely somewhere at protocol layer, which is used by pohmelfs. To test this scenario and find out the best possible solution I implemented trivial network module and userspace server, which talk to each other via protocol very similar to what is used in lookup/create operations in pohmelfs. Server and client also maintain trees of the objects it sent/received, so that model would be as much as possible similar to pohmelfs usage patterns.

Its time to test things and find out where the problem lies, but as usual there are problems. You are sick, everything is aching, but you want to beat the crap, to move a bit further, to make something interesting, so you start implementing the tiny bits, you start thinking, you finally make the things, so you become happy and proud, and that is just to find out, that all testing machines you had access previously are turned off, and new ones are behind a firewall and there is no access to the network from the ass of the world. This is called 'shit happens'.

/devel/fs :: Link / Comments (0)


Tue, 19 Feb 2008

Fedora sucks. It is not even remotely designed for smaller than high-end systems.

At least yum developers do not know, that there are systems with less than 1 Gb of RAM. And it is not even about how slow yum is. Not about the fact, that to install 30 kb application yum will download 3.5 Mb sqlite database file.
It is about yum programming bugs:

error: Couldn't fork %post: Cannot allocate memory
  Updating  : libebml                      ################### [  68/1218] 
error: Couldn't fork %post: Cannot allocate memory
  Updating  : xorg-x11-server-utils        ################### [  69/1218] 
  Updating  : fribidi                      ################### [  70/1218] 
error: Couldn't fork %post: Cannot allocate memory
  Updating  : lame-libs                    ################### [  71/1218] 
error: Couldn't fork %post: Cannot allocate memory
error: Couldn't fork %pre: Cannot allocate memory
error:   install: %pre scriptlet failed (2), skipping tk-8.4.17-2.fc8
  Updating  : libdvdnav                    ################### [  73/1218] 
error: Couldn't fork %post: Cannot allocate memory
*** glibc detected *** /usr/bin/python: corrupted double-linked list: 0x15cdde58 ***
As you might expect, all 70+ packages above also got 'Cannot allocate memory' error. My laptop has 256 Mb of RAM and 512 Mb of swap, more than a half was free.
After trying to start the same process again, after some applications were killed to get free memory, yum refused to install packages because of broken dependencies...

For example for xorg-x11-server-utils I have:
xorg-x11-server-utils-7.3-2.fc8
xorg-x11-server-utils-7.2-1.fc7
xorg-x11-server-utils-7.3-1.fc8
But libebml has one FC6 version. For the protocol, FC6 was never installed on this laptop:
libebml-0.7.7-2.fc6
libebml-0.7.7-3.fc8
Fedora Core also forces FC9 stopper bug into needinfo one without any single patch/version to test (at least I did not receive any such mail), opened with perfect description, with probability of bufer overflow somewhere in image processing/rendering code, with 100% reproducible example and image to test with, even after other person reported the same problem on rawhide (and marked it as fc9 stopper).
How in the hell you expect to get some info after two months of silence from developers? (one month after bug was confirmed in rawhide) Some people still believe in miracles...

I would like to test it right now, but I can not because of yum problems... Old packages, as you might expect, still have that bug somewhere.

World is far from being perfect :)

I will not turn it off or suspend, I do not believe it will work after that. Instead I will wait until capabale to get new DVD with some other distro. And it will not be Debian either.

/devel/other :: Link / Comments (3)


A doctor's visit.

I first time in my life called a doctor.
Doctor happend to be a nice-looking woman less than 30 years old we nicely talked about my sickness, and how (un)successfull cure was. Since it is first time I used some drugs (except aspirine) to cure (I belive it is a flu) the sick, I managed to miss the point, when getting the drug should be started, so it was not very useful. After the quick look at me she found so many possible crappies I would have, so I began to scary. I always knew that the less one knows the better sleep is, but after set of questions and 'no, I do not' answers, sky became a bit less cloudy...
She gave me a list of needed drugs to get, described what the heck it is (flu, which forced some complications: something like bronchitis), so now I feel a bit better, but will be ill (according to her prognosis) at least all this week or maybe a bit more.
Modulo themperature I feel not that bad, but when it starts rising after drug (aspirin) stops working, I start feeling like shit, dejavu.

I actually tried two anti-themperature drugs: aspirine and paracetamol. The former kicks the themperature in about 30 minutes, but frequently this results in excessive hyperhidrosis, while the latter acts only in an hour or so, but without any bad effects.

So, as you might notice, there are no updates about tons of my projects because of this fucking sick, but I expect soon to be able to kick its ass so things would be in a good shape again.
Such (crappily forced) 'vacations' make brain to sleep allmost all the time, so when this will be ended, I expect even higher rates and more interesting stuff happens.

Stay tuned!

/life :: Link / Comments (1)


Sat, 16 Feb 2008

How to measure a temperature of the body under very limited conditions.

Let's suppose one does not have a thermometer, but there are lots of instruments and equipment around starting from screwdriver to drills, from simple amper/voltmeter to laptop. As a prompting: there is also electricity, vater and automatic teakettle.
Task is to measure temperature of the own body and decide to get or not to get an aspirin. Or make some fun from the process because of quite boring sickness.

Solution is pretty geeky, but first try to think about it yourself.


So, the solution.
It is based on the fact, that when human body or part of it is placed into environment with essentially the same temperature, but much bigger thermal capacity, it does not feel this. Try get shower with about 36 degress Centigrade, and you will not feel neither cold, nor hot. Things are different when air on the street is more than 30 degrees Centigrade, that is because of too much different thermal capacity of water (it is huge) and air (very small).

So, back to the task. To determine your tempeperature you have to get precise volume of water in the teakettle (let's say 1 liter, I could measure it because I have water counters), connect teakettle to the electricity via ampermeter, measure voltage by voltmeter.
Then you have to put your arm into the teakettle and turn it on (beware of heating element) and checkout first time. When your hand will feel itself very comfortable (here is a main error factor) you have to checkout second time. Then remove your arm and wait until water become boiled and write third time.

Now, its time for school physics: power of the teakettle, which is equal to multiplication of current strength and voltage, multipled by time difference is equal to weight of the water multipled by its thermal capacity and temperature difference, which was changed during above time frame.

So, here are practical results:
current strength I = 3.7 A
voltage V = 231 V
mass of the water m = 1 kg
thermal capacity of the water c = 4200 J/(kg*degree)
time difference for complete boiling (from unknown temperature to 100 degrees Centigrade) dt0 = 420 seconds
temperature difference dT can be found from following equation:
I*V*dt0 = m*c*dT
So, we have dT = 100 (temperature of the boiling) - T0 (initial temperature of the water) = I*V*dt0/(m*c), and is equal to 85 degress Centigrade, so initial temperature of the water was about 15 degress Centigrade.

Time difference between start of the process and comfortable temperature was about 30 seconds, so placing this timeframe into above equation we can find, that temperature was changed by 6 degress.

Since we already found, that initial temperature was about 15 degress Centigrade, calculated temperature of my body is about 21 degrees Centigrade.
Its time to go back to grave...

P.S. Yes, I'm a former looser-physicist, that's why I became a kernel hacker, this can explain alot...

/devel/other :: Link / Comments (6)


Fri, 15 Feb 2008

Richard Stallman in Russia and related problems.

Here are number of interesting moments in this miserable melodrama (got via Linux Today News).

Stallman wanted to visit Russia this March, and parliament member Viktor Alksnis promoted his visit and wanted to help with 'administration issues'. Then LOR (www.linux.org.ru, one of the most popular linux sites in Russia) moderator Sergey Udaltsov (who lives in Ireland) sent a letter to Richard, where pointed that Alksnis is not a good man, he also noted about Alksnis' "fight against the independence of the Baltic countries" in late 80s.
Stallman then said that he does not want Alksnis to organize his visit.

Well, my couple of points about this stupid situation.
First, Alksnis is really not a very smart person in IT, and it looks very much like he is a usual careerist, since I do not know about his work at all except stupid idea of creation of 'national OS', probably using 'nanotechnologies' (it is a modern trend here :).
Second, step of Sergey Udaltsov is very well braindamaged - while it is ok to describe who Alksnis is, but pointing to Baltic independence is even more stupid than 'national os' idea.

So, my simple point is that both Viktor Alksnis and Sergey Udaltsov just wanted to make some self-advertisement profit from Rishcard Stallman's visit and do not really do it because of open source. Although self-advertisement is not a bad idea (you read this blog :), such movements are stupid.
If Stallman got problems with visa (I'm surprised if it takes more than two weeks), then it means he does not really want to visit Russia, for example I got kernel summit invitation more than two month ahead of the meeting, which was enough to get visa (it required two (!) visits to the UK visa office just to give and get back documents, it took two (!) days to check documents, and it was possible to order a courier, although previous year time frame was shorter). If he does not want, why would we care?

If someone wants to visit a country, he can find a way to do that himeself and do not wash own brain with stupid rethorics.
I think it will be a cause for Richard not to visit, since it really looks like he does not want to do it :)

/other :: Link / Comments (7)


Thu, 14 Feb 2008

Meanwhile at appartment development side.

Well, when you are sick it is generally not a very good idea to work on something, but this state can produce very interesting ideas. So, I decided to change my table, so I 'disconnected' a leg (there is only one, other side is connected to the wall near the window), removed all varnish layers using a plane, polished a bit by grinding machine and painted both sides of the table with new dark chocolate colour. Looks pretty good, but since orgalite, used on both sides, is made out of wood fibers, it is not 100% smooth, although second layer of the colur is a mandatory, it is possible, that it will not be enough. I want if not completely smooth surface, but at least that hand moved on top of it would eel nothing. Managed to paint hands and feet, but at least hair was not touched (although I'm not sure).
Also installed number of electricity sockets, so there is no need to place kettle far from the 'bed' (i.e. part of the floor).

Overall it was a good development day, I expect next one to be as much as productive if not more. Nevertheless I hate to be sick...

/devel/flat :: Link / Comments (0)


Wed, 13 Feb 2008

POHMELFS got full inode number resync logic.

Now it updates all upper inodes in the tree when doing writeback for some inodes. Here is a result:

/mnt/tmp$ mkdir -p 1/2/3/4
/mnt/tmp$ echo qweqweqwe > 1/2/3/4/file
/mnt/tmp$ ls -liR ./
./:
3332986296 drwxr-xr-x 3 zbr users 0 2008-02-13 12:07 1

./1:
3332988600 drwxr-xr-x 3 zbr users 0 2008-02-13 12:07 2

./1/2:
3306456568 drwxr-xr-x 3 zbr users 0 2008-02-13 12:07 3

./1/2/3:
3332985144 drwxr-xr-x 2 zbr users 0 2008-02-13 12:07 4

./1/2/3/4:
3306458488 -rw-r--r-- 0 zbr users 10 2008-02-13 12:07 file
/mnt/tmp$ sync
/mnt/tmp$ 
/mnt/tmp$ ls -liR ./
./:
557065 drwxr-xr-x 3 zbr users 0 2008-02-13 12:07 1

./1:
557066 drwxr-xr-x 3 zbr users 0 2008-02-13 12:07 2

./1/2:
557069 drwxr-xr-x 3 zbr users 0 2008-02-13 12:07 3

./1/2/3:
557070 drwxr-xr-x 2 zbr users 0 2008-02-13 12:07 4

./1/2/3/4:
557071 -rw-r--r-- 0 zbr users 10 2008-02-13 12:07 file
It also works with much bigger trees (like untarring linux kernel tree, although ugliness of userspace server requires to rise maximum amount of opened file descriptors).

There is a single problem in this case: it is damn slow. And I do not see an easy explaination for that. Well, tcpdump shows small window, but that is an end result I think, not a reason, and the reason is likely in the protocol pohmelfs uses - system sends number of short packets in round-robin fashion, which may be slow for some reason. Since I'm waiting for real hardware to test things on (since oprofile does not work on installed Xen version), I can only handwave about the root of the problem...
And that is exactly the same problem which was with write-through cache pohmelfs had first, I think even timings are similar, so after this problem is fixed, new version will be released.

There is another problem, which complicates the development - I got a cold (second one this year, and third one for the last 3 or 4 years though), but such condition with some temperature, when brain is in the 'hinged' state between sick and good shape, opens very fun feelings about things around, which usually ends up with very interesting results.

/devel/fs :: Link / Comments (0)


Tue, 12 Feb 2008

POHMELFS got inode number resync logic.

It happens when inode in question is being under writeback - protocol implements quite simple ping-pong message passing, so result looks like this:

/mnt/tmp$ echo qweqweqwe > qwe
/mnt/tmp$ ls -lai ./
total 8
    557057 drwxrwxrwt  2 root root  4096 2008-02-12 19:58 .
         2 drwxr-xr-x 22 root root  4096 2008-02-12 19:58 ..
3322992632 -rw-r--r--  0 zbr  users   10 2008-02-12 20:32 qwe
/mnt/tmp$ sync
/mnt/tmp$ ls -lai ./
total 8
   557057 drwxrwxrwt  2 root root  4096 2008-02-12 19:58 .
        2 drwxr-xr-x 22 root root  4096 2008-02-12 19:58 ..
   557065 -rw-r--r--  0 zbr  users   10 2008-02-12 20:32 qwe
But overall it does not work, since writeback can happen for any inode inside the whole not-synced tree, so trying to sync inode number for some obscure object, which sits in the directory server never saw before, is quite problematic - the whole tree has to be traversed from the inode under writeback up to the one which is known for the server host.
Although this is not a very complex task, but there is a question about what to sync. Should the whole directory content be synced, or just single inode, if the former, than should we force writeback for other objects in the directory under resync... I think the simplest case is to force only higher layer object creations, not syncing theirs content (like other objects in the directory), but directory itself should be marked as dirty, so that access from different clients forced appropriate resynchronization.

/devel/fs :: Link / Comments (0)


Mon, 11 Feb 2008

Climbing evening.

That was bloody excellent training, but since it was only third one this year, I hardly can find a piece of body which is not aching right now. Maybe ears though...
I started to climb high without usual warming traverses today, so knees were damaged first (not counting couple of strikes by the wall and holds). Then fair number of various attempts to finish new (quite complex though) trace over blue holds in the right central section, which moves over the new balcony in that area. Eventually I found the solution (although I think there is another one, which requires a bit more power, but a bit less technique), but damaged arms, rubbed all 20 fingers, and made arms very tired. Also tried some interesting old traces, but since I tried them only second time, I failed. That was ok, legs and back were not aching that time, but then I found a trace, which made the day! Excellent complex and quite short trace in the right vertical corner of the climbing area. Trace requires very interesting technique almost without heavy powerlifting movements, but good balance and stretching.
I spent more than a half of a hour there, damaged shoulders and fingers, stretched everything which was not broken before, but completed the trace (although only piece-by-piece, will finish it cleanly next time), and finally moved to the shower (even sauna did not work today) as a number of separate pieces of crap connected to each other essentially by virtue of the mind.
Excellent day, excellent time!

/life :: Link / Comments (0)


Initial implementation of the offline and cache coherency algorithms.

It is rather dumb and even does not have state machine handling in the usual meaning.
Existing pohmelfs implementation has only two places where content of the inode is 'globaly' modified, by 'gloabaly' I mean some changes, which have to be seen by other clients if they will access given inode.
First one is directory reading, when inode in question gets information about other inodes in given one, another one is object creation. Object removal is local operation, and there are no collisions if multiple clients delete the same object simultaneously.

When directory is being read first time, pohmelfs just syncs its content from the server, all subsequent reads happen from cache, since all creations and removals happen locally. This case is simple.
When pohmelfs is about to create an object, it marks parent inode as dirty, if parent inode was not marked dirty previously, this ends up sending a single message to the server. Server in turn can return content of the directory in question, if that inode was already modified by different client. If there are objects with the same name as local ones, local objects are 'renamed' to the 'oldname-synctime', so that user could later run diff or whatever and merge changes. That is how offline pohmelfs clients work.
Object is always created in the local cache only with local inode number. So far it is never being sent to server (although code which does it and changes the inode content exists), even writeback does not work right now (since server does not know about object with local inode numbers). This part is a bit more complex: pohmelfs has to sync inode (i.e. to send current inode info, wait until server creates object, then receive real inode info and change local cache) either in writeback (when system forces to writeback a page(s), appropriate inode will be synced first) or in cache coherency algo. For that purpose each network state locking first checks if there are messages in the queue from the server, which have to be processed first, so far only server content receiving is supported, forcing to send own content on request from server is a base of the cache coherency nad this is not yet turned on. Here major race lives, which can lead to the full resync of the idea actually. After we locked own network state and checked that there are no requests from the server, client can start sending own commands, but before they came to the server, it can start CC resync (and send messages into the same pipe as clients command) initiated by different client, which will break protocol state machine. This is main idea to think about. Oh, and to implement the same logic on server :)

/devel/fs :: Link / Comments (0)


kernelplanet.org

Someone good placed my rss feed to kernelplanet.org, which is a kernel hackers place of shame glory :)
Well, one who did that probably saw that frequently I write quite a lot of notes for a day, that I have no political/hacker/whatever ethic in the blog, that I made too many english errors (especially when I have no access to the dictionary) and so on, hope it is not that bad.

So, couple of words about what it is. This blog is fully devoted to how I spend the days: hacking, having a rest, sleep and move to the toilet...
Blog has comments (with a bit not user friendly captcha), and number of them one can find at the end of the message. When new comment is added, entry is updated, so stream-based aggregators will see it as a new one. Usually there are 1-3 entries per day, sometimes more, sometimes no entries.

That's it. Stay tuned.

/devel/other :: Link / Comments (2)


Sun, 10 Feb 2008

Completed electricity socket setup.

Also replaced electricity switch with the new one and installed warm floor thermal system in the bathrom, but the latter is not yet fully completed (wires are connected without good isolation yet and thermal controller is not placed into its own socket in the wall, but hungs around on wires), but it work (do not know how good, and I will turn electricity off when move from home, will check it thermal capabilities when I'm able to stop the fire...

/devel/flat :: Link / Comments (0)


Sat, 09 Feb 2008

Meanwhile at appartment development side.

I started electricity projects in the bathroom, which required to setup electicity sockets for light switch and warm floor. During installation I managed to get dirty of assembly foam, which, if you do not know, is much much more heavier to clean than any other material we have. And I managed to make dirty not hands (that's usual), but hair (I first time worked without hat).
This forced me to wash my head with acetone... It was not that bad actually, maybe except its smell, so eventually I cleaned almost all dirty areas (about one quarter I belive, since I have no mirror I can not say for sure) and washed it couple of times usual way, but that stopped me from further development for today.
Maybe I will finish it tomorrow.

/devel/flat :: Link / Comments (0)


Thu, 07 Feb 2008

POHMELFS and CRFS in the news.

At LWN.net. And as usual I do not have an account this time...
So, will wait for a week for free article, by that time pohmelfs will contain very tasty things, which do not exist in any other fs out there (or at least in the single filesystem).

Edited to add, that Simon Holm Thøgersenshared a link to the article. It is somewhat fun, although author (Jake Edge) writes quite differently from Jonathan Corbet imho. Article does not compare pohmelfs and crfs, but shows that they are very similar. I've known, that Zach Brown works about a year on CRFS, while pohmelfs exists less than a month. Someone shared a secret knowledge about meaning of the pohmelfs abbreviation in russian, well, maybe he/she is right, who knows...
Article does not cover features scheduled for pohmelfs like offline working and inode resync logic.
Commenters try to compare crfs and pohmelfs with afs and pnfs. Both do not have metadata caching mechanisms, so they are fundamentally different, pnfs in addition allows to implement closed extensions, which will lead to vendor lock.

One point to writer Jake Edge is that he does not use names in the articles, but only last names.

/devel/fs :: Link / Comments (2)


Filesystem freezer. Removable device.

There is a long discussion in linux-fsdevel about various filesystem freezing implementations and features it should have.
Main goal of this project is to freeze any filesystem, so that all write requests would be blocked. This allows to implement consistent backups. This task belongs to block layer though, and this patchset actually implements that by suspending underlying block device. Although interface (ioctl) is a bit ugly, it will likely be accepted, since other filesystems (namely XFS) have such feature via own provite ioctls. People say that it does not always work though.
LVM supports consistent backups natively, but having such interesting feature without need to work on top of device mapper would be a great deal!

This highlighted a very interesting project I have in mind (actually it will be another reinvention of the wheel though) about various removable devices. Actually it is not only about removable, but any devices, which can suddenly dissapear or stuck (like network filesystem, broken cable to local disk or bad drive).
Old idea is to remount access to such device as readonly and with error returned to any atempt to access it. There is a frevoke() syscall which does that for given file descriptor - it is marked as errorneous so access to it returns errror, but this does not fix a problem with network filesystem for example. Let's suppose we have NFS client which stuck because of server was disconnected, there are cases when it will never resume and return error. Or bad block/bad drive access, which will try again and again forever...
Revoking particular file descriptor is simple task, but what if we have a web server, which accesses broken drive for each new client or similar scenario? While we revoke one file descriptor, server will create another two, stuck in the middle of the operation.
The very good solution I have in mind is to break all existing access pathes (block layer has access to all bios) and either replace underlying device with fake one, so that all requests would be completed with error (consider it like hotplug/unplug of storage device), or replace filesystem (inode and file) operations, so that they returned error (that is like hotplug/unplug of the filesystem). In the latter case it would be even possible to change filesystem on the fly! First, plug a filesystem which just queues requests isntead of processing data, then unplug real filesystem, plug new one and unplug fake one.
Not sure it is very useful functionality, but very interesting...

/devel/fs :: Link / Comments (0)


Btrfs 0.12 has been released.

Chris Mason changed on-tree disk format again, which leads to very noticeble (30 times!) speed improvement for random write access (from 1 mb/s to 30 mb/s).
This release also contains mount option and some tweaks for SSD (solid-state disk), mainly write clustering without getting into account directory file writes belong to. Also added simple ENOSPC handling, although it is still possible to crash machine, when there is not space left on device, now it is a bit harder.

Next step for btrfs is to support multiple devices for single filesystem via subvolumes.

Release notes.

/devel/fs :: Link / Comments (0)


Memory notification events.

Jake Edge posted an article at LWN.net about various memory pressure notification, which userspace application may be insterested in.
For example they can wait for swap in/out notifcaitinos or oom condition far before it is killed by oom-killer, so it could free some unused ram (like firefox could free some recently viewed pages cache).

Notifications are transferred to userspace via /dev/mem_notify file, which is readable and pollable. Alternative way is to use SIGIO signal to the process when the device becomes readable.
Patch likely will be accepted soon.

This is another example of the real need for unified event management subsystem in the Linux kernl.

/devel/kevent :: Link / Comments (0)


Got grinding wheel machine from frieds.

Will order some equipment tomorrow and continue table and shelves development.
Since my kitchen is not completed, it will be cleaned and temporarily transformed into wood workshop. I also have to complete some electricity projects (in kitchen, bathroom and hall).

/devel/flat :: Link / Comments (0)


Wed, 06 Feb 2008

Climbing evening.

That was my second training this year - not a very huge progress as you can see, so this training was hard. There is fair number of new traces, most of them were quite simple, so I decided to run several in one go without the rest in-between. Probably that was not that good decision, since after 7 or so of them completed after two starts I was very tired and was not able to climb good over more complex traces. So that will be postponed for the next training.
I bought myself new climbing shoes, which are a bit large - usually I wear 3 sizes smaller climbing shoes, this time difference is only two sizes, but it is my favourite shoes, so I expect very good climbing.

Anyway, tired very noticebly and that's great!

/life :: Link / Comments (0)


Continuing POHMELFS client side caching design (offline working capabilities).

As I wrote previously, accepted design of the local cache allows not only to fix problem cases with inode generation numbers, but also provides a very interesting feature with offline working.

Let's suppose client was moved offline or just does not yet synced its cache with the server. It can work without any problem and later when it connects back to server system will resync its data with server one. For all files, which are different on client and server, client will have an own version, but with different name (like orig_name-$date_of_sync), so that user could run diff or anything else and merge changes properly.
Number of usage cases for this excellent imho functionality is extremely large...
There is a problem though, since client's memory is limited, and eventually writeback will start pushing data to server, so for such cases client has to have ability to cache not only to mem, but to disk too. That is future extension though.

An anounymous reader dropped me a note, that such behaviour of locally cached files, when its inode number will change after resync with server, will be frowned upon by some RSBAC systems.
I believe that inode-only based approach is broken because of heavy problems with filesystems, when file can be changed by different clients. There is a possibility to remove file and then create new one, and it will have the same inode number as just removed one, so withough knowing name of the file system will be screwed. And how does this system work with hardlinks, which have the same inode number as target object, but different names?

/devel/fs :: Link / Comments (0)


Tue, 05 Feb 2008

Selecting computer language for the new project.

assert youKnowWhatYouReallyWant == true;
if (iAmWritingForPersonalUseOnly()) {
    if (iWantAReallyNewParadigm()) { // actually you'll get some irreversible brain damage.
        try {
            return "Huskell"; // dude, I really mean the DAMAGE!
        } catch(ECriticalBrainFailure e) {
            if (preferDotNetWorld()){
                return "F#"; // it's the same as Gb, ain't it?
            } else if (processorCount() >= OH_SO_MANY) {
                return "Erlang"; // start thinking in 1000 threads
            } else if (preferPunctuation() == STRONGLY){
                try {
                    return "J"; // APL needed a transliteration -- and got it
                } catch (EBrainOverolad e) {
                    return "K"; // better have a bank hire you soon!
                }
            } else {
                throw new RethinkParadigmException();
                // you should have better selected Haskeel before
            }
        }
    } else {
        if (isDynamicTypingOk()) { // hey, everyone wanna be a cool geek today.
            if (cannotLiveWithoutCurlyBraces()) { // well, who can ?!
                return "Ruby"; // it's Python done better.
            } else if (enjoyIndentation()){
                return "Python"; // it's Ruby done right.
            } else if (shizophrenia->isOK()){
                return "Perl"; // all the expressivenes and imprecision of a human language.
            } else if (sourceCodeConceptIsObsolete()){
                return "Smalltalk"; // ever modified the value of True -- on a live system?
            } else {
                throw new LameException("PHP5"); // stick with this, los^W poor dude
            }

        } else { // static typing obviously
            if (isManagedOk()) { // let PC do some job for me, they are so smart nowdays.
	    			 //Sick of doing everything myself.
                if (preferJavaWorld()) { // die, MS, die!!!
                    return "Scala"; // huge, really huge. Must be inspired by Noah Arc.
                } else if (preferDotNetWorld()) { // stuck on Windows, ha?
                    return "Nemerle"; // kazalos' by... oh, not again...
                } else {
                    throw new IsThereReallyAnythingElseException();
                }
	    
	    // computers will eliminate the humankind if they get enough control.
            } else if (unmanagedOnly()) {
                return "D"; // get a whole new language with every new release. Great fun.
            } else {
                throw new YouWantSomethingStrangeHereException();
            }
        }
    }
} else {
    return "Do Whatever Your Boss Says To And Keep Your Mouth Shut Programming Language";
}
I only know C a bit and some time ago I tried Java and knew what C++ was... I think I'm living out of this new and shiny world of programming, and that's cool.

/devel/other :: Link / Comments (6)


POHMELFS inode generation and cache coherency.

I think I've just designed the way to fix the problem with overlapping inodes on different clients or server and clients.

Here is short problem description: when client locally creates some object, it has to assign unique number to its inode and put it into global hash tables. With local cache and maximum performance (or when client is offline) it shold not connect to server and perform create operation at all, instead it should pick some number for inode and work with it.
Problem is that number of clients can have the same inode number for different inodes and have actually the same object but with different inode number on different client's machine.
When clients and server will have to sync its states problem rises: server does not know about inode with client's number and thus sync can not happen.

Solution is quite simple imo, which solves both cache coherency problem and inode number one.
Clients use any numbers they like: for example sequential increase from zero. When new object is created its parent is marked as dirty by client (if it is already marked as dirty by other clien, it is forced to push its changes to the server, which then will be forwarded to the new client), and client uses own inode numbering scheme. When later there is a need for resync (lile forced writeback or above case of cache coherency synchronization), client sends inode content to the server with both name and local inode number. Server then creates an object and assigns real unique inode number to it, which is then returned back to client. Client removes inode with old (local) number from hash and inserts it back with different inode number. That's all.

Simple. And allows to work with any filesystem on the server side because system uses both object name and object id (inode number) as identificators during creation time.

So far I do not see any drawbacks in this approach, but practice will show if it is correct design or not. Stay tuned.

/devel/fs :: Link / Comments (0)


Mon, 04 Feb 2008

Linux.Conf.Au 2008 presentations available.

Check them out!

And getting CRFS presentation (slides, ogg, SPX (what's it)), we have:

http://oss.oracle.com/projects/crfs/

$ wc -l lk/*.[ch] | tail -1
 7335 total
$ wc -l crfsd/*.[ch] | tail -1
 5971 total
But world is so cruel:
Not Found

The requested URL /projects/crfs/ was not found on this server.
Likely it is still a weekend in USA.

/devel/other :: Link / Comments (1)


Return from the country-side.

We celebrated Grange's birthday in the small wood cottage somewhere at the coutry-side, where had a rusian bath there, bathing in an ice-hole, shashlik cooking on the frost, musical jam (electric guitar, couple of tomtoms and a saxophone), snow balls and wrestling, and main one: lots of friends.
That was bloody cool days, thanks a lot for organizing that meeting!

/life :: Link / Comments (3)


Fri, 01 Feb 2008

I officially retired.

That was fun and sometimes very interesting years, I got lots of experience especially with 32-bit PPC arch and various hardware.

That was good, but its time for step further.

/life :: Link / Comments (0)