Zbr's days.
July
Sun Mon Tue Wed Thu Fri Sat
   
25 26
27 28 29 30 31    
2008
Months
JulAug Sep
Oct Nov Dec

About TODO Blog RSS Old blog Projects Gallery Notes

Wed, 23 Jul 2008

Manager's thoughts: unused extensibility and used de-facto standards.

After some before-sleep-reading (this time DNS RFC specifications) I found, that DNS protocol is so much extensible, that is can perfectly cover not only its area, but also help in really lots of close problems. It already has (though completely unused) many interesting RRs and types, which have nothing to deal with DNS (like NULL RR, which allows to transmit binary data or TXT RR, which also is not related to DNS area). And the most popular RRs are A, PTR, SOA CNAME and MX. That's all from about 20 others. The same applies to (q)type and class (I first time read about Hesiod class for example). And DNS allows to introduce own classes, types and resource records.
It is just not used, but we could create distributed DNS system with new types. It would be really simple (and actually it can be done even without new DNS extensions).
But it is not actually needed, since people are used to have DNS just like it is.

Another example is internet video. There is de-facto Adobe standard, no matter what W3C will put into its new standard, everyone will continue to use existing one. Just because it works ok. Not excellent or perfect or whatever, it just works how we used to know.

And there are lots and lots similar examples.

People are so much intert in this questions (although I think in most areas, just because it is convenient not to do something better, when existing solution just works, even if not perfectly and even if not good), that no one will ever bother to change something dramatically, because it will not only require huge amount of money, but also changes in the way people used to think about given area, which is likely even more complex (and money-hungry) problem.

All this talk is about simple thing, I just opened for myself: when you created something completely new, even if it is not the best solution for given problem, if you will start pushing it to wide audience to be used, then you are able to get all 'the market'. That's why when you have something new on the market, where most of the users already used to work with one or another solution, (and even if your project is potentially very good and definitely much better than existing solutions) then there will not be any major gain, only single links to the completely new users.
This is probably told to the first year MBA students, but I was quite excited and dissapointed by this issue: the first new idea, when properly presented even if not the best solution for given problem, can get all the users, after which they will not switch to the new one just because they used to have it this way.

/devel/other :: Link / Comments (1)


Tue, 22 Jul 2008

POHMELFS distributed facilities design notes.

Since I'm quite busy with VISA/hotel/tickets and overall preparations for Kernel Summit, there is no development progress, but it should be completed very soon I think, and so I will write here some design notes I have in mind about how POHMELFS server will be designed. It is not a finished draft, but somewhat a rough direction paint.

POHMELFS will utilize distributed hash table approach, i.e. storage will support ability to get an obect based on some key attached to it. In a local filessytem we already work with hash table: directory lookup is no more than lookup for inode object based on its name, i.e. lookup for the value based on attached key. And although key in this case is not created based on object itself (like hash of the content or some other function), it still is a (turn on your imagination here) table lookup.

Cloud of POHMELFS servers will utilize similar approach. Consider a single server in the system. When it joins the cloud (I ommit this proccess for now, and will describe it below) first time, it is empty, so it gets some unique id, either via administrator steps or randomly, or it just waits in the queue to be filled with new data, so it will get id at that time, it does not matter for now how it gets its id, but this id is propagated to some cloud of its neighbours (or if it would be a bittorrent or napster to the main server).
There are two ideas on how to treat this ID: either as a part of the filename, or as a nameless pointer in the abstract namespace, I will show below that actually it does not matter.

Now, let's check what will happen when user wants to perform some IO on given file.
Every file access actually happen to inode, stored on disk. In our case it can be stored somewhere we do not know yet where, so we need to perform a lookup to get address of the node in cluster which contains our data. In existing schemas like bittorrent or Lustre there is a server (or small cloud of servers) which contain mapping information about where this or that object is placed in data cloud, so simple lookup to this server(s) return needed info. This approach does not scale to really lots of nodes and is failure-prone.
Instead I consider completely distributed metadata storage. Let's check how system will lookup the whole path in our case.

Each path starts from the root directory, which is '/', which in turn is a id in the global namespace (or hash from this string or whatever else mapping), so we first need to lookup a node, which is responsible to content of this directory. Each node contains routes only to the very limited set neighbour nodes (in various designs this number varys, but idea lays in the fact, that node, performing lookup, does not know which node contains needed info). Gnutella system just broadcasted this lookup request to all of its neighbours, so each one broadcasted it to its neighbours and so on until one of the system replied, that it contains needed info. Amount of unneded broadcasts killed Gnutella next day after Napster was closed.
So, this approach does not scale, and instead we need to map needed directory into node address in a more intelligent way. There are at least two the most appealing design choices: ring-based structure implemted in CHORD and multidimensional torus implemented in CAN.
Right now it does not matter, let's assume that we found a node, which has information about content of the needed directory. When we have that data, we can find next node (or this info can be cached on 'parent' directory node) and so on until get node, which is resposible for storing content of the needed object.

When new node joins the cloud it connects to one or another known node (provided either in public service or by administrator) and sends there information about its available space, gets ID and just waits until some client connects to it and start writing a data.
When node joins with some content, which was written to it by the system before, or written by local users bypassing distributed mechanism, node has to tell this information to the node, which holds parent directory. This information should be stored in each directory it exports, or it can be provided by administrator, for example this node exports dir '/zbr' which is actually a subdir of '/home', so node will lookup '/home' directory content owner and update its records, that now it contains new dir. There is a problem here: what if there is already another node, which also claims to have dir '/zbr' in '/home'? This can be handled via attached to each object extended attribute, which will tell us the last modification date, so system can select either the last modified '/zbr' dir or that node, which contains dir with the biggest number of the same replicas. It can be setup by administrator.

Main advantage of this joining scheme is the fact, that we actually do not need to know content of any object in the exported directory, we publish only high-level object, which may or may not contain some inner file or dir. Thus we do not need to hash millions of files in the exported directory and publish them one by one, we do not need to store information about each inner object, no need attach full path to each object and so on.

When we will decide to split the same object between multiple node, we will need to introduce not only name based lookup, but also extend it to the offset inside the object. This can be done by introducing ssytem wide 'block size', so each file is actually set of blocks of given size, so when we found a node, resposible for storing information about directory, where it is located, this node can also contain information where each part of the object was stored.

Looks quite simple, but... Devil is in the details.
I obviously missed some bits in the design (and I created it in mind during talk being under 'impression' of the greece spirit while talking with asm@, who suggested to look at Kademlia project), like redundancy management of the nodes, splitting of the node content between multiple nodes and other bits, but it is one of the first drafts, so things can be changed if needed.

Stay tuned, I will be very soon back to development process (DST first :), since paper work for kernel summit travel seems to reach its end.

/devel/fs :: Link / Comments (0)


Mon, 21 Jul 2008

Foot, fingers, shins, knees, thigs, shoulder, back.

No, it is not parts of the body I know (half of it I looked in the dictionary), it is what is being aching right now.

Its called football.

Yes, sounds a bit scary, but that was hell the super game today. We were much stronger, but I have to admit, that mostly because we get a right transfer decision and selected right players at the beginning, so our previous team was strengthenen. I managed to make a goal, couple of nice saves and even make quite technical outplay sometimes, which was quite surprisingly, since I did not play football for 5 years.
I would not say I'm getting into the shape, but have a progress.

/life :: Link / Comments (0)


Sun, 20 Jul 2008

Crazy security idea.

I've just thought, that I do not know a way to make some (running) application to encrypt all its data, which hits the disk (either via swap or usual way, like editor writing the file and all its temporary files).
I actually consider this as a very useful feature for the editors, browsers, instant messengers and mail clients, downloading applications and musical players and so on. This is especially valid for temporary files, when one expects editor to be highly secure (or even working on encrypted partition), while its temprary files are stored somewhere in /tmp which is not encrypted.

It could be started via some wrapper, which will tell the kernel encryption algorithm, key, iv and all needed info, it will attach a crypto processing callback to the process, so when disk activity is started by given pid (swap or data writing or reading), it is encrypted/decrypted in flight.
Kernel should check all file descriptors opened by the given process and appropriately process them. There may be some problems with communication with unprotected applications, which should be thought out, but overall I like the idea...

Has put it into todo list.

/devel/other :: Link / Comments (0)


Project presentation.

I've just realized, that lots of my blog posts are valid enough presentation abstracts, at least they contain enough words describing the problem, possible solution and overall interested for given area topics. But I never presented such projects in english before, although quite frankly I'm not that bad speaker in russian, at least I am not afraid to talk and probably like a contact with interesting auditory. After all there is this blog :) and even had number of similar kind of presentations from 15 minutes to couple of hours including question/answer part.
My english used in blog is rather ugly, but I rarely (if at all) fix errors which I detect after subsequent reading of the text in the browser (and I detect lots of them) as long as in mails and other posts.
So probably eventually we will have interesting talks about diferent areas, but expect to 'listen' a world-wide language of the gestures :)

/devel/other :: Link / Comments (0)


Sat, 19 Jul 2008

Disributed storage is dead, long live the Distributed storage!

As you may know, DST project was an attempt to implement redundant, failover resistant, flexible block level storage subsytem. Among other features it supported ability to map multiple remote nodes via linear or mirroring algorithms to single node, reconnect to failed node, reading balancing and parallel writing to multiple nodes (in case of mirroring) and so on.

Now it has gone. There is no more distributed storage you knew before, instead there is completely new project being developed, which main goal is to provide a transport layer for the block requests only. Consider it as Network Block Device on huge steroids. Consider it as iSCSI on huge steroids. Consider it as ATA-over-Ethernet on even more huge steroids.
It is just an example of what all those protocols should have. And only that.
An it does not sound very ambitious, previous DST versions already supported lots of features, which never existed (and in some cases were impossible to be added) in another block level network storages.
DST moves further.

There will be no mirroring and overall ability to map multiple devices into single one, instead one should use Device Mapper for this goal, since its features were simply mirrored (although I tried to optimize them sometimes) in DST, and amount of targets was noticebly smaller.

Now DST is just a simple block device which operates on top of network connection. With just a single exception: its done right.

Features planned for the new Distributed Storage:

  • kernelspace client and server
  • initial autoconfiguration between client and server nodes
  • automatic reconnect to failed target
  • transaction model: resending, timeout error completion, full rollback of the failed transaction
  • wire speed performance
  • data channel encryption, strong checksumming
  • cryptographical authentification
  • ability to work on top of any network protocol
  • barriers support (when, if any, Device Mapper will start support them, DST will not need to be changed)
  • flexible protocol with simple ability to extend it to needed functionality
  • trivial configuration
Project is being written from scratch, but it is actually very simple, and should be quite small, so expect its first release quite soon.
It will be pushed upstream when ready.

/devel/dst :: Link / Comments (8)


Fri, 18 Jul 2008

Completed distributed storage redesign.

I also managed to play second octave F# and sometimes the whole chromatic scale down to small (minor?) octave F on my trumpet, and I belive I started to understand overall trumpet kung-fu, but expect it is not what you wanted to read under DST tag.

So, DST becomes smaller, cleaner and simpler. Notably, I decided to drop userspace target completely for now.
Kernel part now operates on transaction entity, which holds a reference to the node, where data should be sent/received. There can be at most two such nodes if block IO request spans the boundary. In case of mirroring (which will be dropped for the first release) list of nodes to mirror this data to will be maintained by the first node, so transaction will not need to know about them.
In theory block request can be as much as BIO_MAX_PAGES pages, which is 256 for now, but I decided to limit minimum node size to be not smaller than above bio limit, so there will be always at most two nodes per request.
Each node has either block device behind it (so it will just call generic_make_request() with different block device for given bio), or network state machine.
Network state will have two threads: RX and TX. Receive one is used to get replies for the read/write messages, search appropriate transaction and complete it. In case of DST server it will also handle read/write requests and generate replies, but the whole processing will be exactly the same, client node will have a switch to process read/write requests from the network, but they should be only received by server.
Sending thread is tricky. It is used as fallback for non-blocking sockets, which are used first at generic_make_request() time, i.e. when higher level user performed read or write, if block was not fully sent, then it is queued to this thread and it will try to send the rest of the data when polling allows. ->make_request_fn() function returns in this case and higher layer can proceed with own operations.
Transaction is not freed until reply is received from the remote side or resending retry count fires.
Transaction is always allocated (from the appropriate memory pool) and that is actually all allocations in DST itself. In case it works with block devices, it is possible to clone a bio, when it crosses the boundaries (or even always, I have to check it, but it is essentially what device mapper with lots of own additional allocations), but it should be very rare condition.
Network stack will allocate data itself too.

That was a theory. Practice tells me, that essentially 90% of the code should be rewritten from scratch, so I recloned the tree and so far implemented generic bits of registering block device, creating various sysfs files and directories and other similar trivial bits. I still plan to finish it this weekend (without mirroring), but things may turn to me a different side though...

/devel/dst :: Link / Comments (0)


Have sent all documents for US visa.

Checked my passports and decided that if other countries allowed to let me in with that photos, then US custom officers should not frown too much upon current ones.

So, waiting for the results. I almost sure that I will get visa and will met with interesting people at kernel summit and Plumbers conference, but anyway would like to draw the line.

For instance, Zach Brown will talk about CRFS (as long as show some chocolate and coctail bars around, imho the only good coctail is rum with cola (smaller colla) and ice), so there will be something to listen.

/life :: Link / Comments (0)


Thu, 17 Jul 2008

"The Gun Seller" by Hugh Laurie.

Just finished to read this excellent detective novell (at Amazon, electronic version in russian).
People call it the best english humor novell for reason: it indeed is fun and interesting, although I suspect lots of its witty satire was a bit lost in translation, but nevertheless I do recommend it for easy reading.

And of course if you like House M.D., you have to read this novel, and you will not waste your time for sure.

/other :: Link / Comments (0)


Morning trumpet exercises.

Today's morning I raped ears almost two hours, and at the end managed to play a chromatic scale (glide?) from second octave D (E trumpet) down to minor octave F (E trumpet), at least that is what my Korg tuner showed. Much more frequently I was able to play single first octave (only via descended direction though, I did not yet try to rise tones).
Korg AW1 tuner does not show octaves, but I really do not think, that it is possible to play one octave lower than what my the lowest sound was, but pretty sure it is possible to have at least one octave higher than my the highest tone, so I decided that I play several tones around first octave.

Ugh, it was supposed to move earlier to the office (before heat and traffic jams), but instead I fucked my brain via ears (and probably neighbours were not happy either, although I did not play on the full volume).

/life :: Link / Comments (0)


Wed, 16 Jul 2008

New toys: Korg AW1 tuner.

I believe I can produce enough sounds out of my trumpet, so I need to have tuner, which will tell me how bad that sounds are.

Korg AW1

So, now I'm starting to seriously tune my sounds.
So far I hope think that I can play at least two octaves, actually I mean not to play, but to produce a sound, it is still not that simple and not always very clean. But since I 'play' (or better say rape ears study) my trumpet only couple of months and never played any instrument (not counting couple guitar riffs in the university) before and do not play with a teacher, I think this tuner will be very good addition.

/other :: Link / Comments (4)


Tue, 15 Jul 2008

Distributed storage development roadmap.

Yes, DST project is alive and will beat out the crap very soon, since I decided to change its underlying architecture, and switch to transaction model just like POHMELFS. This basically means that as long as system has enough RAM writing operations will be extremely fast, reading can be balanced between multiple nodes (in mirror), transactions can be resent, failover mechanism becomes much simpler, and system overall will be much more robust to failures.

Transaction model also means that system requires explicit acknowlege from remote side, and there are two possibilities here: two handle implicit ack which comes with TCP ack packets like I experimented before, and send explicit ack from server for each client's request.
\ The former approach although has smaller performance overhead, still suffers from the fact, that pages sent via DST are always stateless, i.e. at this layer there is no knowledge about who sends this page. We can determine inode page belongs to, can even get a socket when page is about to be released when ack has been received, but we can not know from exactly which PIPE it was submitted into given socket, so when multiple threads send the same page via miltiple sendfile() calls we do not know when and how page will be released. We can put pipes this page belong to into single-linked list (since page has only two unused at this point pointers: LRU list head, and one of them is used to determine that this page belongs to sendfile()/splice codepath), and likely traversing this list will not hurt usual users, but malicios one can create a local DoS with this approach. After some experiments with the splice code today I decided to drop this idea implementation for now.
There is a strong argument in favour of explicit acks from the server: this allows to make asynchronous transaction processing (with implicit acks we can not hook into processing path, since we do not know where exactly skb with our pages is chained), and this does not hurt perfromance (which was proven by POHMELFS benchmarks).

So, overall plan to develop DST is to switch to transaction model and perform async processing of all events (there are only two actually: reading and writing of the given pages to given locations).
This task is not that complex, so I expect some new results later this week. Stay tuned!

/devel/dst :: Link / Comments (5)


Football match has made the day!

That was exceptionally bloody cool evening!
We had three teams of 6 playerrs in each and played on a small mini-football field about 2.5 hours, each match took either 7 minutes or 2 goals into single gates. It sucked power so much cool, that even exceptional tireness right now brings kind of masochistic pleasure.
My breathing system really sucks, and actually it is not a surprise, I did not play football more than 5 years already, but nevertheless shoes and ball are in a good shape.

I managed to damage knees, shoulder and fingers on the leg in various 'contacts' during the game, but that's not a problem.
Our team was not the best one really, but we strongly hold second place, and actually can fight for the first one, since all our players had long enough pauses in own games, while first team players regulary train in its own teams (including youth football champion).

That was the super time!

/life :: Link / Comments (0)


Mon, 14 Jul 2008

ParaLLels concert.

ParaLLels

Visited ParaLLels concert this weekend...
Mixed feelings, but saw lots of old friends (musicians by a coincidence), which made the day.

/life :: Link / Comments (0)


Sun, 13 Jul 2008

Hermite interpolation.

Hermite interpolation examples

This interpolation uses cardinal splines approach, and namely Catmull-Rom splines. Next task is to test how the Kochanek-Bartels splines (also called TCB-splines) behave. The latter are used in all popular 3d modelling engines. Since math behind them is very non-trivial, I will try just to use existing formulas for hermite tangents, which are quite simple.

Now its time to think, how to use this knowledge and how to apply given approach to detect and decode letters on the image...

/devel/math/bezier :: Link / Comments (0)


Sat, 12 Jul 2008

Monday evening prognosis.

It promises to be just bloody excellent!

My old Select!

I did not play football several years already, but I've found people, who do like it, so our games promse to be really fun and interesing! There are already three commands (4+1 players in the team). This will be my first game after about 5 years of football silence, likely there is nothing in the legs which can help playing football, well climbing likely does not correlate with it, as long as so my experience in other physical trainings, but nevertheless I'm looking forward this promised to be excelptionally cool game!

/life :: Link / Comments (0)


Fri, 11 Jul 2008

Spline graphical interpolation fun.

Bezier/Hermite interpolation interface

Playing with different spline interpolation methods. So far they seems to be quite simple when written in matrix form, so I cooked up simple GTK application to test various methods.
There is no interpolation implementation yet, since I devoted last two days to read lots of materials about Bezier and Hermite interpolation techniques (as long as lots of papers about distributed hash tables, which I will use as a filesystem storage base for POHMELFS).

/devel/captcha :: Link / Comments (6)


Wed, 09 Jul 2008

Captcha transformation algorithms.

Couple of first ideas. Pretty trivial.

Average sliding and normalization algorithms

Next step is to squize images, so that all bold lines moved to single-pixel ones. In theory it should not be very complex (I have an algorithm in mind), but in practice it will - starting to recall why in the hell I learnt LISP.
Basic idea is to transform above BW pictures into simple binary format, which will be read by LISP application, since I do not know and do not want to devote much time to learn how to parse/process various image formats, instead it is done by GTK application written in C. I belive LISP was called the best language for artificial intellengence development for reason, so will try to find why.

Slacking - rox :)

/devel/captcha :: Link / Comments (4)


Tue, 08 Jul 2008

Anecdots and allegories.

I'm not a major kernel contributor, but I was invited 3 times last 3 years to kernel summit.
And I will try to move to this year one in Portland, Oregon, at least I started some preparation process and contacted needed people. I hope I will also participate in Plumber's conference.
As before I will bring bottle of vodka (number of people who wanted to talk suddenly dropped to ground) and greatly appreciate your contact and discussion topics :)
That's of course if stars will stay in a straight line, but I will push them a bit.

/devel/other :: Link / Comments (0)


Mon, 07 Jul 2008

New POHMELFS release.

Irish 'Clontarf' and Scotch 'Grant's' helped to rule this release out.

This POHMELFS release features include:

  • Strong cryptography support. One can encrypt whole data channel (except headers) and/or hash/digest it. System will try to autoconfigure itself and if server does not support requested algorithms, mount will either fail (if special mount option is specified) or disable appropriate algorithm usage.
  • Bug fixes.
Cryptography support is essential addition to the POHMELFS core. It was implemented with performance in mind, so that processing speeds would not drop noticeble even in case of very CPU-hungry operations (one can check performance graphs).
POHMELFS utilizes pool of crypto threads (its number can be specified via mount option), which perform data crypto processing and submit it either to network or VFS layer.

Now I will concentrate mostly on userspace server features, mainly its distributed facilities, current ability to write data to multiple servers and balance reading among them is not enough for POHMELFS, but it will be an essential building block of the fully distributed fault-tolerant paralllel filesystem.

If this development will require some changes in kernel side (namely network protocol extension), it will be don in the upcoming releases with possible found bug fixes.

As usual, you can grab sources from archive or via GIT tree.
You can also check POHMELFS homepage to get more details on its design and supported features.

P.S. I think I will have some rest out of this project for several days, which will allow me to concentrate on main POHMELFS features and work out rough edges. I will switch to DST and netchannels (main to make a new releases) and then will devote some time to captcha cracking algorithms.

/devel/fs :: Link / Comments (4)


POHMELFS crypto processing performance.

If you expected a miracle, it did not happen, so I just present a picture, where I compared plain async in-kernel NFS server (no encryption, no checksumming) versus POHMELFS, which performed SHA1 hashing and AES-128-CBC encryption of the whole data channel.
Block size used in iozone test is 8KB, filesize - 8GB, 1GB of RAM.

Encrypted + hashed POHMELFS vs plain NFS

/devel/fs :: Link / Comments (4)


Sun, 06 Jul 2008

Vodka drinks.

Vodka itself is very interesting drink, but depending on situation it can be either the cheapest way to become very drunk, or possibility to have long and fun time in a good company.
Frequently (and likely most of the time) vodka is used for the first case only, which is sad of course.

I do not know, when and how vodka became popular in Russia, but I think it is always associated with my country now. Actually every nation has some kind of vodka in its own history of drinks, and likely still has it. For example UK/Ireland has whiskey, which is effectively vodka, but drawn in an oak barrels. This brings very interesting taste, which allows to use it as a kind of long drink (especially with ice). After having a whiskey shot one can start breathing air in (especially via nose), which brings aftertaste directly into the brain to the every piece of the body. I do not know any coctails based on whiskey.
In my opinion, Irish whiskey is much more tasty and interesting than (probalby originals of) Scotch, although the former has much more labels.
USA also used to drink whiskey, but most of the time it is its own labels, which I did not try yet. USA does not have own popular drink though, or at least I do not know it.

Europe also has lots and lots of different vodka kinds.
Frech drinks cogniak. I do not like it, and belive that it is only coloured non-tasty vodka, even likely the best labels like Remi Martin and Hennesy (although the latter is originated by irelands :), but it is only matter of taste of course. Cogniak creation process is a bit more complex than vodka, and it also has very different taste, which (for me) is very similar to clean vodka. Cogniak is one of the most popular strong drinks. Culture of its drinking is forgotten, but nevertheless it is very interesting. Cogniak should be drunken only with special temperature (16 degress Centigrade) in glass of specail form, which concentrate its airtaste. Cogniak is not swallowed immediately, but 'stored' in a mouth for a while to get all taste.
Frenchmen also created absinthe. This is very strong drink (upto 90 degrees), but its main feature is thujone. History tells us that thujone was the main reason, why absinthe was forbidden in Europe, and it was quite strong hallucinogen. History also tells us that its concentration never exceeded 10%, so it is unlikely that it had some kind of strong effect. Vincent Van Gogh liked it very much, there is even a theory that it cut his ear during absinthe intoxication, but likely it was some special absinthe, since 10% less-to-equal thujone concentration does not have any significant effect. Right now absinthe is allows in most of the countires, where it was forbidden 200 years ago.
Eastern Europe used to drink various kinds of vodka, which are called in local manner.
For example so called Cha-Cha, which is quite strong (upto 80 degrees) drink, but usually very clear, so it can be drunken without dilution.

The New World (most of it is from Mexico) brings us very interesting vodka-like drink called tequila. It is frequently called mexican vodka, although US also produces own labels. There are also types, which are made using french cogniak barrels.
Usually it is drunken with salt, lime (sometimes lemon) and mulatto female. Process is very interesting: you lick mulatto's hip, cover it with salt, lick it, get tequila shot and eat a lime portion. Even without mulatto it is still very tasty drink. Tequila is made out of special agave sorts, the more it has, the higher is quality.

One of the very known vodka-like drinks from Carribean is rum. It is also quite strong drink, but because of its oil-like elements, it is more sweet and very tasty. Rum is likely one of the most widely used strong drinks for coctails.

I know that Koreans also very like own kind of vodka, which has smaller spirit concentration, namely 20 degrees. It is made out of rice. It is very popular drink to be mixed with beer. Drives you roof away just after couple of shots.

Ukrainians have very interesting drink called 'Gorilka', which is effectively vodka with pepper. It is very tasty, but never eat Gorilka pepper, or you are risking to get a peptic poisoning.

There is several vodka mixes.
First and likely the most known, is 'Screwdriver', whcih is vodka mixed with juise. It is not very tasty imho. One of the most strong roof-driving-out drink is so called 'ruff' or mixture of vodka and beer. Do not try it if you do not know what it is.
I also know one vodka long drink: vodka with Martini mixed one to one. Although it looks quite strong, it is very tasty drink with excellent sweet and a bit dry taste.
Using my small cellar I created (at least tried first time) another long drink, which consists of vodka mixed with 'Malibu' rum. It is also possible to add there juice or cold tea.

Weekend...

/other :: Link / Comments (9)


Multithreaded POHMELFS crypto processing.

Meanwhile having a rest from various celebrations, I managed to complete receiving multhreaded crypto processing in POHMELFS.
So far it was only tested in debug environment (i.e. zillions of logs and overall miserable performance), but it shows, that different threads pick up the work, both on sending and receiving directions.
There is a limitation though: the same crypto threads are used both for receiving and transmit pathes, so it is possible to saturate them all for example for receiving, so sending will stall. If there are unsufficient crypto threads, waiting for RX crypto processing can take too long, so watchdog transmit scanner will fire up and complete transactions with errors. One can work this around by specifying big enough number of crypto threads or long enough transaction scanning timeout, both are provided via mount option.

I would like to test it in more production-like environment and perform various stresses on it, but I'm far from my working place, so can not do it right now. Which means release will be postponed for tomorrow (if testing will not show regressions or bugs).

This will not be last feature release though: for example POHMELFS does not support extended attributes and ACLs, there is no header checksum (although there is a reserved 32-but field) there may be some features in different areas too, but I do not hurry to implement them, since I need something to put into future POHMELFS changelogs. I think sending the same kernel patch with different words about userspace server changes is not the way to go, so there should be some kernel changes too :)

I will draw up some design notes on how I plan to implement POHMELFS server, and namely how distributed facilities will be done, so far I have quite clear picture in mind, but it needs to be worked out 'on paper' to find rough corners.

Stay tuned!

/devel/fs :: Link / Comments (0)


Sat, 05 Jul 2008

Midnight creatiff. Casted by LHC start.

- Shit! There are no more M8 screw-nuts.
- What? Use M12, bozon should pass through.
- We all will be fucked this Monday!

Building LHC

Good night. Actually as a former physicist I can say, that at least two out of four killing theories are really stupid, but nevertheless its interesting!

/other :: Link / Comments (2)


Fri, 04 Jul 2008

In case we will die this Monday...

I've started a countdown...

Countdown has been started

Large Hadron Collider will be started in 3 days...

/other :: Link / Comments (0)


Thu, 03 Jul 2008

POHMELFS crypto support has been completed.

kernel$ git commit -a
Created commit b07e3ed: Added crypto support.
 9 files changed, 1534 insertions(+), 221 deletions(-)
 create mode 100644 fs/pohmelfs/crypto.c

fserver$ git commit -a -m "Aded crypto support."
Created commit f916b2f: Aded crypto support.
 3 files changed, 788 insertions(+), 94 deletions(-)
I implemented pool of crypto processing threads (number of them is mount option parameter), each of which has pool of pages to encrypt data into, so crypto thread is not released until server returns acknowledge that data was successfully written, so one should tune number of threads and page pool (number of pages in each thread is maximum number of pages per transaction, this limit has own mount option too) according to desired behaviour.

Testing shows that writing performance was reduced with this approach noticebly: with 4 encryption threads and 4 receiving thread in server perfromance dropped by around 30% from 65+ MB/s down to 46+ MB/s, but I think it can be improved with larger number of encryption threads. During iozone write/rewrite test each of 4 crypto threads ate about 20-30% of CPU, while server ate about 130% (4 threads totally). In all previous iozone tests the larger number of userspace was used, the worse results were (this is somewhat expected, since iozone is singlethreaded benchmark, so larger number of threads lead only to performance degradation), so I will test different setups (namely larger number of crypto threads and smaller number of server threads).

But this behaviour is not a problem, and I expect it to be tuned, real problem is reading performance. Right now there is only single thread, which reads from one socket: it was done intentionally, since reading data from socket is longer operation than searching page in radix tree or any other operation performed by that thread, so there is no way to saturate its capabilities. Until we start encryption, which is slow, so any subsequent data reading from the socket can not be done in parallel with crypto processing, and overall reading performance drops to ground.

This problem has to be fixed, so I plan to use the same crypto processing threads to decrypt and/or perform hash check for received data and push it up to the VFS stack.

/devel/fs :: Link / Comments (0)


Wed, 02 Jul 2008

POHMELFS crypto: feel incredibly stupid.

First, POHMELFS does need to have encryption. Because I plan to use distributed hash table approach in server (well, consider POHMELFS kernel client as a kind of bittorrent filesystem client), and as in any non-centralized system, content transferred via uncontrolled data channels has to be encrypted.

But... I'm incredibly stupid: I implemented encryption and decryption in place, i.e. VFS page is being encrypted prior to be written to the servers, so subsequent reading leads to... Yes, it reads encrypted content.
To fix this issue I plan to encrypt data into different pages and send them, leaving VFS ones as is. There are two approaches I consider:

  • allocate and send pages at writeback time - we want to send 5 pages, so allocate 5 pages, encrypt data into them and broadcast them to all needed servers.
  • allocate (potentially large) pool of pages at mount time per crypto thread and encrypt data into them. This will have about zero run-time overhead for VFS, except slightly delayed because of encryption write completion.

/devel/fs :: Link / Comments (7)


Louis Maggio trumpet school: never smile.

/life :: Link / Comments (0)


Holy shit: kernel summit.

We would like to invite you to the 2008 Kernel summit, and we hope that you will be able to join us...
I'm trying to recall previous kernel summit:



That was fun, but no one wanted to play football instead of talking about whatever we talked about.

For that year I only committed a HIFN driver into the tree, and there was no kevent :)

This time in US, thinking...

/devel/other :: Link / Comments (5)


Tue, 01 Jul 2008

Why is blocking sending considered harmful?

I frequently hear that whatever server you implement, it has to be non-blocking, since in case of parallel sending it allows to send multiple requests to fast servers, while not-sending data to slow server, since non-blocking socket will return EAGAIN.

This is only half-right solution: when we have to put given data to all servers, and can not free it until all servers replied with acknowledge, non-blocking mode can bring more damage than gain.

Mainly because it allows to eat all the memory for requests, which are still in the queue to be sent to slow server, and which was already sent to fast ones. In this case higher-level application (consider simple application which generates some data and writes it into the file in distributed filesystem, which writes file to several servers) will never block since transfer to fast servers completes quickly, and will provide more and more data, which will consume all RAM.

It is possible to deadlock system in this case, since to send some data to remote server we always have to allocate at least some data to put network headers into. With non-blocking solution we will consume all memory and kick itself into the coma.

/devel/networking :: Link / Comments (2)


Passive OS fingerprinting.

I've updated OSF modules to xtables, so you have to enable its support in kernel config and get recent iptables (I tested with 1.4.1.1, which is the latest release to date).

OSF allows you to match incoming packets by different sets of SYN-packet and determine, which remote system is on the remote end, so you can make decisions based on OS type and even version at some degreee.

Installation instruction, example and source code can be found on homepage.

I've also sent it to netfilter-devel@ and netdev@ maillists, since my previous mails never appeared there likely because of spam filters.

/devel/networking :: Link / Comments (0)


Mon, 30 Jun 2008

Filesystem development rumors.

Rumor number one. SWsoft aka Parallels actively searches for Linux kernel hackers in lead Moscow universities, namely MSU and MIPT. I saw theirs posters, where among other (wanted) requirements there is distributed filesystem knowledge.

Rumor number two. Alexey Kuznetsov (if you do not know, its the guy who wrote major part of linux network stack, namely TCP/UDP/IP and socket implementations, and although there was lots of changes in the stack since then, I think it will not be an exaggeration to call him the author), who also worked on Virtuozzo and OpenVZ (and its interesting VFS parts, which AFAICS are not in kernel, maybe yet), so he works on some filesystem too. The last time we 'confronted' was couple of years ago, when I first time implemented netchannels and tried to convince network community (and namely Alexey Kuznetsov and David Miller) that netchannel idea worth further investigation and implementation. IIRC I did not succeed, although results were very impressive.
Let's see what will happen with filesystems :)

Rumor number three. SWsoft recently started to actively search for kernel hacker for 'new interesting open source project'. They always searched for kernel programmers, but never told anything about projects, now something changed.

Rumor number four. OpenVZ and Virtuozzo have serious problems with NFS (especially when server dies), probably because of very ugly NFS protocol (yes it is), so its hard to properly virtualize it (or not?). There are no alternatives for NFS right now in major productions, but you all know about POHMELFS which right now can be used as really good replacement.

Rumor number five. SWsoft has long history of PHD defences (at least in MIPT) based on theoretical FS called TorFS (namely Tormasov FileSystem), year ago it was still not very alive project in practice, but I heard that it was very impressive in theory. This rumor exists really many years.

So, I have a quite clear picture, that SWsoft started development of the new distributed filesystem, which is aimed at first to replace NFS in virtualized environments. I can also imagine very interesting distributed parallel facilities needed for virtualized systems. And they try to attract lots of people to the project as long as really heavy artillery like Alexey Kuznetsov.

Which basically means, that sooner or later my development will meet strong concurency from this company, which has lots of really good professionals.
And that's very interesting and cool :)

P.S. or it may be a complete bullshit and delirium of my fevered consciousness.

And one fact about POHMELFS: today I finished client support for padded crypto processing of all requests and started to work out server bits, I expect to finish it in a day or around, so new release is very close.

/devel/fs :: Link / Comments (3)


Sat, 28 Jun 2008

Listened how my trumpet can sound.

It was really interesting. Although it is very simple student model, a friend produced very good sounds. He did not practice many years already, but nevertheless it was not that bad.

My everyday half to hour exercises usually produce worse sound, although sometimes I do find really cool notes. Unfortunately I still do not know some magic bit about how to catch on that sound, it borns and dissapears on its own, but I'm sure I will find it, and I think I'm close to where it hides :)

/other :: Link / Comments (0)


Need to rethink POHMELFS crypto a bit.

1. Because of encryption problem - data to be encrypted has to be blocksize aligned, so some informaion about padding has to be added into network command as long as crypto data size.

2. IV generation. I decided to extend network command and put there 64 bit IV for given packet. using simple sequence number is enough to protect against repeat message attack.

3. Encryption/hashing data. I decided not to ecnrypt/hash network headers, and only do it for transmitted data. If transaction contains several commands, data for all commands will be encrypted/hashed, in case of hash, signle digest/hmac will be generated and placed into transaction header.

4. It is possible, that I will add strong header checksum, which will be generated only for header and placed into special field. It will be calculated assuming checksum field is zero. This step is optional so far, but network header has 32 reserved bits, which can be used for it.

Right now hashing and encryption work, but are not checked on server (although generated), because of crypto alignment ugliness I decided to rethink approach a bit.
Evolution process in action...

/devel/fs :: Link / Comments (0)


Fri, 27 Jun 2008

0:3

That was really suck - yes, we played bad. Just like it was before. It is not somewhat surprising.
But what was the fucking ubnormal week ago agains Holland? That was new, was cool, was bloody great, but not today. Tired or whatever... What's the difference right now, we lose.

Yes, Spain played really good, my congratulations.
But our command showed, that it is possible.
That there is nothing impossible.
We can, when we want. You can, when you want.

Thanks a lot for the games!

/other :: Link / Comments (0)


Thu, 26 Jun 2008

POHMELFS server got initial crypto processing capabilities.

POHMELFS server is able to handshake hash/cipher names and operation modes, to initialize appropriate algorithms and perfrom basic operations (like more generic hash_update() instead of different functions with different arguments used to hash data depending on operation mode, either simple digest or hmac: EVP_DigestUpdate()/HMAC_Update(). I'm working on the right way of doing crypto processing, since how it is done right now is a bit hairy, i.e. without serious changes in the code.
I already hate OpenSSL API: EVP_get_cipherbyname(), EVP_MD_CTX, EVP_DigestFinal_ex(). It looks like above functions were written by three different persons and they never actually talked to each other about how to make them look similar... But it is a minor issue of course.

So, when things are settled down, I will make a new release, likely it will see the light this week.

/devel/fs :: Link / Comments (0)


Hacking your ISP for fun and profit.

My ISP again blocked my account and can not unblock it although there are money on the deposit. There are serious problems in its billing system which requires manual intervention of the operator. Unfortunately it is a real challenge to call them, it already took more than half of a hour yesterday, and without success.
So, I decided to implement an interesting idea on how to bypass its blocking.

It is based on the security 'hole' in its (and I think vast majority of ISPs do the same) DNS configuration, which allows to request any DNS record even if account is blocked. It will be fetched from remote DNS server if there are no records in the IPSs cache.
Thus attack vector becomes visible: implement IP over DNS tunnel network device and setup local routing to use it by default. One has to control at least one remote machine which hosts DNS records for given domain name, since it is required to parse incoming DNS requests and process them accordingly.

There are at least two known IP over DNS tunnel solutions: NSTX (howto) and OzymanDNS (howto). Both solutions require that you own one or another server to run ip-over-dns tunnel server on it. Unfortunately I have only single machine with static IP address, which is not protected by lots of firewalls and allows incoming connections.

The simplest solution for this problem is to create iptables input target rule for the server, which will parse incoming DNS requests and redirect usual queries up the network stack to the userspace server, and handle 'poisoned' queries as tunnel.
Client can be TUN/TAP based, but can also be a tunnel network device.
I believe the more weird it looks, the more interesting it is, so likely will think more about kernel based tunnels.

DNS queries are limited enough not to allow binary data (IIRC, the most interesting is DNS TXT records), but it can be appropriately encoded and enciphered. So, will put it into todo list.
I even think that it is not that bad idea to have such modules in kernel :)

/devel/other :: Link / Comments (6)


Wed, 25 Jun 2008

POHMELFS input crypto processing engine is ready for testing.

But testing can not be done without appropriate server support, which is now the main task. POHMELFS uses lazy crypto engine - each network state (it represents connection between client and one server) contains number of fields used exclusively for semi-lockless input data processing (it locks state when performs actual reading, but does not hold that lock when processing incoming messages, since it is the only path, which receives data), now it also has crypto information about how to manage reply messages (they include read page reply for example), so it does not queue work to be done by crypto threads, but does that itself instead. It may or may not be the bottleneck of the input path, tests will provide facts, so far I do not have plans to change it, but it can be done of course if performance will suck.

After I finish crypto processing in both client (it has been written, but requires lots of testing with server) and server (just have started to recall how to work with OpenSSL. Well, I've read how HMAC works in OpenSSL, found it to be simple enough and then started to read how to parse binary data in LISP :) But anything which is interesting for me now, ends up in good results for all other projects), I will switch to something different for a while.
Some voices in the brain ask to be spread it in lots of interesting directions :)

/devel/fs :: Link / Comments (0)


POHMELFS crypto performance.

I've ran read/reread and write/rewrite tests as described in previous run, now with HMAC(SHA1) of all outgoing transactions (note, that reading response data is not yet encrypted and does not contain digital signature, server also does not support neither operation), essentially only writing should be affected by this, but I also ran reading tests for compelteness.

Results show zero performance overhead of the full data SHA1 hashing, but note that quite fast machines were used (2 3Ghz Xeons (2 physical and 2 logical CPUs, HT enabled) with 1 GB of RAM). All the time only two crypto threads were actively hashing data, since there are only two pdflush threads on this machine.

Read Reread

Write Rewrite

Writing is even faster with hashing, but results drifted around, so essentially performance is the same.

/devel/fs :: Link / Comments (0)


Tue, 24 Jun 2008

VM gotcha: forbidden double kmapping.

I've just known, that it is impossible to map the same page twice: for example first time using kmap()/kunmap() and second one via kmap_atomic()/kunmap_atomic().
Although mechanisms are a bit different in both mappings, it is forbidden to do and system will panic like this:

IP: [] kmap_atomic_prot+0x1b/0xc5
*pdpt = 0000000031c79001 *pde = 0000000000000000 
Oops: 0000 [#1] SMP 

Pid: 6478, comm: pohmelfs-crypto Not tainted (2.6.25 #27)
EIP: 0060:[] EFLAGS: 00010202 CPU: 2
EIP is at kmap_atomic_prot+0x1b/0xc5
EAX: ebc7c000 EBX: 00000003 ECX: 00000000 EDX: 00000003
ESI: 00000fdc EDI: 00000163 EBP: 80000000 ESP: ebc7dee4
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process pohmelfs-crypto (pid: 6478, ti=ebc7c000 task=f25040b0 task.ti=ebc7c000)
Stack: 00000000 00000003 00000fdc f7cf4078 00000fdc c0114144 00000163 80000000 
       c01991b1 ebc7df44 f70e3580 00000000 ebc7dfa8 ebc7df40 f70e3580 00000003 
       00000000 f7cf4000 f70e3580 f70ff8b0 f70ff880 f7096c00 c019a771 f70e3580 
Call Trace:
 [] kmap_atomic+0x11/0x14
 [] update2+0x7c/0x13f
 [] hmac_update+0x49/0x50
 [] pohmelfs_crypto_thread_func+0x304/0x3e8 [pohmelfs]
 [] hrtick_set+0x7a/0xd7
 [] autoremove_wake_function+0x0/0x2b
 [] pohmelfs_crypto_thread_func+0x0/0x3e8 [pohmelfs]
 [] kthread+0x38/0x5f
 [] kthread+0x0/0x5f
 [] kernel_thread_helper+0x7/0x10
This happend for exacly above case, when page was first mapped via kmap() in POHMELFS and then via kmap_atomic() in HMAC crypto processing code.
I wonder what will happen if we ever try to send kmapped pages over IPsec tunnel. Likely it will ooops too...
This can happen for example when pages are mapped in tcp_sendpage() when calling sendfile() over the interface, which does not support hardware checksumming and scater-gather: mapped pages are pushed down the network stack where they will be eventually encrypted/hashed in IPsec, which will in turn call kmap_atomic().

So, if you will find obscure oops in kmap_atomic() and friends, first check that calling stack did not map page earlier.

/devel/other :: Link / Comments (0)


Next 40 entries