Zbr's days.
June
Sun Mon Tue Wed Thu Fri Sat
4
         
2008
Months
Jun
Sep
Oct Nov Dec

About TODO Blog RSS Old blog Projects Gallery Notes

Wed, 04 Jun 2008

Optimized POHMELFS transactions.

Now they eat less memory, and single writing transaction can accumulate up to 1024 pages. This can be further tuned especially for small requests mixed with sync. Currently writing transaction is allocated for its maximum size, and then pages pointers are written to the allocated area, so if number of dirty pages requiring writeback is small, quite lots of space will be wasted.
It is a task for the next optimization, nevertheless currently sequential writing is only limited by disk throughput or network bandwidth in case of multiple servers, since link is shared between machines, so effective bandwidth becomes equal to GigE/number of servers, or about 60 MB/s in my environment with two servers and single client.

Also, reading path was not changed at all (only transaction internals) - there is still no readahead and new transaction is allocated for each page to be read. Nevertheless, see how reading was improved: POHMELFS not only outperformed NFS again, but reached disk bandwidth limit already for 16Kb requsts (almost two times faster than NFS). Table shows IO throughput in KB/s.

                                                    random  random
      KB  reclen   write rewrite    read    reread    read   write
 8388608       8   74058   68392    40130    79509   43588    4818
 8388608      16   62332   66978    73714   122074   42160    8434
 8388608      32   64775   67073   109357   171139  145416   14183
 8388608      64   66962   66602   147350   217323  227962   22257
 8388608     128   67724   67133   185574   266855  321060   32681
 8388608     256   68233   67922   201591   283567  474657   40944
 8388608     512   68339   66514   213513   295995  646897   50303
 8388608    1024   67744   67384   220858   297748  676582   48796
I will create nice graphs out of this tables and also will include optimized reading tests (tomorrow likely) and two data server results.

What also should be done, is testing with either bigger files or smaller amount of ram and thus smaller VFS cache size. As you saw in all tests, when lots of reads start to hit the cache, picture becomes completely non-informative for filesystem behaviour. So I want to limit all three testing machines to 1Gb of RAM (booting with mem=1G parameter) and perform the same iozone bench for 8Gb file. Results should be more realistic.

In parallel I will implement userspace run-time server addition/removal command, which will also be used as-is for network message from one or another server, connected before. With optimized reading transactions it will be a good ground for the next POHMELFS release. So I plan to schedule it to thursday or middle of the next week, since I will be on small vacation jun 6-9.

/devel/fs :: Link / Comments (0)

Please solve this captcha to be allowed to post (need to reload in a minute): 13 + 86

Name:
URL (optional):
Captcha:
Comments: