Zbr's days.

About :: TODO :: Blog :: RSS :: Old blog :: Projects :: GIT :: Gallery :: Notes

Sun, 31 Jul 2005

Yesterday was a cooking day.


I cooked fergana-uzbek pilaw, it is really tasty thing, although not all ingredients were included.
Later me and WiJo drunk tequila and ate my pilaw - it was nce evening.

From hacking side there were nothing major - DaveM found that AF_TLB needs memory barriers and D-cache flushing or cache colouring for virtual addressing caches, the latter already exist in the driver, but there are no barriers when control page is updated.

:: Link / Comments ()


Fri, 29 Jul 2005

AF_TLB performance.


I measured indirect af_tlb performance versus 1500 bytes copying - remapping of the physical page took about 25-50% less time than 1500 bytes copying using memcpy().
And 15 times faster just after reboot, i.e. without anything in the cache.
CPU is Xeon with HT enabled:

cpu family      : 15
model           : 2
model name      : Intel(R) Xeon(TM) CPU 2.40GHz
stepping        : 7
cpu Mhz         : 800.384
1.
packet_mmap_test: 1000 remaps took 1495 usec.
packet_mmap_test: 1000 copyings took 1988 usec.
2.
packet_mmap_test: 1000 remaps took 1406 usec.
packet_mmap_test: 1000 copyings took 2613 usec.
3. And just after reboot, when there is nothing in cache:
packet_mmap_test: 1000 remaps took 1387 usec.
packet_mmap_test: 1000 copyings took 20173 usec.
4. Yet another "just after reboot":
packet_mmap_test: 1000 remaps took 1295 usec.
packet_mmap_test: 1000 copyings took 14889 usec.

Above copying is being done using arbitrary kernel virtual address as source address and with PAGE_SIZE addition to it before each memcpy().

Created a patch for netlink subscribtion.
Subscription is a netlink socket option, which, if enabled, ends up in direct message delivering, but not multicast one.
I.e. socket with such option enbled will only receive data directed exactly to the specifined on bind() time group, it will not receive any multicast traffic.
This option also allows to have 2^32 different groups for specified socket.
This idea was originally implemented in connector.

This usage case is the most popular in kernel event notifications, and the most of the userspace -> kernelspace messages can be be converted into it too.

:: Link / Comments ()


Thu, 28 Jul 2005

Yesterdays climbing was great!


I need to find some new traces - old ones are already finished, although instructors frequently remove/replace couple of holds to add some complexity into the trace.

Updated af_tlb sources in archive and announced it in netdev@ again.

:: Link / Comments ()


Tue, 26 Jul 2005

Netlink connector discussion.


The fundamental problem that the connector is trying to solve:

1) provide more 'groups' (to transport more different kinds of events)
2) provide an abstract API for other kernel code, so it doesn't have to know anything about skb's or networking.

It was decided to postpone connector merging until Patrik McHardy finish his netlink work, so above number 1 will be resolved. After it networking people will rise this question again and decide can connector properly work with it or not.

AF_TLB hacking - it now remaps all data it receives, although sometimes it is too slow - current tests were done using 16 pages buffer, so only 16 packets can live there in a time, and it crashes on exit - my page table manipulations are not so innocent...
I've released new version of AF_TLB sniffer - it crashes on exit, and probably has other bugs, but it is far away comparing to what was announced first time.

:: Link / Comments ()


Mon, 25 Jul 2005

Climbing a lot today.


it was nice - but instructors removed several holds from my favorite trace, so it becomes more interesting...

AF_TLB hacking - some PTE manipulation - I can not use exported unmapping function, so I wrote pte unmapping code myself, it works, but unmapped address after some time becomes wrong, it looks like address is gotten from different VMA tree, will investigate it further tomorrow.

New discussion about netlink connector on netdev@ and linux-kernel@ - I hope it is the last one. I try to explain that connector is not only simple netlink wrapper, but also some kind of message bus, even more - it is very convenient to use message bus, i.e. connector users do not care about skb allocation/handling/freeing/queueing and so on...
One needs only register callback and that is all. It was called in first releases ioctl-ng, since it allows very convenient usage.

:: Link / Comments ()


Sun, 24 Jul 2005

I've bought floor covering - my loft becomes nicer and nicer,

although it required quite a lot of my technical education to cover whole are with furniture using only two hands...

:: Link / Comments ()


Sat, 23 Jul 2005

Working with AF_TLB.

It does crash as frequently as before, so I can not say it is stable. It happens on remapping after skb was freed and it's page becomes awailable for next mapping.
But it still does not read right data from mapped area - even with 512 mapped PAGES for SKBs. I need to investigate Linux VM further...
Hugh, after adding some debugging into mm/memory.c AF_TLB sniffer can already dump much more data then before, I wonder my debug works at that layer - now developments will go much faster.

I see how rarely people visit my page after I placed acrypto future roadmap, where I put my pesimistic foretell for Asynchronous crypto hardware in general and acrypto in particular.
I do think Herbert Xu will comment on it, since he is linux crypto maintainer, after OLS. I also wait unti people will put theirs proceedings and slides at Netconf 2005 - there was an asynchronous crypto processing talk from Herbert Xu.

:: Link / Comments ()


Thu, 21 Jul 2005

Working on remapping sniffer.


Problems are still there - it is not as fast, as it should, it also crashes under the load, and it's data is completely unreliable.
For example, one can read MAC addresses from the page, and next read will be from empty page or from SLAB redzone - i.e. 6a 6a 6a 6a or similar.
But it is not main issue with AF_TLB sniffer - the most unpleasant thing, is that is does not remap with the speed of flow - i.e. kernel part looks like doing it's job, but userspace never gets usefull data from remapped pages.

Blosxom has one unpleasant thing - it does not allow to write blog entries for the past days - it gets data info from file's metadata, so I need to write some application to cure old entries.

:: Link / Comments ()


Wed, 20 Jul 2005

This is a test from the future...

:: Link / Comments ()


Mon, 18 Jul 2005

Work...


Fixed several old bugs in my projects at work - I feel myself to lame to have them - magically those things were not designed for asynchronous processing on SMP machines - as usual, all not interesting things are done quite bad, so after long time I need to fix it - now I can state, that current releases are officially bug free!
Unfortunately, my bosses do not agree - naturally...

Added offset check to the userspace part of AF_TLB sniffer - it works better, but still has some quirks - I think I need to rethink some issues with queueing - sniffer looses too many frames, and it is not related to remapping speed, but queueing handling instead. Probably tomorrow will address this issues.

Fast internet connection is not set up yet. I even was not called from the firm.
I also found, that floor covering can be brought to my house, so in a couple of days I will order it and begin to spread. It will take some time, but it is definitely worth it.

:: Link / Comments ()


Sun, 17 Jul 2005

Moved things to new loft - finally all is here.

Bag with the books has a weight more than all other things together, even with old CRT monitor I think. Sold my old car - finally no stress about it.
Doing some steps towards internet connection here - GPRS sucks, I think next week I will have fast link at home, so I can start doing interesting things at home - not syncronized archives all over the places is quite bad thing.

Tomorrow will have many things to do concerning my work, but plan to find some time and clean up userspace for zero-copy sniffer - so it does not read several times the same skb and run several tests comparing 1500 MTU copying versus page remapping.

:: Link / Comments ()


Fri, 15 Jul 2005

AF_TLB - zero-copy sniffer update.


I've created new archive with sources for new zero-copy sniffer.
Current version has many cleanups, enhancements and bug fixes. Archive contains of the following files:
af_tlb.[ch] - kernel side sniffer implementation.
tlb_test.c - userspace "sniffer".
Makefile - build kernel side with "all" target and userspace with "test" target.

Enjoy.

:: Link / Comments ()


Thu, 14 Jul 2005

Network sniffer.

Published first sources to netdev@ - it basically works, but has too many bugs and quirks, so I can only call it proof-of-concept code.
It acts as packet socket, i.e. gets all packets using prot_hook.func(), but never copy it.

Basic idea behind zero-copy is remapping of the physical pages where skb->data lives to the userspace process.

According to my tests, which can be found commented in the code (packet_mmap()), remapping of one page gets from 5 upto 20 times faster than copying the same amount of data (i.e. PAGE_SIZE).

Since current VM code requires PTE to be unmapped, when remapping, but only exports unmap_mapping_range() and __flush_tlb(), I used them, although they are quite heavy monsters.
It also required mm->mmap_sem to be held, so I placed main remapping code into workqueue.

skbs are queued in prot_hook.func() and then workqueue is being scheduled, where skb is unlinked and remapped. It is not freed there, as it should be, since userspace will never found real data then, but instead some smart algo should be investigated to defer skb freeing, or simple defering using timer and redefined skb destructor. It also should remap several skbs at once, so rescheduling would not appeared very frequently.
First mapped page is information page, where offset in page of the skb->data is placed, so userspace can detect where actual data lives on the next page.

Such schema is very suitable for applications that do not require the whole data flow, but only select some data from the flow, based on packet content.
I'm quite sure it will be slower than copying for small packets, so this two ideas must be combined to achieve the maximum sniffer performance.

:: Link / Comments ()


Tue, 12 Jul 2005

New diary created.

Doing some CSS manipulations and playing with HTML tables - i think it becomes better... CSS is not so complex as I thought, although I've gotten my file from DaveM blog.

:: Link / Comments ()