Zbr's days.

About :: TODO :: Blog :: RSS :: Old blog :: Projects :: GIT :: Gallery :: Notes

Tue, 07 Nov 2006

I am ready new sick is kevent release.


No, I am sick and new kevent release is ready.
Actually not. Looking at how much I've eaten and how much I still want to eat, one can say that I'm definitely ok.

Nevertheless, new kevent release 'take23' is out.
There are several major changes in this release:

  • new ring buffer implementation in process' memory. it works similar to this description, but structure has changed - I removed user's index, instead there is only kernel one (i.e. index where kernel will put new ready event). When user calls kevent_get_events() or kevent_wait() requested number of kevents is copied into ring buffer (if it was initialized using
    int kevent_ring_init(int ctl_fd, struct ring_buffer *ring, unsigned int num);
    function and kernel's index is increased. Simple. Userspace should itself store pointer to it's previous index (actually it can be calculated using returned value from kevent_wait() - it is number of events copied into ring buffer and current kernel's index, but it is only correct in case of special locking between threads.
  • wakeup-one-thread flag. When several threads wait on the same kevent queue and requested the same event, for example 'wake me up when new client has connected, so I could call accept()', then all threads will be awakened when new client has connected, but only one of them can process the data. This problem is known as 'thundering nerd problem'. Events which have this flag set will not be marked as ready (and appropriate threads will not be awakened) if at least one event has been already marked.
  • edge-triggered behaviour. It is an optimisation which allows to move ready and dequeued (i.e. copied to userspace) event to move into set of interest for given storage (socket, inode and so on) again. It is very usefull for cases when the same event should be used many times (like reading from pipe). It is similar to epoll()'s EPOLLET flag.

Eric Dumazet created special benchmark which creates set of AF_INET sockets and two threads start to simultaneously read and write data from/into them to test epoll() and I ported it to kevent (ring_buffer.c application can be found in archive).
Here are the results:
  epoll (no EPOLLET):                  57428 events/sec
  kevent (no ET):                      59794 events/sec
  epoll (with EPOLLET):                71000 events/sec
  kevent (with ET):                    78265 events/sec
  Maximum (busy loop reading events):  88482 events/sec

So, kevent works faster than epoll() and it was confirmed by three different benchmarks (Eric's code (epoll_bench.c and appropriate kevent_bench.c can be found in archive), Johann Borck's web server and my benchmarks presented on kevent homepage)).

Jeff Garzik today joined kevent supporting team.
So there are serious people on kevent side including such heavyweights like David Miller (network core maintainer) and Jeff Garzik (network device drivers, SATA maintainer). Andrew Morton (I think you heard this name) included kevent in his -mm tree several releases ago.

So, things develop from bad to probably not bad.

/devel/kevent :: Link / Comments ()