|
|
About ::
TODO ::
Blog ::
RSS ::
Old blog ::
Projects ::
GIT ::
Gallery ::
Notes
Fri, 10 Oct 2008
How to get back 100 MB/s in several clicks or fixing tbench regression for fun.
It was reported recently that tbench has a long history of regressions,
started at least from 2.6.23 kernel. I verified, that in my test
environment tbench
'lost'
more than 100 MB/s from 470 down to 355 for 8 threads
between at least 2.6.24 and 2.6.27. 2.6.26-2.6.27 performance regression
in my machines rougly corresponds to 375 down to 355 MB/s.
I spent several days (please do not think that I'm bored and have
nothing to do: there are really interesting things to work with,
but since I already started...) in various tests and bisections (unfortunately
bisect can not always point to the 'right' commit), and found following
problems.
First, related to the network, as lots of people expected: TSO/GSO over
loopback with tbench workload eats about 5-10 MB/s, since TSO/GSO frame
creation overhead is not paid by the optimized super-frame processing
gains. Since it brings really impressive improvement in big-packet
workload, it was (likely) decided not to add a patch for this, but
instead one can disable TSO/GSO via ethtool. This patch was added in
2.6.27 window, so it has its part in its regression.
Second part in the 26-27 window regression (I remind, it is about 20
MB/s) is related to the scheduler changes, which was expected by another
group of people. I tracked it down to the a7be37ac8e1565e00880531f4e2aff421a21c803
commit, which, if being reverted, returns 2.6.27 tbench perfromance to the highest
(for 2.6.26-2.6.27) 365 MB/s mark. I also tested tree, stopped at above commit itself,
i.e. not 2.6.27, adn got 373 MB/s, so likely another changes in that merge
ate couple of megs.
Curious reader can ask, where did we lost another 100 MB/s? This small
issue was not detected (or at least reported in netdev@ with provocative
enough subject), and it happend to live somehere in 2.6.24-2.6.25 changes.
I was so lucky to 'guess' (just after couple of hundreds of compilations),
that it corresponds to 8f4d37ec073c17e2d4aa8851df5837d798606d6f commit about
high-resolution timers. I sent a patch, based on revert of the above commit,
to the mail lists and developers, unfortunately it is impossible to clearly revert
it in 2.6.25 not even talking about 2.6.27 tree. That patch brings
performance for the 2.6.25 kernel tree to 455 MB/s.
There are still somewhat missed 20 MB/s, but 2.6.24 has 475 MB/s, so
likely bug lives between 2.6.24 and above 8f4d37ec073 commit, but this excercise
I left to Ingo and Peter :)
Sigh, it is more than 3 A.M. in Moscow, I think if I would be on stronger than Linux
kernel hacking drugs, all my organs would run away from me long ago...
/devel/other :: Link / Comments ()
|