Zbr's days.
August
Sun Mon Tue Wed Thu Fri Sat
      2
 
2007
Months
Aug

About TODO Blog RSS Old blog Projects Gallery Notes

Thu, 02 Aug 2007

Fixing bug in Linux network stack.

There was an interesting bug posted to netdev@ today by user with name John (actually the same one was posted sevaral times already, but this time he included simple application to trigger it). It was possible that at the end of the connection, the last TCP segment was sent with wrong port number, like in example below:

17:50:43.414212 IP 127.0.0.1.50000 > 127.0.0.1.10250: S 1312601602:1312601602(0) win 1500
17:50:43.452081 IP 127.0.0.1.10250 > 127.0.0.1.50000: S 864201221:864201221(0) ack 1312601603 win 32792 
17:50:43.414364 IP 127.0.0.1.50000 > 127.0.0.1.10250: . ack 1 win 1500
17:50:43.452649 IP 127.0.0.1.50000 > 127.0.0.1.10250: P 1:17(16) ack 1 win 1500
17:50:43.452666 IP 127.0.0.1.10250 > 127.0.0.1.50000: . ack 17 win 32792
17:50:43.452735 IP 127.0.0.1.50000 > 127.0.0.1.10250: R 1312601619:1312601619(0) win 1500
17:50:43.564760 IP 127.0.0.1.54076 > 127.0.0.1.50000: R 1:1(0) ack 17 win 32792
As you can see, the last RST segment contains wrong port number 54076 instead of 10250. This does not break anything actually, but is bad as is, so I decided to spent ths day helping the world instead of hacking the hash.
Number of tricks I used is more than enourmous - that was debug printks all over the place, padding red zones to catch overflows, delayed operations and even total rename of given variable.
Bug scenario was only 'detectable' when socket is closed, but there is data unread, so that RST should (according to RFC 2525) be sent, so it is quite rare condition.
Eventually I tracked it down to the fact, that when socket is being closed, it already contains wrong port field. Work with timers showed, that in such short lived connection neither of three TCP timers fired, but processing of the last RST in the connection above shows that port number is still valied there.
Stumbled. how is it ever possible that nothing happend, but something was broken?
In the MIPT, especially in the physical laboratoris, I was continuously told, that there are no miracles, and I think that is true, so I audited every single usage of the port field in the inet socket structure (surprisingly not that many cases, maybe several dozens only), and eventually found that inet_autobind() fucntion can change given field, after number of debug prints problem was completely localized and fixed by simple patch, which checks if socket is really alive and thus requires binding. Problem described above can only happen for semi-alive socket, when it was partially released (namely its port value is freed) and thus smells bad, but still can be accessed from userspace (socket reference itself is not released), so that any subsequent sending call could endup changing port number by binding to the new port. My simple fix checks if socket is partially alive and if so, it does not allow sending (which is not allowed anyway later in tcp_sendmsg()) and does not perform autobinding.
That's all, but it sucked about 6 hours. Do not even know if this number is good or bad.
Bug submitter John reports that this bug exists even in 2.4.0 kernel, and it was referred in web multiple times, but did not force anyone to fix.
Now it is gone. Even if my fix is not correct, I provided enough information, so that real fix would be simple.

/devel/networking :: Link / Comments (1)