[En-Nut-Discussion] TCP sockets stuck in closing state

Harald Kipp harald.kipp at egnite.de
Mon May 11 20:31:33 CEST 2015


Hi Coleman,

On 07.05.2015 16:33, Coleman Brumley wrote:
> Based on my overnight testing, this has worked very well. To be honest, I
> don't if it's resulted in incorrect TCP behavior, but I have noticed any
> negative side effects. But, the code no longer leaks heap space and that is
> what is important to me. I'd rather the TCP SM have to renegotiate that the
> board need to be reset because it has no remaining heap space. 

While reading through this thread, I just see wild guesses and trials in
the dark. Quite some years ago there had been some trouble with short
lived connections at high frequencies and also with sending a large
number of tiny segments. We set up simple test cases to address this
problem and were able to fix it.

In the meantime many things changed within the kernel and in the TCP
state machine and one of the old monsters may have re-appeared in a new
incarnation.

I'm currently working on a large application with all kinds of listening
and connecting TCP and UDP ports. Everything looks rock solid, running
for months without any problem or reboot. Therefore, as you can imagine,
this thread didn't really catch my attention. But the growing size and
age of this thread starts to make it highly visible. :-) Anyway, not
much information is provided, which would again attract me to dive in.
Could be thread priority, could be HTTP, could be... followed by a
number of dubious hints, what could be tried else.

While working on my current app, I also experienced problems with
closing sockets, which looked to me like the behavior of Windows changed
since Windows 7. I rarely see FINs, but frequently RSTs instead. As I'm
using a hybrid Nut/OS version, something between 4.8 and the trunk, I
haven't been able to create proper patches, not talking about testing
them with the latest trunk or other versions on several platforms. In
this sense, this is also no valuable information, like several other
contributions to this thread. It just states: Yes, there is something
wrong somewhere.

The only thing I have so far is, that you, Coleman, are using Nut/OS
4.8.7 on an AT91SAM7X256. 4.8 is actually a nice and well maintained
version. I'd suggest 4.10, but it's not really required to do this
upgrade. Both are known to work with SAM7X.

So what? Simply this way: Create a test case, which is so simple and
easy to try, that almost everyone can reproduce the problem without
spending more than half an hour. And simple means simple. Too often I
asked for simple test cases and received too much. Every reference to
any function, which is not directly related to the test, must be
excluded. I'm quite sure, that this will fix your problem less days,
than a fraction of the age of this thread.

If you need someone to debug your full application, there are several
companies offering commercial support. If it's not a commercial
application, than put the source code on Github or elsewhere and lets
see, if it is useful enough, so that someone else would need it and
spend some time on it.

Regards,

Harald





More information about the En-Nut-Discussion mailing list