[En-Nut-Discussion] Thread stops executing after some time.
Henrik Maier
hmlists at focus-sw.com
Fri Apr 4 03:36:56 CEST 2008
Interesting results. In particular to see all stalled sockets with so_retran_time=0 but a pending send buffer (nbq <> 0) indicating there is something to be sent but no retransmission occurring.
Maybe there is an issue with Nut/OS' re-transmission management?
Nut/OS is using a 32-bit ms timer as time base. However the TCP state machine is using only a 16-bit ms time base for time-out management. Every 65536 ms we have the case that the lower 16 bits of NutGetMillis are zero. This rollover happens almost once a minute (every 65 s), quite frequently in fact.
When a packet is sent the first time in NutTcpOutput, the re-transmission time is set to the lower 16-bit of NutGetMillis using the following statement:
sock->so_retran_time = (u_short) NutGetMillis();
If this happens exactly at rollover time, then sock->so_retran_time is set to 0.
Subsequently that packet will never be re-tramsitted because of this if-clause in NutTcpSm:
if (sock->so_tx_nbq && sock->so_retran_time) {
...
NutTcpStateRetranTimeout(sock);
...
}
And subsequently this socket will forever stay in SYN_SENT state and never time-out, because it's never re-transmitted.
Erik, I suggest to change in the Nut/OS file net\tcpout.c (around line 336) the statement:
sock->so_retran_time = (u_short) NutGetMillis();
to
sock->so_retran_time = (u_short) NutGetMillis();
if (sock->so_retran_time == 0)
sock->so_retran_time = 1; // so_retran_time must not be 0 which is a magic value!
and to recompile Nut/OS. See if that changes your issue.
Regards
Henrik
http://www.proconx.com
> -----Original Message-----
> From: en-nut-discussion-bounces at egnite.de [mailto:en-nut-discussion-
> bounces at egnite.de] On Behalf Of Erik L
> Sent: Wednesday, 2 April 2008 6:21 PM
> To: en-nut-discussion at egnite.de
> Subject: Re: [En-Nut-Discussion] Thread stops executing after some time.
>
>
> Because of the long time it takes before i can see the result of the tests
> only the ones without debuging of the pointers have falled now.
>
> But i did get some interesting info out of that one.
> The printout i get from the one that faild is the following (I listed the
> sockets 3 times)
>
> -----------
> 222 List of sockets
>
> (DEAD SOCKET)
> SYN SENT 220F TCP 192.168.0.112:6130 192.168.0.115:9050
> last_error:0 so_retran_time:0 so_rtto:1000 so_retransmits:2,
> NutGetMillis:35943
More information about the En-Nut-Discussion
mailing list