[En-Nut-Discussion] Thread stops executing after some time.

Henrik Maier hmlists at focus-sw.com
Sun Apr 6 12:52:03 CEST 2008


Don,

This issue relates only to TCP communication and the lock-up can only occur
if there are network errors which require retransmission of packets. In most
scenarios this issue will never occur and probably explains why it has not
been detected so far.

Regards

Henrik
http://www.proconx.com

> -----Original Message-----
> From: en-nut-discussion-bounces at egnite.de [mailto:en-nut-discussion-
> bounces at egnite.de] On Behalf Of Don Ingram
> Sent: Sunday, 6 April 2008 7:24 PM
> To: Ethernut User Chat (English)
> Subject: Re: [En-Nut-Discussion] Thread stops executing after some time.
> 
> Bravo!
> 
> I have had a similar problem with a serial port task which waits on a
> char in using the timeout value.  I count the no. of times per minute in
> which the task runs, normally a low value such as 100..120. The reports
> are sent out very few minutes via syslog over the ethernet port.
> 
> After a few hours ( about 4 ) the serial port stops responding and the
> task rate goes to 10000 or more. Rest of the system is OK just a dead
> serial port.
> 
> The fault is consistent across the 16 units in service but still could
> be my dodgy code ;-)
> 
> Any possibility that the serial port timeout routines may suffer from
> the same problem?
> 
> Cheers
> 
> Don
> 
> 
> 
> Harald Kipp wrote:
> > Henrik Maier wrote:
> >
> >> Erik, I suggest to change in the Nut/OS file net\tcpout.c (around line
> 336) the statement:
> >>             sock->so_retran_time = (u_short) NutGetMillis();
> >> to
> >>             sock->so_retran_time = (u_short) NutGetMillis();
> >>             if (sock->so_retran_time == 0)
> >>                sock->so_retran_time = 1; // so_retran_time must not be
0
> which is a magic value!
> >>
> >
> > Excellent, Henrik.
> >
> > Here is what I did:
> >
> > Added some extra code in ethin.c, which discards every 7th packet.
> >
> > Added some extra code in ipout.c, which creates a bad checksum for each
> > 13th packet.
> >
> > Masked out the lower 17 bits of the NutGetMillis result at two locations
> > in tcpsm.c (near lines 500 and 940) and one location in tcpout.c (near
> > line 340). This way so_retran_time becomes zero more likely.
> >
> > Created a Nut/OS and Windows application to test transfers in both
> > direction. The Nut/OS application continuously prints the current result
> > of NutGetMillis.
> >
> > Here are the results:
> >
> > Without further modification the transfer stopped after some minutes.
> >
> > Then I changed all three locations to
> > sock->so_retran_time = (u_short) NutGetMillis() | 1;
> >
> > After 12 hours the connections are still running. I'll update 4.4 as
> > well as CVS HEAD.
> >
> > Harald
> > _______________________________________________
> > http://lists.egnite.de/mailman/listinfo/en-nut-discussion
> >
> >
> 
> _______________________________________________
> http://lists.egnite.de/mailman/listinfo/en-nut-discussion




More information about the En-Nut-Discussion mailing list