[En-Nut-Discussion] Thread stops executing after some time.

Don Ingram don at led.com.au
Mon Apr 7 01:26:22 CEST 2008


Henrik Maier wrote:

I haven't looked at the source for the relevant area for 12 months or so 
but the issue of the interaction with the timer variable in the port 
timeout in previous posts was of interest.  Something is going troppo at 
a regular interval & is not showing in any stack overflow reports or 
memory overruns. Besides the port not responding the only other symptom 
is that port timeout  period is malfunctioning. When this occurs the app 
resets the port and all is well again, no impact just messy coding.

Having said this I am always open to a pointer somewhere in my app 
eventually going bozo or a queue overflowing and causing a write to the 
middle of a device control block or similar.

The issue has been there since day one when the app was very small, it 
now runs at about 8KLOC. The obvious solution is to try to get some 
spare time to strip it down and isolate the cause... after all it's just 
spare time ;-)

Cheers

Don

> Don,
>
> This issue relates only to TCP communication and the lock-up can only occur
> if there are network errors which require retransmission of packets. In most
> scenarios this issue will never occur and probably explains why it has not
> been detected so far.
>
> Regards
>
> Henrik
> http://www.proconx.com
>
>   
>> -----Original Message-----
>> From: en-nut-discussion-bounces at egnite.de [mailto:en-nut-discussion-
>> bounces at egnite.de] On Behalf Of Don Ingram
>> Sent: Sunday, 6 April 2008 7:24 PM
>> To: Ethernut User Chat (English)
>> Subject: Re: [En-Nut-Discussion] Thread stops executing after some time.
>>
>> Bravo!
>>
>> I have had a similar problem with a serial port task which waits on a
>> char in using the timeout value.  I count the no. of times per minute in
>> which the task runs, normally a low value such as 100..120. The reports
>> are sent out very few minutes via syslog over the ethernet port.
>>
>> After a few hours ( about 4 ) the serial port stops responding and the
>> task rate goes to 10000 or more. Rest of the system is OK just a dead
>> serial port.
>>
>> The fault is consistent across the 16 units in service but still could
>> be my dodgy code ;-)
>>
>> Any possibility that the serial port timeout routines may suffer from
>> the same problem?
>>
>> Cheers
>>
>> Don
>>
>>
>>
>> Harald Kipp wrote:
>>     
>>> Henrik Maier wrote:
>>>
>>>       
>>>> Erik, I suggest to change in the Nut/OS file net\tcpout.c (around line
>>>>         
>> 336) the statement:
>>     
>>>>             sock->so_retran_time = (u_short) NutGetMillis();
>>>> to
>>>>             sock->so_retran_time = (u_short) NutGetMillis();
>>>>             if (sock->so_retran_time == 0)
>>>>                sock->so_retran_time = 1; // so_retran_time must not be
>>>>         
> 0
>   
>> which is a magic value!
>>     
>>> Excellent, Henrik.
>>>
>>> Here is what I did:
>>>
>>> Added some extra code in ethin.c, which discards every 7th packet.
>>>
>>> Added some extra code in ipout.c, which creates a bad checksum for each
>>> 13th packet.
>>>
>>> Masked out the lower 17 bits of the NutGetMillis result at two locations
>>> in tcpsm.c (near lines 500 and 940) and one location in tcpout.c (near
>>> line 340). This way so_retran_time becomes zero more likely.
>>>
>>> Created a Nut/OS and Windows application to test transfers in both
>>> direction. The Nut/OS application continuously prints the current result
>>> of NutGetMillis.
>>>
>>> Here are the results:
>>>
>>> Without further modification the transfer stopped after some minutes.
>>>
>>> Then I changed all three locations to
>>> sock->so_retran_time = (u_short) NutGetMillis() | 1;
>>>
>>> After 12 hours the connections are still running. I'll update 4.4 as
>>> well as CVS HEAD.
>>>
>>> Harald
>>> _______________________________________________
>>> http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>>>
>>>
>>>       
>> _______________________________________________
>> http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>>     
>
> _______________________________________________
> http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>
>   


-- 
Cheers

Don Ingram


Leading Edge Design
 
Mob:   +61 4 1877 5670
Ph :   +61 7 4942 5670
SIP: 899 060 4942 5670

Fax:   +61 7 4942 5680

23 Daniel St
North Mackay
QLD 4740
Australia

P.O. Box 10326 
Mt Pleasant
QLD 4740
Australia




More information about the En-Nut-Discussion mailing list