[En-Nut-Discussion] Nutos 5.1 on Ethernut 1.3g with multiple threads: network freezes

Jonathan Woithe jwoithe at atrad.com.au
Thu Jun 11 05:07:11 CEST 2015


On Wed, Jun 10, 2015 at 10:33:07AM +0200, Uwe Bonnes wrote:
> In the next days I will try to compile the programm on STM32 to see if the
> hang happens there to.

That will be interesting.

> Otherwise:
> Do you have a devugger?

Not as such, no.

> Or do you have spare and reachable pins? In the latter case,  instrument
> crucial Ethernet routines with some pin toggling and observe with a scope
> or logic analyser.

I have been instrumenting the packet path with printf()s this morning and
was able to make some progress before the presence of additional printf()s
changed the program behaviour.  This in itself is interesting as it suggests
that memory corruption might be the root cause.

>From this work it appears that packets are being received by NutOS at all
times during the fault condition.  The problem is that the outside world
never sees anything in response.

Curiously enough, the ethernut *does* respond to ARP requests even when it's
apparently unresponsive to IP packets:

12:17:48.268374 IP pc.domain > 192.168.0.245: ICMP echo request, id 15889, seq 1, length 64
12:17:49.267483 IP pc.domain > 192.168.0.245: ICMP echo request, id 15889, seq 2, length 64
12:17:50.267475 IP pc.domain > 192.168.0.245: ICMP echo request, id 15889, seq 3, length 64
12:17:51.267643 IP pc.domain > 192.168.0.245: ICMP echo request, id 15889, seq 4, length 64
12:17:52.267477 IP pc.domain > 192.168.0.245: ICMP echo request, id 15889, seq 5, length 64
12:17:53.267476 IP pc.domain > 192.168.0.245: ICMP echo request, id 15889, seq 6, length 64

The pc.domain machine already had an ARP entry when "ping" was started here,
so initially it just used that.

12:17:53.271451 ARP, Request who-has 192.168.0.245 tell pc.domain, length 28
12:17:54.267470 IP pc.domain > 192.168.0.245: ICMP echo request, id 15889, seq 7, length 64
12:17:54.271447 ARP, Request who-has 192.168.0.245 tell pc.domain, length 28
12:17:55.267478 IP pc.domain > 192.168.0.245: ICMP echo request, id 15889, seq 8, length 64
12:17:55.271439 ARP, Request who-has 192.168.0.245 tell pc.domain, length 28
12:17:56.267492 IP pc.domain > 192.168.0.245: ICMP echo request, id 15889, seq 9, length 64
12:17:57.267471 ARP, Request who-has 192.168.0.245 tell pc.domain, length 28
12:17:57.268234 ARP, Reply 192.168.0.245 is-at 42:54:52:44:10:00 (oui Unknown), length 46

Although it took some time (4 ARP requests over 4 seconds), the ethernut did
eventually respond to one of these.

12:17:57.268250 IP pc.domain > 192.168.0.245: ICMP echo request, id 15889, seq 10, length 64
12:17:58.267484 IP pc.domain > 192.168.0.245: ICMP echo request, id 15889, seq 11, length 64
12:17:59.267474 IP pc.domain > 192.168.0.245: ICMP echo request, id 15889, seq 12, length 64

Subsequent ping packets still went unanswered.

Obviously print_P() calls are more disruptive to the code layout than I/O
pin manipulations, so I'll see about trying that.

Another thing I did was annotate NutEtherInput() and NutEtherOutput() to
print something on entry.  In the fault condition I see paired calls to
these (that is, NutEtherInput() is followed a little time later by
NutEtherOutput()).  However, no packets were noted by tcpdump as coming from
the ethernut at these times (only the ICMP requests from the PC).

jonathan


More information about the En-Nut-Discussion mailing list