[En-Nut-Discussion] Nutos 5.1 on Ethernut 1.3g with multiple threads: network freezes

hmnews at proconx.com hmnews at proconx.com
Thu Jun 11 10:33:23 CEST 2015


Hi Jonathan,

Now you mention EEPROM emulation, that reminded of some issues in the 
past with the RTL8019AS chip we had.

If the chip is not initialised with correct mode at power-up or reset it 
can cause all sorts of strange behaviour. With one of the designs we had 
this caused some random address decoding issues where the NIC sometimes 
but not always did decode itself into RAM address space causing 
conflicts with the static memory. Obviously this lead to random crashes 
after the network was initialised. Sounds familiar?

So I think you are on the right track by looking at the init routines 
and/or EEPROM emulation. Good luck, been there before...

Cheers,

Henrik


On 11/06/2015 6:04 PM, Jonathan Woithe wrote:
> Hi everyone
>
> I have run many tests today without coming up with a definitive answer.  I
> did not get a chance to do pin toggle tests - that will come later if
> needed.
>
> The most interesting finding came when I instrumented only NutIpOutput(),
> NutIpInput(), NutEtherOutput() and NicPutPacket() (in nicrtl.c).  In the
> fault condition, NicPutPacket() appeared to be receiving the ICMP reply
> packet for transmission, was submitting it and returning normally.  However,
> no such packet appeared on the ethernet output.  Taken at face value this
> suggested that the NIC itself had failed to transmit the packet for some
> reason.
>
> I had also noticed that compared to NutOS 4.4.1 the behaviour of the NIC
> LEDs under NutOS 5.1.0 had changed (rather than both being mostly on, one
> was now continuously off).  This got me thinking about the RTL8019AS
> initialisation sequence.
>
> Looking in nicrtl.c it is clear that EEPROM emulation is no longer being
> done here.  A check of the svn repo shows that it was removed in r4711
> (dated 7 Oct 2012).  Curiously, there was also a less sophisticated EEPROM
> emulation routine originally in arch/avr/os/nutinit.c which was moved to
> arch/avr/board/ethernut1.c in r3000.  This code remains in place, but as
> mentioned it does not set the NIC up as completely as the deleted code.
>
> The commit message in r4711 says:
>
>    Remove EEPROM emulation from RTL8019AS driver. When compiling with
>    avr-gccdbg, the driver crashes at NutDelay. EEPROM emulation is quite board
>    specific and we may later implement a separate driver in the Ethernut 1
>    board file. For now we live with the fact, that additional Ethernet
>    collisions may appear. Much better than a driver which crashes in debug
>    mode.
>
> Given my experience chasing the present issue down (where precise symptoms
> seemingly varied depending on what additional code was included in NutOS
> functions) I wonder whether this crashing in NutDelay() was a symptom of the
> same deeper problem that I have triggered.  It is certainly far from obvious
> why the code removed in r4711 would crash inside NutDelay() only in debug
> mode.  This kind of non-deterministic behaviour is exactly the sort of thing
> I've been seeing.
>
> Lacking anything else more concrete, I manually patched the EEPROM emulation
> code from NutOS 4.4.1 back into nicrtl.c from NutOS 5.1.0 along with the
> defines it required.  The call sequence
>
>    if (DetectNicEeprom() == 0) {
>        EmulateNicEeprom();
>    }
>
> was inserted into NicStart() just after the NicReset() clause.  When the
> resulting NutOS was used with the test program posted previously, full
> network functionality was restored (in so far as ICMP pings worked once
> again).
>
> I then recompiled and relinked our original application against this revised
> NutOS and tested it.  It too worked, and full network connectivity (ICMP and
> TCP) was observed.
>
> Next up I temporarily changed the call sequence in NicStart() to
>
>    if (DetectNicEeprom() == 0) {
>        EmulateNicEeprom();
>    }
>
> This would effectively prevent the old EEPROM emulation code from running
> while still keeping it in the program space (thus preserving program memory
> layout).  The resulting firmware failed to respond to ICMP pings.
>
> These observations suggest one of several options.
>
>   1) The old EEPROM emulation code is configuring the NIC in a certain way
>      that is required for reliable operation under all possible
>      circumstances.
>
>   2) When it runs the EEPROM emulation code is leaving some values in RAM
>      which happen to word around some other as yet unidentified problem
>      within NutOS.
>
> As I understand it, the default EEPROM emulation in nutinit.c simply holds
> the data line high for the duration of the process.  In nicrtl.c, the only
> data fields not held high is CONFIG3 (0x30).  I changed this to 0xFF and
> re-tested.  Networking worked and the LEDs were configured to both operate.
> If option 1 is the key to the problem, the critical configuration details
> may not be in the emulated EEPROM contents themselves, but rather a side
> effect of the process.
>
> This is as far as I got today.  Is there anything we can conclude from these
> findings?
>
> Regards
>    jonathan
> _______________________________________________
> http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>



More information about the En-Nut-Discussion mailing list