[En-Nut-Discussion] Nutos 5.1 on Ethernut 1.3g with multiple threads: network freezes

Jonathan Woithe jwoithe at atrad.com.au
Mon Jun 15 05:26:35 CEST 2015


Hi all

Further to my email on Thurs June 11, I have now done some further tests.

On Thu, Jun 11, 2015 at 08:44:47PM +0930, Jonathan Woithe wrote:
> > So I think you are on the right track by looking at the init routines and/or
> > EEPROM emulation. Good luck, been there before...
> 
> Well, as I said it seems that the EEPROM emulation initialisation that was
> removed in r4711 has pretty much broken things for us with our application. 
> Putting it back in immediately makes everything work correctly.  Having said
> that, it seems the actual data emulated isn't all that critical: sending all
> ones is sufficient to make things work.  Now, this is more or less what the
> ethernut1.c code effectively does, but the nicrtl.c code does a few more
> steps either side of the emulation.  My guess is that the combination of
> these actions results in a cleaner initialisation.

It turns out that nut/arch/avr/board/ethernut1.c isn't being compiled for
the ethernut-1, and as such the FakeNicEeprom() function in this module is
not getting run at all.  Therefore, the default situation in NutOS 5.1.0
(and later) is that no EEPROM emulation is being done for the RTL8019AS
despite what one might conclude from the comments in ethernut1.c. 
FakeNicEeprom() essentially sets all EEPROM data to 0xff, which is what
the older EEPROM emulation function could be trivially set to do.  The
latter worked while the former seemed to have no effect.  At least we now
know why FakeNicEeprom() had no effect - it's not being used at all.

Considering now the older EEPROM emulation functions DetectNicEeprom() and
EmulateNicEeprom(), I noted earlier that if EmulateNicEeprom() was forced to
send 0xff for every byte then the interface worked fine.  However, calling
only DetectNicEeprom() (which essentially did the same thing) was not
sufficient to get a working interface.  Looking at DetectNicEeprom(), this
function sets an EEPROM read in motion, looks for a toggle of the EESK input
and then immediately finishes up (re-enabling the external memory interface,
setting the pins used for EEPROM emulation to their default state, and so
on).  However, when these pins are restored to their normal use the
RTL8019AS is almost certainly still reading (emulated) EEPROM contents from
these pins.  This means that for the latter part of the EEPROM read the
RTL8019AS will receive indeterminant data and therefore be initialised to
some indeterminent state - one where operation is apparently inconsistent.

If I paste in the assembly language code from FakeNicEeprom() into
DetectNicEeprom() so it waits for EESK to stop toggling before continuing,
the NIC initialises correctly and normal operation is observed.

These tests seem to indicate that reliable operation of the RTL8019AS
requires EEPROM emulation in the event that the circuitry has been
configured to expect it.  The content of the emulated EEPROM can be entirely
0xff bytes, but if it's not done the NIC is not set up correctly and will
not run under all situations.

Even when the NIC does respond to packets without EEPROM emulation in place
(as outlined in my earlier tests) I note that the second NIC LED is not lit,
whereas it is when the NIC is fully initialised.  This suggests to me that
even when it seems to work without EEPROM emulation, there are still parts
of the NIC which are not completely set up.  This may explain the general
flakiness of test results and the dependence on the content of the user
application.

Note that all the above relates to ethernut boards which include the EEPROM
emulation circuit.  I expect it does not apply at all to those without
EEPROM emulation.  Unfortunately I don't have a non-EEPROM-emulation board
so I can't test that directly.  I don't think activating EEPROM emulation on
these boards in software will be detrimental because the EEPROM data line is
tied high, and thus the NIC will read 0xff which, as has been determined,
appears to be fine.

To address this problem a patch to NutOS is going to be required.  Should
fixed EEPROM data be emulated (much like FakeNicEeprom() does), or should we
retain the ability to provide arbitrary emulated data (like
EmulateNicEeprom() used to do?  Since fixed data is evidently fine, in the
interests of simplicity I think the former approach is best.  I have
included a patch at the end of this post (against NutOS 5.1.0 but applicable
to svn trunk as well I imagine) which implements this idea.  Note that
rather than call NutDelay() or NutSleep() in the critical section an
assembly language delay is used instead.  This removes any possible side
effects of such calls.

With this patch in place both the test code posted earlier to the list and
our application code is observed to work normally.  Without it, both fail to
respond to any network traffic.

> The only reason given for the removal of this code in r4711 was that the
> NutSleep() crashed when the firmware was compiled with "avr-gccdbg".  Does
> anyone know what this "avr-gccdbg" means in terms of gcc compile options?  I
> could do up a version of gcc to match this and test that too.

I don't know what this "avr-gccdbg" was so testing this with the attached
patch is not possible at this stage.  I am inclined to suggest that until
the problem is observed again we should simply note it and proceed with a
fix.  Noting that NutSleep()/NutDelay() are now no longer used by the EEPROM
emulation code I have submitted, if avr-gccdbg again gives trouble it can be
investigated in detail then.  In the code removed in r4711 the freezing call
was very early in DetectNicEeprom() and way before the EEPROM emulation had
commenced; it is hard to see how EEPROM emulation could be directly causing
the issue.  I am wondering whether it had anything to do with the
NutSleep()/NutDelay() call being within a critical section or that it
occurred after the emulation port bits had been configured (these being
normally part of the external memory interface).  Obviously at this stage I
can't test this theory.  Why the compiler's configuration should matter in
either case is unclear - maybe it was due to optimisation, or even a subtle
compiler bug.

Regards
  jonathan

The RTL8019AS NIC on Ethernut 1 boards does not appear to initialise
reliably without software EEPROM emulation when the board circuitry is
configured to expect EEPROM emulation.  Reinstate a minimalist EEPROM
emulation to address this.  While many NutOS applications would work fine
without this, others exhibited various network-related problems.

Signed-off-by: Jonathan Woithe <jwoithe at atrad.com.au>

--- ethernut-5.1.0-production/nut/arch/avr/dev/nicrtl.c	2012-12-28 20:57:17.000000000 +1030
+++ ethernut-5.1.0/nut/arch/avr/dev/nicrtl.c	2015-06-15 12:34:47.175277508 +0930
@@ -144,6 +144,76 @@
 
 #endif /* RTL_SIGNAL_IRQ */
 
+/* Pins used to support for EEPROM emulation */
+#if (RTL_EESK_AVRPORT == AVRPORTB)
+#define RTL_EESK_PIN    PINB
+#define RTL_EESK_DDR    DDRB
+  
+#elif (RTL_EESK_AVRPORT == AVRPORTC)
+#define RTL_EE_MEMBUS
+#define RTL_EESK_PIN    PINC
+#define RTL_EESK_DDR    DDRC
+  
+#elif (RTL_EESK_AVRPORT == AVRPORTD)
+#define RTL_EESK_PIN    PIND
+#define RTL_EESK_DDR    DDRD
+  
+#elif (RTL_EESK_AVRPORT == AVRPORTE)
+#define RTL_EESK_PIN    PINE
+#define RTL_EESK_DDR    DDRE
+  
+#elif (RTL_EESK_AVRPORT == AVRPORTF)
+#define RTL_EESK_PIN    PINF
+#define RTL_EESK_DDR    DDRF
+  
+#endif /* RTL_EESK_AVRPORT */
+
+#if (RTL_EEDO_AVRPORT == AVRPORTB)
+#define RTL_EEDO_PORT   PORTB
+#define RTL_EEDO_DDR    DDRB
+
+#elif (RTL_EEDO_AVRPORT == AVRPORTC)
+#define RTL_EE_MEMBUS
+#define RTL_EEDO_PORT   PORTC
+#define RTL_EEDO_DDR    DDRC
+
+#elif (RTL_EEDO_AVRPORT == AVRPORTD)
+#define RTL_EEDO_PORT   PORTD
+#define RTL_EEDO_DDR    DDRD
+
+#elif (RTL_EEDO_AVRPORT == AVRPORTE)
+#define RTL_EEDO_PORT   PORTE
+#define RTL_EEDO_DDR    DDRE
+
+#elif (RTL_EEDO_AVRPORT == AVRPORTF)
+#define RTL_EEDO_PORT   PORTF
+#define RTL_EEDO_DDR    DDRF
+
+#endif /* RTL_EEDO_AVRPORT */
+
+#if (RTL_EEMU_AVRPORT == AVRPORTB)
+#define RTL_EEMU_PORT   PORTB
+#define RTL_EEMU_DDR    DDRB
+
+#elif (RTL_EEMU_AVRPORT == AVRPORTC)
+#define RTL_EE_MEMBUS
+#define RTL_EEMU_PORT   PORTC
+#define RTL_EEMU_DDR    DDRC
+
+#elif (RTL_EEMU_AVRPORT == AVRPORTD)
+#define RTL_EEMU_PORT   PORTD
+#define RTL_EEMU_DDR    DDRD
+
+#elif (RTL_EEMU_AVRPORT == AVRPORTE)
+#define RTL_EEMU_PORT   PORTE
+#define RTL_EEMU_DDR    DDRE
+
+#elif (RTL_EEMU_AVRPORT == AVRPORTF)
+#define RTL_EEMU_PORT   PORTF
+#define RTL_EEMU_DDR    DDRF
+
+#endif /* RTL_EEMU_AVRPORT */
+
 
 /*!
  * \brief Size of a single ring buffer page.
@@ -243,6 +313,146 @@
 #define NICINB(reg)         (*((volatile uint8_t *)RTL_BASE_ADDR + reg))
 #define NICOUTB(reg, val)   (*((volatile uint8_t *)RTL_BASE_ADDR + reg) = val)
 
+/*!  
+ * \brief Provide basic EEPROM emulation for the Ethernet controller.
+ *
+ * Tests show that if the EEPROM emulation circuitry is in place it is
+ * necessary to at least provide a string of 0xff bytes to the RTL8019AS
+ * via emulation in order for it to be initialised reliably.
+ *
+ * Return 0 if EEPROM emulation was detected or -1 if not.  The caller could
+ * use this information for its own purposes if desired.
+ *
+ */   
+static int InitNicEeprom(void)
+{
+#ifdef RTL_EESK_BIT
+    register u_int cnt = 0;
+
+    NutEnterCritical();
+
+    /*
+     * Prepare the EEPROM emulation port bits. Configure the EEDO
+     * and the EEMU lines as outputs and set both lines to high.
+     */
+    sbi(RTL_EEDO_PORT, RTL_EEDO_BIT);
+    sbi(RTL_EEDO_DDR, RTL_EEDO_BIT);
+#ifdef RTL_EEMU_BIT
+    sbi(RTL_EEMU_PORT, RTL_EEMU_BIT);
+    sbi(RTL_EEMU_DDR, RTL_EEMU_BIT);
+#endif
+
+    /* Insert a delay of approximately 20ms to allow the pins to settle (the
+     * exact delay length is not critical).  Use assembly to remove any
+     * possibility that the external SRAM is used since the eeprom
+     * emulation bits - which may be part of the data bus - have already
+     * been configured.
+     *
+     * Like Delay16Cycles(), this assumes a 14.7456 MHz system clock.  adiw
+     * consumes 2 clocks, a brne when taken consumes 2 clocks.  The loop
+     * will run for 65536 cycles, giving a delay of 17 ms.  That's close
+     * enough.
+     */
+    __asm__ __volatile__(
+        "        ldi  r24, 0    " "\n"   /* Clear counter. */
+        "        ldi  r25, 0    " "\n"   /* */
+        "DelayLoop:             " "\n"   /* */
+        "        adiw r24, 1    " "\n"   /* Count loops. */
+        "        brne DelayLoop " "\n\t" /* Exit loop on counter overflow. */
+        :      /* No outputs. */
+        :      /* No inputs. */
+        :"r24", "r25");
+
+    /* Force the chip to re-read the EEPROM contents. */
+    NICOUTB(NIC_CR, NIC_CR_STP | NIC_CR_RD2 | NIC_CR_PS0 | NIC_CR_PS1);
+    NICOUTB(NIC_PG3_EECR, NIC_EECR_EEM0);
+
+#ifdef RTL_EE_MEMBUS
+    /*
+     * No external memory access beyond this point.
+     */
+#ifdef __AVR_ENHANCED__
+    /* On the ATmega 128 we release bits 5-7 as normal port pins. */
+    outb(XMCRB, inb(XMCRB) | _BV(XMM0) | _BV(XMM1));
+#else
+    /* On the ATmega 103 we have to disable the external memory interface. */
+    cbi(MCUCR, SRE);
+#endif
+#endif
+
+    /* Check, if the chip toggles our EESK input. If not, we do not
+     * have EEPROM emulation hardware.
+     */
+    if (bit_is_set(RTL_EESK_PIN, RTL_EESK_BIT)) {
+        while (++cnt && bit_is_set(RTL_EESK_PIN, RTL_EESK_BIT));
+    } else {
+        while (++cnt && bit_is_clear(RTL_EESK_PIN, RTL_EESK_BIT));
+    }
+
+    /* Loop until the chip stops toggling our EESK input, ensuring it reads
+     * 0xff until it is completed.  We do it in assembly language to make
+     * sure that no external RAM is used for the timeout counter variable.
+     *
+     * On boards without EEPROM emulation the loop will time out after
+     * 0xffff iterations, which should be more than enough time for the
+     * NIC to have completed its read.
+     */
+    __asm__ __volatile__("\n"   /* */
+                         "EmuLoop:               " "\n" /* */
+                         "        ldi  r24, 0    " "\n" /* Clear counter. */
+                         "        ldi  r25, 0    " "\n" /* */
+                         "        sbis %0, %1    " "\n" /* Check if EESK set. */
+                         "        rjmp EmuClkClr " "\n" /* */
+                         "EmuClkSet:             " "\n" /* */
+                         "        adiw r24, 1    " "\n" /* Count loops with EESK set. */
+                         "        breq EmuDone   " "\n" /* Exit loop on counter overflow. */
+                         "        sbis %0, %1    " "\n" /* Test if EESK is still set. */
+                         "        rjmp EmuLoop   " "\n" /* EESK changed, do another loop. */
+                         "        rjmp EmuClkSet " "\n" /* Continue waiting. */
+                         "EmuClkClr:             " "\n" /* */
+                         "        adiw r24, 1    " "\n" /* Count loops with EESK clear. */
+                         "        breq EmuDone   " "\n" /* Exit loop on counter overflow. */
+                         "        sbic %0, %1    " "\n" /* Test if EESK is still clear. */
+                         "        rjmp EmuLoop   " "\n" /* EESK changed, do another loop. */
+                         "        rjmp EmuClkClr " "\n" /* Continue waiting. */
+                         "EmuDone:               \n\t"  /* */
+                         :      /* No outputs. */
+                         :"I"(_SFR_IO_ADDR(RTL_EESK_PIN)), /* Emulation port. */
+                          "I"(RTL_EESK_BIT)                 /* EESK bit. */
+                         :"r24", "r25");
+
+#ifdef RTL_EE_MEMBUS
+    /*
+     * Enable memory interface.
+     */
+#ifdef __AVR_ENHANCED__
+    /* On the ATmega 128 we release bits 5-7 as normal port pins. */
+    outb(XMCRB, inb(XMCRB) & ~(_BV(XMM0) | _BV(XMM1)));
+#else
+    /* On the ATmega 103 we have to disable the external memory interface. */
+    sbi(MCUCR, SRE);
+#endif
+#endif
+
+    /* Reset port outputs to default. */
+    cbi(RTL_EEDO_PORT, RTL_EEDO_BIT);
+    cbi(RTL_EEDO_DDR, RTL_EEDO_BIT);
+#ifdef RTL_EEMU_BIT
+    cbi(RTL_EEMU_PORT, RTL_EEMU_BIT);
+    cbi(RTL_EEMU_DDR, RTL_EEMU_BIT);
+#endif
+
+    /* Restore previous interrupt enable state. */
+    NutExitCritical();
+
+    /* Wait until controller ready. */
+    while (NICINB(NIC_CR) != (NIC_CR_STP | NIC_CR_RD2));
+
+    return cnt ? 0 : -1;
+#else
+    return -1;
+#endif
+}
 /*!
  * \brief Reset the Ethernet controller.
  *
@@ -306,6 +516,8 @@
         return -1;
     }
 
+    InitNicEeprom();
+
     /*
      * Mask all interrupts and clear any interrupt status flag to set the
      * INT pin back to low.
@@ -315,10 +527,10 @@
 
     /*
      * During reset the nic loaded its initial configuration from an
-     * external eeprom. On the ethernut board we do not have any
-     * configuration eeprom, but simply tied the eeprom data line to
-     * high level. So we have to clear some bits in the configuration
-     * register. Switch to register page 3.
+     * external eeprom.  On the ethernut board we do not have any
+     * configuration eeprom, but either tied the eeprom data line to high
+     * level or emulate 0xff eeprom data in software.  So we have to clear
+     * some bits in the configuration register.  Switch to register page 3.
      */
     NICOUTB(NIC_CR, NIC_CR_STP | NIC_CR_RD2 | NIC_CR_PS0 | NIC_CR_PS1);
 


More information about the En-Nut-Discussion mailing list