[En-Nut-Discussion] Nutos 5.1 on Ethernut 1.3g with multiple threads: network freezes

Jonathan Woithe jwoithe at atrad.com.au
Wed Jun 10 09:41:52 CEST 2015


Hi all

We have a NutOS application targetting the Ethernut 1.3g reference design
(implemented on our own PCBs) which was originally developed using gcc
3.4.4, binutils 2.16.1, avr-libc 1.2.5 and NutOS 4.4.1, cross compiling
under Linux.  Lua 5.0.2 was used by NutOS for configuration management. 
This has worked well for around 7 years, but the time has come to upgrade
the software components to pick up bug fixes and the like.

Consequently I have compiled a new development environment (still under
Linux) comprisng gcc 5.1.0, binutils 2.25, avr-libc 1.8.1 (patched to deal
with the various crt/libdev issues under gcc 5.x) and NutOS 5.1.0 (patched
to work with avrlibc 1.8.1 changes to program memory management).  The
respective patches were lifted from the project repositories and can be
found (for the moment) at

  http://www.atrad.com.au/~jwoithe/avr/

Lua 5.3.0 is now in use for NutOS configuration duties.

When compiled under this new environment, the resulting firmware seems to
freeze its handling of the network interface: not even ICMP pings work. 
Usually all other threads appear to be ok.

At the end of this email is an example program (nettest.c) which exhibits
the behaviour.  This was compiled using

  avr-gcc -I.. -I. \
    -I~/avr/ethernut-5.1.0//nut/include \
    -mmcu=atmega128 -MD -MP -Os -Wall -ffunction-sections \
    -fno-delete-null-pointer-checks -Wstrict-prototypes \  
    -Wa,-ahlms=nettest.lst -DETHERNUT1 -D__HARVARD_ARCH__ \
    -c nettest.c -o nettest.o

linked with

  avr-gcc nettest.o \
    -mmcu=atmega128 \
    -Wl,--gc-sections -Wl,--defsym=main=0,-Map=.map,--cref \
    -L~/avr/ethernut-5.1.0//nutbld/lib \  
    -Wl,--start-group \
      ~/avr/ethernut-5.1.0//nutbld/lib/nutinit.o \
      -lnutpro -lnutnet -lnutfs -lnutos -lnutgorp -lnutcrt -lnutdev -lnutarch\
    -Wl,--end-group -lm \
    -Wl,-Map=nettest.map -o nettest.elf

  avr-objcopy -R .eeprom -O ihex nettest.elf nettest.hex

It is written to the flash with an Egnite SP-Duo 2 via an USB-serial
converter using

  avrdude -p m128 -P /dev/ttyUSB0 -c stk500v2 -U flash:w:nettest.hex

The NutThreadSetPriority() calls mirror what we do in our final application
but it turns out the erroneous behaviour is seen even when these are
commented out.

Correct network behaviour (that is, ICMP pings work from a PC to the
ethernut at 192.168.0.245) when:

 * The stdin freopen() call is commented out along with either the mux_cx 
   or tcp_hdlr NutThreadCreate() call.

 * The mux_cx and tcp_hdlr NutThreadCreate() calls are commented out.

 * The ser_hdlr NutThreadCreate() call is commented out.

 * The i2c_wdog NutThreadCreate() call is commented out.

 * The tcp_hdlr NutThreadCreate() call is commented out.

All other commenting combinations appear to work fine.

To complicate things somewhat, in our real firmware some intermediate
behaviours are sometimes seen: I might get 6 ICMP pings through before the
ethernut's interface becomes unresponsive, for example.  For the moment
though this is peripheral.

The above results have been replicated on two different PCBs, thus seemingly
ruling out faulty electronic components.  A svn checkout from yesterday (9
June) exhibited similar behaviours.

With the program as distributed below, the initial output to the serial port 
is

  free 30038
  free 29736
  free 27320
    tcp_hdlr prio= 64,  996 bytes free, stat=0
    i2c_wdog prio= 64,  484 bytes free, stat=0
    ser_hdlr prio= 64,  486 bytes free, stat=0
      mux_cx prio= 64,  230 bytes free, stat=0
        rxi5 prio=  9,  219 bytes free, stat=0
        main prio= 64,  618 bytes free, stat=0
        idle prio=254,  358 bytes free, stat=0

followed by repeats of
  free 26816
       tcpsm prio= 32,  219 bytes free, stat=0
    tcp_hdlr prio= 64,  998 bytes free, stat=0
    i2c_wdog prio= 64,  486 bytes free, stat=0
    ser_hdlr prio= 64,  486 bytes free, stat=0
      mux_cx prio= 64,  230 bytes free, stat=0
        rxi5 prio=  9,  219 bytes free, stat=0
        main prio= 64,  618 bytes free, stat=0
        idle prio=254,  358 bytes free, stat=0

In other words, there looks to be plenty of free RAM and none of the threads
are in stack trouble.  Curiously enough, the main loop sometimes freezes in
this configuration.

Given the somewhat non-deterministic behaviour my initial thought was that
the board had bad flash or SRAM, but since a second board behaves similarly
this seems unlikely.  To be doubly sure I dug out a real Ethernut 1.3g board
we had here, and it showed exactly the same behaviour as the other two
tested PCBs.

Does anyone have any ideas as to what might be going wrong here, or how to
debug this further?

Regards
  jonathan

/* Program to test network freeze */

#include <stdio.h>
#include <sys/timer.h>
#include <arpa/inet.h>
#include <sys/socket.h>

#include <dev/board.h>
#include <io.h>

#include <sys/heap.h>

THREAD(serial_handler1, arg) {
//  NutThreadSetPriority(100);
  for (;;) {
    NutSleep(500);
  }
}

THREAD(mux_cx_handler1, arg) {
//  NutThreadSetPriority(90);
  for (;;) {
    NutSleep(610);
  }
}

THREAD(i2c_watchdog1, arg) {
//  NutThreadSetPriority(50);
  for (;;) {
    NutSleep(705);
  }
}

THREAD(acm_tcp_handler1, arg) {

  TCPSOCKET *sock;
//  NutThreadSetPriority(60);
  sock = NutTcpCreateSocket();
  if (sock == NULL)
    printf_P(PSTR("socket error\n"));

  for (;;) {
    NutSleep(850);
  }
}

int main(void) {

  /* Hard coded network configuration. */
  #define MY_MAC  "\x42\x54\x52\x44\x10\x00"
  #define MY_IP   "192.168.0.245"
  #define MY_MASK "255.255.255.0"

  uint8_t mac[] = MY_MAC;
  uint32_t ip_addr = inet_addr(MY_IP);
  uint32_t ip_mask = inet_addr(MY_MASK);

  // Enable debug output
  uint32_t serial_speed = 115200;
  NutRegisterDevice(&DEV_CONSOLE, 0, 0);
  freopen(DEV_CONSOLE.dev_name, "w", stdout);
  freopen(DEV_CONSOLE.dev_name, "r", stdin); 
  _ioctl(_fileno(stdout), UART_SETSPEED, &serial_speed);

  NutSleep(200);
  printf_P(PSTR("free %d\n"), NutHeapAvailable());

  if (NutRegisterDevice(&DEV_ETHER, 0x8300, 5) != 0)
    printf_P(PSTR("nic init fail\n"));

  if (NutNetIfConfig(DEV_ETHER_NAME, mac, ip_addr, ip_mask) != 0)
    printf_P(PSTR("nic config fail\n"));

  printf_P(PSTR("free %d\n"), NutHeapAvailable());


  NutThreadCreate("mux_cx", mux_cx_handler1, 0, 256);
  NutThreadCreate("ser_hdlr", serial_handler1, 0, 512);
  NutThreadCreate("i2c_wdog", i2c_watchdog1, 0, 512);
  NutThreadCreate("tcp_hdlr", acm_tcp_handler1, 0, 1024);

  for (;;) {
    NUTTHREADINFO *tdp = nutThreadList;
    printf_P(PSTR("free %d\n"), NutHeapAvailable());
    for (tdp = nutThreadList; tdp; tdp = tdp->td_next) {
      printf_P(PSTR("%10s prio=%3d, %4d bytes free, stat=%d\n"),
        tdp->td_name, tdp->td_priority, 
        (unsigned long)((uintptr_t) tdp->td_sp - (uintptr_t) tdp->td_memory),
        tdp->td_state);
    }
    printf_P(PSTR("\n"));
    NutSleep(1000);
  }
}


More information about the En-Nut-Discussion mailing list