[En-Nut-Discussion] Ethernut TCP hangs

Michael Smola Michael.Smola at gmx.net
Sat Apr 2 16:26:57 CEST 2005


I carefully analysed, but couldn't solve the problem!?

Here are my configuration parameters:

============================================================================
====
Ethernut: Rev 1.3 Rev-D

GCC:      Reading specs from c:/Program
Files/Avr/winavr/bin/../lib/gcc/avr/3.4.1/specs
	    Configured with:
../gcc-3.4.1/configure --prefix=e:/avrdev/install --build=mingw32 --host=min
gw32 --target=avr
                           --enable-languages=c,c++
          Thread model: single
          gcc version 3.4.1

Test:     Automatic reload of a diagnostic html page via http server every 5
seconds
          (several cgi methods and html pages)

Nut/OS:   3.4.1.1: test passed (>than 24 hours)
Nut/OS:   3.9.5.1: test failed (max. 3-4 hours)
============================================================================
====

I takes between several minutes and 3-4 hours until the connection stalls
with 3.9.5.1pre!?
Nut/OS 3.4.1.1 never showed a problem in weeks of up-time!

I agree, that it would be boring to click on a "refresh" button to test :o)
So I added an expiry field in my test pages which are controllable by an
URL parameter: ........./cgi-bin/tool.cgi?5. This forces the page to be
reloaded
every 5 seconds. I don't like to post the complete URL in a news group, so
if you like to
test it online we should find a method to transfer the required data secure.

============================================================================
====

I've added some code to check a valid TCP/IP connection with timeout:

      // tcpsock.c
////////////////////////////////////////////////////////////////////

	int NutTcpConnectTO(TCPSOCKET * sock, u_long addr, u_short port, u_long
timeout)
	{
   		 ...
    		return NutTcpStateActiveOpenEventTO(sock, timeout);
	}

	// tcpsm.c
//////////////////////////////////////////////////////////////////////

	int NutTcpStateActiveOpenEventTO(TCPSOCKET * sock, u_long timeout)
	{
    		...
    		NutEventWait(&sock->so_ac_tq, timeout);
    		...
	}

	///////////////////////////////////////////////////////////////////////////
////

Maybe you could take it over in later NUT/OS editions it is very helpful
to have non-blocking connection attempts!

============================================================================
=====

My implementation is mainly based on the httpserv.c sample:

int main(void)
{
  //////////////////////////////////////////////////////////////////////////
/////
  //
  // Initialize the uart.
  //
  //////////////////////////////////////////////////////////////////////////
/////

  NutRegisterDevice(&devUart0, 0, 0);

  freopen("uart0", "w", stdout);
  freopen("uart0", "r", stdin);

  u_long baud       = 115200;
  _ioctl(_fileno(stdin), UART_SETSPEED, &baud);

  NutSleep(200);
  printf_P(PSTR("Nut/OS %s HTTP Daemon.\n"), NutVersionString());

  //////////////////////////////////////////////////////////////////////////
/////
  //
  // Register Realtek controller at address 8300 hex and interrupt 5.
  //
  //////////////////////////////////////////////////////////////////////////
/////

  if(NutRegisterDevice(&devEth0, 0x8300, 5))
  {
    puts_P(PSTR("Error: Registering device failed.\n"));
    RESET;
  }

#ifdef NUTOS_3_9_5
  strncpy(confos.hostname,
		  #include "hostname.def"
          , sizeof(confos.hostname)-1);
#else //NUTOS_3_4_1
  setDhcpHostname(
#                 include "hostname.def"
  );
#endif

  //////////////////////////////////////////////////////////////////////////
/////

  u_char mac[32] =
  {
#include "mac.def"
  };


  if (NutDhcpIfConfig("eth0", mac, 60000))
  {
    puts_P(PSTR("DHCP config failed.\n"));
    RESET;
  }

  printf_P(PSTR("\nIP:                  %s\n"),
inet_ntoa(confnet.cdn_ip_addr));
  printf_P(PSTR("MASK:                %s\n")  ,
inet_ntoa(confnet.cdn_ip_mask));
  printf_P(PSTR("GATEWAY:             %s\n"),
inet_ntoa(confnet.cdn_gateway));

  //////////////////////////////////////////////////////////////////////////
/////
  //
  // Register our UROM.
  //
  //////////////////////////////////////////////////////////////////////////
/////

  NutRegisterDevice(&devUrom, 0, 0);

  //////////////////////////////////////////////////////////////////////////
/////
  //
  // Init CGI
  //
  //////////////////////////////////////////////////////////////////////////
/////

  initCGI();

  //////////////////////////////////////////////////////////////////////////
/////
  //
  // Init ports
  //
  //////////////////////////////////////////////////////////////////////////
/////

  DDRD = (1<<SERVER_SWITCH) | (1<<DESKTOP_SWITCH);
  PORTD = 0xff;

  //////////////////////////////////////////////////////////////////////////
/////
  //
  // HTTP threads[1-4]
  //
  //////////////////////////////////////////////////////////////////////////
/////

  int i;
  char *thname;

  for(i = 1; i <= 4; i++)
  {
    thname = "httpd_0";
    sprintf(thname+6,"%01d",i);
    NutThreadCreate(thname, Service, (void *)(u_short)i, 640);
  }

  //////////////////////////////////////////////////////////////////////////
/////
  //
  // 433MHz co-processor via UART1 thread
  //
  //////////////////////////////////////////////////////////////////////////
/////

  thname = "uart_1";
  NutThreadCreate(thname, UartService, (void *)(u_short)i, 640);

  //////////////////////////////////////////////////////////////////////////
/////
  //
  // Time server thread
  //
  //////////////////////////////////////////////////////////////////////////
/////


  u_long sntp_server = inet_addr(sntp_server_ip);

  _timezone = -1L * 60L * 60L;       // CET := UTC+1
  if(0 != NutSNTPStartThread (sntp_server, 3600000))
  {
    printf_P(PSTR("\nSNTP client start failed: %s.\n"), sntp_server_ip);
    RESET;
  }
  else
  {
    printf_P(PSTR("SNTP server:         %s\n"), sntp_server_ip);
  }

  NutSleep(1000);
  setResetTime();

  //////////////////////////////////////////////////////////////////////////
/////
  //
  // Watchdog
  //
  //////////////////////////////////////////////////////////////////////////
/////

  wdt_enable( WDTO_2S);
  puts_P( PSTR("Watchdog activated."));

  //////////////////////////////////////////////////////////////////////////
/////
  //
  // Okay! Let's roll...
  //
  //////////////////////////////////////////////////////////////////////////
/////

  puts_P(PSTR("Ready."));

  NutThreadSetPriority(254);
  int n=0;
  for(;;)
  {
    NutSleep(1000);
    wdt_reset();
    putchar('.');
    n++;
    if(n>10)
    {
      n=0;
      print_threads();
    }
  }
}

////////////////////////////////////////////////////////////////////////////
///








-----Original Message-----
From: en-nut-discussion-bounces at egnite.de
[mailto:en-nut-discussion-bounces at egnite.de]On Behalf Of Harald Kipp
Sent: Wednesday, March 30, 2005 16:19
To: Ethernut User Chat (English)
Subject: Re: [En-Nut-Discussion] Ethernut TCP hangs


Indeed 3.9.1 contained a bug in the TCP state machine,
which had been discovered and fixed some time ago.
See entry 2005-01-02 of
http://cvs.sourceforge.net/viewcvs.py/ethernut/nut/ChangeLog?rev=1.187&view=
auto

Version 3.9.5 includes this and the later arpcache fix
up to 2005-02-04. All remaining changes not yet released
should not result in TCP stack problems.

We recently ran a few long time tests including heavy traffic
tests with CVS HEAD up to 2005-03-13 without problems on
Ethernut 1.3 boards. All 8 CPU were running Nut/OS, one of
them with 14 concurrent threads. Everything seems to be quite
solid.

Any possibility, that you mismatched the versions?

If you don't want to run cvs, you may also grab the latest
snapshot from
http://www.egnite.de/cvs

If this doesn't solve it, I'm most interested in helping to find
the cause. May be you can setup a simple test case. Hopefully
it could be reproduced by a simple TCP application. I'd hate
to click on a browser update button for hours. :-) Though, it
should use concurrent connections. If there's a problem, it's
most likely caused by multithreaded transfers.

We also need to know the hardware (Realtek or SMSC Controller)
and the exact compiler version.

Harald

At 15:39 30.03.2005 +0200, you wrote:
>Hi,
>
>I'm using an ethernut board for a long time as Ethernet-Server.
>
>It is running well over a very long period without any problem (last
>known good OS: Nut/OS 3.4.1.1).
>The application is based on the http server sample. I've added some more
>threads to service the uart(s) and a SNTP
>client.
>========================================================================
>======================
>Problem:
>To keep my board up-to-date I tried to upgrade to Nut/OS 3.9.1 some
>weeks ago.
>
>Result: I discovered that after several minutes of heavy load on the TCP
>channel, the connection to ethernet is lost
>            until a reset is performed.
>
>It seems, that only the TCP stack fails, the remaining threads (UART,
>SNTP, watchdog ...) are running, so that diagnostic messages sent
>over the UART are transmitted correctly. Only the http thread seems to
>hang!?
>========================================================================
>=======================
>
>After that I switched back to 3.4.1.1 and everything was fine again
>(rock solid in a very high traffic session running several hours).
>
>In the latest ethernut discussion group posts I read that similar
>problems were discovered by other parties and that you
>worked on it. Finally I got the impression, that you fixed the problem.
>So I tried to use 3.9.5 pre.
>
>Result: Unfortunately it almost has the same behavior as with 3.9.1 ->
>Freeze of the TCP after several minutes of intensive traffic.
>
>Question: Is your fix for the problem you discussed with  Bernd Walter,
>Dusan Ferbas, Roman in thread: [TCP stops working after some time]
>included in the 3.9.5 pre ?
>
>thx in advance
>
>M.Smola
>Munich

_______________________________________________
En-Nut-Discussion mailing list
En-Nut-Discussion at egnite.de
http://www.egnite.de/mailman/listinfo.cgi/en-nut-discussion




More information about the En-Nut-Discussion mailing list