[En-Nut-Discussion] [en-nut-discussion] thread stops executingafter some time.

Erik L erik.lindstein at gmail.com
Wed Apr 2 10:31:32 CEST 2008


Hi Jakub.
I used DHCP before but i hade some problems with that aswell.
After 24h the lease got old and the ethernut tryed to get a new IP from the
DHCP-server and when that happens the communication gets distrubed, i guess
one could remove the lease time but i didnt put any effort into it because i
have other things to gain in using static ipadresses.

If the "problem" is there or not when using dhcp i dont know. I might give
it a try.

Thanks for the input. 
Regards/Erik

 

jakub nowak-2 wrote:
> 
> Maybe try dynamic IP gateway ... -> if  dhcp will work fine, at least
> You will know where is problem.
> 
> 2008/3/29, Erik L <erik.lindstein at gmail.com>:
>>
>>  I dont think the problem is created when i disconnect the PC.
>>  I connected/disconnected the PC lots of times in a short period of time
>> and
>>  the clients still always connect.
>>  Also i have clients that havent been connected at all efter i powerd
>> them up
>>  and they still get the same problem if i leave them for a couple of
>> hours
>>  before i connect the PC.
>>
>>  I have no route set, the communication is just on the LAN.
>>  (Clients 192.168.0.1-10 server 192.168.0.115) all ipadresses are static.
>>
>>  /Erik
>>
>>
>>
>>
>>  ernstst wrote:
>>  >
>>  > Hi Erik!
>>  >
>>  > quote
>>  > But when the problem occurs the software in the client can´t get to
>> that
>>  > point where it actualy exchanges the data because the socket shoudnt
>> be
>>  > able
>>  > to connect.
>>  > " if(NutTcpConnect(socket, rip, TCPSERVERPORT) == 0) "
>>  >
>>  > If the server(rip:TCPSERVERPORT) issnt connected to the LAN it can´t
>>  > return
>>  > 0 correct?
>>  > And even if it for some reason does the socket read timeouts should
>> occur
>>  > and it should manage to get past the data exchange rutines and then
>> start
>>  > all over again.
>>  > --------------------------------
>>  > Unquote
>>  >
>>  > I am not sure this is correct (ie. If the read timeout triggers on a
>>  > NutTcpConnect which does not do thru)
>>  >
>>  > I need to think about it, maybe give it a try. A quick look into
>> TCPSOCK.C
>>  > and the like didn't enlighten me ... Maybe the TCP protocol state in
>> which
>>  > the disconnect occurred somehow influences if the reconnect works.
>>  > Another thought: IP Routing tables. Are they still "there" when the
>>  > reconnect is attempted? Have you defined a route?
>>  >
>>  > Regards
>>  > Ernt
>>  >
>>  >
>>  >
>>  > -----Ursprüngliche Nachricht-----
>>  > Von: en-nut-discussion-bounces at egnite.de
>>  > [mailto:en-nut-discussion-bounces at egnite.de] Im Auftrag von Erik
>> Lindstein
>>  > Gesendet: Dienstag, 25. März 2008 12:21
>>  > An: en-nut-discussion at egnite.de
>>  > Betreff: [En-Nut-Discussion] [en-nut-discussion] thread stops
>>  > executingafter
>>  > some time.
>>  >
>>  > Ernst, thank you very much for taking time to answer.
>>  >
>>  > I'l write some comments down below
>>  >
>>  > I understand this is a sporadic problem so it takes a lot of time to
>> run
>>  > into the "error" situation, but anyway:
>>  >
>>  > 1) ... only 1 out of 4 clients where still connecting ...
>>  > When you experience this situation more than once, is it always the
>> same
>>  > Ethernut which can still connect or is that also "random". And what
>> looks
>>  > "random" in the first place, is it really?
>>  >
>>  > --------------------------------
>>  > Well, i don´t think there is much that involves software that is truly
>>  > random :-) so ofcorse this issnt either.
>>  >
>>  > But here it can be any one of the clients that stops executing the
>>  > thread, usuly i can see that after some time(~6 - 7h) one or two
>>  > stoped connecting and there can be one left running for up to 24h
>>  > (perhaps longer). But in the end all of them stops trying to connect
>>  > and the thread sets sleep time to "None".
>>  > --------------------------------
>>  >
>>  >
>>  >
>>  >
>>  > 2) .. (all of them uses the same software just different MACs and IPs
>> ) ..
>>  > Even if all of them use the same SW, are they operating under
>>  > same/similar/different conditions? (I mean ".. exchanges some XML
>>  > data,..":
>>  > where does this data come from?, i.e. how is it generated and how
>>  > different
>>  > can it be between the Ethernuts?
>>  >
>>  > --------------------------------
>>  > When the server software on the PC is running and the PC is connected
>>  > then the client socket gets connected and the client sends some values
>>  > read from the A/D, some status variables and then the PC responds with
>>  > a command that tells the client to do "something".
>>  > Usuly the PC just sends a "reset watchdog command" to the client.
>>  >
>>  > But in this case everything workes fine as long as the software is
>>  > running and the PC is connected.
>>  > When i then close down the server software the client gets a command
>>  > that tells it to start reseting the WDT localy.
>>  >
>>  > But when the problem occurs the software in the client can´t get to
>>  > that point where it actualy exchanges the data because the socket
>>  > shoudnt be able to connect.
>>  > " if(NutTcpConnect(socket, rip, TCPSERVERPORT) == 0) "
>>  >
>>  > If the server(rip:TCPSERVERPORT) issnt connected to the LAN it can´t
>>  > return 0 correct?
>>  > And even if it for some reason does the socket read timeouts should
>>  > occur and it should manage to get past the data exchange rutines and
>>  > then start all over again.
>>  > --------------------------------
>>  >
>>  >
>>  >
>>  > 3) How about buffer overflows due to "special" tx/rx data conditions
>>  > (length)?
>>  > --------------------------------
>>  > In this case it only happens when (atleast think that) I don´t read or
>>  > send any data more than the data that the tcpsm sends out trying to
>>  > connect the socket.
>>  > My code dossnt do any rx/tx until the socket and stream is OK.
>>  >
>>  > --------------------------------
>>  >
>>  >
>>  >
>>  > 4) Try looking "into" the TCP sockets. My bank switch test-program at
>>  > http://www.es-business.com/Firma/eng/edocs.htm may help. Include cli.c
>> and
>>  > dump.c in your main pgm and create a thread as indicated in the
>> source.
>>  > It contains a Telnet based CLI which has a "lists" command which walks
>>  > thru
>>  > and displays all Nut/OS known lists (TCP Sockets is one of these). One
>>  > word
>>  > of caution: Because Nut/OS (i.e. other threads) are executing while
>> this
>>  > command follows the pointer in the various lists pointing from one
>> entry
>>  > to
>>  > the next, the command may loop in case a list is updated (by an or on
>>  > behalf
>>  > of an app thread) right when this pointer in the list is used by the
>>  > "lists"
>>  > command itself.
>>  > The dump command may help you peek around on RAM.
>>  >
>>  > --------------------------------
>>  > Il look into that, thanks..
>>  >
>>  > --------------------------------
>>  >
>>  >
>>  >
>>  > 5) Is there a possibility to have wireshark monitoring the TCP/IP link
>> up
>>  > until the PC gets disconnected? This way, you could find out what was
>>  > exchanged immediately before the disconnect happened and maybe this
>> gives
>>  > more info about the internal status of the Ethernuts and the TCP/IP
>>  > connection states.
>>  >
>>  > --------------------------------
>>  > The PC only gets disconnected when i remove the LAN cable but i could
>>  > monitor the data until that point.
>>  > But i can disconnect and connect the cable many ( unlimited? ) times
>>  > and there is no problems. The clients always connects again if i dont
>>  > leave the PC unconnected for a longer period of time ( > ~5-6h )
>>  >
>>  > One possibility might be to have the switch i use setup to echo all
>>  > trafic out on another port and monitor the trafic there with
>>  > wireshark. That way i might be able to se what happens before it stops
>>  > working.
>>  > But if i have the PC connected in the "normal" way the problem dossnt
>>  > occur.
>>  > --------------------------------
>>  >
>>  >
>>  >
>>  > 6) Do you log the state of the TCP/IP connection between the Ethernuts
>> and
>>  > the PC within the PC? Maybe such log (record length / contents) could
>>  > provide some more info.
>>  >
>>  > --------------------------------
>>  > Because of the problem only occuring when the PC issnt connected this
>>  > is hard to do.
>>  > I can log the trafic when everything workes fine but not sure it gives
>>  > away the problem but perhaps someone with more knowledge of TCP/IP can
>>  > se some something here.
>>  > --------------------------------
>>  >
>>  >
>>  >
>>  >
>>  > 7) The most important question is:
>>  > Is the problem caused by behaviour in Nut/OS or Nut/Net (IP stack,
>> timers,
>>  > events etc)
>>  > Or
>>  > Is the problem cause by some behaviour in the application threads.
>>  > Is there any chance to "strip down" the application threads to try to
>>  > minimize their possible impact on the situation?
>>  >
>>  > --------------------------------
>>  > I minimized the software to only include the thread for client socket
>>  > and one tcpserver thread. But this still happends. Il try to remove
>>  > some more code in the client thread and se if it changes anything.
>>  > I did get the feeling that this software took longer time before it
>>  > stoped trying to connect. But thats not 100% verified.  Anyway still
>>  > it stops.
>>  > --------------------------------
>>  >
>>  >
>>  > 8) I have an example here of a test app, which produces the following
>>  > threads list:
>>  >
>>  > CLI>threads
>>  > Name    Status  Prio    Stack   Memory  Timeout  INFO-addr  Bank
>>  > CMDLINE Run      64        891  OK      None        36C9     -1
>>  > XHTST   Sleep    64        357  OK      6           3203      9
>>  > XHTST   Ready    64        357  OK      None        2F3D      8
>>  > XHTST   Sleep    64        357  OK      None        2C77      7
>>  > XHTST   Ready    64        357  OK      None        29B1      6
>>  > XHTST   Sleep    64        357  OK      24          26EB      5
>>  > XHTST   Ready    64        357  OK      None        2425      4
>>  > XHTST   Sleep    64        357  OK      13          215F      3
>>  > XHTST   Ready    64        357  OK      None        1E99      2
>>  > XHTST   Sleep    64        357  OK      6           1BD3      1
>>  > tcpsm   Sleep    32        468  OK      102         1925     -1
>>  > XHTST   Sleep    64        357  OK      None        16C1      0
>>  > rxi5    Sleep     9        603  OK      699         145F     -1
>>  > main    Sleep   200        705  OK      940         1041     -1
>>  > idle    Ready   254        356  OK      None         D21     -1
>>  >
>>  > The XHTST threads a looping apps who work in memory, display info via
>>  > TCP/IP
>>  > to a telnet client and sometimes sleep for a random time.
>>  > There are threads which are Sleeping and do not have a Timeout
>> associated
>>  > with them! (maybe when they are waiting for the telnet output to
>>  > complete?)
>>  >
>>  > --------------------------------
>>  >
>>  > I have no idea but perhaps when
>>  > " NutTcpConnect(socket, rip, TCPSERVERPORT) " executes it sets the
>>  > thread to "None" and then waits for some event from the tcpsm that
>>  > then never occurs.
>>  >
>>  > --------------------------------
>>  >
>>  >
>>  > I am quite sure you have thought about some (if not all) of this
>> already,
>>  > but maybe it "kicks" off some more thoughts.
>>  >
>>  > Good luck
>>  > Regards
>>  > Ernst
>>  >
>>  >
>>  > -----Ursprüngliche Nachricht-----
>>  > Von: en-nut-discussion-bounces at egnite.de
>>  > [mailto:en-nut-discussion-bounces at egnite.de] Im Auftrag von Erik
>>  > Lindstein
>>  > Gesendet: Montag, 24. März 2008 17:13
>>  > An: en-nut-discussion at egnite.de
>>  > Betreff: [En-Nut-Discussion] Thread stops executing after some time.
>>  >
>>  > Guys please help me out.
>>  > I'm on a wild goose chase trying to figure out what is happening with
>>  > a thread that handles communications with a PC thru a tcp/ip socket.
>>  >
>>  > The setup is:
>>  > Ethernut V2.1
>>  > Software 4.4.0
>>  >
>>  > The software is build up by a couple of threads each handling some
>>  > functions( lcd, push buttons, user functions etc )
>>  > Then i have one thread that communicates with a server software on my
>> PC.
>>  > The communication is pretty simple.
>>  > It's a client socket that connects to my server PC and then exchanges
>>  > some XML data, disconnects, sleeps for 300ms and then start all over
>>  > again.
>>  >
>>  > This works fine for weeks without any problems if i have the server PC
>>  > up and running and connected to the same LAN my ethernut is connected
>>  > to.
>>  >
>>  > Then one day i disconnected the server PC from the LAN and left a
>>  > couple of the ethernut clients running over the weekend, then on
>>  > Monday i connected my PC again and started up the server software but
>>  > i noticed that only 1 out of 4 clients where still connecting (all of
>>  > them uses the same software just different MACs and IPs )
>>  >
>>  > I looked at the incoming traffic with wireshark and could not see any
>>  > sign of life at all from the 3 clients not connecting.
>>  > I tried to ping them and they all answer on pings and also all other
>>  > threads that handles the LCD and push buttons are still up and running
>>  > so the software is not dead.
>>  > I tested to deactivate/activate the network connection on my PC to see
>>  > if anyone of the clients woke up. No luck.
>>  >
>>  > I then added another thread to the software ( i took the sample code
>>  > for the tcps in the apps dir and created a thread to run that code )
>>  > And when everything is ok i see the "inetd" thread timeout counting
>>  > all the time and the thread executes as expected.
>>  >
>>  > When the inetd thread stops executing i can connect to the unit and i
>>  > get the output seen below:
>>  >
>> ----------------------------------------------------------------------------
>>  > --------------------------------------------
>>  > 220 List of threads with name,state,prio,stack,mem,timeout follows
>>  > tcpsm   Sleep   32      461     OK      27
>>  > TcpS    Run     64      2546    OK      None
>>  > inetd   Sleep   64      2381    OK      None
>>  > rxi5    Sleep   9       603     OK      1392
>>  > wdt     Sleep   40      255     OK      8
>>  > SmuTh   Sleep   64      65      OK      71
>>  > PcuTh   Sleep   64      805     OK      1
>>  > HvpsTh  Sleep   64      605     OK      24
>>  > IppsTh  Sleep   64      965     OK      4
>>  > TaTh    Sleep   64      65      OK      35
>>  > LcdTh   Sleep   64      929     OK      34
>>  > main    Sleep   64      733     OK      451
>>  > idle    Ready   254     356     OK      None
>>  >
>> ----------------------------------------------------------------------------
>>  > --------------------------------------------
>>  >
>>  > For some reason the thread(inetd) just gets a timeout set to "None"
>>  > instead of the NutSleep value.
>>  >
>>  > I only have one place in the code that sets the thread to sleep and i
>>  > have a fixed value there of 300. (NutSleep(300))
>>  > So there must be somewhere else in the code the thread gets set to
>>  > some wait state, but i have no idea how to figure out where and why
>>  > this happens.
>>  >
>>  > Can it be something that happens when the socket tries to connect to a
>>  > IP/Server that doesn't exist on the LAN.
>>  >
>>  > If it happens all the time it would be easier to figure out whats
>>  > wrong but this can run for days without happening.
>>  > Also if there was low memory the tcps thread wouldn't answer the
>>  > incoming connection attempts i guess.
>>  >
>>  > The thread code is below:
>>  >
>> ----------------------------------------------------------------------------
>>  > --------------------------------------------
>>  > THREAD(InetdThread, arg)
>>  > {
>>  >        TCPSOCKET *socket;
>>  >        FILE *stream = 0;
>>  >        u_long rip = inet_addr("192.168.0.115");
>>  >        u_long tmo = 500;
>>  >        int socket_error = 0;
>>  >        uint8_t *start = 0, *stop = 0;
>>  >        uint8_t unit[20], cmd[40], value[40];
>>  >        uint8_t data_exchange_buffer[100] = "0";
>>  >
>>  >        for(;;)
>>  >        {
>>  >                if ((socket = NutTcpCreateSocket()) != 0)
>>  >                {
>>  >                        NutTcpSetSockOpt(socket, SO_RCVTIMEO, &tmo,
>>  > sizeof(tmo));
>>  >                        NutTcpSetSockOpt(socket, SO_SNDTIMEO, &tmo,
>>  > sizeof(tmo));
>>  >                        if(NutTcpConnect(socket, rip, TCPSERVERPORT) ==
>> 0)
>>  >                        {
>>  >                                stream = _fdopen((int) ((uptr_t)
>> socket),
>>  > "r+b");
>>  >                                if(stream != 0)
>>  >                                {
>>  >                                        fprintf_P(stream, info_P,
>>  > INFO_P_ARGS); // Send some XML DATA
>>  >                                        fflush(stream);
>>  >                                        fgets(data_exchange_buffer,
>>  > sizeof(data_exchange_buffer),
>>  > stream); // Get some XML DATA
>>  >                                        {
>>  >                                                // Handle XML data
>>  >                                        }
>>  >                                        fclose(stream);
>>  >                                        /*
>>  >                                                info_text is a extern
>>  > variable that another thread prints on the
>>  > LCD for debug output.
>>  >                                        */
>>  >                                        sprintf(info_text ,"COK\n%lu",
>>  > (u_long)NutGetMillis());
>>  >                                }
>>  >                        }
>>  >                        else
>>  >                        {
>>  >                                socket_error = NutTcpError(socket);
>>  >                                sprintf(info_text ,"CE:%d \n%lu",
>>  > socket_error, (u_long)NutGetMillis());
>>  >                        }
>>  >                        NutTcpCloseSocket(socket);
>>  >                }
>>  >                NutSleep(300);
>>  >        }
>>  > }
>>  >
>> ----------------------------------------------------------------------------
>>  > --------------------------------------------
>>  >
>>  >
>>  >
>>  > --
>>  > /Erik
>>  > _______________________________________________
>>  > http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>>  >
>>  >
>>  >
>>  > --
>>  > No virus found in this incoming message.
>>  > Checked by AVG.
>>  > Version: 7.5.519 / Virus Database: 269.21.8/1340 - Release Date:
>>  > 23.03.2008
>>  > 18:50
>>  >
>>  >
>>  > _______________________________________________
>>  > http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>>  >
>>  >
>>
>>
>> --
>>  View this message in context:
>> http://www.nabble.com/-en-nut-discussion--thread-stops-executing-after-some-time.-tp16277335p16368466.html
>>  Sent from the MicroControllers - Ethernut mailing list archive at
>> Nabble.com.
>>
>>
>>  _______________________________________________
>>  http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>>
> _______________________________________________
> http://lists.egnite.de/mailman/listinfo/en-nut-discussion
> 
> 

-- 
View this message in context: http://www.nabble.com/-en-nut-discussion--thread-stops-executing-after-some-time.-tp16277335p16445361.html
Sent from the MicroControllers - Ethernut mailing list archive at Nabble.com.




More information about the En-Nut-Discussion mailing list