[En-Nut-Discussion] [en-nut-discussion] thread stops executingafter some time.

jakub nowak jdnowak at gmail.com
Wed Apr 2 00:06:06 CEST 2008


Maybe try dynamic IP gateway ... -> if  dhcp will work fine, at least
You will know where is problem.

2008/3/29, Erik L <erik.lindstein at gmail.com>:
>
>  I dont think the problem is created when i disconnect the PC.
>  I connected/disconnected the PC lots of times in a short period of time and
>  the clients still always connect.
>  Also i have clients that havent been connected at all efter i powerd them up
>  and they still get the same problem if i leave them for a couple of hours
>  before i connect the PC.
>
>  I have no route set, the communication is just on the LAN.
>  (Clients 192.168.0.1-10 server 192.168.0.115) all ipadresses are static.
>
>  /Erik
>
>
>
>
>  ernstst wrote:
>  >
>  > Hi Erik!
>  >
>  > quote
>  > But when the problem occurs the software in the client can´t get to that
>  > point where it actualy exchanges the data because the socket shoudnt be
>  > able
>  > to connect.
>  > " if(NutTcpConnect(socket, rip, TCPSERVERPORT) == 0) "
>  >
>  > If the server(rip:TCPSERVERPORT) issnt connected to the LAN it can´t
>  > return
>  > 0 correct?
>  > And even if it for some reason does the socket read timeouts should occur
>  > and it should manage to get past the data exchange rutines and then start
>  > all over again.
>  > --------------------------------
>  > Unquote
>  >
>  > I am not sure this is correct (ie. If the read timeout triggers on a
>  > NutTcpConnect which does not do thru)
>  >
>  > I need to think about it, maybe give it a try. A quick look into TCPSOCK.C
>  > and the like didn't enlighten me ... Maybe the TCP protocol state in which
>  > the disconnect occurred somehow influences if the reconnect works.
>  > Another thought: IP Routing tables. Are they still "there" when the
>  > reconnect is attempted? Have you defined a route?
>  >
>  > Regards
>  > Ernt
>  >
>  >
>  >
>  > -----Ursprüngliche Nachricht-----
>  > Von: en-nut-discussion-bounces at egnite.de
>  > [mailto:en-nut-discussion-bounces at egnite.de] Im Auftrag von Erik Lindstein
>  > Gesendet: Dienstag, 25. März 2008 12:21
>  > An: en-nut-discussion at egnite.de
>  > Betreff: [En-Nut-Discussion] [en-nut-discussion] thread stops
>  > executingafter
>  > some time.
>  >
>  > Ernst, thank you very much for taking time to answer.
>  >
>  > I'l write some comments down below
>  >
>  > I understand this is a sporadic problem so it takes a lot of time to run
>  > into the "error" situation, but anyway:
>  >
>  > 1) ... only 1 out of 4 clients where still connecting ...
>  > When you experience this situation more than once, is it always the same
>  > Ethernut which can still connect or is that also "random". And what looks
>  > "random" in the first place, is it really?
>  >
>  > --------------------------------
>  > Well, i don´t think there is much that involves software that is truly
>  > random :-) so ofcorse this issnt either.
>  >
>  > But here it can be any one of the clients that stops executing the
>  > thread, usuly i can see that after some time(~6 - 7h) one or two
>  > stoped connecting and there can be one left running for up to 24h
>  > (perhaps longer). But in the end all of them stops trying to connect
>  > and the thread sets sleep time to "None".
>  > --------------------------------
>  >
>  >
>  >
>  >
>  > 2) .. (all of them uses the same software just different MACs and IPs ) ..
>  > Even if all of them use the same SW, are they operating under
>  > same/similar/different conditions? (I mean ".. exchanges some XML
>  > data,..":
>  > where does this data come from?, i.e. how is it generated and how
>  > different
>  > can it be between the Ethernuts?
>  >
>  > --------------------------------
>  > When the server software on the PC is running and the PC is connected
>  > then the client socket gets connected and the client sends some values
>  > read from the A/D, some status variables and then the PC responds with
>  > a command that tells the client to do "something".
>  > Usuly the PC just sends a "reset watchdog command" to the client.
>  >
>  > But in this case everything workes fine as long as the software is
>  > running and the PC is connected.
>  > When i then close down the server software the client gets a command
>  > that tells it to start reseting the WDT localy.
>  >
>  > But when the problem occurs the software in the client can´t get to
>  > that point where it actualy exchanges the data because the socket
>  > shoudnt be able to connect.
>  > " if(NutTcpConnect(socket, rip, TCPSERVERPORT) == 0) "
>  >
>  > If the server(rip:TCPSERVERPORT) issnt connected to the LAN it can´t
>  > return 0 correct?
>  > And even if it for some reason does the socket read timeouts should
>  > occur and it should manage to get past the data exchange rutines and
>  > then start all over again.
>  > --------------------------------
>  >
>  >
>  >
>  > 3) How about buffer overflows due to "special" tx/rx data conditions
>  > (length)?
>  > --------------------------------
>  > In this case it only happens when (atleast think that) I don´t read or
>  > send any data more than the data that the tcpsm sends out trying to
>  > connect the socket.
>  > My code dossnt do any rx/tx until the socket and stream is OK.
>  >
>  > --------------------------------
>  >
>  >
>  >
>  > 4) Try looking "into" the TCP sockets. My bank switch test-program at
>  > http://www.es-business.com/Firma/eng/edocs.htm may help. Include cli.c and
>  > dump.c in your main pgm and create a thread as indicated in the source.
>  > It contains a Telnet based CLI which has a "lists" command which walks
>  > thru
>  > and displays all Nut/OS known lists (TCP Sockets is one of these). One
>  > word
>  > of caution: Because Nut/OS (i.e. other threads) are executing while this
>  > command follows the pointer in the various lists pointing from one entry
>  > to
>  > the next, the command may loop in case a list is updated (by an or on
>  > behalf
>  > of an app thread) right when this pointer in the list is used by the
>  > "lists"
>  > command itself.
>  > The dump command may help you peek around on RAM.
>  >
>  > --------------------------------
>  > Il look into that, thanks..
>  >
>  > --------------------------------
>  >
>  >
>  >
>  > 5) Is there a possibility to have wireshark monitoring the TCP/IP link up
>  > until the PC gets disconnected? This way, you could find out what was
>  > exchanged immediately before the disconnect happened and maybe this gives
>  > more info about the internal status of the Ethernuts and the TCP/IP
>  > connection states.
>  >
>  > --------------------------------
>  > The PC only gets disconnected when i remove the LAN cable but i could
>  > monitor the data until that point.
>  > But i can disconnect and connect the cable many ( unlimited? ) times
>  > and there is no problems. The clients always connects again if i dont
>  > leave the PC unconnected for a longer period of time ( > ~5-6h )
>  >
>  > One possibility might be to have the switch i use setup to echo all
>  > trafic out on another port and monitor the trafic there with
>  > wireshark. That way i might be able to se what happens before it stops
>  > working.
>  > But if i have the PC connected in the "normal" way the problem dossnt
>  > occur.
>  > --------------------------------
>  >
>  >
>  >
>  > 6) Do you log the state of the TCP/IP connection between the Ethernuts and
>  > the PC within the PC? Maybe such log (record length / contents) could
>  > provide some more info.
>  >
>  > --------------------------------
>  > Because of the problem only occuring when the PC issnt connected this
>  > is hard to do.
>  > I can log the trafic when everything workes fine but not sure it gives
>  > away the problem but perhaps someone with more knowledge of TCP/IP can
>  > se some something here.
>  > --------------------------------
>  >
>  >
>  >
>  >
>  > 7) The most important question is:
>  > Is the problem caused by behaviour in Nut/OS or Nut/Net (IP stack, timers,
>  > events etc)
>  > Or
>  > Is the problem cause by some behaviour in the application threads.
>  > Is there any chance to "strip down" the application threads to try to
>  > minimize their possible impact on the situation?
>  >
>  > --------------------------------
>  > I minimized the software to only include the thread for client socket
>  > and one tcpserver thread. But this still happends. Il try to remove
>  > some more code in the client thread and se if it changes anything.
>  > I did get the feeling that this software took longer time before it
>  > stoped trying to connect. But thats not 100% verified.  Anyway still
>  > it stops.
>  > --------------------------------
>  >
>  >
>  > 8) I have an example here of a test app, which produces the following
>  > threads list:
>  >
>  > CLI>threads
>  > Name    Status  Prio    Stack   Memory  Timeout  INFO-addr  Bank
>  > CMDLINE Run      64        891  OK      None        36C9     -1
>  > XHTST   Sleep    64        357  OK      6           3203      9
>  > XHTST   Ready    64        357  OK      None        2F3D      8
>  > XHTST   Sleep    64        357  OK      None        2C77      7
>  > XHTST   Ready    64        357  OK      None        29B1      6
>  > XHTST   Sleep    64        357  OK      24          26EB      5
>  > XHTST   Ready    64        357  OK      None        2425      4
>  > XHTST   Sleep    64        357  OK      13          215F      3
>  > XHTST   Ready    64        357  OK      None        1E99      2
>  > XHTST   Sleep    64        357  OK      6           1BD3      1
>  > tcpsm   Sleep    32        468  OK      102         1925     -1
>  > XHTST   Sleep    64        357  OK      None        16C1      0
>  > rxi5    Sleep     9        603  OK      699         145F     -1
>  > main    Sleep   200        705  OK      940         1041     -1
>  > idle    Ready   254        356  OK      None         D21     -1
>  >
>  > The XHTST threads a looping apps who work in memory, display info via
>  > TCP/IP
>  > to a telnet client and sometimes sleep for a random time.
>  > There are threads which are Sleeping and do not have a Timeout associated
>  > with them! (maybe when they are waiting for the telnet output to
>  > complete?)
>  >
>  > --------------------------------
>  >
>  > I have no idea but perhaps when
>  > " NutTcpConnect(socket, rip, TCPSERVERPORT) " executes it sets the
>  > thread to "None" and then waits for some event from the tcpsm that
>  > then never occurs.
>  >
>  > --------------------------------
>  >
>  >
>  > I am quite sure you have thought about some (if not all) of this already,
>  > but maybe it "kicks" off some more thoughts.
>  >
>  > Good luck
>  > Regards
>  > Ernst
>  >
>  >
>  > -----Ursprüngliche Nachricht-----
>  > Von: en-nut-discussion-bounces at egnite.de
>  > [mailto:en-nut-discussion-bounces at egnite.de] Im Auftrag von Erik
>  > Lindstein
>  > Gesendet: Montag, 24. März 2008 17:13
>  > An: en-nut-discussion at egnite.de
>  > Betreff: [En-Nut-Discussion] Thread stops executing after some time.
>  >
>  > Guys please help me out.
>  > I'm on a wild goose chase trying to figure out what is happening with
>  > a thread that handles communications with a PC thru a tcp/ip socket.
>  >
>  > The setup is:
>  > Ethernut V2.1
>  > Software 4.4.0
>  >
>  > The software is build up by a couple of threads each handling some
>  > functions( lcd, push buttons, user functions etc )
>  > Then i have one thread that communicates with a server software on my PC.
>  > The communication is pretty simple.
>  > It's a client socket that connects to my server PC and then exchanges
>  > some XML data, disconnects, sleeps for 300ms and then start all over
>  > again.
>  >
>  > This works fine for weeks without any problems if i have the server PC
>  > up and running and connected to the same LAN my ethernut is connected
>  > to.
>  >
>  > Then one day i disconnected the server PC from the LAN and left a
>  > couple of the ethernut clients running over the weekend, then on
>  > Monday i connected my PC again and started up the server software but
>  > i noticed that only 1 out of 4 clients where still connecting (all of
>  > them uses the same software just different MACs and IPs )
>  >
>  > I looked at the incoming traffic with wireshark and could not see any
>  > sign of life at all from the 3 clients not connecting.
>  > I tried to ping them and they all answer on pings and also all other
>  > threads that handles the LCD and push buttons are still up and running
>  > so the software is not dead.
>  > I tested to deactivate/activate the network connection on my PC to see
>  > if anyone of the clients woke up. No luck.
>  >
>  > I then added another thread to the software ( i took the sample code
>  > for the tcps in the apps dir and created a thread to run that code )
>  > And when everything is ok i see the "inetd" thread timeout counting
>  > all the time and the thread executes as expected.
>  >
>  > When the inetd thread stops executing i can connect to the unit and i
>  > get the output seen below:
>  > ----------------------------------------------------------------------------
>  > --------------------------------------------
>  > 220 List of threads with name,state,prio,stack,mem,timeout follows
>  > tcpsm   Sleep   32      461     OK      27
>  > TcpS    Run     64      2546    OK      None
>  > inetd   Sleep   64      2381    OK      None
>  > rxi5    Sleep   9       603     OK      1392
>  > wdt     Sleep   40      255     OK      8
>  > SmuTh   Sleep   64      65      OK      71
>  > PcuTh   Sleep   64      805     OK      1
>  > HvpsTh  Sleep   64      605     OK      24
>  > IppsTh  Sleep   64      965     OK      4
>  > TaTh    Sleep   64      65      OK      35
>  > LcdTh   Sleep   64      929     OK      34
>  > main    Sleep   64      733     OK      451
>  > idle    Ready   254     356     OK      None
>  > ----------------------------------------------------------------------------
>  > --------------------------------------------
>  >
>  > For some reason the thread(inetd) just gets a timeout set to "None"
>  > instead of the NutSleep value.
>  >
>  > I only have one place in the code that sets the thread to sleep and i
>  > have a fixed value there of 300. (NutSleep(300))
>  > So there must be somewhere else in the code the thread gets set to
>  > some wait state, but i have no idea how to figure out where and why
>  > this happens.
>  >
>  > Can it be something that happens when the socket tries to connect to a
>  > IP/Server that doesn't exist on the LAN.
>  >
>  > If it happens all the time it would be easier to figure out whats
>  > wrong but this can run for days without happening.
>  > Also if there was low memory the tcps thread wouldn't answer the
>  > incoming connection attempts i guess.
>  >
>  > The thread code is below:
>  > ----------------------------------------------------------------------------
>  > --------------------------------------------
>  > THREAD(InetdThread, arg)
>  > {
>  >        TCPSOCKET *socket;
>  >        FILE *stream = 0;
>  >        u_long rip = inet_addr("192.168.0.115");
>  >        u_long tmo = 500;
>  >        int socket_error = 0;
>  >        uint8_t *start = 0, *stop = 0;
>  >        uint8_t unit[20], cmd[40], value[40];
>  >        uint8_t data_exchange_buffer[100] = "0";
>  >
>  >        for(;;)
>  >        {
>  >                if ((socket = NutTcpCreateSocket()) != 0)
>  >                {
>  >                        NutTcpSetSockOpt(socket, SO_RCVTIMEO, &tmo,
>  > sizeof(tmo));
>  >                        NutTcpSetSockOpt(socket, SO_SNDTIMEO, &tmo,
>  > sizeof(tmo));
>  >                        if(NutTcpConnect(socket, rip, TCPSERVERPORT) == 0)
>  >                        {
>  >                                stream = _fdopen((int) ((uptr_t) socket),
>  > "r+b");
>  >                                if(stream != 0)
>  >                                {
>  >                                        fprintf_P(stream, info_P,
>  > INFO_P_ARGS); // Send some XML DATA
>  >                                        fflush(stream);
>  >                                        fgets(data_exchange_buffer,
>  > sizeof(data_exchange_buffer),
>  > stream); // Get some XML DATA
>  >                                        {
>  >                                                // Handle XML data
>  >                                        }
>  >                                        fclose(stream);
>  >                                        /*
>  >                                                info_text is a extern
>  > variable that another thread prints on the
>  > LCD for debug output.
>  >                                        */
>  >                                        sprintf(info_text ,"COK\n%lu",
>  > (u_long)NutGetMillis());
>  >                                }
>  >                        }
>  >                        else
>  >                        {
>  >                                socket_error = NutTcpError(socket);
>  >                                sprintf(info_text ,"CE:%d \n%lu",
>  > socket_error, (u_long)NutGetMillis());
>  >                        }
>  >                        NutTcpCloseSocket(socket);
>  >                }
>  >                NutSleep(300);
>  >        }
>  > }
>  > ----------------------------------------------------------------------------
>  > --------------------------------------------
>  >
>  >
>  >
>  > --
>  > /Erik
>  > _______________________________________________
>  > http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>  >
>  >
>  >
>  > --
>  > No virus found in this incoming message.
>  > Checked by AVG.
>  > Version: 7.5.519 / Virus Database: 269.21.8/1340 - Release Date:
>  > 23.03.2008
>  > 18:50
>  >
>  >
>  > _______________________________________________
>  > http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>  >
>  >
>
>
> --
>  View this message in context: http://www.nabble.com/-en-nut-discussion--thread-stops-executing-after-some-time.-tp16277335p16368466.html
>  Sent from the MicroControllers - Ethernut mailing list archive at Nabble.com.
>
>
>  _______________________________________________
>  http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>



More information about the En-Nut-Discussion mailing list