[En-Nut-Discussion] [en-nut-discussion] thread stops executingafter some time.

jakub nowak jdnowak at gmail.com
Wed Apr 2 11:16:39 CEST 2008


Maybe this help You. When I use dhcp and the server is winxp pro,
sometimes about 10% there is problem in connection(only reboot of
ethernut helps). Now I'm using mostly linksys WRT54GL(with Tomato) and
the problem disappear. When You look ethernet frames in Wireshark You
will see that winxp sends a lot of unused informations.
I know that this will not solve Your problem but maybe will help You a little.

2008/4/2, Erik L <erik.lindstein at gmail.com>:
>
>  Hi Jakub.
>  I used DHCP before but i hade some problems with that aswell.
>  After 24h the lease got old and the ethernut tryed to get a new IP from the
>  DHCP-server and when that happens the communication gets distrubed, i guess
>  one could remove the lease time but i didnt put any effort into it because i
>  have other things to gain in using static ipadresses.
>
>  If the "problem" is there or not when using dhcp i dont know. I might give
>  it a try.
>
>  Thanks for the input.
>  Regards/Erik
>
>
>
>
>  jakub nowak-2 wrote:
>  >
>  > Maybe try dynamic IP gateway ... -> if  dhcp will work fine, at least
>  > You will know where is problem.
>  >
>  > 2008/3/29, Erik L <erik.lindstein at gmail.com>:
>  >>
>  >>  I dont think the problem is created when i disconnect the PC.
>  >>  I connected/disconnected the PC lots of times in a short period of time
>  >> and
>  >>  the clients still always connect.
>  >>  Also i have clients that havent been connected at all efter i powerd
>  >> them up
>  >>  and they still get the same problem if i leave them for a couple of
>  >> hours
>  >>  before i connect the PC.
>  >>
>  >>  I have no route set, the communication is just on the LAN.
>  >>  (Clients 192.168.0.1-10 server 192.168.0.115) all ipadresses are static.
>  >>
>  >>  /Erik
>  >>
>  >>
>  >>
>  >>
>  >>  ernstst wrote:
>  >>  >
>  >>  > Hi Erik!
>  >>  >
>  >>  > quote
>  >>  > But when the problem occurs the software in the client can´t get to
>  >> that
>  >>  > point where it actualy exchanges the data because the socket shoudnt
>  >> be
>  >>  > able
>  >>  > to connect.
>  >>  > " if(NutTcpConnect(socket, rip, TCPSERVERPORT) == 0) "
>  >>  >
>  >>  > If the server(rip:TCPSERVERPORT) issnt connected to the LAN it can´t
>  >>  > return
>  >>  > 0 correct?
>  >>  > And even if it for some reason does the socket read timeouts should
>  >> occur
>  >>  > and it should manage to get past the data exchange rutines and then
>  >> start
>  >>  > all over again.
>  >>  > --------------------------------
>  >>  > Unquote
>  >>  >
>  >>  > I am not sure this is correct (ie. If the read timeout triggers on a
>  >>  > NutTcpConnect which does not do thru)
>  >>  >
>  >>  > I need to think about it, maybe give it a try. A quick look into
>  >> TCPSOCK.C
>  >>  > and the like didn't enlighten me ... Maybe the TCP protocol state in
>  >> which
>  >>  > the disconnect occurred somehow influences if the reconnect works.
>  >>  > Another thought: IP Routing tables. Are they still "there" when the
>  >>  > reconnect is attempted? Have you defined a route?
>  >>  >
>  >>  > Regards
>  >>  > Ernt
>  >>  >
>  >>  >
>  >>  >
>  >>  > -----Ursprüngliche Nachricht-----
>  >>  > Von: en-nut-discussion-bounces at egnite.de
>  >>  > [mailto:en-nut-discussion-bounces at egnite.de] Im Auftrag von Erik
>  >> Lindstein
>  >>  > Gesendet: Dienstag, 25. März 2008 12:21
>  >>  > An: en-nut-discussion at egnite.de
>  >>  > Betreff: [En-Nut-Discussion] [en-nut-discussion] thread stops
>  >>  > executingafter
>  >>  > some time.
>  >>  >
>  >>  > Ernst, thank you very much for taking time to answer.
>  >>  >
>  >>  > I'l write some comments down below
>  >>  >
>  >>  > I understand this is a sporadic problem so it takes a lot of time to
>  >> run
>  >>  > into the "error" situation, but anyway:
>  >>  >
>  >>  > 1) ... only 1 out of 4 clients where still connecting ...
>  >>  > When you experience this situation more than once, is it always the
>  >> same
>  >>  > Ethernut which can still connect or is that also "random". And what
>  >> looks
>  >>  > "random" in the first place, is it really?
>  >>  >
>  >>  > --------------------------------
>  >>  > Well, i don´t think there is much that involves software that is truly
>  >>  > random :-) so ofcorse this issnt either.
>  >>  >
>  >>  > But here it can be any one of the clients that stops executing the
>  >>  > thread, usuly i can see that after some time(~6 - 7h) one or two
>  >>  > stoped connecting and there can be one left running for up to 24h
>  >>  > (perhaps longer). But in the end all of them stops trying to connect
>  >>  > and the thread sets sleep time to "None".
>  >>  > --------------------------------
>  >>  >
>  >>  >
>  >>  >
>  >>  >
>  >>  > 2) .. (all of them uses the same software just different MACs and IPs
>  >> ) ..
>  >>  > Even if all of them use the same SW, are they operating under
>  >>  > same/similar/different conditions? (I mean ".. exchanges some XML
>  >>  > data,..":
>  >>  > where does this data come from?, i.e. how is it generated and how
>  >>  > different
>  >>  > can it be between the Ethernuts?
>  >>  >
>  >>  > --------------------------------
>  >>  > When the server software on the PC is running and the PC is connected
>  >>  > then the client socket gets connected and the client sends some values
>  >>  > read from the A/D, some status variables and then the PC responds with
>  >>  > a command that tells the client to do "something".
>  >>  > Usuly the PC just sends a "reset watchdog command" to the client.
>  >>  >
>  >>  > But in this case everything workes fine as long as the software is
>  >>  > running and the PC is connected.
>  >>  > When i then close down the server software the client gets a command
>  >>  > that tells it to start reseting the WDT localy.
>  >>  >
>  >>  > But when the problem occurs the software in the client can´t get to
>  >>  > that point where it actualy exchanges the data because the socket
>  >>  > shoudnt be able to connect.
>  >>  > " if(NutTcpConnect(socket, rip, TCPSERVERPORT) == 0) "
>  >>  >
>  >>  > If the server(rip:TCPSERVERPORT) issnt connected to the LAN it can´t
>  >>  > return 0 correct?
>  >>  > And even if it for some reason does the socket read timeouts should
>  >>  > occur and it should manage to get past the data exchange rutines and
>  >>  > then start all over again.
>  >>  > --------------------------------
>  >>  >
>  >>  >
>  >>  >
>  >>  > 3) How about buffer overflows due to "special" tx/rx data conditions
>  >>  > (length)?
>  >>  > --------------------------------
>  >>  > In this case it only happens when (atleast think that) I don´t read or
>  >>  > send any data more than the data that the tcpsm sends out trying to
>  >>  > connect the socket.
>  >>  > My code dossnt do any rx/tx until the socket and stream is OK.
>  >>  >
>  >>  > --------------------------------
>  >>  >
>  >>  >
>  >>  >
>  >>  > 4) Try looking "into" the TCP sockets. My bank switch test-program at
>  >>  > http://www.es-business.com/Firma/eng/edocs.htm may help. Include cli.c
>  >> and
>  >>  > dump.c in your main pgm and create a thread as indicated in the
>  >> source.
>  >>  > It contains a Telnet based CLI which has a "lists" command which walks
>  >>  > thru
>  >>  > and displays all Nut/OS known lists (TCP Sockets is one of these). One
>  >>  > word
>  >>  > of caution: Because Nut/OS (i.e. other threads) are executing while
>  >> this
>  >>  > command follows the pointer in the various lists pointing from one
>  >> entry
>  >>  > to
>  >>  > the next, the command may loop in case a list is updated (by an or on
>  >>  > behalf
>  >>  > of an app thread) right when this pointer in the list is used by the
>  >>  > "lists"
>  >>  > command itself.
>  >>  > The dump command may help you peek around on RAM.
>  >>  >
>  >>  > --------------------------------
>  >>  > Il look into that, thanks..
>  >>  >
>  >>  > --------------------------------
>  >>  >
>  >>  >
>  >>  >
>  >>  > 5) Is there a possibility to have wireshark monitoring the TCP/IP link
>  >> up
>  >>  > until the PC gets disconnected? This way, you could find out what was
>  >>  > exchanged immediately before the disconnect happened and maybe this
>  >> gives
>  >>  > more info about the internal status of the Ethernuts and the TCP/IP
>  >>  > connection states.
>  >>  >
>  >>  > --------------------------------
>  >>  > The PC only gets disconnected when i remove the LAN cable but i could
>  >>  > monitor the data until that point.
>  >>  > But i can disconnect and connect the cable many ( unlimited? ) times
>  >>  > and there is no problems. The clients always connects again if i dont
>  >>  > leave the PC unconnected for a longer period of time ( > ~5-6h )
>  >>  >
>  >>  > One possibility might be to have the switch i use setup to echo all
>  >>  > trafic out on another port and monitor the trafic there with
>  >>  > wireshark. That way i might be able to se what happens before it stops
>  >>  > working.
>  >>  > But if i have the PC connected in the "normal" way the problem dossnt
>  >>  > occur.
>  >>  > --------------------------------
>  >>  >
>  >>  >
>  >>  >
>  >>  > 6) Do you log the state of the TCP/IP connection between the Ethernuts
>  >> and
>  >>  > the PC within the PC? Maybe such log (record length / contents) could
>  >>  > provide some more info.
>  >>  >
>  >>  > --------------------------------
>  >>  > Because of the problem only occuring when the PC issnt connected this
>  >>  > is hard to do.
>  >>  > I can log the trafic when everything workes fine but not sure it gives
>  >>  > away the problem but perhaps someone with more knowledge of TCP/IP can
>  >>  > se some something here.
>  >>  > --------------------------------
>  >>  >
>  >>  >
>  >>  >
>  >>  >
>  >>  > 7) The most important question is:
>  >>  > Is the problem caused by behaviour in Nut/OS or Nut/Net (IP stack,
>  >> timers,
>  >>  > events etc)
>  >>  > Or
>  >>  > Is the problem cause by some behaviour in the application threads.
>  >>  > Is there any chance to "strip down" the application threads to try to
>  >>  > minimize their possible impact on the situation?
>  >>  >
>  >>  > --------------------------------
>  >>  > I minimized the software to only include the thread for client socket
>  >>  > and one tcpserver thread. But this still happends. Il try to remove
>  >>  > some more code in the client thread and se if it changes anything.
>  >>  > I did get the feeling that this software took longer time before it
>  >>  > stoped trying to connect. But thats not 100% verified.  Anyway still
>  >>  > it stops.
>  >>  > --------------------------------
>  >>  >
>  >>  >
>  >>  > 8) I have an example here of a test app, which produces the following
>  >>  > threads list:
>  >>  >
>  >>  > CLI>threads
>  >>  > Name    Status  Prio    Stack   Memory  Timeout  INFO-addr  Bank
>  >>  > CMDLINE Run      64        891  OK      None        36C9     -1
>  >>  > XHTST   Sleep    64        357  OK      6           3203      9
>  >>  > XHTST   Ready    64        357  OK      None        2F3D      8
>  >>  > XHTST   Sleep    64        357  OK      None        2C77      7
>  >>  > XHTST   Ready    64        357  OK      None        29B1      6
>  >>  > XHTST   Sleep    64        357  OK      24          26EB      5
>  >>  > XHTST   Ready    64        357  OK      None        2425      4
>  >>  > XHTST   Sleep    64        357  OK      13          215F      3
>  >>  > XHTST   Ready    64        357  OK      None        1E99      2
>  >>  > XHTST   Sleep    64        357  OK      6           1BD3      1
>  >>  > tcpsm   Sleep    32        468  OK      102         1925     -1
>  >>  > XHTST   Sleep    64        357  OK      None        16C1      0
>  >>  > rxi5    Sleep     9        603  OK      699         145F     -1
>  >>  > main    Sleep   200        705  OK      940         1041     -1
>  >>  > idle    Ready   254        356  OK      None         D21     -1
>  >>  >
>  >>  > The XHTST threads a looping apps who work in memory, display info via
>  >>  > TCP/IP
>  >>  > to a telnet client and sometimes sleep for a random time.
>  >>  > There are threads which are Sleeping and do not have a Timeout
>  >> associated
>  >>  > with them! (maybe when they are waiting for the telnet output to
>  >>  > complete?)
>  >>  >
>  >>  > --------------------------------
>  >>  >
>  >>  > I have no idea but perhaps when
>  >>  > " NutTcpConnect(socket, rip, TCPSERVERPORT) " executes it sets the
>  >>  > thread to "None" and then waits for some event from the tcpsm that
>  >>  > then never occurs.
>  >>  >
>  >>  > --------------------------------
>  >>  >
>  >>  >
>  >>  > I am quite sure you have thought about some (if not all) of this
>  >> already,
>  >>  > but maybe it "kicks" off some more thoughts.
>  >>  >
>  >>  > Good luck
>  >>  > Regards
>  >>  > Ernst
>  >>  >
>  >>  >
>  >>  > -----Ursprüngliche Nachricht-----
>  >>  > Von: en-nut-discussion-bounces at egnite.de
>  >>  > [mailto:en-nut-discussion-bounces at egnite.de] Im Auftrag von Erik
>  >>  > Lindstein
>  >>  > Gesendet: Montag, 24. März 2008 17:13
>  >>  > An: en-nut-discussion at egnite.de
>  >>  > Betreff: [En-Nut-Discussion] Thread stops executing after some time.
>  >>  >
>  >>  > Guys please help me out.
>  >>  > I'm on a wild goose chase trying to figure out what is happening with
>  >>  > a thread that handles communications with a PC thru a tcp/ip socket.
>  >>  >
>  >>  > The setup is:
>  >>  > Ethernut V2.1
>  >>  > Software 4.4.0
>  >>  >
>  >>  > The software is build up by a couple of threads each handling some
>  >>  > functions( lcd, push buttons, user functions etc )
>  >>  > Then i have one thread that communicates with a server software on my
>  >> PC.
>  >>  > The communication is pretty simple.
>  >>  > It's a client socket that connects to my server PC and then exchanges
>  >>  > some XML data, disconnects, sleeps for 300ms and then start all over
>  >>  > again.
>  >>  >
>  >>  > This works fine for weeks without any problems if i have the server PC
>  >>  > up and running and connected to the same LAN my ethernut is connected
>  >>  > to.
>  >>  >
>  >>  > Then one day i disconnected the server PC from the LAN and left a
>  >>  > couple of the ethernut clients running over the weekend, then on
>  >>  > Monday i connected my PC again and started up the server software but
>  >>  > i noticed that only 1 out of 4 clients where still connecting (all of
>  >>  > them uses the same software just different MACs and IPs )
>  >>  >
>  >>  > I looked at the incoming traffic with wireshark and could not see any
>  >>  > sign of life at all from the 3 clients not connecting.
>  >>  > I tried to ping them and they all answer on pings and also all other
>  >>  > threads that handles the LCD and push buttons are still up and running
>  >>  > so the software is not dead.
>  >>  > I tested to deactivate/activate the network connection on my PC to see
>  >>  > if anyone of the clients woke up. No luck.
>  >>  >
>  >>  > I then added another thread to the software ( i took the sample code
>  >>  > for the tcps in the apps dir and created a thread to run that code )
>  >>  > And when everything is ok i see the "inetd" thread timeout counting
>  >>  > all the time and the thread executes as expected.
>  >>  >
>  >>  > When the inetd thread stops executing i can connect to the unit and i
>  >>  > get the output seen below:
>  >>  >
>  >> ----------------------------------------------------------------------------
>  >>  > --------------------------------------------
>  >>  > 220 List of threads with name,state,prio,stack,mem,timeout follows
>  >>  > tcpsm   Sleep   32      461     OK      27
>  >>  > TcpS    Run     64      2546    OK      None
>  >>  > inetd   Sleep   64      2381    OK      None
>  >>  > rxi5    Sleep   9       603     OK      1392
>  >>  > wdt     Sleep   40      255     OK      8
>  >>  > SmuTh   Sleep   64      65      OK      71
>  >>  > PcuTh   Sleep   64      805     OK      1
>  >>  > HvpsTh  Sleep   64      605     OK      24
>  >>  > IppsTh  Sleep   64      965     OK      4
>  >>  > TaTh    Sleep   64      65      OK      35
>  >>  > LcdTh   Sleep   64      929     OK      34
>  >>  > main    Sleep   64      733     OK      451
>  >>  > idle    Ready   254     356     OK      None
>  >>  >
>  >> ----------------------------------------------------------------------------
>  >>  > --------------------------------------------
>  >>  >
>  >>  > For some reason the thread(inetd) just gets a timeout set to "None"
>  >>  > instead of the NutSleep value.
>  >>  >
>  >>  > I only have one place in the code that sets the thread to sleep and i
>  >>  > have a fixed value there of 300. (NutSleep(300))
>  >>  > So there must be somewhere else in the code the thread gets set to
>  >>  > some wait state, but i have no idea how to figure out where and why
>  >>  > this happens.
>  >>  >
>  >>  > Can it be something that happens when the socket tries to connect to a
>  >>  > IP/Server that doesn't exist on the LAN.
>  >>  >
>  >>  > If it happens all the time it would be easier to figure out whats
>  >>  > wrong but this can run for days without happening.
>  >>  > Also if there was low memory the tcps thread wouldn't answer the
>  >>  > incoming connection attempts i guess.
>  >>  >
>  >>  > The thread code is below:
>  >>  >
>  >> ----------------------------------------------------------------------------
>  >>  > --------------------------------------------
>  >>  > THREAD(InetdThread, arg)
>  >>  > {
>  >>  >        TCPSOCKET *socket;
>  >>  >        FILE *stream = 0;
>  >>  >        u_long rip = inet_addr("192.168.0.115");
>  >>  >        u_long tmo = 500;
>  >>  >        int socket_error = 0;
>  >>  >        uint8_t *start = 0, *stop = 0;
>  >>  >        uint8_t unit[20], cmd[40], value[40];
>  >>  >        uint8_t data_exchange_buffer[100] = "0";
>  >>  >
>  >>  >        for(;;)
>  >>  >        {
>  >>  >                if ((socket = NutTcpCreateSocket()) != 0)
>  >>  >                {
>  >>  >                        NutTcpSetSockOpt(socket, SO_RCVTIMEO, &tmo,
>  >>  > sizeof(tmo));
>  >>  >                        NutTcpSetSockOpt(socket, SO_SNDTIMEO, &tmo,
>  >>  > sizeof(tmo));
>  >>  >                        if(NutTcpConnect(socket, rip, TCPSERVERPORT) ==
>  >> 0)
>  >>  >                        {
>  >>  >                                stream = _fdopen((int) ((uptr_t)
>  >> socket),
>  >>  > "r+b");
>  >>  >                                if(stream != 0)
>  >>  >                                {
>  >>  >                                        fprintf_P(stream, info_P,
>  >>  > INFO_P_ARGS); // Send some XML DATA
>  >>  >                                        fflush(stream);
>  >>  >                                        fgets(data_exchange_buffer,
>  >>  > sizeof(data_exchange_buffer),
>  >>  > stream); // Get some XML DATA
>  >>  >                                        {
>  >>  >                                                // Handle XML data
>  >>  >                                        }
>  >>  >                                        fclose(stream);
>  >>  >                                        /*
>  >>  >                                                info_text is a extern
>  >>  > variable that another thread prints on the
>  >>  > LCD for debug output.
>  >>  >                                        */
>  >>  >                                        sprintf(info_text ,"COK\n%lu",
>  >>  > (u_long)NutGetMillis());
>  >>  >                                }
>  >>  >                        }
>  >>  >                        else
>  >>  >                        {
>  >>  >                                socket_error = NutTcpError(socket);
>  >>  >                                sprintf(info_text ,"CE:%d \n%lu",
>  >>  > socket_error, (u_long)NutGetMillis());
>  >>  >                        }
>  >>  >                        NutTcpCloseSocket(socket);
>  >>  >                }
>  >>  >                NutSleep(300);
>  >>  >        }
>  >>  > }
>  >>  >
>  >> ----------------------------------------------------------------------------
>  >>  > --------------------------------------------
>  >>  >
>  >>  >
>  >>  >
>  >>  > --
>  >>  > /Erik
>  >>  > _______________________________________________
>  >>  > http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>  >>  >
>  >>  >
>  >>  >
>  >>  > --
>  >>  > No virus found in this incoming message.
>  >>  > Checked by AVG.
>  >>  > Version: 7.5.519 / Virus Database: 269.21.8/1340 - Release Date:
>  >>  > 23.03.2008
>  >>  > 18:50
>  >>  >
>  >>  >
>  >>  > _______________________________________________
>  >>  > http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>  >>  >
>  >>  >
>  >>
>  >>
>  >> --
>  >>  View this message in context:
>  >> http://www.nabble.com/-en-nut-discussion--thread-stops-executing-after-some-time.-tp16277335p16368466.html
>  >>  Sent from the MicroControllers - Ethernut mailing list archive at
>  >> Nabble.com.
>  >>
>  >>
>  >>  _______________________________________________
>  >>  http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>  >>
>  > _______________________________________________
>  > http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>  >
>  >
>
>  --
>
> View this message in context: http://www.nabble.com/-en-nut-discussion--thread-stops-executing-after-some-time.-tp16277335p16445361.html
>
> Sent from the MicroControllers - Ethernut mailing list archive at Nabble.com.
>
>  _______________________________________________
>  http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>



More information about the En-Nut-Discussion mailing list