[En-Nut-Discussion] [en-nut-discussion] thread stops executingafter some time.
Erik L
erik.lindstein at gmail.com
Wed Apr 2 10:31:32 CEST 2008
Hi Jakub.
I used DHCP before but i hade some problems with that aswell.
After 24h the lease got old and the ethernut tryed to get a new IP from the
DHCP-server and when that happens the communication gets distrubed, i guess
one could remove the lease time but i didnt put any effort into it because i
have other things to gain in using static ipadresses.
If the "problem" is there or not when using dhcp i dont know. I might give
it a try.
Thanks for the input.
Regards/Erik
jakub nowak-2 wrote:
>
> Maybe try dynamic IP gateway ... -> if dhcp will work fine, at least
> You will know where is problem.
>
> 2008/3/29, Erik L <erik.lindstein at gmail.com>:
>>
>> I dont think the problem is created when i disconnect the PC.
>> I connected/disconnected the PC lots of times in a short period of time
>> and
>> the clients still always connect.
>> Also i have clients that havent been connected at all efter i powerd
>> them up
>> and they still get the same problem if i leave them for a couple of
>> hours
>> before i connect the PC.
>>
>> I have no route set, the communication is just on the LAN.
>> (Clients 192.168.0.1-10 server 192.168.0.115) all ipadresses are static.
>>
>> /Erik
>>
>>
>>
>>
>> ernstst wrote:
>> >
>> > Hi Erik!
>> >
>> > quote
>> > But when the problem occurs the software in the client can´t get to
>> that
>> > point where it actualy exchanges the data because the socket shoudnt
>> be
>> > able
>> > to connect.
>> > " if(NutTcpConnect(socket, rip, TCPSERVERPORT) == 0) "
>> >
>> > If the server(rip:TCPSERVERPORT) issnt connected to the LAN it can´t
>> > return
>> > 0 correct?
>> > And even if it for some reason does the socket read timeouts should
>> occur
>> > and it should manage to get past the data exchange rutines and then
>> start
>> > all over again.
>> > --------------------------------
>> > Unquote
>> >
>> > I am not sure this is correct (ie. If the read timeout triggers on a
>> > NutTcpConnect which does not do thru)
>> >
>> > I need to think about it, maybe give it a try. A quick look into
>> TCPSOCK.C
>> > and the like didn't enlighten me ... Maybe the TCP protocol state in
>> which
>> > the disconnect occurred somehow influences if the reconnect works.
>> > Another thought: IP Routing tables. Are they still "there" when the
>> > reconnect is attempted? Have you defined a route?
>> >
>> > Regards
>> > Ernt
>> >
>> >
>> >
>> > -----Ursprüngliche Nachricht-----
>> > Von: en-nut-discussion-bounces at egnite.de
>> > [mailto:en-nut-discussion-bounces at egnite.de] Im Auftrag von Erik
>> Lindstein
>> > Gesendet: Dienstag, 25. März 2008 12:21
>> > An: en-nut-discussion at egnite.de
>> > Betreff: [En-Nut-Discussion] [en-nut-discussion] thread stops
>> > executingafter
>> > some time.
>> >
>> > Ernst, thank you very much for taking time to answer.
>> >
>> > I'l write some comments down below
>> >
>> > I understand this is a sporadic problem so it takes a lot of time to
>> run
>> > into the "error" situation, but anyway:
>> >
>> > 1) ... only 1 out of 4 clients where still connecting ...
>> > When you experience this situation more than once, is it always the
>> same
>> > Ethernut which can still connect or is that also "random". And what
>> looks
>> > "random" in the first place, is it really?
>> >
>> > --------------------------------
>> > Well, i don´t think there is much that involves software that is truly
>> > random :-) so ofcorse this issnt either.
>> >
>> > But here it can be any one of the clients that stops executing the
>> > thread, usuly i can see that after some time(~6 - 7h) one or two
>> > stoped connecting and there can be one left running for up to 24h
>> > (perhaps longer). But in the end all of them stops trying to connect
>> > and the thread sets sleep time to "None".
>> > --------------------------------
>> >
>> >
>> >
>> >
>> > 2) .. (all of them uses the same software just different MACs and IPs
>> ) ..
>> > Even if all of them use the same SW, are they operating under
>> > same/similar/different conditions? (I mean ".. exchanges some XML
>> > data,..":
>> > where does this data come from?, i.e. how is it generated and how
>> > different
>> > can it be between the Ethernuts?
>> >
>> > --------------------------------
>> > When the server software on the PC is running and the PC is connected
>> > then the client socket gets connected and the client sends some values
>> > read from the A/D, some status variables and then the PC responds with
>> > a command that tells the client to do "something".
>> > Usuly the PC just sends a "reset watchdog command" to the client.
>> >
>> > But in this case everything workes fine as long as the software is
>> > running and the PC is connected.
>> > When i then close down the server software the client gets a command
>> > that tells it to start reseting the WDT localy.
>> >
>> > But when the problem occurs the software in the client can´t get to
>> > that point where it actualy exchanges the data because the socket
>> > shoudnt be able to connect.
>> > " if(NutTcpConnect(socket, rip, TCPSERVERPORT) == 0) "
>> >
>> > If the server(rip:TCPSERVERPORT) issnt connected to the LAN it can´t
>> > return 0 correct?
>> > And even if it for some reason does the socket read timeouts should
>> > occur and it should manage to get past the data exchange rutines and
>> > then start all over again.
>> > --------------------------------
>> >
>> >
>> >
>> > 3) How about buffer overflows due to "special" tx/rx data conditions
>> > (length)?
>> > --------------------------------
>> > In this case it only happens when (atleast think that) I don´t read or
>> > send any data more than the data that the tcpsm sends out trying to
>> > connect the socket.
>> > My code dossnt do any rx/tx until the socket and stream is OK.
>> >
>> > --------------------------------
>> >
>> >
>> >
>> > 4) Try looking "into" the TCP sockets. My bank switch test-program at
>> > http://www.es-business.com/Firma/eng/edocs.htm may help. Include cli.c
>> and
>> > dump.c in your main pgm and create a thread as indicated in the
>> source.
>> > It contains a Telnet based CLI which has a "lists" command which walks
>> > thru
>> > and displays all Nut/OS known lists (TCP Sockets is one of these). One
>> > word
>> > of caution: Because Nut/OS (i.e. other threads) are executing while
>> this
>> > command follows the pointer in the various lists pointing from one
>> entry
>> > to
>> > the next, the command may loop in case a list is updated (by an or on
>> > behalf
>> > of an app thread) right when this pointer in the list is used by the
>> > "lists"
>> > command itself.
>> > The dump command may help you peek around on RAM.
>> >
>> > --------------------------------
>> > Il look into that, thanks..
>> >
>> > --------------------------------
>> >
>> >
>> >
>> > 5) Is there a possibility to have wireshark monitoring the TCP/IP link
>> up
>> > until the PC gets disconnected? This way, you could find out what was
>> > exchanged immediately before the disconnect happened and maybe this
>> gives
>> > more info about the internal status of the Ethernuts and the TCP/IP
>> > connection states.
>> >
>> > --------------------------------
>> > The PC only gets disconnected when i remove the LAN cable but i could
>> > monitor the data until that point.
>> > But i can disconnect and connect the cable many ( unlimited? ) times
>> > and there is no problems. The clients always connects again if i dont
>> > leave the PC unconnected for a longer period of time ( > ~5-6h )
>> >
>> > One possibility might be to have the switch i use setup to echo all
>> > trafic out on another port and monitor the trafic there with
>> > wireshark. That way i might be able to se what happens before it stops
>> > working.
>> > But if i have the PC connected in the "normal" way the problem dossnt
>> > occur.
>> > --------------------------------
>> >
>> >
>> >
>> > 6) Do you log the state of the TCP/IP connection between the Ethernuts
>> and
>> > the PC within the PC? Maybe such log (record length / contents) could
>> > provide some more info.
>> >
>> > --------------------------------
>> > Because of the problem only occuring when the PC issnt connected this
>> > is hard to do.
>> > I can log the trafic when everything workes fine but not sure it gives
>> > away the problem but perhaps someone with more knowledge of TCP/IP can
>> > se some something here.
>> > --------------------------------
>> >
>> >
>> >
>> >
>> > 7) The most important question is:
>> > Is the problem caused by behaviour in Nut/OS or Nut/Net (IP stack,
>> timers,
>> > events etc)
>> > Or
>> > Is the problem cause by some behaviour in the application threads.
>> > Is there any chance to "strip down" the application threads to try to
>> > minimize their possible impact on the situation?
>> >
>> > --------------------------------
>> > I minimized the software to only include the thread for client socket
>> > and one tcpserver thread. But this still happends. Il try to remove
>> > some more code in the client thread and se if it changes anything.
>> > I did get the feeling that this software took longer time before it
>> > stoped trying to connect. But thats not 100% verified. Anyway still
>> > it stops.
>> > --------------------------------
>> >
>> >
>> > 8) I have an example here of a test app, which produces the following
>> > threads list:
>> >
>> > CLI>threads
>> > Name Status Prio Stack Memory Timeout INFO-addr Bank
>> > CMDLINE Run 64 891 OK None 36C9 -1
>> > XHTST Sleep 64 357 OK 6 3203 9
>> > XHTST Ready 64 357 OK None 2F3D 8
>> > XHTST Sleep 64 357 OK None 2C77 7
>> > XHTST Ready 64 357 OK None 29B1 6
>> > XHTST Sleep 64 357 OK 24 26EB 5
>> > XHTST Ready 64 357 OK None 2425 4
>> > XHTST Sleep 64 357 OK 13 215F 3
>> > XHTST Ready 64 357 OK None 1E99 2
>> > XHTST Sleep 64 357 OK 6 1BD3 1
>> > tcpsm Sleep 32 468 OK 102 1925 -1
>> > XHTST Sleep 64 357 OK None 16C1 0
>> > rxi5 Sleep 9 603 OK 699 145F -1
>> > main Sleep 200 705 OK 940 1041 -1
>> > idle Ready 254 356 OK None D21 -1
>> >
>> > The XHTST threads a looping apps who work in memory, display info via
>> > TCP/IP
>> > to a telnet client and sometimes sleep for a random time.
>> > There are threads which are Sleeping and do not have a Timeout
>> associated
>> > with them! (maybe when they are waiting for the telnet output to
>> > complete?)
>> >
>> > --------------------------------
>> >
>> > I have no idea but perhaps when
>> > " NutTcpConnect(socket, rip, TCPSERVERPORT) " executes it sets the
>> > thread to "None" and then waits for some event from the tcpsm that
>> > then never occurs.
>> >
>> > --------------------------------
>> >
>> >
>> > I am quite sure you have thought about some (if not all) of this
>> already,
>> > but maybe it "kicks" off some more thoughts.
>> >
>> > Good luck
>> > Regards
>> > Ernst
>> >
>> >
>> > -----Ursprüngliche Nachricht-----
>> > Von: en-nut-discussion-bounces at egnite.de
>> > [mailto:en-nut-discussion-bounces at egnite.de] Im Auftrag von Erik
>> > Lindstein
>> > Gesendet: Montag, 24. März 2008 17:13
>> > An: en-nut-discussion at egnite.de
>> > Betreff: [En-Nut-Discussion] Thread stops executing after some time.
>> >
>> > Guys please help me out.
>> > I'm on a wild goose chase trying to figure out what is happening with
>> > a thread that handles communications with a PC thru a tcp/ip socket.
>> >
>> > The setup is:
>> > Ethernut V2.1
>> > Software 4.4.0
>> >
>> > The software is build up by a couple of threads each handling some
>> > functions( lcd, push buttons, user functions etc )
>> > Then i have one thread that communicates with a server software on my
>> PC.
>> > The communication is pretty simple.
>> > It's a client socket that connects to my server PC and then exchanges
>> > some XML data, disconnects, sleeps for 300ms and then start all over
>> > again.
>> >
>> > This works fine for weeks without any problems if i have the server PC
>> > up and running and connected to the same LAN my ethernut is connected
>> > to.
>> >
>> > Then one day i disconnected the server PC from the LAN and left a
>> > couple of the ethernut clients running over the weekend, then on
>> > Monday i connected my PC again and started up the server software but
>> > i noticed that only 1 out of 4 clients where still connecting (all of
>> > them uses the same software just different MACs and IPs )
>> >
>> > I looked at the incoming traffic with wireshark and could not see any
>> > sign of life at all from the 3 clients not connecting.
>> > I tried to ping them and they all answer on pings and also all other
>> > threads that handles the LCD and push buttons are still up and running
>> > so the software is not dead.
>> > I tested to deactivate/activate the network connection on my PC to see
>> > if anyone of the clients woke up. No luck.
>> >
>> > I then added another thread to the software ( i took the sample code
>> > for the tcps in the apps dir and created a thread to run that code )
>> > And when everything is ok i see the "inetd" thread timeout counting
>> > all the time and the thread executes as expected.
>> >
>> > When the inetd thread stops executing i can connect to the unit and i
>> > get the output seen below:
>> >
>> ----------------------------------------------------------------------------
>> > --------------------------------------------
>> > 220 List of threads with name,state,prio,stack,mem,timeout follows
>> > tcpsm Sleep 32 461 OK 27
>> > TcpS Run 64 2546 OK None
>> > inetd Sleep 64 2381 OK None
>> > rxi5 Sleep 9 603 OK 1392
>> > wdt Sleep 40 255 OK 8
>> > SmuTh Sleep 64 65 OK 71
>> > PcuTh Sleep 64 805 OK 1
>> > HvpsTh Sleep 64 605 OK 24
>> > IppsTh Sleep 64 965 OK 4
>> > TaTh Sleep 64 65 OK 35
>> > LcdTh Sleep 64 929 OK 34
>> > main Sleep 64 733 OK 451
>> > idle Ready 254 356 OK None
>> >
>> ----------------------------------------------------------------------------
>> > --------------------------------------------
>> >
>> > For some reason the thread(inetd) just gets a timeout set to "None"
>> > instead of the NutSleep value.
>> >
>> > I only have one place in the code that sets the thread to sleep and i
>> > have a fixed value there of 300. (NutSleep(300))
>> > So there must be somewhere else in the code the thread gets set to
>> > some wait state, but i have no idea how to figure out where and why
>> > this happens.
>> >
>> > Can it be something that happens when the socket tries to connect to a
>> > IP/Server that doesn't exist on the LAN.
>> >
>> > If it happens all the time it would be easier to figure out whats
>> > wrong but this can run for days without happening.
>> > Also if there was low memory the tcps thread wouldn't answer the
>> > incoming connection attempts i guess.
>> >
>> > The thread code is below:
>> >
>> ----------------------------------------------------------------------------
>> > --------------------------------------------
>> > THREAD(InetdThread, arg)
>> > {
>> > TCPSOCKET *socket;
>> > FILE *stream = 0;
>> > u_long rip = inet_addr("192.168.0.115");
>> > u_long tmo = 500;
>> > int socket_error = 0;
>> > uint8_t *start = 0, *stop = 0;
>> > uint8_t unit[20], cmd[40], value[40];
>> > uint8_t data_exchange_buffer[100] = "0";
>> >
>> > for(;;)
>> > {
>> > if ((socket = NutTcpCreateSocket()) != 0)
>> > {
>> > NutTcpSetSockOpt(socket, SO_RCVTIMEO, &tmo,
>> > sizeof(tmo));
>> > NutTcpSetSockOpt(socket, SO_SNDTIMEO, &tmo,
>> > sizeof(tmo));
>> > if(NutTcpConnect(socket, rip, TCPSERVERPORT) ==
>> 0)
>> > {
>> > stream = _fdopen((int) ((uptr_t)
>> socket),
>> > "r+b");
>> > if(stream != 0)
>> > {
>> > fprintf_P(stream, info_P,
>> > INFO_P_ARGS); // Send some XML DATA
>> > fflush(stream);
>> > fgets(data_exchange_buffer,
>> > sizeof(data_exchange_buffer),
>> > stream); // Get some XML DATA
>> > {
>> > // Handle XML data
>> > }
>> > fclose(stream);
>> > /*
>> > info_text is a extern
>> > variable that another thread prints on the
>> > LCD for debug output.
>> > */
>> > sprintf(info_text ,"COK\n%lu",
>> > (u_long)NutGetMillis());
>> > }
>> > }
>> > else
>> > {
>> > socket_error = NutTcpError(socket);
>> > sprintf(info_text ,"CE:%d \n%lu",
>> > socket_error, (u_long)NutGetMillis());
>> > }
>> > NutTcpCloseSocket(socket);
>> > }
>> > NutSleep(300);
>> > }
>> > }
>> >
>> ----------------------------------------------------------------------------
>> > --------------------------------------------
>> >
>> >
>> >
>> > --
>> > /Erik
>> > _______________________________________________
>> > http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>> >
>> >
>> >
>> > --
>> > No virus found in this incoming message.
>> > Checked by AVG.
>> > Version: 7.5.519 / Virus Database: 269.21.8/1340 - Release Date:
>> > 23.03.2008
>> > 18:50
>> >
>> >
>> > _______________________________________________
>> > http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>> >
>> >
>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/-en-nut-discussion--thread-stops-executing-after-some-time.-tp16277335p16368466.html
>> Sent from the MicroControllers - Ethernut mailing list archive at
>> Nabble.com.
>>
>>
>> _______________________________________________
>> http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>>
> _______________________________________________
> http://lists.egnite.de/mailman/listinfo/en-nut-discussion
>
>
--
View this message in context: http://www.nabble.com/-en-nut-discussion--thread-stops-executing-after-some-time.-tp16277335p16445361.html
Sent from the MicroControllers - Ethernut mailing list archive at Nabble.com.
More information about the En-Nut-Discussion
mailing list