[En-Nut-Discussion] TCP/IP Stack crashes, first analyses

Hugo Simon hugo.simon at gmx.de
Mon Jan 16 20:15:32 CET 2006


Hello,

today I again had a crash of Ethernut's TCP/IP stack. Again the applications
thread runs after the crash, but the Nut didn't respond to pings anymore. I
did not stress the Nut, only do periodically read a small webpage from it.
But again the Nut was in our big company network.

Since I added some debug outputs to NutOS' network function I now have some
info what happened.
For some hours all went fine, Nut works, web page loads fine. Then suddenly
nothing more. The last lifesigns are documented here:

SMRCV;SMRCV;EIIP;IPTCP;TCPSMRDY1;EIEND
EIIP;IPTCP;TCPSMRDY2;EIEND
SMRCV;SMRCV;EIIP;IPTCP;TCPSMRDY1;EIEND
SMRCV;SMSYNRECOVER;SMSYNRECOVER;SMSYNRECOVER;SMSYNRECOVER;SMSYNRECOVER;SMRET
R;SMRETR;SMSYNRECOVER;SMSYNRECOVER;SMSYNRECOVER;SMRETR;SMSYNRECOVER;SMSYNREC
OVER;SMRETR;SMSYNRECOVER;EIARP;EIEND
SMSYNRECOVER;SMRETR;SMSYNRECOVER;SMSYNRECOVER;SMSYNRECOVER;SMRETR;SMSYNRECOV
ER;SMRETR;EIARP;EIEND
SMSYNRECOVER;SMSYNRECOVER;SMSYNRECOVER;SMSYNRECOVER;SMRETR;SMRETR;SMSYNRECOV
ER;SMSYNRECOVER;SMSYNRECOVER;SMSYNRECOVER;SMRETR;SMRETR;SMSYNRECOVER;SMSYNRE
COVER;SMSYNRECOVER;SMSYNRECOVER;SMRETR;EIIP;IPBC;IPUDP;EIEND
SMSYNRECOVER;SMRETR;SMSYNRECOVER;SMSYNRECOVER;SMSYNRECOVER;SMRETR;SMSYNRECOV
ER;SMRETR;SMSYNRECOVER;SMSYNRECOVER;SMSYNRECOVER;SMRETR;EIIP;IPTCP;TCPSMRDY1
;EIEND
SMRCV;SMSYNRECOVER;SMRETR;SMSYNRECOVER;SMSYNRECOVER;SMSYNRECOVER;SMSYNRECOVE
R;SMSYNRECOVER;SMRETR;SMTIMEOUT;

After that there only came outputs from the main application thread. Heap is
the free heap and is constant, ARP is the ARP table, as you see there is
only one entry in it, so I don't think it's the ARP code.

Read all sensors...
No. ID                       Type     Power     Low High  Temp  Status
 1  10-78-BD-DA-00-08-00-0D  DS18S20  parasite    0  300  236    0
Heap available: 22303
ARP: 53.42.91.1 = 00-00-0c-07-ac-00 2


As I interpret the output the crash came _after_ loading the webpage. Normal
web loads end like this:

SMRCV;SMRCV;EIIP;IPTCP;TCPSMRDY1;EIEND
EIIP;IPTCP;TCPSMRDY2;EIEND
SMRCV;SMRCV;EIIP;IPTCP;TCPSMRDY1;EIEND
SMRCV;

After the web load there I see masses of SMSYNRECOVER messages which comes
from a codepiece in tcpsm.c named NutTcpSm() which is commented with
"Recover from SYN flood attacks". What is this?

For understanding the weird debug message above please look at my modified
NutTcpSm() below:


/*! \fn NutTcpSm(void *arg)
 * \brief TCP state machine thread.
 *
 * The TCP state machine serves two purposes: It processes incoming TCP
 * segments and handles TCP timers.
 */
THREAD(NutTcpSm, arg)
{
    NETBUF *nb;
    NETBUF *nbx;
    TCPHDR *th;
    IPHDR *ih;
    TCPSOCKET *sock;
    u_char tac = 0;

    /*
     * It won't help giving us a higher priority than the application
     * code. We depend on the speed of the reading application.
     */
    NutThreadSetPriority (32);

    for (;;) {
        if (++tac > 3 || NutEventWait(&tcp_in_rdy, 200)) {
            tac = 0;
            for (sock = tcpSocketList; sock; sock = sock->so_next) {

                /*
                 * Send late acks.
                 */
                if (sock->so_tx_flags & SO_ACK) {
                    sock->so_tx_flags |= SO_FORCE;
#ifdef NUTDEBUG
/*DEBUG*/  fputs("SMLACK;",stdout);
#endif
                    NutTcpOutput(sock, 0, 0);
                }

                /*
                 * Process retransmit timer.
                 */
                if (sock->so_tx_nbq && sock->so_retran_time) {
                    if ((u_short) NutGetMillis() - sock->so_retran_time >
sock->so_rtto) {
#ifdef NUTDEBUG
/*DEBUG*/  fputs("SMRETR;",stdout);
#endif
                        NutTcpStateRetranTimeout(sock);
                    }
                }

                /*
                 * Destroy sockets after timeout in TIMEWAIT state.
                 */
                if (sock->so_state == TCPS_TIME_WAIT || sock->so_state ==
TCPS_FIN_WAIT_2) {
                    if (sock->so_time_wait++ >= 9) {
#ifdef NUTDEBUG
/*DEBUG*/  fputs("SMTIMEOUT;",stdout);
#endif
                        NutTcpDestroySocket(sock);
                        break;
                    }
                }

                /*
                 * Recover from SYN flood attacks.
                 */
                else if (sock->so_state == TCPS_SYN_RECEIVED) {
#ifdef NUTDEBUG
/*DEBUG*/  fputs("SMSYNRECOVER;",stdout);
#endif
                    if (sock->so_time_wait++ >= 45) {
                        sock->so_state = TCPS_LISTEN;
                        sock->so_time_wait = 0;
                    }
                }
            }
        } else {
#ifdef NUTDEBUG
/*DEBUG*/  fputs("SMRCV;",stdout);
#endif
            nb = tcp_in_nbq;
            tcp_in_nbq = 0;
            tcp_in_cnt = 0;
            while (nb) {
                ih = (IPHDR *) nb->nb_nw.vp;
                th = (TCPHDR *) nb->nb_tp.vp;
                sock = NutTcpFindSocket(th->th_dport, th->th_sport,
ih->ip_src);
#ifdef NUTDEBUG
                if (__tcp_trf)
                    NutDumpTcpHeader(__tcp_trs, " IN", sock, nb);
#endif
                nbx = nb->nb_next;
                if (sock) {
                    NutTcpInputOptions(sock, nb);
                    NutTcpStateProcess(sock, nb);
                }

                /*
                 * Reject the segment, if no matching socket was found.
                 */
                else {
#ifdef NUTDEBUG
/*DEBUG*/  fputs("SMNOSOCK;",stdout);
#endif
                    NutTcpReject(nb);
                }
                nb = nbx;
            }
        }
    }
}

I hope anyone has an idea what happens there.

Thanks
Thorsten




More information about the En-Nut-Discussion mailing list