[En-Nut-Discussion] Fear not, good Sir... TCP might still be saved...

Marek Pavlu pavlu at HWserver.cz
Tue Jun 20 08:08:24 CEST 2006


Hi, 

My opinion is that cause of this problem is excessive exposed
NutEventPostFromIrq. I mean that evil is in event management:).

In case NutEventWait with timeoute you prove that, Michael. 
In case NutEventWait with infinite timeoutem it is still question. 

Maybe I have an idea, how it quickly solve ;). 
One way is modification Nut/OS semaphore for post from irq and do them more
safely(critical section and overload). In this case many call irq is only
one call NutEventPostFromIrq and its mean that in case of irq call:

1. rxi5 is waiting  => only one call NutEventPostFromIrq for multiple irq 
2. rxi5 is working => NutEventPostFromIrq is not call! 

This way conserve system resource's and avoid problem in event management...


For RTL8019(I have only this chip) and 4.0.2.1 ethernut version. 
I test this modifications and so far is good:), but more test is required...

For other driver with event timeut is useful NutSemTryWait 

/***********************************************/ 
/***********************************************/ 
nicrtl.h: 

#include <sys/semaphore.h> 

/*! 
 * \struct _NICINFO nicrtl.h dev/nicrtl.h 
 * \brief Network interface controller information structure. 
 */ 
struct _NICINFO { 
//    HANDLE volatile ni_rx_rdy;      /*!< Receiver event queue. */ 
        SEM ni_rx_rdy_sem;              /*!< Receiver semaphore */ 

/***********************************************/ 
/***********************************************/ 
nicrtl.c: 


static void NicInterrupt(void *arg) 
{ 
.... 
      if (isr & NIC_ISR_PRX) 
        { 
                //NutEventPostFromIrq(&ni->ni_rx_rdy); 
                NutSemPostFromIrq(&ni->ni_rx_rdy_sem); 
        } 
.... 
} 

THREAD(NicRx, arg) 
{ 
... 

        NutSemInit(&ni->ni_rx_rdy_sem, 0); 

... 

    while (1) { 

... 

                //NutEventWait(&ni->ni_rx_rdy, 0); 
                NutSemWait(&ni->ni_rx_rdy_sem); 
                NutSemInit(&ni->ni_rx_rdy_sem, 0); 

... 

} 

/***********************************************/ 
/***********************************************/ 
semaphore.h: 

/*! 
 * \brief Sempahore type. 
 */ 
    typedef volatile struct _SEM SEM; 

/*! 
 * \struct _mutex mutex.h sys/mutex.h 
 * \brief Recursive mutex. 
 * 
 */ 
struct _SEM { 
  volatile HANDLE qhp; /*!< \brief Queue to wait, if semaphore is zero. */ 
  volatile short value;          /*!< \brief semaphore value . */ 
}; 

    extern void NutSemPostFromIrq(SEM * sem); 

    extern void NutSemInit(SEM * sem, short value); 
    extern void NutSemWait(SEM * sem); 
    extern int NutSemTryWait(SEM * sem); 
    extern void NutSemPost(SEM * sem); 
    extern int NutSemDestroy(SEM * sem); 

/***********************************************/ 
/***********************************************/ 
semaphore.c: 
/* 
 * Copyright (C) 2000-2004 by ETH Zurich 
 * 
 * Redistribution and use in source and binary forms, with or without 
 * modification, are permitted provided that the following conditions 
 * are met: 
 * 
 * 1. Redistributions of source code must retain the above copyright 
 *    notice, this list of conditions and the following disclaimer. 
 * 2. Redistributions in binary form must reproduce the above copyright 
 *    notice, this list of conditions and the following disclaimer in the 
 *    documentation and/or other materials provided with the distribution. 
 * 3. Neither the name of the copyright holders nor the names of 
 *    contributors may be used to endorse or promote products derived 
 *    from this software without specific prior written permission. 
 * 
 * THIS SOFTWARE IS PROVIDED BY ETH ZURICH AND CONTRIBUTORS 
 * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 
 * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS 
 * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL ETH ZURICH 
 * OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 
 * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 
 * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS 
 * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED 
 * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 
 * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF 
 * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 
 * SUCH DAMAGE. 
 * 
 * For additional information see http://www.ethernut.de/ 
 * 
 */ 

/* semaphore.c - a nut/os implementation of semaphore functions 
 * 
 * 2004.05.06 Matthias Ringwald <matthias.ringwald at inf.ethz.ch> 
 * 
 */ 
/* 
 * $Log: semaphore.c,v $ 
 * Revision 1.6  2005/08/02 17:47:04  haraldkipp 
 * Major API documentation update. 
 * 
 * Revision 1.5  2004/06/03 08:44:50  olereinhardt 
 * According to a hint from oliver I changed NutEventWait to
NutEventWaitNext 
 * 
 * Revision 1.4  2004/06/03 08:24:21  olereinhardt 
 * Changed semaphore behavior in NutSemTryWait too. 
 * 
 * Revision 1.3  2004/06/02 16:42:53  olereinhardt 
 * fixed bug (integer overflow) in semaphore implementation. 
 * 
 * Revision 1.2  2004/05/18 18:38:42  drsung 
 * Added $Log keyword for CVS. 
 * 
 */ 

/*! 
 * \addtogroup xgSemaphore 
 */ 
/*@{*/ 

#ifdef __cplusplus 
extern "C" { 
#endif 

#include <sys/semaphore.h> 
#include <sys/event.h> 
#include <sys/atom.h> 
#include <limits.h> 

/*! 
 * \brief Initialize an unnamed semaphore to value 
 */ 
    void NutSemInit(SEM * sem, volatile short value) { 
                NutEnterCritical(); 
                        sem->qhp = 0; 
                        sem->value = (volatile short)value; 
                NutExitCritical(); 
    } 


/*! 
 * \brief Lock a semaphore 
 * 
 *  If the semaphore value is currently zero, then the calling thread will
not 
 *  return from the call to sem_wait() the semaphore becomes available 
 * 
 * \Note: Should not be called from interrupt context 
 */ 
        void NutSemWait(SEM * sem) { 
                NutEnterCritical(); 
                        
                        //sem->value--; 
                        //more safe 
                        if(sem->value != SHRT_MIN) sem->value--; 
                        
                        if (sem->value < 0) 
                        { 
                                NutJumpOutCritical(); 
                                NutEventWaitNext(&sem->qhp,
NUT_WAIT_INFINITE); 
                                return; 
                        } 
                NutExitCritical(); 
    } 

/*! 
 * \brief Unlock a sempahore. 
 * 
 * \Note: Should not be called from interrupt context 
 */ 
    void NutSemPost(SEM * sem) { 
                NutEnterCritical(); 
                        //sem->value++; 
                        //more safe 
                        if(sem->value != SHRT_MAX) sem->value++; 
                        if (sem->value <= 0) 
                        { 
                                //NutJumpOutCritical(); 
                                //NutEventPost(&sem->qhp); 
                                //return; 
                                NutEventPostFromIrq(&sem->qhp); 
                        } 
                NutExitCritical(); 
    } 

/*! 
 * \brief Unlock a sempahore. 
 * 
 * \Note: Should not be called from interrupt context 
 */ 
    void NutSemPostFromIrq(SEM * sem) { 
                //sem->value++; 
                //more safe 
                if(sem->value != SHRT_MAX) sem->value++; 
                if (sem->value <= 0) 
                { 
                        //NutJumpOutCritical(); 
                        //NutEventPost(&sem->qhp); 
                        //return; 
                        NutEventPostFromIrq(&sem->qhp); 
                } 
    } 


/*! 
 * \brief Attempt to lock a semaphore without blocking 
 * 
 * Return zero, if successful, otherwise the sempahore is already locked 
 * \Note: Should not be called from interrupt context 
 */ 

    int NutSemTryWait(SEM * sem) { 
                NutEnterCritical(); 
                        if (sem->value < 0) 
                        { 
                                NutJumpOutCritical(); 
                                return -1; 
                        } 
                        else 
                        { 
                                NutJumpOutCritical(); 
                                NutSemWait(sem); 
                                return 0; 
                        } 
                NutExitCritical(); 
    } 

/*! 
 * \brief Free resources allocated for a semaphore 
 * 
 * Return zero, if successful, otherwise there are threads blocked on the 
 * sempahore 
 */ 

    int NutSemDestroy(SEM * sem) { 
                NutEnterCritical(); 
                        if (sem->qhp == SIGNALED) 
                        { 
                                NutJumpOutCritical(); 
                                return 0; 
                        } 
                        
                        if (sem->qhp == 0) 
                        { 
                                NutJumpOutCritical(); 
                                return 0; 
                        } 
                NutExitCritical(); 
        return -1; 
    } 

#ifdef __cplusplus 
} 
#endif 

/*@}*/ 





Regards, 
                Marek Pavlu 

//  -----Original Message----- 
//  From: en-nut-discussion-bounces at egnite.de [mailto:en-nut-discussion- 
//  bounces at egnite.de] On Behalf Of Michael Jones 
//  Sent: Tuesday, June 20, 2006 1:08 AM 
//  To: 'Ethernut User Chat (English)' 
//  Subject: [En-Nut-Discussion] Fear not,good Sir... TCP might still be 
//  saved... 
//  
//  Hello! 
//  
//  I've spent the last few hours tracking down our TCP demon that has been 
//  lurking over us for so long... 
//  
//  I had the luck that unlike in my past experiments I managed to crash 99%

//  of 
//  the 37 nut/os driven boards within 3 minutes by flooding them with ARP 
//  messages... 
//  
//  ...but wait there is more! Exactly the same happens when broadcasting 
//  random 
//  packets. 
//  
//  I discussed this new aspect with Harald and we were both stumped but we 
//  now 
//  knew that the packets never actually reached e.g. the TCP/IP Stack or 
//  ARP - 
//  so what was causing the trouble? 
//  
//  Doing the usual plastering the os with trace outputs I found it. 
//  (Actually 
//  thanks to Harald and a comment he made in the discussion before!) 
//  
//  So here is what I found: 
//  
//  If more then 2000-3000 packets (regardless if broadcasts or actually 
//  addressed to the device) hit nut/os within a 2 second window and are 
//  handled 
//  the unit can crash or start to behave erratic. The actually amount of 
//  packets depend on the remaining heap space. 
//  
//  The cause seems to be the line: 
//  
//      NutEventWait(&ni->ni_rx_rdy, 2000); 
//  
//  ...in the NicRxLanc thread within the driver. 
//  
//      If the 2000ms are replaced with 0ms that problem is gone. 
//  
//  It could be so simple... 
//  
//  ...but now I wanted to know why it makes a difference if 2000ms or 0ms 
//  is 
//  specified. Well, as it seems every NutEventWait(...) which is called 
//  with a 
//  non 0ms value calls NutTimerCreate(...) which allocates a timer object 
//  on 
//  the heap (using TM_ONESHOT). Now that as such it nothing upsetting until

//  you 
//  look closer - once NutTimerInsert(...) is called the timer will tick 
//  away 
//  till the time is reached, calls it callback (it still active) and is 
//  freed. 
//  But... when the event if signaled before the timer reached its timeout 
//  the 
//  timer (and its memory) stay allocated the full remaining duration e.g. 
//  2000ms. So the next event adds a new timer and a new timer and a new 
//  timer... bang (if heap / 12 < events / sec). 
//  
//  So I tried a few things e.g. placing NutTimerStop(...) at the end of 
//  NutEventWait(...). 
//  
//  Actually this only made things even stranger... 
//  
//  So now my question is; does anybody have an idea how we fix this? What 
//  can 
//  we do so that fast sequences of signaled events using timers don't hog 
//  the 
//  heap? 
//  
//  Regards! 
//  
//  Michael 
//  
//  
//  
//  
//  
//  _______________________________________________ 
//  En-Nut-Discussion mailing list 
//  En-Nut-Discussion at egnite.de 
//  http://www.egnite.de/mailman/listinfo.cgi/en-nut-discussion 



  _____  

avast! Antivirus <http://www.avast.com>  : Odchozi zprava cista. 


Virova databaze (VPS): 0625-1, 19.06.2006
Testovano: 20.6.2006 8:08:23
avast! - copyright (c) 2000-2006 ALWIL Software.






More information about the En-Nut-Discussion mailing list