[En-Nut-Discussion] Watchdog issues

Bernd Walter enut at cicely.de
Thu Sep 15 13:07:35 CEST 2005


On Thu, Sep 15, 2005 at 09:33:59PM +1200, Ralph Mason wrote:
> Call me a joy killer, but having an interrupts issue a wdr is about the 
> worst idea you could ever ever ever ever ever ever ever ever have! You 
> software can crash and you will still get  interrupts.

Not always.
I use an timer interrupt as well to reduce a global variable until zero.
If the variable is not zero yet I clear the wdt as well.
Now it depends on the type of functionality you want to garantie.
In my case I set the variable evertime a full TCP transaction
succeeded.
The timer is just to increase the watchdog to several Minutes timeout
range, so the normal clearing has less problem.
In case the timer itself is stuck, well then we get a quick reboot...
Interrupts shouldn't have been blocked for that long so that's fine.

> Here suggestion with watchdogs under nut os. Only the idle thread should 
> ever tickle the watch dog, your other threads can signal that with a 
> yield - so in effect yield becomes a wdr.  secondly your idle thread 
> should track watchdogs from all the other systems in your application, 
> those systems (threads) should reset a counter that that watchdog looks 
> at.  The idle thread should count down these timers and if they reach 0 
> then stop issuing wrd's.  This means that all your threas must issue 
> yields every 1 second or so and every so many seconds (what ever your 
> set the counter to) make a call to reset the countdown (don't do both in 
> a loop).  Finally create a top priority thread (0) that only has a task 
> of reseting it's counter and then sleeping, this will check that no 
> other threads are being starved.  This is about the only way I think you 
> can actually check the health of a nut app using the watchdog timer.

You can't count on specific threads to run within a single second,
that's way too short for a cooperative mutlitasking.

> Now if ionly we could solve the problem of bad code jumping into the 
> bootloader randomly write a page of memory, there by renendering your 
> whole device useless.

Ack, but that's the usual thing about clearing wdt.
You need some kind of reliable healthness information.
With my interrupt way you can even have multiple softtimer, so every
functional block can have their own watchdog timeout period without
caring about the healthness of other functions.

-- 
B.Walter                   BWCT                http://www.bwct.de
bernd at bwct.de                                  info at bwct.de




More information about the En-Nut-Discussion mailing list