[En-Nut-Discussion] Fwd: Not fixed!!!: Re: confirmed!!! Re: NutOS 4.4.0 on ARM7: possibly bug in NutEnterCritical / NutExitCritical

Duane Ellis duane at duaneellis.com
Tue Feb 26 03:01:08 CET 2008


Nathan Moore wrote:
> The reason you use "memory" in clobber as I understand it is if you were
> to alter something through a pointer within an asm operation.
>   
Yes, and that is effectively what *can* *appear* to happen when IRQs are
re-enabled.

>>  >> I'm not really sure why these things still work at all, but if you
>> made it a
>>  >> thread local varialbe it would work better:
>> CurrentThread->CriticalDepth
>>
>>    here are two good references:
>>            http://en.wikipedia.org/wiki/Interrupt_handler
>>
>>    The commercial package, SMX calls them 'link-service-routines' or "lsr"
>>    And has some diagrams that help explain.
>>
>>            http://www.smxrtos.com/articles/lsr_art/lsr_art.htm
>>            http://www.smxrtos.com/articles/techppr/defint.pdf
>>
>>    In the SMX case, they do it via the "INVOKE" macro.
>>     
> I don't really get what those links have to do with the current topics.
> Related, but not relavent, unless I'm really misunderstanding or overlooking something.
>   
You and others have described using a "global" counter to track
criticality.
The concern was - "interrupts should not be disable for a long time"

In this case your suggestion was the ->CriticalDepth, your technique in
fact *does* work better *IF* it is used in conjunction with disable/restore.

The two techniques:
(1) Disable/restore IRQ is found in low level OS stuff, not normally
used by an APP developer
(2) Semaphore are often used by APPs - not really IRQ handlers.

The discussion seemed to be tossing back and forth between the two
solutions.

I did not want to see your suggestion of the "->CriticalDepth" as dismissed.
It is actually very powerful - *IF* it is used in conjunction with the
IRQ solution.

By *only* disabling/restoring IRQs... you face a new problem.

The problem is IRQ Latency (response time).
A nasty problem, and selling point in various RTOSes.

==================
Example:
RS232 driver has 100 bytes to send in the ram fifo
And will be receiving 100 bytes very rapidly.

Meanwhile, *something* needs to be done in a "Critical Section"
And the *problem*is* that *something* will take a long time.

Developers go wild with Critical Sections. Causing - extended "Critical
periods".
Simply put, IRQs are get disabled for too long of a period.

Problems:
  (1) RS232 tx acts 'bursty' or starts/stops. Delayed TX irqs.
  (2) RS232 rx over-run, because rx-irq is blocked.

Hardware  FIFOs often do not help enough.

How do you solve this problem?
==================

There is an elegant solution to that problem, it requires using your
suggestion and the disable/restore together.

A very good solution to that problem is described in those links, in
particular, the SMX links.
The wiki page also speaks of "second level handlers".

If the goal of Nut is to have a *very* *simplistic* view on critical
sections.
And the problem I describe above does not matter - then stop reading.
Just use the disable/restore and be done with it.

The more advanced technique, in summery is as follows:

(1) use disable/restore *ONLY* to protect INC/DEC the counter, or
"tight situations"
(2) In almost all other places, interrupts are enabled.
(3) IRQs can service the IRQ quietly, or
(4) IRQs can enqueue a 'delayed action function' to signal the kernel
more help is needed.
(5) The IRQ can - if the system state is "safe" - execute the delayed
functions
(6) otherwise, the IRQ cannot.
(7) At critical area exit, if the counter will transition to 0, then
execute all delayed action functions.

In detail:

====
The example 1st level rs-232 hw irq handler does this
====

   The RX handler - just fills the software ring buffer.
   The TX handler - keeps draining the ring buffer.

   At end of HW IRQ...
   IF (Something more needs to be done)
   then
         Example: fifo is full/empty, wake up thread,
         Always enqueue a "delayed action function".

         if cirticalcounter == 0
         then
                  Invoke all pending delayed action functions.
         else
                  System is Critical, do nothing
          endif
   endif

=====
The high level code - effectively does this:
====

ENTER_CRITICAL is these 3 steps:
   disable/save irq.
   inc counter.
   restore irq.

WORK is these steps.
   ... work work work ...
    lots of work work work ....
   lots of TX/RX irqs happen.
   work work work.

EXIT_CRITICAL is special.
   disable/save irqs
   if (counter == 1) AND (delayed action functions exist)
   then
        restore irqs, but leave counter = 1.
        run all delayed action functions now.
        disable/save irqs.
   endif
   dec counter
   restore irqs.

====

This "2nd level handler" technique is not present in NutOS

-Duane.






More information about the En-Nut-Discussion mailing list