[En-Nut-Discussion] NutGetMillis is a _very expensive_ function on Ethernut3
Harald Kipp
harald.kipp at egnite.de
Wed Aug 20 18:18:18 CEST 2008
Alain M. wrote:
> duane ellis escreveu:
>> Harald Kipp wrote:
>> >> IMHO, using an array instead of a structure will be more flexible.
>> This way we can define different indices for different targets.
>> I do not think the ARRAY should be public, it should be 100% hidden
>> "static to a chip/target specific C file", and the C file should
>> implement access functions to get the clock values. Data Hiding is
>> good. How that C file/function does its job internally is up to the
>> implementor.
>
> Specialy because it can be made in a way that gets optimyzed away by GCC.
I didn't intend to make the cache variable public. I prefered to use an
array instead of a structure, because it is more flexible and can be
handled by the architecture independent part. This reduces the effort of
porting to new platforms.
No question that data hiding has advantages. However, it can be
overdone. Specifically in tiny embedded systems the function call
overhead may be significant. AFAIK, global functions are typically not
optimized away. If, then it needs to be done by the linker. The compiler
cannot do this.
But calm down. ;-) There are no plans to replace the existing functions.
I'd like to introduce two new hardware independent functions
uint32_t NutClockGet(int idx);
int NutClockSet(int idx, uint32_t freq);
The idx parameter NUT_HWCLK_CPU, NUT_HWCLK_PERIPHERAL etc. which are
platform dependent.
For now, the second function is specified for NutClockSet(-1, 0) only,
which means: Release cached values (freq=0) of all clocks (idx=-1).
Later some platforms may implement the ability to set specified hardware
clocks to specified frequencies.
Then we still have
uint32_t NutGetCpuClock(void);
which is an optimized version of NutClockGet(NUT_HWCLK_CPU) and provides
backward compatibility. If NUT_CPU_FREQ is defined, it will simply
return NUT_CPU_FREQ;
Otherwise NutClockGet() and NutGetCpuClock() will use the cache
mechanism suggested by Duane. By using an array for the cache, these
function can be moved from arch/xxx/ostimer.c to os/timer.c. Indices and
the size of this array are defined in include/arch/<target>/timer.h.
On the hardware dependent side there is a new function
uint32_t NutArchClockGet(int idx);
which reads the hardware in case of an invalid cache value.
This way we are also able to replace constructs like
#if defined(AT91_PLL_MAINCK)
clk = At91GetMasterClock();
#else
clk = NutGetCpuClock();
#endif
by
clk = NutClockGet(NUT_HWCLK_PERIPHERAL);
or even more specific
clk = NutClockGet(NUT_HWCLK_UART0);
On targets with single clock support only, the indices are all the same
#define NUT_HWCLK_CPU 0
#define NUT_HWCLK_PERIPHERAL NUT_HWCLK_CPU
#define NUT_HWCLK_UART0 NUT_HWCLK_PERIPHERAL
...
#define NUT_HWCLK_MAX NUT_HWCLK_PERIPHERAL
and the cache is declared as
static uint32_t clock_cache[NUT_HWCLK_MAX + 1];
In such cases (1 clock only) the function NutClockGet() is mapped to the
optimized NutGetCpuClock() by
#if NUT_HWCLK_MAX == 0
#define NutClockGet(i) NutGetCpuClock()
#endif
This will hopefully not add any additional code to tiny systems, where
every byte counts. And it will provide high flexibility for coming chips
without creating a porting nightmares.
Harald
More information about the En-Nut-Discussion
mailing list