[En-Nut-Discussion] Nut OS reliable on AT91SAM7X Evaluation Board?

Tue Nov 17 19:33:07 CET 2009

Hi!

Marcus Jansson schrieb:
> Ulrich Prinz wrote:
>> I'd like to implement PDC/DMA for the network and the SPI channels. 
> This can be done, the SPI seem to have a long list of errata associated with the PDC, though.
> 
Yes, I saw it and I don't like it :) But what should I do, I'd like to 
get out optimal performance and why carrying bytes by hand if these is 
an automated way for the background.

>> If supported, I'd like to implement it for a background memcpy too.
> I can not see how this can be done with the PDC, unless you're talking about a memcpy() to/from memories (possibly
> eeproms) connected to the SPI or similar. Otherwise a hand optimized LDM/STM block copy loop instead of the existing
> slow memcpy() might be in order for increasing memcpy() performance in internal RAM?
> 
Yeah, there might be an optimization chance in the code for the CPU 
driven memcpy. But most systems do many different tasks. So why block 
the CPU with some dump copy while other things that need calculation 
work have to wait. And it doesn't make programming much more complex.
It's another strategy behind the programming style, more DSP way. SO 
something ist giving you an interrupt and you setup DMA to fetch the 
data into a buffer_1. After DMA finishes, you setup a function to modify 
the content at special places, lets say filling in certain data into 
certain positions. Then you setup DMA to copy the buffer over to another 
functions buffer to fill in other data or preparation of frame 
information. This can be repeated many times until the chain reaches a 
point where everything is prepared and the data can be send to another 
device like TCP or IIS or whatever. This can also be done via DMA.

Now, while this DMA chaining doesn't make sense if you only have one 
buffer and slow incoming data, it important for situations where you 
have several of each buffer in parallel. While receiving in buffer 1, 
copy out of buffer 0. While decoding in the first functions buffer 0, 
transfer decoded data to buffer 1 of second function.

The style of programming is then totally different as it is in Nut/OS 
now, as you program functions as interrupt requests. The software is 
controlled by the data moving through it and not the other, normal, way 
round.

It's sort of DSP programming where you attach handlers to the buffers, 
the so called pipes.

In my case I have to route radio packets the often only need another 
header in front of existing data. Instead of creating another buffer of 
the size of the packet plus the packet itself, filling in the new header 
and attaching the packet with memcpy, I simply create a new buffer, add 
the header in front and DMA-copy the packet into the tail. With the DMA 
Interrupt I setup another DMA throwing the new packet through SPI to tha 
radio chipset.

There is another way of handling this. I setup a table of pointers 
collecting a radio packet and it's different headers out of the memory. 
But then I have to carefully track each pointer operation and have to 
check if another header exists and so on. It will save some time but I 
have to track a lot. The previous DMA wakes DMA is much more elegant and 
the CPU can do lots of other things while the DMA works in background.

There is another reason for DMA. The SPI to the radio chipset can run at 
12MBit plus. You cannot put data fast enough together if you try to do 
that in polling mode... I guess that even on 4MBit I have larger gaps 
between the packets where the CPU is busy with fetching bytes.

> What was the reason for not having Thumb code in Nut/OS?

I never thought about that and I wasn't in the project at the time where 
these decision was made. So you have to ask Harald for that.

Best regards, Ulrich