[En-Nut-Discussion] Questions regarding UART implementation

Wed Sep 29 21:29:01 CEST 2010

Hi Ulrich,

     There are a lot of topics here :)

On Tue, Sep 28, 2010 at 1:44 PM,  <uprinz2 at netscape.net> wrote:
> Hi Thiago,
>
>
> thanks for the infos. I thought that this would happen :) Lots of nice feature switches and none implemented.
> I agree to you, that sometimes it makes sense to determine the old lines and to abstract them. It's a good idea and I try to support it.
>
>
> For the blockread / blockwrite functions I found a missleading description. So should there be some totally different things:
> USART_BLOCKWRITE should control block transfer for writing.
> USART_BLOCKREAD should control block transfer for reading.
> USART_BLOCKING should control if the calling thread is blocked until transceiving is done.
> I think there is an option ASYNC too which I would call the one that controls if a thread is blocked or not on calling a usart transfer (read or write).
> But that is not working or not implemented.

>From what I understand, it helps to read from devices which send a
fixed size "packet". AFAIK it's not implemented in any of our archs.
Even on PCs it's fairly uncommon to use it.
Even ppl who write bad code to read from barcode scanners, don't
usually use that feature. It's usually much better to find the STX or
ETX yourself, or otherwise parse the protocol properly.

>
> For all those functions I miss something too: If you use transfers async, you will not get an reply on the read/write that is valid as the transfer is not finished at that time. So _ioctl needs another option too. Besides getting the information about the errors from the last transfer one needs to get the status of the current transfer, i.e. the number of bytes trasnferred and the status if the transfer is ongoing or whatever reason aborted.
>

I wrote a while ago a serial port class for my desktop apps and spent
some time digging the Unix and Windows APIs. On Linux, if you use a
non-blocking transfer, read() returns EWOULDBLOCK which is a define to
some negative number. Much like the sockets API. Otherwise it returns
the number of read bytes. Is that what you mean?

> So what we have is a usart that relies on ringbuffer even the ringbuffer struct supplies blockptr / blockcntr.
> If you use packet oriented data, you cannot handle timeouts on packets cause the timeouts are based on the ringbuffer and, if the ringbuffer is to small two locks block your thread no, better, something blocks you that you cannot determine.

True. Even if we implement the packet based API, we would have to make
the serial buffer size configurable and make the call fail with some
error code if it requests a packet larger than the buffer.

>
> If you have a function that allocated memory to form a block you don't need to copy it to the ringbuffer for transfer but the actual implemenation does.
> On smaller systems that need packet oriented communication ( block transfer) the ringbuffer memory could bee freed completely.

This would be a nice trick. But I'm not sure how it would fit in our
current driver structure. Somehow we would have to get the buffer
pointer down to the driver level.
Then again, I wonder if there is real use for the packet oriented reads.

>
> On bigger systems with DMA/PDC support, you save a lot of CPU time for all those TXE Interrupts that do not appear.
>
>
> Unfortunately I cannot implement DMA in the actual structure as DMA should lock the calling thread until transfer is over or set a signal after finishing the transfer.
> I tried to do that by using the normal StartTx(void) function that will rise an TXE Interrupt and this first TxReadyIrq( RINGBUF *rbf) will setup the DMA process.
> Unfortunately this function is out of thread as it is an interrupt and therefore cannot set a NutEventWait that blocks the calling thread.
>

I'm confused. Why wouldn't the calling thread keep it's blocked state
from read()?
Anyway, I thought about using DMA first with the EMAC driver, which
should benefit the most from it, as it transfers at least an ethernet
frame each time.
I see that u-boot, linux and other kernels usually define a API for
DMA, with dma_alloc (similar to malloc). The question is, should we
try to do something like that, and have each arch provide the
implementation, or should we confine the DMA engine within the arch
folder as a private API, so each port does it as it pleases.

> So here's what I am thinking about:
> Assume that all functions for the USARTs get the USARTDCB as an argument:
> - You can implement block-control with DMA in StartTx on CPUs that support that feature.
> - You can implement block transfer by saving the ringbuffer (Data is taken from the calling functions buffer pointer)
> - You can write totally different usart drivers for totally different architectures by keeping the Nut/OS usual function calls.
>
>
> In my STM32 implementation I fear that if I can call one set of functions from all usart interrupts the code in flash will be much lower even I implement and enable all features.
> All features mean: HW/SW-Handshake, DTR/DSR support, XON/XOFF, STX//ETX, Full-/Half-Duplex, IRDA, ...
>

It makes a lot more sense to have all functions for the USART to
receive the DCB structure or the DEVUSART structure. That's how
drivers in Linux and Windows Driver Model works (sort of).

> The backdraw of this change would be that all architectures have to be modified to pass DEVUSART *dev or at least USARTDCB *dcb to all functions.
> That would lead to one small problem, any function accessing the ringbuffer needs to derive it from the dcb.
> For a 72MHz STM32 it's not a problem to do RINGBUF *rbf = dcb->dcb_rbf; at every function start. But how is that on an AVR?

That could easily be offset by any deduplication we achieve in the
code. It should not be too much per function really.

> Ah, by the way. I am thinking about making the things a bit comfortable. So one could set "Use Interrupts" and "Use DMA" independantly for every USART in a system.
> If it stays like it is, so usart1.c includes usart.c this saves some flash if the user unchecks the one or the other option.
> If there is only one usart.c calld by the interrupts of usartx.c it could be an idea to include portions of the code only if at least one of the usarts has enabled that option.
> So DMA handling in the general driver is only enabled and compiled if at least one usart has the option set in nutconf.
>

Wouldn't it actually make the code bigger? Some routines would be
duplicated in the binary blob, one with DMA and another without. I'm
not sure if there is a use-case were one would like to enable DMA for
one USART but not for the others.

> So now I have three options:
> 1 Modify usart.c / usart.h / uart.h to the new structure and hope that someone is helping me to bull AVR and ARM architecture to that level.

I can help with AVR and AVR32.

> 2 Just split usart.h / uart.h into stm32_usart.h and other_usart.h while usart.h includes the one or the other depending on the architecture selected.

It can easily became a nightmare regarding to maintenance and portability.

> 3 Leave it as it is and forget about that all :)

Tempting *smile*
Actually I think Nut/OS already has the most compreensive USART driver
from the RTOS I know of, and for the applications we work with, that's
a huge benefit :)
But it's also quite hard to maintain the way it is... If a bug is
found in the flow control code for instance, one has to remember to
fix it in all other archs, and it only get's worst with new platforms
being added.

>
> By the way, Option 2 is what I did for TWI cause STM32 has two interfaces and 4 interrupt providers ( two per interface) that call the same code existing only once.
> Old Tw*() functions are #defined to the stm32 specific functions. Works fine here :)
>

Yesterday I was thinking about a platform independent TWI. So we could
have platform independent drivers to access EEPROMs and Atmel QTouch
chips.

But I'm actually quite worried about the GPIO. I'm going to start
working on a board with UC3B164 connected with sensors/relays. I would
like to see and use an interface to set pin functions, level,
configure interrupts, etc in a way that's standard and portable
between current and future platforms.
How are you handling this with STM32?
Btw, which STM32 are you using? I would like to take a look at the datasheets :)

Kind Regards,
    Thiago A. Correa