[En-Nut-Discussion] BUG?: Strange reboots and/or incorrect assembler calls to apparently random program addresses from ICC's xicall(). NutOS library calls using ICC optimizer fail. (sprintf, sprintf_P)
Brett Abbott
Brett.Abbott at digital-telemetry.com
Tue Mar 29 22:43:17 CEST 2005
re: Strange reboots and/or incorrect assembler calls to apparently
random program addresses from ICC's xicall().
NutOS library calls using ICC optimizer go to incorrect program
locations. (sprintf, sprintf_P)
I have tracked down a bug which causes the program to call the wrong
address in memory in certain circumstances. This behaviour comes and
goes based on code size/shape but isnt a "big" code size problem.
The resulting symptoms are "reboots" and "strange behaviour" with the
wrong functions being called as well as the processor even trying to
execute application data (idata_start!) or uninitialised interupts.
This occurs inside the xicall function which is not finding the correct
function address for some functions in specific situations.
The problem can be reliably reproduced using the AVR Simulator so I am
confident that this is not a hardware problem. This leaves me wondering
if I have a compiler issue, an issue with the way NutOS libraries are
linked, or an issue with my environment.... (Hmmm, too many wonders)
The problem can be prevented if you include crt/spriintf.c,
crt/vsprintf.c, crt/sprintf_P.c and crt/vsprintf_P.c in the application
project source list and compile it as part of the application. Not so
ideal.
I detail the findings, scenarios and a bodgy workaround that seems to
solve the problem below. I have checked and triple checked the
environment, paths, object files etc but perhaps Im missing something.
Any advice is appreciated, or if you have seen similar things - this
would help me focus on cause. Unless I understand the cause of the
issue, my regression testing will be forever extended.
Environment
-----------------
1. NutOS, 3.9.5
2. ICC AVR 6.31A, Code Optimiser on, resulting code @ 80% full with 4K
Bootloader.
3. Ethernut 2
4. Atmel128, 32K SRAM
5. NutOS Library code located in source directories. object files
compiled into target directory structure. libraries copied to icc/lib
as expected.
6. JTAG 1, AVR Studio 4.11 (build 401)
Thank you
Brett
Symptoms/Observations
----------------------------
*Code executes as expected, until an indirect call is executed using
xicall (indirection to support compression) at which point it jumps to
the wrong address.
* The address called in error is always consistent every execution and
compilation but may change when the source changes. ie. not random.
*Sometimes it caluclates the correct address if you have just the right
size of code. This is usually when you introduce enough debug code to
materially alter the code and then it just goes until you take out the
debug code or add more code.
*This behaviour occurs on hardware and on AVR Simulator
*Errors occur without external hardware such as Ethernet being
accessed. ie. Native AVR Atmel 128
*Only certain library function calls fail. Typically they are calls to
functions which then call other functions. both of these functions are
in the same library but compiled from separate .o files - the .c files
are in another directory again. (Nutos files: calls to sprintf() which
then fail when calling vsprintf(), and calls to sprintf_P which fails
when calling vsprintf_P(). - Note these are nutos sources not ICC. - I
will include source below for one of these)
* If I add both sprintf.c and vsprintf.c to the project list for the
application, it forces the recompile of the two functions, creates local
copies of the object files and the problem goes away - possibly the
linker is losing track of where to send the indirection? Strange that
final application code size should alter this - without changing the
libraries.
* Other nutos library calls work ok. (eg. fprintf etc)
* I am uncertain if the xicall lookup table is wrong or xicall is
looking in the wrong place.
* Are there another people using ICC compression and sprintf?
How to track down this "bug"
-----------------------------------
As the bug typically occurs at exactly the same place in the code
(complicated only by real world events) I placed flushed "writes" to a
uart at key points in the code until I got as close to the problem in my
app and then used the JTAG to step through at the microcode level.
The "beauty" of this issue is that when you catch it, it is reproduceable.
Why could it be a....
----------------------
1) Compiler/Linker problem
The intermittent nature (by code version) but reliably reproduceable
nature when occurring of this suggests that we may have found a scenario
where linker is confused. Nutos has multiple levels of indirection, add
to this compression from ICC. The fact that it occurs on such widely
used functions (in the nutos world) suggests an obscure pagey memory
mappy type thing... The latest version 6.31A has been in use here for
some time (ie. not recent change). Of course we may be using an
unsupported method...
The problem goes away (ie. lookup table in func_lit's is correct) when
the source is compiled in the current application directory at
application compile time. This could suggest a linker problem or could
just be masking it.
2) Library linking issue
Perhaps the move to having .o files in different directories to the .c
files has resulted in a more complex environment for the linker or
perhaps the order of linking causes a mismatch in mapping tables.
Having local copies of the .o file seems to solve this (the .c file
stayed in the library folders)
3) Environment Problem
Aha, the obvious answer and always most likely. I believe Ive
eliminated: stale libraries or using wrong libraries. Ive confirmed
that changes to nutos libraries are carried through to the executing
code. It is possible that something is altering the memory layout by
including a different structure or object at library compile time as to
application compile/link time. Ive checked that all -D options are the
same between library make time and application make time. I have now
reinstalled NutOs a dozen times (tried many versions), and ICC several
times so think I havent screwed up anything silly (but who can be sure)?.
4) Source code problem.
The obvious other answer. Perhaps Im using sprintf or sprintf_P
incorrectly?
Any help is appreciated. Let me know if you have seen similar issues.
I suspect this doesnt occur on gcc.
Many Many Thanks
Brett
// main.c portion - one of the offending command
// variable defs
char *OutText (uses heap alloc of 200 bytes)
prog_char P_XMLDATA1_s1[] = "<tr x=\"%s\" t=\"a\" m=\"i\"";
(prog_char is: #define prog_char const char)
volatile char XID_String[15]; (typically a 4 alphanumeric 0 terminated string)
// offending code
sprintf_P(OutText, P_XMLDATA1_s1, XID_String);
// sprintf_p.c
#include "nut_io.h"
/*!
* \addtogroup xgCrtStdio
*/
/*@{*/
/*!
* \brief Write formatted data to a string.
*
* Similar to sprintf() except that the format string is located in
* program memory.
*
* \param buffer Pointer to a buffer that receives the output string.
* \param fmt Format string in program space containing conversion
* specifications.
*
* \return The number of characters written or a negative value to
* indicate an error.
*/
int sprintf_P(char *buffer, PGM_P fmt, ...)
{
int rc;
va_list ap;
va_start(ap, fmt);
/* Bugfix by Ralph Mason. */
rc = vsprintf_P(buffer, (char *) fmt, ap);
va_end(ap);
return rc;
}
//vsprintf_p.c
#include "nut_io.h"
#include <string.h>
#include <sys/heap.h>
/*!
* \addtogroup xgCrtStdio
*/
/*@{*/
static int _sputb(int fd, CONST void *buffer, size_t count)
{
char **spp = (char **) ((uptr_t) fd);
memcpy(*spp, buffer, count);
*spp += count;
return count;
}
/*!
* \brief Write argument list to a string using a given format.
*
* Similar to vsprintf() except that the format string is located in
* program memory.
*
* \param buffer Pointer to a buffer that receives the output string.
* \param fmt Format string in program space containing conversion
* specifications.
* \param ap List of arguments.
*
* \return The number of characters written or a negative value to
* indicate an error.
*/
int vsprintf_P(char *buffer, PGM_P fmt, va_list ap)
{
int rc;
char *rp;
size_t rl;
rl = strlen_P(fmt) + 1;
if ((rp = NutHeapAlloc(rl)) == 0)
return -1;
memcpy_P(rp, fmt, rl);
rc = _putf(_sputb, (int) ((uptr_t) &buffer), rp, ap);
NutHeapFree(rp);
*buffer = 0;
return rc;
}
(All code copyright as per original source)
--
-----------------------------------------------------------------
Brett Abbott, Managing Director, Digital Telemetry Limited
Email: Brett.Abbott at digital-telemetry.com
PO Box 24 036 Manners Street, Wellington, New Zealand
Phone +64 (4) 5666-860 Mobile +64 (21) 656-144
------------------- Commercial in confidence --------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.egnite.de/pipermail/en-nut-discussion/attachments/20050330/eaae2d37/attachment.html>
More information about the En-Nut-Discussion
mailing list