[En-Nut-Discussion] First summary - Building Nut/OS fast...very fast

Wed Jul 20 16:30:14 CEST 2011

Hi all,

This is a first summary of what I extracted from the your valuable hints and pointers. Many thanks for your time to help me on this matter.

If you read this out of context: This text is about speeding up cross-building for a large number of targets.

The PC, that is currently used, has an i7/950 CPU with 12 GB of DDR3-1600 RAM and runs Windows 7 64bit. Tests were done on a normal HD and an SSD.

1. Measuring run time

I've added time logging to the Lua script. The updated version is here:

 http://ethernut.svn.sourceforge.net/viewvc/ethernut/branches/nut-4_10-branch/nut/tools/packaging/distcheck.lua

2. Using solid state drives

In the archive

 http://www.ethernut.de/arc/buildstats-4.10.0.zip

you will find the file stats-4.10.0-hdd-s.log, which shows the time used to build Nut/OS 4.10 on a normal hard disk for all supported targets. The file stats-4.10.0-ssd-s.log shows the result when running on OCZ Agility2 SSD. While 5975 seconds were used with a low performance Hitachi hard drive, it took 5185 seconds running all on the SSD. Except the ImageCraft compiler, which was loaded from hard disk. But this compiler is simpler and already running much faster than GCC.

The result: Either the OCZ Agility2 was a bad choice or the disk drive is not the bottle neck at all.

3. Using faster RAM chips

Several people stated, that upgrading to higher performance SDRAM will not noticeably improve the speed, but cost a lot more money.

4. Using a RAM disk

Ole suggested to use a RAM disk. This will be easy on Linux, but on Windows this requires additional proprietary software. A first test with a demo version from memory.dataram.com failed, because it is limited to 4GB and the disk became full near the end of the script.

Anyway, it turned out, that the RAM disk was even slightly slower than the SSD. May be, it has to do with the fact, that less memory is available for caching. But that's my layman assumption. ;-) Despite Ole's experience with Linux, no significant improvement is gained on Windows, when switching from a normal hard disk to a RAM disk.

5. Concurrent processing

A third file in the archive mentioned above is named stats-4.10.0-ssd-m.log and shows the results when running make with option -j. Now the total time is cut by more than a half. Building libraries was even reduced by about 2/3.

Due to the mangled output of concurrently running processes, the error summary is no longer working, making it hard to determine the cause of a failed build. As Nathan suggested, this could be solved by re-running the failed part in a single thread. I haven't implemented this yet, but seems to be the way to go.

Another problem is, that "make -j clean all install" will not work in the application directory. Possibly this can be solved by re-arranging the Makefile. For now I split it into 3 single calls. Ulrich questioned this problem, but he was probably referring to building libs only, not the samples.

While different usages of -j had been suggested, no one explained to me, what's wrong with -j (without specifying the number of cores). For me this gives the best results.

6. The rm problem myth

In most cases my build starts with a clean, new subdirectory. Several messages in this list blamed 'rm' on Windows to be the cause for long build times. Well, it's not really a "myth". In fact, when running 'make clean' in a dirty directory, it may increase the build time of the libraries by about 20%, but not each time. This is probably caused by some file system features, which may be disabled and should be disabled for solid state drives.

Related article (in German):
http://www.pc-experience.de/wbb2/thread.php?threadid=30040/

Timings on normal hard disk (without any system tweaking) are provided in the file stats-4.10.0-hdd-s-dirty.log. Note that I interrupted the build after some time. It's definitely not "blowing me of the chair", as Ulrich suggested. :-)

Nathan suggested to use FAT32. Previously I had bad experience with FAT file systems and a large number of file entries, but that was "decades" ago. Should be worth a try.

Ulrich suggested to completely switch to building under Cygwin. Also worth to try, but definitely _not_ _on_ _my_ _machine_ ;-). After a lot of bad experience with SoDs and DLL-nightmares on Cygwin-contaminated Windows PCs, this is no otpion for me.

7. Buildbots

Thiago threw in this topic, asking me what I exactly want to do. Right now one Lua script is used to completely test and build a Nut/OS installer release for Windows. That may not be the best way to do it.

Buildbot is a so called "continuous integration tool". Some time ago the ETH Zurich set up CruiseControl to check Nut/OS builds, but that service was canceled. I looked around a bit and found Jenkins CI quite promising. More evaluation is probably required and that will be a thread by its own. In general, it seems to be a good idea to split "build testing" and "release building".

If we will install a CI tool, fast build times become even more important.

8. Linux or Windows

The reason for testing the build on Windows is, that the Windows release is more difficult to create than the Linux version.

I didn't compare build times on Windows and Linux hosts. What I found is, that when running single make processes, only 4 cores out of 8 cores are utilized up to 10% to 20% on the Windows 7 machine. Nathan gives a plausible explanation to this: GCC at least runs the compiler in parallel with the assembler. And I can imagine, that something in this area causes some latency. Due to this latency, the CPU cores are still sleeping most of the time.

After splitting the testing part (either by using a buildbot or modifying the existing Lua script), I should probably try it on Linux first.

Did I miss something.

I'll keep you updated,

Harald