Rolling Your Own Embedded Linux Distribution

Rolling your own Embedded Linux Distribution

Erik AndersenSenior Software Engineer, Lineo Inc.

390 South 400 WestLindon, UT 84042

Email: [email protected]

AbstractThis presentation will cover the software, tools, libraries, and configuration files neededto construct an embedded Linux operating system. Some of the software available forconstructing embedded Linux systems will be discussed, and selection criteria for whichtools to use for differing embedded applications will be presented. Throughout the paper,we will construct an embedded Linux distribution to perform a simple task using only theLinux kernel, the uClibc C library, BusyBox, and a bootloader. The presenter will thenboot up the newly constructed embedded Linux operating system and show that it worksperfectly.

Linux distributions1 tend to be designed for server and desktop systems. As such, theydeliver a full-featured, comprehensive set of tools for just about every purposeimaginable. Most Linux distributions, such as RedHat, Debian, or SuSE, providehundreds of separate software packages adding up to several gigabytes of software. Thegoal of a server or desktop Linux distribution is to provide as much value as possible tothe user, therefore, the large size is quite appropriate. The traditional server and desktopLinux focus has caused a number of things about the Linux operating system to be muchlarger then is desirable for building embedded systems. Since embedded devicesrepresent a fundamentally different target for Linux, it is appropriate that embeddeddevices should use different software than what is commonly used on desktop systems.Linux has a number of strengths which make it extremely attractive for the nextgeneration of embedded devices, but it is important that developers use the best softwaretools that are available to maximize the advantage of using Linux in the embedded space.This paper will describe some of the software tools available for building very smallembedded Linux systems.

IntroductionBefore we cover the software tools needed to build an embedded Linux system, we haveto answer several questions: Why are we doing this? Why use embedded Linux? Whatare the advantages of using Linux? Is Linux small enough to fit inside the target device?There are a large number of embedded operating systems available that can be used for a 1 The term "distribution" is used by the Linux community to refer to a collection of software, including theLinux kernel, application programs, and needed library code, which makes up a complete running system.Sometimes, the term "Linux" or "GNU/Linux" is also used to refer to this collection of software.

moderate fee to develop high-quality embedded systems, so it's worthwhile to exploresome of the advantages Linux has to offer. The answer to these questions depends largelyon who you ask. The Free Software Foundation would like everyone to use FreeSoftware, because it is the morally correct thing to do. Open Source advocates (these arethe people who get the most press time of late) will tell you that Open Source softwaredevelopment can harness the creative efforts of the best software developers throughoutthe world and produce high-quality software as a result. While both these viewpoints areimportant, they are not generally sufficient to convince executive managers to switch tousing embedded Linux to develop products.

On the other hand, what managers generally do care about is money, and nine times outof ten, they choose Linux because it is free, as in gratis. When making the choicebetween using Linux or using a proprietary operating system, many fail to consider therest of the cost involved. How much does it cost to purchase a development seat usingthe chosen operating system? How available are software developers that haveexperience using the chosen operating system? If you have an existing staff of softwaredevelopers, how much will it cost to retrain them? How much are the per-unit royaltycosts? Can the operating system do the job it is required to do? Is Linux small enoughto fit inside my device?

Now I am going to cheat a bit, and avoid giving any detailed, exhaustive answers to thesequestions. This is a technical presentation, not an editorial or an advocacy piece.Hopefully, if you are reading this paper now, you have already decided that Linux is aviable option for you.

When I began working on embedded Linux, the last of the preceding questions, "Is Linuxsmall enough to fit inside my device?" was a difficult problem for us. I work at Lineodoing embedded Linux software development (my official job title is "Senior SoftwareEngineer/Code Poet"). In fact, I was the first engineer hired after Lineo shifted itsdirection from embedded DOS to focus exclusively on embedded Linux. At Lineo, wehad customers who wanted to deliver embedded devices which would run Linux inextremely small amounts of flash memory. This was a real challenge for us, since at thetime, we were relying on the same applications which were used in standard Linuxdistributions on the desktop.

As I began to analyze how we could save space, it quickly became apparent that therewere three main areas we could attack to shrink the footprint of Embedix (Lineo'sembedded Linux distribution): the kernel, the application programs, and the libraries.Many of the higher-ups in Linux kernel development (including Linus himself) have beenworking on shrinking the footprint of the kernel. Thus, over the past year and a half, Ihave focused on the latter two areas, shrinking the footprint of the application programsand libraries required to produce working embedded Linux systems.

The C LibraryLets take a look at a common embedded Linux system, the Linux Router Project(http://www.linuxrouter.org/). The Linux Router Project, begun by Dave Cinege, wasand continues to be a very commonly used embedded Linux system. Its self-describedtagline reads "A networking-centric micro-distribution of Linux" which is "small enoughto fit on a single 1.44MB floppy disk, and makes building and maintaining routers, accessservers, thin servers, thin clients, network appliances, and typically embedded systemsnext to trivial." If we download a copy of one of the Linux Router Project's "idiotimages" (I grabbed one from a mirror atftp://sunsite.unc.edu/pub/Linux/distributions/linux-router/dists/current/idiot-image_1440KB_FAT_2.9.8_Linux_2.0.gz). Opening up the idiot-image we can see afew very interesting facts

[root@sage /tmp]# mount idiot-image_1440KB_FAT_2.9.8_Linux_2.2 /mnt -oloop[root@slag /tmp]# mkdir test[root@slag /tmp]# cd test[root@slag test]# tar -xzf /mnt/ROOT.LRP[root@slag test]# du -ch bin root sbin usr var460k bin8.0k root264k sbin12k usr/bin304k usr/sbin36k usr/lib/ipmasqadm40k usr/lib360k usr56k var/lib/lrpkg60k var/lib4.0k var/spool/cron/crontabs8.0k var/spool/cron12k var/spool76k var1.2M total[root@slag test]# du -ch lib24k lib/POSIXness1.1M lib1.1M total

So as we look at the software contained in this embedded Linux system, we quicklynotice that in a software image totaling 2.2 Megabytes, the libraries take up over half thespace. If we look even closer at the set of libraries, we quickly find that the largest singlecomponent in the entire system is the GNU C library, in this case occupying nearly 650k.What is more, this is a very old version of the C library; newer versions of GNU libc,such as version 2.2.2 are over a 1.1 Megabytes all by itself! There are tools availablefrom Linux vendors and in the Open Source community which can reduce the footprint ofthe GNU C library considerably; however, there are system design issues that maypreclude the use of these tools. Even when these tools are appropriate, there are limits tothe amount of size which can be reclaimed from the GNU C library in this way. In thispaper I will not discuss the techniques of library reduction.

http://www.linuxrouter.org/

ftp://sunsite.unc.edu/pub/Linux/distributions/linux-router/dists/current/idiot-image_1440KB_FAT_2.9.8_Linux_2.0.gz

ftp://sunsite.unc.edu/pub/Linux/distributions/linux-router/dists/current/idiot-image_1440KB_FAT_2.9.8_Linux_2.0.gz

A single file that is so large certainly looks like low hanging fruit. In practice, replacingthe GNU C library for embedded Linux systems has not been an easy job at all.

The origins of uClibcAs I despaired over the large size of the GNU C library, I decided that the best thing to dowould be to find another C library for Linux that would be better suited for embeddedsystems. I spent quite a bit of time looking around, and after carefully evaluating thevarious Open Source C libraries that I knew of2, I sadly found that none of them weresuitable replacements. Of all the Open Source C libraries, the library closest to what Iimagined an embedded C library should be was uClibc. However, it also had a lot ofproblems at the time-- not the least of which was that uClibc had no central maintainer, ithad no mechanism for supporting multiple architectures, and there had already beenseveral source tree forks. In short, uClibc was a mess of twisty versions, all different.After spending some time with the code, I decided to fix it.

With the help of D. Jeff Dionne, one of the originators of uClinux3 , I ported uClibc to runon Intel compatible x86 CPUs. I then grafted in the header files from glibc 2.1.3 tosimplify software ports, and I cleaned up the resulting breakage. This effort has madeporting software to run with uClibc extremely easy. There were many functions inuClibc that were either broken or missing, and had to be written from scratch and/orrewritten. When appropriate, I sometimes grafted in bits of code from the current GNUC library and libc5. Once the core of the library was reasonably solid, I began adding aplatform abstraction layer to allow uClibc to run on different types of CPUs. Once I hadboth the ARM and x86 platforms basically running, I made a few small announcementsto the Internet. At this point, several people began to make regular contributions.

In January 2001, after a great deal of effort both on my part and on the part of the othercontributors to uClibc, I was able to build the first shared library version of uClibc. Atabout the same time, a wrapper for the gcc compiler was contributed, which greatlysimplified the process of compiling applications with uClibc.

Enough Background--Let's make something that works!Now that I have certainly bored people with the history lesson, we finally get to the funpart, building our own embedded Linux system. To begin we need to create a skeleton

2The Open Source C libraries I evaluated included Al's Free C RunTime library, which is available fromhttp://www.k9wk.com/cdoc.html, dietlibc which is available from http://www.fefe.de/dietlibc/the minix C library from http://www.cs.vu.nl/cgi-bin/raw/pub/minix/2.0.0/src.tar, the newlib library fromhttp://sources.redhat.com/newlib/, and the eCos C library from ftp://sourceware.cygnus.com/pub/ecos/3uClinux is a port of Linux designed to run on micro-controllers which lack Memory Management Units(MMUs) such as the Motorolla DragonBall or the ARM7TDMI. The uClinux web site is found athttp://www.uclinux.org/.

filesystem where we will build our embedded Linux distribution. So to begin with wewill create an empty filesystem

[andersen@slag /tmp]$ dd if=/dev/zero of=root_fs bs=1k count=600600+0 records in600+0 records out[root@slag /tmp]# ls -sh /tmp/root_fs604k /tmp/root_fs[andersen@slag /tmp]$ mkfs.minix /tmp/root_fs224 inodes600 blocksFirstdatazone=11 (11)Zonesize=1024Maxsize=268966912

At this point, we now have a 600k file containing a minix filesystem. In order to copyfiles into this filesystem we must now mount it:

[andersen@slag /tmp]$ mount /tmp/root_fs /mnt -o loop -t minixmount: only root can do that[andersen@slag /tmp]$ suPassword:[root@slag /tmp]# mount /tmp/root_fs /mnt -o loop -t minix

Something important to notice here is that we had to have elevated (root) permissions inorder to mount the filesystem. Now that the filesystem is mounted we can copy any fileinto it that we want. It is now time to begin compiling some source code.

Compiling uClibcBefore we can compile uClibc, we must first grab a copy of the source code and unpack itso it is ready to use. For this paper, we will just grab a copy of the daily uClibc snapshot.

[root@slag /tmp]# wget -q ftp://oss.lineo.com/uClibc-snapshot.tar.gz[root@slag /tmp]# tar -xzf uClibc-snapshot.tar.gz[root@slag /tmp]# cd uClibc[root@slag uClibc]#

uClibc has a configuration file, Config, that can be edited to adjust the way the library iscompiled, such as to enable or disable features (i.e. whether debugging support is enabledor not), and to select a cross-compiler. Since we are going to be targeting a standard Intelcompatible x86 system, no changes to the configuration file are necessary. We can nowbegin the compilation process.

[root@slag uClibc]# make[---------compilation omitted---------]make[1]: Leaving directory `/tmp/uClibc/unistd'ranlib libc.aFinding missing symbols in libc.a ... partial linking...No missing symbols found.gcc -s -nostdlib -shared -o libuClibc.so.1 -Wl,-soname,libuClibc.so.1tmp/*.o

Finally finished compiling...

[root@slag uClibc]# make install+ mv -f /lib/libuClibc.so.1 /lib/libuClibc.so.1.old+ rm -f /lib/libuClibc.so.1.old+ cp libuClibc.so.1 /lib+ chmod 644 /lib/libuClibc.so.1+ chown -R root.root /lib/libuClibc.so.1+ rm -f /lib/libuClibc.so+ ln -s /lib/libuClibc.so.1 /lib/libuClibc.so+ ldconfig

At this point we now have a fully compiled uClibc library which is ready to be used.Additionally, we now also have a compiler-wrapper which we can use to compileapplications using the uClibc library. Now would be an excellent time to make use ofthis wrapper. Which is exactly what we will do in just a few minutes. But first, somemore boring historical information.

The Origins of BusyboxAs I mentioned earlier, the two pieces of embedded Linux that I chose to tackle weremaking smaller libraries and making smaller application programs. A typical Linuxsystem contains a variety of command-line utilities from numerous differentorganizations and independent programmers. Among the most prominent of these utilitiesare the GNU shellutils, fileutils, textutils, and similar programs that can be run within ashell. The GNU utilities are very high-quality programs which are very, very feature-rich.The large feature set comes at the cost of being quite large -- prohibitively largeembedded systems. After some investigation, I determined that it would be more efficientto replace them rather than try to strip them down, so I began looking at alternatives.

Just as with alternative C libraries, there were several choices for small shell utilities:BSD has a number of utilities which could be used. The Minix operating system, whichwas recently released under a free software license, also had many useful utilities. Sash,the stand alone shell, was also a possibility. After quite a lot of research, the one thatseemed to be the best fit was Busybox. It also appealed to me because I was alreadyfamiliar with Busybox from its use on the Debian boot floopies, and because I wasaquatinted with Bruce Perens, who was the maintainer. Starting approximately inOctober 1999, I began enhancing Busybox and fixing the most obvious problems. SinceBruce was otherwise occupied and was no longer actively maintaining Busybox, Bruceeventually consented to let me take full ownership of Busybox.

Since that time, Busybox has gained a large following and attracted development talentfrom literally the whole world. It has been used in commercial products such as IBMwristwatch and 3Com's Kerbango Internet Radio with more happening all the time. Somany new features and applets have been added to Busybox, that the biggest challenge Inow face is simply keeping up with all of the new patches that get submitted!

So, How Does It Work?Busybox is a multi-call binary that combines many common Unix utilities into a singleexecutable. When it is run, Busybox checks if it was executed by running a symlink, andif the symlink name matches an applet compiled into Busybox, it runs that applet. If itwas run as "busybox", then Busybox will read the command line and try to execute anyapplet passed as the first argument. For example:

[root@slag busybox]$ ./busybox dateWed Feb 28 17:17:52 MST 2001[root@slag busybox]$ ./busybox echo "hello there"hello there[root@slag busybox]$ ln -s ./busybox uname[root@slag busybox]$ ./unameLinux

Busybox is designed such that the developer compiling it for his embedded system canselect exactly which applets he wants to include in the final binary. Thus, it is possible tostrip out support for unneeded applets, resulting in a smaller binary with a carefullyselected set of commands. The customization granularity for Busybox even goes one stepfurther: each applet can contain multiple features that can be turned on or off. Thus, forexample, if you don't want to include command-line completion in the Busybox shell(lash), or you do not need to mount NFS filesystems, you can simply turn these featuresoff, further reducing the size of the final Busybox binary.

Compiling BusyboxLet's walk through a normal compile of Busybox. First, we must grab a copy of theBusybox source code and unpack it so it is ready to use. For this paper, we will just graba copy of the daily Busybox snapshot.

[root@slag /tmp]# wget -q http://busybox.lineo.com/busybox.tar.gz[root@slag /tmp]# tar -xzf busybox.tar.gz[root@slag /tmp]# cd busybox

Now that we are in the Busybox source directory we can configure Busybox so that itmeets the needs of our embedded Linux system. This is done by editing the file Config.hso that only the applets and features we want are enabled. Additionally, there are severalconfiguration settings which can be set in the file named Makefile. These settings areall prefaced by descriptions which are intended to make it obvious what each settingdoes. At this point, we do not need to change anything so we will proceed to thecompilation process.

[root@slag busybox]# make CC=/tmp/uClibc/extra/gcc-uClibc/gcc-uClibc-i386[---------compilation omitted---------][root@slag busybox]# ldd ./busybox libuClibc.so.1 => /lib/libuClibc.so.1 (0x4000c000) ld-linux.so.1 => /lib/ld-linux.so.1 (0x40039000)[root@slag busybox]# ls -sh ./busybox140k ./busybox*

And we're done!

Installing Busybox to the Target

If you then want to install Busybox onto your target device, this is most easily done bytyping: make install. The installation script automatically creates all the requireddirectories (such as /bin, /sbin and the like) and creates appropriate symlinks in thosedirectories for each applet that was compiled into the Busybox binary.

To continue with the loop-mounted filesystem from our earlier example, let's installBusybox onto it:

[root@slag busybox]# make PREFIX=/mnt install[---------installation text omitted---------]

There now, that wasn't too difficult. However, we are not yet done. Our loop-mountedfilesystem still does not have a copy of uClibc installed. If you recall a bit earlier, we ranthe 'ldd' command to check the list of shared libraries which Busybox was linked against.In order for our system to work properly, every shared library listed by 'ldd' must beincluded in out target. So now is as good a time as any to install the required libraries.

[root@slag busybox]# ldd ./busybox libuClibc.so.1 => /lib/libuClibc.so.1 (0x4000c000) ld-linux.so.1 => /lib/ld-linux.so.1 (0x40039000)[root@slag busybox]# mkdir /mnt/lib[root@slag busybox]# cp /lib/libuClibc.so.1 /mnt/lib[root@slag busybox]# cp /lib/ld-linux.so.1 /mnt/lib

With the Busybox binary and the required shared libraries installed, we really just have alittle bit of wrap-up work to do to finish off our embedded Linux root filesystem. Inparticular, we need to create a directory for mounting the proc filesystem, we need a fewentries in /etc, and we need some device special files (device nodes) in /dev. Lets finishall that work off now.

[root@slag busybox]# mkdir /mnt/dev /mnt/dev/pts /mnt/etc/mnt/etc/init.d /mnt/proc /mnt/tmp[root@slag busybox]# (cd /dev; cp -a console core full hd[abcd] kmem memnull port ram ram0 ram1 random tty tty0 tty1 tty2 tty3 tty4 urandom vcsvcs0 vcs1 vcsa vcsa0 vcsa1 zero /mnt/dev)[root@slag busybox]# echo "/dev/root / minix ro 0 1" > /mnt/etc/fstab[root@sage /tmp]# ldconfig -qr /mnt[root@sage /tmp]# touch /mnt/etc/profile[root@slag busybox]# ln -s /proc/mounts /mnt/etc/mtab

Finally, we need to tell init what we want it to do when we boot up. For now we will justkeep this simple, and simply start up a shell.

[root@sage /tmp]# echo "::respawn:/bin/sh" > /mnt/etc/inittab

We now have our root filesystem finished and ready to go. But we still need to do a littlemore work before we can boot up our newly built embedded Linux system. Forsimplicity, I will use a floppy disk to to boot our newly built system from. Lets hurry andfinish things up quickly now.

[root@sage /tmp]# umount /mnt[root@sage /tmp]# gzip -9 /tmp/root_fs[root@sage /tmp]# ls -sh /tmp/root_fs.gz164k /tmp/root_fs.gz

So now our root filesystem has been compressed and is ready to install on the bootmedia. To make things simple, I will use a floppy disk to install to. Lets prepare thefloppy disk so we can boot from it

[root@sage /tmp]# mformat A:[root@sage /tmp]# syslinux /dev/fd0[root@sage /tmp]# mcopy /tmp/root_fs.gz A:

So we now have a copy of our root filesystem on a floppy disk. Next we need to install acopy of the Linux kernel. To make things simple, I'm going to simply install a copy aLinux kernel that I am currently running on my system.

[root@sage /tmp]# mcopy /boot/kernel-2.2.19pre3 A:linux

Finally, we need to configure the bootloader (in case you missed it a few steps ago, weare using the syslinux bootloader for this example). I happen to have a ready to usesyslinux configuration file already in my /tmp directory, so I will now install that to thefloppy disk as well:

[root@sage /tmp]# cat SYSLINUX.CFGDEFAULT linuxAPPEND initrd=root_fs.gz root=/dev/ram0 rwTIMEOUT 1PROMPT 0[root@sage /tmp]# mcopy SYSLINUX.CFG A:

And now, finally, we are done. Our embedded Linux system is complete and ready toboot. And you know what? It is very, very small. Take a look.

[root@sage /tmp]# mdir A: Volume in drive A has no label Volume Serial Number is 05D3-BF29Directory for A:/

LDLINUX SYS 5860 03-01-2001 6:25root_fs gz 160348 03-01-2001 6:26 root_fs.gzlinux 510337 03-01-2001 6:26 linuxSYSLINUX CFG 76 03-01-2001 6:39 4 files 676 621 bytes 779 776 bytes free

With a carefully optimized Linux kernel (which this kernel unfortunately isn't) we couldexpect to have even more free space. And remember, every bit of space we save ismoney that embedded Linux developers don't have to spend on expensive flash memory.So now comes the final test; it is now time to boot from our floppy disk. Here is whatyou should see.

[-----------kernel boot messages snipped-----------]Freeing unused kernel memory: 64k freedinit started: BusyBox v0.50pre (2001.03.01-13:06+0000) multi-call binary-- GPL2

BusyBox v0.50pre (2001.03.01-13:06+0000) Built-in shell (lash)Enter 'help' for a list of built-in commands.

/ # du -h136.0k ./bin1.0k ./sbin1.0k ./usr/bin1.0k ./usr/sbin3.0k ./usr196.0k ./lib1.0k ./dev/pts2.0k ./dev1.0k ./etc/init.d5.0k ./etc1.0k ./proc1.0k ./tmp346.0k ./ #

And there you have it -- how to build the world's smallest embedded Linux system.

ConclusionThe two largest components of a standard Linux system are the utilities and the libraries.By replacing these with smaller equivalents a much more compact system can be built.Using Busybox and uClibc allows you to customize your embedded distribution bystripping out unneeded applets and features, thus further reducing the final image size.This space savings translates directly into decreased cost per unit as less flash memorywill be required. Combine this with the cost savings of using Linux, rather than a moreexpensive proprietary OS, and the reasons for using Linux become very compelling.

Rolling Your Own Embedded Linux Distribution

Technology

Transcript of Rolling Your Own Embedded Linux Distribution