aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorDavid Oberhollenzer <goliath@infraroot.at>2020-12-06 11:52:21 +0100
committerDavid Oberhollenzer <goliath@infraroot.at>2020-12-06 12:54:14 +0100
commit97e2b8719f5265f6531e74b3ee1dcd6f41f378a4 (patch)
tree6141801e64fc8b096cec091e19f9d17f132d2041
parent04af9e5f139e04b0b3747948386076608ba5d290 (diff)
Undust the guide and apply some fixes
Some typos are removed, a few overly long run-on sentence are broken up and some chaotic structures are untangled. Also, some places which I know find confusing myself years after writing it are clarified and some pointless tangents removed. Signed-off-by: David Oberhollenzer <goliath@infraroot.at>
-rw-r--r--crosscc.md197
-rw-r--r--elfstartup.md20
-rw-r--r--kernel.md51
3 files changed, 139 insertions, 129 deletions
diff --git a/crosscc.md b/crosscc.md
index 553d139..bdd63f0 100644
--- a/crosscc.md
+++ b/crosscc.md
@@ -8,10 +8,8 @@ been tried on [Fedora](https://getfedora.org/) as well as on
[OpenSUSE](https://www.opensuse.org/).
The toolchain we are building generates 32 bit ARM code intended to run on
-a Raspberry Pi 3. [Musl](https://www.musl-libc.org/) is used a C standard
-library implementation. It should be possible to use the instructions provided
-here to any other system with some minor adjustments (i.e. read the manuals
-and do some thinking, don't just go ahead brainlessly).
+a Raspberry Pi 3. [Musl](https://www.musl-libc.org/) is used as a C standard
+library implementation.
## Directory Setup
@@ -55,7 +53,7 @@ The following source packages are required for building the toolchain. The
links below point to the exact versions that I used.
* [Linux](https://github.com/raspberrypi/linux/archive/raspberrypi-kernel_1.20190925-1.tar.gz).
- Linux is a very popular OS kernel that happens to run on our target system.
+ Linux is a very popular OS kernel that we will use on our target system.
We need it to build the the C standard library for our toolchain.
* [Musl](https://www.musl-libc.org/releases/musl-1.1.24.tar.gz). A tiny
C standard library implementation.
@@ -103,7 +101,7 @@ into `src`.
For convenience, I provided a small shell script called `download.sh` that,
when run inside `$BUILDROOT`, does this and also verifies the `sha256sum`
of the packages, which will further make sure that you are using the **exact**
-same version as I am.
+same versions as I am.
Right now, you should have a directory tree that looks something like this:
@@ -131,7 +129,7 @@ tree. Luckily, GCC nowadays provides a shell script that will do that for us:
cd "$BUILDROOT"
-## Overview
+# Overview
From now on, the rest of the process itself consists of the following steps:
@@ -177,7 +175,7 @@ library for the target system and build a fully featured `libgcc` along with
it. We can simply install it *over* the existing GCC and `libgcc` in the
toolchain directory (dynamic linking for the rescue).
-### Autotools and the canonical target tuple
+## Autotools and the canonical target tuple
Most of the software we are going to build is using autotools based build
systems. There are a few things we should know when working with autotools
@@ -190,9 +188,9 @@ Unices on widely varying hardware platforms and the GNU packages were supposed
to build and run on all of them.
Nowadays autotools offers *decades* of being used in practice and is in my
-experience a lot more mature than more modern build systems and having a semi
-standard way of cross compiling stuff with standardized configuration knobs
-is very helpful.
+experience a lot more mature than more modern build systems. Also, having a
+semi standard way of cross compiling stuff with standardized configuration
+knobs is very helpful.
In contrast to many modern build systems, you don't need Autotools to run an
Autotools based build system. The final build system it generates for the
@@ -220,10 +218,6 @@ If we don't want to clobber the source tree, we can also build a package
../path/to/source/configure
make
-When the package developers create a release tarball with `make distcheck`,
-the Autotools actually test whether the package can be built out-of-tree
-from a source directory that has all write permissions disabled.
-
The `configure` script contains *a lot* of system checks and default flags that
we can use for telling the build system how to compile the code.
@@ -239,58 +233,103 @@ three options:
Those options take as an argument a dash seperated tuple that describes
a system and is made up the following way:
- <architecture>-<kernel>-<userspace>
+ <architecture>-<vendor>-<kernel>-<userspace>
-Our 32 bit ARM system, running a Linux kernel with a Musl based user space,
-is described like this:
+The vendor part is completely optional and we will only use 3 components to
+discribe our toolchain. So for our 32 bit ARM system, running a Linux kernel
+with a Musl based user space, is described like this:
arm-linux-musleabihf
-The user space component itself consists of two parts: `musl` indicating
-the libc we use and `eabihf` indicating the ARM ABI that GCC should target.
+The user space component itself specifies that we use `musl` and we want to
+adhere to the ARM embedded ABI specification (`eabi` for short) with hardware
+float `hf` support.
+
+If you want to determine the tuple for the system *you are running on*, you can
+use the script [config.guess](https://git.savannah.gnu.org/gitweb/?p=config.git;a=tree):
+
+ $ HOST=$(./config.guess)
+ $ echo "$HOST"
+ x86_64-pc-linux-gnu
+
+There are reasons for why this script exists and why it is that long. Even
+on Linux distributions, there is no consistent way, to pull a machine triple
+out of a shell one liner.
+
+Some guides out there suggest using a shell builtin **MACHTYPE**:
+
+ $ echo "$MACHTYPE"
+ x86_64-redhat-linux-gnu
+
+The above is what I got on Fedora, however on Arch Linux I got this:
+
+ $ echo "$MACHTYPE"
+ x86_64
-For the build system that I used, the tuple looks likes this:
+Some other guides suggest using `uname` and **OSTYPE**:
- x86_64-linux-gnu
+ $ HOST=$(uname -m)-$OSTYPE
+ $ echo $HOST
+ x86_64-linux-gnu
-The final component `gnu` actually tells GCC that the system is using `glibc`.
+This works on Fedora and Arch Linux, but fails on OpenSuSE:
-Both the user space component and the kernel component have exact meaning.
-Trying to insert clever free-form text for branding can mess up your build.
-If you really must, you can insert it between the kernel and the architecture,
-sort of like this:
+ $ HOST=$(uname -m)-$OSTYPE
+ $ echo $HOST
+ x86_64-linux
- x86_64-redhat-linux-gnu
+If you want to safe yourself a lot of headache, refrain from using such
+adhockery and simply use `config.guess`. I only listed this here to warn you,
+because I have seen some guides and tutorials out there using this nonsense.
+
+As you saw here, I'm running on an x86_64 system and my user space is `gnu`,
+which tells autotools that the system is using `glibc`.
+
+You also saw that the `vendor` is sometimes used for branding, so use that
+field if you must, because the others have exact meaning and are parsed by
+the buildsystem.
-### The Makefile
+### The Installation Path
-The generated Makefile also has a few tune-able knobs that we can use.
+When running `make install`, there are two ways to control where the program
+we just compiled is installed to.
-Most importantly, we will use the `DESTDIR` variable that can be used to
-set a target directory where `make install` will install the programs.
+First of all, the `configure` script has an option called `--prefix`. That can
+be used like this:
-For instance, if `make install` would install a program to `/usr/bin/foo`,
-running `make DESTDIR=/tmp/test install` will instead install the program
-to `/tmp/test/usr/bin/foo`.
+ ./configure --prefix=/usr
+ make
+ make install
-The configure script has a similar option called **--prefix**. However this
-works in a different way and actually controls path prefix. The path in
-the **--prefix** will possibly be embedded in the program during compilation,
-while the `DESTDIR` variable does not affect compilation and changes the
-location of the root directory when installing.
+In this case, `make install` will e.g. install the program to `/usr/bin` and
+install resources to `/usr/share`. The important thing here is that the prefix
+is used to generate path variables and the program "knows" what it's prefix is,
+i.e. it will fetch resource from `/usr/share`.
-For instance if running the following:
+But if instead we run this:
- ./configure --prefix=/usr/local
+ ./configure --prefix=/opt/yoyodyne
make
- make DESTDIR=/tmp/test install
+ make install
+
+The same program is installed to `/opt/yoyodyne/bin` and its resource end up
+in `/opt/yoyodyne/share`. The program again knows to look in the later path for
+its resources.
-the program `foo` will be installed to `/tmp/test/usr/local/bin/foo` but the
-program and the build system "think" it has been installed
-to `/usr/local/bin/foo`.
+The second option we have is using a Makefile variable called `DESTDIR`, which
+controls the behavior of `make install` *after* the program has been compiled:
+
+ ./configure --prefix=/usr
+ make
+ make DESTDIR=/home/goliath/workdir install
+In this example, the program is installed to `/home/goliath/workdir/usr/bin`
+and the resources to `/home/goliath/workdir/usr/share`, but the program itself
+doesn't know that and "thinks" it lives in `/usr`. If we try to run it, it
+thries to load resources from `/usr/share` and will be sad because it can't
+find its files.
-## Getting started
+## Building our Toolchain
At first, we set a few handy shell variables that will store the configuration
of our toolchain:
@@ -308,45 +347,8 @@ We also need the triplet for the local machine that we are going to build
things on. For simplicity, I also set this manually.
The **MUSL_CPU**, **GCC_CPU** and **LINUX_ARCH** variables hold the target
-CPU architecture. The later is used for the kernel build system, the former
-for the GCC build system and the for Musl.
-
-If you want to dynamically determine the **HOST** tuple, I suggest using
-[config.guess](https://git.savannah.gnu.org/gitweb/?p=config.git;a=tree):
-
- $ HOST=$(./config.guess)
- $ echo "$HOST"
- x86_64-pc-linux-gnu
-
-There are reasons for why this script exists and why it is that long. Even
-on Linux distributions, there is no consistent way, to pull a machine triple
-out of a shell one liner.
-
-Some guides out there suggest using a shell builtin **MACHTYPE**:
-
- $ echo "$MACHTYPE"
- x86_64-redhat-linux-gnu
-
-The above is what I got on Fedora, however on Arch Linux I got this:
-
- $ echo "$MACHTYPE"
- x86_64
-
-Some other guides suggest using **OSTYPE**:
-
- $ HOST=$(uname -m)-$OSTYPE
- $ echo $HOST
- x86_64-linux-gnu
-
-This works on Fedora and Arch Linux, but fails on OpenSuSE:
-
- $ echo $OSTYPE
- linux
-
-If you want to safe yourself a lot of headache, refrain from using such
-adhockery and simply use `config.guess`. I only listed this here to warn you,
-because I have seen some guides and tutorials out there using this nonsense.
-
+CPU architecture. The variables are used for musl, gcc and linux respecitively,
+because they cannot agree on consistent architecture names (except sometimes).
### Installing the kernel headers
@@ -355,7 +357,7 @@ kernel outside its source tree works a bit different compared to autotools
based stuff.
To keep things clean, we use a shell variable **srcdir** to remember where
-we kept the binutils source. A pattern that we will repeat later:
+we kept the kernel source. A pattern that we will repeat later:
export KBUILD_OUTPUT="$BUILDROOT/build/linux"
mkdir -p "$KBUILD_OUTPUT"
@@ -419,8 +421,8 @@ From the binutils build directory we run the configure script:
--disable-nls --disable-multilib
We use the **--prefix** option to actually let the toolchain know that it is
-being installed in our toolchain directory, that we are going to run it from
-there and that it should locate helper programs in there.
+being installed in our toolchain directory, so it can locate its resources and
+helper programs when we run it.
We also set the **--target** option to tell the build system what target the
assembler, linker and other tools should generate **output** for. We don't
@@ -440,11 +442,11 @@ or 中文), mainly because we don't need it and not doing something typically
saves time.
Regarding the multilib option: Some architectures support executing code for
-other, related architectures (e.g. x86 code can run x86_64). On GNU/Linux
-distributions that support that, you typically have different versions of the
-same libraries (e.g. in *lib/* and *lib32/* directories) with programs for
-different architectures being linked to the appropriate libraries. We are only
-interested in a single architecture and don't need that, so we
+other, related architectures (e.g. an x86_64 machine can run 32 bit x86 code).
+On GNU/Linux distributions that support that, you typically have different
+versions of the same libraries (e.g. in *lib/* and *lib32/* directories) with
+programs for different architectures being linked to the appropriate libraries.
+We are only interested in a single architecture and don't need that, so we
set **--disable-multilib**.
@@ -471,8 +473,8 @@ such as the assembler, linker and other tools that are prefixed with the host
triplet.
There is also a new directory called `toolchain/arm-linux-musleabihf` which
-contains a secondary system root with programs that aren't prefixed and linker
-scripts.
+contains a secondary system root with programs that aren't prefixed, and some
+linker scripts.
### First pass GCC
@@ -548,7 +550,6 @@ For the first make, you **really** want to specify a *-j NUM-PROCESSES* option
here. Even the first pass GCC we are building here will take a while to compile
on an ordinary desktop machine.
-
### C standard library
We create our build directory and change there:
@@ -648,7 +649,7 @@ All that's left now is building and installing the compiler:
This time, we are going to build and install *everything*. You *really* want to
do a parallel build here. On my AMD Ryzen based desktop PC, building with
-`make -j 16` takes about 3 minutes. On my Intel i5 laptop takes circa 15
+`make -j 16` takes about 3 minutes. On my Intel i5 laptop it takes circa 15
minutes. If you are using a laptop, you might want to open a window (assuming
it is cold outside, i.e. won't help if you are in Taiwan).
diff --git a/elfstartup.md b/elfstartup.md
index 37ee0db..f88bce4 100644
--- a/elfstartup.md
+++ b/elfstartup.md
@@ -3,9 +3,9 @@
This section provides a high level overview of the startup process of a
dynamically linked program on Linux.
-When using the `exec` system call to run a program, the kernel maps it into
-memory and tries to determine what kind of executable it is by looking at
-the magic number. Based on the type of executable, some data structures are
+When using the `exec` system call to run a program, the kernel looks at the
+first few bytes of the target file and tries to determine what kind of
+executable. Based on the type of executable, some data structures are
parsed and the program is run. For a statically linked ELF program, this means
fiddling the entry point address out of the header and jumping to it (with
a kernel to user space transition of course).
@@ -16,16 +16,16 @@ run. This mechanism is also used for implementing dynamically linked programs.
Similar to how scripts have an interpreter field (`#!/bin/sh`
or `#!/usr/bin/perl`), ELF files can also have an interpreter section. For
dynamically linked ELF executables, the compiler sets the interpreter field
-to the loader (`ld-linux.so` or similar).
+to the run time linker (`ld-linux.so` or similar), also known as "loader".
The `ld-linux.so` loader is typically provided by the `libc` implementation
-(i.e. Musl, glibc, ...) then maps the actual executable into memory
+(i.e. Musl, glibc, ...). It maps the actual executable into memory
with `mmap(2)`, parses the dynamic section and mmaps the used libraries
(possibly recursively since libraries may need other libraries), does
some relocations if applicable and then jumps to the entry point address.
The kernel itself actually has no concept of libraries. Thanks to this
-mechanism, it doesn't have to.
+mechanism, it doesn't even have to.
The whole process of using an interpreter is actually done recursively. An
interpreter can in-turn also have an interpreter. For instance if you exec
@@ -39,9 +39,9 @@ no interpreter field set, so the kernel maps it into memory, extracts the
entry point address and runs it.
If `/bin/sh` were statically linked, the last step would be missing and the
-kernel would start executing right there. It should also be noted that Linux
-has a hard limit for interpreter recursion depth, typically set to 3 to
-support this exact standard case (script, interpreter, loader).
+kernel would start executing right there. Linux actually has a hard limit for
+interpreter recursion depth, typically set to 3 to support this exact standard
+case (script, interpreter, loader).
The entry point of the ELF file that the loader jumps to is of course NOT
the `main` function of the C program. It points to setup code provided by
@@ -56,7 +56,7 @@ against this object file and expects it to have a symbol called `_start`. The
entry point address of the ELF file is set to the location of `_start` and the
interpreter is set to the path of the loader.
-Finally, somewhere inside the `main` function of `/bin/sh` is run, it opens
+Finally, somewhere inside the `main` function of `/bin/sh`, it eventually opens
the file it has been provided on the command line and starts interpreting your
shell script.
diff --git a/kernel.md b/kernel.md
index 4ea3025..5dd04fa 100644
--- a/kernel.md
+++ b/kernel.md
@@ -23,10 +23,10 @@ PC, [Syslinux](https://syslinux.org/) or (rarely outside of the PC world)
[GNU GRUB](https://www.gnu.org/software/grub/).
The final stage boot loader then takes care of loading the Linux kernel into
-memory and executing it. The boot loader passes along some informational data
-structures that it writes into memory and passes a pointer to this information
-to the kernel boot code. Besides system information (e.g. RAM layout), this
-typically also contains a command line for the kernel.
+memory and executing it. The boot loader typically generates some informational
+data structures in memory and passes a pointer to the kernel boot code. Besides
+system information (e.g. RAM layout), this typically also contains a command
+line for the kernel.
On a very high level, after the boot loader jumps into the kernel, the kernel
decompresses itself and does some internal initialization, initializes built-in
@@ -43,17 +43,16 @@ For very simple setups, it can be sufficient to pass a command line option to
the kernel that tells it what device to mount for the root filesystem. For more
complex setups, Linux supports mounting an *initial ramdisk*.
-In addition to the kernel and command line, the boot loader loads a
-compressed [cpio](https://en.wikipedia.org/wiki/Cpio) archive into memory and
-passes a pointer to the kernel where it can find it. The kernel then mount
-an in-memory filesystem as root filesystem and unpacks the cpio archive into
-it. Alternatively, the Linux build system can create this archive during kernel
-build and bake it directly into the kernel binary.
+An initial ram disk is a compressed archive that the boot loader loads into
+memory along with the kernel. Along with the kernel command line, the boot
+loader gives the kernel a pointer to archive start in memory.
-This cpio archive usually contains a small rescue shell and some helper
-programs. The process that the kernel executes as PID 1 is usually a shell
-script that does more sophisticated filesystem setup, transitions to the
-actual root filesystem and does an `exec` to the actual `init`.
+The kernel then mounts an in-memory filesystem as root filesystem, unpacks the
+archive into it and runs the PID 1 process from there. Typically this is a
+script or program that then does a more complex mount setup, transitions to
+the actual root file system and does an `exec` to start the actual PID 1
+process. If it fails at some point, it usually drops you into a tiny rescue
+shell that is also packed into the archive.
Systems typically use [BusyBox](https://busybox.net/) as a tiny shell
interpreter. BusyBox is a collection of tiny command line programs that
@@ -66,6 +65,9 @@ symlinks or hard links are created that point to the binary and BusyBox, when
run, will determine what utility to execute from the path through which it has
been started.
+For historical reasons, Linux uses [cpio](https://en.wikipedia.org/wiki/Cpio)
+archives for the initial ramdisk.
+
### Device Tree
TODO: explain
@@ -346,21 +348,27 @@ controlled by the GPU, since the SoC is basically a GPU with an ARM CPU slapped
on to it.
The GPU loads a binary called `bootcode.bin` from the SD card, which contains a
-proprietary boot loader blob for GPU. It does some initialization and chain
-loads `start.elf` which contains a firmware blob for the GPU. The GPU is running
-an RTOS called [ThreadX OS](https://en.wikipedia.org/wiki/ThreadX) and somewhere
-around [1M lines](https://www.raspberrypi.org/forums/viewtopic.php?t=53007#p406247)
+proprietary boot loader blob for the GPU. This in turn does some initialization
+and chain loads `start.elf` which contains a firmware blob for the GPU. The GPU
+is running an RTOS called [ThreadX OS](https://en.wikipedia.org/wiki/ThreadX)
+and somewhere around [>1M lines](https://www.raspberrypi.org/forums/viewtopic.php?t=53007#p406247)
worth of firmware code.
There are different versions of `start.elf`. The one called `start_x.elf`
contains an additional driver for the camera interface, `start_db.elf` is a
debug version and `start_cd.elf` is a version with a cut-down memory layout.
+The `start.elf` file uses an aditional file called `fixup.dat` to configure
+the RAM partitioning between the GPU and the CPU.
+
In the end, the GPU firmware loads and parses a file called `config.txt` from
the SD card, which contains configuration parameters, and `cmdline.txt` which
contains the kernel command line. After parsing the configuration, it finally
loads the kernel, the initrd, the device tree binaries and runs the kernel.
+Depending on the configuration, the GPU firmway may patch the device tree
+in-memory before running the kernel.
+
### Copying the Files Over
First, we need a micro SD card with a FAT32 partition on it. How to create the
@@ -389,11 +397,12 @@ Then, we'll put the [cmdline.txt](firmware/cmdline.txt) onto the SD card:
console=tty0
-The `console` parameter tells the kernel what to use as a console device. We
-tell it to use the first video console which is what we will get at the HDMI
+The `console` parameter tells the kernel the tty where it prints its boot
+messages and that it uses as the standard input/output tty for our init script.
+We tell it to use the first video console which is what we will get at the HDMI
output of the Raspberry Pi.
-Whats left is the device tree binaries and lastly the kernel and initrd:
+Whats left are the device tree binaries and lastly the kernel and initrd:
mkdir -p overlays
cp $SYSROOT/boot/dts/*-rpi-3-*.dtb .