From bf376b4c70e4b7c7623008ff95be2d498cc6f4f2 Mon Sep 17 00:00:00 2001
From: David Oberhollenzer <goliath@infraroot.at>
Date: Sat, 13 Feb 2021 14:57:50 +0100
Subject: Cleanup: prefix the individual chapters with a numeric index

Signed-off-by: David Oberhollenzer <goliath@infraroot.at>
---
 00_setup.md   |  72 +++++++
 01_crosscc.md | 626 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 02_kernel.md  | 475 ++++++++++++++++++++++++++++++++++++++++++++
 README.md     |  12 +-
 crosscc.md    | 626 ----------------------------------------------------------
 kernel.md     | 475 --------------------------------------------
 setup.md      |  72 -------
 7 files changed, 1179 insertions(+), 1179 deletions(-)
 create mode 100644 00_setup.md
 create mode 100644 01_crosscc.md
 create mode 100644 02_kernel.md
 delete mode 100644 crosscc.md
 delete mode 100644 kernel.md
 delete mode 100644 setup.md

diff --git a/00_setup.md b/00_setup.md
new file mode 100644
index 0000000..465c1d7
--- /dev/null
+++ b/00_setup.md
@@ -0,0 +1,72 @@
+# Prerequisites and Directory Setup
+
+This section deals with the packages we need on our system to cross bootstrap
+our mini distro, as well as the basic directory setup before we get started.
+
+## Prerequisites
+
+For compiling the packages you will need:
+
+* gcc
+* g++
+* make
+* flex
+* bison
+* gperf
+* makeinfo
+* ncurses (with headers)
+* awk
+* automake
+* help2man
+* curl
+* pkg-config
+* libtool
+* openssl (with headers)
+
+
+In case you wonder: even if you don't build any C++ package, you need the C++
+compiler to build GCC. The GCC code base mainly uses C99, but with some
+additional C++ features. `makeinfo` is used by the GNU utilities that generate
+info pages from texinfo. ncurses is mainly needed by the kernel build system
+for `menuconfig`. OpenSSL is also requried to compile the kernel later on.
+
+The list should be fairly complete, but I can't guarantee that I didn't miss
+something. Normally I work on systems with tons of development tools and
+libraries already installed, so if something is missing, please install it
+and maybe let me know.
+
+## Directory Setup
+
+First of all, you should create an empty directory somewhere where you want
+to build the cross toolchain and later the entire system.
+
+For convenience, we will store the absolute path to this directory inside a
+shell variable called **BUILDROOT** and create a few directories to organize
+our stuff in:
+
+    BUILDROOT=$(pwd)
+
+    mkdir -p "build" "src" "download" "toolchain/bin" "sysroot"
+
+I stored the downloaded packages in the **download** directory and extracted
+them to a directory called **src**.
+
+We will later build packages outside the source tree (GCC even requires that
+nowadays), inside a sub directory of **build**.
+
+Our final toolchain will end up in a directory called **toolchain**.
+
+We store the toolchain location inside another shell variable that I called
+**TCDIR** and prepend the executable path of our toolchain to the **PATH**
+variable:
+
+    TCDIR="$BUILDROOT/toolchain"
+    export PATH="$TCDIR/bin:$PATH"
+
+
+The **sysroot** directory will hold the cross compiled binaries for our target
+system, as well as headers and libraries used for cross compiling stuff. It is
+basically the `/` directory of the system we are going to build. For
+convenience, we will also store its absolute path in a shell variable:
+
+    SYSROOT="$BUILDROOT/sysroot"
diff --git a/01_crosscc.md b/01_crosscc.md
new file mode 100644
index 0000000..74f2d07
--- /dev/null
+++ b/01_crosscc.md
@@ -0,0 +1,626 @@
+# Building a Cross Compiler Toolchain
+
+As it turns out, building a cross compiler toolchain with recent GCC and
+binutils is a lot easier nowadays than it used to be.
+
+I'm building the toolchain on an AMD64 (aka x86_64) system. The steps have
+been tried on [Fedora](https://getfedora.org/) as well as on
+[OpenSUSE](https://www.opensuse.org/).
+
+The toolchain we are building generates 32 bit ARM code intended to run on
+a Raspberry Pi 3. [Musl](https://www.musl-libc.org/) is used as a C standard
+library implementation.
+
+## Downloading and unpacking everything
+
+The following source packages are required for building the toolchain. The
+links below point to the exact versions that I used.
+
+* [Linux](https://github.com/raspberrypi/linux/archive/raspberrypi-kernel_1.20201201-1.tar.gz).
+  Linux is a very popular OS kernel that we will use on our target system.
+  We need it to build the the C standard library for our toolchain.
+* [Musl](https://www.musl-libc.org/releases/musl-1.2.2.tar.gz). A tiny
+  C standard library implementation.
+* [Binutils](https://ftp.gnu.org/gnu/binutils/binutils-2.36.tar.xz). This
+  contains the GNU assembler, linker and various tools for working with
+  executable files.
+* [GCC](https://ftp.gnu.org/gnu/gcc/gcc-10.2.0/gcc-10.2.0.tar.xz), the GNU
+  compiler collection. Contains compilers for C and other languages.
+
+Simply download the packages listed above into `download` and unpack them
+into `src`.
+
+For convenience, I provided a small shell script called `download.sh` that,
+when run inside `$BUILDROOT` does this and also verifies the `sha256sum`
+of the packages, which will further make sure that you are using the **exact**
+same versions as I am.
+
+Right now, you should have a directory tree that looks something like this:
+
+* build/
+* toolchain/
+   * bin/
+* src/
+   * binutils-2.36/
+   * gcc-10.2.0/
+   * musl-1.2.2/
+   * linux-raspberrypi-kernel_1.20201201-1/
+* download/
+   * binutils-2.36.tar.xz
+   * gcc-10.2.0.tar.xz
+   * musl-1.2.2.tar.gz
+   * raspberrypi-kernel_1.20201201-1.tar.gz
+* sysroot/
+
+For building GCC, we will need to download some additional support libraries.
+Namely gmp, mfpr, mpc and isl that have to be unpacked inside the GCC source
+tree. Luckily, GCC nowadays provides a shell script that will do that for us:
+
+	cd "$BUILDROOT/src/gcc-10.2.0"
+	./contrib/download_prerequisites
+	cd "$BUILDROOT"
+
+
+# Overview
+
+From now on, the rest of the process itself consists of the following steps:
+
+1. Installing the kernel headers to the sysroot directory.
+2. Compiling cross binutils.
+3. Compiling a minimal GCC cross compiler with minimal `libgcc`.
+4. Cross compiling the C standard library (in our case Musl).
+5. Compiling a full version of the GCC cross compiler with complete `libgcc`.
+
+The main reason for compiling GCC twice is the inter-dependency between the
+compiler and the standard library.
+
+First of all, the GCC build system needs to know *what* kind of C standard
+library we are using and *where* to find it. For dynamically linked programs,
+it also needs to know what loader we are going to use, which is typically
+also provided by the C standard library. For more details, you can read this
+high level overview [how dyncamically linked ELF programs are run](elfstartup.md).
+
+Second, there is [libgcc](https://gcc.gnu.org/onlinedocs/gccint/Libgcc.html).
+`libgcc` contains low level platform specific helpers (like exception handling,
+soft float code, etc.) and is automatically linked to programs built with GCC.
+Libgcc source code comes with GCC and is compiled by the GCC build system
+specifically for our cross compiler & libc combination.
+
+However, some functions in the `libgcc` need functions from the C standard
+library. Some libc implementations directly use utility functions from `libgcc`
+such as stack unwinding helpers (provided by `libgcc_s`).
+
+After building a GCC cross compiler, we need to cross compile `libgcc`, so we
+can *then* cross compile other stuff that needs `libgcc` **like the libc**. But
+we need an already cross compiled libc in the first place for
+compiling `libgcc`.
+
+The solution is to build a minimalist GCC that targets an internal stub libc
+and provides a minimal `libgcc` that has lots of features disabled and uses
+the stubs instead of linking against libc.
+
+We can then cross compile the libc and let the compiler link it against the
+minimal `libgcc`.
+
+With that, we can then compile the full GCC, pointing it at the C standard
+library for the target system and build a fully featured `libgcc` along with
+it. We can simply install it *over* the existing GCC and `libgcc` in the
+toolchain directory (dynamic linking for the rescue).
+
+## Autotools and the canonical target tuple
+
+Most of the software we are going to build is using autotools based build
+systems. There are a few things we should know when working with autotools
+based packages.
+
+GNU autotools makes cross compilation easy and has checks and workarounds for
+the most bizarre platforms and their misfeatures. This was especially important
+in the early days of the GNU project when there were dozens of incompatible
+Unices on widely varying hardware platforms and the GNU packages were supposed
+to build and run on all of them.
+
+Nowadays autotools offers *decades* of being used in practice and is in my
+experience a lot more mature than more modern build systems. Also, having a
+semi standard way of cross compiling stuff with standardized configuration
+knobs is very helpful.
+
+In contrast to many modern build systems, you don't need Autotools to run an
+Autotools based build system. The final build system it generates for the
+release tarballs just uses shell and `make`.
+
+### The configure script
+
+Pretty much every novice Ubuntu user has probably already seen this on Stack
+Overflow (and copy-pasted it) at least once:
+
+    ./configure
+    make
+    make install
+
+
+The `configure` shell script generates the actual `Makefile` from a
+template (`Makefile.in`) that is then used for building the package.
+
+The `configure` script itself and the `Makefile.in` are completely independent
+from autotools and were generated by `autoconf` and `automake`.
+
+If we don't want to clobber the source tree, we can also build a package
+*outside the source tree* like this:
+
+    ../path/to/source/configure
+    make
+
+The `configure` script contains *a lot* of system checks and default flags that
+we can use for telling the build system how to compile the code.
+
+The main ones we need to know about for cross compiling are the following
+three options:
+
+* The **--build** option specifies what system we are *building* the
+  package on.
+* The **--host** option specifies what system the binaries will run on.
+* The **--target** option is specific for packages that contain compilers
+  and specify what system to generate output for.
+
+Those options take as an argument a dash seperated tuple that describes
+a system and is made up the following way:
+
+	<architecture>-<vendor>-<kernel>-<userspace>
+
+The vendor part is completely optional and we will only use 3 components to
+discribe our toolchain. So for our 32 bit ARM system, running a Linux kernel
+with a Musl based user space, is described like this:
+
+	arm-linux-musleabihf
+
+The user space component itself specifies that we use `musl` and we want to
+adhere to the ARM embedded ABI specification (`eabi` for short) with hardware
+float `hf` support.
+
+If you want to determine the tuple for the system *you are running on*, you can
+use the script [config.guess](https://git.savannah.gnu.org/gitweb/?p=config.git;a=tree):
+
+	$ HOST=$(./config.guess)
+	$ echo "$HOST"
+	x86_64-pc-linux-gnu
+
+There are reasons for why this script exists and why it is that long. Even
+on Linux distributions, there is no consistent way, to pull a machine triple
+out of a shell one liner.
+
+Some guides out there suggest using a shell builtin **MACHTYPE**:
+
+    $ echo "$MACHTYPE"
+    x86_64-redhat-linux-gnu
+
+The above is what I got on Fedora, however on Arch Linux I got this:
+
+    $ echo "$MACHTYPE"
+    x86_64
+
+Some other guides suggest using `uname` and **OSTYPE**:
+
+    $ HOST=$(uname -m)-$OSTYPE
+    $ echo $HOST
+    x86_64-linux-gnu
+
+This works on Fedora and Arch Linux, but fails on OpenSuSE:
+
+	$ HOST=$(uname -m)-$OSTYPE
+    $ echo $HOST
+    x86_64-linux
+
+If you want to safe yourself a lot of headache, refrain from using such
+adhockery and simply use `config.guess`. I only listed this here to warn you,
+because I have seen some guides and tutorials out there using this nonsense.
+
+As you saw here, I'm running on an x86_64 system and my user space is `gnu`,
+which tells autotools that the system is using `glibc`.
+
+You also saw that the `vendor` is sometimes used for branding, so use that
+field if you must, because the others have exact meaning and are parsed by
+the buildsystem.
+
+### The Installation Path
+
+When running `make install`, there are two ways to control where the program
+we just compiled is installed to.
+
+First of all, the `configure` script has an option called `--prefix`. That can
+be used like this:
+
+	./configure --prefix=/usr
+	make
+	make install
+
+In this case, `make install` will e.g. install the program to `/usr/bin` and
+install resources to `/usr/share`. The important thing here is that the prefix
+is used to generate path variables and the program "knows" what it's prefix is,
+i.e. it will fetch resource from `/usr/share`.
+
+But if instead we run this:
+
+	./configure --prefix=/opt/yoyodyne
+	make
+	make install
+
+The same program is installed to `/opt/yoyodyne/bin` and its resource end up
+in `/opt/yoyodyne/share`. The program again knows to look in the later path for
+its resources.
+
+The second option we have is using a Makefile variable called `DESTDIR`, which
+controls the behavior of `make install` *after* the program has been compiled:
+
+	./configure --prefix=/usr
+	make
+	make DESTDIR=/home/goliath/workdir install
+
+In this example, the program is installed to `/home/goliath/workdir/usr/bin`
+and the resources to `/home/goliath/workdir/usr/share`, but the program itself
+doesn't know that and "thinks" it lives in `/usr`. If we try to run it, it
+thries to load resources from `/usr/share` and will be sad because it can't
+find its files.
+
+## Building our Toolchain
+
+At first, we set a few handy shell variables that will store the configuration
+of our toolchain:
+
+    TARGET="arm-linux-musleabihf"
+	HOST="x86_64-linux-gnu"
+    LINUX_ARCH="arm"
+    MUSL_CPU="arm"
+    GCC_CPU="armv6"
+
+The **TARGET** variable holds the *target triplet* of our system as described
+above.
+
+We also need the triplet for the local machine that we are going to build
+things on. For simplicity, I also set this manually.
+
+The **MUSL_CPU**, **GCC_CPU** and **LINUX_ARCH** variables hold the target
+CPU architecture. The variables are used for musl, gcc and linux respecitively,
+because they cannot agree on consistent architecture names (except sometimes).
+
+### Installing the kernel headers
+
+We create a build directory called **$BUILDROOT/build/linux**. Building the
+kernel outside its source tree works a bit different compared to autotools
+based stuff.
+
+To keep things clean, we use a shell variable **srcdir** to remember where
+we kept the kernel source. A pattern that we will repeat later:
+
+    export KBUILD_OUTPUT="$BUILDROOT/build/linux"
+    mkdir -p "$KBUILD_OUTPUT"
+
+    srcdir="$BUILDROOT/src/linux-raspberrypi-kernel_1.20201201-1"
+
+    cd "$srcdir"
+    make O="$KBUILD_OUTPUT" ARCH="$LINUX_ARCH" headers_check
+    make O="$KBUILD_OUTPUT" ARCH="$LINUX_ARCH" INSTALL_HDR_PATH="$SYSROOT/usr" headers_install
+    cd "$BUILDROOT"
+
+
+According to the Makefile in the Linux source, you can either specify an
+environment variable called **KBUILD_OUTPUT**, or set a Makefile variable
+called **O**, where the later overrides the environment variable. The snippet
+above shows both ways.
+
+The *headers_check* target runs a few trivial sanity checks on the headers
+we are going to install. It checks if a header includes something nonexistent,
+if the declarations inside the headers are sane and if kernel internals are
+leaked into user space. For stock kernel tar-balls, this shouldn't be
+necessary, but could come in handy when working with kernel git trees,
+potentially with local modifications.
+
+Lastly (before switching back to the root directory), we actually install the
+kernel headers into the sysroot directory where the libc later expects them
+to be.
+
+The `sysroot` directory should now contain a `usr/include` directory with a
+number of sub directories that contain kernel headers.
+
+Since I've seen the question in a few forums: it doesn't matter if the kernel
+version exactly matches the one running on your target system. The kernel
+system call ABI is stable, so you can use an older kernel. Only if you use a
+much newer kernel, the libc might end up exposing or using features that your
+kernel does not yet support.
+
+If you have some embedded board with a heavily modified vendor kernel (such as
+in our case) and little to no upstream support, the situation is a bit more
+difficult and you may prefer to use the exact kernel.
+
+Even then, if you have some board where the vendor tree breaks the
+ABI **take the board and burn it** (preferably outside; don't inhale
+the fumes).
+
+### Compiling cross binutils
+
+We will compile binutils outside the source tree, inside the directory
+**build/binutils**. So first, we create the build directory and switch into
+it:
+
+    mkdir -p "$BUILDROOT/build/binutils"
+    cd "$BUILDROOT/build/binutils"
+
+    srcdir="$BUILDROOT/src/binutils-2.36"
+
+From the binutils build directory we run the configure script:
+
+    $srcdir/configure --prefix="$TCDIR" --target="$TARGET" \
+                      --with-sysroot="$SYSROOT" \
+                      --disable-nls --disable-multilib
+
+We use the **--prefix** option to actually let the toolchain know that it is
+being installed in our toolchain directory, so it can locate its resources and
+helper programs when we run it.
+
+We also set the **--target** option to tell the build system what target the
+assembler, linker and other tools should generate **output** for. We don't
+explicitly set the **--host** or **--build** because we are compiling binutils
+to run on the local machine.
+
+We would only set the **--host** option to cross compile binutils itself with
+an existing toolchain to run on a different system than ours.
+
+The **--with-sysroot** option tells the build system that the root directory
+of the system we are going to build is in `$SYSROOT` and it should look inside
+that to find libraries.
+
+We disable the feature **nls** (native language support, i.e. cringe worthy
+translations of error messages to your native language, such as Deutsch
+or 中文), mainly because we don't need it and not doing something typically
+saves time.
+
+Regarding the multilib option: Some architectures support executing code for
+other, related architectures (e.g. an x86_64 machine can run 32 bit x86 code).
+On GNU/Linux distributions that support that, you typically have different
+versions of the same libraries (e.g. in *lib/* and *lib32/* directories) with
+programs for different architectures being linked to the appropriate libraries.
+We are only interested in a single architecture and don't need that, so we
+set **--disable-multilib**.
+
+
+Now we can compile and install binutils:
+
+    make configure-host
+    make
+    make install
+    cd "$BUILDROOT"
+
+The first make target, *configure-host* is binutils specific and just tells it
+to check out the system it is *being built on*, i.e. your local machine and
+make sure it has all the tools it needs for compiling. If it reports a problem,
+**go fix it before continuing**.
+
+We then go on to build the binutils. You may want to speed up compilation by
+running a parallel build with **make -j NUMBER-OF-PROCESSES**.
+
+Lastly, we run *make install* to install the binutils in the configured
+toolchain directory and go back to our root directory.
+
+The `toolchain/bin` directory should now already contain a bunch of executables
+such as the assembler, linker and other tools that are prefixed with the host
+triplet.
+
+There is also a new directory called `toolchain/arm-linux-musleabihf` which
+contains a secondary system root with programs that aren't prefixed, and some
+linker scripts.
+
+### First pass GCC
+
+Similar to above, we create a directory for building the compiler, change
+into it and store the source location in a variable:
+
+    mkdir -p "$BUILDROOT/build/gcc-1"
+    cd "$BUILDROOT/build/gcc-1"
+
+    srcdir="$BUILDROOT/src/gcc-10.2.0"
+
+Notice, how the build directory is called *gcc-1*. For the second pass, we
+will later create a different build directory. Not only does this out of tree
+build allow us to cleanly start afresh (because the source is left untouched),
+but current versions of GCC will *flat out refuse* to build inside the
+source tree.
+
+    $srcdir/configure --prefix="$TCDIR" --target="$TARGET" --build="$HOST" \
+                      --host="$HOST" --with-sysroot="$SYSROOT" \
+                      --disable-nls --disable-shared --without-headers \
+                      --disable-multilib --disable-decimal-float \
+                      --disable-libgomp --disable-libmudflap \
+                      --disable-libssp --disable-libatomic \
+                      --disable-libquadmath --disable-threads \
+                      --enable-languages=c --with-newlib \
+                      --with-arch="$GCC_CPU" --with-float=hard \
+                      --with-fpu=neon-vfpv3
+
+The **--prefix**, **--target** and **--with-sysroot** work just like above for
+binutils.
+
+This time we explicitly specify **--build** (i.e. the system that we are going
+to compile GCC on) and **--host** (i.e. the system that the GCC will run on).
+In our case those are the same. I set those explicitly for GCC, because the GCC
+build system is notoriously fragile. Yes, *I have seen* older versions of GCC
+throw a fit or assume complete nonsense if you don't explicitly specify those
+and at this point I'm no longer willing to trust it.
+
+The option **--with-arch** gives the build system slightly more specific
+information about the target processor architecture. The two options after that
+are specific for our target and tell the buildsystem that GCC should use the
+hardware floating point unit and can emit neon instructions for vectorization.
+
+We also disable a bunch of stuff we don't need. I already explained *nls*
+and *multilib* above. We also disable a bunch of optimization stuff and helper
+libraries. Among other things, we also disable support for dynamic linking and
+threads as we don't have the libc yet.
+
+The option **--without-headers** tells the build system that we don't have the
+headers for the libc *yet* and it should use minimal stubs instead where it
+needs them. The **--with-newlib** option is *more of a hack*. It tells that we
+are going to use the [newlib](http://www.sourceware.org/newlib/) as C standard
+library. This isn't actually true, but forces the build system to disable some
+[libgcc features that depend on the libc](https://gcc.gnu.org/ml/gcc-help/2009-07/msg00368.html).
+
+The option **--enable-languages** accepts a comma separated list of languages
+that we want to build compilers for. For now, we only need a C compiler for
+compiling the libc.
+
+If you are interested: [Here is a detailed list of all GCC configure options.](https://gcc.gnu.org/install/configure.html)
+
+Now, lets build the compiler and `libgcc`:
+
+    make all-gcc all-target-libgcc
+    make install-gcc install-target-libgcc
+
+    cd "$BUILDROOT"
+
+We explicitly specify the make targets for *GCC* and *cross-compiled libgcc*
+for our target. We are not interested in anything else.
+
+For the first make, you **really** want to specify a *-j NUM-PROCESSES* option
+here. Even the first pass GCC we are building here will take a while to compile
+on an ordinary desktop machine.
+
+### C standard library
+
+We create our build directory and change there:
+
+    mkdir -p "$BUILDROOT/build/musl"
+    cd "$BUILDROOT/build/musl"
+
+    srcdir="$BUILDROOT/src/musl-1.2.2"
+
+Musl is quite easy to build but requires some special handling, because it
+doesn't use autotools. The configure script is actually a hand written shell
+script that tries to emulate some of the typical autotools handling:
+
+    CC="${TARGET}-gcc" $srcdir/configure --prefix=/usr --target="$TARGET"
+
+We override the shell variable **CC** to point to the cross compiler that we
+just built. Remember, we added **$TCDIR/bin** to our **PATH**.
+
+We do the same thing for actually compiling musl and we explicitly set the
+**DESTDIR** variable for installing:
+
+    CC="${TARGET}-gcc" make
+    make DESTDIR="$SYSROOT" install
+
+    cd "$BUILDROOT"
+
+The important part here, that later also applies for autotools based stuff, is
+that we don't set **--prefix** to the sysroot directory. We set the prefix so
+that the build system "thinks" it compiles the library to be installed
+in `/usr`, but then we install the compiled binaries and headers to the sysroot
+directory.
+
+The `sysroot/usr/include` directory should now contain a bunch of standard
+headers. Likewise, the `sysroot/usr/lib` directory should now contain a
+`libc.so`, a bunch of dummy libraries, and the startup object code provided
+by Musl.
+
+Despite the prefix we set, Musl installs a `sysroot/lib/ld-musl-armhf.so.1`
+symlink which points to `/usr/lib/libc.so`. Dynamically linked programs built
+with our toolchain will have `/lib/ld-musl-armhf.so.1` set as their loader.
+
+### Second pass GCC
+
+We are reusing the same source code from the first stage, but in a different
+build directory:
+
+    mkdir -p "$BUILDROOT/build/gcc-2"
+    cd "$BUILDROOT/build/gcc-2"
+
+    srcdir="$BUILDROOT/src/gcc-10.2.0"
+
+Most of the configure options should be familiar already:
+
+    $srcdir/configure --prefix="$TCDIR" --target="$TARGET" --build="$HOST" \
+                      --host="$HOST" --with-sysroot="$SYSROOT" \
+                      --disable-nls --enable-languages=c,c++ \
+                      --enable-c99 --enable-long-long \
+                      --disable-libmudflap --disable-multilib \
+                      --disable-libsanitizer --with-arch="$CPU" \
+                      --with-native-system-header-dir="/usr/include" \
+                      --with-float=hard --with-fpu=neon-vfpv3
+
+For the second pass, we also build a C++ compiler. The options **--enable-c99**
+and **--enable-long-long** are actually C++ specific. When our final compiler
+runs in C++98 mode, we allow it to expose C99 functions from the libc through
+a GNU extension. We also allow it to support the *long long* data type
+standardized in C99.
+
+You may wonder why we didn't have to build a **libstdc++** between the
+first and second pass, like the libc. The source code for the *libstdc++*
+comes with the **g++** compiler and is built automatically like `libgcc`.
+On the one hand, it is really just a library that adds C++ stuff
+*on top of libc*, mostly header only code that is compiled with the actual
+C++ programs. On the other hand, C++ does not have a standard ABI and it is
+all compiler and OS specific. So compiler vendors will typically ship their
+own `libstdc++` implementation with the compiler.
+
+We **--disable-libsanitizer** because it simply won't build for musl. I tried
+fixing it, but it simply assumes too much about the nonstandard internals
+of the libc. A quick Google search reveals that it has **lots** of similar
+issues with all kinds of libc & kernel combinations, so even if I fix it on
+my system, you may run into other problems on your system or with different
+versions of packets. It even has different problems with different versions
+of glibc. Projects like buildroot simply disable it when using musl. It "only"
+provides a static code analysis plugin for the compiler.
+
+The option **--with-native-system-header-dir** is of special interest for our
+cross compiler. We explicitly tell it to look for headers in `/usr/include`,
+relative to our **$SYSROOT** directory. We could just as easily place the
+headers somewhere else in the previous steps and have it look there.
+
+All that's left now is building and installing the compiler:
+
+    make
+    make install
+
+    cd "$BUILDROOT"
+
+This time, we are going to build and install *everything*. You *really* want to
+do a parallel build here. On my AMD Ryzen based desktop PC, building with
+`make -j 16` takes about 3 minutes. On my Intel i5 laptop it takes circa 15
+minutes. If you are using a laptop, you might want to open a window (assuming
+it is cold outside, i.e. won't help if you are in Taiwan).
+
+### Testing the Toolchain
+
+We quickly write our average hello world program into a file called **test.c**:
+
+    #include <stdio.h>
+
+    int main(void)
+    {
+        puts("Hello, world");
+        return 0;
+    }
+
+We can now use our cross compiler to compile this C file:
+
+    $ ${TARGET}-gcc test.c
+
+Running the program `file` on the resulting `a.out` will tell us that it has
+been properly compiled and linked for our target machine:
+
+    $ file a.out
+    a.out: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-musl-armhf.so.1, not stripped
+
+Of course, you won't be able to run the program on your build system. You also
+won't be able to run it on Raspbian or similar, because it has been linked
+against our cross compiled Musl.
+
+Statically linking it should solve the problem:
+
+    $ ${TARGET}-gcc -static test.c
+    $ file a.out
+    a.out: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, with debug_info, not stripped
+    $ readelf -d a.out
+
+    There is no dynamic section in this file.
+
+This binary now does not require any libraries, any interpreters and does
+system calls directly. It should now run on your favourite Raspberry Pi
+distribution as-is.
diff --git a/02_kernel.md b/02_kernel.md
new file mode 100644
index 0000000..78a8cbc
--- /dev/null
+++ b/02_kernel.md
@@ -0,0 +1,475 @@
+# Building a Bootable Kernel and Initial RAM Filesystem
+
+This section outlines how to use the cross compiler toolchain you just built
+for cross-compiling a bootable kernel, and how to get the kernel to run on
+the Raspberry Pi.
+
+## The Linux Boot Process at a High Level
+
+When your system is powered on, it usually won't run the Linux kernel directly.
+Even on a very tiny embedded board that has the kernel baked into a flash
+memory soldered directly next to the CPU. Instead, a chain of boot loaders will
+spring into action that do basic board bring-up and initialization. Part of this
+chain is typically comprised of proprietary blobs from the CPU or board vendor
+that considers hardware initialization as a mystical secret that must not be
+shared. Each part of the boot loader chain is typically very restricted in what
+it can do, hence the need to chain load a more complex loader after doing some
+hardware initialization.
+
+The chain of boot loaders typically starts with some mask ROM baked into the
+CPU and ends with something like [U-Boot](https://www.denx.de/wiki/U-Boot),
+[BareBox](https://www.barebox.org/), or in the case of an x86 system like your
+PC, [Syslinux](https://syslinux.org/) or (rarely outside of the PC world)
+[GNU GRUB](https://www.gnu.org/software/grub/).
+
+The final stage boot loader then takes care of loading the Linux kernel into
+memory and executing it. The boot loader typically generates some informational
+data structures in memory and passes a pointer to the kernel boot code. Besides
+system information (e.g. RAM layout), this typically also contains a command
+line for the kernel.
+
+On a very high level, after the boot loader jumps into the kernel, the kernel
+decompresses itself and does some internal initialization, initializes built-in
+hardware drivers and then attempts to mount the root filesystem. After mounting
+the root filesystem, the kernel creates the very first process with PID 1.
+
+At this point, boot strapping is done as far as the kernel is concerned. The
+process with PID 1 usually spawns (i.e. `fork` + `exec`) and manages a bunch
+of daemon processes. Some of them allowing users to log in and get a shell.
+
+### Initial RAM Filesystem
+
+For very simple setups, it can be sufficient to pass a command line option to
+the kernel that tells it what device to mount for the root filesystem. For more
+complex setups, Linux supports mounting an *initial RAM filesystem*.
+
+This basically means that in addition to the kernel, the boot loader loads
+a compressed archive into memory. Along with the kernel command line, the boot
+loader gives the kernel a pointer to archive start in memory.
+
+The kernel then mounts an in-memory filesystem as root filesystem, unpacks the
+archive into it and runs the PID 1 process from there. Typically this is a
+script or program that then does a more complex mount setup, transitions to
+the actual root file system and does an `exec` to start the actual PID 1
+process. If it fails at some point, it usually drops you into a tiny rescue
+shell that is also packed into the archive.
+
+For historical reasons, Linux uses [cpio](https://en.wikipedia.org/wiki/Cpio)
+archives for the initial ram filesystem.
+
+Systems typically use [BusyBox](https://busybox.net/) as a tiny shell
+interpreter. BusyBox is a collection of tiny command line programs that
+implement basic commands available on Unix-like system, ranging from `echo`
+or `cat` all the way to a small `vi` and `sed` implementation and including
+two different shell implementations to choose from.
+
+BusyBox gets compiled into a single, monolithic binary. For the utility
+programs, symlinks or hard links are created that point to the binary.
+BusyBox, when run, will determine what utility to execute from the path
+through which it has been started.
+
+**NOTE**: The initial RAM filesystem, or **initramfs** should not be confused
+with the older concept of an initial RAM disk, or **initrd**. The initial RAM
+disk actually uses a disk image instead of an archive and the kernel internally
+emulates a block device that reads blocks from RAM. A regular filesystem driver
+is used to mount the RAM backed block device as root filesystem.
+
+### Device Tree
+
+On a typical x86 PC, your hardware devices are attached to the PCI bus and the
+kernel can easily scan it to find everything. The devices have nice IDs that
+the kernel can query and the drivers tell the kernel what IDs that they can
+handle.
+
+On embedded machines running e.g. ARM based SoCs, the situation is a bit
+different. The various SoC vendors buy licenses for all the hardware "IP cores",
+slap them together and multiplex them onto the CPU cores memory bus. The
+hardware registers end up mapped to SoC specific memory locations and there is
+no real way to scan for possibly present hardware.
+
+In the past, Linux had something called "board files" that where SoC specific
+C files containing SoC & board specific initialization code, but this was
+considered too inflexible.
+
+Linux eventually adopted the concept of a device tree binary, which is
+basically a binary blob that hierarchically describes the hardware present on
+the system and how the kernel can interface with it.
+
+The boot loader loads the device tree into memory and tells the kernel where it
+is, just like it already does for the initial ramfs and command line.
+
+In theory, a kernel binary can now be started on a number of different boards
+with the same CPU architecture, without recompiling (assuming it has all the
+drivers). It just needs the correct device tree binary for the board.
+
+The device tree binary (dtb) itself is generated from a number of source
+files (dts) located in the kernel source tree under `arch/<cpu>/boot/dts`.
+They are compiled together with the kernel using a device tree compiler that
+is also part of the kernel source.
+
+On a side note, the device tree format originates from the BIOS equivalent
+of SPARC workstations. The format is now standardized through a specification
+provided by the Open Firmware project and Linux considers it part of its ABI,
+i.e. a newer kernel should *always* work with an older DTB file.
+
+## Overview
+
+In this section, we will cross compile BusyBox, build a small initial ramfs,
+cross compile the kernel and get all of this to run on the Raspberry Pi.
+
+Unless you have used the `download.sh` script from [the cross toolchain](01_crosscc.md),
+you will need to download and unpack the following:
+
+* [BusyBox](https://busybox.net/downloads/busybox-1.32.1.tar.bz2)
+* [Linux](https://github.com/raspberrypi/linux/archive/raspberrypi-kernel_1.20201201-1.tar.gz)
+
+You should still have the following environment variables set from building the
+cross toolchain:
+
+    BUILDROOT=$(pwd)
+    TCDIR="$BUILDROOT/toolchain"
+    SYSROOT="$BUILDROOT/sysroot"
+    TARGET="arm-linux-musleabihf"
+	HOST="x86_64-linux-gnu"
+    LINUX_ARCH="arm"
+    export PATH="$TCDIR/bin:$PATH"
+
+
+## Building BusyBox
+
+The BusyBox build system is basically the same as the Linux kernel build system
+that we already used for [building a cross toolchain](01_crosscc.md).
+
+Just like the kernel (which we haven't built yet), BusyBox uses has a
+configuration file that contains a list of key-value pairs for enabling and
+tuning features.
+
+I prepared a file `bbstatic.config` with the configuration that I used. I
+disabled a lot of stuff that we don't need inside an initramfs, but most
+importantly, I changed the following settings:
+
+ - **CONFIG_INSTALL_NO_USR** set to yes, so BusyBox creates a flat hierarchy
+   when installing itself.
+ - **CONFIG_STATIC** set to yes, so BusyBox is statically linked and we don't
+   need to pack any libraries or a loader into our initramfs.
+
+If you want to customize my configuration, copy it into a freshly extracted
+BusyBox tarball, rename it to `.config` and run the menuconfig target:
+
+    mv bbstatic.config .config
+    make menuconfig
+
+The `menuconfig` target builds and runs an ncurses based dialog that lets you
+browse and configure features.
+
+Alternatively you can start from scratch by creating a default configuration:
+
+    make defconfig
+    make menuconfig
+
+To compile BusyBox, we'll first do the usual setup for the out-of-tree build:
+
+    srcdir="$BUILDROOT/src/busybox-1.32.1"
+    export KBUILD_OUTPUT="$BUILDROOT/build/bbstatic"
+
+    mkdir -p "$KBUILD_OUTPUT"
+    cd "$KBUILD_OUTPUT"
+
+At this point, you have to copy the BusyBox configuration into the build
+directory. Either use your own, or copy my `bbstatic.config` over, and rename
+it to `.config`.
+
+By running `make oldconfig`, we let the buildsystem sanity check the config
+file and have it ask what to do if any option is missing.
+
+    make -C "$srcdir" CROSS_COMPILE="${TARGET}-" oldconfig
+
+We need to edit 2 settings in the config file: The path to the sysroot and
+the prefix for the cross compiler executables. This can be done easily with
+two lines of `sed`:
+
+    sed -i "$KBUILD_OUTPUT/.config" -e 's,^CONFIG_CROSS_COMPILE=.*,CONFIG_CROSS_COMPILE="'$TARGET'-",'
+    sed -i "$KBUILD_OUTPUT/.config" -e 's,^CONFIG_SYSROOT=.*,CONFIG_SYSROOT="'$SYSROOT'",'
+
+What is now left is to compile BusyBox.
+
+    make -C "$srcdir" CROSS_COMPILE="${TARGET}-"
+
+Before returning to the build root directory, I installed the resulting binary
+to the sysroot directory as `bbstatic`.
+
+    mkdir -p "$SYSROOT/bin"
+    cp busybox "$SYSROOT/bin/bbstatic"
+    cd "$BUILDROOT"
+
+## Compiling the Kernel
+
+First, we do the same dance again for the kernel out of tree build:
+
+    srcdir="$BUILDROOT/src/linux-raspberrypi-kernel_1.20201201-1"
+    export KBUILD_OUTPUT="$BUILDROOT/build/linux"
+
+    mkdir -p "$KBUILD_OUTPUT"
+    cd "$KBUILD_OUTPUT"
+
+I provided a configuration file in `linux.config` which you can simply copy
+to `$KBUILD_OUTPUT/.config`.
+
+Or you can do the same as I did and start out by initializing a default
+configuration for the Raspberry Pi and customizing it:
+
+    make -C "$srcdir" ARCH="$LINUX_ARCH" bcm2709_defconfig
+    make -C "$srcdir" ARCH="$LINUX_ARCH" menuconfig
+
+I mainly changed **CONFIG_SQUASHFS** and **CONFIG_OVERLAY_FS**, turning them
+both from `<M>` to `<*>`, so they get built in instead of being built as
+modules.
+
+Hint: you can also search for things in the menu config by typing `/` and then
+browsing through the popup dialog. Pressing the number printed next to any
+entry brings you directly to the option. Be aware that names in the menu
+generally don't contain **CONFIG_**.
+
+Same as with BusyBox, we insert the cross compile prefix into the configuration
+file:
+
+    sed -i "$KBUILD_OUTPUT/.config" -e 's,^CONFIG_CROSS_COMPILE=.*,CONFIG_CROSS_COMPILE="'$TARGET'-",'
+
+And then finally build the kernel:
+
+    make -C "$srcdir" ARCH="$LINUX_ARCH" CROSS_COMPILE="${TARGET}-" oldconfig
+    make -C "$srcdir" ARCH="$LINUX_ARCH" CROSS_COMPILE="${TARGET}-" zImage dtbs modules
+
+The `oldconfig` target does the same as on BusyBox. More intersting are the
+three make targets in the second line. The `zImage` target is the compressed
+kernel binary, the `dtbs` target builds the device tree binaries and `modules`
+are the loadable kernel modules (i.e. drivers). You really want to insert
+a `-j NUMBER_OF_JOBS` in the second line, or it may take a considerable amount
+of time.
+
+Also, you *really* want to specify an argument after `-j`, otherwise the kernel
+build system will spawn processes until kingdome come (i.e. until your system
+runs out of resources and the OOM killer steps in).
+
+Lastly, I installed all of it into the sysroot for convenience:
+
+    mkdir -p "$SYSROOT/boot"
+    cp arch/arm/boot/zImage "$SYSROOT/boot"
+    cp -r arch/arm/boot/dts "$SYSROOT/boot"
+
+    make -C "$srcdir" ARCH="$LINUX_ARCH" CROSS_COMPILE="${TARGET}-" INSTALL_MOD_PATH="$SYSROOT" modules_install
+    cd $BUILDROOT
+
+The `modules_install` target creates a directory hierarchy `sysroot/lib/modules`
+containing a sub directory for each kernel version with the kernel modules and
+dependency information.
+
+The kernel binary will be circa 6 MiB in size and produce another circa 55 MiB
+worth of modules because the Raspberry Pi default configuration has all bells
+and whistles turned on. Fell free to adjust the kernel configuration and throw
+out everything you don't need.
+
+## Building an Inital RAM Filesystem
+
+First of all, although we do everything by hand here, we are going to create a
+build directory to keep everything neatly separated:
+
+    mkdir -p "$BUILDROOT/build/initramfs"
+	cd "$BUILDROOT/build/initramfs"
+
+Technically, the initramfs image is a simple cpio archive. However, there are
+some pitfalls here:
+
+* There are various versions of the cpio format, some binary, some text based.
+* The `cpio` command line tool is utterly horrible to use.
+* Technically, the POSIX standard considers it lagacy. See the big fat warning
+  in the man page.
+
+So instead of the `cpio` tool, we are going to use a tool from the Linux kernel
+tree called `gen_init_cpio`:
+
+    gcc "$BUILDROOT/src/linux-raspberrypi-kernel_1.20201201-1/usr/gen_init_cpio.c" -o gen_init_cpio
+
+This tool allows us to create a cpio image from a very simple file listing and
+produces exactely the format that the kernel understands.
+
+Here is the simple file listing that I used:
+
+    cat > initramfs.files <<_EOF
+    dir boot 0755 0 0
+    dir dev 0755 0 0
+    dir lib 0755 0 0
+    dir bin 0755 0 0
+    dir sys 0755 0 0
+    dir proc 0755 0 0
+    dir newroot 0755 0 0
+    slink sbin bin 0777 0 0
+    nod dev/console 0600 0 0 c 5 1
+    file bin/busybox $SYSROOT/bin/bbstatic 0755 0 0
+    slink bin/sh /bin/busybox 0777 0 0
+    file init $BUILDROOT/build/initramfs/init 0755 0 0
+    _EOF
+
+In case you are wondering about the first and last line, this is called a
+[heredoc](https://en.wikipedia.org/wiki/Here_document) and can be copy/pasted
+into the shell as is.
+
+The format itself is actually pretty self explantory. The `dir` lines are
+directories that we want in our archive with the permission and ownership
+information after the name. The `slink` entry creates a symlink, namely
+redirecting `/sbin` to `/bin`.
+
+The `nod` entry creates a devices file. In this case, a character
+device (hence `c`) with device number `5:1`. Just like how symlinks are special
+files that have a target string stored in them and get special treatment from
+the kernel, a device file is also just a special kind of file that has a device
+number stored in it. When a program opens a device file, the kernel maps the
+device number to a driver and redirects file I/O to that driver.
+
+This decice number `5:1` refers to a special text console on which the kernel
+prints out messages during boot. BusyBox will use this as standard input/output
+for the shell.
+
+Next, we actually pack our statically linked BusyBox, into the archive, but
+under the name `/bin/busybox`. We then create a symlink to it, called `bin/sh`.
+
+The last line packs a script called `init` (which we haven't written yet) into
+the archive as `/init`.
+
+The script called `/init` is what we later want the kernel to run as PID 1
+process. For the moment, there is not much to do and all we want is to get
+a shell when we power up our Raspberry Pi, so we start out with this stup
+script:
+
+    cat > init <<_EOF
+    #!/bin/sh
+
+    PATH=/bin
+
+    /bin/busybox --install
+    /bin/busybox mount -t proc none /proc
+    /bin/busybox mount -t sysfs none /sys
+    /bin/busybox mount -t devtmpfs none /dev
+
+    exec /bin/busybox sh
+    _EOF
+
+Running `busybox --install` will cause BusyBox to install tons of symlinks to
+itself in the `/bin` directory, one for each utility program. The next three
+lines run the `mount` utiltiy of BusyBox to mount the following pseudo
+filesystems:
+
+* `proc`, the process information filesystem which maps processes and other
+  various kernel variables to a directory hierchy. It is mounted to `/proc`.
+  See `man 5 proc` for more information.
+* `sysfs` a more generic, cleaner variant than `proc` for exposing kernel
+  objects to user space as a filesystem hierarchy. It is mounted to `/sys`.
+  See `man 5 sysfs` for more information.
+* `devtmpfs` is a pseudo filesystem that takes care of managing device files
+  for us. We mount it over `/dev`.
+
+We can now finally put everything together into an XZ compressed archive:
+
+    ./gen_init_cpio initramfs.files | xz --check=crc32 > initramfs.xz
+    cp initramfs.xz "$SYSROOT/boot"
+    cd "$BUILDROOT"
+
+The option `--check=crc32` forces the `xz` utility to create CRC-32 checksums
+instead of using sha256. This is necessary, because the kernel built in
+xz library cannot do sha256, will refuse to unpack the image otherwise and the
+system won't boot.
+
+
+## Putting everything on the Raspberry Pi and Booting it
+
+Remember how I mentioned earlier that the last step of our boot loader chain
+would involve something sane, like U-Boot or BareBox? Well, not on the
+Raspberry Pi.
+
+In addition to the already bizarro hardware, the Raspberry Pi has a lot of
+proprietary magic baked directly into the hardware. The boot process is
+controlled by the GPU, since the SoC is basically a GPU with an ARM CPU slapped
+on to it.
+
+The GPU loads a binary called `bootcode.bin` from the SD card, which contains a
+proprietary boot loader blob for the GPU. This in turn does some initialization
+and chain loads `start.elf` which contains a firmware blob for the GPU. The GPU
+is running an RTOS called [ThreadX OS](https://en.wikipedia.org/wiki/ThreadX)
+and somewhere around [>1M lines](https://www.raspberrypi.org/forums/viewtopic.php?t=53007#p406247)
+worth of firmware code.
+
+There are different versions of `start.elf`. The one called `start_x.elf`
+contains an additional driver for the camera interface, `start_db.elf` is a
+debug version and `start_cd.elf` is a version with a cut-down memory layout.
+
+The `start.elf` file uses an aditional file called `fixup.dat` to configure
+the RAM partitioning between the GPU and the CPU.
+
+In the end, the GPU firmware loads and parses a file called `config.txt` from
+the SD card, which contains configuration parameters, and `cmdline.txt` which
+contains the kernel command line. After parsing the configuration, it finally
+loads the kernel, the initramfs, the device tree binaries and runs the kernel.
+
+Depending on the configuration, the GPU firmway may patch the device tree
+in-memory before running the kernel.
+
+### Copying the Files Over
+
+First, we need a micro SD card with a FAT32 partition on it. How to create the
+partition is left as an exercise to the reader.
+
+Onto this partition, we copy the proprietary boot loader blobs:
+
+* [bootcode.bin](firmware/bootcode.bin)
+* [fixup.dat](firmware/fixup.data)
+* [start.elf](firmware/start.elf)
+
+We create a minimal [config.txt](firmware/config.txt) in the root directory:
+
+	dtparam=
+	kernel=zImage
+	initramfs initramfs.xz followkernel
+
+The first line makes sure the boot loader doesn't mangle the device tree. The
+second one specifies the kernel binary that should be loaded and the last one
+specifies the initramfs image. Note that there is no `=` sign in the last
+line. This field has a different format and the boot loader will ignore it if
+there is an `=` sign. The `followkernel` attribute tells the boot loader to put
+the initramfs into memory right after the kernel binary.
+
+Then, we'll put the [cmdline.txt](firmware/cmdline.txt) onto the SD card:
+
+	console=tty0
+
+The `console` parameter tells the kernel the tty where it prints its boot
+messages and that it uses as the standard input/output tty for our init script.
+We tell it to use the first video console which is what we will get at the HDMI
+output of the Raspberry Pi.
+
+Whats left are the device tree binaries and lastly the kernel and initramfs:
+
+    mkdir -p overlays
+    cp $SYSROOT/boot/dts/*-rpi-3-*.dtb .
+    cp $SYSROOT/boot/dts/overlays/*.dtbo overlays/
+
+    cp $SYSROOT/boot/initramfs.xz .
+    cp $SYSROOT/boot/zImage .
+
+If you are done, unmount the micro SD card and plug it into your Raspberr Pi.
+
+
+### Booting It Up
+
+If you connect the HDMI port and power up the Raspberry Pi, it should boot
+directly into the initramfs and you should get a BusyBox shell.
+
+The PATH is propperly set and the most common shell commands should be there, so
+you can poke around the root filesystem which is in memory and has been unpacked
+from the `initramfs.xz`.
+
+Don't be alarmed by the kernel boot prompt suddenly stopping. Even after the
+BusyBox shell starts, the kernel continues spewing messages for a short while
+and you may not see the shell prompt. Just hit the enter key a couple times.
+
+Also, the shell itself is running as PID 1. If you exit it, the kernel panics
+because PID 1 just died.
diff --git a/README.md b/README.md
index 9008315..2da222b 100644
--- a/README.md
+++ b/README.md
@@ -13,12 +13,12 @@ command lines around (I'm looking at you, LFS).
 
 This guide is divided into the following parts:
 
-* [Basic Setup](setup.md). Lists some tools that you should have installed and
-  walks through the steps of setting up the directory tree that we work in, as
-  well as a few handy environment variables.
-* [Building a cross compiler toolchain](crosscc.md).
-* [Cross compiling a statically linked BusyBox and the kernel](kernel.md). The
-  BusyBox is packaged into a small initrd. We will make it boot on the
+* [Basic Setup](00_setup.md). Lists some tools that you should have
+  installed and walks through the steps of setting up the directory tree that
+  we work in, as well as a few handy environment variables.
+* [Building a cross compiler toolchain](01_crosscc.md).
+* [Cross compiling a statically linked BusyBox and the kernel](02_kernel.md).
+  The BusyBox is packaged into a small initramfs. We will make it boot on the
   Rapsberry Pi and explore some parts of the Linux boot process.
 * [Building a more sophisticated userland](userland.md). Mostly a
   Linux-From-Scratch-Style "lets build some packages". The userland will be
diff --git a/crosscc.md b/crosscc.md
deleted file mode 100644
index 74f2d07..0000000
--- a/crosscc.md
+++ /dev/null
@@ -1,626 +0,0 @@
-# Building a Cross Compiler Toolchain
-
-As it turns out, building a cross compiler toolchain with recent GCC and
-binutils is a lot easier nowadays than it used to be.
-
-I'm building the toolchain on an AMD64 (aka x86_64) system. The steps have
-been tried on [Fedora](https://getfedora.org/) as well as on
-[OpenSUSE](https://www.opensuse.org/).
-
-The toolchain we are building generates 32 bit ARM code intended to run on
-a Raspberry Pi 3. [Musl](https://www.musl-libc.org/) is used as a C standard
-library implementation.
-
-## Downloading and unpacking everything
-
-The following source packages are required for building the toolchain. The
-links below point to the exact versions that I used.
-
-* [Linux](https://github.com/raspberrypi/linux/archive/raspberrypi-kernel_1.20201201-1.tar.gz).
-  Linux is a very popular OS kernel that we will use on our target system.
-  We need it to build the the C standard library for our toolchain.
-* [Musl](https://www.musl-libc.org/releases/musl-1.2.2.tar.gz). A tiny
-  C standard library implementation.
-* [Binutils](https://ftp.gnu.org/gnu/binutils/binutils-2.36.tar.xz). This
-  contains the GNU assembler, linker and various tools for working with
-  executable files.
-* [GCC](https://ftp.gnu.org/gnu/gcc/gcc-10.2.0/gcc-10.2.0.tar.xz), the GNU
-  compiler collection. Contains compilers for C and other languages.
-
-Simply download the packages listed above into `download` and unpack them
-into `src`.
-
-For convenience, I provided a small shell script called `download.sh` that,
-when run inside `$BUILDROOT` does this and also verifies the `sha256sum`
-of the packages, which will further make sure that you are using the **exact**
-same versions as I am.
-
-Right now, you should have a directory tree that looks something like this:
-
-* build/
-* toolchain/
-   * bin/
-* src/
-   * binutils-2.36/
-   * gcc-10.2.0/
-   * musl-1.2.2/
-   * linux-raspberrypi-kernel_1.20201201-1/
-* download/
-   * binutils-2.36.tar.xz
-   * gcc-10.2.0.tar.xz
-   * musl-1.2.2.tar.gz
-   * raspberrypi-kernel_1.20201201-1.tar.gz
-* sysroot/
-
-For building GCC, we will need to download some additional support libraries.
-Namely gmp, mfpr, mpc and isl that have to be unpacked inside the GCC source
-tree. Luckily, GCC nowadays provides a shell script that will do that for us:
-
-	cd "$BUILDROOT/src/gcc-10.2.0"
-	./contrib/download_prerequisites
-	cd "$BUILDROOT"
-
-
-# Overview
-
-From now on, the rest of the process itself consists of the following steps:
-
-1. Installing the kernel headers to the sysroot directory.
-2. Compiling cross binutils.
-3. Compiling a minimal GCC cross compiler with minimal `libgcc`.
-4. Cross compiling the C standard library (in our case Musl).
-5. Compiling a full version of the GCC cross compiler with complete `libgcc`.
-
-The main reason for compiling GCC twice is the inter-dependency between the
-compiler and the standard library.
-
-First of all, the GCC build system needs to know *what* kind of C standard
-library we are using and *where* to find it. For dynamically linked programs,
-it also needs to know what loader we are going to use, which is typically
-also provided by the C standard library. For more details, you can read this
-high level overview [how dyncamically linked ELF programs are run](elfstartup.md).
-
-Second, there is [libgcc](https://gcc.gnu.org/onlinedocs/gccint/Libgcc.html).
-`libgcc` contains low level platform specific helpers (like exception handling,
-soft float code, etc.) and is automatically linked to programs built with GCC.
-Libgcc source code comes with GCC and is compiled by the GCC build system
-specifically for our cross compiler & libc combination.
-
-However, some functions in the `libgcc` need functions from the C standard
-library. Some libc implementations directly use utility functions from `libgcc`
-such as stack unwinding helpers (provided by `libgcc_s`).
-
-After building a GCC cross compiler, we need to cross compile `libgcc`, so we
-can *then* cross compile other stuff that needs `libgcc` **like the libc**. But
-we need an already cross compiled libc in the first place for
-compiling `libgcc`.
-
-The solution is to build a minimalist GCC that targets an internal stub libc
-and provides a minimal `libgcc` that has lots of features disabled and uses
-the stubs instead of linking against libc.
-
-We can then cross compile the libc and let the compiler link it against the
-minimal `libgcc`.
-
-With that, we can then compile the full GCC, pointing it at the C standard
-library for the target system and build a fully featured `libgcc` along with
-it. We can simply install it *over* the existing GCC and `libgcc` in the
-toolchain directory (dynamic linking for the rescue).
-
-## Autotools and the canonical target tuple
-
-Most of the software we are going to build is using autotools based build
-systems. There are a few things we should know when working with autotools
-based packages.
-
-GNU autotools makes cross compilation easy and has checks and workarounds for
-the most bizarre platforms and their misfeatures. This was especially important
-in the early days of the GNU project when there were dozens of incompatible
-Unices on widely varying hardware platforms and the GNU packages were supposed
-to build and run on all of them.
-
-Nowadays autotools offers *decades* of being used in practice and is in my
-experience a lot more mature than more modern build systems. Also, having a
-semi standard way of cross compiling stuff with standardized configuration
-knobs is very helpful.
-
-In contrast to many modern build systems, you don't need Autotools to run an
-Autotools based build system. The final build system it generates for the
-release tarballs just uses shell and `make`.
-
-### The configure script
-
-Pretty much every novice Ubuntu user has probably already seen this on Stack
-Overflow (and copy-pasted it) at least once:
-
-    ./configure
-    make
-    make install
-
-
-The `configure` shell script generates the actual `Makefile` from a
-template (`Makefile.in`) that is then used for building the package.
-
-The `configure` script itself and the `Makefile.in` are completely independent
-from autotools and were generated by `autoconf` and `automake`.
-
-If we don't want to clobber the source tree, we can also build a package
-*outside the source tree* like this:
-
-    ../path/to/source/configure
-    make
-
-The `configure` script contains *a lot* of system checks and default flags that
-we can use for telling the build system how to compile the code.
-
-The main ones we need to know about for cross compiling are the following
-three options:
-
-* The **--build** option specifies what system we are *building* the
-  package on.
-* The **--host** option specifies what system the binaries will run on.
-* The **--target** option is specific for packages that contain compilers
-  and specify what system to generate output for.
-
-Those options take as an argument a dash seperated tuple that describes
-a system and is made up the following way:
-
-	<architecture>-<vendor>-<kernel>-<userspace>
-
-The vendor part is completely optional and we will only use 3 components to
-discribe our toolchain. So for our 32 bit ARM system, running a Linux kernel
-with a Musl based user space, is described like this:
-
-	arm-linux-musleabihf
-
-The user space component itself specifies that we use `musl` and we want to
-adhere to the ARM embedded ABI specification (`eabi` for short) with hardware
-float `hf` support.
-
-If you want to determine the tuple for the system *you are running on*, you can
-use the script [config.guess](https://git.savannah.gnu.org/gitweb/?p=config.git;a=tree):
-
-	$ HOST=$(./config.guess)
-	$ echo "$HOST"
-	x86_64-pc-linux-gnu
-
-There are reasons for why this script exists and why it is that long. Even
-on Linux distributions, there is no consistent way, to pull a machine triple
-out of a shell one liner.
-
-Some guides out there suggest using a shell builtin **MACHTYPE**:
-
-    $ echo "$MACHTYPE"
-    x86_64-redhat-linux-gnu
-
-The above is what I got on Fedora, however on Arch Linux I got this:
-
-    $ echo "$MACHTYPE"
-    x86_64
-
-Some other guides suggest using `uname` and **OSTYPE**:
-
-    $ HOST=$(uname -m)-$OSTYPE
-    $ echo $HOST
-    x86_64-linux-gnu
-
-This works on Fedora and Arch Linux, but fails on OpenSuSE:
-
-	$ HOST=$(uname -m)-$OSTYPE
-    $ echo $HOST
-    x86_64-linux
-
-If you want to safe yourself a lot of headache, refrain from using such
-adhockery and simply use `config.guess`. I only listed this here to warn you,
-because I have seen some guides and tutorials out there using this nonsense.
-
-As you saw here, I'm running on an x86_64 system and my user space is `gnu`,
-which tells autotools that the system is using `glibc`.
-
-You also saw that the `vendor` is sometimes used for branding, so use that
-field if you must, because the others have exact meaning and are parsed by
-the buildsystem.
-
-### The Installation Path
-
-When running `make install`, there are two ways to control where the program
-we just compiled is installed to.
-
-First of all, the `configure` script has an option called `--prefix`. That can
-be used like this:
-
-	./configure --prefix=/usr
-	make
-	make install
-
-In this case, `make install` will e.g. install the program to `/usr/bin` and
-install resources to `/usr/share`. The important thing here is that the prefix
-is used to generate path variables and the program "knows" what it's prefix is,
-i.e. it will fetch resource from `/usr/share`.
-
-But if instead we run this:
-
-	./configure --prefix=/opt/yoyodyne
-	make
-	make install
-
-The same program is installed to `/opt/yoyodyne/bin` and its resource end up
-in `/opt/yoyodyne/share`. The program again knows to look in the later path for
-its resources.
-
-The second option we have is using a Makefile variable called `DESTDIR`, which
-controls the behavior of `make install` *after* the program has been compiled:
-
-	./configure --prefix=/usr
-	make
-	make DESTDIR=/home/goliath/workdir install
-
-In this example, the program is installed to `/home/goliath/workdir/usr/bin`
-and the resources to `/home/goliath/workdir/usr/share`, but the program itself
-doesn't know that and "thinks" it lives in `/usr`. If we try to run it, it
-thries to load resources from `/usr/share` and will be sad because it can't
-find its files.
-
-## Building our Toolchain
-
-At first, we set a few handy shell variables that will store the configuration
-of our toolchain:
-
-    TARGET="arm-linux-musleabihf"
-	HOST="x86_64-linux-gnu"
-    LINUX_ARCH="arm"
-    MUSL_CPU="arm"
-    GCC_CPU="armv6"
-
-The **TARGET** variable holds the *target triplet* of our system as described
-above.
-
-We also need the triplet for the local machine that we are going to build
-things on. For simplicity, I also set this manually.
-
-The **MUSL_CPU**, **GCC_CPU** and **LINUX_ARCH** variables hold the target
-CPU architecture. The variables are used for musl, gcc and linux respecitively,
-because they cannot agree on consistent architecture names (except sometimes).
-
-### Installing the kernel headers
-
-We create a build directory called **$BUILDROOT/build/linux**. Building the
-kernel outside its source tree works a bit different compared to autotools
-based stuff.
-
-To keep things clean, we use a shell variable **srcdir** to remember where
-we kept the kernel source. A pattern that we will repeat later:
-
-    export KBUILD_OUTPUT="$BUILDROOT/build/linux"
-    mkdir -p "$KBUILD_OUTPUT"
-
-    srcdir="$BUILDROOT/src/linux-raspberrypi-kernel_1.20201201-1"
-
-    cd "$srcdir"
-    make O="$KBUILD_OUTPUT" ARCH="$LINUX_ARCH" headers_check
-    make O="$KBUILD_OUTPUT" ARCH="$LINUX_ARCH" INSTALL_HDR_PATH="$SYSROOT/usr" headers_install
-    cd "$BUILDROOT"
-
-
-According to the Makefile in the Linux source, you can either specify an
-environment variable called **KBUILD_OUTPUT**, or set a Makefile variable
-called **O**, where the later overrides the environment variable. The snippet
-above shows both ways.
-
-The *headers_check* target runs a few trivial sanity checks on the headers
-we are going to install. It checks if a header includes something nonexistent,
-if the declarations inside the headers are sane and if kernel internals are
-leaked into user space. For stock kernel tar-balls, this shouldn't be
-necessary, but could come in handy when working with kernel git trees,
-potentially with local modifications.
-
-Lastly (before switching back to the root directory), we actually install the
-kernel headers into the sysroot directory where the libc later expects them
-to be.
-
-The `sysroot` directory should now contain a `usr/include` directory with a
-number of sub directories that contain kernel headers.
-
-Since I've seen the question in a few forums: it doesn't matter if the kernel
-version exactly matches the one running on your target system. The kernel
-system call ABI is stable, so you can use an older kernel. Only if you use a
-much newer kernel, the libc might end up exposing or using features that your
-kernel does not yet support.
-
-If you have some embedded board with a heavily modified vendor kernel (such as
-in our case) and little to no upstream support, the situation is a bit more
-difficult and you may prefer to use the exact kernel.
-
-Even then, if you have some board where the vendor tree breaks the
-ABI **take the board and burn it** (preferably outside; don't inhale
-the fumes).
-
-### Compiling cross binutils
-
-We will compile binutils outside the source tree, inside the directory
-**build/binutils**. So first, we create the build directory and switch into
-it:
-
-    mkdir -p "$BUILDROOT/build/binutils"
-    cd "$BUILDROOT/build/binutils"
-
-    srcdir="$BUILDROOT/src/binutils-2.36"
-
-From the binutils build directory we run the configure script:
-
-    $srcdir/configure --prefix="$TCDIR" --target="$TARGET" \
-                      --with-sysroot="$SYSROOT" \
-                      --disable-nls --disable-multilib
-
-We use the **--prefix** option to actually let the toolchain know that it is
-being installed in our toolchain directory, so it can locate its resources and
-helper programs when we run it.
-
-We also set the **--target** option to tell the build system what target the
-assembler, linker and other tools should generate **output** for. We don't
-explicitly set the **--host** or **--build** because we are compiling binutils
-to run on the local machine.
-
-We would only set the **--host** option to cross compile binutils itself with
-an existing toolchain to run on a different system than ours.
-
-The **--with-sysroot** option tells the build system that the root directory
-of the system we are going to build is in `$SYSROOT` and it should look inside
-that to find libraries.
-
-We disable the feature **nls** (native language support, i.e. cringe worthy
-translations of error messages to your native language, such as Deutsch
-or 中文), mainly because we don't need it and not doing something typically
-saves time.
-
-Regarding the multilib option: Some architectures support executing code for
-other, related architectures (e.g. an x86_64 machine can run 32 bit x86 code).
-On GNU/Linux distributions that support that, you typically have different
-versions of the same libraries (e.g. in *lib/* and *lib32/* directories) with
-programs for different architectures being linked to the appropriate libraries.
-We are only interested in a single architecture and don't need that, so we
-set **--disable-multilib**.
-
-
-Now we can compile and install binutils:
-
-    make configure-host
-    make
-    make install
-    cd "$BUILDROOT"
-
-The first make target, *configure-host* is binutils specific and just tells it
-to check out the system it is *being built on*, i.e. your local machine and
-make sure it has all the tools it needs for compiling. If it reports a problem,
-**go fix it before continuing**.
-
-We then go on to build the binutils. You may want to speed up compilation by
-running a parallel build with **make -j NUMBER-OF-PROCESSES**.
-
-Lastly, we run *make install* to install the binutils in the configured
-toolchain directory and go back to our root directory.
-
-The `toolchain/bin` directory should now already contain a bunch of executables
-such as the assembler, linker and other tools that are prefixed with the host
-triplet.
-
-There is also a new directory called `toolchain/arm-linux-musleabihf` which
-contains a secondary system root with programs that aren't prefixed, and some
-linker scripts.
-
-### First pass GCC
-
-Similar to above, we create a directory for building the compiler, change
-into it and store the source location in a variable:
-
-    mkdir -p "$BUILDROOT/build/gcc-1"
-    cd "$BUILDROOT/build/gcc-1"
-
-    srcdir="$BUILDROOT/src/gcc-10.2.0"
-
-Notice, how the build directory is called *gcc-1*. For the second pass, we
-will later create a different build directory. Not only does this out of tree
-build allow us to cleanly start afresh (because the source is left untouched),
-but current versions of GCC will *flat out refuse* to build inside the
-source tree.
-
-    $srcdir/configure --prefix="$TCDIR" --target="$TARGET" --build="$HOST" \
-                      --host="$HOST" --with-sysroot="$SYSROOT" \
-                      --disable-nls --disable-shared --without-headers \
-                      --disable-multilib --disable-decimal-float \
-                      --disable-libgomp --disable-libmudflap \
-                      --disable-libssp --disable-libatomic \
-                      --disable-libquadmath --disable-threads \
-                      --enable-languages=c --with-newlib \
-                      --with-arch="$GCC_CPU" --with-float=hard \
-                      --with-fpu=neon-vfpv3
-
-The **--prefix**, **--target** and **--with-sysroot** work just like above for
-binutils.
-
-This time we explicitly specify **--build** (i.e. the system that we are going
-to compile GCC on) and **--host** (i.e. the system that the GCC will run on).
-In our case those are the same. I set those explicitly for GCC, because the GCC
-build system is notoriously fragile. Yes, *I have seen* older versions of GCC
-throw a fit or assume complete nonsense if you don't explicitly specify those
-and at this point I'm no longer willing to trust it.
-
-The option **--with-arch** gives the build system slightly more specific
-information about the target processor architecture. The two options after that
-are specific for our target and tell the buildsystem that GCC should use the
-hardware floating point unit and can emit neon instructions for vectorization.
-
-We also disable a bunch of stuff we don't need. I already explained *nls*
-and *multilib* above. We also disable a bunch of optimization stuff and helper
-libraries. Among other things, we also disable support for dynamic linking and
-threads as we don't have the libc yet.
-
-The option **--without-headers** tells the build system that we don't have the
-headers for the libc *yet* and it should use minimal stubs instead where it
-needs them. The **--with-newlib** option is *more of a hack*. It tells that we
-are going to use the [newlib](http://www.sourceware.org/newlib/) as C standard
-library. This isn't actually true, but forces the build system to disable some
-[libgcc features that depend on the libc](https://gcc.gnu.org/ml/gcc-help/2009-07/msg00368.html).
-
-The option **--enable-languages** accepts a comma separated list of languages
-that we want to build compilers for. For now, we only need a C compiler for
-compiling the libc.
-
-If you are interested: [Here is a detailed list of all GCC configure options.](https://gcc.gnu.org/install/configure.html)
-
-Now, lets build the compiler and `libgcc`:
-
-    make all-gcc all-target-libgcc
-    make install-gcc install-target-libgcc
-
-    cd "$BUILDROOT"
-
-We explicitly specify the make targets for *GCC* and *cross-compiled libgcc*
-for our target. We are not interested in anything else.
-
-For the first make, you **really** want to specify a *-j NUM-PROCESSES* option
-here. Even the first pass GCC we are building here will take a while to compile
-on an ordinary desktop machine.
-
-### C standard library
-
-We create our build directory and change there:
-
-    mkdir -p "$BUILDROOT/build/musl"
-    cd "$BUILDROOT/build/musl"
-
-    srcdir="$BUILDROOT/src/musl-1.2.2"
-
-Musl is quite easy to build but requires some special handling, because it
-doesn't use autotools. The configure script is actually a hand written shell
-script that tries to emulate some of the typical autotools handling:
-
-    CC="${TARGET}-gcc" $srcdir/configure --prefix=/usr --target="$TARGET"
-
-We override the shell variable **CC** to point to the cross compiler that we
-just built. Remember, we added **$TCDIR/bin** to our **PATH**.
-
-We do the same thing for actually compiling musl and we explicitly set the
-**DESTDIR** variable for installing:
-
-    CC="${TARGET}-gcc" make
-    make DESTDIR="$SYSROOT" install
-
-    cd "$BUILDROOT"
-
-The important part here, that later also applies for autotools based stuff, is
-that we don't set **--prefix** to the sysroot directory. We set the prefix so
-that the build system "thinks" it compiles the library to be installed
-in `/usr`, but then we install the compiled binaries and headers to the sysroot
-directory.
-
-The `sysroot/usr/include` directory should now contain a bunch of standard
-headers. Likewise, the `sysroot/usr/lib` directory should now contain a
-`libc.so`, a bunch of dummy libraries, and the startup object code provided
-by Musl.
-
-Despite the prefix we set, Musl installs a `sysroot/lib/ld-musl-armhf.so.1`
-symlink which points to `/usr/lib/libc.so`. Dynamically linked programs built
-with our toolchain will have `/lib/ld-musl-armhf.so.1` set as their loader.
-
-### Second pass GCC
-
-We are reusing the same source code from the first stage, but in a different
-build directory:
-
-    mkdir -p "$BUILDROOT/build/gcc-2"
-    cd "$BUILDROOT/build/gcc-2"
-
-    srcdir="$BUILDROOT/src/gcc-10.2.0"
-
-Most of the configure options should be familiar already:
-
-    $srcdir/configure --prefix="$TCDIR" --target="$TARGET" --build="$HOST" \
-                      --host="$HOST" --with-sysroot="$SYSROOT" \
-                      --disable-nls --enable-languages=c,c++ \
-                      --enable-c99 --enable-long-long \
-                      --disable-libmudflap --disable-multilib \
-                      --disable-libsanitizer --with-arch="$CPU" \
-                      --with-native-system-header-dir="/usr/include" \
-                      --with-float=hard --with-fpu=neon-vfpv3
-
-For the second pass, we also build a C++ compiler. The options **--enable-c99**
-and **--enable-long-long** are actually C++ specific. When our final compiler
-runs in C++98 mode, we allow it to expose C99 functions from the libc through
-a GNU extension. We also allow it to support the *long long* data type
-standardized in C99.
-
-You may wonder why we didn't have to build a **libstdc++** between the
-first and second pass, like the libc. The source code for the *libstdc++*
-comes with the **g++** compiler and is built automatically like `libgcc`.
-On the one hand, it is really just a library that adds C++ stuff
-*on top of libc*, mostly header only code that is compiled with the actual
-C++ programs. On the other hand, C++ does not have a standard ABI and it is
-all compiler and OS specific. So compiler vendors will typically ship their
-own `libstdc++` implementation with the compiler.
-
-We **--disable-libsanitizer** because it simply won't build for musl. I tried
-fixing it, but it simply assumes too much about the nonstandard internals
-of the libc. A quick Google search reveals that it has **lots** of similar
-issues with all kinds of libc & kernel combinations, so even if I fix it on
-my system, you may run into other problems on your system or with different
-versions of packets. It even has different problems with different versions
-of glibc. Projects like buildroot simply disable it when using musl. It "only"
-provides a static code analysis plugin for the compiler.
-
-The option **--with-native-system-header-dir** is of special interest for our
-cross compiler. We explicitly tell it to look for headers in `/usr/include`,
-relative to our **$SYSROOT** directory. We could just as easily place the
-headers somewhere else in the previous steps and have it look there.
-
-All that's left now is building and installing the compiler:
-
-    make
-    make install
-
-    cd "$BUILDROOT"
-
-This time, we are going to build and install *everything*. You *really* want to
-do a parallel build here. On my AMD Ryzen based desktop PC, building with
-`make -j 16` takes about 3 minutes. On my Intel i5 laptop it takes circa 15
-minutes. If you are using a laptop, you might want to open a window (assuming
-it is cold outside, i.e. won't help if you are in Taiwan).
-
-### Testing the Toolchain
-
-We quickly write our average hello world program into a file called **test.c**:
-
-    #include <stdio.h>
-
-    int main(void)
-    {
-        puts("Hello, world");
-        return 0;
-    }
-
-We can now use our cross compiler to compile this C file:
-
-    $ ${TARGET}-gcc test.c
-
-Running the program `file` on the resulting `a.out` will tell us that it has
-been properly compiled and linked for our target machine:
-
-    $ file a.out
-    a.out: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-musl-armhf.so.1, not stripped
-
-Of course, you won't be able to run the program on your build system. You also
-won't be able to run it on Raspbian or similar, because it has been linked
-against our cross compiled Musl.
-
-Statically linking it should solve the problem:
-
-    $ ${TARGET}-gcc -static test.c
-    $ file a.out
-    a.out: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, with debug_info, not stripped
-    $ readelf -d a.out
-
-    There is no dynamic section in this file.
-
-This binary now does not require any libraries, any interpreters and does
-system calls directly. It should now run on your favourite Raspberry Pi
-distribution as-is.
diff --git a/kernel.md b/kernel.md
deleted file mode 100644
index 33a5793..0000000
--- a/kernel.md
+++ /dev/null
@@ -1,475 +0,0 @@
-# Building a Bootable Kernel and Initial RAM Filesystem
-
-This section outlines how to use the cross compiler toolchain you just built
-for cross-compiling a bootable kernel, and how to get the kernel to run on
-the Raspberry Pi.
-
-## The Linux Boot Process at a High Level
-
-When your system is powered on, it usually won't run the Linux kernel directly.
-Even on a very tiny embedded board that has the kernel baked into a flash
-memory soldered directly next to the CPU. Instead, a chain of boot loaders will
-spring into action that do basic board bring-up and initialization. Part of this
-chain is typically comprised of proprietary blobs from the CPU or board vendor
-that considers hardware initialization as a mystical secret that must not be
-shared. Each part of the boot loader chain is typically very restricted in what
-it can do, hence the need to chain load a more complex loader after doing some
-hardware initialization.
-
-The chain of boot loaders typically starts with some mask ROM baked into the
-CPU and ends with something like [U-Boot](https://www.denx.de/wiki/U-Boot),
-[BareBox](https://www.barebox.org/), or in the case of an x86 system like your
-PC, [Syslinux](https://syslinux.org/) or (rarely outside of the PC world)
-[GNU GRUB](https://www.gnu.org/software/grub/).
-
-The final stage boot loader then takes care of loading the Linux kernel into
-memory and executing it. The boot loader typically generates some informational
-data structures in memory and passes a pointer to the kernel boot code. Besides
-system information (e.g. RAM layout), this typically also contains a command
-line for the kernel.
-
-On a very high level, after the boot loader jumps into the kernel, the kernel
-decompresses itself and does some internal initialization, initializes built-in
-hardware drivers and then attempts to mount the root filesystem. After mounting
-the root filesystem, the kernel creates the very first process with PID 1.
-
-At this point, boot strapping is done as far as the kernel is concerned. The
-process with PID 1 usually spawns (i.e. `fork` + `exec`) and manages a bunch
-of daemon processes. Some of them allowing users to log in and get a shell.
-
-### Initial RAM Filesystem
-
-For very simple setups, it can be sufficient to pass a command line option to
-the kernel that tells it what device to mount for the root filesystem. For more
-complex setups, Linux supports mounting an *initial RAM filesystem*.
-
-This basically means that in addition to the kernel, the boot loader loads
-a compressed archive into memory. Along with the kernel command line, the boot
-loader gives the kernel a pointer to archive start in memory.
-
-The kernel then mounts an in-memory filesystem as root filesystem, unpacks the
-archive into it and runs the PID 1 process from there. Typically this is a
-script or program that then does a more complex mount setup, transitions to
-the actual root file system and does an `exec` to start the actual PID 1
-process. If it fails at some point, it usually drops you into a tiny rescue
-shell that is also packed into the archive.
-
-For historical reasons, Linux uses [cpio](https://en.wikipedia.org/wiki/Cpio)
-archives for the initial ram filesystem.
-
-Systems typically use [BusyBox](https://busybox.net/) as a tiny shell
-interpreter. BusyBox is a collection of tiny command line programs that
-implement basic commands available on Unix-like system, ranging from `echo`
-or `cat` all the way to a small `vi` and `sed` implementation and including
-two different shell implementations to choose from.
-
-BusyBox gets compiled into a single, monolithic binary. For the utility
-programs, symlinks or hard links are created that point to the binary.
-BusyBox, when run, will determine what utility to execute from the path
-through which it has been started.
-
-**NOTE**: The initial RAM filesystem, or **initramfs** should not be confused
-with the older concept of an initial RAM disk, or **initrd**. The initial RAM
-disk actually uses a disk image instead of an archive and the kernel internally
-emulates a block device that reads blocks from RAM. A regular filesystem driver
-is used to mount the RAM backed block device as root filesystem.
-
-### Device Tree
-
-On a typical x86 PC, your hardware devices are attached to the PCI bus and the
-kernel can easily scan it to find everything. The devices have nice IDs that
-the kernel can query and the drivers tell the kernel what IDs that they can
-handle.
-
-On embedded machines running e.g. ARM based SoCs, the situation is a bit
-different. The various SoC vendors buy licenses for all the hardware "IP cores",
-slap them together and multiplex them onto the CPU cores memory bus. The
-hardware registers end up mapped to SoC specific memory locations and there is
-no real way to scan for possibly present hardware.
-
-In the past, Linux had something called "board files" that where SoC specific
-C files containing SoC & board specific initialization code, but this was
-considered too inflexible.
-
-Linux eventually adopted the concept of a device tree binary, which is
-basically a binary blob that hierarchically describes the hardware present on
-the system and how the kernel can interface with it.
-
-The boot loader loads the device tree into memory and tells the kernel where it
-is, just like it already does for the initial ramfs and command line.
-
-In theory, a kernel binary can now be started on a number of different boards
-with the same CPU architecture, without recompiling (assuming it has all the
-drivers). It just needs the correct device tree binary for the board.
-
-The device tree binary (dtb) itself is generated from a number of source
-files (dts) located in the kernel source tree under `arch/<cpu>/boot/dts`.
-They are compiled together with the kernel using a device tree compiler that
-is also part of the kernel source.
-
-On a side note, the device tree format originates from the BIOS equivalent
-of SPARC workstations. The format is now standardized through a specification
-provided by the Open Firmware project and Linux considers it part of its ABI,
-i.e. a newer kernel should *always* work with an older DTB file.
-
-## Overview
-
-In this section, we will cross compile BusyBox, build a small initial ramfs,
-cross compile the kernel and get all of this to run on the Raspberry Pi.
-
-Unless you have used the `download.sh` script from [the cross toolchain](crosscc.md),
-you will need to download and unpack the following:
-
-* [BusyBox](https://busybox.net/downloads/busybox-1.32.1.tar.bz2)
-* [Linux](https://github.com/raspberrypi/linux/archive/raspberrypi-kernel_1.20201201-1.tar.gz)
-
-You should still have the following environment variables set from building the
-cross toolchain:
-
-    BUILDROOT=$(pwd)
-    TCDIR="$BUILDROOT/toolchain"
-    SYSROOT="$BUILDROOT/sysroot"
-    TARGET="arm-linux-musleabihf"
-	HOST="x86_64-linux-gnu"
-    LINUX_ARCH="arm"
-    export PATH="$TCDIR/bin:$PATH"
-
-
-## Building BusyBox
-
-The BusyBox build system is basically the same as the Linux kernel build system
-that we already used for [building a cross toolchain](crosscc.md).
-
-Just like the kernel (which we haven't built yet), BusyBox uses has a
-configuration file that contains a list of key-value pairs for enabling and
-tuning features.
-
-I prepared a file `bbstatic.config` with the configuration that I used. I
-disabled a lot of stuff that we don't need inside an initramfs, but most
-importantly, I changed the following settings:
-
- - **CONFIG_INSTALL_NO_USR** set to yes, so BusyBox creates a flat hierarchy
-   when installing itself.
- - **CONFIG_STATIC** set to yes, so BusyBox is statically linked and we don't
-   need to pack any libraries or a loader into our initramfs.
-
-If you want to customize my configuration, copy it into a freshly extracted
-BusyBox tarball, rename it to `.config` and run the menuconfig target:
-
-    mv bbstatic.config .config
-    make menuconfig
-
-The `menuconfig` target builds and runs an ncurses based dialog that lets you
-browse and configure features.
-
-Alternatively you can start from scratch by creating a default configuration:
-
-    make defconfig
-    make menuconfig
-
-To compile BusyBox, we'll first do the usual setup for the out-of-tree build:
-
-    srcdir="$BUILDROOT/src/busybox-1.32.1"
-    export KBUILD_OUTPUT="$BUILDROOT/build/bbstatic"
-
-    mkdir -p "$KBUILD_OUTPUT"
-    cd "$KBUILD_OUTPUT"
-
-At this point, you have to copy the BusyBox configuration into the build
-directory. Either use your own, or copy my `bbstatic.config` over, and rename
-it to `.config`.
-
-By running `make oldconfig`, we let the buildsystem sanity check the config
-file and have it ask what to do if any option is missing.
-
-    make -C "$srcdir" CROSS_COMPILE="${TARGET}-" oldconfig
-
-We need to edit 2 settings in the config file: The path to the sysroot and
-the prefix for the cross compiler executables. This can be done easily with
-two lines of `sed`:
-
-    sed -i "$KBUILD_OUTPUT/.config" -e 's,^CONFIG_CROSS_COMPILE=.*,CONFIG_CROSS_COMPILE="'$TARGET'-",'
-    sed -i "$KBUILD_OUTPUT/.config" -e 's,^CONFIG_SYSROOT=.*,CONFIG_SYSROOT="'$SYSROOT'",'
-
-What is now left is to compile BusyBox.
-
-    make -C "$srcdir" CROSS_COMPILE="${TARGET}-"
-
-Before returning to the build root directory, I installed the resulting binary
-to the sysroot directory as `bbstatic`.
-
-    mkdir -p "$SYSROOT/bin"
-    cp busybox "$SYSROOT/bin/bbstatic"
-    cd "$BUILDROOT"
-
-## Compiling the Kernel
-
-First, we do the same dance again for the kernel out of tree build:
-
-    srcdir="$BUILDROOT/src/linux-raspberrypi-kernel_1.20201201-1"
-    export KBUILD_OUTPUT="$BUILDROOT/build/linux"
-
-    mkdir -p "$KBUILD_OUTPUT"
-    cd "$KBUILD_OUTPUT"
-
-I provided a configuration file in `linux.config` which you can simply copy
-to `$KBUILD_OUTPUT/.config`.
-
-Or you can do the same as I did and start out by initializing a default
-configuration for the Raspberry Pi and customizing it:
-
-    make -C "$srcdir" ARCH="$LINUX_ARCH" bcm2709_defconfig
-    make -C "$srcdir" ARCH="$LINUX_ARCH" menuconfig
-
-I mainly changed **CONFIG_SQUASHFS** and **CONFIG_OVERLAY_FS**, turning them
-both from `<M>` to `<*>`, so they get built in instead of being built as
-modules.
-
-Hint: you can also search for things in the menu config by typing `/` and then
-browsing through the popup dialog. Pressing the number printed next to any
-entry brings you directly to the option. Be aware that names in the menu
-generally don't contain **CONFIG_**.
-
-Same as with BusyBox, we insert the cross compile prefix into the configuration
-file:
-
-    sed -i "$KBUILD_OUTPUT/.config" -e 's,^CONFIG_CROSS_COMPILE=.*,CONFIG_CROSS_COMPILE="'$TARGET'-",'
-
-And then finally build the kernel:
-
-    make -C "$srcdir" ARCH="$LINUX_ARCH" CROSS_COMPILE="${TARGET}-" oldconfig
-    make -C "$srcdir" ARCH="$LINUX_ARCH" CROSS_COMPILE="${TARGET}-" zImage dtbs modules
-
-The `oldconfig` target does the same as on BusyBox. More intersting are the
-three make targets in the second line. The `zImage` target is the compressed
-kernel binary, the `dtbs` target builds the device tree binaries and `modules`
-are the loadable kernel modules (i.e. drivers). You really want to insert
-a `-j NUMBER_OF_JOBS` in the second line, or it may take a considerable amount
-of time.
-
-Also, you *really* want to specify an argument after `-j`, otherwise the kernel
-build system will spawn processes until kingdome come (i.e. until your system
-runs out of resources and the OOM killer steps in).
-
-Lastly, I installed all of it into the sysroot for convenience:
-
-    mkdir -p "$SYSROOT/boot"
-    cp arch/arm/boot/zImage "$SYSROOT/boot"
-    cp -r arch/arm/boot/dts "$SYSROOT/boot"
-
-    make -C "$srcdir" ARCH="$LINUX_ARCH" CROSS_COMPILE="${TARGET}-" INSTALL_MOD_PATH="$SYSROOT" modules_install
-    cd $BUILDROOT
-
-The `modules_install` target creates a directory hierarchy `sysroot/lib/modules`
-containing a sub directory for each kernel version with the kernel modules and
-dependency information.
-
-The kernel binary will be circa 6 MiB in size and produce another circa 55 MiB
-worth of modules because the Raspberry Pi default configuration has all bells
-and whistles turned on. Fell free to adjust the kernel configuration and throw
-out everything you don't need.
-
-## Building an Inital RAM Filesystem
-
-First of all, although we do everything by hand here, we are going to create a
-build directory to keep everything neatly separated:
-
-    mkdir -p "$BUILDROOT/build/initramfs"
-	cd "$BUILDROOT/build/initramfs"
-
-Technically, the initramfs image is a simple cpio archive. However, there are
-some pitfalls here:
-
-* There are various versions of the cpio format, some binary, some text based.
-* The `cpio` command line tool is utterly horrible to use.
-* Technically, the POSIX standard considers it lagacy. See the big fat warning
-  in the man page.
-
-So instead of the `cpio` tool, we are going to use a tool from the Linux kernel
-tree called `gen_init_cpio`:
-
-    gcc "$BUILDROOT/src/linux-raspberrypi-kernel_1.20201201-1/usr/gen_init_cpio.c" -o gen_init_cpio
-
-This tool allows us to create a cpio image from a very simple file listing and
-produces exactely the format that the kernel understands.
-
-Here is the simple file listing that I used:
-
-    cat > initramfs.files <<_EOF
-    dir boot 0755 0 0
-    dir dev 0755 0 0
-    dir lib 0755 0 0
-    dir bin 0755 0 0
-    dir sys 0755 0 0
-    dir proc 0755 0 0
-    dir newroot 0755 0 0
-    slink sbin bin 0777 0 0
-    nod dev/console 0600 0 0 c 5 1
-    file bin/busybox $SYSROOT/bin/bbstatic 0755 0 0
-    slink bin/sh /bin/busybox 0777 0 0
-    file init $BUILDROOT/build/initramfs/init 0755 0 0
-    _EOF
-
-In case you are wondering about the first and last line, this is called a
-[heredoc](https://en.wikipedia.org/wiki/Here_document) and can be copy/pasted
-into the shell as is.
-
-The format itself is actually pretty self explantory. The `dir` lines are
-directories that we want in our archive with the permission and ownership
-information after the name. The `slink` entry creates a symlink, namely
-redirecting `/sbin` to `/bin`.
-
-The `nod` entry creates a devices file. In this case, a character
-device (hence `c`) with device number `5:1`. Just like how symlinks are special
-files that have a target string stored in them and get special treatment from
-the kernel, a device file is also just a special kind of file that has a device
-number stored in it. When a program opens a device file, the kernel maps the
-device number to a driver and redirects file I/O to that driver.
-
-This decice number `5:1` refers to a special text console on which the kernel
-prints out messages during boot. BusyBox will use this as standard input/output
-for the shell.
-
-Next, we actually pack our statically linked BusyBox, into the archive, but
-under the name `/bin/busybox`. We then create a symlink to it, called `bin/sh`.
-
-The last line packs a script called `init` (which we haven't written yet) into
-the archive as `/init`.
-
-The script called `/init` is what we later want the kernel to run as PID 1
-process. For the moment, there is not much to do and all we want is to get
-a shell when we power up our Raspberry Pi, so we start out with this stup
-script:
-
-    cat > init <<_EOF
-    #!/bin/sh
-
-    PATH=/bin
-
-    /bin/busybox --install
-    /bin/busybox mount -t proc none /proc
-    /bin/busybox mount -t sysfs none /sys
-    /bin/busybox mount -t devtmpfs none /dev
-
-    exec /bin/busybox sh
-    _EOF
-
-Running `busybox --install` will cause BusyBox to install tons of symlinks to
-itself in the `/bin` directory, one for each utility program. The next three
-lines run the `mount` utiltiy of BusyBox to mount the following pseudo
-filesystems:
-
-* `proc`, the process information filesystem which maps processes and other
-  various kernel variables to a directory hierchy. It is mounted to `/proc`.
-  See `man 5 proc` for more information.
-* `sysfs` a more generic, cleaner variant than `proc` for exposing kernel
-  objects to user space as a filesystem hierarchy. It is mounted to `/sys`.
-  See `man 5 sysfs` for more information.
-* `devtmpfs` is a pseudo filesystem that takes care of managing device files
-  for us. We mount it over `/dev`.
-
-We can now finally put everything together into an XZ compressed archive:
-
-    ./gen_init_cpio initramfs.files | xz --check=crc32 > initramfs.xz
-    cp initramfs.xz "$SYSROOT/boot"
-    cd "$BUILDROOT"
-
-The option `--check=crc32` forces the `xz` utility to create CRC-32 checksums
-instead of using sha256. This is necessary, because the kernel built in
-xz library cannot do sha256, will refuse to unpack the image otherwise and the
-system won't boot.
-
-
-## Putting everything on the Raspberry Pi and Booting it
-
-Remember how I mentioned earlier that the last step of our boot loader chain
-would involve something sane, like U-Boot or BareBox? Well, not on the
-Raspberry Pi.
-
-In addition to the already bizarro hardware, the Raspberry Pi has a lot of
-proprietary magic baked directly into the hardware. The boot process is
-controlled by the GPU, since the SoC is basically a GPU with an ARM CPU slapped
-on to it.
-
-The GPU loads a binary called `bootcode.bin` from the SD card, which contains a
-proprietary boot loader blob for the GPU. This in turn does some initialization
-and chain loads `start.elf` which contains a firmware blob for the GPU. The GPU
-is running an RTOS called [ThreadX OS](https://en.wikipedia.org/wiki/ThreadX)
-and somewhere around [>1M lines](https://www.raspberrypi.org/forums/viewtopic.php?t=53007#p406247)
-worth of firmware code.
-
-There are different versions of `start.elf`. The one called `start_x.elf`
-contains an additional driver for the camera interface, `start_db.elf` is a
-debug version and `start_cd.elf` is a version with a cut-down memory layout.
-
-The `start.elf` file uses an aditional file called `fixup.dat` to configure
-the RAM partitioning between the GPU and the CPU.
-
-In the end, the GPU firmware loads and parses a file called `config.txt` from
-the SD card, which contains configuration parameters, and `cmdline.txt` which
-contains the kernel command line. After parsing the configuration, it finally
-loads the kernel, the initramfs, the device tree binaries and runs the kernel.
-
-Depending on the configuration, the GPU firmway may patch the device tree
-in-memory before running the kernel.
-
-### Copying the Files Over
-
-First, we need a micro SD card with a FAT32 partition on it. How to create the
-partition is left as an exercise to the reader.
-
-Onto this partition, we copy the proprietary boot loader blobs:
-
-* [bootcode.bin](firmware/bootcode.bin)
-* [fixup.dat](firmware/fixup.data)
-* [start.elf](firmware/start.elf)
-
-We create a minimal [config.txt](firmware/config.txt) in the root directory:
-
-	dtparam=
-	kernel=zImage
-	initramfs initramfs.xz followkernel
-
-The first line makes sure the boot loader doesn't mangle the device tree. The
-second one specifies the kernel binary that should be loaded and the last one
-specifies the initramfs image. Note that there is no `=` sign in the last
-line. This field has a different format and the boot loader will ignore it if
-there is an `=` sign. The `followkernel` attribute tells the boot loader to put
-the initramfs into memory right after the kernel binary.
-
-Then, we'll put the [cmdline.txt](firmware/cmdline.txt) onto the SD card:
-
-	console=tty0
-
-The `console` parameter tells the kernel the tty where it prints its boot
-messages and that it uses as the standard input/output tty for our init script.
-We tell it to use the first video console which is what we will get at the HDMI
-output of the Raspberry Pi.
-
-Whats left are the device tree binaries and lastly the kernel and initramfs:
-
-    mkdir -p overlays
-    cp $SYSROOT/boot/dts/*-rpi-3-*.dtb .
-    cp $SYSROOT/boot/dts/overlays/*.dtbo overlays/
-
-    cp $SYSROOT/boot/initramfs.xz .
-    cp $SYSROOT/boot/zImage .
-
-If you are done, unmount the micro SD card and plug it into your Raspberr Pi.
-
-
-### Booting It Up
-
-If you connect the HDMI port and power up the Raspberry Pi, it should boot
-directly into the initramfs and you should get a BusyBox shell.
-
-The PATH is propperly set and the most common shell commands should be there, so
-you can poke around the root filesystem which is in memory and has been unpacked
-from the `initramfs.xz`.
-
-Don't be alarmed by the kernel boot prompt suddenly stopping. Even after the
-BusyBox shell starts, the kernel continues spewing messages for a short while
-and you may not see the shell prompt. Just hit the enter key a couple times.
-
-Also, the shell itself is running as PID 1. If you exit it, the kernel panics
-because PID 1 just died.
diff --git a/setup.md b/setup.md
deleted file mode 100644
index 465c1d7..0000000
--- a/setup.md
+++ /dev/null
@@ -1,72 +0,0 @@
-# Prerequisites and Directory Setup
-
-This section deals with the packages we need on our system to cross bootstrap
-our mini distro, as well as the basic directory setup before we get started.
-
-## Prerequisites
-
-For compiling the packages you will need:
-
-* gcc
-* g++
-* make
-* flex
-* bison
-* gperf
-* makeinfo
-* ncurses (with headers)
-* awk
-* automake
-* help2man
-* curl
-* pkg-config
-* libtool
-* openssl (with headers)
-
-
-In case you wonder: even if you don't build any C++ package, you need the C++
-compiler to build GCC. The GCC code base mainly uses C99, but with some
-additional C++ features. `makeinfo` is used by the GNU utilities that generate
-info pages from texinfo. ncurses is mainly needed by the kernel build system
-for `menuconfig`. OpenSSL is also requried to compile the kernel later on.
-
-The list should be fairly complete, but I can't guarantee that I didn't miss
-something. Normally I work on systems with tons of development tools and
-libraries already installed, so if something is missing, please install it
-and maybe let me know.
-
-## Directory Setup
-
-First of all, you should create an empty directory somewhere where you want
-to build the cross toolchain and later the entire system.
-
-For convenience, we will store the absolute path to this directory inside a
-shell variable called **BUILDROOT** and create a few directories to organize
-our stuff in:
-
-    BUILDROOT=$(pwd)
-
-    mkdir -p "build" "src" "download" "toolchain/bin" "sysroot"
-
-I stored the downloaded packages in the **download** directory and extracted
-them to a directory called **src**.
-
-We will later build packages outside the source tree (GCC even requires that
-nowadays), inside a sub directory of **build**.
-
-Our final toolchain will end up in a directory called **toolchain**.
-
-We store the toolchain location inside another shell variable that I called
-**TCDIR** and prepend the executable path of our toolchain to the **PATH**
-variable:
-
-    TCDIR="$BUILDROOT/toolchain"
-    export PATH="$TCDIR/bin:$PATH"
-
-
-The **sysroot** directory will hold the cross compiled binaries for our target
-system, as well as headers and libraries used for cross compiling stuff. It is
-basically the `/` directory of the system we are going to build. For
-convenience, we will also store its absolute path in a shell variable:
-
-    SYSROOT="$BUILDROOT/sysroot"
-- 
cgit v1.2.3