How the kernel is compiled
Process of the Linux kernel building
Introduction
I won't tell you how to build and install a custom Linux kernel on your machine. If you need help with this, you can find many resources that will help you do it. Instead, we will learn what occurs when you execute make
in the root directory of the Linux kernel source code.
When I started to study the source code of the Linux kernel, the makefile was the first file that I opened. And it was scary :). The makefile contained 1591
lines of code when I wrote this part and the kernel was the 4.2.0-rc3 release.
This makefile is the top makefile in the Linux kernel source code and the kernel building starts here. Yes, it is big, but moreover, if you've read the source code of the Linux kernel you may have noted that all directories containing source code has its own makefile. Of course it is not possible to describe how each source file is compiled and linked, so we will only study the standard compilation case. You will not find here building of the kernel's documentation, cleaning of the kernel source code, tags generation, cross-compilation related stuff, etc... We will start from the make
execution with the standard kernel configuration file and will finish with the building of the bzImage.
It would be better if you're already familiar with the make util, but I will try to describe every piece of code in this part anyway.
So let's start.
Preparation before the kernel compilation
There are many things to prepare before the kernel compilation can be started. The main point here is to find and configure the type of compilation, to parse command line arguments that are passed to make
, etc... So let's dive into the top Makefile
of Linux kernel.
The top Makefile
of Linux kernel is responsible for building two major products: vmlinux (the resident kernel image) and the modules (any module files). The Makefile of the Linux kernel starts with the definition of following variables:
These variables determine the current version of Linux kernel and are used in different places, for example in the forming of the KERNELVERSION
variable in the same Makefile
:
After this we can see a couple of ifeq
conditions that check some of the parameters passed to make
. The Linux kernel makefiles
provides a special make help
target that prints all available targets and some of the command line arguments that can be passed to make
. For example : make V=1
=> verbose build. The first ifeq
checks whether the V=n
option is passed to make
:
If this option is passed to make
, we set the KBUILD_VERBOSE
variable to the value of V
option. Otherwise we set the KBUILD_VERBOSE
variable to zero. After this we check the value of KBUILD_VERBOSE
variable and set values of the quiet
and Q
variables depending on the value of KBUILD_VERBOSE
variable. The @
symbols suppress the output of command. And if it is present before a command the output will be something like this: CC scripts/mod/empty.o
instead of Compiling .... scripts/mod/empty.o
. In the end we just export all of these variables. The next ifeq
statement checks that O=/dir
option was passed to the make
. This option allows to locate all output files in the given dir
:
We check the KBUILD_SRC
that represents the top directory of the kernel source code and whether it is empty (it is empty when the makefile is executed for the first time). We then set the KBUILD_OUTPUT
variable to the value passed with the O
option (if this option was passed). In the next step we check this KBUILD_OUTPUT
variable and if it is set, we do following things:
Store the value of
KBUILD_OUTPUT
in the temporarysaved-output
variable;Try to create the given output directory;
Check that directory created, in other way print error message;
If the custom output directory was created successfully, execute
make
again with the new directory (see the-C
option).
The next ifeq
statements check that the C
or M
options passed to make
:
The C
option tells the makefile
that we need to check all c
source code with a tool provided by the $CHECK
environment variable, by default it is sparse. The second M
option provides build for the external modules (will not see this case in this part). We also check whether the KBUILD_SRC
variable is set, and if it isn't, we set the srctree
variable to .
:
That tells Makefile
that the kernel source tree will be in the current directory where make
was executed. We then set objtree
and other variables to this directory and export them. The next step is to get value for the SUBARCH
variable that represents what the underlying architecture is:
As you can see, it executes the uname util that prints information about machine, operating system and architecture. As it gets the output of uname
, it parses the output and assigns the result to the SUBARCH
variable. Now that we have SUBARCH
, we set the SRCARCH
variable that provides the directory of the certain architecture and hdr-arch
that provides the directory for the header files:
Note ARCH
is an alias for SUBARCH
. In the next step we set the KCONFIG_CONFIG
variable that represents path to the kernel configuration file and if it was not set before, it is set to .config
by default:
and the shell that will be used during kernel compilation:
The next set of variables are related to the compilers used during Linux kernel compilation. We set the host compilers for the c
and c++
and the flags to be used with them:
Next we get to the CC
variable that represents compiler too, so why do we need the HOST*
variables? CC
is the target compiler that will be used during kernel compilation, but HOSTCC
will be used during compilation of the set of the host
programs (we will see it soon). After this we can see the definition of KBUILD_MODULES
and KBUILD_BUILTIN
variables that are used to determine what to compile (modules, kernel, or both):
Here we can see definition of these variables and the value of KBUILD_BUILTIN
variable will depend on the CONFIG_MODVERSIONS
kernel configuration parameter if we pass only modules
to make
. The next step is to include the kbuild
file.
The Kbuild or Kernel Build System
is a special infrastructure to manage building the kernel and its modules. kbuild
files have the same syntax as makefiles. The scripts/Kbuild.include file provides some generic definitions for the kbuild
system. After including this kbuild
file (back in makefile) we can see the definitions of the variables that are related to the different tools used during kernel and module compilation (like linker, compilers, utils from the binutils, etc...):
We then define two other variables: USERINCLUDE
and LINUXINCLUDE
, which specify paths to header file directories (public for users in the first case and for kernel in the second case):
And the standard flags for the C compiler:
These are not the final compilation flags, as they can be updated in other makefiles (for example kbuilds from arch/
). After all of these, all variables will be exported to be available in the other makefiles. The RCS_FIND_IGNORE
and the RCS_TAR_IGNORE
variables contain files that will be ignored in the version control system:
With that, we have finished all preparations. The next step is building the vmlinux
target.
Directly to the kernel build
We have now finished all the preparations, and next step in the main makefile is related to the kernel build. Before this moment, nothing has been printed to the terminal by make
. But now the first steps of the compilation are started. We need to go to line 598 of the Linux kernel top makefile and we will find the vmlinux
target there:
Don't worry that we have missed many lines in Makefile that are between export RCS_FIND_IGNORE.....
and all: vmlinux.....
. This part of the makefile is responsible for the make *.config
targets and as I wrote in the beginning of this part we will see only building of the kernel in a general way.
The all:
target is the default when no target is given on the command line. You can see here that we include architecture specific makefile there (in our case it will be arch/x86/Makefile). From this moment we will continue from this makefile. As we can see all
target depends on the vmlinux
target that is defined a little lower in the top makefile:
The vmlinux
is the Linux kernel in a statically linked executable file format. The scripts/link-vmlinux.sh script links and combines different compiled subsystems into vmlinux. The second target is the vmlinux-deps
that defined as:
and consists from the set of the built-in.o
from each top directory of the Linux kernel. Later, when we will go through all directories in the Linux kernel, the Kbuild
will compile all the $(obj-y)
files. It then calls $(LD) -r
to merge these files into one built-in.o
file. For this moment we have no vmlinux-deps
, so the vmlinux
target will not be executed now. For me vmlinux-deps
contains following files:
The next target that can be executed is following:
As we can see vmlinux-dirs
depends on two targets: prepare
and scripts
. prepare
is defined in the top Makefile
of the Linux kernel and executes three stages of preparations:
The first prepare0
expands to the archprepare
that expands to the archheaders
and archscripts
that defined in the x86_64
specific Makefile. Let's look on it. The x86_64
specific makefile starts from the definition of the variables that are related to the architecture-specific configs (defconfig, etc...). After this it defines flags for the compiling of the 16-bit code, calculating of the BITS
variable that can be 32
for i386
or 64
for the x86_64
flags for the assembly source code, flags for the linker and many many more (all definitions you can find in the arch/x86/Makefile). The first target is archheaders
in the makefile and it generates syscall table:
And the second target is archscripts
in this makefile is:
We can see that it depends on the scripts_basic
target from the top Makefile. At the first we can see the scripts_basic
target that executes make for the scripts/basic makefile:
The scripts/basic/Makefile
contains targets for compilation of the two host programs: fixdep
and bin2
:
First program is fixdep
- optimizes list of dependencies generated by gcc that tells make when to remake a source code file. The second program is bin2c
, which depends on the value of the CONFIG_BUILD_BIN2C
kernel configuration option and is a very little C program that allows to convert a binary on stdin to a C include on stdout. You can note here a strange notation: hostprogs-y
, etc... This notation is used in the all kbuild
files and you can read more about it in the documentation. In our case hostprogs-y
tells kbuild
that there is one host program named fixdep
that will be built from fixdep.c
that is located in the same directory where the Makefile
is. The first output after we execute make
in our terminal will be result of this kbuild
file:
As script_basic
target was executed, the archscripts
target will execute make
for the arch/x86/tools makefile with the relocs
target:
The relocs_32.c
and the relocs_64.c
will be compiled that will contain relocation information and we will see it in the make
output:
There is checking of the version.h
after compiling of the relocs.c
:
We can see it in the output:
and the building of the generic
assembly headers with the asm-generic
target from the arch/x86/include/generated/asm
that generated in the top Makefile of the Linux kernel. After the asm-generic
target the archprepare
will be done, so the prepare0
target will be executed. As I wrote above:
Note on the build
. It defined in the scripts/Kbuild.include and looks like this:
Or in our case it is current source directory - .
:
The scripts/Makefile.build tries to find the Kbuild
file by the given directory via the obj
parameter, include this Kbuild
files:
and build targets from it. In our case .
contains the Kbuild file that generates the kernel/bounds.s
and the arch/x86/kernel/asm-offsets.s
. After this the prepare
target finished to work. The vmlinux-dirs
also depends on the second target - scripts
that compiles following programs: file2alias
, mk_elfconfig
, modpost
, etc..... After scripts/host-programs compilation our vmlinux-dirs
target can be executed. First of all let's try to understand what does vmlinux-dirs
contain. For my case it contains paths of the following kernel directories:
We can find definition of the vmlinux-dirs
in the top Makefile of the Linux kernel:
Here we remove the /
symbol from the each directory with the help of the patsubst
and filter
functions and put it to the vmlinux-dirs
. So we have list of directories in the vmlinux-dirs
and the following code:
The $@
represents vmlinux-dirs
here that means that it will go recursively over all directories from the vmlinux-dirs
and its internal directories (depends on configuration) and will execute make
in there. We can see it in the output:
Source code in each directory will be compiled and linked to the built-in.o
:
Ok, all buint-in.o(s) built, now we can back to the vmlinux
target. As you remember, the vmlinux
target is in the top Makefile of the Linux kernel. Before the linking of the vmlinux
it builds samples, Documentation, etc... but I will not describe it here as I wrote in the beginning of this part.
As you can see main purpose of it is a call of the scripts/link-vmlinux.sh script is linking of the all built-in.o
(s) to the one statically linked executable and creation of the System.map. In the end we will see following output:
and vmlinux
and System.map
in the root of the Linux kernel source tree:
That's all, vmlinux
is ready. The next step is creation of the bzImage.
Building bzImage
The bzImage
file is the compressed Linux kernel image. We can get it by executing make bzImage
after vmlinux
is built. That, or we can just execute make
without any argument and we will get bzImage
anyway because it is default image:
in the arch/x86/kernel/Makefile. Let's look on this target, it will help us to understand how this image builds. As I already said the bzImage
target defined in the arch/x86/kernel/Makefile and looks like this:
We can see here, that first of all called make
for the boot directory, in our case it is:
The main goal now is to build the source code in the arch/x86/boot
and arch/x86/boot/compressed
directories, build setup.bin
and vmlinux.bin
, and build the bzImage
from them in the end. First target in the arch/x86/boot/Makefile is the $(obj)/setup.elf
:
We already have the setup.ld
linker script in the arch/x86/boot
directory and the SETUP_OBJS
variable that expands to the all source files from the boot
directory. We can see first output:
The next source file is arch/x86/boot/header.S, but we can't build it now because this target depends on the following two header files:
The first is voffset.h
generated by the sed
script that gets two addresses from the vmlinux
with the nm
util:
They are the start and the end of the kernel. The second is zoffset.h
depens on the vmlinux
target from the arch/x86/boot/compressed/Makefile:
The $(obj)/compressed/vmlinux
target depends on the vmlinux-objs-y
that compiles source code files from the arch/x86/boot/compressed directory and generates vmlinux.bin
, vmlinux.bin.bz2
, and compiles program - mkpiggy
. We can see this in the output:
Where vmlinux.bin
is the vmlinux
file with debugging information and comments stripped and the vmlinux.bin.bz2
compressed vmlinux.bin.all
+ u32
size of vmlinux.bin.all
. The vmlinux.bin.all
is vmlinux.bin + vmlinux.relocs
, where vmlinux.relocs
is the vmlinux
that was handled by the relocs
program (see above). As we got these files, the piggy.S
assembly files will be generated with the mkpiggy
program and compiled:
This assembly files will contain the computed offset from the compressed kernel. After this we can see that zoffset
generated:
As the zoffset.h
and the voffset.h
are generated, compilation of the source code files from the arch/x86/boot can be continued:
As all source code files will be compiled, they will be linked to the setup.elf
:
or:
The last two things is the creation of the setup.bin
that will contain compiled code from the arch/x86/boot/*
directory:
and the creation of the vmlinux.bin
from the vmlinux
:
In the end we compile host program: arch/x86/boot/tools/build.c that will create our bzImage
from the setup.bin
and the vmlinux.bin
:
Actually the bzImage
is the concatenated setup.bin
and the vmlinux.bin
. In the end we will see the output which is familiar to all who once built the Linux kernel from source:
That's all.
Conclusion
It is the end of this part and here we saw all steps from the execution of the make
command to the generation of the bzImage
. I know, the Linux kernel makefiles and process of the Linux kernel building may seem confusing at first glance, but it is not so hard. Hope this part will help you understand the process of building the Linux kernel.
Links
Last updated