start_kernelfunction. This function initializes all the kernel features (including architecture-dependent features) before the kernel runs the first
initprocess. You may remember as we built early page tables, identity page tables and fixmap page tables in the boot time. No complicated memory management is working yet. When the
start_kernelfunction is called we will see the transition to more complex data structures and techniques for memory management. For a good understanding of the initialization process in the Linux kernel we need to have a clear understanding of these techniques. This chapter will provide an overview of the different parts of the linux kernel memory management framework and its API, starting from the
Logical Memory Block, but with the patch by Yinghai Lu, it was renamed to the
memblock. As Linux kernel for
x86_64architecture uses this method. We already met
memblockin the Last preparations before the kernel entry point part. And now it's time to get acquainted with it closer. We will see how it is implemented.
bottom_upwhich allows allocating memory in bottom-up mode when it is
true. Next field is
current_limit. This field describes the limit size of the memory block. The next three fields describe the type of the memory block. It can be: reserved, memory and physical memory (physical memory is available if the
CONFIG_HAVE_MEMBLOCK_PHYS_MAPconfiguration option is enabled). Now we see yet another data structure -
memblock_type. Let's look at its definition:
memblock_regionis a structure which describes a memory region. Its definition is:
memblock_regionprovides the base address and size of the memory region as well as a flags field which can have the following values:
memblock_regionare main in the
Memblock. Now we know about it and can look at Memblock initialization process.
memblockstructure which has the same name as structure -
memblock. First of all note the
__initdata_memblock. Definition of this macro looks like:
CONFIG_ARCH_DISCARD_MEMBLOCK. If this configuration option is enabled, memblock code will be put into the
.initsection and will be released after the kernel is booted up.
memblock_type physmemfields of the
memblockstructure. Here we are interested only in the
memblock_type.regionsinitialization process. Note that every
memblock_typefield is initialized by and array of
__initdata_memblockmacro which we already saw in the
memblockstructure initialization (read above if you've forgotten).
bottom_upallocation is disabled and the limit of the current Memblock is:
memblockstructure has been finished and we can have a look at the Memblock API.
memblockstructure and now we can look at the Memblock API and its implementation. As I said above, the implementation of
memblockis taking place fully in mm/memblock.c. To understand how
memblockworks and how it is implemented, let's look at its usage first. There are a couple of places in the Linux kernel where memblock is used. For example let's take
memblock_x86_fillfunction from the arch/x86/kernel/e820.c. This function goes through the memory map provided by the e820 and adds memory regions reserved by the kernel to the
memblock_addfunction. Since we have met the
memblock_addfunction first, let's start from it.
memblock_addfunction does not do anything special in its body, but just calls the:
memory, the physical base address and the size of the memory region, the maximum number of nodes which is 1 if
CONFIG_NODES_SHIFTis not set in the configuration file or
1 << CONFIG_NODES_SHIFTif it is set, and the flags. The
memblock_add_rangefunction adds a new memory region to the memory block. It starts by checking the size of the given region and if it is zero it just returns. After this,
memblock_add_rangechecks the existence of the memory regions in the
memblockstructure with the given
memblock_type. If there are no memory regions, we just fill a new
memory_regionwith the given values and return (we already saw the implementation of this in the First touch of the Linux kernel memory manager framework). If
memblock_typeis not empty, we start to add a new memory region to the
memblockwith the given
base + sizewill not overflow. Its implementation is pretty easy:
memblock_cap_sizereturns the new size which is the smallest value between the given size and
ULLONG_MAX - base.
memblock_add_rangechecks for overlap and merge conditions with memory regions that have been added before. Insertion of the new memory region to the
memblockconsists of two steps:
memblock, insert this region into the memblock with and this is first step, we check if the new region can fit into the memory block and call
memblock_double_arrayin another way:
memblock_double_arraydoubles the size of the given regions array. Then we set
trueand go to the
repeatlabel. In the second step, starting from the
repeatlabel we go through the same loop and insert the current memory region into the memory block with the
truein the first step, now
memblock_insert_regionwill be called.
memblock_insert_regionhas almost the same implementation that we saw when we inserted a new region to the empty
memblock_type(see above). This function gets the last memory region:
memblock_regionfields of the new memory region base, size, etc. and increases size of the
memblock_type. In the end of the execution,
memblock_merge_regionswhich merges neighboring compatible regions in the second step.
memblockwith the following base address and size:
0x1000in our case. And insert it as we did it already in the second step with:
overlapping portion(we insert only the higher portion, because the lower portion is already in the overlapped memory region), then the remaining portion and merge these portions with
memblock_merge_regions. As I said above
memblock_merge_regionsfunction merges neighboring compatible regions. It goes through all memory regions from the given
memblock_type, takes two neighboring memory regions -
type->regions[i + 1]and checks that these regions have the same flags, belong to the same node and that the end address of the first regions is not equal to the base address of the second region:
next) memory region one index backwards with the
memmovehere moves all regions which are located after the
nextregion to the base address of the
nextregion. In the end we just decrease the count of the memory regions which belong to the
thisregion and shifted all regions which are located after
nextregion to its place.
memblock_reservefunction which does the same as
memblock_add, but with one difference. It stores
memblock_type.reservedin the memblock instead of
reservedmemory regions, but also:
memblock_remove- removes memory region from memblock;
memblock_find_in_range- finds free area in given range;
memblock_free- releases memory region in memblock;
for_each_mem_range- iterates through memblock areas.
memblock. It is split in two parts:
get_allocated_memblock_memory_regions_info- getting info about memory regions;
get_allocated_memblock_reserved_regions_info- getting info about reserved regions.
memblockcontains reserved memory regions. If
memblockdoes not contain reserved memory regions we just return zero. Otherwise we write the physical address of the reserved memory regions array to the given address and return aligned size of the allocated array. Note that there is
PAGE_ALIGNmacro used for align. Actually it depends on size of page:
get_allocated_memblock_memory_regions_infofunction is the same. It has only one difference,
memblock_type.memoryused instead of
memblock_dbgin the memblock implementation. If you pass the
memblock=debugoption to the kernel command line, this function will be called. Actually
memblock_dbgis just a macro which expands to