setup_archfunction does x86_64 architecture related initializations. The
setup_archis big function, and in the previous part we stopped on the setting of the two exception handlers for the two following exceptions:
#DB- debug exception, transfers control from the interrupted process to the debug handler;
#BP- breakpoint exception, caused by the
set_system_intr_gate_istfunctions in the previous part and now we will look on the implementation of these two exception handlers.
early_trap_initfunction for the
#BPexceptions and now is time to consider their implementations. But before we will do this, first of all let's look on details of these exceptions.
debugexception occurs when a debug event occurs. For example - attempt to change the contents of a debug register. Debug registers are special registers that were presented in
x86processors starting from the Intel 80386 processor and as you can understand from name of this CPU extension, main purpose of these registers is debugging.
#DBexception, but not the
1(we pass it as
X86_TRAP_DB) and as we may read in specification, this exception has no error code:
breakpointexception occurs when processor executes the int 3 instruction. Unlike the
#BPexception may occur in userspace. We can add it anywhere in our code, for example let's look on the simple program:
set_system_intr_gate_istfunctions takes an addresses of exceptions handlers in theirs second parameter. In or case our two exception handlers will be:
*.c/*.hfiles only definition of these functions which are located in the arch/x86/include/asm/traps.h kernel header file:
asmlinkagedirective in definitions of these functions. The directive is the special specificator of the gcc. Actually for a
Cfunctions which are called from assembly, we need in explicit declaration of the function calling convention. In our case, if function made with
gccwill compile the function to retrieve parameters from stack.
SIGILLsignal and etc.
idtentrymacro from the arch/x86/entry/entry_64.S assembly source code file, so let's look at implementation of this macro. As we may see, the
idtentrymacro takes five arguments:
sym- defines global symbol with the
.globl namewhich will be an an entry of exception handler;
do_sym- symbol name which represents a secondary entry of an exception handler;
has_error_code- information about existence of an error code of exception.
paranoid- shows us how we need to check current mode (will see explanation in details later);
shift_ist- shows us is an exception running at
Interrupt Stack Table.
idtentrymacro, we should to know state of stack when an exception occurs. As we may read in the Intel® 64 and IA-32 Architectures Software Developer’s Manual 3A, the state of stack when an exception occurs is following:
BPexception handlers are defined as:
int3names and both of these exception handlers will call
do_int3secondary handlers after some preparation. The third parameter defines existence of error code and as we may see both our exception do not have them. As we may see on the diagram above, processor pushes error code on stack if an exception provides it. In our case, the
int3exception do not have error codes. This may bring some difficulties because stack will look differently for exceptions which provides error code and for exceptions which not. That's why implementation of the
idtentrymacro starts from putting a fake error code to the stack if an exception does not provide it:
-1also represents invalid system call number, so that the system call restart logic will not be triggered.
paranoidallow to know do an exception handler runned at stack from
Interrupt Stack Tableor not. You already may know that each kernel thread in the system has its own stack. In addition to these stacks, there are some specialized stacks associated with each processor in the system. One of these stacks is - exception stack. The x86_64 architecture provides special feature which is called -
Interrupt Stack Table. This feature allows to switch to a new stack for designated events such as an atomic exceptions like
double fault, etc. So the
shift_istparameter allows us to know do we need to switch on
ISTstack for an exception handler or not.
paranoiddefines the method which helps us to know did we come from userspace or not to an exception handler. The easiest way to determine this is to via
Current Privilege Levelin
CSsegment register. If it is equal to
3, we came from userspace, if zero we came from kernel space:
if we are in an NMI/MCE/DEBUG/whatever super-atomic entry context, which might have triggered right after a normal entry wrote CS to the stack but before we executed SWAPGS, then the only safe way to check for GS is the slower method: the RDMSR.
NMIcould happen inside the critical section of a swapgs instruction. In this way we should check value of the
MSR_GS_BASEmodel specific register which stores pointer to the start of per-cpu area. So to check if we did come from userspace or not, we should to check value of the
MSR_GS_BASEmodel specific register and if it is negative we came from kernel space, in other way we came from userspace:
MSR_GS_BASEmodel specific register into
edx:eaxpair. We can't set negative value to the
gsfrom userspace. But from other side we know that direct mapping of the physical memory starts from the
0xffff880000000000virtual address. In this way,
MSR_GS_BASEwill contain an address from
0xffffc7ffffffffff. After the
rdmsrinstruction will be executed, the smallest possible value in the
%edxregister will be -
-30720in unsigned 4 bytes. That's why kernel space
gswhich points to start of
per-cpuarea will contain negative value.
int3exceptions. In this case we check selector from
CSsegment register and jump at
1flabel if we came from userspace or the
paranoid_entrywill be called in other way.
1label starts from the call of the
SAVE_EXTRA_REGSthe stack will look:
%RIPwas reported. Anyway, in both cases the SWAPGS instruction will be executed and values from
MSR_GS_BASEwill be swapped. From this moment the
%gsregister will point to the base address of kernel structures. So, the
SWAPGSinstruction is called and it was main point of the
idtentrymacro. We may see following assembler code after the call of
task_ptr_regsmacro which is defined in the arch/x86/include/asm/processor.h header file, stores it in the stack pointer and returns it. The
task_ptr_regsmacro expands to the address of
thread.sp0which represents pointer to the normal kernel stack:
sync_regswe switch stack:
pt_regsstructure which contains preserved general purpose registers to the
%rsiregister as it will be second argument of an exception handler and set it to
-1on the stack for the same purpose as we did it before - to prevent restart of a system call:
%esiregister above in a case if an exception does not provide error code.
int 3exception. In this part we will not see implementations of secondary handlers, because they are very specific, but will see some of them in one of next parts.
idtentrymacro is defined with
paranoid=1for this exception. This value of
paranoidmeans that we should use slower way that we saw in the beginning of this part to check do we really came from kernelspace or not. The
paranoid_entryrouting allows us to know this:
SWAPGSin a case if we came from userspace, we should to do the same that we did before: We need to put pointer to a structure which holds general purpose registers to the
%rdi(which will be first parameter of a secondary handler) and put error code if an exception provides it to the
%rsi(which will be second parameter of a secondary handler):
shift_istas argument of the
idtentrymacro. Here we check its value and if its not equal to
-1, we get pointer to a stack from
Interrupt Stack Tableby
shift_istindex and setup it.
paranoid=0and we may use fast method determination of where we are from.
idtentrymacro and the next step will be jump to the
error_exitfunction defined in the same arch/x86/entry/entry_64.S assembly source code file and the main goal of this function is to know where we are from (from userspace or kernelspace) and execute
SWPAGSdepends on this. Restore registers to previous state and execute
iretinstruction to transfer control to an interrupted task.
#BPgates and started to dive into preparation before control will be transferred to an exception handler and implementation of some interrupt handlers in this part. In the next part we will continue to dive into this theme and will go next by the
setup_archfunction and will try to understand interrupts handling related stuff.