Implementation of some exception handlers
Last updated
Was this helpful?
Last updated
Was this helpful?
This is the fifth part about an interrupts and exceptions handling in the Linux kernel and in the previous we stopped on the setting of interrupt gates to the . We did it in the trap_init
function from the source code file. We saw only setting of these interrupt gates in the previous part and in the current part we will see implementation of the exception handlers for these gates. The preparation before an exception handler will be executed is in the assembly file and occurs in the macro that defines exceptions entry points:
The idtentry
macro does following preparation before an actual exception handler (do_divide_error
for the divide_error
, do_overflow
for the overflow
, etc.) will get control. In another words the idtentry
macro allocates place for the registers ( structure) on the stack, pushes dummy error code for the stack consistency if an interrupt/exception has no error code, checks the segment selector in the cs
segment register and switches depends on the previous state (userspace or kernelspace). After all of these preparations it makes a call to an actual interrupt/exception handler:
where INTERRUPT_RETURN
is:
divide_error
overflow
invalid_op
coprocessor_segment_overrun
invalid_TSS
segment_not_present
stack_segment
alignment_check
As we can see the DO_ERROR
macro takes 4 parameters:
Vector number of an interrupt;
Signal number which will be sent to the interrupted process;
String which describes an exception;
Exception handler entry point.
This macro defined in the same source code file and expands to the function with the do_handler
name:
The do_error_trap
function starts and ends from the two following functions:
The context_tracking_enter
function informs the context tracking subsystem that a processor is going to enter to the user mode from the kernel mode. We can see the following code between the exception_enter
and exception_exit
:
and returns the result of the atomic_notifier_call_chain
function with the die_chain
:
which just expands to the atomic_notifier_head
structure that contains lock and notifier_block
:
The do_trap_no_signal
function makes two checks:
Did we come from the kernelspace.
And send a given signal to interrupted process:
This is the end of the do_trap
. We just saw generic implementation for eight different exceptions which are defined with the DO_ERROR
macro. Now let's look at other exception handlers.
The next exception is #DF
or Double fault
. This exception occurs when the processor detected a second exception while calling an exception handler for a prior exception. We set the trap gate for this exception in the previous part:
The handler of the double fault exception split on two parts. The first part is the check which checks that a fault is a non-IST
fault on the espfix64
stack. Actually the iret
instruction restores only the bottom 16
bits when returning to a 16
bit segment. The espfix
feature solves this problem. So if the non-IST
fault on the espfix64 stack we modify the stack to make it look like General Protection Fault
:
In the second case we do almost the same that we did in the previous exception handlers. The first is the call of the ist_enter
function that discards previous context, user
in our case:
And after this we fill the interrupted process with the vector number of the Double fault
exception and error code as we did it in the previous handlers:
And die:
That's all.
The next exception is the #NM
or Device not available
. The Device not available
exception can occur depending on these things:
The processor executed a wait
or fwait
instruction while the MP
and TS
flags of register cr0
were set;
In the next step we check that FPU
is not eager:
When we switch into a task or interrupt we may avoid loading the FPU
state. If a task will use it, we catch Device not Available exception
exception. If we loading the FPU
state during task switching, the FPU
is eager. In the next step we check cr0
control register on the EM
flag which can show us is x87
floating point unit present (flag clear) or not (flag set):
The next exception is the #GP
or General protection fault
. This exception occurs when the processor detected one of a class of protection violations called general-protection violations
. It can be:
Exceeding the segment limit when accessing the cs
, ds
, es
, fs
or gs
segments;
Loading the ss
, ds
, es
, fs
or gs
register with a segment selector for a system segment.;
Violating any of the privilege rules;
and other...
As long mode does not support this mode, we will not consider exception handling for this case. In the next step check that previous mode was kernel mode and try to fix the trap. If we can't fix the current general protection fault exception we fill the interrupted process with the vector number and error code of the exception and add it to the notify_die
chain:
If we can fix exception we go to the exit
label which exits from exception state:
If we came from user mode we send SIGSEGV
signal to the interrupted process from user mode as we did it in the do_trap
function:
That's all.
After an exception handler will finish its work, the idtentry
macro restores stack and general purpose registers of an interrupted task and executes instruction:
More about the idtentry
macro you can read in the third part of the chapter. Ok, now we saw the preparation before an exception handler will be executed and now time to look on the handlers. First of all let's look on the following handlers:
All these handlers defined in the source code file with the DO_ERROR
macro:
Note on the ##
tokens. This is special feature - which concatenates two given strings. For example, first DO_ERROR
in our example will expands to the:
We can see that all functions which are generated by the DO_ERROR
macro just make a call to the do_error_trap
function from the . Let's look on implementation of the do_error_trap
function.
from the . The context tracking in the Linux kernel subsystem which provide kernel boundaries probes to keep track of the transitions between level contexts with two basic initial contexts: user
or kernel
. The exception_enter
function checks that context tracking is enabled. After this if it is enabled, the exception_enter
reads previous context and compares it with the CONTEXT_KERNEL
. If the previous context is user
, we call context_tracking_exit
function from the which inform the context tracking subsystem that a processor is exiting user mode and entering the kernel mode:
If previous context is non user
, we just return it. The pre_ctx
has enum ctx_state
type which defined in the and looks as:
The second function is exception_exit
defined in the same file and checks that context tracking is enabled and call the context_tracking_enter
function if the previous context was user
:
First of all it calls the notify_die
function which defined in the . To get notified for , , or other events the caller needs to insert itself in the notify_die
chain and the notify_die
function does it. The Linux kernel has special mechanism that allows kernel to ask when something happens and this mechanism called notifiers
or notifier chains
. This mechanism used for example for the USB
hotplug events (look on the ), for the memory (look on the , the hotplug_memory_notifier
macro, etc...), system reboots, etc. A notifier chain is thus a simple, singly-linked list. When a Linux kernel subsystem wants to be notified of specific events, it fills out a special notifier_block
structure and passes it to the notifier_chain_register
function. An event can be sent with the call of the notifier_call_chain
function. First of all the notify_die
function fills die_args
structure with the trap number, trap string, registers and other values:
The atomic_notifier_call_chain
function calls each function in a notifier chain in turn and returns the value of the last notifier function called. If the notify_die
in the do_error_trap
does not return NOTIFY_STOP
we execute conditional_sti
function from the that checks the value of the and enables interrupt depends on it:
more about local_irq_enable
macro you can read in the second of this chapter. The next and last call in the do_error_trap
is the do_trap
function. First of all the do_trap
function defined the tsk
variable which has task_struct
type and represents the current interrupted process. After the definition of the tsk
, we can see the call of the do_trap_no_signal
function:
Did we come from the mode;
We will not consider first case because the does not support the mode. In the second case we invoke fixup_exception
function which will try to recover a fault and die
if we can't:
The die
function defined in the source code file, prints useful information about stack, registers, kernel modules and caused kernel . If we came from the userspace the do_trap_no_signal
function will return -1
and the execution of the do_trap
function will continue. If we passed through the do_trap_no_signal
function and did not exit from the do_trap
after this, it means that previous context was - user
. Most exceptions caused by the processor are interpreted by Linux as error conditions, for example division by zero, invalid opcode, etc. When an exception occurs the Linux kernel sends a to the interrupted process that caused the exception to notify it of an incorrect condition. So, in the do_trap
function we need to send a signal with the given number (SIGFPE
for the divide error, SIGILL
for a illegal instruction, etc.). First of all we save error code and vector number in the current interrupts process with the filling thread.error_code
and thread_trap_nr
:
After this we make a check do we need to print information about unhandled signals for the interrupted process. We check that show_unhandled_signals
variable is set, that unhandled_signal
function from the will return unhandled signal(s) and rate limit:
Note that this exception runs on the DOUBLEFAULT_STACK
which has index - 1
:
The double_fault
is handler for this exception and defined in the . The double_fault
handler starts from the definition of two variables: string that describes exception and interrupted process, as other exception handlers:
Next we print useful information about the double fault ( number, registers content):
The processor executed an floating-point instruction while the EM flag in cr0
was set;
The processor executed an , or instruction while the TS
flag in control register cr0
was set and the EM
flag is clear.
The handler of the Device not available
exception is the do_device_not_available
function and it defined in the source code file too. It starts and ends from the getting of the previous context, as other traps which we saw in the beginning of this part:
If the x87
floating point unit not presented, we enable interrupts with the conditional_sti
, fill the math_emu_info
(defined in the ) structure with the registers of an interrupt task and call math_emulate
function from the . As you can understand from function's name, it emulates X87 FPU
unit (more about the x87
we will know in the special chapter). In other way, if X86_CR0_EM
flag is clear which means that x87 FPU
unit is presented, we call the fpu__restore
function from the which copies the FPU
registers from the fpustate
to the live hardware registers. After this FPU
instructions can be used:
The exception handler for this exception is the do_general_protection
from the . The do_general_protection
function starts and ends as other exception handlers from the getting of the previous context:
After this we enable interrupts if they were disabled and check that we came from the mode:
It is the end of the fifth part of the chapter and we saw implementation of some interrupt handlers in this part. In the next part we will continue to dive into interrupt and exception handlers and will see handler for the , handling of the math and coprocessor exceptions and many many more.
If you have any questions or suggestions write me a comment or ping me at .
Please note that English is not my first language, And I am really sorry for any inconvenience. If you find any mistakes please send me PR to .