Documentation/core-api/real-time/architecture-porting.rst - linux - Git at Google

 .. SPDX-License-Identifier: GPL-2.0

 =============================================
 Porting an architecture to support PREEMPT_RT
 =============================================

 :Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

 This list outlines the architecture specific requirements that must be
 implemented in order to enable PREEMPT_RT. Once all required features are
 implemented, ARCH_SUPPORTS_RT can be selected in architecture’s Kconfig to make
 PREEMPT_RT selectable.
 Many prerequisites (genirq support for example) are enforced by the common code
 and are omitted here.

 The optional features are not strictly required but it is worth to consider
 them.

 Requirements
 ------------

 Forced threaded interrupts
   CONFIG_IRQ_FORCED_THREADING must be selected. Any interrupts that must
   remain in hard-IRQ context must be marked with IRQF_NO_THREAD. This
   requirement applies for instance to clocksource event interrupts,
   perf interrupts and cascading interrupt-controller handlers.

 PREEMPTION support
   Kernel preemption must be supported and requires that
   CONFIG_ARCH_NO_PREEMPT remain unselected. Scheduling requests, such as those
   issued from an interrupt or other exception handler, must be processed
   immediately.

 POSIX CPU timers and KVM
   POSIX CPU timers must expire from thread context rather than directly within
   the timer interrupt. This behavior is enabled by setting the configuration
   option CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK.
   When virtualization support, such as KVM, is enabled,
   CONFIG_VIRT_XFER_TO_GUEST_WORK must also be set to ensure
   that any pending work, such as POSIX timer expiration, is handled before
   transitioning into guest mode.

 Hard-IRQ and Soft-IRQ stacks
   Soft interrupts are handled in the thread context in which they are raised. If
   a soft interrupt is triggered from hard-IRQ context, its execution is deferred
   to the ksoftirqd thread. Preemption is never disabled during soft interrupt
   handling, which makes soft interrupts preemptible.
   If an architecture provides a custom __do_softirq() implementation that uses a
   separate stack, it must select CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK. The
   functionality should only be enabled when CONFIG_SOFTIRQ_ON_OWN_STACK is set.

 FPU and SIMD access in kernel mode
   FPU and SIMD registers are typically not used in kernel mode and are therefore
   not saved during kernel preemption. As a result, any kernel code that uses
   these registers must be enclosed within a kernel_fpu_begin() and
   kernel_fpu_end() section.
   The kernel_fpu_begin() function usually invokes local_bh_disable() to prevent
   interruptions from softirqs and to disable regular preemption. This allows the
   protected code to run safely in both thread and softirq contexts.
   On PREEMPT_RT kernels, however, kernel_fpu_begin() must not call
   local_bh_disable(). Instead, it should use preempt_disable(), since softirqs
   are always handled in thread context under PREEMPT_RT. In this case, disabling
   preemption alone is sufficient.
   The crypto subsystem operates on memory pages and requires users to "walk and
   map" these pages while processing a request. This operation must occur outside
   the kernel_fpu_begin()/ kernel_fpu_end() section because it requires preemption
   to be enabled. These preemption points are generally sufficient to avoid
   excessive scheduling latency.

 Exception handlers
   Exception handlers, such as the page fault handler, typically enable interrupts
   early, before invoking any generic code to process the exception. This is
   necessary because handling a page fault may involve operations that can sleep.
   Enabling interrupts is especially important on PREEMPT_RT, where certain
   locks, such as spinlock_t, become sleepable. For example, handling an
   invalid opcode may result in sending a SIGILL signal to the user task. A
   debug excpetion will send a SIGTRAP signal.
   In both cases, if the exception occurred in user space, it is safe to enable
   interrupts early. Sending a signal requires both interrupts and kernel
   preemption to be enabled.

 Optional features
 -----------------

 Timer and clocksource
   A high-resolution clocksource and clockevents device are recommended. The
   clockevents device should support the CLOCK_EVT_FEAT_ONESHOT feature for
   optimal timer behavior. In most cases, microsecond-level accuracy is
   sufficient

 Lazy preemption
   This mechanism allows an in-kernel scheduling request for non-real-time tasks
   to be delayed until the task is about to return to user space. It helps avoid
   preempting a task that holds a sleeping lock at the time of the scheduling
   request.
   With CONFIG_GENERIC_IRQ_ENTRY enabled, supporting this feature requires
   defining a bit for TIF_NEED_RESCHED_LAZY, preferably near TIF_NEED_RESCHED.

 Serial console with NBCON
   With PREEMPT_RT enabled, all console output is handled by a dedicated thread
   rather than directly from the context in which printk() is invoked. This design
   allows printk() to be safely used in atomic contexts.
   However, this also means that if the kernel crashes and cannot switch to the
   printing thread, no output will be visible preventing the system from printing
   its final messages.
   There are exceptions for immediate output, such as during panic() handling. To
   support this, the console driver must implement new-style lock handling. This
   involves setting the CON_NBCON flag in console::flags and providing
   implementations for the write_atomic, write_thread, device_lock, and
   device_unlock callbacks.
	.. SPDX-License-Identifier: GPL-2.0

	=============================================
	Porting an architecture to support PREEMPT_RT
	=============================================

	:Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

	This list outlines the architecture specific requirements that must be
	implemented in order to enable PREEMPT_RT. Once all required features are
	implemented, ARCH_SUPPORTS_RT can be selected in architecture’s Kconfig to make
	PREEMPT_RT selectable.
	Many prerequisites (genirq support for example) are enforced by the common code
	and are omitted here.

	The optional features are not strictly required but it is worth to consider
	them.

	Requirements
	------------

	Forced threaded interrupts
	CONFIG_IRQ_FORCED_THREADING must be selected. Any interrupts that must
	remain in hard-IRQ context must be marked with IRQF_NO_THREAD. This
	requirement applies for instance to clocksource event interrupts,
	perf interrupts and cascading interrupt-controller handlers.

	PREEMPTION support
	Kernel preemption must be supported and requires that
	CONFIG_ARCH_NO_PREEMPT remain unselected. Scheduling requests, such as those
	issued from an interrupt or other exception handler, must be processed
	immediately.

	POSIX CPU timers and KVM
	POSIX CPU timers must expire from thread context rather than directly within
	the timer interrupt. This behavior is enabled by setting the configuration
	option CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK.
	When virtualization support, such as KVM, is enabled,
	CONFIG_VIRT_XFER_TO_GUEST_WORK must also be set to ensure
	that any pending work, such as POSIX timer expiration, is handled before
	transitioning into guest mode.

	Hard-IRQ and Soft-IRQ stacks
	Soft interrupts are handled in the thread context in which they are raised. If
	a soft interrupt is triggered from hard-IRQ context, its execution is deferred
	to the ksoftirqd thread. Preemption is never disabled during soft interrupt
	handling, which makes soft interrupts preemptible.
	If an architecture provides a custom __do_softirq() implementation that uses a
	separate stack, it must select CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK. The
	functionality should only be enabled when CONFIG_SOFTIRQ_ON_OWN_STACK is set.

	FPU and SIMD access in kernel mode
	FPU and SIMD registers are typically not used in kernel mode and are therefore
	not saved during kernel preemption. As a result, any kernel code that uses
	these registers must be enclosed within a kernel_fpu_begin() and
	kernel_fpu_end() section.
	The kernel_fpu_begin() function usually invokes local_bh_disable() to prevent
	interruptions from softirqs and to disable regular preemption. This allows the
	protected code to run safely in both thread and softirq contexts.
	On PREEMPT_RT kernels, however, kernel_fpu_begin() must not call
	local_bh_disable(). Instead, it should use preempt_disable(), since softirqs
	are always handled in thread context under PREEMPT_RT. In this case, disabling
	preemption alone is sufficient.
	The crypto subsystem operates on memory pages and requires users to "walk and
	map" these pages while processing a request. This operation must occur outside
	the kernel_fpu_begin()/ kernel_fpu_end() section because it requires preemption
	to be enabled. These preemption points are generally sufficient to avoid
	excessive scheduling latency.

	Exception handlers
	Exception handlers, such as the page fault handler, typically enable interrupts
	early, before invoking any generic code to process the exception. This is
	necessary because handling a page fault may involve operations that can sleep.
	Enabling interrupts is especially important on PREEMPT_RT, where certain
	locks, such as spinlock_t, become sleepable. For example, handling an
	invalid opcode may result in sending a SIGILL signal to the user task. A
	debug excpetion will send a SIGTRAP signal.
	In both cases, if the exception occurred in user space, it is safe to enable
	interrupts early. Sending a signal requires both interrupts and kernel
	preemption to be enabled.

	Optional features
	-----------------

	Timer and clocksource
	A high-resolution clocksource and clockevents device are recommended. The
	clockevents device should support the CLOCK_EVT_FEAT_ONESHOT feature for
	optimal timer behavior. In most cases, microsecond-level accuracy is
	sufficient

	Lazy preemption
	This mechanism allows an in-kernel scheduling request for non-real-time tasks
	to be delayed until the task is about to return to user space. It helps avoid
	preempting a task that holds a sleeping lock at the time of the scheduling
	request.
	With CONFIG_GENERIC_IRQ_ENTRY enabled, supporting this feature requires
	defining a bit for TIF_NEED_RESCHED_LAZY, preferably near TIF_NEED_RESCHED.

	Serial console with NBCON
	With PREEMPT_RT enabled, all console output is handled by a dedicated thread
	rather than directly from the context in which printk() is invoked. This design
	allows printk() to be safely used in atomic contexts.
	However, this also means that if the kernel crashes and cannot switch to the
	printing thread, no output will be visible preventing the system from printing
	its final messages.
	There are exceptions for immediate output, such as during panic() handling. To
	support this, the console driver must implement new-style lock handling. This
	involves setting the CON_NBCON flag in console::flags and providing
	implementations for the write_atomic, write_thread, device_lock, and
	device_unlock callbacks.