Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism

This paper describes the design, implementation, and performance of a kernel interface and a user-level thread package that together combine the performance of user-level threads (in the common case of thread operations that can be implemented entirely at user level) with the functionality of kernel threads (correct behavior in the infrequent case when the kernel must be involved).

Kernel-level threads
- Kernel directly schedules each application's threads onto physical processors.
- Poor performance: Although typically an order of magnitude better than that of traditional processes, has been typically an order of magnitude worse than the best-case performance of user-level threads.
- Expensive: Must cross an extra protection boundary on every thread operation, even when the processor is being switched between threads in the same address space.
- Inflexible: Overhead imposed on applications that do not use a particular feature of the kernel thread system. E.g. Can't customize the scheduling policy.
User-level threads
- Managed by runtime library routines linked into each application so that thread management operations require no kernel intervention.
- Excellent performance: Cost of user-level thread operations is within a order of magnitude of the cost of a procedure call.
- Flexible: an be customized to the needs of the language or user without kernel modification.
- User-level threads are multiplexed across real, physical processors by the underlying kernel. In the presence of "real world" operating system activity (e.g. multiprogramming, I/O, and page faults) user-level threads built on top of traditional processes can exhibit poor performance or even incorrect behavior (e.g. priority inversion).
- Problems with user-level threads built on top of kernel-level threads:
  - Kernel threads block, resume, and are preempted without notification to the user level.
  - Kernel threads are scheduled obliviously with respect to the user-level thread state.
Scheduler Activations
- Serves as a vessel, or execution context, for running user-level threads, in exactly the same way that a kernel thread does.
- Notifies the user-level thread system of a kernel event. (Called "scheduler activation" because each vectored event causes the user-level thread system to reconsider its scheduling decision of which threads to run on which processors).
- Provides space in the kernel for saving the processor context of the activation's current user-level thread, when the thread is stopped by the kernel (e.g., because the thread blocks in the kernel on I/O or the kernel preempts its processor).
By using scheduler activations, the kernel is able to maintain the invariant that there are always exactly as many running scheduler activations as there are processors assigned to the address space.
Kernel creates a scheduler activation, assigns it to a processor, and upcalls into the application address space at a fixed entry point on the following events:
- New processor available. User-level thread system executes a runnable user-level thread.
- Processor has been preempted. User-level thread system returns to the ready list the user-level thread that was executing in the context of the preempted scheduler activation. The thread's register state is saved by low-level kernel routines (such as the interrupt and page fault handlers), and the kernel passes this state to the user level as part of the upcall notifying the address space of the preemption.
- Scheduler activation has blocked. User-level thread system now knows that the blocked scheduler activation is no longer using its processor. Can pick another user thread to run on the new activation.
- Scheduler activation has unblocked. User-level thread system returns to the ready list the user-level thread that was executing in the context of the blocked scheduler activation. The thread's register state is saved by low-level kernel routines (such as the interrupt and page fault handlers), and the kernel passes this state to the user level as part of the upcall notifying the address space of the I/O completion.
Some details:
- User-level priority scheduling: may need to pull a lower priority user thread off of another activation (only the kernel can preempt a processor). User-level thread system asks kernel to preempt a certain thread's processor and use that processor to do an upcall, allowing the user-level thread system to transfer the preempted thread to the ready list and run a higher-priority thread.
- Kernel's interaction with the application is entirely in terms of scheduler activations. The kernel needs no knowledge of the data structures used to represent parallelism at the user level.
- Scheduler activations work properly even when a preemption or a page fault occurs in the user-level thread manager when no user-level thread is running. (Kernel saves thread manager state). The only added complication for the kernel is that an upcall to notify the program of a page fault may in turn page fault on the same location; the kernel must check for this, and when it occurs, delay the subsequent upcall until the page fault completes.
Address space notifies kernel whenever it makes a transition to a state where it has more runnable threads than processors, or more processor that runnable threads. (Kernel need not be notified of additional idle processors or more parallelism if there are already idle processors or all processors are busy). These notifications are:
- Add more processors. Kernel allocates more processors to this address space and start them running scheduler activations.
- This processor is idle. Kernel preempts this processor if another address space needs it.
Some more details:
- What happens if a user-level thread is executing in a critical section at the instant when it is blocked or preempted? Adopt a solution based on recovery. When an upcall informs the user-level thread system that a thread has been preempted or unblocked, the thread system checks if the thread was executing in a critical section. If so the thread is continued temporarily via a user-level context switch. When the continued thread exits the critical section, it relinquishes control back to the original upcall, again via a user-level context switch. It is now safe to place the user-level thread back on the ready list.
- Critical sections are detected by keeping a hash table of section begin/end addresses that are computed by placing special assembly instructions around critical sections in the object code and then post-processing the object code.

Elaine Cheong

Last modified: Mon Jul 30 23:25:20 PDT 2001