Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism
    This paper describes the design, implementation, and performance
    of a kernel interface and a user-level thread package that
    together combine the performance of user-level threads (in the
    common case of thread operations that can be implemented entirely
    at user level) with the functionality of kernel threads (correct
    behavior in the infrequent case when the kernel must be involved).
    
      - Kernel-level threads
        
          - Kernel directly schedules each application's threads
          onto physical processors.
          
 - Poor performance: Although typically an order of magnitude
          better than that of traditional processes, has been typically
          an order of magnitude worse than the best-case performance
          of user-level threads.
            
          
 - Expensive: Must cross an extra protection boundary on
          every thread operation, even when the processor is being
          switched between threads in the same address space.
          
 - Inflexible: Overhead imposed on applications that do not
          use a particular feature of the kernel thread system.
          E.g. Can't customize the scheduling policy.
        
 
       - User-level threads
        
          - Managed by runtime library routines linked into each
          application so that thread management operations require no
          kernel intervention.
          
 - Excellent performance: Cost of user-level thread
          operations is within a order of magnitude of the cost of a
          procedure call.
          
 - Flexible: an be customized to the needs of the language
          or user without kernel modification.
          
 - User-level threads are multiplexed across real, physical
          processors by the underlying kernel.  In the presence of
          "real world" operating system activity
          (e.g. multiprogramming, I/O, and page faults) user-level
          threads built on top of traditional processes can exhibit
          poor performance or even incorrect behavior (e.g. priority
          inversion).
          
 - Problems with user-level threads built on top of
          kernel-level threads:
            
              - Kernel threads block, resume, and are preempted
              without notification to the user level.
              
 - Kernel threads are scheduled obliviously with
              respect to the user-level thread state.
            
 
         
        
       - Scheduler Activations
        
          - Serves as a vessel, or execution context, for running
          user-level threads, in exactly the same way that a kernel
          thread does.
          
 - Notifies the user-level thread system of a kernel event.
          (Called "scheduler activation" because each vectored event
          causes the user-level thread system to reconsider its
          scheduling decision of which threads to run on which
          processors).
          
 - Provides space in the kernel for saving the processor
          context of the activation's current user-level thread, when
          the thread is stopped by the kernel (e.g., because the
          thread blocks in the kernel on I/O or the kernel preempts
          its processor).
        
 
       - By using scheduler activations, the kernel is able to
      maintain the invariant that there are always exactly as many
      running scheduler activations as there are processors assigned
      to the address space.
      
 - Kernel creates a scheduler activation, assigns it to a
      processor, and upcalls into the application address space at a
      fixed entry point on the following events:
        
          - New processor available.  User-level thread system
          executes a runnable user-level thread.
          
 - Processor has been preempted.  User-level thread system
          returns to the ready list the user-level thread that was
          executing in the context of the preempted scheduler
          activation.  The thread's register state is saved by
          low-level kernel routines (such as the interrupt and page
          fault handlers), and the kernel passes this state to the
          user level as part of the upcall notifying the address space
          of the preemption.
          
 - Scheduler activation has blocked.  User-level thread
          system now knows that the blocked scheduler activation is no
          longer using its processor.  Can pick another user thread to
          run on the new activation.
          
 - Scheduler activation has unblocked.  User-level thread
          system returns to the ready list the user-level thread that
          was executing in the context of the blocked scheduler
          activation.  The thread's register state is saved by
          low-level kernel routines (such as the interrupt and page
          fault handlers), and the kernel passes this state to the
          user level as part of the upcall notifying the address space
          of the I/O completion.
        
 
       - Some details:
        
          - User-level priority scheduling: may need to pull a lower
          priority user thread off of another activation (only the
          kernel can preempt a processor).  User-level thread system
          asks kernel to preempt a certain thread's processor and use
          that processor to do an upcall, allowing the user-level
          thread system to transfer the preempted thread to the ready
          list and run a higher-priority thread.
          
 - Kernel's interaction with the application is entirely in
          terms of scheduler activations.  The kernel needs no
          knowledge of the data structures used to represent
          parallelism at the user level.
          
 - Scheduler activations work properly even when a
          preemption or a page fault occurs in the user-level thread
          manager when no user-level thread is running.  (Kernel saves
          thread manager state).  The only added complication for the
          kernel is that an upcall to notify the program of a page
          fault may in turn page fault on the same location; the
          kernel must check for this, and when it occurs, delay the
          subsequent upcall until the page fault completes.
        
 
       - Address space notifies kernel whenever it makes a transition
      to a state where it has more runnable threads than processors,
      or more processor that runnable threads.  (Kernel need not be
      notified of additional idle processors or more parallelism if
      there are already idle processors or all processors are busy).
      These notifications are:
        
          - Add more processors.  Kernel allocates more processors
          to this address space and start them running scheduler
          activations.
          
 - This processor is idle.  Kernel preempts this processor
          if another address space needs it.
        
 
       - Some more details:
        
          - What happens if a user-level thread is executing in a
          critical section at the instant when it is blocked or
          preempted?  Adopt a solution based on recovery.  When an
          upcall informs the user-level thread system that a thread
          has been preempted or unblocked, the thread system checks if
          the thread was executing in a critical section.  If so the
          thread is continued temporarily via a user-level context
          switch.  When the continued thread exits the critical
          section, it relinquishes control back to the original
          upcall, again via a user-level context switch.  It is now
          safe to place the user-level thread back on the ready list.
          
 - Critical sections are detected by keeping a hash table
          of section begin/end addresses that are computed by placing
          special assembly instructions around critical sections in
          the object code and then post-processing the object code.
        
 
     
    
    Elaine Cheong
Last modified: Mon Jul 30 23:25:20 PDT 2001