From Xen
Icon Info.png The scheduler has been included as an experimental in Xen 4.5 and is still an in-development feature.

Real-Time-Deferrable-Server(RTDS)-Based CPU Scheduler


The Real-Time Deferrable Server (rtds) scheduler is a real-time CPU scheduler built to provide guaranteed CPU capacity to guest VMs on SMP hosts. It is introduced with name rtds in Xen 4.5 as an experimental scheduler. The RTDS scheduler in Xen 4.7 changes the scheduling model from the quantum-driven to event-driven. Because the event-driven model can avoid invoking the scheduler unnecessarily, it will incur less scheduling overhead.


Each VCPU of each domain is assigned a budget and a period. The VCPU with <budget>, <period> is supposed to run for <budget>us (not necessarily continuously) in every <period>us.

Note: The VCPUs of the same domain have the same parameters right now.


The xl sched-rtds command can be used to tune the per VM guest scheduler parameters.


-d DOMAIN, --domain=DOMAIN Specify domain for which scheduler parameters are to be modified or retrieved. Mandatory for modifying scheduler parameters.

-v VCPUID/all, --vcpuid=VCPUID/all Specify vcpu for which scheduler parameters are to be modified or retrieved.

-p PERIOD, --period=PERIOD Period of time, in microseconds, over which to replenish the budget.

-b BUDGET, --budget=BUDGET Amount of time, in microseconds, that the VCPU will be allowed to run every period.

-c CPUPOOL, --cpupool=CPUPOOL Restrict output to domains in the specified cpupool.

The details of how to use the xl sched-rtds command can be found by searching sched-rtds at this documentation[1].


The design of this rtds scheduler is as follows:

Each VCPU has a dedicated period and budget. The deadline of a VCPU is at the end of each period; A VCPU has its budget replenished at the beginning of each period; While scheduled, a VCPU burns its budget. The VCPU needs to finish its budget before its deadline in each period; The VCPU discards its unused budget at the end of each period. If a VCPU runs out of budget in a period, it has to wait until next period.

Each VCPU is implemented as a deferable server. When a VCPU has a task running on it, its budget is continuously burned; When a VCPU has no task but with budget left, its budget is preserved.

This scheduler follows the Preemptive Global Earliest Deadline First (EDF) theory in real-time field to schedule these VCPUs. At any scheduling point, the VCPU with earlier deadline has higher priority. The scheduler always picks the highest priority VCPU to run on a feasible PCPU. A PCPU is feasible to a VCPU if the PCPU is idle or has a lower-priority VCPU running on it.

Queue scheme: A global runqueue and a global depletedq for each CPU pool. The runqueue holds all runnable VCPUs with budget and sorted by deadline; The depletedq holds all VCPUs without budget and unsorted.

In Xen 4.7, budget replenishment and enforcement are separated by adding a replenishment timer, which fires at the next most imminent release time of all runnable vcpus.

A replenishment queue has been added to keep track of all replenishment events.

The following functions have major changes to manage the replenishment events and timer.

repl_handler(): It is a timer handler which is re-programmed to fire at the nearest vcpu deadline to replenish vcpus.

rt_schedule(): picks the highest runnable vcpu based on cpu affinity and ret.time will be passed to schedule(). If an idle vcpu is picked, -1 is returned to avoid busy-waiting.

repl_update() has been removed.

rt_vcpu_wake(): when a vcpu wakes up, it tickles instead of picking one from the run queue.

rt_context_saved(): when context switching is finished, the preempted vcpu will be put back into the run queue and it tickles.

Simplified funtional graph:


       snext = runq_pick()

sched_rt.c TIMER_SOFTIRQ

       <for_each_vcpu_on_q(i)> {

Features Improved in Xen 4.7

  • Burn budget in finer granularity instead of 1ms;
  • Use separate timer per VCPU for each VCPU's budget replenishment, instead of scanning the full runqueue every now and then;
  • Toolstack supports assigning/display each VCPU's parameters of each domain.

Features In the Future

  • Work-conserving version of RTDS. Allow VCPUs (or VMs) to use (unreserved) idle time left in the system.
  • Handle time stolen from domU by hypervisor. When it runs on a machine with many sockets and lots of cores, the spin-lock for global RunQ used in rtds scheduler could eat up time from domU, which could make domU have less budget than it requires.

Glossary of Terms

  • us: microsecond
  • Host: The physical hardware running Xen and hosting guest VMs.
  • VM/domU: Guest virtual machine.
  • VCPU: Virtual CPU (one or more per VM).
  • CPU/PCPU: Physical host CPU.
  • Period: The period when a VCPU's budget is replenished or discarded.
  • Budget: The amount of time a VCPU can execute within its period.


Please cite our research paper below if you use RTDS for a publication.

Also See