Large deviations analysis of scheduling policies for a web server
Abstract
With increasing demand and availability of bandwidth resources, there has been tremendous
growth in the scale and speed of web servers. In web servers, scheduling plays an important
role in resource allocation (for instance, bandwidth allocation, processor allocation,
etc). However, as the scale of a system increases so does the number of activities/events
in the system (e.g., job arrivals), as a consequence of which the analysis of scheduling
becomes increasingly harder. In particular, the possible ways in which scheduling failure
(e.g., queue overflow, excessively large delay, instability of a system) can occur becomes
increasingly greater, thus making it more difficult to understand the behavior and develop
design rules for scheduling algorithms. However, a well-known observation from large devi
viations theory that large scale systems fails in a “most likely way” can potentially be used
to simplify the design and analysis of scheduling. In this thesis, we study the implications
and applications of this effect on scheduling in a web server accessed by a large number of
sources.
We analyze the delay distribution of scheduling policies for web servers under a
many sources large deviation regime which models web servers in a large scale system
well. Due to the difficulties brought on considering a large number of sources, only a small
number of scheduling policies, such as First-Come-First-Serve (FCFS), General-ProcessorSharing
(GPS), and Priority Queueing policies have been analyzed under the many sources
regime. In particular, in a single queue single server setup the delay characteristics of only
FCFS, Shortest-Job-First (SJF), and Longest-Job-First (LJF) has been analyzed.
In this thesis, we study the Two-Dimensional-Queueing (2DQ) framework, a unifying
queueing framework that allows the identification of the “most likely way” in which
delay occurs, to analyze the delay of various unexplored scheduling policies. In conjunction
with the 2DQ framework, we develop a new “cycle based” technique for understanding the
large deviations tail probability of more complex policies.
Using the combination of the 2DQ framework and the cycle based analysis, we
first analyze two interesting scheduling policies, i.e., Shortest-Remaining-Processing-Time
(SRPT) policy (which is mean delay optimal) and Processer-Sharing (PS) policy (which is a
“fair” policy). We derive the asymptotic delay distributions (rate functions) of both policies
and study their behavior across job sizes. Next, we address three problems in implementing
the aforementioned scheduling policies: (i) end receivers may have bandwidth constraints
that are not taken account in SRPT, (ii) the remaining processing time information might
not be available to the web-server, and (iii) most actual implementations are variants of
SRPT to reflect other implementation constraints and/or to jointly optimize other metrics
in addition to delay, i.e., jitter, fairness, etc. To address these, we first develop finite-SRPT
that takes into account the bandwidth constraint at the end receiver, and show that the policy
shifts between SRPT and a PS-like policy depending on the bandwidth constraint. Second,
we study the Least-Attained-Service (LAS) policy which is viewed as a good substitute
for SRPT when the remaining job size is not available and we analyze the penalty associated
with not using the remaining size information directly. Lastly, we analyze a class of
scheduling policies known as SMART that contains many variants of SRPT with different
fairness properties and show that all policies in the class have the same tail probability of
delay across job sizes for a many sources regime. The results of this thesis facilitate the
understanding of various scheduling policies under the many sources regime and provides
an analytical queueing framework that can be used to understand other scheduling policies.