Predictable Performance Management and Simulation
Members: Adam Crume, Carlos Maltzahn, John Bent
Hard disk drive latency models are needed as components of storage system simulations and for storage Quality-of-Service systems. Existing models are based on white-box approaches and are difficult to configure and keep up to date. FLAMBES is a black-box alternative that automatically builds behavioral models of hard disk drive latency with little or no expert knowledge.
Other black-box models have not been able to predict per-request latencies. We recognized that a key reason for this inability is the existence of periodicity with high, unknown frequencies, which many machine learning algorithms do not handle well. To account for this, FLAMBES generates neural net models that use request data augmented with periodicity information to predict individual request latencies.
Members: Andrew Shewmaker, Katia Obraczka, Carlos Maltzahn, Scott Brandt
Good average case network performance isn't necessarily good enough at scale. If a worst-case network delay occurs only one time out of a thousand, then a system serving many thousands of users every second will feel its impact constantly. And tightly coupled scientific simulations are even more sensitive, since all tasks have to wait for one that is slow. Congestion is the cause of long tail delay distributions in a network, and totally eliminating it will require dealing with the issue on individual links, trunked links, and multiple paths. We propose a new optimal multi-queue scheduler that solves the local congestion sub-problems and provides information that should prove useful for global load balancing problem and troubleshooting.
The envisioned queueing discipline builds on processor scheduling work done previously at the UCSC Systems Research Lab. Rate Based Earliest Deadline (RBED) is a generalization of Earliest Deadline First (EDF) that enables integrated scheduling of best effort tasks with tasks requiring absolute rate and deadline guarantees. Reduction to Uniprocessor (RUN) provides an elegant way to semi-partition the scheduling of an array of resources such that a single resource scheduler like EDF can be used to generate schedules for any feasible task set. A multi-queue scheduler based on RBED and RUN could eliminate congestion on a switch port and provide detailed metrics for global load balancing algorithms.
Members: Andrew Shewmaker, Katia Obraczka, Carlos Maltzahn, Scott Brandt, Ivo Jimenez
While Data Center TCP can improve the behavior of data center networks, deploying it comes with deployability challenges that should be addressed. Can DCTCP avoid getting bullied by traditional TCPs without requiring separation of traffic? What are the best options for dealing with the situation where the receiver has not been modified? In what ways can DCTCP be enhanced such that it becomes an option for communication between Data Centers and clients outside the datacenter? Can the benefits of DCTCP be extended outside data centers?
In addition to deployability, questions arise concerning DCTCP’s stability, robustness, and ability to support different Quality of Service (QoS) requirements. What can be done to make DCTCP more stable in general and less vulnerable to ACK-loss? If DCTCP succeeds in making packet losses due to congestion rare, should the response to congestion still be a halving of the window? Can enhancements for QoS be made practical?
RBED: Rate Based Earliest Deadline First
Members: Scott Brandt, Scott Banachowski, Caixue Lin, Timothy Bisson
Real-time systems are growing in complexity and real- time and soft real-time applications are becoming common in general-purpose computing environments. Thus, there is a growing need for scheduling solutions that simultaneously support processes with a variety of different timeliness constraints. Toward this goal we have developed the Resource Allocation/Dispatching (RAD) integrated scheduling model and the Rate-Based Earliest Deadline (RBED) integrated multi-class real-time scheduler based on this model. We present RAD and the RBED scheduler and formally prove the correctness of the operations that RBED employs. We then describe our implementation of RBED and present results demonstrating how RBED simultaneously and seamlessly supports hard real-time, soft real-time, and best-effort processes.
DP-FAIR: A Simple Model for Understanding Optimal Multiprocessor Scheduling
Members: Greg Levin, Caitlin Sadowski, Ian Pye, Scott Brandt (with Shelby Funk)
We consider the problem of optimal real-time scheduling of periodic and sporadic tasks for identical multiprocessors. A number of recent papers have used the notions of fluid scheduling and deadline partitioning to guarantee optimality and improve performance. Prior work has been theoretically challenging and algorithmically complex. With this project, we have developed a unifying theory with the DP-FAIR scheduling policy to clarify the problem of optimal scheduling. We have also devised a simple scheduling algorithm, DP-WRAP, which serves as a least common ancestor to more complex recent algorithms. We are now working to extend the DP-FAIR scheduling theory to broader problem domains, and to refine DP-WRAP to improve its performance.
SNS: A Simple Model for Understanding Optimal Hard Real-Time Multiprocessor Scheduling, Greg Levin, Caitlin Sadowski, Ian Pye, Scott Brandt. UCSC Tech Report. (This is the preliminary version of the work that would eventually be published, in collaboration with Shelby Funk, as DP-FAIR. It contains a section on Scheduling Duality and an extended survey of other work in the field.)
RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor
Members: Greg Levin, Scott Brandt (with Paul Regnier, George Lima, Ernesto Massa)
Existing optimal multiprocessor real-time schedulers incur significant overhead for preemptions and migrations. We present RUN, a new approach to optimal scheduling which reduces the multiprocessor problem to a series of uniprocessor problems. RUN is observed to have no more than three preemptions per job, reduces to Partitioned EDF whenever a proper partitioning is found, and significantly outperforms existing optimal algorithms.
RAD-Flows: Buffer Management for Predictable Communication
Members: Roberto Pineiro, Scott Brandt, Carlos Maltzahn, Kleoni Ioannidou
Real-time systems and applications are becoming increasingly complex and often comprise multiple communicating tasks. The management of the individual tasks is well-understood, but the interaction of communicating tasks with different timing characteristics is less well-understood. We address several representative inter-task communication flows via reserved memory buffers (possibly interconnected via a real-time network) and present RAD-Flows, a model for managing these interactions. We provide proofs and simulation results demonstrating the correctness and effectiveness of RAD-Flows, allowing system designers to determine the amount of memory required based upon the characteristics of the interacting tasks and to guarantee real-time operation of the system as a whole.
Fahrrad: Efficient Performance Guarantees on Storage Devices
Members: Dimitris Skourtis, Scott Brandt, Carlos Maltzahn, Kleoni Ioannidou
Alumni: Tim Kaldewey, Anna Povzner
Guaranteed I/O performance is needed for a variety of applications ranging from real-time data collection to desktop multimedia to large-scale scientific simulations. Reservations on throughput, the standard measure of disk performance, fail to effectively manage disk performance due to the orders of magnitude difference between best-, average-, and worst-case response times, allowing reservation of less than 0.01% of the achievable bandwidth. We show that by reserving disk resources in terms of utilization it is possible to create a disk scheduler that supports reservation of nearly 100% of the disk resources, provides arbitrarily hard or soft guarantees depending upon application needs, and yields efficiency as good or better than best-effort disk schedulers tuned for performance. We present the architecture of our scheduler, prove the correctness of its algorithms, and provide results demonstrating its effectiveness.