Chetan V. Kokate

strangerwineAI and Robotics

Oct 19, 2013 (3 years and 7 months ago)


Chetan V. Kokate


Advanced Operating Systems

FALL 2008


Dynamic Load Sharing and Balancing

Process Migration

Process Implementation

Real Time Scheduling

Stork and Data Aware Schedulers [1].

The unbounded increase in the computation and data requirements of
scientific applications has necessitated the use of widely distributed
computing and storage resources to meet the demand.

Traditional systems closely couple data placement and computation, and
consider data placement as a side effect of computation.

The insufficiency of the traditional systems and existing CPU
schedulers in dealing with the complex data handling problem has yielded
a new emerging era: the data
aware schedulers.

One of the first examples of such schedulers is the Stork data placement

The reason that we categorize the data placement jobs into different types is that
all of these types can have different priorities and different optimization

Example : Register, Unregister, Transfer, Release , etc.

Data placement is handled by scientific applications using the traditional systems.
This example is a well known bioinformatics application: Blast.

Blast aims to decode genetic information and map genomes of different species
including humankind. Blast uses comparative analysis techniques while doing this
and searches for sequence similarities in protein and DNA databases by comparing
unknown genetic sequences (on the order of billions) to the known ones.

Primararily scientific application oriented.

Stochastic Approach to Scheduling [2].

Focus is on the multiple divisible tasks on a Heterogeneous

Distributed Computing System.

The “stochastic” approach, which was previously applied to DAG scheduling, is
employed for scheduling a group of multiple divisible as well as whole
independent tasks.

It explicitly considers the standard deviations (temporal heterogeneity) in
addition to the mean execution times in deriving a schedule, in order to

model more closely what would actually happen “on average”on a temporally
heterogeneous system (instead of approximating the random weights by their
means only as in other approaches).

Through an extensive computer simulation, it has been shown that the
approach can improve schedules significantly over those by a scheme which
uses the average weights only.

A conventional scheduling scheme which considers only the means of task execution
times is not able to find the best possible schedule in a heterogeneous environment.

A new approach to scheduling a group of independent divisible and non
tasks is the stochastic approach.

The first implementation of the approach is based on the Max

Min and Min
algorithms, called the stochastic Max

Min and Min

The schedules derived by the proposed approach are significantly better in terms of
the average parallel execution time than those by the static Max
Min and Min
which consider only the average execution times of tasks.

Also, the stochastic Max
Min and Min
Min are able to accurately predict the actual
performance one can expect on a temporally heterogeneous distributed computing
system, i.e. the schedule length obtained by the stochastic Max
Min and Min

is very close to the average parallel execution time.

Long Term CPU Load Prediction
System [3].

There exist distributed processing environments composed of many
heterogeneous computers.

It is required to schedule distributed parallel processes in an appropriate manner.

For the scheduling, prediction of execution load of a process is effective to exploit
resources of environments.

A prediction module selection for an appropriate prediction method according to a
state of changing CPU load using a neural network has been recently proposed.

The selection is expected to improve prediction accuracy under circumstances

both of steady and unsteady states because each prediction method can predict more
accurately in each different condition.

The NN is a 3
layer perception and each layer is fully connected.

The number of cells in each layer is 6
3 (input

The last three observed load values
V (T ), V (T − 1), V (T −2) and
three of those differences V (T ), V (T − 1), V (T − 2) are fed to the
input layer.

Each cell in output
layer corresponds to each prediction module,
which is LAST or process search method or runtime
based method, and the cell that most strongly fires indicates the
prediction module to be selected at that time.

From the future prospective , parameters can be refined to improve
prediction accuracy and develop scheduling algorithms to exploit
term CPU load predictions.

Enhancing Job Scheduling on

NOWs [4].

Using Simulation, Historical and Hybrid Estimation Systems.

An estimation engine termed CISNE has been introduced
into the job scheduling system.

Three different estimation methods have been proposed and
implemented in the CISNE system: a simulation tool, a
historical system and an integration of both (hybrid).

CISNE is a new scheduling environment.

The main objective of the CISNE system is to manage parallel applications in a non
dedicated environment, ensuring benefits for the parallel applications, while it
preserves the local task responsiveness.

When a parallel job is submitted to the CISNE system, the job waits in a queue until
Queues Manager decides to schedule it.

This decision
is taken according to the computational requirements of each parallel
job waiting in the queue together with the
Node State received from each node. The
Node State includes
the local load and the amount of idle computational resources on
each node.

Once a job is selected from the
Jobs Queue, CISNE will select the best subset of nodes
execute it.

Job Selection Policy (JSP) is the policy for selecting
the next job to run from the waiting
queue. This could depend on the job's priorities (order in the queue), and the cluster
state (intrusion level into the local workload, the Multiprogramming Level (MPL) of
parallel applications, the memory and CPU usage and the available nodes).

Node Selection Policy (NSP) is the policy for distributing
the parallel tasks among the
nodes. This depends on the cluster state and the parallel job's characteristics.

Dynamic Scheduling with Process
Migration [5].

The migration cost has been modeled to introduce an effective method to predict the cost
of process migration.

The dynamic scheduling mechanism considers migration cost as well as other
conventional influential factors for performance optimization in a shared, heterogeneous

Experimental results show that the proposed dynamic scheduling system is feasible and
improves the system performance considerably.

The design of a migration
based dynamic scheduling is fourfold: reschedule triggering,
migration cost modeling, task scheduling, and parameter measurement.

Select the destination machine based on an estimate of the completion time of the
migrated process. When an application consists of multiple processes running
concurrently on different machines, we need to consider the overall application
completion time as a selection criterion.

Assumption: an application is located on machine, m

Objective: dynamically reallocate an application when an abnormality is noticed


Receiving the triggering signal

List a set of idle machines that are lightly loaded over an observed time period,

M= {m
} ;

= 1;

For each machine


(1 ≤ k ≤ q) ,

Calculate the migration cost,



Calculate the mean of the remote task execution time,

Calculate the application completion time,



> T

, then p

= k ;

End For

Migrate the application from

to m



A Real Time Scheduler Using Generic

Neural Network [6].

A generic neural network scheduler for scheduling a set of jobs
with deadlines on a set of resources in critical real time
applications, in which a schedule is to be obtained within a short
time span has recently been proposed.

Based on GENET network model with progressive stochastic search

To cope with the bi criterion of deadlines and optimization, a
heuristic policy which is modified from the earliest deadline first
policy and an optimal mechanism are embedded into the proposed

The real time scheduling problem could be viewed as a constraint satisfaction
problem (CSP).

Different constraints that need to be taken into account :

Order Constraints

Overlap Constraints

Deadline Constraints

GENET is a local search approach with a neural network connectionist
architecture for solving finite CSPs with binary constraints .

In the GENET network, a binary CSP (U, D, C) can be represented by a set of label
nodes and weighed incompatible connections.

GENET neural network scheduler consists of 3 type of neurons

type neurons

type neurons


type neurons

Aware Soft Real
Time Scheduling
for Multi
Radio Embedded Devices [7].

An energy
efficient scheduling algorithm for the data
communications of soft
time periodic tasks on multi
radio embedded devices was recently

To cope with the dynamic fluctuation of channel conditions, a feedback
mechanism that monitors radio throughput is introduced to guarantee real
time behaviors.

A formal analysis on the scheduling, exploring the relationship between
tardiness bounds and energy savings and that between network stability and (m;
firm deadline guarantees has also been done.

This approach is applicable to real
time communication scheduling problems in
which background interference and other environmental factors are not known a

The method is local in that it does not require broad
knowledge about the networking environment, and is
therefore suited to scenarios where interference is

The method results in significant energy savings by
allowing radios to be turned off in ways that still allow
time guarantees to be made.

There are several interesting areas of future work.


[1]“A new paradigm in data intensive computing: Stork and the data
aware schedulers”

Kosar, T.; Challenges of Large Applications in Distributed Environment, 2006 IEEE

19 June 2006 Page(s):5


[2] “
Stochastic Approach to Scheduling Multiple Divisible Tasks on a Heterogeneous

Distributed Computing System” Kamthe, A.; Lee, S.
Y.; Parallel and Distributed

Processing Symposium, 2007. IPDPS 2007. IEEE International 26
30 March 2007

Page(s): 1

[3] “Long
Term CPU Load Prediction System for Scheduling of Distributed Processes and

Implementation” Sugaya, Y.; Tatsumi, H.; Kobayashi, M.; Aso, H.; Advanced Information
Networking and Applications, 2008. AINA 2008. 22

International Conference on 25
March 2008 Page(s):971


[4] “Using Simulation, Historical and Hybrid Estimation Systems for Enhacing Job
Scheduling on NOWs” Hanzich, M.; Hernandez, P.; Luque, E.; Gine, F.; Solsona, F.;
Lerida, J.L.; Cluster Computing, 2006 IEEE International Conference on 25
28 Sept.
2006 Page(s):1

12 .

[5] “Dynamic Scheduling with Process Migration” Du, Cong; Sun, Xian
He; Wu, Ming;
Cluster Computing and the Grid. 2007. CCGRID 2007. Seventh IEEE International
Symposium on May 2007 Page(s):92


[6] “A Real Time Scheduler Using Generic Neural Network for Scheduling with
Deadlines” Xin Feng; Lixin Tang; Hofung Leung; Neural Networks and Brain, 2005.
ICNN&B ‘05. International Conference on Volume 1,

15 Oct. 2005 Page(s):504


[7] “Energy
Aware Soft Real
Time Scheduling for Multi
Radio Embedded Devices”