Jesus Antonio Gonzalez Bernal University of Texas at Arlington (email: gonzalez@cse.uta.edu) CSE6306

odecrackΤεχνίτη Νοημοσύνη και Ρομποτική

29 Οκτ 2013 (πριν από 4 χρόνια και 2 μήνες)

824 εμφανίσεις

Jesus Antonio Gonzalez Bernal

University of Texas at Arlington

(email: gonzalez@cse.uta.edu)

CSE6306

April 23, 1999




ARTIFICIAL INTELLIGENCE AND DISTRIBUTED OPERATING SYSTEMS

Abstract

The interactions between Artificial Intelligence applications and Dist
ributed Systems
architectures are shown through some examples taken from different applications. Most
of the examples are related with the area of “Machine Learning” applied to real world
domains. In this research we state that the way to go to speed up Ar
tificial Intelligence
techniques is through the use of Distributed Systems environments.


1 Introduction

In this paper we describe how Artificial Intelligence (AI) and Distributed Operating
Systems (DS) interact in the development of applications. There

are two ways in which
AI and DS might interact. The first is the use DS as the base platform to develop an AI
application. The second is the use AI techniques to develop a control module for one or
more of the services that a DS provides.

In the AI field
there are several reasons to use DS, one is the need to reduce the
running time of AI programs by exploiting the use of multi
-
processors. Other reason is
just because of the nature of some applications, like the coordination of agents that reside
in differ
ent computers and need to communicate to accomplish their goal. This
communication can be achieved using different approaches provided by DS. Another
reason is the use of distributed knowledge bases.


2

In the field of DS, AI paradigms are used as the decisio
n
-
maker tool for service
assignment. An example is a load
-
balancing module. We are going to refer to several
applications to show some of these AI and DS interactions.


2 Distributed Natural Language Summarization [Dragomir].

We have experienced that in

some cases a search in the web consumes a lot of time,
specially reviewing the information retrieved for a specific topic. Furthermore, there are
cases in which we would like to automatically receive updates about that topic as soon as
those appear.

The p
urpose of this application is to solve those problems providing summaries of the
information found from the different sources; the summaries are shown to the user in
natural language, and new information about the topic will be automatically processed
and
sent to the user in the same format. For example, if the user is interested in the DS
topic the objective of the system is to continuously send him summaries (in natural
language) of the updates in the topic from different sources. It is important to menti
on
that the summaries consider multiple articles from multiple sources.

The approach used to solve the problem considers several types of agents. The first
type has the task of collecting information from different sources and is called “data
collector”. T
he second type of agent is responsible for the task of summarizing the
information; this agent is called “summarizer”. The third type of agent functions as a
database server and provides ontological information used to add or relate information in
the natu
ral language construction phase, this agent is called “database server”. The fourth
type of agent is the “planner”, which is responsible of maintaining the coordination

3

between the facilitator agents to update the summarizer knowledge base (keeping track o
f
the information that has already been sent to the user).

The communication module is used for the interchange of messages in the distributed
environment. This module uses the KQML [Finin et al. 1994] standard; which provides
the communication protocol an
d the message format.

The environment is naturally distributed since the information to summarize is
distributed in many servers, the ontological information resides in different servers as
well as the different types of agents, and the users belong to dif
ferent locations. This is an
example of those applications that require a DS model because the parts of the
application are essentially distributed. The application performance is strongly dependent
of its communication capability. Some of the essential pr
operties of the application are
location transparency and communication transparency.


3 Distributed Knowledge Networks [Honavar et al].

Technology advances like the increment of capacity in disk storage and the
increment of speed in networking communic
ation make possible the development of
applications to exploit very large sources of information geographically distributed
through agents that provide information to a decision
-
maker. The effectiveness of this
type of applications depends of two essential

factors directly related with the decision
-
maker. First, the decision
-
maker needs the right information at the right time, and second,
the decision
-
maker should not loose time reviewing irrelevant information.

An elegant approach to address this problem i
s the use of Distributed Knowledge
Networks (DKN). Distributed Knowledge Networks make use of different types of agents

4

and tools to solve the different problems found to create an application in the
environment described before. The problems to overcome a
nd goals to achieve are:


-

Geographically distributed data sources. DKN use information assistants (agents) to
make an intelligent, selective and context sensitive data gathering in the different
locations. Ideally this agents learn about user preferences s
o that they work properly
according to the user interest.

-

Transmit the less amount of information possible over the network. DKN use mobile
software agents to provide this capability. The concept is that the agent is transported
to the appropriate site and

performs computation there and only transmits the results
over the network.

-

Work in heterogeneous systems because data resides over machines with different
hardware (including sensors) and software. DKN intelligent software agents are able
to operate in t
his environment.

-

Work with different types of data like text, images, relational databases, etc. DKN
puts the information into a data warehouse that is used to perform further analysis
using data discovery techniques.

-

Information in the data sources change
s continuously. The DKN software agents are
able to detect those changes and update the knowledge repositories.

-

Managing large volumes of data. The purpose of the software agents is not only to
provide the found information to the decision
-
maker but search

for hidden
information that might be implied by the relations of the specific information in the

5

databases. DKN works this using the knowledge
-
discovery process, which includes
the use of data
-
mining tools.

-

Modular design. The complexity of the overall ap
proach requires partitioning the
problem in sub
-
tasks that are more easily solved. DKN considers each data source as
a module, which is managed by “more or less” autonomous agents. Now there is a
necessity to coordinate those agents.

-

Other issues to consid
er for the development of DKN application are those concerned
with security, reliability, fault tolerance and performance.


Once we described the type of problems to address in the construction of DKN
applications, let us describe how it is done, which are

the modules and which technology
is used to implement each module of the DKN. The DKN consists of a mobile agent
infrastructure compound of intelligent agents for information retrieval, information
extraction, assimilation, and knowledge discovery; it als
o requires software to deal with
the coordination and control of the interacting agents.

The system is based on an object design. Mobile agents can move from one site to
another to achieve their tasks. This is done to avoid sending large amount of useless
information across the network, instead the information is processed in
-
site and the result
is transmitted through the network. The mobile agent infrastructure needed to accomplish
this task is provided by three components, the first are agent servers, the

second an agent
interface and the third are agent brokers. A platform independent implementation has
been designed using Java and CORBA (Common Object Request Broker Architecture).


6

Several types of agents can be used in the implementation. We can use reac
tive
agents, deliberative agents, utility
-
driven agents, learning agents or a combination of
them. Machine learning techniques have been used to achieve the tasks of information
retrieval, information assimilation, and knowledge discovery. Data mining tool
s are used
for the knowledge discovery task applied to the constructed data warehouse. Some of the
data mining tools that can be used for this purpose are neural networks, statistical
methods, syntactic methods, rule induction and evolutionary techniques;
depending of
what is being looked for. The contract net protocol is being used to perform the
coordination task of the agents.

This paper clearly illustrates through DKN how the Distributed Operating
Systems platform and Artificial Intelligence techniques
are used together to construct an
application that provides an intelligent service with information sources dispersed
geographically. This architecture has been designed to provide services that in essence
are distributed but that require intelligence to a
chieve an efficient result.


4 Parallel Search Using Machine Learning [Cook & Varnell]

The main idea presented by the authors of the paper is how parallelism improves
the performance of search using the Iterative Deepening A* (IDA*) algorithm. IDA* is
a
n optimal and complete algorithm that uses an evaluation function to decide which node
is the next for expansion according to:

f(n)

= g(n) + h(n)
.


7

where
g(n)

is the cost of the path so far and
h(n)

is the estimated cost to the goal. IDA*
can be run on MIMD

parallel processor, a distributed network of workstations or in a
single processor machine with multithreading.

Now lets see how the tasks can be distributed. One method is to use Parallel
Window Search (PWS). In this approach a copy of the entire search
tree is given to each
processor that will perform a search using a specific threshold (or depth.) Now we have
all the processors searching at a different threshold. If a processor finishes the search on
that threshold without finding a solution, a differen
t threshold is assigned to it to perform
another search. In the case that an optimal solution is required, processors that found a
solution must wait for processors that have not finished but that are using a smaller
threshold. A second method consists on
the distribution of the search tree among the
available processors; this approach is called Distributed Tree Search (DTS.) Now each
processor will have a node of the tree and it will be responsible for the search in the
whole sub
-
tree starting at the recei
ved node. In this approach, all the processors search at
the same threshold. If there are not as many nodes as processors, it is possible to start
with a breadth
-
first expansion to get enough nodes to distribute among the processors.

There will be cases wh
ere some processors will finish their work before others. If
we want to minimize the idle time of our processors we need to implement a load
-
balancing technique. Load balancing consists of two steps, the first is the selection of a
node to request work fro
m. In the second step, the selected processor decides which work
is going to give. For the first step, an idle node can request work from its nearest
neighbor. A second method is to select randomly a processor to request work from. A
third method is to ass
ign a master processor to keep track of the workload of every

8

processor and send information to loaded processors about idle processors so that they
can send work to those idle processors. A good approach is to request work when the
processor is about to f
inish but with enough time to get the work before it finishes.

The paper also presents the EUREKA system, which implements the strategies
explained before. The EUREKA system contains a module that use the C4.5 learning
system to select the best strategies
for a new problem instance.

The utilization of parallel processing involves the use of tools for message passing
techniques to coordinate the processors and to distribute the work through scheduling and
load balancing. In the case of this implementation, P
VM was used.


5 Reinforcement Learning and Scheduling [Zhang & Dietterich].

Reinforcement learning techniques can be used to solve the problem of job
scheduling. A reinforcement learning algorithm learns policies that specify which action
should be take
n at each state. During the learning phase, a “reward” is assigned to each
action so that good actions can be differentiated from bad actions.

The reinforcement function R(s, a, s’) is the positive or negative reward for going
to state s’ by taking action
a in state s. If s’ is a good state in the path to the goal, the
reward is positive (or greater,) otherwise the reward is negative (or smaller.)

To apply repair
-
based reinforcement learning to the schedule problem, we start
with an initial critical path sc
hedule that does not consider resource constraints. This is
the initial state S
0
. A reward of

0.001 is assigned to a schedule that still violates resource
constraints. If the schedule is free of violations the reward is the negative of the Resource
Dilati
on Factor defined as:


9


where TRUI(S) is the total resource utilization index:


where RUI is the Resource Utilization Index that for a schedule of length
l

is the sum of
the resource utilization index taken over

all n resources and all
l

times. The RUI is
defined as:


where

is the utilization of resource
i

at time
t
, and

is the capacity of
resource
i
.

In the reinforcement learning algorithm we

apply the policy


to the current state
to get the best action that can be taken in state s:


we also define an associated function

to get the cumulative reward that will be
received fo
r going to state

an onward:


where N is the number of steps to find a conflict free schedule. In order to learn the
policy function, a feed
-
forward neural network is used:


where w is the vector of weights of the network. The temporal difference error at step
j

is:


10


The smoothed gradient is defined by:


the updates to the weights are given by:


whe
re


is the learning rate and


is the smoothing gradient.

The features used to train the neural network are:



Mean and standard deviation of the free pool capacity for bottleneck pools.



Mean and standard deviation of slacks.



Modified RDF.



Over
-
allocation i
ndex.



Percentage of windows in violation.



Percentage of windows in violation that can be resolved by pool reasignment.



Percentage of time units in violation.



First violated window index (normalized).


Although the authors of the paper did not apply the met
hod to scheduling considering a
computing environment (assign processors time, memory, I/O, etc.), I think that it is
possible to try it in that environment and compare its performance with other algorithms.
Something else to consider is that this algorith
m could be used to schedule jobs to
multiple processors but we are not considering to have the scheduler itself distributed
among several processors.


11

6 Decision Trees on Parallel Processors [Kufrin].

Decision Trees (DT) are a tool used to classify examp
les of data from a domain. If
enough examples are used to construct the tree, it can be used to classify new examples.
This process is called the training of the DT, then DT are considered as a supervised
learning method. DT have been successfully used for

the task of Knowledge Discovery in
Databases (KDD). Since the KDD process is applied to real world domain databases, it is
frequent to see large amount of training data. Then, the training method must be very
efficient so that the training phase does not
consume too much time and space.

Parallelism is one way to speed up the training phase. The architecture to use
might be a multiprocessor machine or a network of workstations using software like
PVM and now we have Parallel Decision Trees (PDT).

DT algorit
hms like ID3 and C4.5 use the divide and conquer approach to find the
solution. ID3’s classification algorithm looks like:

1.

Examine the examples, if all belong to the same class, terminate.

2.

Evaluate potential splits, use a goodness measure and select the be
st split.

3.

Divide the examples in two groups corresponding to the previous split.

4.

Recursively apply the DT algorithm to the previous groups.


Some issues to consider by a DT algorithm are the types of values to deal with
(categorical, ordinal and contiguous
), the method to choose the partitions, the way to
evaluate those partitions and the prunning strategies.

One approach to parallelize the DT work is to assign a node of the DT to each
processor. The problem with this model is that some nodes will finish ea
rly and will

12

remain idle. This problem can be solved with a master coordinator that keeps track of the
workload of each node and assign work to the idle processors.

A second method is called “Attribute Based Decomposition”. In this case, each
processor is
assigned an attribute to work with, so that the evaluation of all the attributes
can be processed concurrently.

In the third approach the data is parallelized, then each processor will work with a
different subset of data. This will make more efficient the

basic operation of ID3 that
consists of counting the examples that belong to a class with an attribute set to a specific
value (frequencies of attribute
-
value per class of the training examples). In this approach
we have a coordinator processor which does

not keep any data but receives the
calculations from the worker nodes to decide which is the best split. The coordinator also
notifies each of the workers which split they have to make and it also keeps the structure
of the DT. If parallization is made us
ing a network of workstations, a message passing
system is needed to transfer the information needed to accomplish the task. The network
communication overhead will not affect the DT algorithm performance because data
transmission is fast and we are also s
peeding up the process with the data parallization.


7 Conclusion

The description of the previous applications and approaches to solve problems
integrating Artificial Intelligence techniques with Distributed Systems architectures give
us the idea of whi
ch are some of the current areas of study in Artificial Intelligence.

Artificial Intelligence approaches are usually expensive and there is a constant
interest to reduce the response time of their algorithms. One way to do it is the creation of

13

new algorit
hms that overcome the previous, but this is not always possible. A second way
to do it is to wait for faster processors, but there will always be something that the fastest
processor will not be able to deal with. Finally, the most viable option is the use

of
Distributed Systems techniques that help us to maximize the use of the available
resources, and in that way to make possible the speedup of our applications.

Another important achievement gotten with the use of Distributed Systems
architectures is the
use of complex environments geographically distributed and
coordinated by intelligent agents.

In the case of the use of Artificial Intelligence techniques to control Distributed
Systems services we did not cover specific mechanisms that have been created f
or that
purpose, instead we showed some basic techniques that can be applied for it (like the job
scheduling technique already explained).



References

D. J. Cook and R. C. Varnell, “Maximizing the Benefits of Parallel Search Using
Machine Learning”, Proce
edings of the National Conference on Artificial Intelligence,
1997.


Dragomir R. Radev, “An Architecture For Distributed Natural Language
Summarization”. In Proceedings of the 8
th

International Workshop on Natural Language
Generation: Demonstration and Pos
ters, pages 45
-
48, Herstmonceux, England, June
1996.


14


Honavar, V., Millers L. and Wong, J., “Distributed Knowledge Networks”, In:
Proceedings of the IEEE Information Technology Conference. Syracuse, NY, 1998.


Kufrin, Richard, “Decision Trees in Parallel P
rocessors”, Third Workshop on Parallel
Processing for Artificial Intelligence, 1997.


Zhang, W., Dietterich, T.G., “A Reinforcement Learning Approach to Job
-
Shop
Scheduling”, Proceedings of IJCAI95, 1995.