ARTIFICIAL INTELLIGENCE - EINSTEIN COLLEGE OF ENGINEERING

nosesarchaeologistAI and Robotics

Jul 17, 2012 (5 years and 4 months ago)

882 views



EINSTEIN COLLEGE OF
ENGINEERING

ARTIFICIAL
INTELLIGENCE

[lecture Notes

CS1351
]


SUJATHA.K

12/15/2010




Author of Reference book
-

Stuart Russell& Norvig

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
2


Chapter
-
1


Characterizations

of

Artificial Intelligence

Artificial Intelligence is not
an easy science to describe, as it has fuzzy borders with
mathematics, computer science, philosophy, psychology, statistics, physics, biology
and other disciplines. It is often
characterized

in various ways, some of which are
given below. I'll use these
ca
tegorizations

to introduce various important issues in
AI.

1.1 Long Term Goals

Just what is the science of Artificial Intelligence trying to achieve? At a very high
level, you will hear AI researchers
categorized

as either '
weak
' or '
strong
'. The
'strong'

AI people think that computers can achieve consciousness (although they
may not be working on consciousness issues). The 'weak' AI people don't go that
far. Other people talk of the difference between '
Big AI
' and '
Small AI
'. Big AI is
the attempt to buil
d robots of intelligence
equaling

that of humans, such as
Lieutenant Commander Data from Star Trek. Small AI is all about getting programs
to work for small problems and trying to
generalize

the techniques to work on
larger problems. Most AI researchers do
n't worry about things like consciousness
and concentrate on some of the following long term goals.

Firstly, many researchers want to:



Produce machines which exhibit intelligent
behavior
.


Machines in this sense could simply be personal computers, or the
y could be robots
with embedded systems, or a mixture of both. Why would we want to build
intelligent systems? One answer appeals to the reasons why we use computers in
general: to accomplish tasks which, if we did them by hand would be error prone.
For in
stance how many of us would not reach for our calculator if required to
multiply two six digit numbers together? If we scale this up to more intelligent
tasks, then it should be possible to use computers to do some fairly complicated
things reliably. This
reliability may be very useful if the task is beyond some
cognitive limitation of the brain, or when human intuition is counter
-
constructive,
such as in the Monty Hall problem described below, which many people
-

some of
whom call themselves mathematicians

-

get wrong.

Another reason we might want to construct intelligent machines is to enable us to
do things we couldn't do before. A large part of science is dependent on the use of
computers already, and more intelligent applications are increasingly being

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
3


employed. The ability for intelligent software to increase our abilities is not limited
to science, of course, and people are working on AI programs which can have a
creative input to human activities such as composing, painting and writing.

Finally, in
constructing intelligent machines, we may learn something about
intelligence in humanity and other species. This deserves a category of its own.
Another reason to study Artificial Intelligence is to help us to:



Understand human intelligence in society.


A
I can be seen as just the latest tool in the philosopher's toolbox for answering
questions about the nature of human intelligence, following in the footsteps of
mathematics, logic, biology, psychology, cognitive science and others. Some
obvious questions t
hat philosophy has wrangled with are: "We know that we are
more 'intelligent' than the other animals, but what does this actually mean?" and
"How many of the activities which we call intelligent can be replicated by
computation (e.g., algorithmically)?"

F
or example, the ELIZA program discussed below is a classic example from the
sixties where a very simple program raised some serious questions about the nature
of human intelligence. Amongst other things, ELIZA helped philosophers and
psychologists to quest
ion the notion of what it means to 'understand' in natural
language (e.g., English) conversations

By stating that AI helps us understand the nature of human intelligence
in society
,
we should note that AI researchers are increasingly studying
multi
-
agent

s
ystems,
which are, roughly speaking, collections of AI programs able to communicate and
cooperate/compete on small tasks towards the completion of larger tasks. This
means that the social, rather than individual, nature of intelligence is now a subject
wit
hin range of computational studies in Artificial Intelligence.

Of course, humans are not the only life
-
forms, and the questions of life (including
intelligent life) poses even bigger questions. Indeed, some
Artificial Life

(ALife)
researchers have grand p
lans for their software. They want to use it to:



Give birth to new life forms.


A study of Artificial Life will certainly throw light on what it means for a complex
system to be 'alive'. Moreover, ALife researchers hope that, in creating artificial
life
-
f
orms, given time, intelligent behaviour will
emerge
, much like it did in human
evolution. Hence, there may be practical applications of an ALife approach. In
particular, evolutionary algorithms (where programs and parameters are evolved to
Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
4


perform a partic
ular task, rather than to exhibit signs of life) are becoming fairly
mainstream in AI.

A less obvious long term goal of AI research is to:



Add to scientific knowledge.


This is not to be confused with the applications of AI programs to other sciences,
di
scussed later. Rather, it is worth pointing out that some AI researchers don't write
intelligent programs and are certainly not interested in human intelligence or
breathing life into programs. They are really interested in the various scientific
problems
that arise in the study of AI. One example is the question of
algorithmic
complexity

-

how bad will a particular algorithm get at solving a particular problem
(in terms of the time taken to find the solution) as the problem instances get bigger.
These kind
s of studies certainly have an impact on the other long term goals, but the
pursuit of knowledge itself is often overlooked as a reason for AI to exist as a
scientific discipline. We won't be covering issues such as algorithmic complexity in
this course, h
owever.

1.2 Inspirations

Artificial Intelligence research can be characterised in terms of how the following
question has been answered:

"Just how are we going to get a computer to perform intelligent tasks?"

One way to answer the question is to say tha
t:



Logic makes a science out of various forms of reasoning, which play their part in
intelligence. So, let's build our programs as implementations of logical theories.

This has led to the use of
logic

-

drawing on mathematics and philosophy
-

in a
great
deal of AI research. This means that we can be very precise about the
algorithms we implement, write our programs in very clear ways using logic
programming languages, and even prove things about the programs we produce.

However, while it's theoretically
possible to do certain intelligent things (such as
prove some easy mathematics theorems) with programs based on logic alone, such
methods are held back by the very large search spaces involved. People began to
think about
heuristics

-

rules of thumb
-

whic
h they could use to enable their
programs to get jobs done in a reasonable time. They answered the question like
this:

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
5




We're not sure that humans reason with perfect logic all the time, but we are
certainly intelligent. So, let's use introspection and tel
l our AI programs how to
think like us.

In answering this question, AI researchers started building
expert systems
, which
encapsulated factual, procedural and heuristic knowledge about particular domains

1.4 General Tasks to Accomplish

Once you've worried

about why you're doing AI, what has inspired you and how you're
going to approach the job, then you can start to think about what task it is that you want
to automate. AI is so often portrayed as a set of problem
-
solving techniques, but I think
the relent
less shoe
-
horning of intelligent tasks into one problem formulation or another
is holding AI back. That said, we have determined a number of problem solving tasks in
AI
-

most of which have been hinted at previously
-

which can be used as a
characterizatio
n
. The categories overlap a little because of the generality of the
techniques. For instance, planning could be found in many categories, as this is a
fundamental part of solving many types of problem.

1.5 Generic Techniques Developed

In the pursuit of so
lutions to various problems in the above categories, various
individual techniques have sprung up which have been shown to be useful for solving a
range of problems (usually within the general problem category). These techniques are
established enough now
to have a name and provide at least a partial characterisation of
AI. The following list is not intended to be complete, but rather to introduce some
techniques you will learn later in the course. Note that some of these overlap with the
general techniques

above.



Forward/backward chaining (reasoning)



Resolution theorem proving (reasoning)



Proof planning (reasoning)



Constraint satisfaction (reasoning)



Davis
-
Putnam method (reasoning)



Minimax search (games)



Alpha
-
Beta pruning (games)



Case
-
based reason
ing (expert systems)



Knowledge elicitation (expert systems)



Neural networks (learning)



Bayesian methods (learning)



Explanation based (learning)



Inductive logic programming
(learning)



Reinforcement (learning)



Genetic algorithms (learning)



Genetic pr
ogramming
(learning)



Strips (planning)



N
-
grams (NLP)



Parsing (NLP)



Behavior

based (robotics)



Cell decomposition (robotics)

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
6


1.6 Representations/Languages Used

Many people are taught AI with the opening line: "The three most important things in
AI are

representation, representation and representation". While choosing the way of
representing knowledge in AI programs will always be a key concern, many techniques
now have well
-
chosen ways to represent data which have been shown to be useful for
that techn
ique. Along the way, much research has been undertaken into discovering the
best ways to represent certain types of knowledge. The way in which knowledge can be
represented is often taken as another way to
characterize

Artificial Intelligence. Some
general

representation schemes include:



First order logic



Higher order logic



Logic programs



Frames



Production Rules



Semantic Networks



Fuzzy logic



Bayes nets



Hidden Markov models



Neural networks



Strips

Some standard AI programming languages have been d
eveloped in order to build
intelligent programs efficiently and robustly. These include:



Prolog



Lisp



ML

Note that other languages are used extensively to build AI programs, including:



Perl



C++



Java



C

1.7 Application Areas

Individual applications o
ften drive AI research much more than the long term goals
described above. Much of AI literature is grouped into application areas, some of which
are:

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
7




Agriculture



Architecture



Art



Astronomy



Bioinformatics



Email classification



Engineering



Finance



Fraud detection



Information retrieval



Law



Mathematics



Military



Music



Scientific discovery



Story writing



Telecommunications



Telephone services



Transportaion



Tutoring systems



Video games



Web search engines



.









Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
8


Chapter2

Artificial Inte
lligence Agents

In the previous lecture, we discussed what we will be talking about in Artificial
Intelligence and why those things are important. This lecture is all about
how

we
will be talking about AI, i.e., the language, assumptions and concepts which

will be
common to all the topics we cover.

These notions should be considered before undertaking any large AI project. Hence,
this lecture also serves to add to the systems engineering information you have/will
be studying. For AI software/hardware, of c
ourse, we have to worry about which
programming language to use, how to split the project into modules, etc. However,
we also have to worry about higher level notions, such as: what does it mean for our
program/machine to act rationally in a particular dom
ain, how will it use knowledge
about the environment, and what form will that knowledge take? All these things
should be taken into consideration
before

we worry about actually doing any
programming.

2.1 Autonomous Rational Agents

In many cases, it is ina
ccurate to talk about a single program or a single robot, as
the combination of hardware and software in some intelligent systems is
considerably more complicated. Instead, we will follow the lead of Russell and
Norvig and describe AI through the autonomou
s, rational intelligent agents
paradigm. We're going to use the definitions from chapter 2 of Russell and Norvig's
textbook, starting with these two:



An
agent

is anything that can be viewed as perceiving its environment through
sensors and acting upon tha
t environment through effectors.



A
rational

agent is one that does the right thing.

We see that the word 'agent' covers humans (where the sensors are the senses and
the effectors are the physical body parts) as well as robots (where the sensors are
thing
s like cameras and touch pads and the effectors are various motors) and
computers (where the sensors are the keyboard and mouse and the effectors are the
monitor and speakers).

To determine whether an agent has acted rationally, we need an objective measu
re
of how successful it has been and we need to worry about when to make an
Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
9


evaluation using this measure. When designing an agent, it is important to think
hard about how to evaluate it's performance, and this evaluation should be
independent from any int
ernal measures that the agent undertakes (for example as
part of a heuristic search
-

see the next lecture). The performance should be
measured in terms of how rationally the program acted, which depends not only on
how well it did at a particular task, bu
t also on what the agent experienced from its
environment, what the agent knew about its environment and what actions the agent
could actually undertake.



Acting Rationally

Al Capone was finally convicted for tax evasion. Were the police acting rationall
y?

To answer this, we must first look at how the performance of police forces is
viewed: arresting and convicting the people who have committed a crime is a start,
but their success in getting criminals off the street is also a reasonable, if
contentious,

measure. Given that they didn't convict Capone for the murders he
committed, they failed on that measure. However, they did get him off the street, so
they succeeded there. We must also look at the what the police knew and what they
had experienced about
the environment: they had experienced murders which they
knew were undertaken by Capone, but they had not experienced any evidence
which could convict Capone of the murders. However, they had evidence of tax
evasion. Given the knowledge about the environme
nt that they can only arrest if
they have evidence, their actions were therefore limited to arresting Capone on tax
evasion. As this got him off the street, we could say they were acting rationally.

This answer is controversial, and highlights the reason
why we have to think hard
about how to assess the rationality of an agent before we consider building it.

To
summarize
, an agent takes input from its environment and affects that
environment. The rational performance of an agent must be assessed in terms
of the
task it was meant to undertake, it's knowledge and experience of the environment
and the actions it was actually able to undertake. This performance should be
objectively measured independently of any internal measures used by the agent.

In English

language usage, autonomy means an ability to govern one's actions
independently. In our situation, we need to specify the extent to which an agent's
behavior

is affected by its environment. We say that:



The
autonomy

of an agent is measured by the extent
to which its behaviour
is determined by its own experience.

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
10


At one extreme, an agent might never pay any attention to the input from its
environment, in which case, its actions are determined entirely by its built
-
in
knowledge. At the other extreme, if an

agent does not initially act using its built
-
in
knowledge, it will have to act randomly, which is not desirable. Hence, it is
desirable to have a balance between complete autonomy and no autonomy.
Thinking of human agents, we are born with certain reflexe
s which govern our
actions to begin with. However, through our ability to learn from our environment,
we begin to act more autonomously as a result of our experiences in the world.
Imagine a baby learning to crawl around. It must use in
-
built information t
o enable
it to correctly employ its arms and legs, otherwise it would just thrash around.
However, as it moves, and bumps into things, it learns to avoid objects in the
environment. When we leave home, we are (supposed to be) fully autonomous
agents oursel
ves. We should expect similar of the agents we build for AI tasks: their
autonomy increases in line with their experience of the environment.

2.3 Internal Structure of Agents

We have looked at agents in terms of their external influences and
behaviors
: th
ey
take input from the environment and perform rational actions to alter that
environment. We will now look at some generic internal mechanisms which are
common to intelligent agents.



Architecture and Program


The program of an agent is the mechanism by w
hich it turns input from the
environment into an action on the environment. The architecture of an agent is the
computing device (including software and hardware) upon which the program
operates. On this course, we mostly concern ourselves with the intelli
gence behind
the programs, and do not worry about the hardware architectures they run on. In
fact, we will mostly assume that the architecture of our agents is a computer getting
input through the keyboard and acting via the monitor.

RHINO consisted of th
e robot itself, including the necessary hardware for
locomotion (motors, etc.) and state of the art sensors, including laser, sonar, infrared
and tactile sensors. RHINO also carried around three on
-
board PC workstations and
was connected by a wireless Ethe
rnet connection to a further three off
-
board SUN
workstations. In total, it ran up to 25 different processes at any one time, in parallel.
The program employed by RHINO was even more complicated than the
architecture upon which it ran. RHINO ran software w
hich drew upon techniques
ranging from low level probabilistic reasoning and visual information processing to
high level problem solving and planning using logical representations.

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
11


An agent's program will make use of knowledge about its environment and me
thods
for deciding which action to take (if any) in response to a new input from the
environment. These methods include reflexes, goal based methods and utility based
methods.



Knowledge of the Environment


We must distinguish between knowledge an agent re
ceives through it's sensors and
knowledge about the world from which the input comes. Knowledge about the
world can be programmed in, and/or it can be learned through the sensor input. For
example, a chess playing agent would be programmed with the positio
ns of the
pieces at the start of a game, but would maintain a representation of the entire board
by updating it with every move it is told about through the input it receives. Note
that the sensor inputs are the opponent's moves and this is different to th
e
knowledge of the world that the agent maintains, which is the board state.

There are three main ways in which an agent can use knowledge of its world to
inform its actions. If an agent maintains a representation of the world, then it can
use this inform
ation to decide how to act at any given time. Furthermore, if it stores
its representations of the world, then it can also use information about previous
world states in its program. Finally, it can use knowledge about how its actions
affect the world.

Th
e RHINO agent was provided with an accurate metric map of the museum and
exhibits beforehand, carefully mapped out by the programmers. Having said this,
the layout of the museum changed frequently as routes became blocked and chairs
were moved. By updating

it's knowledge of the environment, however, RHINO
consistently knew where it was, to an accuracy better than 15cm. RHINO didn't
move objects other than itself around the museum. However, as it moved around,
people followed it, so its actions really were a
ltering the environment. It was
because of this (and other reasons) that the designers of RHINO made sure it
updated its plan as it moved around.



Reflexes


If an agent decides upon and executes an action in response to a sensor input
without consultation
of its world, then this can be considered a reflex response.
Humans flinch if they touch something very hot, regardless of the particular social
situation they are in, and this is clearly a reflex action. Similarly, chess agents are
programmed with lookup
tables for openings and endings, so that they do not have
to do any processing to choose the correct move, they simply look it up. In timed
chess matches, this kind of reflex action might save vital seconds to be used in more
difficult situations later.

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
12


U
nfortunately, relying on lookup tables is not a sensible way to program intelligent
agents: a chess agent would need 35
100

entries in its lookup table (considerably
more entries than there are atoms in the universe). And if we remember that the
world of a
chess agent consists of only 32 pieces on 64 squares, it's obvious that we
need more intelligent means of choosing a rational action.

For RHINO, it is difficult to identify any reflex actions. This is probably because
performing an action without consulti
ng the world representation is potentially
dangerous for RHINO, because people get everywhere, and museum exhibits are
expensive to replace if broken!



Goals


One possible way to improve an agent's performance is to enable it to have some
details of what i
t is trying to achieve. If it is given some representation of the goal
(e.g., some information about the solution to a problem it is trying to solve), then it
can refer to that information to see if a particular action will lead to that goal. Such
agents a
re called goal
-
based. Two tried and trusted methods for goal
-
based agents
are planning (where the agent puts together and executes a plan for achieving its
goal) and search (where the agent looks ahead in a search space until it finds the
goal). Planning a
nd search methods are covered later in the course.

In RHINO, there were two goals: get the robot to an exhibit chosen by the visitors
and, when it gets there, provide information about the exhibit. Obviously, RHINO
used information about its goal of getti
ng to an exhibit to plan its route to that
exhibit.



Utility Functions


A goal based agent for playing chess is infeasible: every time it decides which
move to play next, it sees whether that move will eventually lead to a checkmate.
Instead, it would be b
etter for the agent to assess it's progress not against the overall
goal, but against a
localized

measure. Agent's programs often have a utility function
which calculates a numerical value for each world state the agent would find itself
in if it undertook

a particular action. Then it can check which action would lead to
the highest value being returned from the set of actions it has available. Usually the
best action with respect to a utility function is taken, as this is the rational thing to
do. When the

task of the agent is to find something by searching, if it uses a utility
function in this manner, this is known as a best
-
first search.

RHINO searched for paths from its current location to an exhibit, using the distance
from the exhibit as a utility fu
nction. However, this was complicated by visitors
getting in the way.

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
13


2.4 Environments

We have seen that intelligent agents should take into account certain information
when choosing a rational action, including information from its sensors,
information f
rom the world, information from previous states of the world,
information from its goal and information from its utility function(s). We also need
to take into account some specifics about the environment it works in. On the
surface, this consideration wou
ld appear to apply more to robotic agents moving
around the real world. However, the considerations also apply to software agents
which are receiving data and making decisions which affect the data they receive
-

in this case we can think of the environmen
t as the flow of information in the data
stream. For example, an AI agent may be employed to dynamically update web
pages based on the requests from internet users.

We follow Russell and Norvig's lead in
characterizing

information about the
environment:



Accessibility


In some cases, certain aspects of an environment which should be taken into
account in decisions about actions may be unavailable to the agent. This could
happen, for instance, because the agent cannot sense certain things. In these cases,
w
e say the environment is partially inaccessible. In this case, the agent may have to
make (informed) guesses about the inaccessible data in order to act rationally.

The builders of RHINO talk about "invisible" objects that RHINO had to deal with.
These in
cluded glass cases and bars at various heights which could not be detected
by the robotic sensors. These are clearly inaccessible aspects of the environment,
and RHINO's designers took this into account when designing its programs.



Determinism


If we can
determine what the exact state of the world will be after an agent's action,
we say the environment is deterministic. In such cases, the state of the world after
an action is dependent only on the state of the world before the action and the
choice of acti
on. If the environment is non
-
deterministic, then utility functions will
have to make (informed) guesses about the expected state of the world after
possible actions if the agent is to correctly choose the best one.

RHINO's world was non
-
deterministic bec
ause people moved around, and they
move objects such as chairs around. In fact, visitors often tried to trick the robot by
setting up roadblocks with chairs. This was another reason why RHINO's plan was
constantly updated.

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
14




Episodes


If an agent's current
choice of action does not depend on its past actions, then the
environment is said to be episodic. In non
-
episodic environments, the agent will
have to plan ahead, because it's current action will affect subsequent ones.

Considering only the goal of getti
ng to and from exhibits, the individual trips
between exhibits can be seen as episodes in RHINO's actions. Once it had arrived at
one exhibit, how it got there would not normally affect its choices in getting to the
next exhibit. If we also consider the go
al of giving a guided tour, however, RHINO
must at least remember the exhibits it had already visited, in order not to repeat
itself. So, at the top level, its actions were not episodic.



Static or Dynamic


An environment is static if it doesn't change whi
le an agent's program is making the
decision about how to act. When designing agents to operate in dynamic (non
-
static) environments, the underlying program may have to refer to the changing
environment while it deliberates, or to anticipate the change in
the environment
between the time when it receives an input and when it has to take an action.

RHINO was very fast in making decisions. However, because of the amount of
visitor movement, by the time RHINO had planned a route, that plan was
sometimes wrong

because someone was now blocking the route. However, because
of the speed of decision making, instead of referring to the environment during the
planning process, as we have said before, the designers of RHINO chose to enable
it to continually update its
plan as it moved.



Discrete or Continuous


The nature of the data coming in from the environment will affect how the agent
should be designed. In particular, the data may be discrete (composed of a limited
number of clearly defined parts) or continuous (se
emingly without discernible
sections). Of course, given the nature of computer memory (in bits and bytes), even
streaming video can be shoe
-
horned into the discrete category, but an intelligent
agent will probably have to deal with this as if it is continu
ous. The mathematics in
your agent's programs will differ depending on whether the data is taken to be
discrete or continuous.


Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
15


Chapter
-
3


Search in Problem Solving

If Artificial Intelligence can inform the other sciences about anything, it is about
probl
em solving and, in particular, how to search for solutions to problems. Much
of AI research can be explained in terms of specifying a problem, defining a search
space which should contain a solution to the problem, choosing a search strategy
and getting an

agent to use the strategy to find a solution.

If you are hired as an AI researcher/programmer, you will be expected to come
armed with a battery of AI techniques, many of which we cover later in the course.
However, perhaps the most important skill you w
ill bring to the job is to effectively
seek out the best way of turning some vague specifications into concrete problems
requiring AI techniques. Specifying those problems in the most effective way will
be vital if you want your AI agent to find the soluti
ons in a reasonable time. In this
lecture, we look at how to specify a search problem.

3.1 Specifying Search Problems

In our agent terminology, a problem to be solved is a specific task where the agent
starts with the environment in a given
state

and acts

upon the environment until the
altered state has some pre
-
determined quality. The set of states which are
possible

via some sequence of actions the agent takes is called the
search space
. The series
of actions that the agent actually performs is its
searc
h path
, and the final state is a
solution

if it has the required property. There may be many solutions to a particular
problem. If you can think of the task you want your agent to perform in these terms,
then you will need to write a
problem solving agent

which uses search.

It is important to identify the scope of your task in terms of the problems which will
need to be solved. For instance, there are some tasks which are single problems
solved by searching, e.g., find a route on a map. Alternatively, ther
e are tasks such
as winning at chess, which have to be broken down into sub
-
problems (searching
for the best move at each stage). Other tasks can be achieved without searching
whatsoever e.g., multiplying two large numbers together
-

you wouldn't dream of
searching through the number line until you came across the answer!

There are three initial considerations in problem solving (as described in Russell
and Norvig):



Initial State

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
16


Firstly, the agent needs to be told exactly what the initial state is befor
e it starts its
search, so that it can keep track of the state as it searches.



Operators

An operator is a function taking one state to another via an action undertaken by the
agent. For example, in chess, an operator takes one arrangement of pieces on th
e
board to another arrangement by the action of the agent moving a piece.



Goal Test

It is essential when designing a problem solving agent to know when the problem
has been solved, i.e., to have a well defined goal test. Suppose the problem we had
set ou
r agent was to find a name for a newborn baby, with some properties. In this
case, there are lists of "accepted" names for babies, and any solution must appear in
that list, so goal
-
checking amounts to simply testing whether the name appears in
the list. I
n chess, on the other hand, the goal is to reach a checkmate. While there
are only a finite number of ways in which the pieces on a board can represent a
checkmate, the number of these is huge, so checking a position against them is a
bad idea. Instead, a
more abstract notion of checkmate is used, whereby our agent
checks that the opponent's king cannot move without being captured.

3.2 General Considerations for Search

If we can specify the initial state, the operators and the goal check for a search
probl
em, then we know where to start, how to move and when to stop in our search.
This leaves the important question of how to choose which operator to apply to
which state at any stage during the search. We call an answer to this question a
search strategy
. Be
fore we worry about exactly what strategy to use, the following
need to be taken into consideration:



Path or
Artifact


Broadly speaking, there are two different reasons to undertake a search: to find an
artifact

(a particular state), or to find a path fro
m one given state to another given
state. Whether you are searching for a path or an
artifact

will affect many aspects of
your agent's search, including its goal test, what it records along the way and the
search strategies available to you.

For example,
in the maze below, the game involves finding a route from the top left
hand corner to the bottom right hand corner. We all know what the exit looks like (a
gap in the outer wall), so we do not search for an
artifact
. Rather, the point of the
search is to f
ind a path, so the agent must remember where it has been.

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
17



However, in other searches, the point of the search is to find something, and it may
be immaterial how you found it. For instance, suppose we play a different game: to
find an anagram of the phra
se:

ELECTING NEIL


The answer is, of course: (FILL IN THIS GAP AS AN EXERCISE). In this case,
the point of the search is to find an
artifact

-

a word which is an anagram of
"electing neil". No
-
one really cares in which order to actually re
-
arrange the let
ters,
so we are not searching for a path.



Completeness

It's also worth trying to estimate the number of solutions to a problem, and the
density of those solutions amongst the non
-
solutions. In a search problem, there
may be any number of solutions, and t
he problem specification may involve finding
just one, finding some, or finding all the solutions. For example, suppose a military
application searches for routes that an enemy might take. The question: "Can the
enemy get from A to B" requires finding only

one solution, whereas the question:
"How many ways can the enemy get from A to B" will require the agent to find all
the solutions.

When an agent is asked to find just one solution, we can often program it to
prune

its search space quite heavily, i.e., r
ule out particular operators at particular times to
be more efficient. However, this may also prune some of the solutions, so if our
agent is asked to find all of them, the pruning has to be controlled so that we know
that pruned areas of the search space
either contain no solutions, or contain
solutions which are repeated in another (non
-
pruned) part of the space.

If our search strategy is guaranteed to find all the solutions eventually, then we say
that it is
complete
. Often, it is obvious that all the s
olutions are in the search space,
but in other cases, we need to prove this fact mathematically to be sure that our
space is complete. A problem with complete searches is that
-

while the solution is
Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
18


certainly there
-

it can take a very long time to find t
he solution, sometimes so long
that the strategy is effectively useless. Some people use the word
exhaustive

when
they describe complete searches, because the strategy exhausts all possibilities in
the search space.



Time and Space Tradeoffs

In practice,
you are going to have to stop your agent at some stage if it has not
found a solution by then. Hence, if we can choose the fastest search strategy, then
this will explore more of the search space and increase the likelihood of finding a
solution. There is
a problem with this, however. It may be that the fastest strategy is
the one which uses most memory. To perform a search, an agent needs at least to
know where it is in a search space, but lots of other things can also be recorded. For
instance, a search s
trategy may involve going over old ground, and it would save
time if the agent knew it had already tried a particular path. Even though RAM
capacities in computers are going steadily up, for some of the searches that AI
agents are employed to undertake, th
ey often run out of memory. Hence, as in
computer science in general, AI practitioners often have to devise clever ways to
trade memory and time in order to achieve an effective balance.



Soundness

You may hear in some application domains
-

for example au
tomated theorem
proving
-

that a search is "sound and complete". Soundness in theorem proving
means that the search to find a proof will not succeed if you give it a false theorem
to prove. This extends to searching in general, where a search is
unsound

if

it finds
a solution to a problem with no solution. This kind of unsound search may not be
the end of the world if you are only interested in using it for problems where you
know there is a solution (and it performs well in finding such solutions). Another

kind of unsound search is when a search finds the wrong solution to a problem. This
is more worrying and the problem will probably lie with the goal testing
mechanism.



Additional Knowledge in Search

The amount of extra knowledge available to your agent
will effect how it performs.
In the following sections of this lecture, we will look at
uninformed search
strategies
, where no additional knowledge is given, and
heuristic searches
, where
any information about the goal, intermediate states and operators ca
n be used to
improve the efficiency of the search strategy.

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
19


3.3 Uninformed Search Strategies

To be able to undertake an uninformed search, all our agent needs to know is the
initial state, the possible operators and how to check whether the goal has been
reached. Once these have been described, we must then choose a search strategy for
the agent: a pre
-
determined way in which the operators will be applied.

The example we will use is the case of a genetics professor searching for a name for
her newborn bab
y boy
-

of course, it must only contain the letters D, N and A. The
states in this search are strings of letters (but only Ds, Ns and As), and the initial
state is an empty string. Also, the operators available are: (i) add a 'D' to an existing
string (ii)

add an 'N' to an existing string and (iii) add an 'A' to an existing string.
The goal check is possible using a book of boys names against which the professor
can check a string of letters.

To help us think about the different search strategies, we use t
wo analogies. Firstly,
we suppose that the professor keeps an
agenda

of actions to undertake, such as:
add
an 'A' to the string 'AN'
. So, the agenda consists of pairs
(S,O)

of states and
operators, whereby the operator is to be applied to the state. The ac
tion at the top of
the agenda is the one which is carried out, then that action is removed. How actions
are added to the agenda differs for each search strategy. Secondly, we think of a
search graphically: by making each state a node in a graph and each op
erator an
edge, we can think of the search progressing as movement from node to node along
edges in the graph. We then allow ourselves to talk about nodes in a search space
(rather than the graph) and we say that a node in a search space has been
expanded

if the state that node represents has been visited and searched from. Note that
graphs which have no cycles in them are called
trees
, and many AI searches can be
represented as trees.



Breadth First Search

Given a set of operators
o
1
, ..., o
n

in a breadth

first search, every time a new state
s

is reached, an action for each operator on
s

is added to the
bottom

of the agenda,
i.e., the pairs
(s,o
1
), ..., (s,o
n
)

are added to the end of the agenda in that order.

However, once the 'D' state had been found, th
e actions:

1.
(empty,add'D')

2.
(empty,add'N')

3.
(empty,add 'A')

would be added to the top of the agenda, so it would look like this:

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
20


4.
('D',add 'D')

5.
('D',add 'N')

6.
('D',add 'A')


However, we can remove the first agenda item as this action has been
undertaken.
Hence there are actually 5 actions on the agenda after the first step in the search
space. Indeed, after every step, one action will be removed (the action just carried
out), and three will be added, making a total addition of two actions to th
e agenda.
It turns out that this kind of breadth first search leads to the name 'DAN' after 20
steps. Also, after the 20th step, there are 43 tasks still on the agenda to do.

It's useful to think of this search as the evolution of a tree, and the diagram
below
shows how each string of letters is found via the search in a breadth first manner.
The numbers above the boxes indicate at which step in the search the string was
found.


We see that each node leads to three others, which corresponds to the fact t
hat after
every step, three more steps are put on the agenda. This is called the
branching
rate

of a search, and seriously affects both how long a search is going to take and
how much memory it will use up.

Breadth first search is a complete strategy: giv
en enough time and memory, it will
find a solution if one exists. Unfortunately, memory is a big problem for breadth
first search. We can think about how big the agenda grows, but in effect we are just
Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
21


counting the number of states which are still 'alive',

i.e., there are still steps in the
agenda involving them. In the above diagram, the states which are still alive are
those with fewer than three arrows coming from them: there are 14 in all.

It's fairly easy to show that in a search with a branching rate

of
b
, if we want to
search all the way to a depth of
d
, then the largest number of states the agent will
have to store at any one time is
b
d
-
1
. For example, if our professor wanted to search
for
all

names up to length 8, she would have to remember (or wri
te down) 2187
different strings to complete a breadth first search. This is because she would need
to remember 3
7

strings of length 7 in order to be able to build all the strings of
length 8 from them. In searches with a higher branching rate, the memory
r
equirement can often become too large for an agent's processor.



Depth First Search

Depth first search is very similar to breadth first, except that things are added to the
top

of the agenda rather than the bottom. In our example, the first three things o
n
the agenda would still be:

However, once the 'D' state had been found, the actions:

1.
(empty,add'D')

2.
(empty,add'N')

3.
(empty,add 'A')

would be added to the top of the agenda, so it would look like this:

4.
('D',add 'D')

5.
('D',add 'N')

6.
('D',add

'A')


Of course, carrying out the action at the top of the agenda would introduce the
string 'DD', but then this would cause the action:

('DD',add 'D')

to be added to the top, and the next string found would be 'DDD'. Clearly, this can't
go on indefinite
ly, and in practice, we must specify a depth limit to stop it going
down a particular path forever. That is, our agent will need to record how far down
a particular path it has gone, and avoid putting actions on the agenda if the state in
the agenda item i
s past a certain depth.

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
22


Note that our search for names is special: no matter what state we reach, there will
always be three actions to add to the agenda. In other searches, the number of
actions available to undertake on a particular state may be zero, w
hich effectively
stops that branch of the search. Hence, a depth limit is not always required.

Returning to our example, if the professor stipulated that she wanted very short
names (of three or fewer letters), then the search tree would look like this:


We see that 'DAN' has been reached after the 12th step, so there is an improvement
on the breadth first search. However, it was lucky in this case that the first letter
explored is 'D' and that there is a solution at depth three. If the depth limit had b
een
set at 4 instead, the tree would have looked very much different:

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
23



It looks like it will be a long time until it finds 'DAN'. This highlights an important
drawback to depth first search. It can often go deep down paths which have no
solutions, when t
here is a solution much higher up the tree, but on a different
branch. Also, depth first search is not, in general, complete.

Rather than simply adding the next agenda item directly to the top of the agenda, it
might be a better idea to make sure that eve
ry node in the tree is fully expanded
before moving on to the next depth in the search. This is the kind of depth first
search which Russell and Norvig explain. For our DNA example, if we did this, the
search tree would like like this:

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
24



The big advantage

to depth first search is that it requires much less memory to
operate than breadth first search. If we count the number of 'alive' nodes in the
diagram above, it amounts to only 4, because the ones on the bottom row are not to
be expanded due to the depth

limit. In fact, it can be shown that if an agent wants to
search for all solutions up to a depth of
d

in a space with branching factor
b
, then in
a depth first search it only needs to remember up to a maximum of
b
*
d

states at any
one time.

To put this in

perspective, if our professor wanted to search for
all

names up to
length 8, she would only have to remember 3 * 8 = 24 different strings to complete
a depth first search (rather than 2187 in a breadth first search).



Iterative Deepening Search

So, bread
th first search is guaranteed to find a solution (if one exists), but it eats all
the memory. Depth first search, however, is much less memory hungry, but not
guaranteed to find a solution. Is there any other way to search the space which
combines the good

parts of both?

Well, yes, but it sounds silly. Iterative Deepening Search (IDS) is just a series of
depth first searches where the depth limit is increased by one every time. That is, an
Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
25


IDS will do a depth first search (DFS) to depth 1, followed by a DF
S to depth 2,
and so on, each time starting completely from scratch. This has the advantage of
being complete, as it covers all depths of the search tree. Also, it only requires the
same memory as depth first search (obviously).

However, you will have not
iced that this means that it completely re
-
searches the
entire space searched in the previous iteration. This kind of redundancy will surely
make the search strategy too slow to contemplate using in practice? Actually, it isn't
as bad as you might think. T
his is because, in a depth first search, most of the effort
is spent expanding the last row of the tree, so the repetition over the top part of the
tree is not a major factor. In fact, the effect of the repetition reduces as the
branching rate increases. I
n a search with branching rate 10 and depth 5, the number
of states searched is 111,111 with a single depth first search. With an iterative
deepening search, this number goes up to 123,456. So, there is only a repetition of
around 11%.



Bidirectional Searc
h

We've concentrated so far on searches where the point of the search is to find a
solution, not the path to the solution. In other searches, we know the solution, and
we know the initial state, but we don't know how to get from one to the other, and
the
point of the search is to find a path. In these cases, in addition to searching
forward from the initial state, we can sometimes also search backwards from the
solution. This is called a bidirectional search.

For example, consider the 8
-
puzzle game in the

diagram below, where the point of
the game is to move the pieces around so that they are arranged in the right hand
diagram. It's likely that in the search for the solution to this puzzle (given an
arbitrary starting state), you might start off by moving
some of the pieces around to
get some of them in their end positions. Then, as you got closer to the solution state,
you might work backwards: asking yourself, how can I get from the solution to
where I am at the moment, then reversing the search path. In
this case, you've used
a bidirectional search.


Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
26


Bidirectional search has the advantage that search in both directions is only required
to go to a depth half that of normal searches, and this can often lead to a drastic
reduction in the number of paths lo
oked at. For instance, if we were looking for a
path from one town to another through at most six other towns, we only have to
look for a journey through three towns from both directions, which is fairly easy to
do, compared to searching all paths through
six towns in a normal search.

Unfortunately, it is often difficult to apply a bidirectional search because (a) we
don't really know the solution, only a description of it (b) there may be many
solutions, and we have to choose some to work backwards from (
c) we cannot
reverse our operators to work backwards from the solution and (d) we have to
record all the paths from both sides to see if any two meet at the same point
-

this
may take up a lot of memory, and checking through both sets repeatedly could take

up too much computing time.


3.5 Heuristic Search Strategies

Generally speaking, a heuristic search is one which uses a rule of thumb to improve
an agent's performance in solving problems via search. A heuristic search is not to
be confused with a heur
istic measure. If you can specify a heuristic measure, then
this opens up a range of generic heuristic searches which you can try to improve
your agent's performance, as discussed below. It is worth remembering, however,
that any rule of thumb, for instanc
e, choosing the order of operators when applied in
a simple breadth first search, is a heuristic.

In terms of our agenda analogy, a heuristic search chooses where to put a (state,
operator) pair on the agenda when it is proposed as a move in the state spa
ce. This
choice could be fairly complicated and based on many factors. In terms of the graph
analogy, a heuristic search chooses which node to expand at any point in the search.
By definition, a heuristic search is not guaranteed to improve performance for

a
particular problem or set of problems, but they are implemented in the hope of
either improving the speed of which a solution is found and/or the quality of the
solution found. In fact, we may be able to find
optimal

solutions, which are as good
as poss
ible with respect to some measure.



Optimality

The path cost of a solution is calculated as the sum of the costs of the actions which
led to that solution. This is just one example of a measure of value on the solution
Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
27


of a search problem, and there are m
any others. These measures may or may not be
related to the heuristic functions which estimate the likelihood of a particular state
being in the path to a solution. We say that
-

given a measure of value on the
possible solutions to a search problem
-

one
particular solution is optimal if it scores
higher than all the others with respect to this measure (or costs less, in the case of
path cost). For example, in the maze example given in section 3.2, there are many
paths from the start to the finish of the m
aze, but only one which crosses the fewest
squares. This is the optimal solution in terms of the distance travelled.

Optimality can be guaranteed through a particular choice of search strategy (for
instance the uniform path cost search described below). A
lternatively, an agent can
choose to prove that a solution is optimal by appealing to some mathematical
argument. As a last resort, if optimality is necessary, then an agent must exhaust a
complete search strategy to find all solutions, then choose the one

scoring the
highest (alternatively costing the lowest).



Uniform Path Cost Search

A breadth first search will find the solution with the shortest path length from the
initial state to the goal state. However, this may not be the least expensive solution
in terms of the path cost. A uniform path cost search chooses which node to expand
by looking at the path cost for each node: the node which has cost least to get to is
expanded first. Hence, if, as is usually the case, the path cost of a node increases
wi
th the path length, then this search is guaranteed to find the least expensive
solution. It is therefore an optimal search strategy. Unfortunately, this search
strategy can be very inefficient.



Greedy Search

If we have a heuristic function for states, de
fined as above, then we can simply
measure each state with respect to this measure and order the agenda items in terms
of the score of the state in the item. So, at each stage, the agent determines which
state scores lowest and puts agenda items on the top

of the agenda which contain
operators acting on that state. In this way, the most promising nodes in a search
space are expanded before the less promising ones. This is a type of
best first
search

known specifically as a greedy search.

In some situations
, a greedy search can lead to a solution very quickly. However, a
greedy search can often go down blind alleys, which look promising to start with,
but ultimately don't lead to a solution. Often the best states at the start of a search
are in fact really q
uite poor in comparison to those further in the search space. One
way to counteract this blind
-
alley effect is to turn off the heuristic until a proportion
of the search space has been covered, so that the truly high scoring states can be
Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
28


identified. Anoth
er problem with a greedy search is that the agent will have to keep
a record of which states have been explored in order to avoid repetitions (and
ultimately end up in a cycle), so a greedy search must keep all the agenda items it
has undertaken in its mem
ory. Also, this search strategy is not optimal, because the
optimal solution may have nodes on the path which score badly for the heuristic
function, and hence a non
-
optimal solution will be found before an optimal one.
(Remember that the heuristic functio
n only
estimates

the path cost from a node to a
solution).



A* Search

A* search combines the best parts of uniform cost search, namely the fact that it's
optimal and complete, and the best parts of greedy search, namely its speed. This
search strategy sim
ply combines the path cost function

g(n)

and the heuristic function
h(n)

by summing them to form a new heuristic
measure
f(n)
:

f(n) = g(n) + h(n)


Remembering that
g(n)

gives the path cost from the start state to state
n

and
h(n)

estimates the path cost
from
n

to a goal state, we see that
f(n)

estimates the cost of
the cheapest solution which passes through
n
.

The most important aspect of A* search is that, given one restriction on
h(n)
, it is
possible to prove that the search strategy is complete and op
timal. The restriction to
h(n)

is that it must always
underestimate

the cost to reach a goal state from
n
. Such
heuristic measures are called
admissible
. See Russell and Norvig for proof that A*
search with an admissible heuristic is complete and optimal.



IDA* Search

A* search is a sophisticated and successful search strategy. However, a problem
with A* search is that it must keep all states in its memory, so memory is often a
much bigger consideration than time in designing agents to undertake A* searche
s.
We overcame the same problem with breadth first search by using an iterative
deepening search (IDS), and we do similar with A*.

Like IDS, an IDA* search is a series of depth first searches where the depth is
increased after each iteration. However, the

depth is not measured in terms of the
path length, as it is in IDS, but rather in terms of the A* combined function
f(n)

as
described above. To do this, we need to define
contours

as regions of the search
space containing states where
f

is below some limi
t for all the states, as shown
pictorially here:

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
29



Each node in a contour scores less than a particular value and IDA* search agents
are told how much to increase the contour boundary by on each iteration. This
defines the depth for successive searches. W
hen using contours, it is useful for the
function
f(n)

to be
monotonic
, i.e.,
f

is monotonic if whenever an operator takes a
state
s
1

to a state
s
2
, then
f(s
2
) >= f(s
1
)
. In other words, if the value of
f

always
increases along a path, then
f

is monotonic.
As an exercise, why do we need
monotonicity to ensure optimality in IDA* search?



SMA* Search

IDA* search is very good from a memory point of view. In fact, it can be criticised
for not using enough memory
-

using more memory can increase the efficiency,
so
really our search strategies should use all the available memory. Simplified
Memory
-
Bounded A* search (SMA*) is a search which does just that. This is a
complicated search strategy, with details given in Russell and Norvig.



Hill Climbing

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
30


As we've seen
, in some problems, finding the search path from initial to goal state is
the point of the exercise. In other problems, the path and the artefact at the end of
the path are both important, and we often try to find optimal solutions. For a certain
set of pr
oblems, the path is immaterial, and finding a suitable artefact is the sole
purpose of the search. In these cases, it doesn't matter whether our agent searches
down a path for 10 or 1000 steps, as long as it finds a solution in the end.

For example, consi
der the 8
-
queens problem, where the task is to find an
arrangement of 8 queens on a chess board such that no one can "take" another (one
queen can take another if its on the same horizontal, vertical or diagonal line). A
solution to this problem is:


One

way to specify this problem is with states where there are a number of queens
(1 to 8) on the board, and an action is to add a queen in such a way that it can't take
another. Depending on your strategy, you may find that this search requires much
back
-
tra
cking
, i.e., towards the end, you find that you simply can't put the last
queens on anywhere, so you have to move one of the queens you put down earlier
(you go back
-
up the search tree).

An alternative way of specifying the problem is that the states are
boards with 8
queens already on them, and an action is a movement of
one

of the queens. In this
case, our agent can use an
evaluation function

and do hill climbing. That is, it
counts the number of pairs of queens where one can take the other, and only mov
es
a queen if that movement reduces the number of pairs. When there is a choice of
movements both resulting in the same decrease, the agent can choose one randomly
from the choices. In the 8
-
queens problem, there are only 56 * 8 = 448 possible
ways to move

one queen, so our agent only has to calculate the evaluation function
448 times at each stage. If it only chooses moves where the situation with respect to
Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
31


the evaluation function improves, it is doing hill climbing (or
gradient descent

if
it's better to
think of the agent going downhill rather than uphill).

A common problem with this search strategy is
local maxima
: the search has not
yet reached a solution, but it can only go downhill in terms of the evaluation
function. For example, we might get to the

stage where only two queens can take
each other, but moving any queen increases this number to at least three. In cases
like this, the agent can do a
random re
-
start

whereby they randomly choose a state
to start the whole process from again. This search s
trategy has the appeal of never
requiring to store more than one state at any one time (the part of the hill the agent
is on). Russell and Norvig make the analogy that this kind of search is like trying to
climb mount everest in the fog with amnesia, but t
hey do concede that it is often the
search strategy of choice for some industrial problems. Local/Global
Maxima/Minima are represented in the diagram below:




Simulated Annealing

One way to get around the problem of local maxima, and related problems suc
h as
ridges and plateaux in hill climbing is to allow the agent to go downhill to some
extent. In simulated annealing
-

named because of an analogy with cooling a liquid
until it freezes
-

the agent chooses to consider a random move. If the move
improves t
he evaluation function, then it is always carried out. If the move doesn't
improve the evaluation function, then the agent will carry out the move with some
probability between 0 and 1. The probability decreases as the move gets worse in
terms of the evalu
ation function, so really bad moves are rarely carried out. This
strategy can often nudge a search out of a local maximum and the search can
continue towards the global maximum.



Random Search

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
32


Some problems to be solved by a search agent are more creative

in nature, for
example, writing poetry. In this case, it is often difficult to project the word
'creative' on to a program because it is possible to completely understand why it
produced an artefact, by looking at its search path. In these cases, it is of
ten a good
idea to try some randomness in the search strategy, for example randomly choosing
an item from the agenda to carry out, or assigning values from a heuristic measure
randomly. This may add to the creative appeal of the agent, because it makes it
much more difficult to predict what the agent will do.

3.6 Assessing Heuristic Searches

Given a particular problem you want to build an agent to solve, there may be more
than one way of specifying it as a search problem, more than one choice for the
searc
h strategy and different possibilities for heuristic measures. To a large extent,
it is difficult to predict what the best choices will be, and it will require some
experimentation to determine them. In some cases,
-

if we calculate the
effective
branching

rate
, as described below
-

we can tell for sure if one heuristic measure is
always being out
-
performed by another.



The Effective Branching Rate

Assessing heuristic functions is an important part of AI research: a particular
heuristic function may sound
like a good idea, but in practice give no discernible
increase in the quality of the search. Search quality can be determined
experimentally in terms of the output from the search, and by using various
measures such as the
effective branching rate
. Suppose

a particular problem
P

has
been solved by search strategy
S

by expanding
N

nodes, and the solution lay at
depth
D

in the space. Then the effective branching rate of
S

for
P

is calculated by
comparing
S

to a
uniform

search
U
. An example of a uniform search

is a breadth
first search where the number of branches from any node is always the same (as in
our baby naming example). We then suppose the (uniform) branching rate of
U

is
such that, on exhausting its search to depth
D
, it too would have expanded exactl
y
N

nodes. This imagined branching rate, written
b*
, is the effective branching rate of
S

and is calculated thus:

N = 1 + b* + (b*)
2

+ ... + (b*)
D
.


Rearranging this equation will provide a value for
b*
. For example (taken from
Russell and Norvig), suppos
e
S

finds a solution at depth 5 having expanded 52
nodes. In this case:

52 = 1 + b* + (b*)
2

+ ... + (b*)
5
.


Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
33


and it turns out that
b*
=1.91. To calculate this, we use the well known mathematical

identity:


This enables us to write a polynomial for which
b
*

is a zero, and we can solve this
using numerical techniques such as Newton's method.

It is usually the case that the effective branching rate of a search strategy is similar
over all the problems it is used for, so that it is acceptable to average
b*

ov
er a
small set of problems to give a valid account. If a heuristic search has a branching
rate near to 1, then this is a good sign. We say that one heuristic function
h
1

dominates

another
h
2

if the search using
h
1

always has a lower effective branching
rat
e than
h
2
. Having a lower effective branching rate is clearly desirable because it
means a quicker search.










Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
34


Chapter
-
4


Knowledge Representation

To recap, we now have some
characterizations

of AI, so that when an AI problem
arises, you will be able

to put it into context, find the correct techniques and apply
them. We have introduced the agents language so that we can talk about intelligent
tasks and how to carry them out. We have also looked at search in the general case,
which is central to AI pro
blem solving. Most pieces of software have to deal with
data of some type, and in AI we use the more grandiose title of "knowledge" to
stand for data including (i) facts, such as the temperature of a patient (ii)
procedures, such as how to treat a patient
with a high temperature and (iii) meaning,
such as why a patient with a high temperature should not be given a hot bath.
Accessing and
utilizing

all these kinds of information will be vital for an intelligent
agent to act rationally. For this reason, knowl
edge representation is our final general
consideration before we look at particular problem types.

To a large extent, the way in which you
organize

information available to and
generated by your intelligent agent will be dictated by the type of problem yo
u are
addressing. Often, the best ways of representing knowledge for particular
techniques are known. However, as with the problem of how to search, you will
need a lot of flexibility in the way you represent information. Therefore, it is worth
looking at
four general schemes for representing knowledge, namely logic,
semantic networks, production rules and frames. Knowledge representation
continues to be a much
-
researched topic in AI because of the
realization

fairly early
on that how information is arrange
d can often make or break an AI application.

4.1 Logical Representations

If all human beings spoke the same language, there would be a lot less
misunderstanding in the world. The problem with software engineering in general is
that there are often slips i
n communication which mean that what we think we've
told an agent and what we've actually told it are two different things. One way to
reduce this, of course, is to specify and agree upon some concrete rules for the
language we use to represent information
. To define a language, we need to specify
the
syntax

of the language and the
semantics
. To specify the the syntax of a
language, we must say what symbols are allowed in the language and what are legal
constructions (sentences) using those symbols. To spec
ify the semantics of a
language, we must say how the legal sentences are to be read, i.e., what they mean.
If we choose a particular well defined language and stick to it, we are using a
logical representation.

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
35


Certain logics are very popular for the repr
esentation of information, and range in
terms of their
expressiveness
. More expressive logics allow us to translate more
sentences from our natural language (e.g., English) into the language defined by the
logic.

Some popular logics are:



Propositional Lo
gic

This is a fairly restrictive logic, which allows us to write sentences about
propositions
-

statements about the world
-

which can either be true or false. The
symbols in this logic are (i) capital letters such as
P, Q

and
R

which represent
propositio
ns such as: "It is raining" and "I am wet", (ii)
connectives

which are: and
(
), or (
), implies (
) and not (
), (iii) brackets and (iv)
T

which stands for
the proposition "true", and
F

which stands for the proposition "false". The syntax of
this logic ar
e the rules specifying where in a sentence the connectives can go, for
example
must go between two propositions, or between a bracketed conjunction
of propositions, etc.

The semantics of this logic are rules about how to assign truth values to a sentence

if we know whether the propositions mentioned in the sentence are true or not. For
instance, one rule is that the sentence
P
Q

is true only in the situation when both
P

and

Q

are true. The rules also dictate how to use brackets. As a very simple
example,
we can represent the knowledge in English that "I always get wet and
annoyed when it rains" as:

It is raining
I am wet
I am annoyed


Moreover, if we program our agent with the semantics of propositional logic, then if
at some stage, we tell it that it i
s raining, it can infer that I will get wet and annoyed.



First Order Predicate Logic

This is a more expressive logic because it builds on propositional logic by allowing
us to use
constants
,
variables
,
predicates
,
functions

and
quantifiers

in addition to

the connectives we've already seen. For instance, the sentence: "Every Monday and
Wednesday I go to John's house for dinner" can be written in first order predicate
logic as:

X ((day_of_week(X, monday)
day_of_week(X, wednesday))

(go_to(me, house_of(j
ohn))
eat_meal(me, dinner))).

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
36


Here, the symbols
monday
,
wednesday
,
me
,
dinner

and
john

are all constants:
base
-
level objects in the world about which we want to talk. The symbols
day_of_week
,
go_to

and
eat_meal

are predicates which represent relationship
s
between the
arguments

which appear inside the brackets. For example in
eat_meal
, the relationship specifies that a person (first argument) eats a particular
meal (second argument). In this case, we have represented the fact that
me

eats
dinner
. The symbo
l
X

is a variable, which can take on a range of values. This
enables us to be more expressive, and in particular, we can
quantify

X

with the
'forall' symbol
, so that our sentence of predicate logic talks about
all

possible
X
's.
Finally, the symbol
house_
of

is a function, and
-

if we can
-

we are expected to
replace
house_of(john)

with the output of the function (john's house) given the
input to the function (john).

The syntax and semantics of predicate logic are covered in more detail as part of the
lect
ures on automated reasoning.



Higher Order Predicate Logic

In first order predicate logic, we are only allowed to quantify over objects. If we
allow ourselves to quantify over predicate or function symbols, then we have
moved up to the more expressive hig
her order predicate logic. This means that we
can represent meta
-
level information about our knowledge, such as "For all the
functions we've specified, they return the number 10 if the number 7 is input":

f, (f(7) = 10).



Fuzzy Logic

In the logics descr
ibed above, we have been concerned with truth: whether
propositions and sentences are true. However, with some natural language
statements, it's difficult to assign a "true" or "false" value. For example, is the
sentence: "Prince Charles is tall" true or f
alse? Some people may say true, and
others false, so there's an underlying probability that we may also want to represent.
This can be achieved with so
-
called "fuzzy" logics. The originator of fuzzy logics,
Lotfi Zadeh, advocates not thinking about particu
lar fuzzy logics as such, but rather
thinking of the "fuzzification" of current theories, and this is beginning to play a
part in AI. The combination of logics with theories of probability, and programming
agents to reason in the light of uncertain knowled
ge are important areas of AI
research. Various representation schemes such as Stochastic Logic Programs have
an aspect of both logic and probability.



Other logics

Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
37


Other logics you may consider include:

Multiple valued logics
, where different truth value

such as "unknown" are allowed.
These have some of the advantages of fuzzy logics, without necessarily worrying
about probability.

Modal logics
, which cater for individual agents' beliefs about the world. For
example, one agent could believe that a certai
n statement is true, but another may
not. Modal logics help us deal with statements that may be believed to be true to
some, but not all agents.

Temporal logics
, which enable us to write sentences involving considerations of
time, for example that a state
ment may become true some time in the future.

It's not difficult to see why logic has been a very popular representation scheme in
AI:



It's fairly easy to represent knowledge in this way. It allows us to be
expressive enough to represent most knowledge,
while being constrained
enough to be precise about that knowledge.



There are whole branches of mathematics devoted to the study of it.



We get a lot of reasoning for free (theorems can be deduced about
information in a logical representation and patterns
can be similarly
induced).



Some programming languages grew from logical representations, in
particular Prolog. So, if you understand the logic, it's fairly easy to write
programs.





Artificial Intelligence
-
CS1351


EINSTEIN COLLEGE OF ENGINEERING

Page
38


Chapter
-
5


Game Playing

We have now dispensed with the necessary backg
round material for AI problem
solving techniques, and we can move on to looking at particular types of problems
which have been addressed using AI techniques. The first type of problem we'll
look at is getting an agent to compete, either against a human or

another artificial
agent. This area has been extremely well researched over the last 50 years. Indeed,
some of the first chess programs were written by Alan Turing, Claude Shannon and
other fore
-
fathers of modern computing. We only have one lecture to loo
k at this
topic, so we'll restrict ourselves to looking at two person games such as chess
played by software agents. If you are interested in games involving more teamwork
and/or robotics, then a good place to start would be with the Robo

Cup project,
5.1
M
inMax Search

Parents often get two children to share a cake fairly by asking one to cut the cake
and the other to choose which half they want to eat. In this two player cake