Artificial Intelligence A Modern Approach

vinegarclothAI and Robotics

Jul 17, 2012 (5 years and 3 months ago)

1,049 views

Artificial Intelligenc e
A Moder n Approac h
Stuart J. Russel l and Peter Norvi g
Contributing writers:
John F. Canny, Jitendr a M. Malik, Douglas D. Edwards
Prentice Hall, Englewood Cliffs, New Jersey 07632
Library of Congress Cataloging-in-Publication Data
Russell, Stuar t J. (Stuar t Jonathan )
Artificia l intelligenc e : a moder n approach/ Stuar t Russell, Pete r Norvig.
p. cm.
Include s bibliographica l reference s an d index.
ISB N 0-13-103805- 2
1. Artificia l intelligenc e I. Norvig, Peter. II. Title.
Q335.R8 6 199 5
006.3-dc2 0 94-3644 4
CIP
Publisher: Ala n Ap t
Productio n Editor: Mon a Pompil i
Developmenta l Editor: Sondr a Chave z
Cove r Designers: Stuar t Russel l and Pete r Norvi g
Productio n Coordinator: Lori Bulwi n
Editoria l Assistant: Shirle y McGuir e
© 1995 by Prentice-Hall, Inc.
A Simo n & Schuste r Compan y
Englewoo d Cliffs, New Jerse y 0763 2
The autho r and publishe r of this book have used thei r best effort s in preparin g this book. Thes e effort s
includ e the development, research, and testing of the theorie s and programs to determin e thei r
effectiveness. The autho r and publishe r shal l not be liabl e in any even t for incidenta l or consequentia l
damage s in connectio n with, or arising out of, the furnishing, performance, or use of these programs.
All right s reserved. No par t of thi s book ma y be
reproduced, in any for m or by any means,
withou t permissio n in writin g fro m the publisher.
Printe d in the Unite d State s of Americ a
10 9 8 7 6 5 4 3 2 1
ISB N D-IH-IQBSOS- E
Prentice-Hal l Internationa l (UK) Limited, London
Prentice-Hal l of Australi a Pty. Limited, Sydney
Prentice-Hal l Canada, Inc., Toronto
Prentice-Hal l Hispanoamericana, S.A., Mexico
Prentice-Hal l of Indi a Privat e Limited, New Delhi
Prentice-Hal l of Japan, Inc., Tokyo
Simo n & Schuste r Asi a Pte. Ltd., Singapore
Editor a Prentice-Hal l do Brasil, Ltda., Rio de Janeiro
Preface
There are many textbook s that offer an introductio n to artificia l intelligenc e (AI). Thi s text has
five principa l feature s that togethe r distinguis h it from othe r texts.
1. Unified presentation of the field.
Some text s are organize d from a historica l perspective, describin g each of the major
problems and solution s that have been uncovere d in 40 years of AI research. Althoug h
there is value to this perspective, the resul t is to give the impressio n of a dozen or so barel y
related subfields, each wit h its own technique s and problems. We have chosen to present
AI as a unifie d field, workin g on a commo n proble m in variou s guises. Thi s has entaile d
some reinterpretatio n of past research, showin g how it fit s withi n a commo n framewor k
and how it relates to other work that was historicall y separate. It has also led us to include
materia l not normall y covered in AI texts.
2. Intelligent agent design.
The unifyin g theme of the book is the concep t of an intelligent agent. In thi s view, the
proble m of AI is to describ e and buil d agent s that receive percept s from the environmen t
and perfor m actions. Each such agent is implemente d by a functio n that maps percept s
to actions, and we cover differen t ways to represen t these functions, such as productio n
systems, reactive agents, logical planners, neural networks, and decision-theoreti c systems.
We explai n the role of learning as extendin g the reach of the designe r into unknown environ -
ments, and show how it constrain s agent design, favorin g explici t knowledg e representatio n
and reasoning. We treat robotic s and vision not as independentl y define d problems, but
as occurrin g in the service of goal achievement. We stress the importanc e of the task
environmen t characteristic s in determinin g the appropriat e agent design.
3. Comprehensive and up-to-date coverage.
We cover areas that are sometime s underemphasized, includin g reasonin g under uncer -
tainty, learning, neural networks, natura l language, vision, robotics, and philosophica l
foundations. We cover many of the mor e recent ideas in the field, includin g simulate d
annealing, memory-bounde d search, global ontologies, dynami c and adaptive probabilisti c
(Bayesian ) networks, computationa l learnin g theory, and reinforcemen t learning. We also
provide extensiv e notes and reference s on the historica l sources and current literatur e for
the main ideas in each chapter.
4. Equal emphasis on theory and practice.
Theor y and practic e are given equal emphasis. Al l materia l is grounde d in firs t principle s
with rigorous theoretica l analysi s wher e appropriate, but the point of the theor y is to get the
concept s across and explai n how they are used in actual, fielde d systems. The reader of this
book will come away with an appreciatio n for the basic concept s and mathematica l methods
of AI, and also with an idea of what can and cannot be done with today's technology, at
what cost, and using what techniques.
5. Understanding through implementation.
The principles of intelligent agent design are clarified by using them to actually build agents.
Chapter 2 provides an overview of agent design, including a basic agent and environment
vii
Vlll
Prefac e
project. Subsequent chapters include programming exercises that ask the student to add >.
capabilities to the agent, making it behave more and more interestingly and (we hope)
intelligently. Algorithms are presente d at three levels of detail: prose description s and !
pseudo-cod e in the text, and complet e Commo n Lisp programs availabl e on the Interne t or
on floppy disk. All the agent programs are interoperabl e and work in a uniform framework
for simulated environments.
This book is primarily intended for use in an undergraduate course or course sequence. It
can also be used in a graduate-leve l cours e (perhap s wit h the additio n of some of the primar y
sources suggeste d in the bibliographica l notes). Becaus e of its comprehensiv e coverage and the
large numbe r of detailed algorithms, it is usefu l as a primar y referenc e volume for AI graduat e
student s and professional s wishin g to branc h out beyon d thei r own subfield. We also hope that
AI researcher s coul d benefi t from thinkin g abou t the unifyin g approac h we advocate.
The onl y prerequisit e is familiarit y wit h basi c concept s of compute r scienc e (algorithms,
data structures, complexity ) at a sophomor e level. Freshma n calculu s is usefu l for understandin g
neural network s and adaptiv e probabilisti c network s in detail. Some experienc e wit h nonnumeri c
programmin g is desirable, but can be picked up in a few weeks study. We provid e implementation s
of all algorithms in Commo n Lisp (see Appendi x B), but other language s such as Scheme, Prolog,
Smalltalk, C++, or ML coul d be used instead.
Overview of the book
The book is divide d int o eight parts. Part 1, "Artificia l Intelligence," sets the stage for all the others,
and offer s a view of the AI enterpris e based aroun d the idea of intelligen t agents—system s that
can decide what to do and do it. Part II, "Problem Solving," concentrate s on method s for deciding
what to do whe n one needs to think ahead severa l steps, for exampl e in navigatin g acros s countr y
or playing chess. Part III, "Knowledg e and Reasoning," discusse s ways to represen t knowledg e
about the world—ho w it works, what it is currentl y like, what one's action s might do—an d how
to reason logicall y wit h that knowledge. Part IV, "Acting Logically," then discusse s how to
use these reasonin g method s to decide what to do, particularl y by constructin g plans. Part V,
"Uncertain Knowledge and Reasoning," is analogous to Parts III and IV, but it concentrates on
reasonin g and decision-makin g in the presenc e of uncertainty about the world, as migh t be faced,
for example, by a system for medica l diagnosi s and treatment.
Together, Part s II to V describe that part of the intelligen t agent responsibl e for reachin g
decisions. Part VI, "Learning," describes methods for generating the knowledg e required by these
decision-makin g components; it also introduce s a new kind of component, the neural network,
and its associated learning procedures. Part VII, "Communicating, Perceiving, and Acting,"
describes ways in which an intelligent agent can perceive its environment so as to know what is
going on, whether by vision, touch, hearing, or understanding language; and ways in which it can
turn its plans into real actions, either as robot motion or as natural language utterances. Finally,
Part VIII, "Conclusions," analyses the past and future of AI, and provides some light amusement
by discussing what AI really is and why it has already succeeded to some degree, and airing the
views of those philosophers who believe that AI can never succeed at all.
Preface
Using this book
This is a big book; covering all the chapters and the projects would take two semesters. You will
notice that the book is divided into 27 chapters, which makes it easy to select the appropriate
material for any chosen course of study. Each chapter can be covered in approximately one week.
Some reasonabl e choice s for a variet y of quarte r and semeste r courses are as follows:
• One-quarter general introductory course:
Chapter s 1, 2, 3, 6, 7, 9, 11, 14, 15, 18, 22.
• One-semester general introductory course:
Chapter s 1, 2, 3, 4, 6, 7, 9, 11, 13, 14, 15, 18, 19, 22, 24, 26, 27.
• One-quarter course with concentration on search and planning:
Chapter s 1, 2, 3, 4, 5, 6, 7, 9, 11, 12,13.
• One-quarter course with concentration on reasoning and expert systems:
Chapter s 1,2, 3, 6, 7, 8,9, 10,11,14, 15,16.
• One-quarter course with concentration on natural language:
Chapter s 1, 2, 3, 6, 7, 8, 9, 14, 15, 22, 23, 26, 27.
• One-semester course with concentration on learning and neural networks:
Chapter s 1, 2, 3, 4, 6, 7, 9, 14, 15, 16, 17,18, 19, 20, 21.
• One-semester course with concentration on vision and robotics:
Chapter s 1, 2, 3, 4, 6, 7, 11, 13, 14, 15, 16, 17, 24, 25, 20.
These sequence s coul d be used for bot h undergraduat e and graduat e courses. The relevan t part s
of the book coul d also be used to provid e the firs t phas e of graduat e specialt y courses. For
example, Par t VI coul d be used in conjunctio n wit h reading s from the literatur e in a cours e on
machine learning.
We have decided not to designat e certai n sections as "optional" or certai n exercises as
"difficult," as individua l taste s and background s var y widely. Exercise s requirin g significan t
programmin g are marke d wit h a keyboar d icon, and thos e requirin g some investigatio n of the
literatur e are marke d wit h a book icon. Altogether, over 300 exercise s are included. Some of
them are large enoug h to be considere d term projects. Many of the exercise s can best be solved
by taking advantage of the code repository, which is described in Appendix B. Throughout the
book, important points are marked with a pointing icon.
If you have any comment s on the book, we'd like to hear from you. Appendi x B include s
information on how to contact us.
Acknowledgements
Jitendra Malik wrote most of Chapter 24 (Vision) and John Canny wrote most of Chapter
25 (Robotics). Doug Edwards researched the Historical Notes sections for all chapters and wrote
much of them. Tim Huang helped with formatting of the diagrams and algorithms. Maryann
Simmons prepared the 3-D model from which the cover illustration was produced, and Lisa
Marie Sardegna did the postprocessing for the final image. Alan Apt, Mona Pompili, and Sondra
Chavez at Prentice Hall tried their best to keep us on schedule and made many helpful suggestions
on design and content.
Prefac e
Stuar t woul d like to thank his parents, brother, and sister for their encouragemen t and thei r
patience at his extended absence. He hopes to be home for Christmas. He woul d also like to
thank Loy Sheflot t for her patience and support. He hopes to be home some time tomorrow
afternoon. His intellectua l debt to his Ph.D. advisor, Michae l Genesereth, is eviden t throughou t
the book. RUGS (Russell's Unusua l Group of Students ) have been unusuall y helpful.
Peter would like to thank his parents (Torsten and Gerda) for getting him started, his advisor
(Bob Wilensky), supervisor s (Bil l Woods and Bob Sproull ) and employe r (Sun Microsystems )
for supportin g his wor k in AI, and hi s wif e (Kris ) and friend s for encouragin g and toleratin g hi m
throug h the long hour s of writing.
Befor e publication, draft s of this book were used in 26 course s by about 1000 students.
Bot h of us deepl y appreciat e the many comment s of these student s and instructor s (and other
reviewers). We can't thank them all individually, but we woul d like to acknowledg e the especiall y
helpful comment s of these people:
Tony Barrett, Howar d Beck, John Binder, Larr y Bookman, Chri s Brown, Lauren
Burka, Murra y Campbell, Ani l Chakravarthy, Robert o Cipolla, Doug Edwards, Kut -
luha n Erol, Jeffrey Forbes, John Fosler, Bob Futrelle, Sabine Glesner, Barbar a Grosz,
Steve Hanks, Otha r Hansson, Jim Hendler, Tim Huang, Seth Hutchinson, Dan Ju-
rafsky, Lesli e Pack Kaelbling, Keij i Kanazawa, Surekh a Kasibhatla, Simo n Kasif,
Daphne Roller, Rich Korf, James Kurien, John Lazzaro, Jason Leatherman, Jon
LeBlanc, Jim Martin, Andy Mayer, Steve Minton, Leora Morgenstern, Ron Musick,
Stuar t Nelson, Steve Omohundro, Ron Parr, Tony Passera, Michae l Pazzani, Ira
Pohl, Martha Pollack, Bruce Porter, Malcol m Pradhan, Lorraine Prior, Greg Provan,
Phili p Resnik, Richar d Scherl, Daniel Sleator, Rober t Sproull, Lynn Stein, Devika
Subramanian, Rich Sutton, Jonatha n Tash, Austi n Tate, Mar k Torrance, Randal l
Upham, Jim Waldo, Bonni e Webber, Michae l Wellman, Dan Weld, Richar d Yen,
Shlomo Zilberstein.
Summary of Contents
i
ii
in
IV
Artificial Intelligence 1
1 Int roduct i on................................................................. 3
2 Intelligent Agent s............................................................ 31
Problem-solving 53
3 Solving Problems by Searching .............................................. 55
4 Informed Search Methods ................................................... 92
5 Game Pl ayi ng................................................................ 122
Knowledge and reasoning 149
6 Agents that Reason Logi cal l y................................................ 151
7 First-Order Logi c............................................................ 185
8 Building a Knowledge Base .................................................. 217
9 Inference in First-Order Logi c............................................... 265
10 Logical Reasoning Syst ems................................................... 297
Acting logically 335
11 Pl anni ng..................................................................... 337
12 Practical Planning ........................................................... 367
13 Planning and Act i ng......................................................... 392
Uncertain knowledge and reasoning 413
14 Uncer t ai nt y.................................................................. 415
15 Probabilistic Reasoning Syst ems............................................. 436
16 Making Simple Decisions .................................................... 471
17 Making Complex Decisions .................................................. 498
Learning 523
18 Learning from Observat i ons................................................. 525
19 Learning in Neural and Belief Net works..................................... 563
20 Reinforcement Lear ni ng..................................................... 598
21 Knowledge in Lear ni ng...................................................... 625
Communicating, perceiving, and acting 649
22 Agents that Communicate ................................................... 651
23 Practical Natural Language Processing ...................................... 691
24 Perception ................................................................... 724
25 Robot i cs..................................................................... 773
VIII Conclusions 815
26 Philosophical Foundations ................................................... 817
27 AI: Present and Future ...................................................... 842
A Complexity analysis and O() not at i on........................................ 851
B Notes on Languages and Al gori t hms......................................... 854
Bibliography 859
Index 905
VI
VII
Contents
I Artificia l Intelligenc e 1
1 Introduction 3
1.1 What is AI? .................................... 4
Acting humanly: The Turing Test approach .................... 5
Thinking humanly: The cognitive modelling approach .............. 6
Thinking rationally: The laws of thought approach ................ 6
Acting rationally: The rational agent approach .................. 7
1.2 The Foundations of Artificial Intelligence ..................... 8
Philosophy (428 B.C.-present) .......................... 8
Mathematics (c. 800-present) ........................... 11
Psychology (1879-present) ............................ 12
Compute r engineerin g (1940-present) ...................... 14
Linguistics (1957-present) ............................ 15
1.3 The History of Artificial Intelligence ....................... 16
The gestation of artificial intelligenc e (1943-1956). ............... 16
Early enthusiasm, great expectations (1952-1969) ................ 17
A dose of reality (1966-1974) ........................... 20
Knowledge-based systems: The key to power? (1969-1979). .......... 22
AI becomes an industry (1980-1988) ....................... 24
The return of neural networks (1986-present) .................. 24
Recent events (1987-present) ........................... 25
1.4 The State of the Art ................................ 26
1.5 Summary ...................................... 27
Bibliographical and Historical Notes ........................... 28
Exercises ......................................... 28
2 Intelligent Agents 3 1
2.1 Introduction .................................... 31
2.2 How Agents Should Act .............................. 31
The ideal mapping from percept sequences to actions .............. 34
Autonomy ..................................... 35
2.3 Structure of Intelligent Agents ........................... 35
Agent programs .................................. 37
Why not just look up the answers? ........................ 38
An example .................................... 39
Simple reflex agents ................................ 40
Agents that keep track of the world ........................ 41
Goal-based agents ................................. 42
Utility-based agents ................................ 44
2.4 Environments ................................... 45
XIV
Content s
Properties of environments ............................ 46
Environment programs .............................. 47
2.5 Summary ...................................... 49
Bibliographical and Historical Notes ........................... 50
Exercises ......................................... 50
II Problem-solving 53
3 Solving Problems by Searching 55
3.1 Problem-Solving Agents .............................. 55
3.2 Formulating Problems ............................... 57
Knowledge and problem types .......................... 58
Well-define d problems and solution s ....................... 60
Measuring problem-solving performance ..................... 61
Choosing states and actions ............................ 61
3.3 Example Problems ................................. 63
Toy problems ................................... 63
Real-world problems ............................... 68
3.4 Searching for Solutions .............................. 70
Generating action sequences ............................. 70
Data structures for search trees .......................... 72
3.5 Search Strategies .................................. 73
Breadth-first search ................................ 74
Uniform cost search ................................ 75
Depth-first search ................................. 77
Depth-limited search ................................ 78
Iterative deepening search ............................. 78
Bidirectional search ................................ 80
Comparing search strategies ............................ 81
3.6 Avoiding Repeated States ............................. 82
3.7 Constraint Satisfaction Search ........................... 83
3.8 Summary ...................................... 85
Bibliographical and Historical Notes ........................... 86
Exercises ......................................... 87
4 Informed Search Methods 9 2
4.1 Best-First Search .................................. 92
Minimize estimated cost to reach a goal: Greedy search ............. 93
Minimizing the total path cost: A* search .................... 96
4.2 Heuristic Functions ................................ 101
The effec t of heuristi c accuracy on performanc e ................. 102
Inventin g heuristi c function s ............................ 103
Heuristics for constraint satisfactio n problems .................. 104
4.3 Memor y Bounded Search ............................. 106
Contents_______________________________________________________xv
Iterative deepening A* search (IDA*) ....................... 106
SMA* search ................................... 107
4.4 Iterative Improvement Algorithms ........................1 11
Hill-climbing search ................................1 11
Simulated annealing ................................1 13
Applications in constraint satisfaction problems .................1 14
4.5 Summary ...................................... 115
Bibliographical and Historical Notes ........................... 115
Exercises ......................................... 118
5 Game Playing 12 2
5.1 Introduction: Games as Search Problems ..................... 122
5.2 Perfect Decisions in Two-Person Games ..................... 123
5.3 Imperfect Decisions ................................ 126
Evaluation functions ................................ 127
Cutting off search ................................. 129
5.4 Alpha-Beta Pruning ................................ 129
Effectivenes s of alpha-bet a pruning ........................ 131
5.5 Games That Include an Element of Chance .................... 133
Position evaluation in games with chance nodes ................. 135
Complexity of expectiminima x .......................... 135
5.6 State-of-the-Ar t Game Programs ......................... 136
Chess ........................................ 137
Checkers or Draughts ............................... 138
Othello ....................................... 138
Backgammon ................................... 139
Go ......................................... 139
5.7 Discussion ..................................... 139
5.8 Summary ...................................... 141
Bibliographical and Historical Notes ........................... 141
Exercises ......................................... 145
III Knowledge and reasoning 14 9
6 Agents that Reason Logicall y 15 1
6.1 A Knowledge-Based Agent ............................ 151
6.2 The Wumpus World Environment ......................... 153
Specifying the environment ............................ 154
Acting and reasoning in the wumpus world .................... 155
6.3 Representation, Reasoning, and Logic ...................... 157
Representation ................................... 160
Inference ...................................... 163
Logics ....................................... 165
6.4 Prepositional Logic: A Very Simple Logic .................... 166
XVI
Content s
Syntax ....................................... 166
Semantics ..................................... 168
Validity and inference ............................... 169
Models ....................................... 170
Rules of inference for propositional logic ..................... 171
Complexit y of prepositiona l inference ...................... 173
6.5 An Agen t fo r the Wumpu s Worl d ......................... 174
The knowledge base ................................ 174
Finding the wumpus ................................ 175
Translating knowledge into action ......................... 176
Problems with the propositional agent ...................... 176
6.6 Summary ...................................... 178
Bibliographical and Historical Notes ........................... 178
Exercises ......................................... 180
7 First-Order Logic 18 5
7.1 Syntax and Semantics ............................... 186
Terms ....................................... 188
Atomic sentences ................................. 189
Complex sentences ................................ 189
Quantifier s ..................................... 189
Equality ...................................... 193
7.2 Extensions and Notational Variations ....................... 194
Higher-order logic ................................. 195
Functional and predicate expressions using the A operator ............ 195
The uniqueness quantifier 3! ........................... 196
The uniqueness operator / ............................. 196
Notationa l v a r i a t i o n s................................ 196
7.3 Using First-Order Logic .............................. 197
The kinship domain ................................ 197
Axioms, definitions, and theorems ........................ 198
The domain of sets ................................. 199
Special notations for sets, lists and arithmetic ................... 200
Asking questions and getting answers ....................... 200
7.4 Logical Agents for the Wumpus World ...................... 201
7.5 A Simple Reflex Agent .............................. 202
Limitations of simple reflex agents ........................ 203
7.6 Representing Change in the World ........................ 203
Situation calculus ................................. 204
Keeping track of location ............................. 206
7.7 Deducing Hidden Properties of the World . .................... 208
7.8 Preferences Among Actions ............................ 210
7.9 Toward a Goal-Based Agent ............................ 211
7.10 Summary ...................................... 21 1
Content s
xvn
Bibliographical and Historical Notes ........................... 212
Exercises ......................................... 213
8 Building a Knowledge Base 217
8.1 Properties of Good and Bad Knowledge Bases .................. 218
8.2 Knowledg e Engineerin g .............................. 221
8.3 The Electronic Circuits Domain .......................... 223
Decide what to talk about ............................. 223
Decide on a vocabulary .............................. 224
Encode general rules ................................ 225
Encode the specific instance ............................ 225
Pose queries to the inference procedure ...................... 226
8.4 General Ontology ................................. 226
Representin g Categorie s .............................. 229
Measures ...................................... 231
Composit e objects ................................. 233
Representin g change wit h event s ......................... 234
Times, intervals, and actions ............................ 238
Object s revisite d .................................. 240
Substance s and object s .............................. 241
Mental event s and mental object s ......................... 243
Knowledge and action ............................... 247
8.5 The Grocery Shopping World ........................... 247
Complet e descriptio n of the shoppin g simulatio n ................. 248
Organizing knowledge ............................... 249
Menu-planning ................................... 249
Navigatin g ..................................... 252
Gatherin g ..................................... 253
Communicatin g .................................. 254
Paying ....................................... 255
8.6 Summary ...................................... 256
Bibliographical and Historical Notes ........................... 256
Exercises ......................................... 261
9 Inference in First-Order Logic 26 5
9.1 Inference Rules Involving Quantifiers ....................... 265
9.2 An Example Proof ................................. 266
9.3 Generalize d Modus Ponens ............................ 269
Canonical form .................................. 270
Unification ..................................... 270
Sample proof revisited ............................... 271
9.4 Forward and Backward Chaining ......................... 272
Forward-chaining algorithm ............................ 273
Backward-chaining algorithm ........................... 275
XV111
Contents
9.5 Completeness ................................... 276
9.6 Resolution: A Complet e Inference Procedure ................... 277
The resolution inference rule ........................... 278
Canonical forms for resolution .......................... 278
Resolution proofs ................................. 279
Conversion to Normal Form ............................ 281
Example proof ................................... 282
Dealing with equality ............................... 284
Resolution strategies ................................ 284
9.7 Completeness of resolution ............................ 286
9.8 Summary ...................................... 290
Bibliographical and Historical Notes ........................... 291
Exercises ......................................... 294
10 Logical Reasoning Systems 297
10.1 Introduction .................................... 297
10.2 Indexing, Retrieval, and Unificatio n ........................ 299
Implementing sentences and terms ........................ 299
Store and fetch ................................... 299
Table-based indexing ............................... 300
Tree-based indexing ................................ 301
The unification algorithm ............................. 302
10.3 Logic Programming Systems ........................... 304
The Prolog language ................................ 304
Implementation .................................. 305
Compilation of logic programs .......................... 306
Other logic programming languages ....................... 308
Advanced control facilities ............................ 308
10.4 Theorem Provers .................................. 310
Design of a theorem prover ............................ 310
Extending Prolog .................................3 11
Theorem provers as assistants ........................... 312
Practical uses of theorem provers ......................... 313
10.5 Forward-Chaining Production Systems ......................3 13
Match phase .................................... 314
Conflict resolution phase ............................. 315
Practical uses of production systems ....................... 316
10.6 Frame Systems and Semantic Networks . ..................... 316
Syntax and semantics of semantic networks ................... 317
Inheritance with exceptions ............................ 319
Multiple inheritance ................................ 320
Inheritance and change .............................. 320
Implementation of semantic networks ....................... 321
Expressiveness of semantic networks ....................... 323
I
Content s __________________________________________________ xix
10.7 Description Logics ................................. 323
Practical uses of description logics ........................ 325
10.8 Managing Retractions, Assumptions, and Explanations ............. 325
10.9 Summary ...................................... 327
Bibliographical and Historical Notes ........................... 328
Exercises ......................................... 332
IV Actin g logicall y 33 5
11 Planning 33 7
11.1 A Simple Planning Agent ............................. 337
11.2 From Problem Solving to Planning ........................ 338
11.3 Planning in Situation Calculus . .......................... 341
11.4 Basic Representations for Planning ........................ 343
Representations for states and goals ........................ 343
Representations for actions ............................ 344
Situation space and plan space .......................... 345
Representations for plans ............................. 346
Solutions ...................................... 349
11.5 A Partial-Order Planning Example ........................ 349
11.6 A Partial-Order Planning Algorithm ....................... 355
11.7 Plannin g wit h Partiall y Instantiate d Operator s .................. 357
11.8 Knowledge Engineering for Planning ....................... 359
The blocks world ................................. 359
Shakey's world ................................... 360
11.9 Summary ...................................... 362
Bibliographical and Historical Notes ........................... 363
Exercises ......................................... 364
12 Practical Planning 367
12.1 Practical Planners ................................. 367
Spacecraf t assembly, integration, and verificatio n ................. 367
Job shop scheduling ................................ 369
Scheduling for space missions ........................... 369
Buildings, aircraf t carriers, and beer factorie s ................... 371
12.2 Hierarchical Decomposition ............................ 371
Extending the language .............................. 372
Modifyin g the planner ............................... 374
12.3 Analysis of Hierarchical Decomposition ..................... 375
Decomposition and sharing ............................ 379
Decomposition versus approximation ....................... 380
12.4 More Expressive Operator Descriptions . ..................... 381
Conditional effects ................................. 381
Negated and disjunctive goals ........................... 382
XX
Content s
Universal quantificatio n .............................. 383
A planner for expressive operator descriptions .................. 384
12.5 Resource Constraints ............................... 386
Using measures in planning ............................ 386
Temporal c o n s t r a i n t s................................ 388
12.6 Summary ...................................... 388
Bibliographical and Historical Notes ........................... 389
Exercises ......................................... 390
13 Planning and Acting 39 2
13.1 Conditional Planning ............................... 393
The nature of conditional plans .......................... 393
An algorithm for generating conditional plans .................. 395
Extending the plan language ............................ 398
13.2 A Simple Replannin g Agent ............................ 401
Simple replannin g with execution m o n i t o r i n g................... 402
13.3 Fully Integrated Planning and Execution ..................... 403
13.4 Discussion and Extensions ............................ 407
Comparing conditional planning and replanning ................. 407
Coercion and abstraction ............................. 409
13.5 Summary ...................................... 410
Bibliographica l and Historical Notes ........................... 411
Exercises ......................................... 412
V Uncertai n knowledge and reasoning 41 3
14 Uncertainty 41 5
14.1 Acting under Uncertainty ............................. 415
Handling uncertain knowledge .......................... 416
Uncertainty and rational decisions . ........................ 418
Design for a decision-theoretic agent ....................... 419
14.2 Basic Probability Notation . ............................ 420
Prior probability .................................. 420
Conditional probability .............................. 421
14.3 The Axioms of Probability ............................ 422
Why the axioms of probability are reasonable .................. 423
The joint probability distribution ......................... 425
14.4 Bayes' Rule and Its Use .............................. 426
Applying Bayes' rule: The simple case ...................... 426
Normalization ................................... 427
Using Bayes' rule: Combining evidence ..................... 428
14.5 Where Do Probabilities Come From? ....................... 430
14.6 Summary ...................................... 431
Bibliographical and Historical Notes ........................... 431
Content s
xxi
Exercises ......................................... 433
15 Probabilistic Reasoning Systems 436
15.1 Representing Knowledge in an Uncertain Domain ................ 436
15.2 The Semantics of Belief Networks ........................ 438
Representing the joint probability distribution .................. 439
Conditional independence relations in belief networks .............. 444
15.3 Inference in Belief Networks ........................... 445
The nature of probabilistic inferences ....................... 446
An algorithm for answering queries ........................ 447
15.4 Inference in Multiply Connected Belief Networks ................ 453
Clustering methods ................................ 453
Cutset conditioning methods ........................... 454
Stochastic simulation methods .......................... 455
15.5 Knowledge Engineering for Uncertain Reasoning ................ 456
Case study: The Pathfinder system ........................ 457
15.6 Other Approaches to Uncertain Reasoning .................... 458
Default reasoning ................................. 459
Rule-based methods for uncertain reasoning ................... 460
Representing ignorance: Dempster-Shafer theory ................ 462
Representing vagueness: Fuzzy sets and fuzzy logic ............... 463
15.7 Summary ...................................... 464
Bibliographical and Historical Notes ........................... 464
Exercises ......................................... 467
16 Making Simpl e Decisions 47 1
16.1 Combining Beliefs and Desires Under Uncertainty ................ 471
16.2 The Basis of Utility Theory ............................ 473
Constraints on rational preferences ........................ 473
... and then there was Utility ........................... 474
16.3 Utility Functions .................................. 475
The utility of money ................................ 476
Utility scales and utility assessment ........................ 478
16.4 Multiattribut e utility functions ........................... 480
Dominance ..................................... 481
Preference structure and multiattribute utility ................... 483
16.5 Decision Networks ................................. 484
Representing a decision problem using decision networks ............ 484
Evaluating decision networks ........................... 486
16.6 The Value of Information ............................. 487
A simple example ................................. 487
A general formula ................................. 488
Properties of the value of information ....................... 489
Implementing an information-gathering agent .................. 490
xxii
Contents
16.7 Decision-Theoretic Expert Systems ........................ 491
16.8 Summary ...................................... 493
Bibliographical and Historical Notes ........................... 493
Exercises ......................................... 495
17 Making Complex Decisions 49 8
17.1 Sequential Decision Problems ........................... 498
17.2 Value Iteration ................................... 502
17.3 Policy Iteration . .................................. 505
17.4 Decision-Theoreti c Agent Design ......................... 508
The decision cycle of a rational agent ....................... 508
Sensing in uncertain worlds ............................ 510
17.5 Dynami c Belief Networks ............................. 514
17.6 Dynamic Decision Networks ........................... 516
Discussion .....................................5 18
17.7 Summary ...................................... 519
Bibliographica l and Historical Notes ........................... 520
Exercises ......................................... 521
VI Learning 523
18 Learning from Observations 52 5
18.1 A General Model of Learning Agent s ....................... 525
Component s of the performance element ..................... 527
Representation of the components ......................... 528
Availabl e feedback ................................. 528
Prior knowledge .................................. 528
Bringing it all together ............................... 529
18.2 Inductive Learning ................................. 529
18.3 Learning Decision Trees .............................. 531
Decision trees as performanc e element s ...................... 531
Expressivenes s of decision trees .......................... 532
Inducin g decision trees from example s ...................... 534
Assessing the performance of the learning algorithm ............... 538
Practical uses of decision tree learning ...................... 538
18.4 Using Information Theory ............................. 540
Noise and overfilling ................................ 542
Broadening the applicability of decision Irees ................... 543
18.5 Learning General Logical Descriptions ...................... 544
Hypotheses ..................................... 544
Examples ...................................... 545
Current-besl-hypolhesis search .......................... 546
Least-commitment search ............................. 549
Discussion ..................................... 552
Content s
XXlll
18.6 Why Learnin g Works: Computationa l Learnin g Theor y ............. 552
How many example s are needed? ......................... 553
Learning decision lists ............................... 555
Discussion ..................................... 557
18.7 Summar y ...................................... 558
Bibliographica l and Historica l Notes ........................... 559
Exercises ......................................... 560
19 Learning in Neural and Belief Networks 56 3
19.1 How the Brai n Work s ............................... 564
Comparin g brains wit h digita l computer s ..................... 565
19.2 Neura l Network s .................................. 567
Notation ...................................... 567
Simpl e computin g element s ............................ 567
Networ k structure s ................................. 570
Optimal networ k structur e ............................. 572
19.3 Perceptrons .................................... 573
What perceptron s can represent .......................... 573
Learnin g linearl y separabl e function s ....................... 575
19.4 Multilaye r Feed-Forwar d Network s ........................ 578
Back-propagatio n learnin g ............................. 578
Back-propagatio n as gradien t descent search ................... 580
Discussio n ..................................... 583
19.5 Application s of Neura l Network s ......................... 584
Pronunciatio n ................................... 585
Handwritte n characte r recognitio n ........................ 586
Drivin g ....................................... 586
19.6 Bayesia n Method s for Learnin g Belief Network s ................. 588
Bayesia n learnin g ................................. 588
Belief networ k learnin g problems ......................... 589
Learnin g network s wit h fixe d structur e ...................... 589
A compariso n of belie f network s and neura l network s .............. 592
19.7 Summar y ...................................... 593
Bibliographica l and Historica l Notes ........................... 594
Exercises ......................................... 596
20 Reinforcement Learning 59 8
20.1 Introductio n .................................... 598
20.2 Passiv e Learnin g in a Known Environmen t .................... 600
Nai'v e updatin g ................................... 601
Adaptiv e dynami c programmin g ......................... 603
Tempora l differenc e learnin g ........................... 604
20.3 Passive Learning in an Unknown Environment .................. 605
20.4 Active Learning in an Unknown Environment .................. 607
XXI V
Contents
20.5 Exploration .................................... 609
20.6 Learning an Action-Valu e Function ........................ 612
20.7 Generalization in Reinforcement Learning .................... 615
Application s to game-playin g ........................... 617
Applicatio n to robot control ............................6 17
20.8 Genetic Algorithms and Evolutionary Programming ............... 619
20.9 Summar y ...................................... 621
Bibliographical and Historical Notes ........................... 622
Exercises ......................................... 623
21 Knowledge in Learning 62 5
21.1 Knowledg e in Learnin g .............................. 625
Some simpl e examples .............................. 626
Some general scheme s ............................... 627
21.2 Explanation-Based Learning ........................... 629
Extractin g genera l rules from example s ...................... 630
Improvin g efficienc y ................................ 631
21.3 Learnin g Usin g Relevanc e Informatio n ...................... 633
Determinin g the hypothesi s space ......................... 633
Learning and using relevance information .................... 634
21.4 Inductiv e Logi c Programmin g ........................... 636
An exampl e .................................... 637
Inverse resolution ................................. 639
Top-dow n learnin g method s ............................ 641
21.5 Summary ...................................... 644
Bibliographica l and Historical Notes ........................... 645
Exercises ......................................... 647
VII Communicating, perceiving, and acting 649
22 Agents that Communicate 651
22.1 Communicatio n as Actio n ............................. 652
Fundamentals of language ............................. 654
The componen t steps of communicatio n ..................... 655
Two model s of communicatio n .......................... 659
22.2 Type s of Communicatin g Agent s ......................... 659
Communicatin g using Tell and Ask ........................ 660
Communicating using formal language ...................... 661
An agent that communicates ............................ 662
22.3 A Formal Grammar for a Subset of English .................... 662
The Lexicon of £o ................................. 664
The Grammar of £Q ................................ 664
22.4 Syntactic Analysis (Parsing) ............................ 664
22.5 Definite Clause Grammar (DCG) ......................... 667
Content s
xxv
22.6 Augmenting a Grammar .............................. 668
Verb Subcategorization .............................. 669
Generative Capacit y of Augmente d Grammar s .................. 671
22.7 Semanti c Interpretatio n .............................. 672
Semantics as DCG Augmentations ........................ 673
The semantics of "John loves Mary" ....................... 673
The semantics of £\ ................................ 675
Converting quasi-logical form to logical form .................. 677
Pragmati c Interpretatio n .............................. 678
22.8 Ambiguit y and Disambiguatio n .......................... 680
Disambiguation .................................. 682
22.9 A Communicatin g Agent ............................. 683
22.10 Summar y ...................................... 684
Bibliographica l and Historica l Notes ........................... 685
Exercises ......................................... 688
23 Practical Natural Language Processing 69 1
23.1 Practical Applications ............................... 691
Machine translatio n ................................ 691
Database access .................................. 693
Information retrieval ................................ 694
Text categorizatio n ................................. 695
Extractin g dat a from text ............................. 696
23.2 Efficien t Parsin g .................................. 696
Extractin g parses from the chart: Packin g ..................... 701
23.3 Scalin g Up the Lexicon .............................. 703
23.4 Scaling Up the Grammar ............................. 705
Nomina l compound s and appositio n ....................... 706
Adjective phrases ................................. 707
Determiner s .................................... 708
Noun phrases revisited ............................... 709
Clausal complements ............................... 710
Relative clause s .................................. 710
Questions .....................................7 11
Handlin g agrammatica l strings .......................... 712
23.5 Ambiguity ..................................... 712
Syntacti c evidenc e ................................. 713
Lexical evidence ..................................7 1 3
Semantic evidence ................................. 713
Metonymy ..................................... 714
Metaphor ...................................... 715
23.6 Discourse Understanding ............................. 715
The structure of coherent discourse ........................ 717
23.7 Summary ...................................... 719
xxvi Content s
Bibliographical and Historical Notes ........................... 720
Exercises ......................................... 721
24 Perception 724
24.1 Introduction .................................... 724
24.2 Imag e Formatio n .................................. 725
Pinhole camera ................................... 725
Lens systems .................................... 727
Photometr y of imag e formatio n .......................... 729
Spectrophotometr y of imag e formatio n ...................... 730
24.3 Image-Processin g Operation s for Early Vision .................. 730
Convolutio n wit h linear filter s ........................... 732
Edge detectio n ................................... 733
24.4 Extracting 3-D Information Using Vision ..................... 734
Motion ....................................... 735
Binocular stereopsi s ................................ 737
Texture gradient s .................................. 742
Shading ...................................... 743
Contour ...................................... 745
24.5 Using Vision for Manipulation and Navigation .................. 749
24.6 Objec t Representatio n and Recognitio n ...................... 751
The alignmen t metho d ............................... 752
Using projectiv e invariant s ............................ 754
24.7 Speec h Recognitio n ................................ 757
Signal processing ................................. 758
Defining the overal l speech recognitio n model .................. 760
The languag e model: P(words ) .......................... 760
The acoustic model: P(signallwords) ....................... 762
Putting the models together ............................ 764
The searc h algorith m ............................... 765
Trainin g the mode l ................................. 766
24.8 Summar y ...................................... 767
Bibliographical and Historical Notes ........................... 767
Exercises ......................................... 771
25 Robotics 77 3
25.1 Introduction .................................... 773
25.2 Tasks: What Are Robots Good For? . ....................... 774
Manufacturing and materials handling ...................... 774
Gofer robots .................................... 775
Hazardous environments .............................. 775
Telepresence and virtual reality .......................... 776
Augmentation of human abilities ......................... 776
25.3 Parts: What Are Robots Made Of? ........................ 777
Content s _________________________________________________________xxvii
Effectors: Tools for action ............................. 777
Sensors: Tools for perceptio n ........................... 782
25.4 Architecture s .................................... 786
Classical architecture ............................... 787
Situated automat a ................................. 788
25.5 Configuratio n Spaces: A Framewor k for Analysi s ................ 790
Generalize d configuratio n space .......................... 792
Recognizable Sets ................................. 795
25.6 Navigatio n and Motion Plannin g ......................... 796
Cell decomposition ................................ 796
Skeletonizatio n methods .............................. 798
Fine-motion planning ............................... 802
Landmark-base d navigatio n ............................ 805
Online algorithms ................................. 806
25.7 Summar y ...................................... 809
Bibliographica l and Historica l Notes ........................... 809
Exercise s ......................................... 81 1
VIII Conclusions 81 5
26 Philosophical Foundations 817
26.1 The Big Question s ................................. 817
26.2 Foundation s of Reasonin g and Perceptio n .................... 819
26.3 On the Possibilit y of Achievin g Intelligen t Behavio r ............... 822
The mathematica l objection ............................ 824
The argumen t from informalit y .......................... 826
26.4 Intentionalit y and Consciousnes s ......................... 830
The Chinese Room ................................ 831
The Brai n Prosthesi s Experimen t ......................... 835
Discussion ..................................... 836
26.5 Summar y ...................................... 837
Bibliographica l and Historica l Notes ........................... 838
Exercises ......................................... 840
27 AI: Present and Future 84 2
27.1 Have We Succeeded Yet? ............................. 842
27.2 What Exactl y Are We Trying to Do? ....................... 845
27.3 What If We Do Succeed? ............................. 848
A Complexity analysis and O() notation 851
A.I Asymptoti c Analysi s ................................ 851
A.2 Inherently Hard Problems ............................. 852
Bibliographical and Historical Notes ........................... 853
XXV111
Contents
B Notes on Languages and Algorithms 854
B.I Defining Languages with Backus-Naur Form (BNF) ............... 854
B.2 Describing Algorithms with Pseudo-Code .................... 855
Nondeterminism .................................. 855
Static variables ................................... 856
Functions as values ................................ 856
B.3 The Code Repository ............................... 857
B.4 Comments ..................................... 857
Bibliography
Index
859
905
Parti
ARTIFICIAL INTELLIGENCE
The two chapters in this part introduce the subject of Artificial Intelligence or AI
and our approach to the subject: that AI is the study of agents that exist in an
environment and perceive and act.
Sectio n
The Foundation s of Artificia l Intelligenc e
and subtracting machine called the Pascaline. Leibniz improved on this in 1694, building a
mechanical device that multiplied by doing repeated addition. Progress stalled for over a century
until Charles Babbage (1792-1871) dreamed that logarithm tables could be computed by machine.
He designed a machine for this task, but never completed the project. Instead, he turned to the
design of the Analytical Engine, for which Babbage invented the ideas of addressable memory,
stored programs, and conditional jumps. Although the idea of programmable machines was
not new—in 1805, Joseph Marie Jacquard invented a loom that could be programmed using
punched cards—Babbage's machine was the first artifact possessing the characteristic s necessar y
for universa l computation. Babbage's colleagu e Ada Lovelace, daughte r of the poet Lord Byron,
wrot e program s for the Analytica l Engin e and even speculate d that the machin e coul d play ches s
or compos e music. Lovelac e wa s the world's firs t programmer, and the firs t of man y to endur e
massive cost overruns and to have an ambitious project ultimately abandoned." Babbage's basic
design was prove n viabl e by Doro n Swad e and his colleagues, who buil t a workin g mode l usin g
onl y the mechanica l technique s availabl e at Babbage's time (Swade, 1993). Babbag e had the
right idea, but lacked the organizationa l skills to get his machine built.
AI also owes a debt to the software side of computer science, which has supplied the
operatin g systems, programmin g languages, and tool s neede d to writ e moder n program s (and
paper s abou t them). But thi s is one area wher e the debt has been repaid: wor k in AI has pioneere d
many ideas that have made their way back to "mainstream" computer science, including time
sharing, interactiv e interpreters, the linked list data type, automati c storage management, and
some of the key concepts of object-oriented programming and integrated program development
environment s wit h graphica l user interfaces.
Linguistics (1957-present)
In 1957, B. F. Skinner published Verbal Behavior. This was a comprehensive, detailed account
of the behavioris t approac h to languag e learning, writte n by the foremos t exper t in the field. Bu t
curiously, a review of the book became as well-known as the book itself, and served to almost kill
off interes t in behaviorism. The autho r of the revie w was Noa m Chomsky, who had jus t publishe d
a book on his own theory, Syntactic Structures. Chomsky showed how the behaviorist theory did
not addres s the notio n of creativit y in language—i t di d not explai n how a chil d coul d understan d
and make up sentences that he or she had never heard before. Chomsky's theory—based on
syntacti c model s going back to the Indian linguis t Panini (c. 350 B.C.)—coul d explai n this, and
unlike previous theories, it was formal enough that it could in principl e be programmed.
Later developments in linguistics showed the problem to be considerably more complex
than it seeme d in 1957. Languag e is ambiguou s and leave s muc h unsaid. Thi s mean s tha t
understanding language requires an understandin g of the subject matter and context, not just an
understanding of the structure of sentences. This may seem obvious, but it was not appreciated
unti l the earl y 1960s. Muc h of the earl y wor k in knowledg e representatio n (the study of how to
put knowledge into a form that a computer can reason with) was tied to language and informed
by research in linguistics, which was connected in turn to decades of work on the philosophical
analysi s of language.
She also gave her name to Ada, the U.S. Department of Defense's all-purpose programming language.
1
INTRODUCTION
In which we try to explain why we consider artificial intelligence to be a subject most
worthy of study, and in which we try to decide what exactly it is, this being a good
thing to decide before embarking.
Humankin d has given itsel f the scientifi c name homo sapiens—man the wise—becaus e our
menta l capacitie s are so importan t to our everyda y live s and our sens e of self. The fiel d of
artificia l intelligence, or AI, attempt s to understan d intelligen t entities. Thus, one reason to
study it is to lear n mor e abou t ourselves. But unlik e philosoph y and psychology, whic h are
also concerne d wit h intelligence, AI strive s to build intelligen t entitie s as wel l as understan d
them. Anothe r reason to stud y AI is that these constructe d intelligen t entitie s are interestin g and
useful in their own right. AI has produced many significant and impressive product s even at this
earl y stage in its development. Althoug h no one can predic t the futur e in detail, it is clear that
computer s wit h human-leve l intelligenc e (or better ) woul d have a huge impac t on our everyda y
lives and on the future course of civilization.
AI addresse s one of the ultimat e puzzles. How is it possibl e for a slow, tiny brain, whethe r
biologica l or electronic, to perceive, understand, predict, and manipulat e a worl d far larger and
more complicate d than itself? How do we go about makin g somethin g wit h thos e properties?
These are hard questions, but unlike the search for faster-than-ligh t travel or an antigravit y device,
the researche r in AI has solid evidence that the quest is possible. All the researche r has to do is
look in the mirror to see an exampl e of an intelligen t system.
AI is one of the newes t disciplines. It was formall y initiate d in 1956, when the name
was coined, althoug h at that point wor k had been unde r way for about five years. Alon g wit h
moder n genetics, it is regularl y cited as the "fiel d I woul d mos t like to be in" by scientist s in other
disciplines. A student in physics might reasonably feel that all the good ideas have already been
taken by Galileo, Newton, Einstein, and the rest, and that it takes many years of study before one
can contribute new ideas. AI, on the other hand, still has openings for a full-time Einstein.
The study of intelligence is also one of the oldest disciplines. For over 2000 years, philoso-
phers have tried to understand how seeing, learning, remembering, and reasoning could, or should,
Chapter
Introduction
be done.' The advent of usable computers in the early 1950s turned the learned but armchair
speculation concerning these mental faculties into a real experimental and theoretical discipline.
Many felt that the new "Electronic Super-Brains" had unlimited potential for intelligence. "Faster
Than Einstein" was a typica l headline. But as wel l as providin g a vehicl e for creatin g artificiall y
intelligent entities, the computer provides a tool for testing theories of intelligence, and many
theories failed to withstand the test—a case of "out of the armchair, into the fire." AI has turned
out to be more difficul t than many at first imagined, and modem ideas are much richer, more
subtle, and more interesting as a result.
AI currentl y encompasse s a huge variet y of subfields, from general-purpos e areas such as
perception and logical reasoning, to specific tasks such as playing chess, proving mathematical
theorems, writing poetry, and diagnosing diseases. Often, scientists in other fields move gradually
into artificial intelligence, where they find the tools and vocabular y to systematize and automat e
the intellectual tasks on which they have been working all their lives. Similarly, workers in AI
can choos e to appl y thei r method s to any area of huma n intellectua l endeavor. In thi s sense, it is
truly a universa l field.
1.1 WHA T is AI?
RATIONALIT Y
We have now explaine d why AI is exciting, but we have not said wha t it is. We coul d just say,
"Well, it has to do with smart programs, so let's get on and write some." But the history of science
shows that it is helpful to aim at the right goals. Earl y alchemists, lookin g for a potion for eterna l
life and a metho d to tur n lead int o gold, wer e probabl y off on the wron g foot. Onl y whe n the aim ;
changed, to that of finding explicit theories that gave accurate predictions of the terrestrial world, j
in the same way that early astronomy predicte d the apparen t motion s of the stars and planets, i
could the scientific method emerge and productive science take place.
Definitions of artificial intelligence according to eight recent textbooks are shown in Fig- j
ure 1.1. Thes e definition s var y alon g two mai n dimensions. The ones on top are concerne d
with thought processes and reasoning, whereas the ones on the bottom address behavior. Also,!
the definition s on the lef t measur e succes s in terms of human performance, wherea s the one s 1
on the right measure against an ideal concept of intelligence, which we will call rationality. A!
system is rationa l if it does the right thing. Thi s gives us four possibl e goal s to pursue in artificia l j
intelligence, as seen in the caption of Figure 1.1.
Historically, all four approaches have been followed. As one might expect, a tension existsl
betwee n approache s centere d around human s and approache s centere d around rationality.2 A!
human-centered approach must be an empirical science, involving hypothesis and experimental]
1 A more recent branch of philosophy is concerned with proving that AI is impossible. We will return to this interesting j
viewpoint in Chapter 26.
2 We should point out that by distinguishing between human and rational behavior, we are not suggesting that humans 1
are necessarily "irrational" in the sense of "emotionally unstable" or "insane." One merely need note that we often make I
mistakes; we are not all chess grandmasters even though we may know all the rules of chess; and unfortunately, not]
everyone gets an A on the exam. Some systematic errors in human reasoning are cataloged by Kahneman et al. (1982).
Sectio n 1.1
What is Al?
"The excitin g new effor t to mak e computer s
think . . . machines with minds, in the ful l
and literal sense" (Haugeland, 1985)
"[The automatio n of] activitie s that we asso-
ciat e wit h huma n thinking, activitie s such as
decision-making, problem solving, learnin g
..."(Bellman, 1978)
"The art of creating machine s that perfor m
functions that require intelligence when per-
formed by people" (Kurzweil, 1990)
"The study of how to make computer s do
things at which, at the moment, people are
better" (Ric h and Knight, 1 99 1 )
"The study of menta l facultie s throug h the
use of computationa l models"
(Charniak and McDermott, 1985)
"The study of the computation s that make
it possibl e to perceive, reason, and act"
(Winston, 1992 )
"A fiel d of stud y tha t seek s t o explai n an d
emulat e intelligen t behavio r in terms of
computationa l processes" (Schalkoff, 1 990)
"The branch of computer science that is con-
cerned wit h the automatio n of intelligen t
behavior" (Luge r and Stubblefield, 1993)
Figure 1.1 Some definition s of AI. They are organize d int o four categories:
System s that think like humans.
System s tha t ac t lik e humans.
Systems that think rationally.
System s tha t act rationally.
confirmation. A rationalis t approac h involve s a combinatio n of mathematic s and engineering.
People in each group sometime s cast aspersion s on work done in the other groups, but the trut h
is that each directio n has yielde d valuabl e insights. Let us look at each in more detail.
TURIN G TEST
KNOWLEDG E
REPRESENTATIO N
AUTOMATE D
REASONIN G
MACHIN E LEARNIN G
L
Acting humanly: The Turing Test approach
The Turing Test, proposed by Alan Turing (1950), was designed to provide a satisfactor y
operationa l definitio n of intelligence. Turin g define d intelligen t behavio r as the abilit y to achiev e
human-leve l performanc e in al l cognitiv e tasks, sufficien t to fool an interrogator. Roughl y
speaking, the test he propose d is that the compute r shoul d be interrogate d by a huma n vi a a
teletype, and passes the test if the interrogator cannot tell if there is a computer or a human at the
other end. Chapte r 26 discusse s the detail s of the test, and whethe r or not a compute r is really
intelligent if it passes. For now, programming a computer to pass the test provides plenty to work
on. The compute r woul d need to possess the followin g capabilities:
0 natural language processing to enable it to communicate successfully in English (or some
other human language);
<C > knowledge representation to store information provided before or during the interrogation;
<) automated reasoning to use the stored information to answer questions and to draw new
conclusions;
<) machine learning to adapt to new circumstances and to detect and extrapolate patterns.
Turing's test deliberately avoided direct physical interaction between the interrogator and the
computer, because physical simulation of a person is unnecessary for intelligence. However,
Chapter 1. Introductio n
TOTA L TURIN G TES T th e so-called total Turing Test include s a video signa l so that the interrogator can test the
subject's perceptual abilities, as well as the opportunit y for the interrogator to pass physical
objects "through the hatch." To pass the total Turing Test, the computer will need
COMPUTE R VISIO N < ) computer vision t o perceive objects, an d
ROBOTIC S ( > robotics t o move them about.
Within AI, there has not been a big effort to try to pass the Turing test. The issue of acting
like a huma n come s up primaril y whe n AI program s have to interac t wit h people, as whe n an
expert system explains how it came to its diagnosis, or a natural language processing system has
a dialogu e wit h a user. Thes e program s mus t behav e accordin g to certai n norma l convention s of
human interaction in order to make themselves understood. The underlyin g representatio n and
reasonin g in such a syste m may or may not be base d on a huma n model.
COGNITIV E SCIENC E
Thinking humanly: The cognitive modelling approach
If we are going to say that a given program thinks like a human, we must have some way of
determining how humans think. We need to get inside the actual workings of human minds.
Ther e are two way s to do this: throug h introspection—tryin g to catc h our own thought s as the y
go by—or through psychologica l experiments. Once we have a sufficientl y precise theory of
the mind, it become s possibl e to expres s the theor y as a compute r program. If the program's
input/outpu t and timin g behavio r matche s huma n behavior, tha t is evidenc e tha t some of the
program's mechanism s may also be operatin g in humans. For example, Newel l and Simon, who
developed GPS, the "General Problem Solver" (Newel l and Simon, 1961), were not content to
have their program correctly solve problems. They were more concerned with comparing the
trace of its reasoning steps to traces of human subjects solving the same problems. This is in
contrast to other researcher s of the same time (such as Wang (I960)), who were concerned with
getting the right answers regardless of how humans might do it. The interdisciplinar y field of
cognitiv e scienc e bring s togethe r compute r model s from AI and experimenta l technique s from
psychology to try to construct precise and testabl e theories of the workings of the human mind.
Althoug h cognitiv e scienc e is a fascinatin g fiel d in itself, we are not goin g to be discussin g
it all that much in this book. We will occasionall y comment on similaritie s or differences between
AI techniques and human cognition. Real cognitive science, however, is necessarily based on
experimenta l investigatio n of actua l human s or animals, and we assume that the reade r onl y has
access to a computer for experimentation. We will simpl y note that AI and cognitive science
continue to fertilize each other, especially in the areas of vision, natural language, and learning.
The history of psychological theories of cognition is briefly covered on page 12.
SYLLOGISMS
L
Thinking rationally: The laws of thought approach
The Greek philosopher Aristotle was one of the first to attempt to codify "right thinking," that is,
irrefutable reasoning processes. His famous syllogisms provided patterns for argument structures
that always gave correct conclusions given correct premises. For example, "Socrates is a man;
Sectio n 1.1. Wha t is AI?
LOGI C
LOGICIS T
all men are mortal; therefore Socrates is mortal." These laws of thought were supposed to govern
the operatio n of the mind, and initiate d the fiel d of logic.
The developmen t of forma l logi c in the lat e nineteent h and earl y twentiet h centuries, whic h
we describe in more detail in Chapter 6, provided a precise notation for statements about all kinds
of things in the world and the relations between them. (Contrast this with ordinary arithmetic
notation, whic h provide s mainl y for equalit y and inequalit y statement s abou t numbers.) By 1965,
program s existe d tha t could, give n enoug h time and memory, take a descriptio n of a proble m
in logica l notatio n and fin d the solutio n to the problem, if one exists. (I f ther e is no solution,
the progra m migh t neve r stop lookin g for it.) The so-calle d logicis t traditio n withi n artificia l
intelligence hopes to build on such programs to create intelligent systems.
Ther e are two mai n obstacle s to thi s approach. First, it is not easy to take informa l
knowledg e and stat e it in the forma l terms require d by logica l notation, particularl y whe n the
knowledg e is less tha n 100% certain. Second, ther e is a bi g differenc e betwee n bein g abl e to
solve a proble m "in principle" and doin g so in practice. Eve n problem s wit h jus t a few doze n
facts can exhaust the computational resources of any computer unless it has some guidance as to
whic h reasonin g steps to tr y first. Althoug h bot h of thes e obstacle s appl y to any attemp t to buil d
computational reasoning systems, they appeared first in the logicist tradition because the power
of the representatio n and reasonin g system s are well-define d and fairl y wel l understood.
AGEN T
Acting rationally: The rational agent approach
Acting rationall y means acting so as to achieve one's goals, given one's beliefs. An agent is just
something that perceives and acts. (This may be an unusual use of the word, but you will get
used to it.) In this approach, AI is viewed as the study and construction of rational agents.
In the "laws of thought" approach to AI, the whole emphasis was on correct inferences.
Makin g correc t inference s is sometime s part of being a rationa l agent, becaus e one way to act
rationally is to reason logically to the conclusion that a given action will achieve one's goals,
and the n to act on that conclusion. On the othe r hand, correc t inferenc e is not all of rationality,
because there are often situations where there is no provably correct thing to do, yet something
mus t stil l be done. Ther e are also way s of actin g rationall y tha t canno t be reasonabl y sai d to
involve inference. For example, pulling one's hand off of a hot stove is a reflex action that is
mor e successfu l than a slowe r actio n take n afte r carefu l deliberation.
All the "cognitiv e skills" needed for the Turing Test are there to allow rationa l actions. Thus,
we need the ability to represent knowledge and reason with it because this enables us to reach
good decision s in a wide variet y of situations. We need to be abl e to generat e comprehensibl e
sentence s in natura l languag e becaus e sayin g those sentence s helps us get by in a comple x society.
We need learning not just for erudition, but because having a better idea of how the world works
enables us to generate more effective strategies for dealing with it. We need visual perception not
just because seeing is fun, but in order to get a better idea of what an action might achieve—for
example, being able to see a tasty morsel helps one to move toward it.
The study of AI as rational agent design therefore has two advantages. First, it is more
general than the "laws of thought" approach, because correct inference is only a useful mechanism
for achieving rationality, and not a necessary one. Second, it is more amenable to scientific
Chapter 1. Introduction
LIMITE D
RATIONALIT Y
development than approaches based on human behavior or human thought, because the standard
of rationality is clearly defined and completely general. Human behavior, on the other hand,
is well-adapted for one specific environment and is the product, in part, of a complicated and
largel y unknow n evolutionar y proces s that stil l may be far from achievin g perfection. This
book will therefore concentrate on general principles of rational agents, and on components for
constructing them. We will see that despite the apparent simplicity with which the problem can
be stated, an enormous variety of issues come up when we try to solve it. Chapter 2 outlines
some of these issues in more detail.
One importan t poin t to keep in mind: we wil l see befor e too long that achievin g perfec t
rationality—alway s doing the right thing—i s not possibl e in complicate d environments. The
computationa l demand s are jus t too high. However, for most of the book, we wil l adopt the
working hypothesi s that understanding perfect decision making is a good place to start. It
simplifie s the proble m and provide s the appropriat e settin g for mos t of the foundationa l materia l
in the field. Chapter s 5 and 17 deal explicitl y with the issue of limited rationality —acting
appropriatel y whe n ther e is not enoug h time to do all the computation s one migh t like.
1.2 TH E FOUNDATION S O F ARTIFICIA L INTELLIGENC E
In this section and the next, we provide a brief history of AI. Although AI itself is a young field,
it has inherite d many ideas, viewpoints, and technique s from othe r disciplines. From over 2000
years of tradition in philosophy, theories of reasoning and learning have emerged, along with the
viewpoint that the mind is constituted by the operation of a physical system. From over 400 years
of mathematics, we have forma l theorie s of logic, probability, decisio n making, and computation.
From psychology, we have the tools with which to investigate the human mind, and a scientific
language within which to express the resulting theories. From linguistics, we have theories of
the structur e and meanin g of language. Finally, from compute r science, we have the tool s wit h
which to make AI a reality.
Like any history, this one is forced to concentrate on a small number of people and events,
and ignore others that were also important. We choose to arrange events to tell the story of how
the variou s intellectua l component s of moder n AI came int o being. We certainl y woul d not wish
to give the impression, however, that the disciplines from which the components came have all
been working toward AI as their ultimate fruition.
Philosophy (428 B.C.-present)
The safest characterizatio n of the Europea n philosophica l traditio n is that it consist s of a series
of footnotes to Plato.
—Alfred North Whitehead
We begin with the birth of Plato in 428 B.C. His writings range across politics, mathematics,
physics, astronomy, and several branches of philosophy. Together, Plato, his teacher Socrates,
I
Section 1.2. The Foundations of Artificial Intelligence
DUALIS M
MATERIALIS M
EMPIRICIS T
INDUCTIO N
and his student Aristotl e laid the foundatio n for much of wester n though t and culture. The
philosophe r Huber t Dreyfu s (1979, p. 67) say s that "The stor y of artificia l intelligenc e migh t wel l
begin around 450 B.C." when Plato reported a dialogue in which Socrates asks Euthyphro,3 "I
want to know wha t is characteristi c of piet y whic h make s all action s pious... that I may have it
to turn to, and to use as a standard whereby to judge your actions and those of other men."4 In
other words, Socrates was asking for an algorithm to distinguis h piety from non-piety. Aristotl e
went on to try to formulat e more precisely the laws governing the rational part of the mind. He
develope d an informa l syste m of syllogism s for prope r reasoning, whic h in principl e allowe d one
to mechanicall y generat e conclusions, give n initia l premises. Aristotl e did not believ e all part s
of the mind were governe d by logica l processes; he also had a notio n of intuitiv e reason.
Now tha t we have the idea of a set of rule s tha t can describ e the workin g of (at leas t par t
of) the mind, the next step is to conside r the min d as a physica l system. We have to wai t for
Rene Descartes (1596-1650) for a clear discussion of the distinction between mind and matter,
and the problems that arise. One problem with a purely physical conception of the mind is that
it seems to leav e littl e roo m for fre e will: if the min d is governe d entirel y by physica l laws, the n
it has no more free will than a rock "deciding" to fall toward the center of the earth. Althoug h a
stron g advocat e of the powe r of reasoning, Descarte s was also a proponen t of dualism. He hel d
that there is a part of the mind (or soul or spirit) that is outside of nature, exempt from physical
laws. On the othe r hand, he fel t tha t animal s di d no t posses s thi s dualis t quality; the y coul d be
considered as if they were machines.
An alternativ e to dualis m is materialism, whic h hold s tha t al l the worl d (includin g the
brain and mind) operat e according to physica l law.5 Wilhel m Leibni z (1646-1716) was probabl y
the first to take the materialis t positio n to its logica l conclusio n and buil d a mechanica l device
intende d to carry out mental operations. Unfortunately, hi s formulatio n of logi c was so weak that
his mechanical concept generator could not produce interesting results.
It is also possibl e to adopt an intermediat e position, in whic h one accept s that the min d
has a physical basis, but denies that it can be explained by a reduction to ordinary physical
processes. Menta l processes and consciousnes s are therefor e part of the physica l world, but
inherentl y unknowable; they are beyon d rationa l understanding. Some philosopher s critica l of
AI have adopted exactl y thi s position, as we discus s in Chapte r 26.
Barrin g thes e possibl e objection s to the aims of AI, philosoph y had thus establishe d a
traditio n in which the mind was conceive d of as a physica l devic e operatin g principall y by
reasonin g wit h the knowledg e that it contained. The next proble m is then to establis h the
source of knowledge. The empiricis t movement, startin g with Franci s Bacon's (1561-1626 )
Novwn Organum,6 is characterize d by the dictum of John Locke (1632-1704): "Nothing is in
the understanding, which was not first in the senses." David Hume's (1711-1776) A Treatise
of Human Nature (Hume, 1978) propose d what is now known as the principl e of induction:
3 The Euthyphro describe s the event s jus t befor e the trial of Socrate s in 399 B.C. Dreyfu s has clearl y erred in placing it
51 year s earlier.
4 Note that other translation s have "goodness/good" instead of "piety/pious."
5 In thi s view, the perceptio n of "free will" arises becaus e the deterministi c generatio n of behavio r is constitute d by the
operation of the mind selecting among what appear to be the possible courses of action. They remain "possible" because
the brai n does not have access to its own futur e states.
6 An update of Aristotle's organon, or instrument of thought.
10
Chapter 1. Introductio n
LOGICA L POSITIVIS M
OBSERVATIO N
SENTENCE S
CONFIRMATIO N
THEOR Y
MEANS-END S
ANALYSI S
that general rules are acquire d by exposur e to repeated association s betwee n thei r elements.
The theor y was given more forma l shape by Bertran d Russel l (1872-1970 ) who introduce d
logical positivism. This doctrine holds that all knowledg e can be characterize d by logical
theories connected, ultimately, to observatio n sentences that correspon d to sensor y inputs.7 The
confirmation theory of Rudolf Carnap and Carl Hempel attempted to establish the nature of the
connectio n between the observatio n sentence s and the more genera l theories—i n other words, to
understand how knowledge can be acquired from experience.
The fina l elemen t in the philosophica l pictur e of the min d is the connectio n betwee n
knowledge and action. What form should this connection take, and how can particular actions
be justified? Thes e question s are vita l to AI, becaus e onl y by understandin g how action s are
justifie d can we understan d how to buil d an agent whose actions are justifiable, or rational.
Aristotl e provide s an elegant answer in the Nicomachean Ethics (Book III. 3, 1112b):
We deliberat e not about ends, but about means. For a doctor does not deliberat e whethe r he
shall heal, nor an orator whether he shall persuade, nor a statesman whether he shall produce
law and order, nor does any one else deliberat e about hi s end. They assume the end and
conside r ho w an d by wha t mean s i t i s attained, an d i f i t seem s easil y an d bes t produce d
thereby; whil e if i t is achieve d by one mean s onl y the y conside r how i t wil l be achieve d by
thi s an d by wha t mean s this wil l be achieved, til l the y com e to th e firs t cause, whic h i n th e
orde r of discover y i s las t ... an d wha t is las t i n the orde r of analysi s seem s t o be firs t i n th e
orde r of becoming. An d if we come on an impossibility, we giv e up the search, e.g. if we
nee d mone y and thi s canno t be got: bu t if a thin g appear s possibl e we tr y to do it.
Aristotle's approach (with a few minor refinements ) was implemented 2300 years later by Newell
and Simo n in thei r GPS program, abou t whic h the y writ e (Newel l and Simon, 1972):
The mai n method s of GP S jointl y embod y th e heuristi c of means-end s analysis. Means-end s
analysi s i s typifie d by the followin g kin d of common-sens e argument:
I wan t to tak e my son to nurser y school. What's th e differenc e betwee n wha t I
have and what I want? One of distance. What changes distance? My automobile.
My automobil e won't work. Wha t is neede d to mak e i t work? A new battery.
What has new batteries? An auto repair shop. I want the repair shop to put in a
new battery; but the sho p doesn't kno w I nee d one. Wha t is the difficulty? On e
of communication. What allows communication? A telephon e ... and so on.
Thi s kin d of analysis—classifyin g thing s i n terms of the function s the y serv e and oscillatin g
amon g ends, function s required, and mean s tha t perfor m them—form s the basi c syste m of
heuristi c of GPS.
Means-end s analysi s is useful, but does not say wha t to do whe n severa l action s wil l achiev e the
goal, or when no action will completely achieve it. Arnauld, a follower of Descartes, correctly
describe d a quantitativ e formul a for decidin g wha t actio n to take in cases like thi s (see Chapte r 16).
John Stuart Mill's (1806-1873) book Utilitarianism (Mill, 1863) amplifies on this idea. The more
formal theory of decisions is discussed in the following section.
7 In this picture, all meaningfu l statement s can be verifie d or falsifie d either by analyzin g the meanin g of the words or
by carrying out experiments. Because this rules out most of metaphysics, as was the intention, logical positivism was
unpopular in some circles.
Sectio n 1.2. Th e Foundation s of Artificia l Intelligenc e
11
Mathematics (c. 800-present)
Philosopher s stake d out most of the importan t ideas of AI, but to make the leap to a forma l
scienc e require d a level of mathematica l formalizatio n in thre e mai n areas: computation, logic,
ALGORITH M and probability. The notion of expressing a computatio n as a forma l algorithm goes back to
al-Khowarazmi, an Arab mathematicia n of the nint h century, whose writings also introduced
Europ e to Arabi c numeral s and algebra.
Logic goes back at least to Aristotle, but it was a philosophical rather than mathematical
subjec t unti l Georg e Bool e (1815-1864 ) introduce d hi s forma l languag e for makin g logica l
inferenc e in 1847. Boole's approac h was incomplete, but good enoug h tha t other s fille d in the
gaps. In 1879, Gottlo b Frege (1848-1925 ) produce d a logi c that, excep t for some notationa l
changes, forms the first-order logic that is used today as the most basic knowledge representatio n
system.8 Alfred Tarski (1902-1983) introduced a theory of reference that shows how to relate
the object s in a logi c to object s in the real world. The nex t step was to determin e the limit s of
wha t coul d be done wit h logi c and computation.
Davi d Hilber t (1862-1943), a great mathematicia n in his own right, is most remembere d
for the problems he did not solve. In 1900, he presented a list of 23 problems that he correctl y
predicte d woul d occup y mathematician s for the bul k of the century. Th e fina l proble m ask s
if ther e is an algorith m for decidin g the trut h of any logica l propositio n involvin g the natura l
numbers—th e famous Entscheidungsproblem, or decision problem. Essentially, Hilber t was
askin g if ther e wer e fundamenta l limit s to the powe r of effectiv e proo f procedures. In 1930, Kur t
Godel (1906-1978) showed that there exists an effective procedure to prove any true statement in