# Exam part 2 - Artificial Intelligence

Τεχνίτη Νοημοσύνη και Ρομποτική

15 Οκτ 2013 (πριν από 4 χρόνια και 7 μήνες)

92 εμφανίσεις

BKI
-
212:
Artif
icial Intelligence
: from search to planning

E
xam part 2
,
April 19
th
, 2007, 8
.45
-

1
0
.30

student number and “BKI 212 exam part
2

06/07”

at the top of the answer
sheet.

There are 4 questions.
Answer the questions in a full, clear, but also relevant way. Omissions,
unclarities as well as irrelevant elaborations will lower your g

The maximum attainable score is 100 points.

Question 1
.
MDPs and Reinforcement Learning

(max.

5+5+5+5+5

= 25 points)

Consider the following world:

a

b

+1

c

d

-
1

In this world there are 4 possible
(deterministic)
actions: left (

), up (

), righ
t (

), and down (

)
resulting
in moving to the give
n

neighbour state when possible or staying in the same state w
hen the
action is not possible.

Choose

the discount factor

equal to 0.1
.
The

reward
of the states
a
,
b
,
c
, and
d

is
0;
the reward of
the stat
e marked by +1 is

+1
;

the reward of the state marked by
-
1
is

-
1
.

The states with reward +1
and
-
1 are terminal states.

a.

Compute the utilities for the states:
U
(
a
),
U
(
b
),
U
(
c
), and
U
(
d
).

b.

Compute the Q
-
function

for the state
-
action pairs
(
up
,
b
)
, (
left
,
b
)
, (
right
,
b
),
and
(
down
,
b
)
in this world.

c.

W
h
at is/
are the optimal polic(y)(ies) f
or a
n agent in
this wor
ld?

Answer the following questions in a general world with states
s
,
s

, actions
a
,
a

, utility
estimates
U
(
s
), and Q
-
function estimates
Q
(
a
,
s
).

d.

Ex
plain how TD learning

is applied when learning a Q
-
function.

e.

Why does an optimistic utility
estimate

cause exploration?

Question 2
.
Machine Learning

(max.
8+8+9 =
25 points)

Given are the following training points (
x
,
y
):

the points (0,0) and (2,0) as n
egative instances, and

the points (0,1), and (0,
-
1) as positive instances.

Construct a support vector machine which classifies these examples correctly. Take the values

1 and +1 (instead of 0 and 1) for the input and output values.

a.

Draw the input v
ectors in the
u
-
v

plane defined by

u

=
x

and
v

=
y
2

-
1
.

b.

Draw the “maximal margin separator” in the
u
-
v

plane.

c.

Also draw the corresponding decision line/curve in the original Euclidian plane defined by
x
and
y
.

BKI
-
212:
Artif
icial Intelligence
: from search to planning

E
xam part 2
,
April 19
th
, 2007, 8
.45
-

1
0
.30

Question 3
.
Ensemble Learning

(max.
7+18

= 25 points)

a.

Suppose you have a classifier that is an ensemble of decision trees
generated

using bagging.
Denote the type of this classifier with
T
.

Do you expect to gain in accuracy if you produce an ensemble of classifiers of type
T

(
an
ense
mble of ensembles
)

b.

Consider a two
-
class classification task. Assume we have three classifiers A, B, and C, with
errors

e

(0 <
e

< 1
) each.

Consider
the ensemble of

classifier
s

consisting of
the

3 cl
assifiers

A, B, and C
,

and usin
g
majority voting for classification
.

Estimate the
error

of
this ensemble

when
e

= 1/3 and when
e

= 2/3

for all three classifiers A, B,
and C
:

i.

In the worst case;

ii.

In the best case;

iii.

In the case that the errors of the classifi
ers are completely independent
.

Question

4
.

PAC Learning

(max.
13+12

= 25 points)

a.

Consider the space of instances
X
corresponding to all points in the
x
-
y

plane. Give the VC
-
dimension of the following hypothesis spaces:

i.

H
r

= the set of all rectangles in the
x
-
y

plane. Points in
side the rectangle are classified as
positive examples.

ii.

H
c

= the set of all
triangles

in the
x
-
y

plane. Points inside the
triangle

are classified as positive
examples.

b.

For PAC Learning
with a finite hypothesis space

H
, which cont
ains the concept to be learned,
Haussler
derived

the following formula for the number of training examples needed:

i.

Explain how to apply this formula (give the meaning of
m
,
ε
,
|
H
|
,
δ
, and the conditions for
both training instan
ces and test instances
).

ii.

This formula follows from the inequality:

Explain this
inequality

and also explain why this
inequality

results in an overestimation of the
number of training examples needed.