1 - SSDI

boorishadamantAI and Robotics

Oct 29, 2013 (3 years and 9 months ago)

105 views

1

Heuristic Search


Algorithms

that

maintain

some

form

of

consistency,

remove

(many?)

redundant

values

but,

not

being

complete,

do

not

eliminate

the

need

for

search
.


Even

when

a

constraint

network

is

consistent,

enumeration

is

subject

to

failure
.


In

fact,

a

consistent

constraint

network

may

not

even

be

satisfiable

(neither

a

satisfiable

constraint

network

is

necessarily

consistent)
.


All

that

is

guaranteed

by

maintaining

some

type

of

consistency

is

that

the

networks

are

equivalent
.



Solutions

are

not

“lost”

in

the

reduced

network,

that

despite

having

less

redundant

values,

has

all

the

solutions

of

the

former
.

2

Heuristic Search


Hence,

the

domain

pruning

does

not

eliminate

in

general

the

need

for

search
.

The

search

space

is

usually

organised

as

a

tree,

and

the

search

becomes

some

form

of

tree

search
.


As

usual,

the

various

branches

down

from

one

node

of

the

search

tree

correspond

to

the

assignment

of

the

different

values

in

the

domain

of

a

variable
.



As

such,

a

tree

leaf

corresponds

to

a

complete

compound

label

(including

all

the

problem

variables)
.



A

depth

first

search

in

the

tree,

resorting

to

backtracking

when

a

node

corresponds

to

a

dead

end

(unsatisfiability),

corresponds

to

an

incremental

completion

of

partial

solutions

until

a

complete

one

is

found
.

3

Heuristic Search


Given

the

execution

model

of

constraint

logic

programming

(or

any

algorithm

that

interleaves

search

with

constraint

propagation)



Problem(Vars):
-


Declaration of Variables and Domains,


Specification of Constraints,


Labelling of the Variables.


the

enumeration

of

the

variables

determines

the

shape

of

the

search

tree,

since

the

nodes

that

are

reached

depend

on

the

order

in

which

variables

are

enumerated
.


Take

for

example

two

distinct

enumerations

of

variables

whose

domains

have

different

cardinality,

e
.
g
.

X

in

1
..
2
,

Y

in

1
..
3

and

Z

in

1
..
4
.

4

Heuristic Search

enum([X,Y,Z])
:
-


indomain(X)


propagation


indomain(Y),


propagation,


indomain(Z)
.

# of nodes =
32

(2 + 6 + 24)


1

2

3

4

1

2

3

4

1

2

3

4

1

2

3

1

1

2

3

4

1

2

3

4

1

2

3

4

1

2

3

2

5

Heuristic Search

enum([X,Y,Z]):
-


indomain(Z),



propagation


indomain(Y),


propagation,


indomain(X).

# of nodes =
40

(4 + 12 + 24)

1

2

1

2

1

2

1

2

1

2

1

2

1

2

1

2

1

2

1

2

1

2

1

2

2

3

1

2

3

1

2

3

1

2

3

1

1

2

3

4

6

Heuristic Search


The

order

in

which

variables

are

enumerated

may

have

an

important

impact

on

the

efficiency

of

the

tree

search,

since


The

number

of

internal

nodes

is

different,

despite

the

same

number

of

leaves,

or

potential

solutions,

P

#D
i
.


Failures

can

be

detected

differently,

favouring

some

orderings

of

the

enumeration
.


Depending

on

the

propagation

used,

different

orderings

may

lead

to

different

prunings

of

the

tree
.


The

ordering

of

the

domains

has

no

direct

influence

on

the

search

space,

although

it

may

have

great

importance

in

finding

the

first

solution
.

7

Heuristic Search



To

control

the

efficiency

of

tree

search

one

should

in

principle

adopt

appropriate

heuristics

to

select


The

next

variable

to

label


The

value

to

assign

to

the

selected

variable



Since

heuristics

for

value

choice

will

not

affect

the

size

of

the

search

tree

to

be

explored,

particular

attention

will

be

paid

to

the

heuristics

for

variable

selection
.

8

Variable Selection Heuristics


There

are

two

types

of

heuristics

that

can

be

considered

for

variable

selection
.



Static

-

the

ordering

of

the

variables

is

set

up

before

starting

the

enumeration,

not

taking

into

account

the

possible

effects

of

propagation
.


Dynamic

-

the

selection

of

the

variable

is

determined

after

analysis

of

the

problem

that

resulted

from

previous

enumerations

(and

propagation)
.


Static

heuristics

are

based

on

some

properties

of

the

underlying

constraint

graphs,

namely

their

width

and

bandwidth
,

so

we

will

first

present

and

discuss

these

properties
.

9

Node Width


To

define

the

width

of

a

graph

we

will

first

define

the

notion

of

the

width

of

a

node
.

Definition

(
Node

width,

given

ordering

O
)
:



Given

some

total

ordering,

O,

of

the

nodes

of

a

graph,

the

width

of

a

node

N,

induced

by

ordering

O

is

the

number

of

lower

order

nodes

that

are

adjacent

to

N
.


As

an

example,

given

any

ordering

that

is

increasing

from

the

root

to

the

leaves

of

a

tree,

all

nodes

of

the

tree

(except

the

root)

have

width

1
.

1

5

3

2

7

6

4

8

9

10

Graph Width

Definition

(
Width

of

a

graph

G,

induced

by

O
)
:



Given

some

ordering,

O,

of

the

nodes

of

a

graph,

G,

the

width

of

G

induced

by

ordering

O
,

is

the

maximum

width

of

its

nodes,

given

that

ordering
.

Definition

(
Width

of

a

graph

G
)
:



The

width

of

a

graph

G

is

the

lowest

width

of

the

graph

induced

by

any

of

its

orderings

O
.


It

is

apparent

from

these

definitions,

that

a

tree

is

a

special

graph

whose

width

is

1
.

1

5

3

2

7

6

4

8

9

11

Graph Width

Example
:



In

the

graph

below,

we

may

consider

various

orderings

of

its

nodes,

inducing

different

widths
.

The

width

of

the

graph

is

3

(this

is,

for

example

the

width

induced

by

ordering

O
1
dir)
.

2

3

5

6

1

7

4

1

2

3

4

5

6

7

<<

w
O1dir
= 3 (nodes 4, 5, 6 and 7)
>>

w
O1inv
= 5 (nd 1)

<<

w
O2dir
= 5 (node 1)
>>

w
O1inv
= 6 (nd 4)

4

3

2

5

6

7

1

12

Graph Width


As

shown

before,

to

get

a

backtrack

free

search

in

a

tree,

it

would

be

enough

to

guarantee

that,

for

each

node

N,

all

of

its

children

(adjacent

nodes

with

higher

order)

have

values

that

support

the

values

in

the

domain

of

node

N
.


This

result

can

be

generalised

for

arbitrary

graphs,

given

some

ordering

of

its

nodes
.



If,

according

to

some

ordering,

the

graph

has

width

w
,

and

if

enumeration

follows

that

ordering,

backtrack

free

search

is

guaranteed

if

the

network

is

strongly

k
-
consistent

(with

k

>

w
)
.


In

fact,

like

for

the

case

of

the

trees,

it

would

be

enough

to

maintain

some

kind

of

directed

strong

consistency,

if

the

labelling

of

the

nodes

is

done

in

“increasing”

order
.

13

Graph Width

Example
:



With

ordering

Lo
1
dir,

strong

4
-
consistency

guarantees

backtrack

free

search
.

Such

consistency

guarantees

that

any

3
-
compound

label

that

satisfies

the

relevant

constraints,

may

be

extended

to

a

4
th

variable

that

satisfies

the

relevant

constraints
.


In

fact,

any

value

v
1

from

the

domain

of

variable

1

may

be

selected
.

If

strong

4
-
consistency

is

maintained,

values

from

the

domains

of

the

other

variables

will

possibly

be

removed,

but

no

variable

will

have

its

domain

emptied
.

2

3

5

6

1

7

4

1

2

3

4

5

6

7

14

Graph Width

Example
:



Since

the

ordering

induces

width

3
,

every

variable

X
k

connected

to

variable

X
1

is

connected

at

most

with

other

2

variables

“lower”

than

X
k
.

For

example,

variable

X
6
,

connected

to

variable

X
1

is

also

connected

to

lower

variables

X
3

and

X
4

(the

same

applies

to

X
5
-
X
4
-
X
1

and

X
4
-
X
3
-
X
2
-
X
1
)


Hence,

if

the

network

is

strongly

4
-
consistent,

and

if

the

label

X
1
-
v
1

was

there,

this

means

that

any

3
-
compound

label

{X
1
-
v
1
,

X
3
-
v
3
,

X
4
-
v
4
}

could

be

extended

to

a

4
-
compound

label

{X
1
-
v
1
,

X
3
-
v
3
,

X
4
-
v
4
,

X
6
-
v
6
}

satisfying

the

relevant

constraints
.

When

X
1

is

enumerated

to

v
1
,

values

v
3
,

v
4
,

v
6

(at

least)

will

be

kept

in

their

variable

domains
.

2

3

5

6

1

7

4

1

2

3

4

5

6

7

15

Graph Width


This

example

shows

that,

after

the

enumeration

of

the

“lowest”

variable

(in

an

ordering

that

induces

a

width

w

to

the

graph)

of

a

strongly

k
-
consistent

network

(and

k

>

w)

the

remaining

values

still

have

values

in

their

domains

(to

avoid

backtracking)
.


Being

strongly

k
-
consistent,

all

variables

connected

to

node

1

(variable

X
1
)

are

only

connected

to

at

most

w
-
1

(<k)

other

variables

with

lower

order
.



Hence,

if

some

value

v
1

was

in

the

domain

of

variable

X
1
,

then

for

all

these

sets

of

w

variables,

some

w
-
compound

label

{X
1
-
v
1
,
...
,X
w
-
1
-
v
w
-
1
}

could

be

extended

with

some

label

X
w
-
v
w
.

Hence,

when

enumerating

X
1
=v
1

the

removal

of

the

other

values

of

X
1

still

leaves

the

w
-
compound

label

{X
1
-
v
1
,

...
,

X
w
-
v
w
}

to

satisfy

the

relevant

constraints
.

16

Graph Width


The

process

can

proceed

recursively,

by

successive

enumeration

of

the

variables

in

increasing

order
.

After

each

enumeration

the

network

will

have

one

variable

less

and

values

might

eventually

be

removed

from

the

domain

of

the

remaining

variables,

ensuring

that

the

simplified

network

remains

strongly

4
-
consistent
.


Eventually,

a

network

with

only

4

variables

is

reached,

and

an

enumeration

is

obviously

possible

without

the

need,

ever
,

of

backtracking
.

1

2

3

4

5

6

7

2

3

5

6

1

7

4

17

Graph Width


The

ideas

shown

in

the

previous

example

could

lead

to

a

formal

proof

(by

induction)

of

the

following

Theorem
:



Any

constraint

network

strongly

k
-
consistent,

may

be

enumerated

backtrack

free

if

there

is

an

ordering

O

according

to

which

the

constraint

graph

has

a

width

less

then

k
.


In

practice,

if

such

an

ordering

O

exists,

the

enumeration

is

done

in

increasing

order

of

the

variables,

maintaining

the

system

k
-
consistent

after

enumeration

of

each

variable
.


This

result

is

particularly

interesting

for

large

and

sparse

networks,

where

the

added

cost

of

maintaining

say,

path
-
consistency

(polinomial)

may

compensated

by

not

incurring

in

backtracking

(exponential)
.

18

MWO Heuristics


Strong

k
-
consistency

is

of

course

very

costly

to

maintain,

in

computational

terms,

and

this

is

usually

not

done
.



Nevertheless,

and

specially

in

constraint

networks

with

low

density

and

where

the

widths

of

the

nodes

vary

significantly

with

the

orderings

used,

the

orderings

leading

to

lower

graph

widths

may

be

used

to

heuristically

select

the

variable

to

label
.

Specifically,

one

may

define

the

MWO

Heuristics

(
Minimum

Width

Ordering
)
:


The

Minimum

Width

Ordering

heuristics

suggests

that

the

variables

of

a

constraint

problem

are

enumerated,

increasingly,

in

some

ordering

that

leads

to

a

minimal

width

of

the

primal

constraint

graph
.

19

MWO Heuristics


The

definition

refers

the

primal

graph

of

a

constraints

problem,

which

coincides

with

the

graph

for

binary

constraints

(arcs)
.

For

n
-
ary

constraints,

the

primal

graph

includes

an

arc

between

any

variables

connected

by

an

hyper
-
arc

in

the

problem

hyper
-
graph
.


For

example,

the

graph

being

used

could

be

the

primal

graph

of

a

problem

with

2

quaternary

constraints

(C
1245

and

C
1346
)

and

3

ternary

constraints

(C
123
,

C
457

and

C
467
)
.

C
123

--
>

arcs

a
12
,
a
13

e
a
23

C
1245

--
>

arcs

a
12
,
a
14
,
a
15
,
a
24
,
a
25
and
a
45

C
1346

--
>

arcs

a
13
,
a
14
,
a
16
,
a
34
,
a
36

and
a
46


C
457

--
>

arcs

a
45
,
a
47

e
a
57


C
467

--
>

arcs

a
46
,
a
47

e
a
67

2

3

5

6

1

7

4

20

MWO Heuristics



The application of the MWO heuristcs, requires the determination of
orderings O leading to the lowest primal constraint graph width. The
following greedy algorithm can be used to determine such orderings.



function
sorted_vars(V,C): Node List;


if

V = {N}
then


% only one variable



sorted_vars <
-

N


else


N <
-

arg Vi min {degree(Vi,C) | Vi in V}


% N is one of the nodes with less neighbours


C’<
-

C
\

arcs(N,C)


V’<
-

V
\

{N}


sorted_vars <
-

sorted_vars(V’,C’) & N


end if


end function

21

MWO Heuristics


Example
:



The ordering [1,2,3,4,5,6,7] obtained, leads to a width 3 for the graph.

2

3

5

6

1

7

4

2

3

5

6

1

4

2

3

5

1

4

2

3

1

4

2

3

1

2

1


1.
In

the

graph,

node

7

has

least

degree

(=
3
)
.

Once

removed

node

7
,

nodes

5

and

6

have

least

degree

(=
3
)
.

2.
Node

6

is

now

removed
.

3.
Nodes

5
,

4

(degree

3
),

3

(degree

2
)

and

2

(degree

1
)

are

subsequently

removed,

as

shown

below
.

22

MDO Heuristics


An

approximation

of

the

MWO

heuristics

is

the

MDO

heuristics

that

avoids

the

computation

of

ordering

O

leading

to

lowest

constraint

graph

width
.


MDO

Heuristics

(
Maximum

Degree

Ordering
)
:


The

Maximum

Degree

Ordering

heuristics

suggests

that

the

variables

of

a

constraint

problem

are

enumerated,

by

decreasing

order

of

their

degree

in

the

constraint

graph
.


Example
:



Heuristics

MDO

would

use

an

ordering

started

by

nodes

4

(d=
6
)

and

1

(d=
5
)

and

ending

in

node

7

(d=
3
)
.

Nodes

2
,

3
,

5

and

6

would

be

sorted

arbitrarily
.


2

3

5

6

1

7

4

23

MDO and MWO Heuristics


Both

the

MWO

and

the

MDO

heuristic

start

the

enumeration

by

those

variables

with

more

variables

adjacent

in

the

graph,

aiming

at

the

earliest

detection

of

dead

ends
.


(notice

that

in

the

algorithm

to

detect

minimal

width

ordering,

the

last

variables

are

those

with

least

degree)
.

2

3

5

6

1

7

4

Example
:



MWO

and

MDO

orderings

are

not

necessarily

coincident,

and

may

be

used

to

break

ties
.

For

example,

the

two

MDO

orderings

O1 = [4,1,
5,6,2,
3
,7]

O2 = [4,1,2,3,5,6,7]




induce

different

widths

(
4

and

3
)
.

24

Cycle
-
cut sets



The

MWO

heuristic

is

particularly

useful

when

some

high

level

of

consistency

is

maintained,

as

shown

in

the

theorem

relating

it

with

strong

k
-
consistency,

which

is

a

generalisation

to

arbitrary

constraint

networks

of

the

result

obtained

with

constraint

trees
.


However,

since

such

consistency

is

hard

to

maintain,

an

alternative

is

to

enumerate

the

problem

variables

in

order

to,

as

soon

as

possible,

simplify

the

constraint

network

into

a

constraint

tree,

from

where

a

backtrack

free

search

may

proceed

(provided

directed

arc

consistency

is

maintained)
.


This

is

the

basic

idea

of

cycle
-
cut

sets
.

25

Cycle
-
cut Sets

Example
:



Enumeranting

first

variables

1

and

4
,

the

graph

becomes
.

2

3

5

6

1

7

4

Therefore,

adding

any

other

node

to

the

set

{
1
,
4
}

eliminates

cycle

2
-
3
-
6
-
7
-
5
-
2
,

and

turn

the

remaining

nodes

into

a

tree
.

Sets

{
1
,
4
,
2
},

{
1
,
4
,
3
},

{
1
,
4
,
5
},

{
1
,
4
,
6
}

and

{
1
,
4
,
7
},

are

thus

cycle
-
cut

sets
.

2

3

5

6

1

7

4

For

example,

after

enumeration

of

variables

from

the

cycle
-
cut

set

{
1
,
4
,
7
},

the

remaining

constraint

graph

is

a

tree
.

26

Cycle
-
cut sets

Obviously,

one

is

interested

in

cycle
-
cut

sets

with

lowest

cardinality
.



Consider

a

constraint

network

with

n

variables

with

domains

of

size

d
.

If

a

cycle
-
cut

set

of

cardinality

k

is

found,

the

search

complexity

is

reduced

to


1.
a

tuple

in

these

k

nodes

with

time

complexity

O(d
k
)
;

and


2.
maintenance

of

(directed)

arc

consistency

in

the

remaining

tree

with

n
-
k

nodes,

with

time

complexity

O(ad
2
)
.


Since

a

=

n
-
k
-
1

for

a

tree

with

n
-
k

nodes,

and

asuming

that

k

is

“small”,

the

total

time

complexity

is

thus

O(n d
k+2
).

27

Cycle
-
cut sets



One

may

thus

define

the


CCS

Heuristics

(
Cycle
-
Cut

Sets
)


The

Cycle
-
Cut

Sets

heuristics

suggests

that

the

first

variables

of

a

constraint

problem

to

be

enumerated

are

those

that

form

a

cycle
-
cut

set

with

least

cardinality
.


Unfortunately,

there

seems

to

be

no

good

algorithm

to

determine

optimal

cycle
-
cut

sets

(i
.
e
.

with

least

cardinality)
.


Two

possible

approximations

correspond

to

use

the

sorted_vars

algorithm

with

the

inverse

ordering,

or

simply

start

with

the

nodes

with

maximum

degree

(MDO)
.

28

Cycle
-
cut sets


Using algorith
sorted_vars

but selecting
first the nodes with highest degree, we
would get


1.
Remove

node

4

(degree

6
)

and

get

2.
Remove node 1 (degree 4) and get

2

3

5

6

1

7

4

2

3

5

6

1

7

4

3.
Now,

inclusion

of

any

of

the

other

nodes

(all

with

degree

2
)

turns

the

constraint

graph

into

a

tree
.

Selecting

node

2
,

we

get

the

cycle
-
cut

set

{
1
,
2
,
4
},

which

is

optimal
.

29

MBO Heuristics



In

constrast

with

heuristics

MWO

and

MDO

that

aim

to

detect,

as

soon

as

possible,

failures

in

the

enumeration,

the

MBO

heuristic

has

the

goal

of

eliminating

irrelevant

backtracking
.


For

this

purpose,

it

is

aimed

that,

as

much

as

possible,

if

a

variable

cannot

be

enumerated,

the

variables

that

may

cause

a

conflict

with

X

should

immediatly

precede

X,

in

the

enumeration

order,

to

guarantee

efficient

backtracking
.


In

fact,

if

the

backtracking

is

not

done

to

a

variable

Y

connected

to

X

by

some

constraint,

changing

Y

will

not

remove

the

previous

conflict,

and

the

failure

will

repeat
.


Such

ideas

are

captured

in

the

notion

of

Graph

Bandwidth
.

30

Graph Bandwidth


Definition

(
Bandwidth

of

a

graph

G,

induced

by

O
)
:



Given

a

total

ordering

O

of

the

nodes

of

a

graph,

the

bandwidth

of

a

graph

G,

induced

by

O,

is

the

maximum

distance

between

adjacent

nodes
.

2

6

3

5

1

4

7


Example
:



With

ordering

O
1
,

the

bandwith

of

the

graph

is

5
,

the

distance

between

nodes

1
/
6

and

2
/
7
.

If

the

choice

of

variable

X
6

determines

the

choice

of

X
1
,

changes

of

X
1

will

only

occur

after

irrelevantly

backtracking

variables

X
5
,

X
4
,

X
3

e

X
2

!!!

1

2

3

4

5

6

7

31

Graph Bandwidth


Definition

(
Bandwidth

of

a

graph

G
)
:




The

bandwidth

of

a

graph

G

is

the

lowest

bandwidth

of

the

graph

induced

by

any

of

its

orderings

O
.


Example
:



With

ordering

O
2
,

the

bandwidth

is

3
,

the

distance

between

adjacent

nodes

2
/
5
,

3
/
6

and

4
/
7
.


No

ordering

induces

a

lower

bandwidth

to

the

graph
.

Therefore,

the

graph

bandwidth

is

3

!!!

1

2

3

4

5

6

7

2

3

4

6

1

7

5

32

MBO Heuristics


The

concept

of

bandwidth

is

the

basis

of

the

following

MBO

Heuristics

(
Minimum

Bandwidth

Ordering
)


The

Minimum

Width

Ordering

heuristics

suggests

that

the

variables

of

a

constraint

problem

are

enumerated,

increasingly,

in

some

ordering

that

leads

to

the

minimal

bandwidth

of

the

primal

constraint

graph
.


Example
:



The MBO heuristic suggests the use of an
heuristic succh as O2, that induces a
bandwidth of 3.

1

2

3

4

5

6

7

2

3

4

6

1

7

5

33

MBO Heuristics



The

use

of

the

MBO

heuristics

with

constraint

propagation

is

somewhat

problematic

since
:


The

constraint

graphs

in

which

it

is

more

useful

should

be

sparse

and

possessing

no

node

with

high

degree
.

In

the

latter

case,

the

distance

to

the

farthest

apart

adjacent

node

dominates
.



The

principle

exploited

by

the

heuristics,

avoid

irrelevant

backtracking,

is

obtained,

hopefully

more

efficiently,

by

constraint

propagation
.


No

efficient

algorithms

exist

to

compute

the

bandwidth

for

general

graphs

(see

[Tsan
93
],

ch
.

6
)
.

34

MBO Heuristics

These cases are illustrated by some examples


Case 1:


The

constraint

graphs

in

which

it

is

more

useful

should

be

sparse

and

possessing

no

node

with

high

degree
.

In

the

latter

case,

the

distance

to

the

farthest

apart

adjacent

node

dominates



Example:


Node

4
,

with

degree

6
,

determines

that

the

bandwith

may

be

no

less

than

3

(for

orderings

with

3

nodes

before

and

3

nodes

after

node

4
)
.


Many

orderings

exist

with

this

bandwidth
.

However

if

node

3

were

connected

to

node

7
,

the

bandwidth

would

be

4
.

5

6

1

2

3

7

4

35

MBO Heuristics

Case 2:



Irrelevant

backtracking

is

handled

by

constraint

propagation
.



Example:


Assume

the

choice

of

X
6

is

determinant

for

the

choice

of

X
1
.

Constraint

propagation

will

possibly

empty

the

domain

of

X
6

if

a

bad

choice

is

made

in

labeling

X
1
,

before

(some)

variables

X
2
,

X
3
,

X
4

and

X
5

are

labeled,

thus

avoiding

their

irrelevant

backtracking
.

2

6

3

5

1

4

7

The

MBO

heuristics

is

thus

more

appropriate

with

backtracking

algorithms

without

constraint

propagation
.

1

2

3

4

5

6

7

36

MBO Heuristics

Case 3:


No

efficient

algorithms

exist

to

compute

bandwidth

for

general

graphs
.


In

general,

lower

widths

correspond

to

lower

bandwidths,

but

the

best

orderings

in

both

cases

are

usually

different
.


A

B

C

D

E

F

G

H

C

D

A

B

E

F

G

H

<<

Width

(minimal)

=

2
;

Bandwidth

=

5

<<

Width

=

3
;



Bandwidth

(minimal)

=

4

C

D

E

A

B

F

G

H

37

Dynamic Heuristics



In

contrast

to

the

static

heuristics

discussed

(MWO,

MDO,

CCS

e

MBO)

variable

selection

may

be

determined

dynamically
.

Instead

of

being

fixed

before

enumeration

starts,

the

variable

is

selected

taking

into

account

the

propagation

of

previous

variable

selections

(and

labellings)
.


In

addition

to

problem

specific

heuristics,

there

is

a

general

principle

that

has

shown

great

potential,

the

first
-
fail

principle
.


The

principle

is

simple
:

when

a

problem

includes

many

interdependent

“tasks”,

start

solving

those

that

are

most

difficult
.

It

is

not

worth

wasting

time

with

the

easiest

ones,

since

they

may

turn

to

be

incompatible

with

the

results

of

the

difficult

ones
.

38

First
-
Fail Heuristics



There are many ways of interpreting and implementing this generic
first
-
fail

principle.


Firstly, the tasks to perform to solve a constraint satisfaction problem
may be considered the assignment of values to the problem
variables. How to measure their difficulty?


Enumerating by itself is easy (a simple assignment). What turns the
tasks difficult is to assess whether the choice is viable, after
constraint propagation. This assessment is hard to make in general,
so we may consider features that are easy to measure, such as


The
domain

of the variables


The
number of constraints
(degree) they participate in.

39

First
-
Fail Heuristics

The
domain

of the variables


Intuitively,

if

variables

X
1

/

X
2

have

m
1

/

m
2

values

in

their

domains,

and

m
2

>

m
1
,

it

is

preferable

to

assign

values

to

X
1
,

because

there

is

less

choice

available

!


In

the

limit,

if

variable

X
1

has

only

one

value

in

its

domain,

(m
1

=

1
),

there

is

no

possible

choice

and

the

best

thing

to

do

is

to

immediately

assign

the

value

to

the

variable
.


Another

way

of

seeing

the

issue

is

the

following
:



On

the

one

hand,

the

“chance”

to

assign

a

good

value

to

X
1

is

higher

than

that

for

X
2
.



On

the

other

hand,

if

that

value

proves

to

be

a

bad

one,

a

larger

proportion

of

the

search

space

is

eliminated
.

40

First
-
Fail Heuristics: Example

Example:


In

the

8

queens

problem,

where

queens

Q
1
,

Q
2

e

Q
3
,

were

already

enumerated,

we

have

the

following

domains

for

the

other

queens


Q
4

in

{
2
,
7
,
8
},

Q
5

in

{
2
,
4
,
8
},

Q
6

in

{
4
},

Q
7

in

{
2
,
4
,
8
},

Q
8

in

{
2
,
4
,
6
,
7
}
.


Hence,

the

best

variable

to

enumerate

next

should

be

Q
6
,

not

Q
4

that

would

follow

in

the

“natural”

order
.


In

this

extreme

case

of

singleton

domains,

node
-
consistency

achieves

pruning

similar

to

arc
-
consistency

with

less

computational

costs!

1

1

1

1

1

1

1

1

1

1

1

1

1

1

2

2

2

2

2

2

2

2

2

2

2

3

3

3

3

3

3

3

41

First
-
Fail Heuristics


The
number of constraints
(degree) of the variables


This

heuristics

is

basically

the

Maximum

Degree

Ordering

(MDO)

heuristics,

but

now

the

degree

of

the

variables

is

assessed

dynamically,

after

each

variable

enumeration
.


Clearly,

the

more

constraints

a

variable

is

involved

in,

the

more

difficult

it

is

to

assign

a

good

value

to

it,

since

it

has

to

satisfy

a

larger

number

of

constraints
.


Of

course,

and

like

in

the

case

of

the

domain

size,

this

decision

is

purely

heuristic
.

The

effect

of

the

constraints

depends

greatly

on

their

propagation,

which

depends

in

turn

on

the

problem

in

hand,

which

is

hard

to

antecipate
.

42

Problem Dependent

Heuristics


In

certain

types

of

problems,

there

might

be

heuristics

specially

adapted

for

the

problems

being

solved
.



For

example,

in

scheduling

problems,

where

they

should

not

overlap

but

have

to

take

place

in

a

certain

period

of

time,

it

is

usually

a

good

heuristic

to

“scatter”

them

as

much

as

possible

within

the

allowed

period
.


This

suggests

that

one

should

start

by

enumerating

first

the

variables

corresponding

to

tasks

that

may

be

performed

in

the

beginning

and

the

end

of

the

allowed

period,

thus

allowing

“space”

for

the

others

to

execute
.



In

such

case,

the

dynamic

choice

of

the

variable

would

take

into

account

the

values

in

its

domain,

namely

the

minimum

and

maximum

values
.

43

Mixed

Heuristics


Taking

into

account

the

features

of

the

heuristics

discussed

so

far,

one

may

consider

the

use

of

mixed

strategies,

that

incorporate

some

of

these

heuristics
.


For

example

the

static

cycle
-
cut

sets

(CCS)

heuristics

suggests

a

set

of

variables

to

enumerate

first,

so

as

to

turn

the

constraint

graph

into

a

tree
.



Within

this

cycle
-
cut

set

one

may

use

a

first
-
fail

dynamic

heuristic

(e
.
g
.

smallest

domain

size)
.


On

the

other

hand,

even

a

static

heuristic

like

Minimal

Width

Ordering

(MWO),

may

be

turned

“dynamic”,

by

reassessing

the

width

of

the

graphs

after

a

certain

number

of

enumerations,

to

take

into

account

the

results

of

propagation
.

44

Value Choice

Heuristics



Once

selected

a

variable

to

label,

a

value

within

its

domain

has

to

be

chosen
.


There

are

little

generic

methods

to

handle

value

choice
.

The

only

one

widely

used

is

the

principle

of

chosing

the

value

with

higher

“likelihood”

of

success!


The

reason

for

this

is

obvious
.

In

contrast

with

variable

selection,

value

choice

will

not

determine

the

size

of

the

search

space,

so

one

should

be

interested

in

finding

as

quickly

as

possible

the

path

to

a

solution
.


Of

course,

the

application

of

this

principle,

is

highly

dependent

on

the

problem

(or

even

the

instance

of

the

problem)

being

solved
.

45

Value Choice

Heuristics



Some

forms

of

assigning

likelihoods

are

the

following
:

Ad

hoc

choice
:



Again

in

scheduling

problems,

once

selected

the

variable

with

lowest/highest

values

in

its

domain,

the

natural

choice

for

the

value

will

be

the

lowest/highest,

which

somehow

“optimises”

the

likelihood

of

success
.

Lookahed
:



One

may

try

to

antecipate

for

each

of

the

possible

values

the

likelihood

of

success

by

evaluating

after

its

propagation

the

effect

on

an

aggregated

indicator

on

the

size

of

the

domains

not

yet

assigned

(as

done

with

the

kappa

indicator),

chosing

the

one

that

maximises

such

indicator
.

46

Value Choice

Heuristics


Optimisation
:


In

optimisation

problems,

where

there

is

some

function

to

maximise/minimise,

one

may

get

bounds

for

that

function

when

the

alternative

values

are

chosen

for

the

variable,

or

check

how

they

change

with

the

selected

value
.


Of

course,

the

heuristic

will

chose

the

value

that

either

optimises

the

bounds

in

consideration,

or

that

improves

them

the

most
.


Notice

that

in

this

case,

the

computation

of

the

bounds

may

be

performed

either

before

propagation

takes

place

(less

computation,

but

also

less

information)

or

after

such

propagation
.

47

Heuristics in SICStus



Being

based

on

the

Constraint

Logic

Programming

paradigm,

a

program

in

SICStus

has

the

structure

described

before



Problem(Vars)
:
-


Declaration

of

Variables

and

Domains,


Specification

of

Constraints,


Labelling

of

the

Variables
.


In

the

labelling

of

the

variables

X
1
,

X
2
,

...
,

X
n
,

of

some

list

Lx,

one

should

specify

the

intended

heuristics
.


Although

these

heuristics

may

be

programmed

explicitely,

there

are

some

facilities

that

SICStus

provides,

both

for

variable

selection

and

value

choice
.

48

Heuristics in SICStus



The

simplest

form

to

specify

enumeration

is

through

a

buit
-
in

predicate,

labeling/
2
,

where



the

1
st

argument

is

a

list

of

options,

possibly

empty


The

2
nd

argument

is

a

list

Lx

=

[X
1
,

X
2
,

...
,

X
n
]

of

variables

to

enumerate


By

default,

labeling([

],

Lx)

selects

variables


X
1
,

X
2
,

...
,

X
n
,

from

list

Lx,

according

to

their

position,

“from

left

to

right”
.

The

value

chosen

for

the

variable

is

the

least

value

in

the

domain
.


This

predicate

can

be

used

with

no

options

for

static

heuristics,

provided

that

the

variables

are

sorted

in

the

list

Lx

according

to

the

intended

ordering
.

49

Heuristics in SICStus



With

an

empty

list

of

options,

predicate

labeling([

],L)

is

in

fact

equivalent

to

predicate

enumerating(L)

below





enumerating([])
.




enumerating([Xi|T])
:
-




indomain(Xi),




enumerating(T)
.


where

the

built
-
in

predicate,

indomain(Xi)
,

choses

values

for

variable

Xi

in

increasing

order
.


There

are

other

possibilities

for

user

control

of

value

choice
.

The

current

domain

of

a

variable,

may

be

obtained

with

built
-
in

fd_predicate

fd_dom/
2
.

For

example


?
-

X in 1..5, X #
\
=3, fd_dom(X,D).



D = (1..2)
\
/(4..5),



X in(1..2)
\
/(4..5) ?

50

Heuristics in SICStus



Usually,

it

is

not

necessary

to

reach

this

low

level

of

programming,

and

a

number

of

predefined

options

for

predicate

labeling/
2

can

be

used
.



The

options

of

interest

for

value

choice

for

the

selected

variable

are

up

and

down
,

with

the

obvious

meaning

of

chosing

the

values

from

the

domain

in

increasing

and

decreasing

order,

respectively
.


Hence,

to

guarantee

that

the

value

of

some

variable

is

chosen

in

decreasing

order

without

resorting

to

lower
-
level

fd_predicates,

it

is

sufficient,

to

call

predicate

labeling/
2

with

option

down

labeling([down],[Xi])


51

Heuristics in SICStus


The

options

of

interest

for

variable

selection

are

leftmost
,

min
,

max
,

ff
,

ffc

and

variable(Sel)


leftmost

-

is

the

default

mode
.



Variables

are

simply

selected

by

their

order

in

the

list
.



min
,

max

-

the

variable

with

the

lowest/highest

value

in

its

domain

is

selected
.


Useful,

for

example,

in

many

applications

of

scheduling,

as

discussed
.


ff
,

ffc

-

implements

the

first
-
fail

heuristics,

selection

the

variable

with

a

domain

of

least

size,

breaking

ties

with

the

number

of

constraints,

in

which

the

variable

is

involved
.

52

Heuristics in SICStus


variable(Sel)


This

is

the

most

general

possibility
.

Sel

must

be

defined

in

the

program

as

a

predicate,

whose

last

3

parameters

are

Vars,

Selected,

Rest
.

Given

the

list

of

Vars

to

enumerate,

the

predicate

should

return

Selected

as

the

variable

to

select,

Rest

being

the

list

with

the

remaining

variables
.



Other

parameters

may

be

used

before

the

last

3
.

For

example,

if

option

variable(includes(
5
))

is

used,

then

some

predicate

includes/
4

must

be

specified,

such

as

includes(V, Vars, Selected, Rest)


which

should

chose,

from

the

Vars

list,

a

variable,

Selected
,

that

includes

V

in

its

domain
.

53

Heuristics in SICStus


Notice

that

all

these

options

of

predicate

labeling/
2

may

be

programmed

at

a

lower

level,

using

the

adequate

primitives

available

from

SICStus

for

inspection

of

the

domains
.

These

Reflexive

Predicates
,

named

fd_predicates

include



fd_min(?X, ?Min)




fd_max(?X, ?Max)



fd_size(?X, ?Size)




fd_degree(?X, ?Degree
)


with the obvious meaning. For example,



?
-

X in 3..8, Y in 1..5, X #< Y,





fd_size(X,S), fd_max(X,M), fd_degree(Y,D).


D = 1, M = 4, S = 2,


X in 3..4, Y in 4..5

?

54

Heuristics in SICStus: Example


Program

queens_fd_h

solves

the

n

queens

problem


queens(N,M,O,S,F)



with various labelling options:


Variable Selection


Option

ff

is used.


Variables that are passed to predicate labeling/2 may or may not
be (
M=2/1
) sorted from the middle to the ends (for example,
[X4,X5,X3,X6,X2,X7,X1,X8]) by predicate

my_sort(2, Lx, Lo)
.


Value Choice


Rather than starting enumeration from the lowest value, an offset
(parameter
O
) is specified to start enumeration “half
-
way”.

55

Heuristics in SICStus


Constraint

Logic

Programming

uses,

by

default,

depth

first

search

with

backtracking

in

the

labelling

phase
.



Despite

being

“interleaved”

with

constraint

propagation,

and

the

use

of

heuristics,

the

efficiency

of

search

depends

critically

of

the

first

choices

done,

namely

the

values

assigned

to

the

first

variables

selected
.


Backtracking

“chronologically”,

these

values

may

only

change

when

the

values

of

the

remaining

k

variables

are

fully

considered

(after

some

O(
2
k
)

time

in

the

worst

case)
.

Hence,

alternatives

have

been

proposed

to

pure

depth

first

search

with

chronological

backtracking,

namely



Intelligent

backtracking,



Iterative

broadening,



Limited

discrepancy
;

and



Incremental

time
-
bound

search
.

56

Heuristics in SICStus



In

chronological

backtracking,

when

the

enumeration

of

a

variable

fails,

backtracking

is

performed

on

the

variable

that

immediately

preceded

it,

even

if

this

variable

is

not

to

blame

for

the

failure
.


Various

techniques

for

inteligent

backtracking
,

or

dependency

directed

search
,

aim

at

identifying

the

causes

of

the

failure

and

backtrack

directly

to

the

first

variable

that

participates

in

the

failure
.


Some

variants

of

intelligent

backtracking

are
:


Backjumping

;



Backchecking

;

and


Backmarking

.

57

Intelligent Backtracking

Backjumping


Failing

the

labeling

of

a

variable,

all

variables

that

cause

the

failure

of

each

of

the

values

are

analysed,

and

the

“highest”

of

the

“least”

variables

is

backtracked


In

the

example,

variable

Q
6
,

could

not

be

labeled,

and

backtracking

is

performed

on

Q
4
,

the

“highest

of

the

least“

variables

involved

in

the

failure

of

Q
6
.



All

positions

of

Q
6

are,

in

fact,

incompatible

with

the

value

of

some

variable

lower

than

Q
4
.



1

3 4

2 5

4

5

3 5

1

2

3

58

Intelligent Backtracking

Backchecking

and

Backmarking


These

techniques

may

be

useful

when

the

testing

of

constraints

on

different

variables

is

very

costly
.

The

key

idea

is

to

memorise

previous

conflicts,

in

order

to

avoid

repeating

them
.


In

backchecking
,

only

the

assignments

that

caused

conflicts

are

memorised
.


Em

backmarking

the

assignments

that

did

not

cause

conflicts

are

also

memorised
.


The

use

of

these

techniques

with

constraint

propagation

is

usually

not

very

effective

(with

a

possible

exception

of

SAT

solvers,

with

nogood

clause

learning
),

since

propagation

antecipates

the

conflicts,

somehow

avoiding

irrelevant

backtracking
.


59

Iterative Broadening


In

iterative

broadening

it

is

assigned

a

limit

b,

to

the

number

of

times

that

a

node

is

visited

(both

the

initial

visit

and

those

by

backtracking),

i
.
e
.

the

number

of

values

that

may

be

chosen

for

a

variable
.

If

this

value

is

exceeded,

the

node

and

its

successors

are

not

explored

any

further
.


In

the

example,

assuming

that

b=
2
,

the

search

space

prunned

is

shadowed
.


Of

course,

if

the

search

fails

for

a

given

b,

this

value

is

iteratively

increased,

hence

the

iterative

broadening

qualification
.

1

2

3

4

1

2

3

4

1

2

3

4

1

2

3

1

60

Limited Discrepancy


Limited

discrepancy

assumes

that

the

value

choice

heuristic

may

only

fail

a

(small)

number

of

times
.

It

directs

the

search

for

the

regions

where

solutions

more

likely

lie,

by

limiting

to

some

number

d

the

number

of

times

that

the

suggestion

made

by

the

heuristic

is

not

taken

further
.


In

the

example,

assuming

heuristic

options

at

the

left

and

d=
3

the

search

space

prunned

is

shadowed
.


Again,

if

the

search

fails,

d

may

be

incremented

and

the

search

space

is

increasingly

incremented
.

1

2

3

4

1

2

3

4

1

2

3

4

1

2

3

1

d = 3

d = 2

61

Incremental Time Bound Search



In

ITBS,

the

goal

is

similar

to

iterative

broadening

or

limited

discrepancy,

but

implemented

differently
.

Once

chosen

the

values

for

the

first

k

variábles,

for

each

label

{X
1
-
v
1
,

...

,

X
k
-
v
k
}

search

is

allowed

for

a

given

time

T
.


If

no

solution

is

found,

another

labeling

is

tested
.

Of

course,

if

the

search

fails

for

a

certain

value

of

T,

this

may

be

increased

incrementally

in

the

next

iterations,

guaranteeing

that

the

search

space

is

also

increassing

iteratively
.


In

all

these

algorithms

(iterative

broadening,

limited

discrepancy

and

incremental

duration)

parts

of

the

search

space

may

be

revisited
.

Nevertheless,

the

worst
-
case

time

complexity

of

the

algorithms

is

not

worsened
.

62

Incremental Time Bound Search


For

example,

in

the

case

of

incremental

time
-
bound

search

if

the

successive

and

failed

iterations

increase

the

time

limit

by

some

factor

a
,

i
.
e
.

T
j+
1

=

a
Tj,

the

ite牡tions

will

la獴




T
1

+

T
2

+

...

+

T
j



=

T

(

1
+

a

+

a
2

+

...
+

a
j
-
1
)

=



=

T(
1

-

a
j
)/(
1
-

a
)



T

a
j


If

a

solution

is

found

in

the

iteration

j+
1
,

then



the

time

spent

in

the

previous

iterations

is

T
a
j
.



iteration

j+
1

lasts

for

T
a
j

and,

in

average,

the

solution

is

found

in

half

this

time
.


Hence,

the

“wasted”

time

is

of

the

same

magnitude

of

the

“useful”

time

spent

in

the

search

(in

iteration

j+
1
)
.