1
Heuristic Search
•
Algorithms
that
maintain
some
form
of
consistency,
remove
(many?)
redundant
values
but,
not
being
complete,
do
not
eliminate
the
need
for
search
.
•
Even
when
a
constraint
network
is
consistent,
enumeration
is
subject
to
failure
.
•
In
fact,
a
consistent
constraint
network
may
not
even
be
satisfiable
(neither
a
satisfiable
constraint
network
is
necessarily
consistent)
.
•
All
that
is
guaranteed
by
maintaining
some
type
of
consistency
is
that
the
networks
are
equivalent
.
•
Solutions
are
not
“lost”
in
the
reduced
network,
that
despite
having
less
redundant
values,
has
all
the
solutions
of
the
former
.
2
Heuristic Search
•
Hence,
the
domain
pruning
does
not
eliminate
in
general
the
need
for
search
.
The
search
space
is
usually
organised
as
a
tree,
and
the
search
becomes
some
form
of
tree
search
.
•
As
usual,
the
various
branches
down
from
one
node
of
the
search
tree
correspond
to
the
assignment
of
the
different
values
in
the
domain
of
a
variable
.
•
As
such,
a
tree
leaf
corresponds
to
a
complete
compound
label
(including
all
the
problem
variables)
.
•
A
depth
first
search
in
the
tree,
resorting
to
backtracking
when
a
node
corresponds
to
a
dead
end
(unsatisfiability),
corresponds
to
an
incremental
completion
of
partial
solutions
until
a
complete
one
is
found
.
3
Heuristic Search
•
Given
the
execution
model
of
constraint
logic
programming
(or
any
algorithm
that
interleaves
search
with
constraint
propagation)
Problem(Vars):

Declaration of Variables and Domains,
Specification of Constraints,
Labelling of the Variables.
the
enumeration
of
the
variables
determines
the
shape
of
the
search
tree,
since
the
nodes
that
are
reached
depend
on
the
order
in
which
variables
are
enumerated
.
•
Take
for
example
two
distinct
enumerations
of
variables
whose
domains
have
different
cardinality,
e
.
g
.
X
in
1
..
2
,
Y
in
1
..
3
and
Z
in
1
..
4
.
4
Heuristic Search
enum([X,Y,Z])
:

indomain(X)
propagation
indomain(Y),
propagation,
indomain(Z)
.
# of nodes =
32
(2 + 6 + 24)
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
1
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
2
5
Heuristic Search
enum([X,Y,Z]):

indomain(Z),
propagation
indomain(Y),
propagation,
indomain(X).
# of nodes =
40
(4 + 12 + 24)
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
2
3
1
2
3
1
2
3
1
2
3
1
1
2
3
4
6
Heuristic Search
•
The
order
in
which
variables
are
enumerated
may
have
an
important
impact
on
the
efficiency
of
the
tree
search,
since
–
The
number
of
internal
nodes
is
different,
despite
the
same
number
of
leaves,
or
potential
solutions,
P
#D
i
.
–
Failures
can
be
detected
differently,
favouring
some
orderings
of
the
enumeration
.
–
Depending
on
the
propagation
used,
different
orderings
may
lead
to
different
prunings
of
the
tree
.
•
The
ordering
of
the
domains
has
no
direct
influence
on
the
search
space,
although
it
may
have
great
importance
in
finding
the
first
solution
.
7
Heuristic Search
•
To
control
the
efficiency
of
tree
search
one
should
in
principle
adopt
appropriate
heuristics
to
select
•
The
next
variable
to
label
•
The
value
to
assign
to
the
selected
variable
•
Since
heuristics
for
value
choice
will
not
affect
the
size
of
the
search
tree
to
be
explored,
particular
attention
will
be
paid
to
the
heuristics
for
variable
selection
.
8
Variable Selection Heuristics
•
There
are
two
types
of
heuristics
that
can
be
considered
for
variable
selection
.
–
Static

the
ordering
of
the
variables
is
set
up
before
starting
the
enumeration,
not
taking
into
account
the
possible
effects
of
propagation
.
–
Dynamic

the
selection
of
the
variable
is
determined
after
analysis
of
the
problem
that
resulted
from
previous
enumerations
(and
propagation)
.
•
Static
heuristics
are
based
on
some
properties
of
the
underlying
constraint
graphs,
namely
their
width
and
bandwidth
,
so
we
will
first
present
and
discuss
these
properties
.
9
Node Width
•
To
define
the
width
of
a
graph
we
will
first
define
the
notion
of
the
width
of
a
node
.
Definition
(
Node
width,
given
ordering
O
)
:
Given
some
total
ordering,
O,
of
the
nodes
of
a
graph,
the
width
of
a
node
N,
induced
by
ordering
O
is
the
number
of
lower
order
nodes
that
are
adjacent
to
N
.
As
an
example,
given
any
ordering
that
is
increasing
from
the
root
to
the
leaves
of
a
tree,
all
nodes
of
the
tree
(except
the
root)
have
width
1
.
1
5
3
2
7
6
4
8
9
10
Graph Width
Definition
(
Width
of
a
graph
G,
induced
by
O
)
:
Given
some
ordering,
O,
of
the
nodes
of
a
graph,
G,
the
width
of
G
induced
by
ordering
O
,
is
the
maximum
width
of
its
nodes,
given
that
ordering
.
Definition
(
Width
of
a
graph
G
)
:
The
width
of
a
graph
G
is
the
lowest
width
of
the
graph
induced
by
any
of
its
orderings
O
.
It
is
apparent
from
these
definitions,
that
a
tree
is
a
special
graph
whose
width
is
1
.
1
5
3
2
7
6
4
8
9
11
Graph Width
Example
:
In
the
graph
below,
we
may
consider
various
orderings
of
its
nodes,
inducing
different
widths
.
The
width
of
the
graph
is
3
(this
is,
for
example
the
width
induced
by
ordering
O
1
dir)
.
2
3
5
6
1
7
4
1
2
3
4
5
6
7
<<
w
O1dir
= 3 (nodes 4, 5, 6 and 7)
>>
w
O1inv
= 5 (nd 1)
<<
w
O2dir
= 5 (node 1)
>>
w
O1inv
= 6 (nd 4)
4
3
2
5
6
7
1
12
Graph Width
•
As
shown
before,
to
get
a
backtrack
free
search
in
a
tree,
it
would
be
enough
to
guarantee
that,
for
each
node
N,
all
of
its
children
(adjacent
nodes
with
higher
order)
have
values
that
support
the
values
in
the
domain
of
node
N
.
•
This
result
can
be
generalised
for
arbitrary
graphs,
given
some
ordering
of
its
nodes
.
If,
according
to
some
ordering,
the
graph
has
width
w
,
and
if
enumeration
follows
that
ordering,
backtrack
free
search
is
guaranteed
if
the
network
is
strongly
k

consistent
(with
k
>
w
)
.
•
In
fact,
like
for
the
case
of
the
trees,
it
would
be
enough
to
maintain
some
kind
of
directed
strong
consistency,
if
the
labelling
of
the
nodes
is
done
in
“increasing”
order
.
13
Graph Width
Example
:
•
With
ordering
Lo
1
dir,
strong
4

consistency
guarantees
backtrack
free
search
.
Such
consistency
guarantees
that
any
3

compound
label
that
satisfies
the
relevant
constraints,
may
be
extended
to
a
4
th
variable
that
satisfies
the
relevant
constraints
.
•
In
fact,
any
value
v
1
from
the
domain
of
variable
1
may
be
selected
.
If
strong
4

consistency
is
maintained,
values
from
the
domains
of
the
other
variables
will
possibly
be
removed,
but
no
variable
will
have
its
domain
emptied
.
2
3
5
6
1
7
4
1
2
3
4
5
6
7
14
Graph Width
Example
:
Since
the
ordering
induces
width
3
,
every
variable
X
k
connected
to
variable
X
1
is
connected
at
most
with
other
2
variables
“lower”
than
X
k
.
For
example,
variable
X
6
,
connected
to
variable
X
1
is
also
connected
to
lower
variables
X
3
and
X
4
(the
same
applies
to
X
5

X
4

X
1
and
X
4

X
3

X
2

X
1
)
Hence,
if
the
network
is
strongly
4

consistent,
and
if
the
label
X
1

v
1
was
there,
this
means
that
any
3

compound
label
{X
1

v
1
,
X
3

v
3
,
X
4

v
4
}
could
be
extended
to
a
4

compound
label
{X
1

v
1
,
X
3

v
3
,
X
4

v
4
,
X
6

v
6
}
satisfying
the
relevant
constraints
.
When
X
1
is
enumerated
to
v
1
,
values
v
3
,
v
4
,
v
6
(at
least)
will
be
kept
in
their
variable
domains
.
2
3
5
6
1
7
4
1
2
3
4
5
6
7
15
Graph Width
•
This
example
shows
that,
after
the
enumeration
of
the
“lowest”
variable
(in
an
ordering
that
induces
a
width
w
to
the
graph)
of
a
strongly
k

consistent
network
(and
k
>
w)
the
remaining
values
still
have
values
in
their
domains
(to
avoid
backtracking)
.
•
Being
strongly
k

consistent,
all
variables
connected
to
node
1
(variable
X
1
)
are
only
connected
to
at
most
w

1
(<k)
other
variables
with
lower
order
.
•
Hence,
if
some
value
v
1
was
in
the
domain
of
variable
X
1
,
then
for
all
these
sets
of
w
variables,
some
w

compound
label
{X
1

v
1
,
...
,X
w

1

v
w

1
}
could
be
extended
with
some
label
X
w

v
w
.
Hence,
when
enumerating
X
1
=v
1
the
removal
of
the
other
values
of
X
1
still
leaves
the
w

compound
label
{X
1

v
1
,
...
,
X
w

v
w
}
to
satisfy
the
relevant
constraints
.
16
Graph Width
•
The
process
can
proceed
recursively,
by
successive
enumeration
of
the
variables
in
increasing
order
.
After
each
enumeration
the
network
will
have
one
variable
less
and
values
might
eventually
be
removed
from
the
domain
of
the
remaining
variables,
ensuring
that
the
simplified
network
remains
strongly
4

consistent
.
•
Eventually,
a
network
with
only
4
variables
is
reached,
and
an
enumeration
is
obviously
possible
without
the
need,
ever
,
of
backtracking
.
1
2
3
4
5
6
7
2
3
5
6
1
7
4
17
Graph Width
•
The
ideas
shown
in
the
previous
example
could
lead
to
a
formal
proof
(by
induction)
of
the
following
Theorem
:
Any
constraint
network
strongly
k

consistent,
may
be
enumerated
backtrack
free
if
there
is
an
ordering
O
according
to
which
the
constraint
graph
has
a
width
less
then
k
.
•
In
practice,
if
such
an
ordering
O
exists,
the
enumeration
is
done
in
increasing
order
of
the
variables,
maintaining
the
system
k

consistent
after
enumeration
of
each
variable
.
•
This
result
is
particularly
interesting
for
large
and
sparse
networks,
where
the
added
cost
of
maintaining
say,
path

consistency
(polinomial)
may
compensated
by
not
incurring
in
backtracking
(exponential)
.
18
MWO Heuristics
•
Strong
k

consistency
is
of
course
very
costly
to
maintain,
in
computational
terms,
and
this
is
usually
not
done
.
•
Nevertheless,
and
specially
in
constraint
networks
with
low
density
and
where
the
widths
of
the
nodes
vary
significantly
with
the
orderings
used,
the
orderings
leading
to
lower
graph
widths
may
be
used
to
heuristically
select
the
variable
to
label
.
Specifically,
one
may
define
the
MWO
Heuristics
(
Minimum
Width
Ordering
)
:
The
Minimum
Width
Ordering
heuristics
suggests
that
the
variables
of
a
constraint
problem
are
enumerated,
increasingly,
in
some
ordering
that
leads
to
a
minimal
width
of
the
primal
constraint
graph
.
19
MWO Heuristics
•
The
definition
refers
the
primal
graph
of
a
constraints
problem,
which
coincides
with
the
graph
for
binary
constraints
(arcs)
.
For
n

ary
constraints,
the
primal
graph
includes
an
arc
between
any
variables
connected
by
an
hyper

arc
in
the
problem
hyper

graph
.
•
For
example,
the
graph
being
used
could
be
the
primal
graph
of
a
problem
with
2
quaternary
constraints
(C
1245
and
C
1346
)
and
3
ternary
constraints
(C
123
,
C
457
and
C
467
)
.
C
123

>
arcs
a
12
,
a
13
e
a
23
C
1245

>
arcs
a
12
,
a
14
,
a
15
,
a
24
,
a
25
and
a
45
C
1346

>
arcs
a
13
,
a
14
,
a
16
,
a
34
,
a
36
and
a
46
C
457

>
arcs
a
45
,
a
47
e
a
57
C
467

>
arcs
a
46
,
a
47
e
a
67
2
3
5
6
1
7
4
20
MWO Heuristics
•
The application of the MWO heuristcs, requires the determination of
orderings O leading to the lowest primal constraint graph width. The
following greedy algorithm can be used to determine such orderings.
function
sorted_vars(V,C): Node List;
if
V = {N}
then
% only one variable
sorted_vars <

N
else
N <

arg Vi min {degree(Vi,C)  Vi in V}
% N is one of the nodes with less neighbours
C’<

C
\
arcs(N,C)
V’<

V
\
{N}
sorted_vars <

sorted_vars(V’,C’) & N
end if
end function
21
MWO Heuristics
Example
:
The ordering [1,2,3,4,5,6,7] obtained, leads to a width 3 for the graph.
2
3
5
6
1
7
4
2
3
5
6
1
4
2
3
5
1
4
2
3
1
4
2
3
1
2
1
1.
In
the
graph,
node
7
has
least
degree
(=
3
)
.
Once
removed
node
7
,
nodes
5
and
6
have
least
degree
(=
3
)
.
2.
Node
6
is
now
removed
.
3.
Nodes
5
,
4
(degree
3
),
3
(degree
2
)
and
2
(degree
1
)
are
subsequently
removed,
as
shown
below
.
22
MDO Heuristics
•
An
approximation
of
the
MWO
heuristics
is
the
MDO
heuristics
that
avoids
the
computation
of
ordering
O
leading
to
lowest
constraint
graph
width
.
MDO
Heuristics
(
Maximum
Degree
Ordering
)
:
The
Maximum
Degree
Ordering
heuristics
suggests
that
the
variables
of
a
constraint
problem
are
enumerated,
by
decreasing
order
of
their
degree
in
the
constraint
graph
.
•
Example
:
Heuristics
MDO
would
use
an
ordering
started
by
nodes
4
(d=
6
)
and
1
(d=
5
)
and
ending
in
node
7
(d=
3
)
.
Nodes
2
,
3
,
5
and
6
would
be
sorted
arbitrarily
.
2
3
5
6
1
7
4
23
MDO and MWO Heuristics
•
Both
the
MWO
and
the
MDO
heuristic
start
the
enumeration
by
those
variables
with
more
variables
adjacent
in
the
graph,
aiming
at
the
earliest
detection
of
dead
ends
.
•
(notice
that
in
the
algorithm
to
detect
minimal
width
ordering,
the
last
variables
are
those
with
least
degree)
.
2
3
5
6
1
7
4
Example
:
MWO
and
MDO
orderings
are
not
necessarily
coincident,
and
may
be
used
to
break
ties
.
For
example,
the
two
MDO
orderings
O1 = [4,1,
5,6,2,
3
,7]
O2 = [4,1,2,3,5,6,7]
induce
different
widths
(
4
and
3
)
.
24
Cycle

cut sets
•
The
MWO
heuristic
is
particularly
useful
when
some
high
level
of
consistency
is
maintained,
as
shown
in
the
theorem
relating
it
with
strong
k

consistency,
which
is
a
generalisation
to
arbitrary
constraint
networks
of
the
result
obtained
with
constraint
trees
.
•
However,
since
such
consistency
is
hard
to
maintain,
an
alternative
is
to
enumerate
the
problem
variables
in
order
to,
as
soon
as
possible,
simplify
the
constraint
network
into
a
constraint
tree,
from
where
a
backtrack
free
search
may
proceed
(provided
directed
arc
consistency
is
maintained)
.
•
This
is
the
basic
idea
of
cycle

cut
sets
.
25
Cycle

cut Sets
Example
:
Enumeranting
first
variables
1
and
4
,
the
graph
becomes
.
2
3
5
6
1
7
4
Therefore,
adding
any
other
node
to
the
set
{
1
,
4
}
eliminates
cycle
2

3

6

7

5

2
,
and
turn
the
remaining
nodes
into
a
tree
.
Sets
{
1
,
4
,
2
},
{
1
,
4
,
3
},
{
1
,
4
,
5
},
{
1
,
4
,
6
}
and
{
1
,
4
,
7
},
are
thus
cycle

cut
sets
.
2
3
5
6
1
7
4
For
example,
after
enumeration
of
variables
from
the
cycle

cut
set
{
1
,
4
,
7
},
the
remaining
constraint
graph
is
a
tree
.
26
Cycle

cut sets
Obviously,
one
is
interested
in
cycle

cut
sets
with
lowest
cardinality
.
•
Consider
a
constraint
network
with
n
variables
with
domains
of
size
d
.
If
a
cycle

cut
set
of
cardinality
k
is
found,
the
search
complexity
is
reduced
to
1.
a
tuple
in
these
k
nodes
with
time
complexity
O(d
k
)
;
and
2.
maintenance
of
(directed)
arc
consistency
in
the
remaining
tree
with
n

k
nodes,
with
time
complexity
O(ad
2
)
.
•
Since
a
=
n

k

1
for
a
tree
with
n

k
nodes,
and
asuming
that
k
is
“small”,
the
total
time
complexity
is
thus
O(n d
k+2
).
27
Cycle

cut sets
•
One
may
thus
define
the
CCS
Heuristics
(
Cycle

Cut
Sets
)
The
Cycle

Cut
Sets
heuristics
suggests
that
the
first
variables
of
a
constraint
problem
to
be
enumerated
are
those
that
form
a
cycle

cut
set
with
least
cardinality
.
•
Unfortunately,
there
seems
to
be
no
good
algorithm
to
determine
optimal
cycle

cut
sets
(i
.
e
.
with
least
cardinality)
.
•
Two
possible
approximations
correspond
to
use
the
sorted_vars
algorithm
with
the
inverse
ordering,
or
simply
start
with
the
nodes
with
maximum
degree
(MDO)
.
28
Cycle

cut sets
Using algorith
sorted_vars
but selecting
first the nodes with highest degree, we
would get
1.
Remove
node
4
(degree
6
)
and
get
2.
Remove node 1 (degree 4) and get
2
3
5
6
1
7
4
2
3
5
6
1
7
4
3.
Now,
inclusion
of
any
of
the
other
nodes
(all
with
degree
2
)
turns
the
constraint
graph
into
a
tree
.
Selecting
node
2
,
we
get
the
cycle

cut
set
{
1
,
2
,
4
},
which
is
optimal
.
29
MBO Heuristics
•
In
constrast
with
heuristics
MWO
and
MDO
that
aim
to
detect,
as
soon
as
possible,
failures
in
the
enumeration,
the
MBO
heuristic
has
the
goal
of
eliminating
irrelevant
backtracking
.
•
For
this
purpose,
it
is
aimed
that,
as
much
as
possible,
if
a
variable
cannot
be
enumerated,
the
variables
that
may
cause
a
conflict
with
X
should
immediatly
precede
X,
in
the
enumeration
order,
to
guarantee
efficient
backtracking
.
•
In
fact,
if
the
backtracking
is
not
done
to
a
variable
Y
connected
to
X
by
some
constraint,
changing
Y
will
not
remove
the
previous
conflict,
and
the
failure
will
repeat
.
•
Such
ideas
are
captured
in
the
notion
of
Graph
Bandwidth
.
30
Graph Bandwidth
Definition
(
Bandwidth
of
a
graph
G,
induced
by
O
)
:
Given
a
total
ordering
O
of
the
nodes
of
a
graph,
the
bandwidth
of
a
graph
G,
induced
by
O,
is
the
maximum
distance
between
adjacent
nodes
.
2
6
3
5
1
4
7
Example
:
With
ordering
O
1
,
the
bandwith
of
the
graph
is
5
,
the
distance
between
nodes
1
/
6
and
2
/
7
.
If
the
choice
of
variable
X
6
determines
the
choice
of
X
1
,
changes
of
X
1
will
only
occur
after
irrelevantly
backtracking
variables
X
5
,
X
4
,
X
3
e
X
2
!!!
1
2
3
4
5
6
7
31
Graph Bandwidth
Definition
(
Bandwidth
of
a
graph
G
)
:
The
bandwidth
of
a
graph
G
is
the
lowest
bandwidth
of
the
graph
induced
by
any
of
its
orderings
O
.
Example
:
•
With
ordering
O
2
,
the
bandwidth
is
3
,
the
distance
between
adjacent
nodes
2
/
5
,
3
/
6
and
4
/
7
.
•
No
ordering
induces
a
lower
bandwidth
to
the
graph
.
Therefore,
the
graph
bandwidth
is
3
!!!
1
2
3
4
5
6
7
2
3
4
6
1
7
5
32
MBO Heuristics
•
The
concept
of
bandwidth
is
the
basis
of
the
following
MBO
Heuristics
(
Minimum
Bandwidth
Ordering
)
The
Minimum
Width
Ordering
heuristics
suggests
that
the
variables
of
a
constraint
problem
are
enumerated,
increasingly,
in
some
ordering
that
leads
to
the
minimal
bandwidth
of
the
primal
constraint
graph
.
Example
:
•
The MBO heuristic suggests the use of an
heuristic succh as O2, that induces a
bandwidth of 3.
1
2
3
4
5
6
7
2
3
4
6
1
7
5
33
MBO Heuristics
•
The
use
of
the
MBO
heuristics
with
constraint
propagation
is
somewhat
problematic
since
:
–
The
constraint
graphs
in
which
it
is
more
useful
should
be
sparse
and
possessing
no
node
with
high
degree
.
In
the
latter
case,
the
distance
to
the
farthest
apart
adjacent
node
dominates
.
–
The
principle
exploited
by
the
heuristics,
avoid
irrelevant
backtracking,
is
obtained,
hopefully
more
efficiently,
by
constraint
propagation
.
–
No
efficient
algorithms
exist
to
compute
the
bandwidth
for
general
graphs
(see
[Tsan
93
],
ch
.
6
)
.
34
MBO Heuristics
These cases are illustrated by some examples
Case 1:
The
constraint
graphs
in
which
it
is
more
useful
should
be
sparse
and
possessing
no
node
with
high
degree
.
In
the
latter
case,
the
distance
to
the
farthest
apart
adjacent
node
dominates
Example:
•
Node
4
,
with
degree
6
,
determines
that
the
bandwith
may
be
no
less
than
3
(for
orderings
with
3
nodes
before
and
3
nodes
after
node
4
)
.
•
Many
orderings
exist
with
this
bandwidth
.
However
if
node
3
were
connected
to
node
7
,
the
bandwidth
would
be
4
.
5
6
1
2
3
7
4
35
MBO Heuristics
Case 2:
Irrelevant
backtracking
is
handled
by
constraint
propagation
.
Example:
Assume
the
choice
of
X
6
is
determinant
for
the
choice
of
X
1
.
Constraint
propagation
will
possibly
empty
the
domain
of
X
6
if
a
bad
choice
is
made
in
labeling
X
1
,
before
(some)
variables
X
2
,
X
3
,
X
4
and
X
5
are
labeled,
thus
avoiding
their
irrelevant
backtracking
.
2
6
3
5
1
4
7
The
MBO
heuristics
is
thus
more
appropriate
with
backtracking
algorithms
without
constraint
propagation
.
1
2
3
4
5
6
7
36
MBO Heuristics
Case 3:
No
efficient
algorithms
exist
to
compute
bandwidth
for
general
graphs
.
•
In
general,
lower
widths
correspond
to
lower
bandwidths,
but
the
best
orderings
in
both
cases
are
usually
different
.
A
B
C
D
E
F
G
H
C
D
A
B
E
F
G
H
<<
Width
(minimal)
=
2
;
Bandwidth
=
5
<<
Width
=
3
;
Bandwidth
(minimal)
=
4
C
D
E
A
B
F
G
H
37
Dynamic Heuristics
•
In
contrast
to
the
static
heuristics
discussed
(MWO,
MDO,
CCS
e
MBO)
variable
selection
may
be
determined
dynamically
.
Instead
of
being
fixed
before
enumeration
starts,
the
variable
is
selected
taking
into
account
the
propagation
of
previous
variable
selections
(and
labellings)
.
•
In
addition
to
problem
specific
heuristics,
there
is
a
general
principle
that
has
shown
great
potential,
the
first

fail
principle
.
•
The
principle
is
simple
:
when
a
problem
includes
many
interdependent
“tasks”,
start
solving
those
that
are
most
difficult
.
It
is
not
worth
wasting
time
with
the
easiest
ones,
since
they
may
turn
to
be
incompatible
with
the
results
of
the
difficult
ones
.
38
First

Fail Heuristics
•
There are many ways of interpreting and implementing this generic
first

fail
principle.
•
Firstly, the tasks to perform to solve a constraint satisfaction problem
may be considered the assignment of values to the problem
variables. How to measure their difficulty?
•
Enumerating by itself is easy (a simple assignment). What turns the
tasks difficult is to assess whether the choice is viable, after
constraint propagation. This assessment is hard to make in general,
so we may consider features that are easy to measure, such as
–
The
domain
of the variables
–
The
number of constraints
(degree) they participate in.
39
First

Fail Heuristics
The
domain
of the variables
•
Intuitively,
if
variables
X
1
/
X
2
have
m
1
/
m
2
values
in
their
domains,
and
m
2
>
m
1
,
it
is
preferable
to
assign
values
to
X
1
,
because
there
is
less
choice
available
!
•
In
the
limit,
if
variable
X
1
has
only
one
value
in
its
domain,
(m
1
=
1
),
there
is
no
possible
choice
and
the
best
thing
to
do
is
to
immediately
assign
the
value
to
the
variable
.
•
Another
way
of
seeing
the
issue
is
the
following
:
–
On
the
one
hand,
the
“chance”
to
assign
a
good
value
to
X
1
is
higher
than
that
for
X
2
.
–
On
the
other
hand,
if
that
value
proves
to
be
a
bad
one,
a
larger
proportion
of
the
search
space
is
eliminated
.
40
First

Fail Heuristics: Example
Example:
In
the
8
queens
problem,
where
queens
Q
1
,
Q
2
e
Q
3
,
were
already
enumerated,
we
have
the
following
domains
for
the
other
queens
•
Q
4
in
{
2
,
7
,
8
},
Q
5
in
{
2
,
4
,
8
},
Q
6
in
{
4
},
Q
7
in
{
2
,
4
,
8
},
Q
8
in
{
2
,
4
,
6
,
7
}
.
•
Hence,
the
best
variable
to
enumerate
next
should
be
Q
6
,
not
Q
4
that
would
follow
in
the
“natural”
order
.
•
In
this
extreme
case
of
singleton
domains,
node

consistency
achieves
pruning
similar
to
arc

consistency
with
less
computational
costs!
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
41
First

Fail Heuristics
The
number of constraints
(degree) of the variables
•
This
heuristics
is
basically
the
Maximum
Degree
Ordering
(MDO)
heuristics,
but
now
the
degree
of
the
variables
is
assessed
dynamically,
after
each
variable
enumeration
.
•
Clearly,
the
more
constraints
a
variable
is
involved
in,
the
more
difficult
it
is
to
assign
a
good
value
to
it,
since
it
has
to
satisfy
a
larger
number
of
constraints
.
•
Of
course,
and
like
in
the
case
of
the
domain
size,
this
decision
is
purely
heuristic
.
The
effect
of
the
constraints
depends
greatly
on
their
propagation,
which
depends
in
turn
on
the
problem
in
hand,
which
is
hard
to
antecipate
.
42
Problem Dependent
Heuristics
•
In
certain
types
of
problems,
there
might
be
heuristics
specially
adapted
for
the
problems
being
solved
.
•
For
example,
in
scheduling
problems,
where
they
should
not
overlap
but
have
to
take
place
in
a
certain
period
of
time,
it
is
usually
a
good
heuristic
to
“scatter”
them
as
much
as
possible
within
the
allowed
period
.
•
This
suggests
that
one
should
start
by
enumerating
first
the
variables
corresponding
to
tasks
that
may
be
performed
in
the
beginning
and
the
end
of
the
allowed
period,
thus
allowing
“space”
for
the
others
to
execute
.
•
In
such
case,
the
dynamic
choice
of
the
variable
would
take
into
account
the
values
in
its
domain,
namely
the
minimum
and
maximum
values
.
43
Mixed
Heuristics
•
Taking
into
account
the
features
of
the
heuristics
discussed
so
far,
one
may
consider
the
use
of
mixed
strategies,
that
incorporate
some
of
these
heuristics
.
•
For
example
the
static
cycle

cut
sets
(CCS)
heuristics
suggests
a
set
of
variables
to
enumerate
first,
so
as
to
turn
the
constraint
graph
into
a
tree
.
•
Within
this
cycle

cut
set
one
may
use
a
first

fail
dynamic
heuristic
(e
.
g
.
smallest
domain
size)
.
•
On
the
other
hand,
even
a
static
heuristic
like
Minimal
Width
Ordering
(MWO),
may
be
turned
“dynamic”,
by
reassessing
the
width
of
the
graphs
after
a
certain
number
of
enumerations,
to
take
into
account
the
results
of
propagation
.
44
Value Choice
Heuristics
•
Once
selected
a
variable
to
label,
a
value
within
its
domain
has
to
be
chosen
.
•
There
are
little
generic
methods
to
handle
value
choice
.
The
only
one
widely
used
is
the
principle
of
chosing
the
value
with
higher
“likelihood”
of
success!
•
The
reason
for
this
is
obvious
.
In
contrast
with
variable
selection,
value
choice
will
not
determine
the
size
of
the
search
space,
so
one
should
be
interested
in
finding
as
quickly
as
possible
the
path
to
a
solution
.
•
Of
course,
the
application
of
this
principle,
is
highly
dependent
on
the
problem
(or
even
the
instance
of
the
problem)
being
solved
.
45
Value Choice
Heuristics
•
Some
forms
of
assigning
likelihoods
are
the
following
:
Ad
hoc
choice
:
Again
in
scheduling
problems,
once
selected
the
variable
with
lowest/highest
values
in
its
domain,
the
natural
choice
for
the
value
will
be
the
lowest/highest,
which
somehow
“optimises”
the
likelihood
of
success
.
Lookahed
:
One
may
try
to
antecipate
for
each
of
the
possible
values
the
likelihood
of
success
by
evaluating
after
its
propagation
the
effect
on
an
aggregated
indicator
on
the
size
of
the
domains
not
yet
assigned
(as
done
with
the
kappa
indicator),
chosing
the
one
that
maximises
such
indicator
.
46
Value Choice
Heuristics
Optimisation
:
In
optimisation
problems,
where
there
is
some
function
to
maximise/minimise,
one
may
get
bounds
for
that
function
when
the
alternative
values
are
chosen
for
the
variable,
or
check
how
they
change
with
the
selected
value
.
•
Of
course,
the
heuristic
will
chose
the
value
that
either
optimises
the
bounds
in
consideration,
or
that
improves
them
the
most
.
•
Notice
that
in
this
case,
the
computation
of
the
bounds
may
be
performed
either
before
propagation
takes
place
(less
computation,
but
also
less
information)
or
after
such
propagation
.
47
Heuristics in SICStus
•
Being
based
on
the
Constraint
Logic
Programming
paradigm,
a
program
in
SICStus
has
the
structure
described
before
Problem(Vars)
:

Declaration
of
Variables
and
Domains,
Specification
of
Constraints,
Labelling
of
the
Variables
.
•
In
the
labelling
of
the
variables
X
1
,
X
2
,
...
,
X
n
,
of
some
list
Lx,
one
should
specify
the
intended
heuristics
.
•
Although
these
heuristics
may
be
programmed
explicitely,
there
are
some
facilities
that
SICStus
provides,
both
for
variable
selection
and
value
choice
.
48
Heuristics in SICStus
•
The
simplest
form
to
specify
enumeration
is
through
a
buit

in
predicate,
labeling/
2
,
where
–
the
1
st
argument
is
a
list
of
options,
possibly
empty
–
The
2
nd
argument
is
a
list
Lx
=
[X
1
,
X
2
,
...
,
X
n
]
of
variables
to
enumerate
•
By
default,
labeling([
],
Lx)
selects
variables
X
1
,
X
2
,
...
,
X
n
,
from
list
Lx,
according
to
their
position,
“from
left
to
right”
.
The
value
chosen
for
the
variable
is
the
least
value
in
the
domain
.
•
This
predicate
can
be
used
with
no
options
for
static
heuristics,
provided
that
the
variables
are
sorted
in
the
list
Lx
according
to
the
intended
ordering
.
49
Heuristics in SICStus
•
With
an
empty
list
of
options,
predicate
labeling([
],L)
is
in
fact
equivalent
to
predicate
enumerating(L)
below
enumerating([])
.
enumerating([XiT])
:

indomain(Xi),
enumerating(T)
.
where
the
built

in
predicate,
indomain(Xi)
,
choses
values
for
variable
Xi
in
increasing
order
.
•
There
are
other
possibilities
for
user
control
of
value
choice
.
The
current
domain
of
a
variable,
may
be
obtained
with
built

in
fd_predicate
fd_dom/
2
.
For
example
?

X in 1..5, X #
\
=3, fd_dom(X,D).
D = (1..2)
\
/(4..5),
X in(1..2)
\
/(4..5) ?
50
Heuristics in SICStus
•
Usually,
it
is
not
necessary
to
reach
this
low
level
of
programming,
and
a
number
of
predefined
options
for
predicate
labeling/
2
can
be
used
.
•
The
options
of
interest
for
value
choice
for
the
selected
variable
are
up
and
down
,
with
the
obvious
meaning
of
chosing
the
values
from
the
domain
in
increasing
and
decreasing
order,
respectively
.
•
Hence,
to
guarantee
that
the
value
of
some
variable
is
chosen
in
decreasing
order
without
resorting
to
lower

level
fd_predicates,
it
is
sufficient,
to
call
predicate
labeling/
2
with
option
down
labeling([down],[Xi])
51
Heuristics in SICStus
The
options
of
interest
for
variable
selection
are
leftmost
,
min
,
max
,
ff
,
ffc
and
variable(Sel)
–
leftmost

is
the
default
mode
.
•
Variables
are
simply
selected
by
their
order
in
the
list
.
–
min
,
max

the
variable
with
the
lowest/highest
value
in
its
domain
is
selected
.
•
Useful,
for
example,
in
many
applications
of
scheduling,
as
discussed
.
–
ff
,
ffc

implements
the
first

fail
heuristics,
selection
the
variable
with
a
domain
of
least
size,
breaking
ties
with
the
number
of
constraints,
in
which
the
variable
is
involved
.
52
Heuristics in SICStus
–
variable(Sel)
•
This
is
the
most
general
possibility
.
Sel
must
be
defined
in
the
program
as
a
predicate,
whose
last
3
parameters
are
Vars,
Selected,
Rest
.
Given
the
list
of
Vars
to
enumerate,
the
predicate
should
return
Selected
as
the
variable
to
select,
Rest
being
the
list
with
the
remaining
variables
.
•
Other
parameters
may
be
used
before
the
last
3
.
For
example,
if
option
variable(includes(
5
))
is
used,
then
some
predicate
includes/
4
must
be
specified,
such
as
includes(V, Vars, Selected, Rest)
which
should
chose,
from
the
Vars
list,
a
variable,
Selected
,
that
includes
V
in
its
domain
.
53
Heuristics in SICStus
•
Notice
that
all
these
options
of
predicate
labeling/
2
may
be
programmed
at
a
lower
level,
using
the
adequate
primitives
available
from
SICStus
for
inspection
of
the
domains
.
These
Reflexive
Predicates
,
named
fd_predicates
include
fd_min(?X, ?Min)
fd_max(?X, ?Max)
fd_size(?X, ?Size)
fd_degree(?X, ?Degree
)
with the obvious meaning. For example,
?

X in 3..8, Y in 1..5, X #< Y,
fd_size(X,S), fd_max(X,M), fd_degree(Y,D).
D = 1, M = 4, S = 2,
X in 3..4, Y in 4..5
?
54
Heuristics in SICStus: Example
•
Program
queens_fd_h
solves
the
n
queens
problem
queens(N,M,O,S,F)
with various labelling options:
•
Variable Selection
–
Option
ff
is used.
–
Variables that are passed to predicate labeling/2 may or may not
be (
M=2/1
) sorted from the middle to the ends (for example,
[X4,X5,X3,X6,X2,X7,X1,X8]) by predicate
my_sort(2, Lx, Lo)
.
•
Value Choice
–
Rather than starting enumeration from the lowest value, an offset
(parameter
O
) is specified to start enumeration “half

way”.
55
Heuristics in SICStus
•
Constraint
Logic
Programming
uses,
by
default,
depth
first
search
with
backtracking
in
the
labelling
phase
.
•
Despite
being
“interleaved”
with
constraint
propagation,
and
the
use
of
heuristics,
the
efficiency
of
search
depends
critically
of
the
first
choices
done,
namely
the
values
assigned
to
the
first
variables
selected
.
•
Backtracking
“chronologically”,
these
values
may
only
change
when
the
values
of
the
remaining
k
variables
are
fully
considered
(after
some
O(
2
k
)
time
in
the
worst
case)
.
Hence,
alternatives
have
been
proposed
to
pure
depth
first
search
with
chronological
backtracking,
namely
•
Intelligent
backtracking,
•
Iterative
broadening,
•
Limited
discrepancy
;
and
•
Incremental
time

bound
search
.
56
Heuristics in SICStus
•
In
chronological
backtracking,
when
the
enumeration
of
a
variable
fails,
backtracking
is
performed
on
the
variable
that
immediately
preceded
it,
even
if
this
variable
is
not
to
blame
for
the
failure
.
•
Various
techniques
for
inteligent
backtracking
,
or
dependency
directed
search
,
aim
at
identifying
the
causes
of
the
failure
and
backtrack
directly
to
the
first
variable
that
participates
in
the
failure
.
•
Some
variants
of
intelligent
backtracking
are
:
•
Backjumping
;
•
Backchecking
;
and
•
Backmarking
.
57
Intelligent Backtracking
Backjumping
•
Failing
the
labeling
of
a
variable,
all
variables
that
cause
the
failure
of
each
of
the
values
are
analysed,
and
the
“highest”
of
the
“least”
variables
is
backtracked
In
the
example,
variable
Q
6
,
could
not
be
labeled,
and
backtracking
is
performed
on
Q
4
,
the
“highest
of
the
least“
variables
involved
in
the
failure
of
Q
6
.
All
positions
of
Q
6
are,
in
fact,
incompatible
with
the
value
of
some
variable
lower
than
Q
4
.
1
3 4
2 5
4
5
3 5
1
2
3
58
Intelligent Backtracking
Backchecking
and
Backmarking
•
These
techniques
may
be
useful
when
the
testing
of
constraints
on
different
variables
is
very
costly
.
The
key
idea
is
to
memorise
previous
conflicts,
in
order
to
avoid
repeating
them
.
–
In
backchecking
,
only
the
assignments
that
caused
conflicts
are
memorised
.
–
Em
backmarking
the
assignments
that
did
not
cause
conflicts
are
also
memorised
.
•
The
use
of
these
techniques
with
constraint
propagation
is
usually
not
very
effective
(with
a
possible
exception
of
SAT
solvers,
with
nogood
clause
learning
),
since
propagation
antecipates
the
conflicts,
somehow
avoiding
irrelevant
backtracking
.
59
Iterative Broadening
•
In
iterative
broadening
it
is
assigned
a
limit
b,
to
the
number
of
times
that
a
node
is
visited
(both
the
initial
visit
and
those
by
backtracking),
i
.
e
.
the
number
of
values
that
may
be
chosen
for
a
variable
.
If
this
value
is
exceeded,
the
node
and
its
successors
are
not
explored
any
further
.
•
In
the
example,
assuming
that
b=
2
,
the
search
space
prunned
is
shadowed
.
•
Of
course,
if
the
search
fails
for
a
given
b,
this
value
is
iteratively
increased,
hence
the
iterative
broadening
qualification
.
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
1
60
Limited Discrepancy
•
Limited
discrepancy
assumes
that
the
value
choice
heuristic
may
only
fail
a
(small)
number
of
times
.
It
directs
the
search
for
the
regions
where
solutions
more
likely
lie,
by
limiting
to
some
number
d
the
number
of
times
that
the
suggestion
made
by
the
heuristic
is
not
taken
further
.
•
In
the
example,
assuming
heuristic
options
at
the
left
and
d=
3
the
search
space
prunned
is
shadowed
.
•
Again,
if
the
search
fails,
d
may
be
incremented
and
the
search
space
is
increasingly
incremented
.
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
1
d = 3
d = 2
61
Incremental Time Bound Search
•
In
ITBS,
the
goal
is
similar
to
iterative
broadening
or
limited
discrepancy,
but
implemented
differently
.
Once
chosen
the
values
for
the
first
k
variábles,
for
each
label
{X
1

v
1
,
...
,
X
k

v
k
}
search
is
allowed
for
a
given
time
T
.
•
If
no
solution
is
found,
another
labeling
is
tested
.
Of
course,
if
the
search
fails
for
a
certain
value
of
T,
this
may
be
increased
incrementally
in
the
next
iterations,
guaranteeing
that
the
search
space
is
also
increassing
iteratively
.
•
In
all
these
algorithms
(iterative
broadening,
limited
discrepancy
and
incremental
duration)
parts
of
the
search
space
may
be
revisited
.
Nevertheless,
the
worst

case
time
complexity
of
the
algorithms
is
not
worsened
.
62
Incremental Time Bound Search
•
For
example,
in
the
case
of
incremental
time

bound
search
if
the
successive
and
failed
iterations
increase
the
time
limit
by
some
factor
a
,
i
.
e
.
T
j+
1
=
a
Tj,
the
ite牡tions
will
la獴
T
1
+
T
2
+
...
+
T
j
=
T
(
1
+
a
+
a
2
+
...
+
a
j

1
)
=
=
T(
1

a
j
)/(
1

a
)
T
a
j
•
If
a
solution
is
found
in
the
iteration
j+
1
,
then
–
the
time
spent
in
the
previous
iterations
is
T
a
j
.
–
iteration
j+
1
lasts
for
T
a
j
and,
in
average,
the
solution
is
found
in
half
this
time
.
•
Hence,
the
“wasted”
time
is
of
the
same
magnitude
of
the
“useful”
time
spent
in
the
search
(in
iteration
j+
1
)
.
Comments 0
Log in to post a comment