# slides - Computing Science and Mathematics

AI and Robotics

Oct 23, 2013 (4 years and 6 months ago)

129 views

The Necessity of Metabias in

Metaheuristics.

John.woodward@nottingham.edu.cn

www.cs.nott.ac.uk/~jrw

Abstract

Bias is necessary for learning, and is a probability over a search space. This
is usually introduced implicitly. Each time a metaheuristic is executed it
will give a different solution. However, if executed repeated it will give the
same solution
on average
. In other words, the bias is static (even if we
include a self adaptive component to the search algorithm). One desirable
property of metaheuristics is that they
converge
. This means that there is
a non
-
zero probability of visiting each item in the search space. Search
algorithms are intended to be reused on many instances of a problem.
These instances can be consider to be drawn from a probability
distribution. In other words, a search algorithm and problem class can
both be viewed as probability distributions over the search space. If the
bias of a search algorithm does not match the bias of a problem class, it
will under perform, if however, they do match, it will perform well.
Therefore we need some mechanism of altering the initial bias of the
search algorithm to coincide with that of the problem class. This
mechanism can be realized by a meta level which alters the bias of the
base level. In other words, if a search algorithm is to be applied to many
instances of a problem, then meta bias is necessary. This implies that
convergence at the meta level means a search algorithm shift its bias to
any probability distribution. Additionally, shifting bias is equivalent to
automating the design of search algorithms.

Outline

Need and application of meta heuristics.

Preliminaries (search space, problem instance,
problem class, meta heuristic)

Generate
-
and
-
Test and Convergence

Bias and Probability

Many problem instances.

Bias and Problem Class

Summary and Conclusions

The Need for Heuristics

Many computational problems of industrial
interest are
intractable

to search.

The
combinatorial explosion
associated with
many problems means the search spaces
grows too rapidly for practical purposes.

For example, with the
travelling salesman
problem
, the size of the search space grows as
O(n!)
where n is the number of cities.

Therefore we
need

“non
-
exact” heuristics.

Examples of MetaHeuristics

Hill Climbing

Simulated Annealing

Genetic Algorithms

Ant colony algorithms

Swam particle optimization

…the list continues to grow…

Applications of Metaheuristis

Function Optimization

Function Regression

Combinatorial Problems

Bin Packing

Knapsack

Travelling Salesman

Program Induction

Reinforcement Learning

Preliminaries

A
search space
is a finite set of objects.

A
cost function
assigns a value to each object
in the search space.

A
problem instance
is a search space and cost
function.

A
problem class
is a probability distribution
over a set of problem instances with the same
search space.

A
metaheuristic
samples the search space in
order to find a “good quality” solution.

Generate
-
and
-
Test/Search

A candidate solution is generated
and tested.

This is repeated until some
termination condition is met
(usually a fixed number of
iterations).

Generate
-
and
-
test is also called
“search”.

In other words we are sampling the
search space.

test

generate

Property of Convergence

Convergence
: given enough time they will
eventually reach (sample/hit) the global
optima.

There is a
non
-
zero probability
of visiting all
points in the search space.

This criteria is usually easy to meet.

premature convergence
” and “
getting stuck in
a local optima
” are major issues for many
metaheuristics.

Bias and Probability

In the literature, “the bias of a metaheuristic”.

Bias is the probability distribution over the
search space (simplified assumption).

A metaheuristic is a random variable.

Let us call this “base bias”.

Sources of bias: Any design decision! E.g.

Cooling schedule, crossover operator, parameters.

It is unlikely the designer makes correct
choices or that the choices have much affect
(i.e. tuning one parameter maybe more
effective than tuning a different parameter).

One problem instance to the next one

Problem

instance1

MetaheuristicA

Problem

instance2

metaheuristicA

Metaheuritic A operates in the same way regardless of the

underlying problem instance (1 then 2 then 1). There in no mechanism to alter
the bias from one instance to the next (except case based reasoning).

Mechanisms do exist which alter the bias during the application of the
metaheuristic on a single problem instance, but rarely across problem
instances. We argue this is essential.

The metaheuristic may learn on one problem instance, but is not learning
across problem instances, which is essential if it is applied to many instances.

One problem instance to the next 2

Problem

instance1

metaheuristicA

Problem

instance2

metaheuristicA

Problem

instance1

metaheuristicA

Meta bias

Meta bias can affect how a metaheuristic operates on different instances.

As a special case, if we return to problem instance 1 (or something similar),

now at least have a mechanism to allow improved performance.

Bias and Problem Class

0
0.1
0.2
0.3
0.4
0.5
0.6
1
2
3
4
5
0
0.1
0.2
0.3
0.4
0.5
0.6
1
2
3
4
5
If no quality solutions exist in a
certain part of the search space,
there is no need in sampling it,
and therefore we do not require
the property of convergence.

We want, over a number of
instances, for the algorithm to
learn which are the promising
areas of the search space.

Convergence at the meta level
means for the probability of
sampling to change from an
arbitrary initial distribution for the
best for the class. This cannot be
achieved by parameter tuning
alone.

Distribution of quality solutions

over a problem class

Distribution of sampling

By a metaheuristic

Probability

Probability

Items in search space

Summarizing our contributions

1. If we apply our metaheuristics to many problem
instances, then meta
bias is necessary so the base bias
of the metaheuristic converges
towards the global
optima. Altering bias at the meta level is equivalent to
automatically designing metaheuristics.

2. At the meta level,
convergence means that the bias can
shift to any base bias (i.e. any probability distribution).

3.
Problem classes defined as probability distributions are
an essential part of the machine learning methodology.
A problem class defines a niche in which a suitable
metaheuristic can fit. Therefore algorithms should be
tested on problem instances which are drawn from this
distribution and NOT randomly selected benchmark
instances.

REFERENCES 1

[1] T.M. Mitchell, The Need for Biases in Learning Generalizations, Rutgers
Computer Science Department Technical Report CBM
-
TR
-
117, May, 1980.
Reprinted in Readings in Machine Learning, J.
Shavlik

and T.
Dietterich,
eds., Morgan Kaufmann, 1990.

[2] S.
Thrun

and L. Pratt, Learning To Learn, S.
Thrun

and L. Pratt, ed.,
Kluwer

[3] E. K. Burke, J. Woodward, M. Hyde, G. Kendall, Automatic heuristic
generation with genetic programming: Evolving a Jack of all trades or a
master of one. Genetic and Evolutionary Computation Conference, GECCO
2007.

[4] C. Schumacher, M. D.
Vose
, and L. D. Whitley. The no free lunch and
problem description length. In proceedings of the Genetic and
Evolutionary Computation Conference, 565
-
570, California, USA, 7
-
11 July
2001. Morgan Kaufmann.

[5] T. M. Mitchell. Machine Learning. McGraw
-
Hill 1997.

[6]
Riccardo

Poli
, William B. Langdon and Nicholas
Freitag

McPhee
, A Field
Guide to Genetic Programming, Lulu.com, freely available under Creative
Commons
Licence

from www.gp
-
field
-
guide.org.uk, March 2008.

REFERENCES 2

[7] William B. Langdon: Scaling of Program Fitness Spaces. Evolutionary Computation
7(4): 399
-
428 (1999)

[8] Woodward J. Computable and Incomputable Search Algorithms and Functions. IEEE
International Conference on Intelligent Computing and Intelligent Systems (IEEE
ICIS 2009) November 20
-
22,2009 Shanghai, China.

[9] Woodward, J., Evans A., Dempster, P. 2008, A Syntactic Justification of Occam’s
Razor. October 31 to November 2, 2008 Midwest, A New Kind of Science
Conference Indiana University Bloomington, Indiana

[10] Marcus Hutter, ”Universal Artificial Intelligence: Sequential Decisions based on
Algorithmic Probability” Springer,2004,
http://www.hutter1.net/ai/uaibook.htm
.

[11] Hartley Rogers, Theory of Recursive Functions and Effective Computability, The
MIT Press (April 22, 1987)

[12] Edmund K. Burke and Graham Kendall (editors), Search Methodologies:
Introductory Tutorials in Optimization and Decision Support Techniques, Springer
2005.

[13] Stefan Droste, Thomas Jansen, Ingo Wegener: Perhaps Not a Free Lunch But At
Least a Free Appetizer, 13
-
17 July 1999: 833
-
839 Proceedings of the Genetic and
Evolutionary Computation Conference, Orlando, Florida, USA, Morgan Kaufmann,

The End

Thank you

I would be glad to take any questions.

John.woodward@nottingham.edu.cn

www.cs.nott.ac.uk/~jrw/