kaw - SWI

crazymeasleAI and Robotics

Oct 15, 2013 (3 years and 5 months ago)


Submitted to: Knowledge Acquisition Workshop 1999

down Design and Construction of Knowledge
Based Systems
with Manual and Inductive Techniques

Floor Verdenius

Maarten W. van Someren


Department of Social Science Informatics

PO Box 1

University of Amsterdam

6700 AA Wageningen

Roeterstraat 15

The Netherlands

1018 WB Amsterdam


The Netherlands



In this paper we present the outline of a method for planning the design and construction of
based systems that combines a divide
conquer approach as commonly used in
knowledge acquisition and software engineering with the use of induc
tive techniques.
Decomposition of a knowledge acquisition problem into sub
problems is guided by the expected
costs and benefits of applying elicitation or induction to acquire the knowledge for part of the
target knowledge. The method is illustrated with
rational reconstruction

of a knowledge
acquisition process that involved inductive techniques.

: Inductive Techniques, Knowledge Acquisition, Learning goals, Decomposition



Knowledge acquisition involves the formalisation of (human
) knowledge about a certain
expert task. The aim is to build a system that can execute this task with a performance
that is comparable to the human expert. The knowledge on which a system is based can
be acquired in different ways. The classical way is to
elicit knowledge from a human
expert and formalise this in an operational language, e.g. using an expert system shell.
This is often extended to acquiring knowledge not only from a single human expert but
also from several experts with complementary expert
ise and from relatively unstructured
information such as textbooks and manuals. A different approach that we shall call the
"inductive approach" is based on induction from observations or machine learning.
Expert performance is sampled or observations in a

domain are collected and inductive
methods are used to construct a system that performs the task of the human expert or that
can make predictions about a domain.

Both the knowledge acquisition and the inductive approach have their strengths and
. Consequently, for many problems a pure approach is not optimal. Here we
briefly review the separate approaches and then mixed approaches.

Knowledge Acquisition

Many design problems in knowledge
based system construction are characterised by the
ity of a number of knowledge sources that can be used to construct a system.
Examples of sources are documents, human expertise on related tasks, collections of
observations or even existing knowledge in computational form.

If a domain involves a complex r
elation between problem data and solutions then direct
knowledge elicitation is not effective: it will lead to questions to a domain expert that are
too global and will therefore not result in useful knowledge. For example, if only the
possible problems an
d solutions are known, a knowledge engineer can only ask a very
general question like:

How do you find solutions like S from problem data like P?

For a human expert such general questions are hard to answer for a complex domain. If
possible problems and so
lutions and something about intermediate reasoning steps is
known then more specific questions can be asked.

Research in Knowledge Acquisition has lead to the formulation of predefined methods
and general conceptual models or ontologies. An ontology is an
abstract description of
concepts that play a role in problem solving in a particular domain. If a method or
ontology is found to fit to a particular knowledge acquisition problem then this can act as
a basis for a dialogue between knowledge engineer and ex
pert. These methods and
ontologies serve two important functions. They provide a common language that can be
used to phrase more specific questions to the expert and they make it possible to
decompose the knowledge acquisition problem into sub
problems, a
form of divide

One difficulty with this approach is the selection of an appropriate method and ontology.
The range of methods and ontologies is likely to be very large. Another problem is that
this approach does not take into account the econo
mic aspect of the process: once a
method is selected it may turn out to be very expensive to acquire the knowledge for each
component of the method. Below we give a method for design and construction of
based systems that optimises the use of ava
ilable resources.

The Inductive Approach

Machine learning technology gives the prospect of (partially) automating the construction
of knowledge
based systems. There are several ways to apply machine learning
techniques to knowledge acquisition. The most di
rect approach is to collect a set of
examples of problems with solutions that are provided or approved by a human expert
and to apply an induction technique to automatically construct a knowledge
system. This requires no intermediate knowledge. In so
me cases, such data are easy to
obtain but for many applications, obtaining data is expensive. If the task is difficult then
the underlying relation may be rather complex. In that case many data are needed to
acquire adequate knowledge which makes this opt
ion expensive.

Other approaches focus on refining or debugging knowledge that was acquired manually
(e.g. Shapiro, 1982; Ginsberg, 1988; Craw and Sleeman, 1990). Either approach can be
improved by using domain specific prior knowledge, for example in the f
orm of
descriptions of rules that are to be learned.

Like the elicitation
based approach, direct application of an inductive approach also
encounters problems if the knowledge acquisition problem is complex. One complication
is that uncovering a complex s
tructure will require many data. Especially if
measurements are noisy or the structure in the domain is probabilistic many data are
needed to obtain reliable results. Because for many applications data are difficult and
expensive to acquire and because exp
loiting available explicit knowledge, a
straightforward inductive approach is often not optimal. Here again we find the need for a
structured and economic approach to the acquisition problem. Decomposing the learning
problem and learning components separat
ely may reduce the number of examples that are
needed (e.g. Shapiro, 1987).

In our method we integrate the elicitation
based approach (including the use of
predefined methods and ontologies) with the inductive approach and we show how
optimal use can be ma
de of available sources of knowledge using systematic
decomposition of the learning task.

Combining Knowledge Acquisition and Induction

Several authors have presented approaches and techniques for combining inductive
techniques and knowledge acquisition. M
orik et al. (1993) introduced the principle of
balanced co
: divide the tasks between human and computer to optimise the
combination. This was elaborated in the MOBAL system (Morik et al., 1993) that
allowed the user access to all data and general
isations and the included meta
rules as
constraints on possible generalisations and supports automated refinement of knowledge
that was explicitly entered by the user (see also Craw and Sleeman, 1990, Ginsburg,
1988; Aben and van Someren, 1990) or other fo
rms of constraints on possible
generalisations. Other approaches a enable the user to visualise aspects of the data to
select an appropriate analysis technique (e.g. Kohavi, Sommerfield and Dougherty,
1997). However, these approaches do not address one of
the key issues in knowledge
acquisition: decomposition of the acquisition problem. They assume acquisition problems
that may be large in terms of the number of variables but that can actually be addressed
as a single problem without decomposition. Many rea
listic problems require a divide
conquer approach and combined use of different acquisition methods. In this paper
we present an approach to such problems that incorporates principles of knowledge
acquisition and of induction and that is based on divid
conquer and economy.

This paper is organised as follows: section 2 briefly introduces the MeDIA framework for
designing inductive applications. Section 3 illustrates the application of the framework by
detailing the design process of a fruit treatmen
t panning system. Sections In section 4 we
discuss the implications and applicability of this method, and discuss some directions for
further work.


MeDIA: A Method for Design of Inductive

Knowledge acquisition and machine learning differ in
another respect which cerates a problem
with combining them. Acquisition based on elicitation starts with a phase in which informal
representations are used and machine learning projects often start with a large set of data. Our
goal is to define a methodo
logy that covers the entire process, from specifying the problem and
identifying available sources to a running knowledge
based system. At which point should
decisions about the representation and the format of data be taken? At which point should a tool
or induction or elicitation be selected?

Figure 1:

Overview of the MeDIA model

In line with earlier work (Verdenius and Engels, 1997) we view the development of inductive
systems to proceed in three hierarchical leve
ls according to the Method for Designing Inductive
Applications (the MeDIA model, see Figure 1); within these levels, a total of six activities is
located (see Table 1):


Application Level


Requirements Definition
: deriving, from the problem owner and/or pot
user, the requirements for the application to deliver


Source Identification
: available data

and knowledge resources that are relevant
for the current domain are identified


Acquisition planning
: a resource directed decomposition process, to be detai
in this paper


Analysis Level


Data analysis
: deriving an explicit description of data characteristics that are
required to select and tune inductive techniques


Technique selection
: selecting the appropriate technique for implementing each


ique Level


: setting parameters of the selected technique, and (if
required) delivering a tuned model

The levels are as much as possible performed sequentially. Iteration of design steps however still
may occur at two locations:

acquisition planning

data analysis
, and

technique selection

technique implementation

A knowledge base of

is available for
acquisition planning
. In this knowledge base, the
input and output of resources is defined. Moreove
r, knowledge is available whether the
knowledge for a component of the target system obtained using this resource can be acquired by
means of inductive techniques.

When implementing a technique, the assumptions used in technique selection may proof
rate: the selected technique proofs inadequate for performing the task, in spite of earlier
indications. In that case, a new technique selection has to be made.

For performing the
acquisition planning


approach is proposed, configured
or optimal use of available sources of knowledge. In both software engineering (e.g.
Sommerville, 1995) and knowledge engineering (e.g. Schreiber, 1993) divide
approaches are standard. Many authors have defined languages that support top
evelopment (e.g. Schreiber, 1993; Terpstra, van Heijst, Wielinga and Shadbolt, 1993). However
most methods are not very specific about how to reduce a complex problem into simpler sub
problems, and if presented, decomposition choices are mainly motivated b
y computational
efficiency or design modularity.

Table 1
MeDIA activities and their input and output.




Knowledge Base








Domain Ontology

Knowledge Sources

Data (source)






Primitive Task

Domain Ontology

Knowledge Sources

Data (source)

Data an

Data (source)

Domain Ontology


Data Sets (to be

Data Source



Data Sets

Data Source

Domain Ontology




Search Method


Task Technique



Search Method


Domain Ontology


Training Algorithm



Technique tuning

In this paper, we describe a method for decomposing knowledge acquisition problems into sub
problems that can be solved better than the original problem. Here, the design choices for
decomposing a performance task are strongly m
otivated by the possibility to acquire an adequate
component against reasonable costs. The main differences with standard approaches for the
development of knowledge systems are that:

1. a wide range of types of knowledge sources and acquisition methods (e
.g. induction,
elicitation, decomposition) are considered and evaluated as possible source of knowledge
for a specific component of the target system. The options are evaluated on the basis of

and accuracy

2. also the decomposition process

is directed by the structure of the available knowledge
and economy

In data mining decomposition of a problem is normally based on the requirements for
like data selection, data cleaning and data conversion (see for example Engels, 1997;
d et al., 1996). Here we base decomposition on a deconstruction of the required
functionality in functionalities of reduced size and complexity to minimise the acquisition costs.

Compared to knowledge acquisition methods such as CommonKADS (Schreiber et al
., 1999) our
approach defines several extensions. First, it considers data as a potential source for acquiring
knowledge. In CommonKADS, human experts are considered the main source of knowledge,
and most tools and techniques focus on extracting knowledge
from human experts. Second, our
approach explicitly takes the costs and benefits of acquisition into account. Finally, we explicitly
consider induction as acquisition method when decomposing tasks. Decomposition of a task into
subtasks can take place if fo
r the subtasks data
types are available in the domain ontology.
Moreover, at least one primitive task needs to have the same input

or output
structure as the
subtask to decompose. When this is not possible, or not preferred by the user, manual
on has to be performed by the user.

Compared to other approaches for designing inductive applications (e.g. Brodley and Smyth,
1997), our model explicitly separates the design at a conceptual level from the actual
implementation. At the application level,

implementation issues are not assessed.

Requirement Definition and Source Identification

The first steps are to define the (functional and non
functional) system requirement, to describe
the domain ontology and to identify and collect resources that can
supply knowledge for the
target system. With functional requirements, we refer to the input
output mapping of the problem
to be solved. This is typically a (formal or verbal) definition of available and demanded data
items and their semantics. Non
al requirements refer to all aspects of the solution that
are not of relevance for the mapping, but that still have an effect on the acceptance and
satisfaction of the problem owner about the offered solution. Examples of non
requirements are th
e hardware and/or software platform, the response time or the preferred
layout of the user interface. The domain ontology is the set of relevant concepts that can be used
to express problems and solutions. In CommonKADS (Schreiber et al. 1999), the notion
domain ontology is further defined.

The knowledge resources are collected in the next step. This is done largely on the basis of the
domain ontology and the explicit information in the requirement definition.

Acquisition planning

The next step is to pl
an the use of the available resources for the actual knowledge acquisition.
The result of this step is a plan that specifies which resources are to be used to acquire the
knowledge for the components of a target system. The structure in which these compone
nts are
connected to form the final system is also produced as output.

In general, a knowledge acquisition problem can be solved in three ways:

by direct elicitation of the knowledge from a source, e.g. a human expert or a document,

by induction from obs
ervations, or

by further decomposition into sub

Decomposition continues until the sub
problems can be mapped onto acquirable knowledge
resources: a set of resources is found, connectable in a data
flow structure, from which
knowledge can be acqu
ired to perform the task of the target system. When a task is directly
mapped onto a source that is not a decomposition source it is referred to as a
primitive task
. Our
notions of task and primitive task are adjacent to the notions of task and inference a
s used in
knowledge acquisition (e.g. CommonKADS, Schreiber et al.1999).

Each of the options
direct elicitation
induction from observations

, involves
further choices. The main criterion for these choices is
acquisition economy
: the balan
ce between
the expected costs of implementing the option and the expected accuracy of the resulting
knowledge (O'Hara and Shadbolt, 1996). In case of
direct elicitation

in the case of manual
acquisition methods, the accuracy depends on the quality of the s
ources (e.g. human experts) and
of the communication process. In case of
induction from observation

the accuracy of the result
will depend on the availability of reliable data, the complexity of the actual relations and on
knowledge about the type of relat
ion that is to be induced. If many reliable data are available, if
the underlying relation is not very complex and if the type of function is known then induction is
likely to be successful. Otherwise there is a risk of constructing knowledge that is incor

The idea of
decomposition into sub

is based directly on the top
down approach of
stepwise refinement (e.g. Wirth, 1971, 1976). The main difference is that in software engineering
the main principle that guides decomposition is minimising the

complexity of the resulting
systems and thereby supporting activities like debugging, maintaining and re
using the system.
In knowledge acquisition the acquisition of the knowledge is usually the main factor that
determines the costs and benefits and ther
efore this guides the decomposition. Decomposition is
useful if (cheap and accurate) sources of knowledge are available for sub
tasks of the overall
knowledge acquisition task but not for the overall task. For example, there may be abundant data
for one su
problem and a communicative expert for another sub
problem but not for the
problem as a whole. This is a reason to split the knowledge acquisition problem into sub
problems that are then acquired separately. Another situation where a decomposition can be

cheaper and give more accurate results than a single step inductive approach is when there is no
prior knowledge to bias induction on the overall problem.

Acquisition Economy

To decide if decomposition is a good idea, we compare the expected costs and ben
efits of
acquisition with and without decomposition. The benefit of an acquisition operation (elicitation
based or induction based) is the accuracy of the resulting knowledge. The costs of the acquisition
process depend on the acquisition process. In case
of elicitation, this involves time of the expert
and of the knowledge engineer, equipment, etc. In case of an inductive approach this involves the
costs of collecting and cleaning data and of applying an induction system to the result. If we
decompose the
acquisition problem, the costs and benefits are simply the sum of those of the
problems. So we get for elicitation/induction:

EG(elicitation/induction) = w1 * EA(elicitation/induction


and for decomposition:


operation) = w2 * min(EG(operation



EG = expected gain

EC = expected costs

EA = expected accuracy

Here the weight parameters w1 and w2 indicate the importance of accuracy relative to

of acquisition; if the accuracy

is translated as annual benefits, w1 and w2 are related to
return on
. The expected accuracy of a compound acquisition is derived from the minimal
accuracy of its components, which is a pessimistic estimate. As we argued above, in some cases
citation is almost impossible because the expert cannot answer very global questions. This
means that the

are high and the accuracy of the knowledge is 0. In machine learning
applications the costs of actually running a system are usually rather smal
l compared to other
costs, such as designing the target system, collecting data and tuning the induction tool, so this
could be left out.

The Decomposition Process

A decomposition is constructed by

inserting a source description that is connected to one o
r more types of data in the current

adding or deleting a connection in the data
flow structure

inserting a method (a sub
procedure) for a component in the data
flow structure

The method for decomposing a knowledge acquisition problem is based on the i
dea that the
reasons for decomposing a knowledge acquisition problem that we gave above are applied in the
order given above. The method is a form of
first search

that uses expected costs and
benefits to evaluate candidate decompositions.

In case of f
urther decomposition, the method is applied recursively to the sub
problems. The
algorithm is depicted as algorithm 1. If costs and accuracies cannot be estimated the alternative is
to perform a pilot study to assess the costs and expected accuracy. In the

context of elicitation
this amounts to performing elicitation on part of the task and evaluating the result. In the context
of induction it amounts to comparative studies by cross validation. Main goal of such studies is
to select the best techniques in t
erms of the above expressed balance between costs and accuracy.

Data Analysis

In some cases, the resource can simply be included in the target system but in most cases the
knowledge must be "extracted'' from the resource, using an acquisition technique.

acquisition plan specifies the resource to be used but not the acquisition technique. Data analysis
applies to situations in which acquisition means induction from data. The purpose of this step is
to measure properties of the data that are relevant f
or selecting a technique. Selecting and
applying a technique will be the final step. Currently there is no comprehensive and practically
applicable method for this. As observed by Verdenius and van Someren (1997) many application
projects that use inductiv
e techniques do reason about selection of a technique. Often, designers
only consider one single induction technique. If necessary, the problem is transformed to make it
suitable for the chosen technique.

Several studies report experiments about the relat
ion between properties of the data and the
performance of learning systems. The ESPRIT project
Machine Learning Toolbox

Kodratoff et al, 1994) has gathered heuristics for technique selection for classification. The
heuristics focus on several aspects

of the learning problem. Based on descriptions various aspects
of the learning process the user is provided with a number of alternative techniques that can be
applied to the learning task at hand. Relevant aspects include, beside aspects of the data that

mentioned above, the nature of the learning task, uncertainty, the availability of background
knowledge, user interaction. The heuristics are implemented in an automated tool for user
support (Craw et al., 1994). The

project (Michie et al., 199
4) provides experimental
comparison of more then twenty different classification techniques on some thirty different data
sets. The analysis of the results indicates strong and weak points of the different techniques.
Moreover, additional analysis on STATL
OG results (Brazdil et al., 1994) generalizes over the
results in an attempt to formulate comprehensive heuristics.

In general, analysis consists of selecting a form for the hypothesis, transforming the data into a
suitable format so that an appropriate l
earning method can be applied. Langley (1996) gives an
extensive description of various forms of hypotheses and corresponding learning methods but
less is known about which properties of a dataset indicate which hypothesis and which learning
method are opt
imal. Currently this problem is handled by experimentally trying out methods and
evaluating them by cross validation. We expect that better understanding of properties of the
dataset that discriminate between different classes of hypotheses will enable mor
e rational
selection of the form of the hypothesis.


Example: The Product Treatment Support System

We illustrate the method with the example introduced above, on planning systematic treatments
for fruit ripening (Ve
rdenius, 1996). This problem involved both knowledge acquisition and
machine learning. The project was not run with this method in mind but our description can be
viewed as a post hoc
design rationale
. The initial acquisition goal is:

construct a knowledge

system that takes as input information on a batch of fruits that
arrives from abroad and that produces a recipe for storing the fruits

Figure 2:

Description of the input and output that defines the lear
ning problem

Requirement Definition and Source Identification

Figure 2 shows the overall learning problem. The outcome of the task, that is, the

to this
planning problem
, is a
treatment recipe
. A recipe is a prescription of the values c


a set of
treatment conditions


that applies to a specific time interval. The time interval is
subdivided in fixed
duration time
slices j. Storage conditions include attributes like temperature,
relative humidity and ethylene concentration; rele
vance of conditions is determined by the
product type.

Figure 3:

Part of the ontology

The first step is to identify available sources of knowledge for this task. Figure 3 illustrates part
of the domain ontology as devel
oped for this application. This should be read as a schema that
can be instantiated with specific knowledge. The following data about a batch of fruit (grouped)
are available:

batch data, such as origin, product cultivar etc.

commercial data, mainly the re
due date

of the product treatment

product data
, being a number values for attributes such as colour, shape, firmness, weight
etc, describing per individual product in a batch various quality aspects at the start of the
recipe. It is assumed here tha
t a fixed final quality is delivered for all recipes.

Table 2 lists some of the sources that are available in the fruits storage planning domain. These
sources cover the application of machine learning, knowledge elicitation from experts and
extraction fr
om documents. The sources include information that is not part of the original
problem. For example, the source
Sample Products

refers to
Batch (of Fruit)

Sampling Instructions
, and it delivers
Quality data
. The latter two are not
mentioned in
the original problem statement.

In this stage, only the resources are identified but no effort is made to extract the actual
knowledge. At this level the sources are not bound to any of the acquisition means. Available
resources may not be us
ed and actual acquisition of the knowledge is postponed until a complete
plan is available. The actual knowledge is to be obtained by applying an acquisition technique to
the resource: a human expert, a document, a set of data or an existing system. At thi
s point, no
choice for a technique is made either because this will depend on details of the resource that are
not relevant for this stage of the design process.

Table 2

Some sources of knowledge for the example

Source ID





Constraints and

Quality 1


i, x

batch, x






i, x

batch, x




i, x


p batch I



I, x

I, x






1…m: p

batch, j




due dat,

batch, I




batch, I
, due

I, j



I = 1… recipe duration

i, j



<induce from all available


batch, x


i, j



<induce from selected and
processed data>


I, x
cultivar, due

batch, i)



Recipe 1

due date

i, j



<standard r

Adapt recipe


I, j

, E(p
batch, I)

i, j



<apply heuristic in adapt

Acquisition is guided by economic principles and therefore an estimate must be made of the
costs and the expected

) of the knowledge that
can be acquired from a
resource. Accuracy ought to be estimated independent of the technique that will be used.

Some sources may have no costs if the knowledge already exists. Moreover, note that for a
(sub)problem and (sub)solution combination there may
be more than one way to acquire the
knowledge. For example, it may be possible to directly acquire knowledge that relates

Planning Destination

to a
Detailed Recipe

The costs of using these resources and the expected accuracies wer
e estimated using rules of
thumb. For example,
Specify Recipe

involves finding a detailed recipe specification from product
data, batch data and the required due date of the batch. The cost is estimated from the availability
of resources and the complexity

of the task. The size of the space defined by the properties in
Specify Recipe

gives an indication of the number of data that must be acquired to obtain certain
accuracy in case of an inductive approach. This in turn gives an estimate of the costs. The
lation is likely to be complex and this suggests that many cases are needed. Costs and accuracy
of an elicitation approach are estimated from the time that it takes to acquire the expertise for a
task. If this is unknown then a rough estimate is made based

on the complexity, as for the
inductive approach. The accuracy is estimated from a pilot experiment.

Acquisition Planning

Figure 4 shows the final decomposition and Table 3 gives an overview of the sources and
techniques that were actually use
d to acquire knowledge for the various components. Below we
reconstruct the process that lead to this decomposition and choice of acquisition methods.

Figure 4
: Decomposition of the knowledge acquisition problem

The est
imated costs of acquiring the complete system by elicitation are very high (because there
is no expert) and the same is true for induction. Without further analysis, there are about 15
input variables and between 12 and 52 output variables (depending on

the duration of the
storage). The relation is therefore likely to be very complex and it would take many data to find
an initial model if it is possible at all. We estimate costs and accuracies of single step acquisition
(elicitation or induction). The es
timates are in Table 4. The last column gives the expected gain
using a value 3 for weight values w
, as a reasonable value for the ROI (Return On Investment).

We now consider decomposition. The available sources and the causal and temporal structures
ne a number of possible decompositions of the initial knowledge acquisition problem. There
are many possibilities and here we describe some possibilities with estimates of the expected
accuracy and acquisition costs.

Decomposition 1

From the initial proble
m of Figure 3, the first decomposition step is to abstract from the quality
data on individual products to the quality of the batch. This requires a number of measurements.
Taking the average of a number of measurements requires a large product sample (see

1996). In Figure 5, the resulting decomposition is depicted. The expected gain of this
decomposition is: 3 * 0.35

1.6 =

Figure 5:

Decomposition 1

Decomposition 2

The next step in the decomposition
aims at overcoming the weakest point in decomposition 1.
Postulate Recipe 3 has a poor cost/accuracy ratio. It can be replaced by a two step approach,
where recipe specification is followed by a recipe design. The resulting decomposition is shown
in Figure

6. The expected gain of this option is: 3 * 0.8

2 = 0.4. Already a non
outcome, but still worse then the original problem formulation (over the ROI).

Figure 6: Decomposition 2

Final Decomposition

The final
decomposition again is advocated by first identifying the weakest point in the best
far, and identifying a task combination with a better pay
off. Here, it appears that estimating the
product quality can be optimised by first drawing a small sample from

the total data set, and
using these data to estimate the quality. Due to sample reduction the benefit increases. The
resulting decomposition was shown in Figure 2. The expected gain of this is: 3 * 0.8

1.6 = 0.8

Acquiring Components

We now can concentra
te on the actual acquisition of the knowledge for the components. The first
Select Products

implements a sampling procedure. For each product, a number of
assess data items are available. Based on these items, the product is classified a
s being

. This is a classification task. Historic data on
the relation between product descriptors and batch mean is available. On the other hand, for
humans, looking at this relation is fairly uncommon. Consequ
ently, elicitation of knowledge
from human experts is not an option (low accuracy vs. high costs). Data analysis may learn that
the underlying type of function is relatively simple, although not fully orthogonal on the data
axis. Interpretability may be a
functional) requirement, as the resulting knowledge has to be
applied by human experts in order to select fruits. In the actual planner, it has been implemented
by means of a decision rule learner. The rules are extracted, and handed over to a human e
to perform the actual selection on location.

The next component is the actual assessment of the batch quality. This is simply averaging of the
measurements. The main differences between the two available
Estimate Quality
sources can be
found in the
number of (expensive) measurements that is required in the case of unselected and
selected estimation. The former requires between 60 and 200 expensive measurements to be
taken. In the latter case, only 5
10 are needed. This does not dramatically effectuat
e the
accuracy, but dramatically reduces the costs.

For acquisition of recipe specification, the two options of elicitation or induction must be
evaluated. Human experts are not used to specify recipes on batch level, i.e. expertise is not
available. Hist
oric data is available for induction of the required knowledge. On the input side 21
attributes are taken as input. The size of the output space is limited (in the actual fruit planning
system, only 1 parameter was output; a maximum of 4 output values can
be imagined). Based on
a comparison between linear and non
linear models, a preference was developed for non
models. These have been implemented in the form of a neural network.



We presented a rational reconstruction of decisions to us
e machine learning in a knowledge
acquisition context. Applications of machine learning to knowledge acquisition involve more
than selecting and applying an appropriate induction tool. In general, knowledge or data are not
or only partially available and d
ecisions must be taken on how to acquire them. Knowledge
acquisition problems are often better solved using a

approach that reduces
the overall problem to sub
problems that can be solved by machine learning or direct elicitation.
This pr
ocess of

is guided by estimations of costs of the acquisition process
and of the expected accuracy of the result.

In this section, we discuss the relation between the approach advocated here, and some of the
approaches that are of use fo
r knowledge acquisition or induction. Finally, we discuss options for
further work.

Comparison with other methods

Knowledge acquisition methods

Many existing knowledge acquisition methods rely heavily on the idea of decomposition (e.g.
Terpstra, 1993, Sch
reiber, 1993a, Marcus, 1988). However, these methods focus on

languages and do rarely make the underlying principles explicit that are needed for a rational
application of the methods. These methods also do not cover the use of inductive techniqu
Here we reconstruct the rationale behind these methods and use this to extend them towards the
use of machine learning methods. We presented criteria and a method for decomposing
knowledge acquisition problems into simpler sub
problems and illustrated
this with a
reconstruction of a real world application. This method can be applied both to inductive methods,
knowledge elicitation or other manual acquisition methods.

In modern approaches for knowledge acquisition, especially in CommonKADS, the starting
point for divide
conquer approaches is identified from libraries of standard models. For
example, suppose that the acquisition problem is to construct a system that can design storage
recipes for fruits. The knowledge engineer may decide adopt a model
from a library of methods
(Breuker and VandeVelde, 1994). First, the problem is specified as before:

: Fruits Characteristics, Current Quality, Required Quality, Recipe Duration

: Storage Recipe, i.e. condition set
points for a series of time

The KADS library offers the following models:





Needs and desires



Components, required
structure, constraints,



Initial state, goal state,
world d
plan description, plan





It is not obvious which of these is appropriate here.
Recipe Duration

can be viewed as needs and
desires, constraints, requirements and plan descriptio
Fruits Characteristics


do not have an immediate counterpart in the terminology above. The
Storage Recipe

corresponds most closely to an assignment, although it can also be viewed as a plan, a design
solution or a configuration. Alth
assignment and scheduling

sounds like a good choice, the
models for this type of task concern allocation of resources to tasks in a schedule. This does not
correspond to our task.

is a better term. The inputs of the most general model for

(Valente, 1995) are:
initial state
goal state
world description
plan description

plan model
. A plan is an ordered set of actions that starts in the initial state and ends with a state
that satisfies the requirements of a goal state. The world
knowledge describes general
information about the world in which the actions will take place.

In our example,
Fruits Characteristics

Product Quality

can be viewed as "initial state''.
However, the storage recipe does not involve discrete states and th
erefore a planning process is
problematic. Even when the process is somehow discretised then there are very many
possibilities and the goal provides little guidance for the evaluation of intermediate states.
Another problem is, that if we compare this to t
he available resources in table 2, we see that the
resulting model is not
. The
Fruits Characteristics

Current Quality

are not the
description of the
initial state

parameter of the planning operators. The approach outlined in
Breuker and VandeV
elde does not tell us what to do now. An obvious step is to apply the whole
approach recursively to the task of finding the input of the planning operators from

Current Quality
. We shall not pursue this here. But it is noted that

planning model cannot actually be applied because of the continuous character of the operators
and the process, which is not mentioned in the description of the model as a prerequisite.
Moreover, the analysis process is about the same as that of our a
pproach. This is because the
flow structure of the available knowledge is of much more importance at this stage than the
structure of the data and the knowledge. Our approach postpones the choice between discrete
models and continuous models until lat
er and only then selects a modelling technique.

Inductive Methods

Compared with inductive engineering methods our methodology has a broader scope than most
methodologies. MeDIA includes the identification of resources, takes into account economic
factors a
nd structuring of the acquisition problem. Machine learning technology plays a specific
role in the overall method. A straightforward inductive approach to this problem would probably
have been more expensive and less successful. The reason is the complexi
ty of the relation
between the "raw'' data about a batch of fruits and its destination and the recipe and in the costs
of collecting data.

Further Work

The main "hole" in the methodology is selection of a model for the hypothesis and related data
rmation. We intend to review the literature on this question and summarise the state of the
art. After this, we intend to do more empirical evaluations of the methodology.



The MeDIA approach is based on separation of planning and implementati
on of the knowledge
acquisition process and on a "divide and conquer" approach to the planning problem. This
approach is possible if enough information about sources of knowledge is available. This
information can often be obtained by heuristics and cheap
measurements on the data. In
knowledge acquisition, these are part of the "experience" of knowledge engineers. In machine
learning and in statistical data analysis, rules of thumb and experience are used to estimate the
expected accuracy of the result of a
pplying an induction system. For example, for many
statistical techniques, rules of thumb relate the number of variables, the complexity of the
function to be induced and the number of data to an estimate of accuracy. The main alternative,
if there is no p
rior knowledge, is currently a "reactive" approach. The expected accuracy of
applying an operator can be determined empirically by trying it out. For inductive techniques,
this is done by cross validation, resulting in an estimate of the accuracy. In knowl
edge elicitation
simply asking an expert to provide the knowledge does this. If this fails it is concluded that
decomposition is necessary. See Brodley (1995) for a method following this approach. Graner
and Sleeman (1993) follow a similar approach in the
context of knowledge acquisition. Their
model does not include search through possible decompositions or the use of estimated costs and

The method outlined here can be extended to include the expected gain of having the resulting
system. This
would give a more comprehensive model including both the costs of acquisition
and the costs of having and using the acquired knowledge. See van Someren et al. (1997) for a
model of induction methods that include costs of measurements and costs of errors, i
n the
context of learning decision trees. These two models can be integrated into a single model, see
for example DesJardins (1995) for a similar model for robot exploration.

The MeDIA method involves decomposition before formalisation and data analysis (
when data analysis detects the need for different types of hypotheses and thus leads to
decomposition). Some heuristics for estimation of expected accuracy are stated in terms of
statistical properties of the data (see the STATLOG results). This sug
gests that data collection
and data analysis should be integrated more tightly with decomposition. However, we expect that
this is in general not correct. Accuracy can be estimated relatively well without using properties
of the data.


, M. and M. W. van Someren (1990) Heuristic Refinement of Logic Programs, in:L.C.
Aiello (ed):
Proceedings ECAI
, London:Pitman, 7

P. Brazdil, J. Gama and B. Henery (1994), Characterising the applicability of Classification
Algorithms Using Meta
el Learning, in: F. Bergadano and L. de Raedt (eds.),
Proceedings of
, Springer Verlag, Berlin, pp. 84

J. Breuker and W. van de Velde (1994),
CommonKADS Library for Expertise Modelling,

PRess, Amsterdam

Brodley, C. (1995) Recursive bias sele
ction for classifier construction.
Machine Learning,

pp. 63

C.E. Brodley and P. Smyth (1997), Applying Classification Algorithms in Practice,
Statistics and

7, pp. 45

Craw, S., and Sleeman, D. (1990) Automating the refinement of knowle
based systems. In:
Aiello, L. C., ed.,
Proceedings ECAI
, pp. 167
172. London: Pitman.

DesJardins, M. (1995) Goal
directed learning: a decision
theoretic model for deciding what to
learn next. In: D. Leake and A. Ram (eds)
Driven Learning


R. Engels (1996), Planning Tasks for Knowledge Discovery in Databases; Performing Task
Oriented User Guidance, in: Proceedings of the 2nd Int. Conf. on KDD

Engels, R., Lindner, G., and Studer, R. (1997) A guided tout through the data mining jungle.

Proceedings of the 3rd International Conference on Knowledge Discovery in Databases


U.M. Fayyad, G. Piatesky
Shapiro and P. Smyth (1996), From Data Mining to Knowledge
Discovery: An Overview, in: U.M. Fayyad et al. (eds.),
Advances in Knowle
dge Discovery and
Data Mining
, pp. 1

Ginsberg, A. (1988).
Refinement of Expert System Knowledge Bases: A Metalinguistic
FrameWork for Heuristic Analysis.

Graner, N. (1993). The Muskrat system. In:
Proceedings second workshop on multistrategy
, George Mason University.

Kodratoff, Y., et al. Will Machine Learning solve my problem,
Applied Artificial Intelligence

Kohavi, R, D. Sommerfield, and J. Dougherty (1997) Data Mining using MLC++, a Machine
Learning Library,

in C++.
International J
ournal on Artificial Intelligence Tools
, vol. 6.

P. Langley and H.A. Simon (1994), Applications of Machine Learning and Rule Induction, in:
Communications of the ACM.

Langley, P. (1997).
Elements of Machine Learning
. Morgan Kaufmann.

Marcus, S., ed. (1988)
Automatic knowledge acquisition for expert systems.

Boston: Kluwer.

J. McDermott (1988), Preliminary Steps Toward a Taxonomy of Problem Solving Methods, in:
S. Marcus (ed),
Automating Knowledge Acquisition for Expert Systems
; Kluwer Academic
Dordrecht (NL)

Michie, D., Spiegelhalter, D. J. and Taylor, C. C. (Eds.) 1994. Machine Learning, Neural and
Statistical Classification. Ellis Horwood. London

T.M. Mitchell (1997),
Machine Learning
, McGraw
Hill, New York

Morik, K., Wrobel, S., Kietz, J.
. and Emde, W. (1993)
Knowledge acquisition and machine
, London:Academic Press.

O'Hara, K., and Shadbolt, N. 1996. The thin end of the wedge: Efficiency and the generalised
directive model methodology. In Shadbolt, N.; O'Hara, K.; and Schreiber, G
., eds.,
Advances in
Knowledge Acquisition
. Springer Verlag. 33

Polderdijk, J.; Verdenius, F.; Janssen, L.; van Leusen, R.; den Uijl, A.; and de Naeyer, M.
(1996). Quality measurement during the post
harvest distribution chain of tropical products. In
Proceedings of the Congress Global Commercialization of Tropical Fruits
, volume 2. 185

J.R. Quinlan (1993),
C4.5: Programs for Machine Learning
, Morgan Kauffman, San Mateo

A. Rudstrom (1995), Applications of Machine Learning, Report 95
018, Un
iversity of

Schreiber, A.T.; Wielinga, B.J.; and Breuker, J.A., eds. 1993.
KADS: A Principled Approach to
Based System Development
, London: Academic Press.

Shapiro, E.Y. (1982).
Algorithmic Program Debugging.

ACM Distinguished Dissertat
series. Cambridge, Massachussetts: MIT Press.

Shapiro, A. (1987)
Structured induction in expert systems
, Addison Wesley.

M.W. van Someren, C. Torres and F. Verdenius (1997), A Systematic Description of Greedy
Optimization Algorithms for Cost Sensitive

Generalisation, in: X. Liu and P. Cohen,
Proceedings of IDA
, Springer Verlag, Berlin (Ge), pp. 247

I. Sommerville (1995),
Software Engineering
, Addison and Wesley, UK

L. Steels (1990), Components of Expertise,
AI Magazine

11:2, pp. 29


P.; van Heijst, G.; Wielinga, B.; and Shadbolt, N. (1993). Knowledge acquisition
support through generalised directive models. In David, J.
M.; Krivine, J.
P.; and Simmons, R.,
Second Generation Expert Systems
. Berlin Heidelberg, Germany: Springer
erlag. 428

J.L. Top (1993),
Conceptual Modelling of Physical Systems
, PhD thesis, Enschede (NL)

Valente, A. (1995) Planning, in: J. Breuker and W. van de Velde (1994),
CommonKADS Library
for Expertise Modelling,

IOS PRess, Amsterdam

F. Verdenius (1996
), Managing Product Inherent Variance During Treatment, Computers and
Electronics in Agriculture 15, pp. 245

F. Verdenius (1997), Developing an Embedded Neural Network Application: The making of the
PTSS, in: B. Kappen and S. Gielen, Neural Networks, B
est Practice in Europe, World Scientific,
Singapore, pp. 193

F. Verdenius and M.W. van Someren (1997), Applications of Inductive Techniques: a Survey in
the Netherlands, in:
AI Communications
,10, pp. 3

F. Verdenius, A.J.M. Timmermans and R.E. Schout
en (1997), Process Models for Neural
Network Application in Agriculture, in: AI Applications in Natural Resources,
Agriculture and
Environmental Sciences,

11 (3)

F. Verdenius and R. Engels (1997), A Process Model for Developing Inductive Applications,
eedings of Benelearn
, Tilburg University (NL), pp. 119

S.M. Weiss and C.A. Kulikowski (1991), Computer Systems that Learn, Morgan Kauffman, Palo

S.M. Weiss and N. Indurkhya (1998), Predictive Data Mining, Morgan Kauffman, San Francisco

N. W
irth (1971), Program Development by stepwise refinement, Comm ACM, 14 (4), 221

N. Wirth (1976), Systematic Programming, An introduction, Englewood Cliffs, NJ: Prentice Hall