Using Bayesian networks with rule extraction to infer the risk of weed
infestation in a corncrop
Gla´ ucia M.Bressan
a
,Vilma A.Oliveira
a,
,Estevam R.Hruschka Jr.
b
,Maria C.Nicoletti
b
a
Universidade de Sa˜o Paulo,Departamento de Engenharia Ele´trica,13566590 Sa˜o Carlos,SP,Brazil
b
Universidade Federal de Sa˜o Carlos,Departamento de Computaca˜o,13565905 Sa˜o Carlos,SP,Brazil
a r t i c l e i n f o
Article history:
Received 30 November 2007
Received in revised form
20 January 2009
Accepted 24 March 2009
Available online 14 May 2009
Keywords:
Bayesian network
Naı¨ve Bayes
Rule extraction
Weed infestation
Kriging
a b s t r a c t
This paper describes the modeling of a weed infestation risk inference system that implements a
collaborative inference scheme based on rules extracted fromtwo Bayesian network classiﬁers.The ﬁrst
Bayesian classiﬁer infers a categorical variable value for the weed–crop competitiveness using as input
categorical variables for the total density of weeds and corresponding proportions of narrowand broad
leaved weeds.The inferred categorical variable values for the weed–crop competitiveness along with
three other categorical variables extracted fromestimated maps for the weed seed production and weed
coverage are then used as input for a second Bayesian network classiﬁer to infer categorical variables
values for the risk of infestation.Weed biomass and yield loss data samples are used to learn the
probability relationship among the nodes of the ﬁrst and second Bayesian classiﬁers in a supervised
fashion,respectively.For comparison purposes,two types of Bayesian network structures are
considered,namely an expertbased Bayesian classiﬁer and a naı¨ve Bayes classiﬁer.The inference
system focused on the knowledge interpretation by translating a Bayesian classiﬁer into a set of
classiﬁcation rules.The results obtained for the risk inference in a corncrop ﬁeld are presented and
discussed.
& 2009 Elsevier Ltd.All rights reserved.
1.Introduction
Agricultural procedures may modify the ecological balance of a
ﬁeld due to the tilling procedures growers use to prepare the land,
quite often leading to a population explosion or infestation of
some inconvenient plants commonly known as weeds.Weed
control is a fundamental part of all crop production systems.Yield
reductions due to weeds are commonly known obstacle in harvest
operations as they lower crop quality by competing with the crop
for limited resources,such as water,nutrients,light,etc.Oerke et
al.(1994) estimated that a 10% loss of worldwide agricultural
production might be a consequence of weed activity.
In general,the main components of weed management
systems are herbicides.Usually,herbicides are uniformly spread
over the entire ﬁeld aiming at weed control.A uniformapplication
rate is often based on a visual evaluation of the weed density,with
no procedure used to evaluate the risks associated with under and
over spraying (Faechner et al.,2002).However,weed infestation
does not occur over the entire ﬁeld and the amount of herbicides
could be reduced by spraying only over the weed patches
(Wallinga et al.,1998;JuradoExpo´ sito et al.,2004).The prediction
of weed dispersion can be efﬁciently used in preventing infesta
tions by applying herbicides only in speciﬁc regions (Jurado
Expo´ sito et al.,2003;Faechner et al.,2002).Reducing the quantity
of herbicides potentially reduces herbicide residues in water,food
crops and in the environment,and it may prevent the develop
ment of weed resistance (Aitkenhead et al.,2003).
In the literature,a considerable diversity of weed management
decision models can be found.There are many different
approaches,ranging from empirical functions to mechanistic
simulation models.As surveyed by Wilkerson et al.(2002),some
of the models are too simple as they do not include all factors that
can inﬂuence weed competition or other issues farmers consider
when deciding how to manage weeds.Other models can be
excessively complex given that many users might ﬁnd difﬁculty in
obtaining the needed information or do not have the required
equipment for acquiring the data.According to Wilkerson et al.
(2002),weed management decision models must be built and
evaluated from three perspectives:biological accuracy,quality of
recommendations and ease of use.In addition,another important
issue to be taken into account when building weed management
systems is related to the interpretation of the model.The latter is
of particular interest in the experiments conducted in this paper.
There are few formalisms that can be used to model weed
infestation in a crop ﬁeld.Primot et al.(2006) developed 20
simple models (ﬁve are linear regression models and the other 15
ARTICLE IN PRESS
Contents lists available at ScienceDirect
journal homepage:www.elsevier.com/locate/engappai
Engineering Applications of Artiﬁcial Intelligence
09521976/$ see front matter & 2009 Elsevier Ltd.All rights reserved.
doi:10.1016/j.engappai.2009.03.006
Corresponding author.Tel.:+551633739336;fax:+551633739372.
Email address:vilmao@sel.eesc.usp.br (V.A.Oliveira).
Engineering Applications of Artiﬁcial Intelligence 22 (2009) 579–592
are logistic regression models).The models were evaluated for
their ability to discriminate the ﬁelds with a high level of weed
infestation from the ﬁelds with a low level of infestation—the
parameters of the 20 models were estimated using 3 years of
experimental data.The models can be used to help farmers decide
what type of weed control (chemical,mechanical or biological)
to use.
The risk of weed infested crop can be inferred from the
mathematical modeling of the weed behavior,based on experi
mental data.Dynamic models for weed seed populations describe
the population size at lifecycle t as a function of the population
size at lifecycle t 1 using difference (Sakai,2001;Cousens and
Mortimer,1995).The dynamic models indicate that infestation is
not only dependent upon the weed density but also on the
competitiveness of the weed species (Park et al.,2003;Firbank
and Watkinson,1985;Kropff and Spitters,1991).More recently,
competitive indexes and weed ranking were used to quantify the
weed competitiveness in a soybean ﬁeld (Hock et al.,2006).
Although purely mathematical models can be used for modeling
the weed risk of infestation,with good performance,as described
in several of the previous references,most of themlack ﬂexibility
and more important,lack interpretability—they work as ‘black
boxes’ where the user feeds a fewvalues and the systemoutputs a
diagnosis.
A particular class of models is based on probability.Of special
interest in this paper is the class of Bayesian networks (BN)
models,which are based on the probability that a given set of
measurements deﬁne objects as belonging to a certain class.In the
literature,Bayesian based methods have already been used for
modeling similar problems (Hughes and Madden,2003;Smith
and Blackshaw,2002;Banerjee et al.,2005).Particularly,Hughes
and Madden (2003) proposed a risk assessment methodology to
identify which exotic plant species,among those presented for
import,are a threat (to agricultural and ecological systems) and
which are not.Bayesian theory has also been employed in the
agriculture domain as the basis for developing classiﬁcation
systems,as described in Granitto et al.(2002).In their work,the
performance of a naı
¨
ve Bayes classiﬁer (BC) is used as the
selection criterion for identifying a nearly optimal set of 12 seed
characteristics further used as classiﬁcation parameters,such as
coloration,morphological and textural features.Considering the
seed identiﬁcation problem,the work described in Granitto et al.
(2005) compared naı¨ve Bayes classiﬁer performance to an
artiﬁcial neural network (NN) based classiﬁer.In this particular
experiment the naı¨ve Bayes classiﬁer with an adequately selected
set of classiﬁcation features outperformed the NN based classiﬁer.
Similar result was also obtained in Marchant and Onyango (2003)
but with a Bayesian classiﬁer and a multilayer feedforward
neural network in a task for discriminating plants,weeds,and soil
in color images.
The main goal of this paper is to propose and describe the use
of Bayesian network methods to infer the risk of weed infestation
in a corncrop as well as to present and discuss the results
obtained in a real application domain based on empirical data.
The procedure is implemented as a collaborative system that
integrates two classiﬁcation tasks.The ﬁrst uses a Bayesian
network to infer the competitiveness of weeds expressed by their
biomass,using as input the total density of weeds,and
corresponding narrow and broadleaved proportions.The second
task assesses the risk of infestation,expressed by the yield loss,
using as input the previous inferred competitiveness,as well as
features extracted fromthe weed seed density,weed coverage and
weed seed patches.The three last variables are estimated with a
geostatistics method called kriging (Brooker,1979;Isaaks and
Srivastana,1989) and image objects (Gonzalez and Woods,2002)
fromweed seed density and weed coverage data samples.
In addition,the paper also presents the translation of the
induced Bayesian networks into a set of classiﬁcation rules,
aiming at a more comprehensible knowledge representation.As
mentioned
before,this is an important aspect of a knowledge
based system construction,since it provides the system cred
ibility,a quality that other types of representation lack.Therefore,
the main idea of the conducted experiments is not to show that
the translation method is better than traditional classiﬁers (as
C4.5,for instance) or rule extraction methods.The claimis that it
is possible to take advantage of both the causal knowledge
representation (which can be adequately represented in a BN or
BC) and high accuracy of a Bayesian classiﬁer to have a set of
classiﬁcation rules (extracted from the BC) as a knowledge base.
For both classiﬁcation tasks implemented by the collaborative
system,two different Bayesian network structures are used for
comparison purposes.One is induced by the naı¨ve Bayes
algorithm (Duda and Hart,1973) using empirical data and the
other,an unrestricted Bayesian network,is designed and reﬁned
by an expert using the same empirical data.The networks in this
paper are referred to as naı¨ve Bayes and expertbased networks,
respectively.Due to their different architectures,the two Bayesian
networks have different performances,depending on the available
information.A set of probabilistic classiﬁcation rules is then
extracted from each of the Bayesian networks using a Markov
based strategy proposed in Hruschka et al.(2008).To reduce the
number of rules where the Markovbased strategy does not
remove categorical variables,a pruning strategy is proposed.The
pruning strategy is mainly motivated by the fact that no extra
computation effort is needed.The pruning can be done by
considering only the rules having estimated probability higher
than a predeﬁned threshold.This paper is an extended and revised
version of two earlier conference papers namely Bressan et al.
(2007a,b).
The remaining of this paper is organized as follows.Section 2
describes the basics of Bayesian networks and naı¨ve Bayes
classiﬁers and discusses the importance of improving their
understandability.Section 3 focuses on two important issues:
the approach used to collect and to interpolate empirical data,and
the construction of the collaborative system that integrates two
Bayesian classiﬁers.Section 4 presents the results of the
collaborative system,focusing on the results of the individual
classiﬁers,that is,the Bayesian network and the naı¨ve Bayes
classiﬁers.Finally,Section 5 presents some concluding remarks
and highlights the next steps for this research work.
2.Basics of Bayesian networks,Markov blanket and
classiﬁcation rules
As pointed out in Heckerman et al.(2000),Bayesian networks
and Bayesian classiﬁers are usually employed in data mining tasks
mainly because they (i) may deal with incomplete data sets
straightforwardly;(ii) can learn causal relationships;(iii) may
combine prior knowledge with patterns learnt from data and (iv)
can help to avoid overﬁtting.
A Bayesian network can be viewed as a form of probabilistic
graphical model used for knowledge representation and reasoning
about data domains.Instead of encoding a joint probability
distribution over a set of random variables,as usually done by a
Bayesian network,a Bayesian classiﬁer usually aims to correctly
predict the value of a discrete class variable given the value of a
vector of features (predictors).Since Bayesian classiﬁers are a
particular type of Bayesian networks the concepts and results
described in this section are valid for both.
A Bayesian network consists of two components—a network
structure,which is a directed acyclic graph,and a set of
ARTICLE IN PRESS
G.M.Bressan et al./Engineering Applications of Artiﬁcial Intelligence 22 (2009) 579–592580
probability tables.The nodes of the Bayesian network represent
variables and the arcs between nodes represent dependence
relation between the corresponding variables.An arc starting at a
node X (representing variable X) and ending at a node Y
(representing variable Y) establishes X as a parent of Y and Y as
a child of X.A Bayesian network can be used to compute the
conditional probability of one node,given values assigned to the
other nodes.Hence,a Bayesian network can be used as a classiﬁer
that gives the posterior probability distribution of the class node
given the values of other attributes.When learning Bayesian
networks from datasets,nodes are used to represent dataset
features.
Consider a ﬁnite set fX
i
;i ¼...;ng of discrete randomvariables
where each variable may take on values (represented by lower
case letters) from a ﬁnite set.As formally stated in Cheng et al.
(2002),a Bayesian network is represented by BN ¼ hN;A;
Y
i where
the component oN;Ai is a directed acyclic graph with nodes
X
i
2 N,i ¼...;n representing domain variables and arc a 2 A
between nodes representing a probabilistic dependency between
the associated nodes and,ﬁnally,denoting
p
X
i
as the set of parents
of X
i
in hN;Ai,the last component of BN is given by
Y
¼ f
y
X
i
j
p
X
i
¼
PðX
i
j
p
X
i
g for each possible value x
i
of X
i
and
p
x
i
of
p
X
i
which
collectively represents a conditional probability distribution
(CPtable) that quantiﬁes how a node X
i
2 N depends on its
parents.The conditional independence assumption (Markov
condition) allows the calculation of the joint probability distribu
tion function over the variables fX
i
;i ¼...;ng based on the
background knowledge as
PðX
1
;X2;...;X
n
Þ ¼
Y
n
i¼1
PðX
i
j
p
X
i
Þ
¼
Y
n
i¼1
y
X
i
j
p
X
i
,(1)
where n ¼ jNj.Therefore,a Bayesian network can be used as a
knowledge representation that allows inferences.
Bayesian networks can be built by an expert,or can be learnt
fromdata.The learning of a Bayesian network can be divided into
two procedures:one responsible for the network structure
learning and the other responsible for the conditional probability
tables learning for the structure.The learning of these tables can
be carried out using empirical conditional frequencies from data
(Cheng et al.,2002).When building a Bayesian network based on
subject specialist knowledge,the major problemis the conditional
distribution probability deﬁnition.This is due to human beings
tendency to miscalculate probabilities (Tversky and Kahneman,
1974).To avoid this difﬁculty it is possible to use expert
knowledge to build only the Bayesian network structure and then
use learning algorithms to induce
Y
from data.
2.1.Markov blanket
In a Bayesian network structure,with
l
X
i
as the set of children
of node X
i
and
p
X
i
as the set of parents of node X
i
,the subset of
nodes containing
p
X
i
,
l
X
i
and the parents of
l
X
i
is called the
Markov blanket of X
i
,as shown in Fig.1.As stated in Pearl (1988),
in a Bayesian network the only nodes that have inﬂuence on the
conditional probability distribution of a given node X
i
are the
nodes that belong to the Markov blanket of X
i
.Thus,after learning
a Bayesian network classiﬁer fromdata,the Markov blanket of the
node that represents the class can be used as a feature subset
selection method,in order to identify,from all the nodes that
deﬁne the network,those that inﬂuence the class node.
As previously mentioned,Bayesian networks can also be used
as classiﬁers.A Bayesian network,however,is not designed to
optimize the conditional likelihood of the class given the other
features (Domingos and Pazzani,1997).Consequently,Bayesian
networks may not produce good classiﬁcation results.Actually,
even the naı
¨v
e Bayes classiﬁer can outperform more complex
Bayesian networks classiﬁers in some domains (Friedman et al.,
1997).
A naı¨ve Bayes is a Bayesian network with a ﬁxed structure,in
which the class node has no parents and each feature has the class
node as its unique parent.Since naı¨ve Bayes classiﬁers have their
structure predeﬁned,only the numerical parameters need to be
learnt;thus only information about the features and their
corresponding values are needed to estimate probabilities.The
computational time complexity of learning a naı¨ve Bayes classiﬁer
is linear with respect to the amount of training instances.The
construction is also space efﬁcient,requiring only the information
provided by twodimensional tables (CPtables),in which each
entry corresponds to a probability estimated for a given value of a
particular feature.However,the naı¨ve Bayes classiﬁer makes a
strong and unrealistic assumption:all the features are condition
ally independent given the value of the class.
2.2.Classiﬁcation rules
The knowledge represented by a Bayesian classiﬁer is not as
comprehensible as some other forms of knowledge representa
tion,as for instance,classiﬁcation rules.In the literature there are
a few works that aim at improving the readability/understand
ability of Bayesian classiﬁers;for instance,Moz
ˇ
ina et al.(2004)
implements a visualization process of a naı¨ve Bayes model in the
formof a nomogram.In Hruschka et al.(2008),after inducing the
Bayesian classiﬁer,the BayesRule method improves the under
standability by implementing its translation into a set of
probabilistically qualiﬁed if–then rules of the form
If condition then class with certainty F,(2)
where the condition is called antecedent and F is a percentage
value.
In the BayesRule method,the a posteriori probability for the
rules is evaluated as follows.Let v
1
;v
2
;...;v
n
;c be the sets of
categorical variables values for X
1
;X
2
;...;X
n
and C,respectively.
Also,let v
i
¼ fv
i1
;...;v
ij
i
g,that is,jv
i
j ¼ j
i
;i ¼ 1;...;n and c ¼
fc
1
;...;c
j
g,that is jcj ¼ j.
By using the BayesRule method,the number of variables
involved in the condition part of a rule is reduced since the
method only considers the Markov blanket of the class variable C.
Considering a particular situation where the Markov blanket of
the class variable C is the set fX
1
;...;X
k
g,the a posteriori
probability of class C ¼ c
‘
2 fc
1
;...;c
j
g given the values of the
variables in the Markov blanket of class C for a particular
ARTICLE IN PRESS
X
i
Fig.1.A network structure and the Markov blanket of node X
i
represented by
shadowed nodes.
G.M.Bressan et al./Engineering Applications of Artiﬁcial Intelligence 22 (2009) 579–592 581
instantiation of indexes J
i
;i ¼ 1;...;k is
PðC ¼ c
‘
jv
1;J
1
;...;v
k;J
k
Þ
¼ arg max
J2f1;...;jg
fPðC ¼ c
J
jv
1;J
1
;...;v
k;J
k
Þg,(3)
with
PðC ¼ c
J
jv
1;J
1
;...;v
k;J
k
Þ ¼ PðC ¼ c
J
Þ
Pðv
1;J
1
;...;v
k;J
k
jC¼c
J
Þ
Pðv
1;J
1
;...;v
k;J
k
Þ
.(4)
For the naı¨ve Bayes network model the features are assumed to be
independent given the class C ¼ c
J
and (4) becomes
PðC ¼ c
J
jv
1;J
1
;...;v
k;J
k
Þ/PðC ¼ c
J
Þ
Y
k
i¼1
Pðv
i;J
i
jC ¼ c
J
Þ.(5)
A categorical probabilistic if–then rule has the form
R
r
:
If X
1
is v
1;J
1
and and X
k
is v
k;J
k
then
C is c
J
with certainty F given by (3),(6)
where index r is used for referencing the rules given by the
BayesRule method.
The conﬁdence of a rule can be deﬁned using inferential
results.In doing so,the probability given to the inferred class may
be used as a conﬁdence value and it is embedded in the inference
algorithm.Among the many methods for data understanding,the
BayesRule method focuses on translating a Bayes classiﬁer into a
set of classiﬁcation rules in their simplest,propositional form,as a
way of promoting the understandability of the corresponding
Bayesian network classiﬁer.Reasoning with logical rules is more
acceptable to users than the recommendations given by black box
systems.Moreover,reasoning with rules is comprehensive,
provides explanations,and can be validated by human inspection.
3.Bayesian network inference modeling
To present the collaborative inference system for the risk of
weed infestation in a corncrop,this section is organized into two
pats.The ﬁrst one describes the procedure for collecting and
preparing the data and the second how the data were used to
model two Bayesian network classiﬁers (the naı¨ve Bayes and the
expertbased) for inferring the risk of a weed infestation in a corn
crop.Fig.2 presents an schematic diagram of the proposed
collaborative inference system.
3.1.Collecting and preparing the data
In the experiments described in this paper,data from a corn
crop ﬁeld located in an experimental farm of the Empresa
Brasileira de Pesquisa Agropecua´ ria (Embrapa),in Sete Lagoas,
Minas Gerais,Brazil,were used.
1
A ﬁeld of a 49ha area was tilled
in 16–20 November 2004 and again in 15–19 May 2006.The area
contains 41 experimental ﬁeld parcels 100m distant from each
other.The parcels are rectangular measuring 4m (east–west
direction) and 3m (north–south direction),with ﬁve corn rows
separated from each other by 0.7m,starting at 0.1m from the
bottom edge.Before the crop development,the glifosate 2.4kg
active ingredient (a.i.) ha
1
herbicide was applied outside the
parcels.Also,after the crop development,nicosulfuron 0.04kg
(a.i.) ha
1
and atrazine 1kg (a.i.) ha
1
herbicides were applied all
over the ﬁeld,except on the parcels.The samples per parcel were
obtained in April 2005 and October 2006 for two different corn
crops,excepted for the yield loss which was evaluated in June
2005 and November 2006.
To obtain the weed density data,that is,the number of weeds
per m
2
in each parcel and the biomass of the species,four squares
measuring 0:5m0:5m were randomly placed within each
parcel and the narrowleaved and broadleaved weed species
were collected and counted.Then,the weed species were
separated into bags and kept in a greenhouse at the temperature
of 105
C until their mass has become constant.At this point,the
biomass of the species,deﬁned as the amount of dry material per
m
2
of the aerial part of weeds,was measured.The weed density
and the biomass samples were collected in each experimental
parcel.Therefore,82 data instances were obtained,that is,two
data instances for each of the 41 parcels.Analyzing the collected
data,11 data instances were identiﬁed as outliers and removed.
To obtain weed seed production per m
2
,the weed seeds of one
weed from each specie were counted and multiplied by the
number of weeds found in the squares.The weed coverage data
were estimated by visual observation of the percentage of surface
infested by weeds.This coverage is mainly due to the weed seeds
from the previous weed population which germinated.The weed
seed density,associated to the seed production,and the weed
coverage samples were collected,as described above,from each
ARTICLE IN PRESS
Broadleaved
weed density
Narowedleaved
weed density
Inferring the competitiveness of weedcrop
Total weed density
Bayesian network
classifier
Weed seed
density
Weed
coverage
Risk
Bayesian network
classifier
Geoestatistics
image analisys
Inferring the risk of infestation
x
2
x
1
x
3
x
4
Fig.2.Input–output of the proposed collaborative classiﬁcation system.
1
Embrapa—Project 55.2004.509.00:Rede de Conhecimento em Agricultura
de Precisa˜o para Condico˜es do Cerrado e dos Campos Gerais.
G.M.Bressan et al./Engineering Applications of Artiﬁcial Intelligence 22 (2009) 579–592582
experimental ﬁeld parcel also in two different corncrops,
resulting in 82 data instances.
To evaluate the yield loss per experimental parcels,ﬁrst the
yield was measured by the mass of the corn grains.The mass of
the corn grains was adjusted for the humidity of 13% and
converted from gm
2
to kgha
1
.Then,the yield loss was
evaluated as
P
RR
i
¼
Y
0
Y
i
Y
0
;i ¼ 1;...;41,(7)
where Y
0
denotes the maximum yield found in the witness
parcels (with herbicide application) and Y
i
the yield of each
experimental parcel (without herbicide application).
3.2.Inferring the risk of infestation
The proposed collaborative inference systemis based on inputs
given by the discretized values of infestation features for the weed
coverage,weed seed production density,weed seed patches and
the weed–crop competitiveness,denoted as x
1
;x
2
;x
3
;x
4
,respec
tively.As already mentioned,the feature values for the weed–crop
competitiveness are inferred by the ﬁrst network classiﬁer of the
collaborative system.Following,it is described howthe weed seed
production and weed coverage maps were estimated with kriging
and subsequently treated as images so to obtain the other
features.
3.2.1.Kriging and maps
Interpolation methods have been used in precision farming to
infer the values to nonsampling locations.As already mentioned,
the estimation method used was the geostatistics method called
kriging,an interpolation approach that provides optimal estima
tive of regionalized variables with minimumvariance and without
bias,using a theoretical variogram (Isaaks and Srivastana,1989;
Shiratsuchi,2001).A variogram,also referred to as a semivario
gram,shows the degree of spatial dependence among the samples
and generally is an increasing monotonic function that reaches a
plateau.The distance at which the variogramreaches the plateau
is called range.The frequently used models for the theoretical
variograms are described in detail in Isaaks and Srivastana (1989).
The parameters of the theoretical variogram used in a
interpolation problem are selected from an experimental vario
gram.The experimental variogram
g
ðhÞ is given by the following
equation:
g
ðhÞ ¼
1
2N
h
X
ði;jÞjjh
ij
j¼h
½ZðjÞ ZðiÞ
2
,(8)
where N
h
is the number of pairs of data whose locations are
separated by h,i and j represent the location i and j,respectively,
ZðiÞ is the value of the variable Z at location i,jh
ij
j is the Euclidean
norm of the vector h
ij
and h
ij
is the vector from location i to
location j.
Aiming at ﬁnding the most suitable model,the collected
samples were used with the exponential,Gaussian and spherical
variogram models.The exponential model was chosen based on
the criteria suggested in Iwashita and Landim (2003) since it
provided the smallest ﬁt index (FI) for the sample set,as shown in
Table 1 for data collected in 2005 and 2006.The exponential
variogram model is given by
g
ðhÞ ¼
C
0
þC
1
1 e
h
a
;0ohpa;
C
0
þC
1
;h4a;
8
>
<
>
:
(9)
with C
0
the nugget effect,C
1
the variance of variable Z,C
0
þC
1
the
sill,and a the range.The ﬁt index over the pairs of data whose
locations are separated by all N vectors h named h
k
;k ¼ 1;...;N,is
deﬁned as
FI ¼
1
N
X
N
k¼1
ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ
ð
g
ðh
k
Þ
g
ðh
k
ÞÞ
2
q
g
ðh
k
Þ
.(10)
The ﬁtted variograms for the weed coverage and weed seed
production for the 2005 and 2006 data are shown in Figs.3 and 4,
respectively.Each point of the variogram represents pairs of data
equally apart and is described in Table 2 for 2005 and 2006 data.
As the spatial dependence deﬁned by variograms exists up to
200m,the weed data for nonsampled locations were estimated
with kriging,based on the collected data,to obtain the spatial
representation of the weeds.
The interpolation grid was selected as 20m20m,which is
within the variogram range.The corncrop of 49ha
(700m700m) was then divided into 35 cells per axis sized
20m20m.Estimated maps for the weed coverage at the current
lifecycle and weed seed production at the subsequent lifecycle
were thus generated.The maps are shown in Figs.5 and 6 for the
2005 and 2006 data,respectively.
The estimation quality with kriging was evaluated by cross
validation (Isaaks and Srivastana,1989).Three characteristics of
the residuals,mean closes to zero,constant variance and normal
probability were analyzed,indicating a good estimative.Table 3
shows the results of the cross validation for the 2005 and 2006
data.As the estimative residual means contain the zero,the null
hypothesis of the mean being close to zero is not rejected.The
variances are considered constants with
¯
R the residual size and
the Anderson–Darling test is used to check the normality of the
residuals distribution with 95% of conﬁdence.As the pvalue for
the residuals are larger than 0.05,the hypothesis that residuals
have normal distribution is not rejected.
3.2.2.Map objects and features
Weeds have a tendency to aggregate in clusters.This tendency
explains why certain regions of a ﬁeld are free of weeds.Due to
the spatial variability of weeds in agricultural ﬁelds,it is possible
to detect clusters frommaps.Let
R
ðu;vÞ represent the entire map
region with ðu;vÞ the spatial coordinate of the intensities in the
map.The clusters detected in
R
ðu;vÞ associated to the weed maps
provide three features to infer the weed infestation risk.Assuming
the features have three categorical conditions,the clusters in
R
ðu;vÞ are described by connected objects obtained as follows.
First,to form a map Iðu;vÞ with coded intensities,the
intensities f ðu;vÞ of
R
ðu;vÞ are quantized into three levels
L
1
;L
2
;L
3
associated to ranges equally apart of f ðu;vÞ by an encoder
Q as follows:
Iðu;vÞ ¼ Qðf ðu;vÞÞ ¼ t,(11)
where
t ¼
1 if f ðu;vÞpL
1
;
2 if L
1
of ðu;vÞpL
2
;
3 if L
2
of ðu;vÞpL
3
:
8
>
<
>
:
ARTICLE IN PRESS
Table 1
Fit index of theoretical variograms models.
Model Weed seed Weed coverage
2005 2006 2005 2006
Exponential 0.12 0.05 0.11 0.06
Gaussian 0.25 0.16 0.22 0.16
Spherical 0.16 0.08 0.15 0.08
G.M.Bressan et al./Engineering Applications of Artiﬁcial Intelligence 22 (2009) 579–592 583
ARTICLE IN PRESS
Table 2
Results for the variograms.
Number of pairs separated by h
k
Distance h
k
in m
g
ðh
k
Þ
2005 2006 2005 2006 2005 2006
Weed seed production per m
2
14.50 14.50 101.57 101.52
6:19 10
6
2:06 10
6
80.50 98 141.64 139.22
6:79 10
6
2:77 10
6
129 157 245.59 245.06
5:78 10
6
2:34 10
6
116.50 146.50 344.63 344.73
6:47 10
6
2:58 10
6
90 16 445.61 439.74
7:74 10
6
2:56 10
6
54 98.50 544.41 543.19
6:08 10
6
2:57 10
6
Weed coverage in % 1 16.50 95 101.57 0.045 0.048
113 106 126.63 139.47 0.042 0.063
142.50 172.50 223.81 245.13 0.047 0.057
210 161 330.39 344.85 0.044 0.063
142.50 130.50 435.92 440.20 0.054 0.071
98 112.50 528.15 543.11 0.061 0.063
59.50 – 619.44 – 0.062 –
4 – 690.96 – 0.048 –
100 200 300 400 500 600 700
0
0.02
0.04
0.06
0.08
distance (m)
γ∗(h)γ(h)
100 200 300 400 500 600 700
0
2
4
6
8
10
x 10
6
distance (m)
γ∗(h)γ(h)
Fig.3.Theoretical variograms for the 2005 data obtained with an exponential model (solid line) and the corresponding experimental variogram(points) for (a) the weed
coverage with C
0
¼ 0:038,C
0
þC
1
¼ 0:05 and for (b) the weed seed production with C
0
¼ 5:09 10
6
,C
0
þC
1
¼ 6:50 10
6
.
100 200 300 400 500 600 700 800
0
0.02
0.04
0.06
0.08
distancia (m)
γ∗(h)γ(h)
γ∗(h)γ(h)
100 200 300 400 500 600
0
0.5
1
1.5
2
2.5
3
x 10
6
distancia (m)
Fig.4.Theoretical variograms for the 2006 data obtained with an exponential model (solid line) and the corresponding experimental variogram(points) for (a) the weed
coverage with C
0
¼ 0:048;C
0
þC
1
¼ 0:063 and for (b) the weed seed production with C
0
¼ 2:0 10
6
;C
0
þC
1
¼ 2:60 10
6
.
G.M.Bressan et al./Engineering Applications of Artiﬁcial Intelligence 22 (2009) 579–592584
ARTICLE IN PRESS
0
200
400
600
0
100
200
300
400
500
600
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0
200
400
600
0
100
200
300
400
500
600
1000
2000
3000
4000
5000
6000
7000
Fig.5.Maps estimated with kriging for data collected in 2005 associated to (a) the weed coverage map at the current lifecycle and (b) the weed seed production map at
the subsequent lifecycle.The up right corner of both maps represents the irregular contour of the corncrop ﬁeld.The gray scale in (a) represents percentage and in (b)
represents the number of seeds per m
2
.
0 200 400 600
0
100
200
300
400
500
600
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 200 400 600
0
100
200
300
400
500
600
1000
2000
3000
4000
5000
6000
7000
Fig.6.Maps estimated with kriging for data collected in 2006 associated to (a) the weed coverage map at the current lifecycle and (b) the weed seed production map at
the subsequent lifecycle.The up right corner of both maps represents the irregular contour of the corncrop ﬁeld.The gray scale in (a) represents percentage and in (b)
represents the number of seeds per m
2
.
Table 3
Cross validation for kriging estimation for 2005 and 2006 data.
Residual mean Mean interval Constant variance Anderson–Darling test
Weed coverage
2005
0:50 10
2
½0:07;0:06 ¯
R ¼ 0:25 pvalue ¼ 0:61
2006
0:11 10
1
½0:10;0:07 ¯R ¼ 0:36 pvalue ¼ 0:19
Weed seed production
2005
0:40 10
2
½0:17;0:18 ¯R ¼ 0:26 pvalue ¼ 0:45
2006
0:82 10
1
½0:68;0:51 ¯
R ¼ 0:21
pvalue ¼ 0:16
G.M.Bressan et al./Engineering Applications of Artiﬁcial Intelligence 22 (2009) 579–592 585
The pixels in Iðu;vÞ may represent the same intensity range but
may belong to different clusters within the image.Connected
objects are thus obtained by image analysis using a 4connected
model (Gonzalez and Woods,2002).In this model,two pixels in
the fourneighbors are connected if they have the same value.The
fourneighbors of pixel p at coordinates ðu;vÞ are given by
ðu þ1;vÞ;ðu 1;vÞ;ðu;v þ1Þ;ðu;v 1Þ.(12)
The fourconnected model is implemented by generating a binary
matrix called I
b
ðu;vÞ as
If Iðu;vÞa0 then I
b
ðu;vÞ ¼ 1.(13)
Finally,the connected objects are organized in a matrix called
Tðu;vÞ.In a gray scale,Fig.7 shows the image of the labels given to
the nine connected objects identiﬁed in both the maps of the
weed coverage and weed seed production for the 2005 data and
Fig.8 shows the same images for the 2006 data.
Using the connected objects deﬁned above,features for the
infestation were selected (Bressan et al.,2008).The features were
evaluated per regions of size not exceeding the spatial depen
dence of the data sets.Let R
i
;i ¼ 1;...;N
R
denote subregion
R
i
of
R
,p
t
i
the number of connected object in
R
i
such that Tðu;vÞ ¼ t
and k
t
i
the number of pixels with intensities equal to t in
R
i
.The
features were established as follows:
x
1
:Feature for the weed coverage per region.Indicates the
percentage of surface infested by emergent weeds in each
region.In each region
R
i
it is obtained as the weighted
intensities Tðu;vÞ,as follows:
x
1
ðiÞ ¼
1
number elements of
R
i
P
ðu;vÞ2
R
i
Tðu;vÞ
P
3
t¼1
t
.(14)
x
2
:Feature for the weed seed production per region.Charac
terizes the locations of seeds which can germinate in each
region and is associated with the weed seed production.It is
obtained in the same way as feature
u
1
.
x
3
:Feature for the weed seed patches per region.Represents how
the seeds contribute to weed proliferation in the surroundings
ARTICLE IN PRESS
0
200
400
600
0
100
200
300
400
500
600
0
2
4
6
8
0
200
400
600
0
100
200
300
400
500
600
0
2
4
6
8
Fig.7.Maps of connected objects representing (a) the matrix Tðu;vÞ for the weed coverage and (b) the weed seed production (2005 data).
0 200 400 600
0
100
200
300
400
500
600
0
2
4
6
8
10 10
0 200 400 600
0
100
200
300
400
500
600
0
2
4
6
8
Fig.8.Maps of connected objects representing (a) the matrix Tðu;vÞ for the weed coverage and (b) the weed seed production (2006 data).
G.M.Bressan et al./Engineering Applications of Artiﬁcial Intelligence 22 (2009) 579–592586
of each region.The worst case of a patch distribution is one
patch covering all the cells of the image,representing 100% of
occupation.If a value of weed seed occupy a large part of a
region,that is,several pixels contain this value,then,the object
formed by this value has a high inﬂuence on the weed seed
patch calculation.In each region
R
i
,it is obtained as the
average of the weighted intensities Tðu;vÞ of the connected
objects as follows:
x
3
ðiÞ ¼
1
number elements of
R
i
P
tjp
t
i
a0
tk
j
i
=p
t
i
P
3
t¼1
t
(15)
x
4
:Feature for the competitiveness.Reﬂects the high level of
competitiveness of certain species of weeds and their pro
liferation and is based on the weed biomass.The higher the
weed biomass the higher the competitiveness.It is obtained as
the output of the ﬁrst Bayesian network as a categorical
variable for each region using the resulting rule set.
An example of the evaluation of the features x
2
and x
3
in region
R
25
for 2006 data is presented in Table 4.
Along with features x
i
;i ¼ 1;2;3,which had to be discretized
in order to build matrices of categorical variables,the matrix of
categorical variables representing the feature for the competi
tiveness was an input to the network that classiﬁes the risk of
infestation for each region
R
i
.The features x
1
and x
4
were
calculated at current lifecycle and features named x
2
and x
3
at
subsequent lifecycle of the weed population.The features as well
as the yield loss were obtained for both 2005 and 2006 lifecycle
weed–crop data sets.The infestation features were evaluated per
region.Then,the crop was divided into 49 regions of 5 5 cells,
each one having 100m100m,not exceeding the data set spatial
dependence of 200m.The variables used to train the networks are
the categorical variables obtained by region normalized in ½0;1.
Therefore,two instances for each of the 49 regions were
considered resulting in 98 instances.
In order to extract probabilistic rules using the BayesRule
method,the values of all the variables had to be discretized.The
discretizing was conducted by an expert who proposed the three
intervals described in Table 5 represented by categorical variables.
3.2.3.Bayesian networks structures
Since the risk can be explained by the yield loss this was
deﬁned as the class variable.Fig.9 shows the structure of the
Bayesian network classiﬁer represented by the parent–children
relationships,deﬁned by a subject specialist.The node identiﬁed
as weed biomass is the class node from which the
competitiveness is inferred.
Fig.10 shows the naı
¨
ve Bayes classiﬁer structure that
represents the same problem,in which the class variable has no
parents and all the features are conditionally independent given
the class variable.For the purpose of rule comparison,only the
node competitiveness from the ﬁrst collaborative network is
included in the learning of the naı¨ve Bayes classiﬁer.
It is evident by inspecting the expertbased Bayesian network
classiﬁer depicted in Fig.9 that the weed coverage,total weed,
broadleaved weed and narrowleaved weed nodes do not belong
to the Markov blanket of the class node deﬁned as the yield loss.
Therefore,these nodes will not be taken into account by the
BayesRule method (Hruschka et al.,2008).
Once both Bayesian networks had their structure deﬁned,the
next step was to learn the conditional probability distribution
associated to their nodes.This was accomplished as part of the
BayesRule method,using a free software called Genie.
2
As
mentioned in Section 1,the knowledge represented by a Bayesian
classiﬁer is not easily understood by human beings.A way of
promoting its understandability is by translating it into a more
ARTICLE IN PRESS
Table 4
Objects and features for
R
25
with
R
i
2 R
55
.
T ¼
1 1 1 2 2
1 1 1 2 2
2 2 2 2 2
2 2 2 2 1
2 2 2 2 1
2
6
6
6
6
6
6
4
3
7
7
7
7
7
7
5
k
1
25
¼ 8;p
1
25
¼ 2
k
2
25
¼ 17;p
2
25
¼ 1
x
2
¼ 0:2800;x
3
¼ 0:2533
Table 5
Discrete intervals for the risk of infestation categorical variables.
Node variables Intervals
Weed coverage (WCoverage) (%) Thin(Th) Average(A) Thick(k)
[0,0.35] ]0.35,0.70[ [0.70,1]
Weed seed (WSeed) (m
2
)
Low(L) Medium(M) High(A)
[0,0.35] ]0.35,0.70[ [0.70,1]
Weed seed patches (WSPatch) (m
2
)
Small(S) Regular(R) Large(G)
[0,0.40] ]0.40,0.80[ [0.80,1]
Total weed (TWeed) (m
2
)
Low(L) Medium(M) High(A)
[0,0.20] ]0.20,0.60[ [0.60,1]
Narrowleaved weed (NLWeed) (m
2
)
Low(L) Medium(M) High(H)
[0,0.20] ]0.20,0.60[ [0.60,1]
Broadleaved weed (BLWeed) (m
2
)
Low(L) Medium(M) High(H)
[0,0.25] ]0.25,0.75[ [0.75,1]
Weed biomass (WBiomass) (m
2
)
Low(L) Medium(M) High(H)
[0,0.20] ]0.20,0.60[ [0.60,1]
Yield loss (YLoss) (output) (m
2
)
Low(L) Medium(M) High(H)
[0,0.15] ]0.15,0.45[ [0.45,1]
Weed seed
Weed coverage
Narrowleaved
weed
Broadleaved
weed
Total weed
Weed
biomass
Weed seed
patches
Yield
loss
Fig.9.The expertbased Bayesian network classiﬁer to infer the risk of infestation.
2
http://genie.sis.pitt.edu
G.M.Bressan et al./Engineering Applications of Artiﬁcial Intelligence 22 (2009) 579–592 587
suitable representation,such as classiﬁcation rules.As the
standard propositional if–then classiﬁcation rule is the simplest
and most comprehensive way to represent a classiﬁcation
procedure,it has been adopted by the BayesRule method
(Hruschka et al.,2008),which implements the translation process.
3.2.4.Pruning strategy
As stated in the literature,there are many rule interestingness
metrics such as support,conﬁdence,lift,correlation,collective
strength,etc.Such metrics are often used to determine the more
relevant rules from a rule set (as those implemented by pruning
strategies,for instance).Many of these measures,however,
provide conﬂicting information about the interestingness of a
pattern.Therefore,the best metric to use for a given application
domain is hard to deﬁne.
It is not claimed that the rule estimated probability given by
the BC is the best measure to be used in a rule set pruning task,as
long as different measures have different intrinsic properties.
However,it is an easy measure to be implemented as well as
understood and contributed (as the experiments showed) for
helping pruning.See Tan et al.(2002) for a more detailed
description of properties of some of the most commonly used
rule interestingness measures.
In the experiments described in this paper,focusing on the
weed infestation domain,a pruning strategy based on the rule
estimated probability given by the BC is proposed to reduce the
number of rules in each rule set when the Markov blanket
strategy was unable to reduce the number of rules.The pruning
strategy is based on a very simple idea and is mainly motivated by
the fact that it can be applied without any extra computation
effort.Considering the rule set as an ordered (based on the
estimated probability) list,the pruning can be done by taking into
account only the rules having estimated probability higher than a
predeﬁned threshold.When pruning is applied,the number of
rules tends to be smaller and the comprehensibility tends to be
higher.On the other hand,having fewer rules may imply in having
a less detailed overview of the problem (with fewer rules and
fewer antecedents).Thus,the tradeoff between accuracy and
complexity is a very important issue to be analyzed in each
speciﬁc application domain.
4.Collaborative classiﬁers results
Using an expertbased Bayesian and a naı¨ve Bayes classiﬁers,
the numerical parameters of the classiﬁers were obtained using
the Genie software.The BayesRule method was then used in
conjunction with each classiﬁer in order to extract the corre
sponding classiﬁcation rules to infer the infestation risk.In order
to do that,the values of all features were discretized in the
conditions of each rule as in Table 5,except x
4
,which was inferred
from the weed biomass as a categorical variable thus having the
same intervals as the weed biomass.
The number of rules represents all the variable combinations
and their categorical variables.Each rule has an associated value
that represents the probability of its class value,given the values
of its antecedent variables.Using a 10fold cross validation
procedure,10 Bayesian networks were trained using 10 different
training sets and the extracted rules were evaluated using each of
the 10 corresponding testing sets.The same testing sets were used
to evaluate the extracted rules with and without pruning from
both the expertbased and naı¨ve networks of the collaborative
system.In the pruning for each one of 10 cross validation sets,the
rules with probability below a certain threshold,which were
generated from an a priori probability of the class variable
obtained from the numerical parameters of the network,were
removed and a default class was introduced.The most probable
value for the class variable was taken as the categorical value
medium (M).The default class named D is then deﬁned as the
most frequent class.
Considering that a 10fold cross validation strategy was used in
the experiments,only one of the 10 testing sets was chosen to be
presented in the paper for each classiﬁer.The remaining fold
results are obtained in the same way.In what follows,the results
for both networks of the collaborative system used to infer the
risk of infestation are presented.
4.1.Competitiveness weed–crop classiﬁcation results
The inference for the competitiveness of weed–crop is
performed by the ﬁrst classiﬁcation task in the collaborative
system.The BayesRule method extracted a set of rules fR
1
;...;R
r
g
with r ¼ 27 probabilistic rules from each Bayes classiﬁer (three
variables,each having three possible values).
The evaluation results obtained for one of the 10 testing set,
including the accuracy and the corresponding class probability,
are shown in Table 6.For this case,the rules are 50% in agreement
with the testing set,since 3 out of 6 data instances were correctly
classiﬁed.Table 7 shows the pruned Bayesian rule set,which
presents rules with probability above a threshold of 70% as well as
the default rule D and Table 8 shows the results of the testing set
using the pruned rule set of Table 7.For this testing set,rules 9
and 21 were replaced by the default rule and the pruned rule set
was 83.33% in agreement with the testing set,since 5 out of 6
instances were correctly classiﬁed.In this particular modeling,the
classiﬁcation rate has improved.For all the 10 testing set cases,
the 71 data instances were tested.The results indicate 63.39% of
agreement,since 45 of 71 testing data were correctly classiﬁed.
By replacing the rules with probability less than 70% by the
default rule D,this percentage became 64.79%,since 46 out of 71
testing instances were correctly classiﬁed.These results are
shown in Table 9.Table 10 shows the results for one testing set
when considering the Bayesian rule set extracted from the naı¨ve
Bayes classiﬁer which reveal that the rules are 50% in agreement
ARTICLE IN PRESS
Yield
loss
Weed seed
Weed coverage
Weed seed
patches
Competitiveness
Fig.10.The naı¨ve Bayes classiﬁer to infer the risk of infestation.
Table 6
Competitiveness expertbased Bayesian network testing data set results for the
rules.
BLWeed NLWeed TWeed WBiomass R
r
Test P(Rule R
r
jX
1;J
1
;X
2;J
2
;X
3;J
3
) (%)
H M M H R
9
Incorrect 52
M H M M R
21
Incorrect 55
H L M M R
6
Correct 71
H L M M R
6
Correct 71
H M M M R
9
Incorrect 52
M M L L R
26
Correct 80
G.M.Bressan et al./Engineering Applications of Artiﬁcial Intelligence 22 (2009) 579–592588
with the testing set,since 3 out of 6 instances were correctly
classiﬁed.Table 11 shows the pruned Bayesian rule set and
Table 12 shows the results of the testing set using the pruned rule
set.For this testing set,rules 3,13,25 and 27 were replaced by the
default rule and the pruned rule set was 66.66% in agreement
with the testing set,since 4 out of 6 data instances were correctly
classiﬁed.In this particular modeling,the classiﬁcation has also
improved.
The results of the classiﬁcation for all testing sets using other
thresholds for pruning the rule sets of the expertbased and naı¨ve
Bayesian classiﬁers are also presented in Table 9.
4.2.Risk of infestation classiﬁcation results
The second classiﬁcation task was performed in order to
generate a set of classiﬁcation rules to infer the risk of infestation.
As in the case of the competitiveness,an expertbased Bayesian
network classiﬁer and a naı¨ve Bayes classiﬁer were considered.As
before,the numerical parameters were deﬁned using the Genie
software.The BayesRule method was then used in conjunction
with each classiﬁer in order to extract the corresponding
classiﬁcation rules to infer the infestation risk.The BayesRule
method extracted a set of 27 probabilistic rules fromeach expert
based Bayesian network classiﬁer as shown in Table 13.
By considering the Bayesian rule set extracted fromthe expert
based network,Table 14 displays the results obtained for the test
set illustrated here.The rules presented 89% of agreement,given
that 8 of 9 tested instances were correctly classiﬁed.By
considering the Bayesian rule set extracted from the naı¨ve Bayes
classiﬁer,the results are shown in Table 15.
As before,the rules with probability below 70% were removed
and a default class D also taken as the categorical value M was
used.The pruning strategy was applied after the Markov blanket
had reduced the number and the complexity (regarding condi
tions in their antecedent part) of the classiﬁcation rules.Thus,the
pruning strategy was applied to the rules shown in Table 13 and
the reduced set of rules are shown in Table 16.
By considering the pruned Bayesian rule set extracted fromthe
expertbased network,Table 17 displays the results obtained
again for the test set illustrated here.The rules presented,as
before,89% of agreement.By considering the pruned Bayesian rule
set extracted fromthe naı¨ve Bayes classiﬁer,the results are shown
in Table 18.The rules presented 88.9% of agreement.For all the 10
testing set cases,the results are shown in Table 19.Also,to verify if
the rules set have a positive impact on the results,the results
obtained using the default class D in the all 10 testing set cases
ARTICLE IN PRESS
Table 7
Pruned expertbased Bayesian rule set using the default rule D with a threshold of
probability 0.7.
1 If (BLWeed is H) and (NLWeed is H) and (TWeed is H) then WBiomass is H (0.72)
4 If (BLWeed is H) and (NLWeed is L) and (TWeed is H) then WBiomass is M(1.00)
6 If (BLWeed is H) and (NLWeed is L) and (TWeed is M) then WBiomass is M(0.72)
7 If (BLWeed is H) and (NLWeed is M) and (TWeed is H) then WBiomass is H (0.83)
11 If (BLWeed is L) and (NLWeed is H) and (TWeed is L) then WBiomass is M(1.00)
19 If (BLWeed is M) and (NLWeed is H) and (TWeed is H) then WBiomass is L (1.00)
23 If (BLWeed is M) and (NLWeed is L) and (TWeed is L) then WBiomass is L (0.79)
24 If (BLWeed is M) and (NLWeed is L) and (TWeed is M) then WBiomass is L (0.72)
26 If (BLWeed is M) and (NLWeed is M) and (TWeed is L) then WBiomass is L (0.80)
27 If (BLWeed is M) and (NLWeed is M) and (TWeed is M) then WBiomass is M
(0.80)
D Otherwise WBiomass is M (1.00)
Table 8
Pruned expertbased Bayesian network testing data set results using the default
rule D with a threshold of probability 0.7.
BLWeed NLWeed TWeed WBiomass R
r
Test P(Rule R
r
jX
1;J
1
;X
2;J
2
;X
3;J
3
) (%)
H M M H R
D
Incorrect 100
M H M M R
D
Correct 100
H L M M R
6
Correct 72
H L M M R
6
Correct 72
H M M M R
D
Correct 100
M M L L R
26
Correct 80
Table 9
Competitiveness classiﬁcation results with expertbased and naı¨ve Bayesian networks for all 10 folds.
Expertbased Naı¨ve
Accuracy (%) Number of rules Accuracy (%) Number of rules
Markov blanket rules set 63.39 27 57.75 27
Pruned rules set
Threshold ¼ 60% 64.79 12 60.56 15
Threshold ¼ 70% 64.79 11 61.97 9
Threshold ¼ 80% 60.65 7 60.56 3
Threshold ¼ none 60.00 1 60.50 1
Table 10
Naı¨ve Bayes testing data set results for the rules.
BLWeed NLWeed TWeed WBiomass R
r
Test P(Rule R
r
jX
1;J
1
;X
2;J
2
;X
3;J
3
Þ (%)
H M H H R
3
Incorrect 62
L H L M R
13
Correct 52
M M M M R
27
Correct 48
M H M M R
25
Incorrect 50
M M M M R
27
Correct 48
H L M L R
20
Incorrect 91
Table 11
Pruned naı¨ve Bayes rule set with a threshold of probability 0.7.
2 If (BLWeed is H) and (NLWeed is L) and (TWeed is H) then WBiomass is M(0.80)
5 If (BLWeed is L) and (NLWeed is L) and (TWeed is H) then WBiomass is M(0.73)
18 If(BLWeed is M) and (NLWeed is M) and (TWeed is L) then WBiomass is L (0.71)
19 If (BLWeed is H) and (NLWeed is H) and (TWeed is M) then WBiomass is M
(0.76)
20 If (BLWeed is H) and (NLWeed is L) and (TWeed is M) then WBiomass is M
(0.91)
21 If (BLWeed is H) and (NLWeed is M) and (TWeed is M) then WBiomass is M
(0.78)
23 If (BLWeed is L) and (NLWeed is L) and (TWeed is M) then WBiomass is M(0.85)
D Otherwise WBiomass is M (1.00)
G.M.Bressan et al./Engineering Applications of Artiﬁcial Intelligence 22 (2009) 579–592 589
whatever the rule antecedent is,and also the results for
thresholds of 60%,70% and 80% are showed in Table 19.
5.Conclusions
This work explores Bayesian network based methods to infer
the risk of weed infestation in a corncrop.The proposed inference
systemis implemented as a collaboration between two classiﬁca
tion tasks.The ﬁrst one infers the competitiveness (expressed by
the biomass) of weeds and the second infers the risk of infestation
(expressed by the yield loss),using as input the inferred
competitiveness,the weed seed density,weed coverage and weed
seed patches.The last three features are inferred fromkriging and
image objects.For both classiﬁcation tasks,two different Bayesian
network structures,a naı¨ve Bayes and an expertbased network
structures,were used for comparison purposes.The numeric
parameters of both Bayesian models were learned from the
empirical data collected from a corncrop ﬁeld.
A hybrid approach,implemented by the BayesRule method,
which articulates Bayes and categorical rules,was used to
improve the model’s understandability,by extracting classiﬁca
tion rules from each model.The Markov blanket concept was
used in the BayesRule method to reduce the number and the
complexity of classiﬁcation rules.When pruning is applied,the
number of rules tends to be smaller and the comprehensibility
tends to be higher.On the other hand,having fewer rules may
imply having a less detailed overview of the problem(with fewer
rules and fewer antecedents).Thus,the trade off between
accuracy and complexity is a very important issue to be analyzed
in each speciﬁc application domain.
In this work,for the expertbased network,the Markov blanket
concept was sufﬁcient to prune the rule set efﬁciently,since the
results indicate 72.5% and 66.3% of agreement without and with
the pruning strategy,respectively.In addition,the results reveal
that the expertbased Bayesian network classiﬁer yields a higher
accuracy than the naı¨ve Bayes classiﬁer.In the former,
the application of the pruning strategy made no difference in
the results.The strong and unrealistic assumption (that all the
features are independent given the class) which is an intrinsic
aspect of any naı¨ve Bayes classiﬁer may have contributed to this
behavior.It is worthwhile mentioning that the results presented
are speciﬁc to a particular crop ﬁeld,subject to the conditions
described in Section 3.1.Further work includes the use of
extensive simulations and experiments to generalize the obtained
results.It is also worth looking into the use of the proposed
pruning strategy in other domains in order to conﬁrm its
relevance.
ARTICLE IN PRESS
Table 13
Expertbased Bayesian rules set for the risk of infestation.
1 If (WSeed is H) and (WCompetitiveness is H) and (WSPatch is G) then YLoss is H (0.66)
2 If (WeedSeed is H) and (WCompetitiveness is H) and (WSPatch is S) then YLoss is H (0.50)
3 If (WeedSeed is H) and (WCompetitiveness is H) and (WSPatch is R) then YLoss is H (0.47)
4 If (WeedSeed is H) and (WCompetitiveness is L) and (WSPatch is G) then YLoss is H (0.41)
5 If (WeedSeed is H) and (WCompetitiveness is L) and (WSPatch is S) then YLoss is M (0.72)
6 If (WeedSeed is H) and (WCompetitiveness is L) and (WSPatch is R) then YLoss is M (0.48)
7 If (WeedSeed is H) and (WCompetitiveness is M) and (WSPatch is G) then YLoss is M (0.36)
8 If (WSeed is H) and (WCompetitiveness is M) and (WSPatch is S) then YLoss is M (0.51)
9 If (WSeed is H) and (WCompetitiveness is M) and (WSPatch is R) then YLoss is M (0.77)
10 If (WSeed is L) and (WCompetitiveness is H) and (WSPatch is G) then YLoss is H (0.66)
11 If (WSeed is L) and (WCompetitiveness is H) and (WSPatch is S) then YLoss is M (0.61)
12 If (WSeed is L) and (WCompetitiveness is H) and (WSPatch is R) then YLoss is H (0.93)
13 If (WSeed is L) and (WCompetitiveness is L) and (WSPatch is G) then YLoss is L (0.41)
14 If (WSeed is L) and (WCompetitiveness is L) and (WSPatch is S) then YLoss is M (0.84)
15 If (WSeed is L) and (WCompetitiveness is L) and (WSPatch is R) then YLoss is L (0.54)
16 If (WSeed is L) and (WCompetitiveness is M) and (WSPatch is G) then YLoss is M (0.36)
17 If (WSeed is L) and (WCompetitiveness is M) and (WSPatch is S) then YLoss is M (0.55)
18 If (WSeed is L) and (WCompetitiveness is M) and (WSPatch is R) then YLoss is M (0.92)
19 If (WSeed is M) and (WCompetitiveness is H) and (WSPatch is G) then YLoss is H (0.50)
20 If (WSeed is M) and (WCompetitiveness is H) and (WsPatch is S) then YLoss is M (0.67)
21 If (WSeed is M) and (WCompetitiveness is H) and (WSPatch is R) then YLoss is H (1.00)
22 If (WSeed is M) and (WCompetitiveness is L) and (WSPatch is G) then YLoss is M (0.84)
23 If (WSeed is M) and (WCompetitiveness is L) and (WSPatch is S) then YLoss is M (1.00)
24 If (WSeed is M) and (WCompetitiveness is L) and (WSPatch is R) then YLoss is M (0.58)
25 If (WSeed is M) and (WCompetitiveness is M) and (WSPatch is G) then YLoss is M (0.62)
26 If (WSeed is M) and (WCompetitiveness is M) and (WSPatch is S) then YLoss is M (0.500)
27 If (WSeed is M) and (WCompetitiveness is M) and (WSPatch is R) then YLoss is H (1.00)
Table 12
Pruned naı¨ve Bayes testing data set results for the rules with a threshold of
probability 0.7.
BLWeed NLWeed TWeed WBiomass R
r
Test P(Rule R
r
jX
1;J
1
;X
2;J
2
;X
3;J
3
Þ (%)
H M H H R
D
Incorrect 100
L H L M R
D
Correct 100
M M M M R
D
Correct 100
M H M M R
D
Correct 100
M M M M R
D
Correct 100
H L M L R
20
Incorrect 91
Table 14
Expertbased Bayesian network testing data set results for the infestation risk.
WSeed WSPatch WCompetitiveness YLoss R
r
Test P(Rule
R
r
jV
1;J
1
;V
2;J
2
;V
3;J
3
;V
4;J
4
Þ
(%)
L S B M R
14
Correct 84
L S M M R
17
Correct 55
L S B M R
14
Correct 84
L S M M R
17
Correct 55
M R M M R
27
Incorrect 100
L S B M R
14
Correct 84
L S H M R
11
Correct 61
L S H M R
11
Correct 61
M R H H R
21
Correct 100
G.M.Bressan et al./Engineering Applications of Artiﬁcial Intelligence 22 (2009) 579–592590
ARTICLE IN PRESS
Table 15
Naı¨ve Bayes classiﬁer testing data set results for the infestation risk.
WCoverage WSeed WSPatch WCompetitiveness YLoss R
r
Test P(Rule R
r
jV
1;J
1
;V
2;J
2
;V
3;J
3
;V
4;J
4
Þ (%)
Th L L S M R
41
Correct 82
Th L L S M R
41
Correct 82
Th L L S M R
40
Correct 61
Th L L R M R
42
Correct 70
Th L L R M R
42
Correct 70
Th L L S M R
41
Correct 82
Th L L R H R
42
Incorrect 70
A L L R H R
51
Correct 75
A L L G H R
49
Correct 82
Table 16
Pruned expertbased Bayesian rules set for the risk of infestation.
5 If (WSeed is H) and (WCompetitiveness is B) and (WSPatch is S) then YLoss is M (0.72)
9 If (WSeed is H) and (WCompetitiveness is M) and (WSPatch is R) then YLoss is M (0.77)
12 If (WSeed is L) and (WCompetitiveness is H) and (WSPatch is R) then YLoss is H (0.93)
14 If (WSeed is L) and (WCompetitiveness is B) and (WSPatch is S) then YLoss is M (0.84)
18 If (WSeed is B) and (WCompetitiveness is M) and (WSPatch is R) then YLoss is M (0.92)
21 If (WSeed is M) and (WCompetitiveness is H) and (WSPatch is R) then YLoss is H (1.00)
22 If (WSeed is M) and (WCompetitiveness is L) and (WSPatch is G) then YLoss is M (0.84)
23 If (WSeed is M) and (WCompetitiveness is L) and (WSPatch is S) then YLoss is M (1.00)
27 If (WSeed is M) and (WCompetitiveness is M) and (WSPatch is R) then YLoss is H (1.00)
D Otherwise YLoss is M (1.00)
Table 17
Pruned expertbased Bayesian network testing data set results for the infestation risk with a threshold of probability 0.7.
WSeed WSPatch WCompetitiveness YLoss R
r
Test P(Rule R
r
jX
1;J
1
;X
2;J
2
;X
3;J
3
;X
4;J
4
Þ (%)
L S L M R
14
Correct 84
L S M M R
D
Correct 100
L S L M R
14
Correct 84
L S M M R
D
Correct 100
M R M M R
27
Incorrect 100
L S L M R
14
Correct 84
L S H M R
D
Correct 100
L S H M R
D
Correct 100
M R H H R
21
Correct 100
Table 18
Pruned naı¨ve Bayes classiﬁer testing data set results for the infestation risk with a threshold of probability 0.7.
WCoverage WSeed WSPatch WCompetitiveness YLoss R
r
Test P(Rule R
r
jX
1;J
1
;X
2;J
2
;X
3;J
3
;X
4;J
4
Þ (%)
Th L L S M R
41
Correct 82
Th L L S M R
41
Correct 82
Th L L G M R
D
Correct 100
Th L L R M R
42
Correct 70
Th L L R M R
42
Correct 70
Th L L P M R
41
Correct 82
Th L L R H R
42
Incorrect 70
A L L R H R
51
Correct 75
A L L G H R
49
Correct 82
Table 19
Risk classiﬁcation results with expertbased and naı¨ve Bayesian networks for all 10 folds.
Expertbased Naı¨ve
Accuracy (%) Number of rules Accuracy (%) Number of rules
Markov blanket rules set 72.5 27 71.4 81
Pruned rules set
Threshold ¼ 60% 63 15 69.3 68
Threshold ¼ 70% 66.3 10 71.4 52
Threshold ¼ 80% 66.3 8 66.3 44
Threshold ¼ none 65.3 1 66.32 1
G.M.Bressan et al./Engineering Applications of Artiﬁcial Intelligence 22 (2009) 579–592 591
Acknowledgments
This work was partially supported by the Coordenaca˜o de
Aperfeicoamento de Pessoal de Nı´vel Superior (CAPES) under the
Programa Nacional de Cooperaca˜o Acadeˆmica (PROCAD),the
Conselho Nacional de Desenvolvimento Cientı´ﬁco e Tecnolo´ gico
(CNPq) and Fundaca˜o de Amparo a`Pesquisa do Estado de Sa˜o
Paulo (FAPESP).We thank Dr.De
´
cio KaramfromEmbrapa Milho e
Sorgo,Sete Lagoas,MG,for helping to deﬁne the Bayesian network
structures and for providing the data used in the experiments
described in this paper.
References
Aitkenhead,M.J.,Dalgetty,I.A.,Mullins,C.E.,McDonald,A.J.S.,Strachan,N.J.C.,
2003.Weed and crop discrimination using image analysis and artiﬁcial
intelligence methods.Computers and Electronics in Agriculture 39 (3),
157–171.
Banerjee,S.,Johnson,G.A.,Schneider,N.,Durgan,B.R.,2005.Modelling replicated
weed growth data using spatiallyvarying growth curves.Environmental and
Ecological Statistics 12 (4),357–377.
Bressan,G.M.,Koenigkan,L.V.,Oliveira,V.A.,Cruvinel,P.E.,Karam,D.,2008.A
classiﬁcation methodology for the risk of weed infestation using fuzzy logic.
Weed Research 48 (5),470–479.
Bressan,G.M.,Oliveira,V.A.,Hruschka,E.R.J.,Nicoletti,M.C.,2007a.Biomass based
weed–crop competitiveness classiﬁcation using Bayesian networks.In:
Seventh International Conference on Intelligent Systems Design and Applica
tions,IEEE Press,Rio de Janeiro,pp.121–126.
Bressan,G.M.,Oliveira,V.A.,Hruschka,E.R.J.,Nicoletti,M.C.,2007b.A probability
estimation based strategy to optimize the classiﬁcation rule set extracted from
Bayesian network classiﬁers.In:VIII Simpo´ sio Brasileiro de Automac ao
Inteligente,Floriano´ polis,paper ID 306511.
Brooker,P.I.,1979.Kriging.Engineering and Mining Journal 180 (9),148–153.
Cheng,J.,Greiner,R.,Kelly,J.,Bell,D.,Liu,W.,2002.Learning Bayesian networks
from data:an informationtheory based approach.Artiﬁcial Intelligence 137
(1),43–90.
Cousens,R.,Mortimer,M.,1995.Dynamics of Weed Populations.Cambridge
University Press,Cambridge,UK.
Domingos,P.,Pazzani,M.,1997.On the optimality of the simple Bayesian classiﬁer
under zero–one loss.Machine Learning 29 (2–3),103–130.
Duda,R.O.,Hart,P.E.,1973.Pattern Classiﬁcation and Scene Analysis.Wiley,
New York.
Faechner,T.,Norrena,K.,Thomas,A.G.,Deutsch,C.V.,2002.A riskqualiﬁed
approach to calculate locally varying herbicide application rates.Weed
Research 42 (6),476–485.
Firbank,L.G.,Watkinson,A.R.,1985.A model of interference within plant
monocultures.Journal of Theoretical Biology 116 (2),291–311.
Friedman,N.,Geiger,D.,Goldszmidt,M.,1985.Bayesian network classiﬁers.
Machine Learning 29 (1),131–163.
Gonzalez,R.C.,Woods,R.E.,2002.Digital Image Processing,second ed.Prentice
Hall,Upper Saddle River,NJ.
Granitto,P.M.,Navone,H.D.,Verdes,P.F.,Ceccatto,H.A.,2002.Weed seeds
identiﬁcation by machine vision.Computers and Electronics in Agriculture
33 (2),91–103.
Granitto,P.M.,Verdes,P.F.,Ceccatto,H.A.,2005.Largescale investigation of weed
seed identiﬁcation by machine vision.Computers and Electronics in Agricul
ture 47 (1),15–24.
Heckerman,D.,Chickering,D.M.,Meek,C.,Rounthwaite,R.,Kadie,C.,2000.
Dependency networks for inference,collaborative ﬁltering,and data visualiza
tion.Journal of Machine Learning Research 1 (1),49–75.
Hock,S.M.,Knezevic,S.Z.,Martin,A.,Lindquist,J.L.,2006.Soybean rowspacing and
weed emergence time inﬂuence weed competitiveness and competitive
indices.Weed Science 1 (54),38–46.
Hruschka,E.,Nicoletti,M.,Oliveira,V.,Bressan,G.M.,2008.BayesRule:a Markov
blanket based procedure for extracting a set of probabilistic rules fromBayesian
classiﬁers.International Journal of Hybrid Intelligent Systems 5 (2),83–96.
Hughes,G.,Madden,L.V.,2003.Evaluating predictive models with application in
regulatory policy for invasive weeds.Agricultural Systems 76 (2),755–774.
Isaaks,E.H.,Srivastana,R.M.,1989.An Introduction to Applied Geostatistics.Oxford
University Press,New York.
Iwashita,F.,Landim,P.B.,2003.GEOMATLAB:geostatistics using MATLAB
(in Portuguese).Instituto de Geologia e Cieˆncias Exatas,Universidade Estadual
Paulista (UNESP),Rio Claro,SP,pp.1–17,Texto Dida´ tico 12.
JuradoExpo´ sito,M.,Lo´ pezGranados,F.,Garcı´aTorres,L.,Garcı´aFerrer,A.,
Sanche´ z de la Orden,M.,Atenciano,S.,2003.Multispecies weed spatial
variability and sitespeciﬁc management maps in cultivated sunﬂower.Weed
Science 51 (3),319–328.
JuradoExpo´ sito,M.,Lo´ pezGranados,F.,Gonza´ lezAndujar,J.L.,Garcı´aTorres,L.,
2004.Spatial and temporal analysis of Convolvulus arvensis L.populations over
four growing seasons.European Journal of Agronomy 21 (3),287–296.
Kropff,M.J.,Spitters,C.J.T.,1991.A simple model of crop loss by weed competition
from early observations on relative leaf area of the weeds.Weed Research 2
(31),97–107.
Marchant,J.A.,Onyango,C.M.,2003.Comparison of a Bayesian classiﬁer with a
multilayer feedforward neural network using the example of plant/weed/soil
discrimination.Computers and Electronics in Agriculture 39 (1),3–22.
Moz
ˇ
ina,M.,Dems
ˇ
ar
,J.,Kattan,M.,Zupan,B.,2004.Nomograms for visualization of
naı
¨v
e Bayesian classiﬁer.In:Proceedings of the Eighth European Conference on
Principles and Practice of Knowledge Discovery in Databases,Pisa,Italy,pp.
337–348.
Oerke,E.C.,Dehne,H.W.,Schonbeck,F.,Weber,A.,1994.Crop Production and
Crop Protection.Estimated Losses in Major Food and Cash Crops.Elsevier,
Amsterdam.
Park,S.E.,Benjamin,L.R.,Watkinson,A.R.,2003.The theory and application of
plant competition models:an agronomic perspective.Annals of Botany 92 (6),
741–748.
Pearl,J.,1988.Probabilistic Reasoning in Intelligent Systems:Networks of Plausible
Inference.Morgan Kaufmann,San Mateo,CA.
Primot,S.,ValantinMorison,M.,Makowski,D.,2006.Predicting the risk of weed
infestation in winter oilseed rape crops.Weed Research 46 (1),22–33.
Sakai,K.,2001.Nonlinear Dynamics and Chaos in Agricultural Systems.Develop
ments in Agricultural Systems,Elsevier,Amsterdam,Netherlands.
Shiratsuchi,L.S.,2001.Mapping weed spatial variability using precision farming
tools (in Portuguese).Master’s Thesis,Escola Superior de Agricultura Luiz de
Queiroz,Universidade de S ao Paulo,Piracicaba,SP.
Smith,A.M.,Blackshaw,R.E.,2002.Crop/weed discrimination using remote
sensing.Geoscience and Remote Sensing Symposium 4,1962–1964.
Tan,P.,Kumar,V.,Srivastava,J.,2002.Selecting the right interestingness measure
for association patterns.In:Eighth ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining,Edmonton,Alberta,Canada,pp.32–41,
hDOI ¼ http://doi.acm.org/10.1145/775047.775053i.
Tversky,A.,Kahneman,D.,1974.Judgment under uncertainty:heuristics and
biases.Science 185 (4157),1124–1131.
Wallinga,J.,Groeneveld,R.M.W.,Lotz,L.A.P.,1998.Measures that describe weed
spatial patterns at different levels of resolution and their applications for patch
spraying of weeds.Weed Research 38 (5),351–359.
Wilkerson,G.G.,Wiles,L.J.,Bennett,A.C.,2002.Weed management decision
models:pitfalls,perceptions,and possibilities of the economic threshold
approach.Weed Science 50 (4),411–422.
ARTICLE IN PRESS
G.M.Bressan et al./Engineering Applications of Artiﬁcial Intelligence 22 (2009) 579–592592
Enter the password to open this PDF file:
File name:

File size:

Title:

Author:

Subject:

Keywords:

Creation Date:

Modification Date:

Creator:

PDF Producer:

PDF Version:

Page Count:

Preparing document for printing…
0%
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο