and Its Applications

religiondressInternet and Web Development

Oct 21, 2013 (3 years and 9 months ago)

71 views

Yuanyuan Guo


Faculty of Computer Science

University of New Brunswick, Canada


yuanyuan.guo
@unb.ca

Semi
-
Supervised Learning

and Its Applications

2

Outline


Introduction of Semi
-
Supervised Learning (SSL)


Our ISBOLD algorithm


Applications of SSL in Semantics


Conclusions


Q & A

For many real
-
world problems,
labeled

data (L)

can be scarce or expensive.




Unlabeled data (U)

is much cheaper or
easier to collect.

Need human efforts to annotate / categorize them

3

For example: web pages, images, speech,


medical outcomes, …


For many real
-
world problems,
labeled

data (L)

can be scarce or expensive.




Unlabeled data (U)
is much cheaper or
easier to collect.

Need human efforts to annotate / categorize them

4

Can we utilize the cheap & abundant
unlabeled data?

5




Semi
-
supervised learning uses both
labeled data

and
unlabeled data
to learn better classifiers.

Semi
-
supervised

learning

Yes

Yes

1. Semi
-
supervised Learning


Paradigms of commonly used SSL methods:


Self
-
training


Co
-
training


Generative models


Semi
-
supervised Support Vector Machines


Graph
-
based algorithms


We focus on classification tasks.

6



The final
classifier
trained on
all the
“labeled”
data


General idea of Self
-
training:


To iteratively select some unlabeled examples according to a given selection
criterion, and then to move them (together with the labels assigned by the
classifier) to enlarge the training data to build a better classifier.



A common selection method is to select the unlabeled instances that
the current classifier has high prediction confidence.


7


Existing variants of self
-
training algorithms presented by
researchers:


Using all the unlabeled instances to enlarge the training set,
hence no selection criterion is needed.


e.g. semi
-
supervised EM



Using active learning method to select the most informative
unlabeled instances and to ask experts to label them.


No mislabelled examples will occur, in principle.


not guaranteed


It is not applicable when experts are not available.



Using different selection techniques, e.g.


Value Difference Metric (Wang et al. 2008)


SETRED data editing method (Li et al. 2005)


8



2. Our ISBOLD algorithm


Main idea of our
ISBOLD

algorithm:


In each iteration, after the selection of the most confident
unlabeled instances, the accuracy of the current classifier on the
original labelled data is computed and then used to decide
whether to add the selected instances to the training set in the
next iteration.


The unlabeled data does not always help.


If many wrong labels are assigned to the selected unlabeled
instances, the final performance will be jeopardized due to
accumulation of errors in the expanded training set.


A new
Instance Selection method Based on the Original
Labelled Data
(
ISBOLD
) was presented to improve the
performance of self
-
training.

9




ISBOLD for self
-
training:

10


Experimental settings


10 runs of 4
-
fold cross
-
validation


25%
-
> testing


75%
-
> training:


75% *

lp


L

(labeled data)


75% * (1
-
lp
)


U

(unlabeled data)



lp

is set to 5%



The maximum number of iterations is 80

Experiments & Results

11


26 UCI datasets:


18 binary
-
class datasets


8 multi
-
class datasets


Data preprocessed in WEKA:


Replace missing values;


Unsupervised 10
-
bin
discretization;


Remove attributes that have
little contribution to
classification, e.g.

Instance_name
” .


12


Performance comparison on
Accuracy


Performance comparison on
AUC

13




[R. J. Kate 07] Utilizes Semi
-
Supervised Learning on Semantic
Parsing.


Semantic Parsing
: Transforming
natural language

(NL) sentences into
computer executable

complete
meaning representations

(MRs) for
domain
-
specific applications.


Example:
Geoquery
-

A Database Query Application



Which rivers run
through the states
bordering Texas?

Query

answer(traverse(next_to(stateid(‘texas’))))

Semantic Parsing

Arkansas,

Canadian,

Cimarron,


Gila,

Mississippi, Rio

Grande …

Answer

3. Applications of SSL on Semantics

* This slide is originally from R. J. Kate

SEMISUP
-
KRISP: Semi
-
Supervised Semantic Parser Learner

Which rivers run through the states bordering
Texas?

answer(traverse(next_to(stateid(‘texas’))))

What is the lowest point of the state with the
largest area?

answer(lowest(place(loc(largest_one(area(state(all)))))
))


……

Which states have a city named Springfield?

What is the capital of the most populous state?

How many rivers flow through Mississippi?

How many states does the Mississippi run through?

…….

Supervised Corpus

-

-

-

-

+

+

+

+

+

+

SVM classifiers

Collect

labeled

examples

Semantic Parsing

Semantic Parsing

KRISP

Unsupervised Corpus

* This slide is originally from R. J. Kate

14

SEMISUP
-
KRISP: Semi
-
Supervised Semantic Parser Learner
contd.

Which rivers run through the states bordering
Texas?

answer(traverse(next_to(stateid(‘texas’))))

What is the lowest point of the state with the
largest area?

answer(lowest(place(loc(largest_one(area(state(all)))))
))


……

Which states have a city named Springfield?

What is the capital of the most populous state?

How many rivers flow through Mississippi?

How many states does the Mississippi run through?

…….

Supervised Corpus

Unsupervised Corpus

-

-

-

-

+

+

+

+

+

+

SVM classifiers

Collect

labeled

examples

Semantic Parsing

Collect

unlabeled

examples

Semantic Parsing

Learned

Semantic

Parser

Transductive

* This slide is originally from R. J. Kate

15

Experiments in [R. J. Kate 07]


Compared the performance of SEMISUP
-
KRISP and KRISP
on the Geoquery domain



Labeled data
: 250 natural language sentences annotated
with their correct meaning representations



Unlabeled data
: 1037 unannotated sentences



Increased the amount of supervised training data and
measured the best F
-
measure

16

17

* This slide is originally from R. J. Kate

18




Other applications:


Semantic Role Labeling


Relation Extraction


etc.



Can benefit various NLP tasks ( such as information
extraction, Question Analysis, …)

19




Self
-
training and Co
-
training for Semantic Role Labeling: Primary Report
,
Shan
He and Daniel Gildea, Technical Report, 2011


A self
-
training approach for resolving object coreference on the semantic web
,
Wei Hu et al., WWW 2011.


Adapting Self
-
training for Semantic Role Labeling,
R. S. Z. Kaljahi, ACL 2010
Student Research Workshop


Semantic Relation Extraction Based on Semi
-
Supervised Learning
, Li et al.,
AAIRS 2010


A Graph
-
Based Semi
-
Supervised Learning for Question Semantic Labeling
,
Celikyilmaz et al.,
NAACL/HLT 2010 Workshop on Semantic Search


Semi
-
Supervised Learning of Semantic Classes for Query Understanding


from the web and for the Web
, Microsoft Research, CIKM 2009


Semi
-
supervised semantic role labeling,
Hagen Fürstenau , EACL 2009


Semantic Concept Classification by Joint SSL of feature subspaces and SVM
,
Jiang et al., ECCV 2008


Semi
-
Supervised Learning for Semantic Parsing using Support Vector
Machines
,
R. J. Kate et al.,
NAACL/HLT 2007


More to list……


Some references of applications on semantics

20

4. Conclusion


Automatic methods of collecting data make it more important
than ever to develop methods to make use of unlabeled data.



Semi
-
supervised learning uses both labeled data and
unlabeled data to learn better classifiers.



Many research work has been done to improve the
performance of semi
-
supervised learning methods.



Applying SSL in different areas such as web mining and
Natural Language Processing is popular.



There are still many potential applications to explore…

21


Q & A




Thank you!


Questions? Feel free to contact me!

yuanyuan.guo@unb.ca