Mathematics/Applied Mathematics/ Computer Science
Machine Learning and Prediction
Hiroshi Mamitsuka, Kyoto University
Learning Probabilistic Models for Mining Labeled Ordered Trees
Learning a probabilistic model
is a long
standing, highly regarded approach in machine
learning. In this approach, a probabilistic model is first designed based on the
background knowledge of a target application, and then
parameters of a
model are estimated from given (tr
aining) examples. Using a model with estimated
parameters, we can find patterns hidden in training data and can give a score
for a newly given example. We note that finding frequent patterns is a general
and important issue in data mining,
and prediction is also a major and useful goal in
In this talk, I'll focus on labeled ordered trees, which are a typical example of
structured data that have appeared in a lot of applications, including text, web and
gy. For labeled ordered trees, I'll show an example of probabilistic model
learning. That is, I'll present a
probabilistic model and its efficient learning
scheme for labeled ordered trees.
For strings, a hidden Markov model is a general and w
used probabilistic model in a
lot of applications such as speech recognition, natural language processing and
bioinformatics. In our approach, we extend a hidden Markov model and its standard
learning scheme to those for
labeled ordered trees. I
n this talk, I will describe the
structure of our model for labeled ordered trees and how the model parameters are
estimated from given training examples, as comparing with those for a hidden Markov
Finally, I will demonstrate the predictive perfor
mance of our approach using synthetic
datasets as well as real datasets derived from glycobiology. Assessing the results using
the approach on real data from some biological viewpoints will also be added, verifying
known facts in glycobiology.
Managing and Analyzing Carbohydrate Data.
Aoki, K. F., Ueda, N., Yamaguchi, A.,
Akutsu, T., Kanehisa, M. and Mamitsuka, H.
ACM SIGMOD Record
Application of a New Probabilistic Model for Recognizing Complex Patterns in
K. F., Ueda, N., Yamaguchi, A., Kanehisa, M., Akutsu, T. and
Proceedings of the Twelfth International Conference on Intelligent
Systems for Molecular Biology (ISMB/ECCB 2004),
, Supplement 1,
i14), Glasgow, UK, August,
2004, Oxford University Press.
Probabilistic Model for Mining Labeled Ordered Trees: Capturing Patterns in
Carbohydrate Sugar Chains.
Ueda, N., Aoki
Kinoshita, K. F., Yamaguchi, A., Akutsu,
T. and Mamitsuka, H.
IEEE Transactions on Knowledge and Data Engin
ProfilePSTMM: Capturing Tree
structure Motifs in Carbohydrate Sugar Chains.
K. F., Ueda, N., Mamitsuka, H. and Kanehisa, M.
Proceedings of the Fourteenth
International Conference on Intelligent Systems for Molecular B
iology (ISMB 2006),
e34), Fortaleza, Brazil, August, 2006, Oxford University
A New Efficient Probabilistic Model for Mining Labeled Ordered Trees.
Kinoshita, K. F., Ueda, N., Kanehisa, M. and Mamitsu
Proceedings of the
Twelfth ACM SIGKDD International Conference On Knowledge Discovery and Data
Mining (KDD 2006)
, Philadelphia, PA, USA, August 2006, ACM Press.