Machine Learning Methods for

zoomzurichΤεχνίτη Νοημοσύνη και Ρομποτική

16 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

85 εμφανίσεις

Machine Learning Methods for
Property Prediction in

Chemoinformatics

Name:
Rong

Li

Outline


INTRODUCTION


LEARNING APPROACHES


MODELS DESCRIPTION


FUTURE WORK AND CONCLUSION


Outline


INTRODUCTION


LEARNING APPROACHES


MODELS DESCRIPTION


FUTURE WORK AND CONCLUSION


Main challenges


QSAR


Table 1.
Chemoinformatics

Tasks and the
Appropriate Machine Learning Concepts
and Methods

Outline


INTRODUCTION


LEARNING APPROACHES


MODELS DESCRIPTION


FUTURE WORK AND CONCLUSION






Two main Models


Frequentist



Bayesian

Three levels of statistical modeling


Deterministic discriminative


Probabilistic
discrim
-
inative


Generative

Outline


INTRODUCTION


LEARNING APPROACHES


MODELS DESCRIPTION


FUTURE WORK AND CONCLUSION


MODELS

DESCRIPTION

Input/Output

Matching


Unsupervised learning



the data is used without distinctions
between “input” or “output” variables. Its
goal is to analyze the data distribution,
reduce data dimensionality, or reveal the
patterns hidden in the data

Input/Output

Matching

Insupervised

learning



each training example contains both
inputs (X) and outputs (Y) labels, and the
task is to predict outputs for given inputs

Input/Output

Matching


Semi
-
Supervised


Transductive

Learning


Active Learning

Types of Models


Linear Models


Nonlinear Models


Logical Models

Logical Models:


ILP(Inductive Logical Programming)


ILP has successfully been applied to
mutagenicity
(
诱变
)


toxicity prediction(
毒性
预测
)
pharmacophore

discovery(
药效基因发现
)


Model Interference:

Inductive Transfer of Knowledge



Multitask



Feature Net learning



Model Interference:


Inductive Transfer of Knowledge


Tasks:


Regression


Classification


Density Estimation

Duality of Models


Primal



Dual
Representa
-
tions

Data Types

Data Types



Chemistry mainly deals with chemical
structures and their transformations.


Molecular descriptors are used to map
molecular graphs to vectors

Three types of vectors are used in
chemoinformatics
:


(
i
)vectors of bits (
bitstrings
) corresponding
to the screens or fingerprints


(ii) vectors of integer values forming by
fragment descriptors (counts of
substructures)

(iii) vectors of real
-
valued numbers
involving other types of descriptors


Outline


INTRODUCTION


LEARNING APPROACHES


MODELS DESCRIPTION


FUTURE WORK AND CONCLUSION


Conclution


In this article, we have discussed the
most promising ideas, achievements, and
approaches in machine learning that
could potentially be useful in
chemoinformatics

in order to improve the
accuracy of predictions and efficiency of
virtual screening.

Future Work


Most of these methods are implemented in
freely available software but still are little
known in the
chemoinformatics

community. We hope that some
recommendations given here would
enrich
the“modeling

kit”used

in
computer
-
aided molecular design


Questions and Thanks!