Data Mining with Oracle

coachkentuckyAI and Robotics

Nov 25, 2013 (3 years and 11 months ago)

72 views

Data Mining with Oracle
using Classification and
Clustering Algorithms

Presented by Nhamo Mdzingwa

Supervisor: John Ebden

Overview of Presentation


Recap of Proposal


Classification of Data Mining & DM Algorithms


Oracle Data Mining


Data Mining Process


Evaluation of Results


Progress so far


Updated Timeline


Plans

Objective


Investigate two types of algorithms
available in Oracle10g for data mining
(ODM).



Apply the two algorithms to actual data.


Analyse &


Evaluate results in terms of performance.

Classification of Data Mining



Directed data mining/supervised learning



which build a model that describes one

particular attribute in terms of the rest of the

data.


Undirected DM / Unsupervised learning



builds a model to establish the relationships

amongst all the input attributes by grouping.



Classification of Data Mining
algorithms

DM strategies

Unsupervised
learning

Supervised
learning

Classification

Naive Bayes

Model Seeker

Adaptive Bayes

Estimation

Prediction

Predictive variance

Clustering

k
-
Means

O
-
Cluster

Input attributes but
have no output
attributes

Input attributes and
output one or more
attributes

Association Discovery

Visualization

Algorithms offered in Oracle10g

classification


1.
Adaptive Bayes Network



2.
Naive Bayes

3.
Model Seeker

clustering

1.
k
-
Means

2.
O
-
Cluster

3.
Predictive variance

association rules

1.
Apriori (association rules)

Evaluation of Results


Evaluation of unsupervised learning
models involves determining the level of
predictive accuracy.


Evaluated using test data sets.


Compare confidence and support levels of
models created from the same training
data to determine accuracy.

Progress


Literature Survey


Oracle10g installed on Athena in Hons Lab


Exploring the Oracle9i and 10g Suite
including JDeveloper


Member of MetaLink (Oracle’s online support
service)



Updated Timeline

Continuation from literature and
tutorials

done

Investigate Clustering & Classification
algorithms (theory)


done

Find suitable computerised case
studies of the use of above algorithms


with or without Oracle.

done


Search datasets for testing
(possibilities: AIDS data & faculty data)

In progress


Apply algorithms to data found then
Critically Analyse & assess results

Second semester


Write up paper

September vacation and 3rd term

Final project write up

Due 7/11