CIS 690 (Implementation of High-Performance Data Mining Systems ...

sentencehuddleData Management

Nov 20, 2013 (3 years and 7 months ago)

113 views

Kansas State University

Department of Computing and Information Sciences

CIS 690: Data Mining Systems

Lab 0

Monday, May 15, 2000


William H. Hsu

Department of Computing and Information Sciences, KSU

http://www.cis.ksu.edu/~bhsu

High
-
Performance Data Mining:

Problems and Current Tools

Kansas State University

Department of Computing and Information Sciences

CIS 690: Data Mining Systems

Data Mining Software Practicum


Lab and Implementation Project Objectives


Learn to work with some KDD software tools


Understand modern ML
-
based object oriented HP
-
KDD systems


Main tools
:
MLC++
/
MineSet

and
NCSA D2K


Implementation Project Milestones


Project plan


Due Monday, May 22, 2000 (Homework 1)


Short writeup: 1
-
2 pages design plus simple specification document


Project interim report


Due Wednesday, May 31, 2000 (Homework 2)


Preliminary itinerary and documentation


Comparative results


Using software tools


Due Thursday, June 8, 2000 (machine problem; Homework 3)


Exposure to:
MineSet
;
SNNS, NS3
;
Hugin, BKD;

GPSys
,
Genesis

Kansas State University

Department of Computing and Information Sciences

CIS 690: Data Mining Systems

Software Environments for KDD:

NCSA D2K

Kansas State University

Department of Computing and Information Sciences

CIS 690: Data Mining Systems

KDD and Software Engineering:

D2K

Framework

Rapid KDD Development Environment

Kansas State University

Department of Computing and Information Sciences

CIS 690: Data Mining Systems

Performance Element:

Decision Support Systems (DSS)

0
10
20
30
40
50
60
Objective
Determination
Data Preparation
Machine
Learning
Analysis &
Assimilation
Effort (%)
Environment

(Data Model)

Learning

Element

Knowledge

Base

Performance

Element


Model Identification (Relational Database)


Specify data model


Group attributes by type (dimension)


Define queries


Prediction Objective Identification


Identify target function


Define hypothesis space


Transformation of Data


Reduce

data: e.g., decrease frequency


Select

relevant

data channels (given prediction objective)


Integrate

models, sources of data (e.g.,
interactively elicited rules
)


Supervised Learning


Analysis and Assimilation: Performance Evaluation using DSS

Kansas State University

Department of Computing and Information Sciences

CIS 690: Data Mining Systems

Some Interesting Industrial Applications

Reasoning (Inference, Decision Support)

Cartia
ThemeScapes
-

http://www.cartia.com

6500 news stories

from the WWW

in 1997

Planning, Control

Normal

Ignited

Engulfed

Destroyed

Extinguished

Fire Alarm

Flooding

DC
-
ARM

-

http://www
-
kbs.ai.uiuc.edu

Database Mining

NCSA
D2K

-

http://www.ncsa.uiuc.edu/STI/ALG