Data Mining 1x - Akson

benhurspicyΤεχνίτη Νοημοσύνη και Ρομποτική

7 Νοε 2013 (πριν από 3 χρόνια και 9 μήνες)

72 εμφανίσεις

27
-
18 września 2012

1

Data
Mining







dr Iwona
Schab




2

Semester

timetable



ORGANIZATIONAL ISSUES,


INDTRODUCTION TO DATA MINING




1
Sources of data in business, administration, science and
technology.


2 The process of discovering knowledge in data; the role of
data mining in this process.


3 Data mining and Business Intelligence.


4 SEMMA
methodology
.


5 Data preparation: sampling, cleaning, normalization and
standardization.


6
Association

rules

discovery
.


7 Classification problems: case studies
.

3

Semester

timetable



8 Rule induction systems: algorithms, knowledge
representation.


9
Decision trees: partition rules and pruning.


10
Classification based on probability distributions: naive
Bayes estimation and Bayesian networks.


11 Grouping problems
-

case studies.


12 Cluster analysis: combinatorial and hierarchical methods.


13 Modeling response to direct mail marketing.


14
Churn

analysis
.


15
Text

mining
.


16 Web
mining
.


17 Data mining in Life Science.


18 Comparative analysis of algorithms implemented in SAS
Enterprise Miner and WEKA software.

4

Literature

Basic



Paolo
Giudici
, Applied Data
Mining
. Statistical
Methods

for
Business and
Industry
,
Wiley, New York
2011



Supplementary



Selected

papers

to be
circulated



D
aniel
T.Larose
, Discovering Knowledge in Data: An
Introduction to Data Mining, Wiley, New York
2005



D
aniel
T.Larose
,
Data

Mining
Methods and Models, Wiley, New
York
2006



5

Statistical
Analysis?

6

Data
Mining



t
o
mine

= to
extract

(
e.g
.
precious
,
hidden

resources

from the
Earth)



Different

definition

and
understanding

depending

on
user



New
dyscipline

developed

from
computing

and
statistics



In
-
depth

search

to
find

additional

information

(
previously

unnoticed

in the mass of data
available
)


Data
preparation

and „
structuring

unstructured

needed





Machine learning =
finding

relations and
regularities

in data


Generalisation

from the
observed

data to
new

unobserved

case



7

KDD
Process


(Knowledge Discovery in Database)





8

Software


www.sgh.waw.pl/ogolnouczelniane/ci/aplikacje/oprogramowanie/




SAS/STAT



SAS Enterprise Miner



---


Other
:
Statistica
, SPSS


WEKA