SAP Curriculum Congress 2010

levelsordData Management

Nov 20, 2013 (3 years and 8 months ago)

76 views

Business Intelligence with SAP BI and SAP BusinessObjects Software


Christine Davis


University of Arkansas

Nitin Kale


University of Southern California

SAP Curriculum Congress 2010

©
SAP AG
2010.
All rights reserved. / Page
2

Introduction to Data Mining

Data Mining Process

Data Mining Methods

Data Mining Case Studies

Resources

SAP University Alliances


Module BI1
-
M6

©
SAP AG
2010.
All rights reserved. / Page
3

Introduction to Data Mining

The majority of reports
are based on known
facts


BUT

We don’t know what
we don’t
know

©
SAP AG
2010.
All rights reserved. / Page
4

What is Driving Data Mining?

Changes in Technology:


Increased usage of the Internet



Appearance of data warehouses


Increase in computing power


Better modeling approaches

Changes in Competition:


Evolution of strategies:


Mass marketing vs. One
-
to
-
One
marketing


Increased competition


Fast
-
paced environment


Emergence of niche players


Changes in Customer
Behavior:


Better informed


More demanding


Increased willingness to switch to
competitors


Evolution of needs: more
complex, harder to satisfy


©
SAP AG
2010.
All rights reserved. / Page
5

Definition

Data mining is the process of discovering
meaningful new
correlations
,
patterns

and
trends by "mining" large amounts of stored
data
using pattern recognition
technologies,
as well as statistical and mathematical
techniques.

(Ashby, Simms (1998))


©
SAP AG
2010.
All rights reserved. / Page
6

Data Mining Examples

Market Based
Analysis and Up
-
Selling/Cross
-
Selling

Pharmaceutical
Industry:

Drug Effectiveness
by Patient Type

Defect Analysis
in

Manufacturing

University and
Employee
Recruitment

Employee
Turnover
Predictions

Credit

Risk

Determination

Credit

Card

Fraud

Customer
Grouping and
Behaviour
Prediction

©
SAP AG
2010.
All rights reserved. / Page
7

Introduction to Data Mining

Data Mining Process

Data Mining Methods

Data Mining Case Studies

Resources

SAP Business Intelligence

Module 6

©
SAP AG
2010.
All rights reserved. / Page
8

CRISP DM: Overview

©
SAP AG
2010.
All rights reserved. / Page
9


K
nowledge
D
iscovery in
D
atabases (KDD)

Knowledge Discovery in Data is the non
-
trivial process of identifying

valid
-
novel
-
potentially useful
-
and ultimately
understandable patterns in
data
.


Advances in Knowledge Discovery and Data Mining
, Fayyad,

Piatetsky
-
Shapiro, Smyth, and Uthurusamy, (Chapter 1), AAAI/MIT Press
1999

©
SAP AG
2010.
All rights reserved. / Page
10

Introduction to Data Mining

Data Mining Process

Data Mining Methods

Data Mining Case Studies

Resources

SAP Business Intelligence

Module 6

©
SAP AG
2010.
All rights reserved. / Page
11

Data
Mining Models


Predictive

Supervised Learning

©
SAP AG
2010.
All rights reserved. / Page
12

Data
Mining Models


Explorative

Unsupervised Learning

©
SAP AG
2010.
All rights reserved. / Page
13

Customer

Income

Age

Credit Rating

Etc.

Buying
Behavior

Customers
-


Historical Data

(query)

Mick Jones

$ 100000

48

Excellent



Yes

Elton Brown

$ 130000

22

Fair



No

Jack Turner

$ 118000

36

Excellent



Yes

Etc.











How will other
Customers
behave?

New Data

(query)


Willie Nelson

$ 165000

34

Fair



Carol Lee

$ 80000

63

Excellent



Etc.










Identify the factors driving customer behavior and
predict future behavior

?

?

?

Predictive: Decision
Tree
*

*Ayati: This example shows the common features of Decision
Tree and Decision Table, w
hich is the underlying principle of
Expert
Systems

©
SAP AG
2010.
All rights reserved. / Page
14

Model process:


A record in the query starts at the root node


A test (in the model) determines which node
the record should go to next


All records end up in a leaf node

Interpreting the Results

Read the tree from top to bottom


Rule:


If Age is less than 35 and


Income is greater than $5000 and


Credit standing is
Excellent,
then the
customer has a 35% chance of buying
the product


Age, then Income and credit rating, are the
most influential attributes determining
buying behavior.



Age

Income

Buy

100%

Won’t Buy

100%

Credit

Rating

Will Buy

35%

Won’t Buy

65%

Leaf Nodes

Root Node

Decision

Node

<35

>= 35

>$5000

<=$5000

Fair

Excellent

Test

Predictive: Decision Tree

©
SAP AG
2010.
All rights reserved. / Page
15

A tree showing
survival of
passengers on
the
Titanic

("
sibsp
" is the
number of
spouses or
siblings
aboard). The
figures under
the leaves show
the probability
of survival and
the percentage
of observations
in the leaf.

Source: Wikipedia.org

©
SAP AG
2010.
All rights reserved. / Page
16

Source: Wikipedia.org

©
SAP AG
2010.
All rights reserved. / Page
17

Decision Tree: Practical Applications

How can we reduce customer fraud?


Analyze customer characteristics:


Fraudulent behavior (Y or N), age, education, occupation, frequency of purchase,
dollar value of purchase, etc.

Who is likely to “churn” (stop buying from us)?


Analyze customer characteristics; who is:


(1) still with us, and


(2) no longer “on board”,


Plus other demographic or transactional attributes...

Who is likely to be a credit risk?


Analyze customer characteristics: who has:


(1) not been a credit risk in the past, and


(2) who has been a credit risk in the past



Include relevant customer characteristics



©
SAP AG
2010.
All rights reserved. / Page
18

Weighted
Score Tables

Customer
groups
)

Age


Points

(Age)

Income


Points

(Income)


Region


Points

(Region)

Weight

30%

50%

20%


1

10


19

7

25 000

2

South

5


2

20


29

10

50 000

5

West

3


3

30


39

2

120 000

8

East

7

Calculated score for Customer 2:

= (10 x
30%
)
+ (5 x
50%
) + (3 x
20%
) = 6.1

Use weighted scoring to rank
customers according to the
importance of certain attributes.

©
SAP AG
2010.
All rights reserved. / Page
19

Predictive:
Regression


Linear Regression


Nonlinear Regression

Use regression to predict
the impact of one (or
more) on another.

Example: impact of price
reduction on sales in
Regions NY, PA and TX.


Example: Impact of age,
income, HH size, region,
length of subscription on
canceling a subscription

©
SAP AG
2010.
All rights reserved. / Page
20

Informative: Clustering

Clustering is
a data mining technique that creates groups of
records that are:


Similar to each other within a particular group


Very different across different groups


The degree of
association

between members is measured
by all the characteristics specified in the analysis


Clustering helps
the user explore vast amounts of data and
organize it in a systematic way


©
SAP AG
2010.
All rights reserved. / Page
21

Income

Age

High

Low

High

Informative: Clustering

©
SAP AG
2010.
All rights reserved. / Page
22

Informative: Clustering Process

©
SAP AG
2010.
All rights reserved. / Page
23


Informative
: Association Analysis

Association Analysis uncovers the hidden patterns,
correlations or casual structures among a set of items or
objects.

It is typically used for
Market Basket Analysis (MBA).

It allows the user to:


Understand and quantify the relationship between different
items (e.g.
products, clickstream, etc...)


Group different
items

by affinity


Create readily
-
understandable rules describing
....


Organize web pages in order to optimize user accessibility


©
SAP AG
2010.
All rights reserved. / Page
24



Association Analysis

Data Mining


Cross
-
Selling

Rules

C

D

D

A

B

E

E

E

A

Customers

Products

B

C

D

What products /
services are
typically bought
together?



Export rules

to Web Shop


Use in

merchandising

Informative: Association Analysis
-

Example

Amazon using
Association
Analysis

©
SAP AG
2010.
All rights reserved. / Page
26

Informative: Association Analysis
-

Measures

©
SAP AG
2010.
All rights reserved. / Page
27

Informative: ABC Classification

Use ABC to classify objects (such as customers, employees, vendors or
products) based on a particular measure (such as revenue or profit).

Examples:


Customers with revenue >$100M = Class “A”, etc


Customers who generate top 20% of our revenue = Class “A”, etc


Rank customers by their revenue:


The top 20% on the list = Class “A”, etc OR


The first 50 customers = Class “A”, etc

Practical applications


Classify customers into Platinum, Gold, Silver


Rank vendors based on product quality (returned goods)



©
SAP AG
2010.
All rights reserved. / Page
28

Informative: ABC Analysis
-

Example

©
SAP AG
2010.
All rights reserved. / Page
29

Introduction to Data Mining

Data Mining Process

Data Mining Methods

Data Mining Case Studies

Resources

SAP Business Intelligence

Module 6

©
SAP AG
2010.
All rights reserved. / Page
30

Data Mining: Terrorism



Five Were Active FBI Terrorist
Investigations



Including Hijacker:

Marwin Youseff Alsherri



Delivered List to Authorities
Prior

to
Names Being Made Public

Within 16 Hours Seisint Delivered

419

Names of Interest

On September 14, 2001

Seisint’s Artificial

Intelligence


Billions
Of

Public Records

FAA Public Record
Information

Seisint’s Data

Supercomputer

+

+

+

©
SAP AG
2010.
All rights reserved. / Page
31

Data Mining: Examples

Banking

Lloyds TSB


Saved $35 million by reducing credit card fraud

HSBC


4x more leads, 37% more asset potential

Bank Financial


7x increase in response rates, 80% reduction in
costs

Insurance

Aegon


Generated $30M additional revenue in service
call center

FBTO


Decreased direct mailing costs by 35%,
increased conversion rates by 40%, increased
profit by 29%

Telecommunications

Verizon Wireless


Cut churn by 20%, saved 33% of

“at
-
risk” clients and reduced marketing costs by 60%

Telstra


Increased sales in call centers by 120%

Other industries

Experian


Generated $2.5 million in catalog revenue while
reducing hardware and software maintenance costs by
80%

Center Parcs


Added $3 million to their bottom line


Reduced mail costs by 46%

Sofmap.com (retail)


Tripled profitability of online store

De Telegraaf (media)


Reduced acquisition cost per subscription

by 90%

www.spss.com/events/e_id_2247/presentation.ppt

©
SAP AG
2010.
All rights reserved. / Page
32

Introduction to Data Mining

Data Mining Process

Data Mining Methods

Data Mining Case Studies

Resources

SAP Business Intelligence

Module 6

©
SAP AG
2010.
All rights reserved. / Page
33

Data Mining: Resources

Data Mining Resources Blog


http://dataminingresources.blogspot.com/

Data Mining@CCSU


http://www.ccsu.edu/datamining/resources.html

The Data Warehousing Institute


www.tdwi.org


©
SAP AG
2010.
All rights reserved. / Page
34

SAP Resources


SAP University Alliances community
http://www.sdn.sap.com/irj/uac


Collaboration workspace from SAP
https://cw.sdn.sap.com/cw/index.jspa


Business Intelligence workspace: content and discussions
https://cw.sdn.sap.com/cw/community/uac/bi


SAP BusinessObjects Community
http://www.sdn.sap.com/irj/boc


University of Arkansas, Walton College Enterprise Systems
http://enterprise.waltoncollege.uark.edu/


University of Southern California, Viterbi School of Engineering, Information
Technology Program/SAP Program
http://itp.usc.edu/sap






Contact

Christine Davis

Nitin Kale


University of Southern California

3650 McClintock Ave, OHE 412

Los Angeles, CA 90089


T: +01 (213) 740


7083

F: +01 (213) 740


1051


kale@usc.edu

©
SAP AG
2010.
All rights reserved. / Page
36

Thank you!