1
Data Mining
What is data Mining?
It is a
data modeling process
that covers a broad range of techniques
being used in a variety of industries involved with marketing, risk
and customer relationship management.
The success of any modeling project requi
res not only a good
understanding of the methodologies but solid knowledge of the data,
market and overall business objectives.
Effective use of data mining techniques is a delicate belend of
art
and science.
2
Steps for Preparing a Data Mining Project
1.
S
etting the Objectives
2.
Selecting the Data Sources
3.
Preparing the Data for Modeling
4.
Selecting and Transforming the variables
5.
Processing and Evaluating the Model
6.
Validating the Model
7.
Implementing and Maintaing the model
8.
Applications
3
Defining the Goal
To measure or to Predict?
Predictive models estimate values that represent future activity.
A descriptive model creates rules that are used to group subjects into
descriptive categories.
From business point of view, companies use predictive and
descriptive models to
attract and retain profitable customers.
4
One way to determine the objective of target modeling or profiling
project is to ask the following questions:
Do you want to attract new customers?
Do you want those new customers to be p
rofitable?
Do you want to avoid high
-
risk customers?
Do you want to understand the characteristics of your current
customers?
Do you want to make your unprofitable customers more profitable?
Do you want to retain your profitable customers?
Do you want
to win back your lost customers?
Do you want to improve customer satisfaction?
Do you want to increase sales?
Do you want to reduce expenses?
5
Do you want to attract new customers?
Targeted response modeling
Do you want those new customers to be prof
itable?
Lifetime value modeling
Do you want to avoid high
-
risk customers?
Risk or approval models
Do you want to understand the characteristics of your current
customers?
Segmenting and profile analysis
Do you want to make your unprofitable customers mo
re profitable?
Cross
-
sell and up
-
sell targeting models
Do you want to retain your profitable customers?
Retention or churn models
Do you want to win back your lost customers?
Win
-
back models
Do you want to improve customer satisfaction?
Do you want to
increase sales?
Do you want to reduce expenses?
6
Some Terminologies
Profile Analysis
It measures common characteristics within a population of interest.
Demographics as well as consumption behaviors are typically the key
variables to be analyzed.
Seg
mentation
Use profiles analysis to separate customers by profitability and
market potential, or by profit and risk.
Response
The goal of a response model is to predict who will be responsive to
an offer for a product or a service.
Risk
Approval or ris
k models are unique to banking and insurance
industries that assume the potential for loss when offering a product
or service.
Activation
Activation models are models that predict if a prospect will become a
full
-
fledged customer.
Cross
-
sell and up
-
sell
7
Cross
-
sell models are used to predict the probability or value of a
current customer buying a different product or service from the same
company.
Up
-
sell models predict the probability or value of a customer buying
more of the same products or services.
Attrition
Attrition is defined as a decrease in the use of a product or service.
The issue is to predict the act of reducing or ending the use of a
product or service after an account has been activated.
Net present Value
A net present value (NPV) mode
l attempts to predict the overall
profitability of a product for a predetermined length of time
Lifetime Value
A lifetime value model attempts to predict the overall profitability of
a customer for a predetermined length of time.
8
Choose the Modeling Me
thodology
(Details to be discussed)
Linear Regression
Logistic Regression
Multivariate techniques for Clustering and classification
Neural Networks
Genetic Algorithms
Classification Trees
To be successful to support an analytic approach,
every area of
the company must be willing to work toward the same goals,
especially the team work among finance, accounting, marketing
and information technoloy groups.
9
Selecting the Data Source
There are three basic types of data:
Demographic data
--
provides description of personal or household
characteristics
Gender, age, martial status, income, home ownership, dwelling type,
education level, ethnicity, presence of children, ….
Behavior data
–
records or measurement of action or behavior
Sale
s amount, types and dates of purchases, payment patterns,
customer service activities, insurance claims, bankruptcy behavior,…
Psychographic or attitudial data
–
provides indication of intended
behavior and is characterized by opinions, lifestyle characte
ristics or
personal values
10
Source of data
--
Internal Sources
Customer Database
Customer ID, household ID, account number, customer name,
address, phone number, demographics, product or services, offer
details, model scores,…
Each customer has a
record
Transation Database
Customer ID, account number, sales activity, date pf activity,
Each transaction has a record.
Offer history database
This contains details about offers made to prospects, customers or
both.
Data warehouse
A data warehous
e is a structure that links information from two or
more databases.
It is effective to integrate all internal databases into an information
data mart for general applications.
External Source
--
List sailer and compilers for new customers
11
Selecting t
he best data for targeting model development requires
a thorough understanding of the market and the objective.
More and more companies are forming affinity relationship with
other companies to pool resources and increase profits.
Tip: strive to have t
he population from which the data is extracted be
representative of the population to be scored.
Data for prospecting: Data from a prior compaign for the same
product and to the same group is the optimal choice for data in any
targeting models. Often new
list can be obtained from list providers
or alliance.
Data for customer model
Phone survey from existing customers for cross
-
sell analysis.
Data for risk models
Credit and insurance risk data.
Usually, it is more effective to use both internal and e
xternal data
source to build reisk models over a pre
-
specified period
Pupulation and Sampling Methods
Simple random sampling, stratified random sampling
12
Prepare the data for modeling
Fixed format versus variable format
Qualitative data versus Quant
itative data
Nominal data versus Cardinal data
Interval data versus Continuous data
Cleaning the Data
Examine data for possible errors, outliers, and missing values.
Need examples here for (a) identifying errors and outliers, (b)
methods of de
aling with outliers, and (c) methods to relace missing
values.
Continuous variables and categorical variables should be discussed
separately. For continuous variables, normally unusual data points
would be deleted or replaced, for categorical data often m
issing data
can be a new category.
13
Defining Objective Function and variable selcetion
With respect to the goal of the study, a specific objective should be
formulated.
For instance,
Net present value of a product (NPV)
NPV = probability of activa
tion * Risk index * profit
–
marketing
expenses
Each component of the above formulation needs to be modelled and
estimated.
The overall NPV should be estimated by the combinations of
segments of the entire market.
Methods of variable reductions: ratios,
summerization, aggregations.
Segmentation, transformation of data,
Building linear predictors, interactions, threshold models,
14
Model selection
Criteria
Stepwise, forward, backword
Stability and homogeniouity of results
Validation of the
model (Model Checking)
Splitting the data
–
for fitting and validation
Resampling methods
Model Modifications
15
Implementing and maintaining the model
Once an useful model is built, the parameters would be updated
and validation measurements would
be examined periodically.
Enter the password to open this PDF file:
File name:
-
File size:
-
Title:
-
Author:
-
Subject:
-
Keywords:
-
Creation Date:
-
Modification Date:
-
Creator:
-
PDF Producer:
-
PDF Version:
-
Page Count:
-
Preparing document for printing…
0%
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο