Chapter 9 - Cluster Analysis

savagelizardΤεχνίτη Νοημοσύνη και Ρομποτική

25 Νοε 2013 (πριν από 3 χρόνια και 11 μήνες)

156 εμφανίσεις

Multivariate Data Analysis

Chapter 9
-

Cluster Analysis


Section 3: Independence Techniques

Chapter 9


What Is Cluster Analysis (Q analysis)?


Define groups of homogeneous objects (i.e., individuals,
firms, products, or behaviors)


Maximize the homogeneity of objects within the clusters
while also maximize the heterogeneity between clusters


Segmentation and target marketing


Compare with Factor Analysis


How Does Cluster Analysis Work?


Measuring Similarity (Euclidean distance)


Forming Clusters (hierarchical procedure vs.
agglomerative method)


Determining the Number of Clusters in the Final
Solution (entropy group)

Cluster Analysis Decision Process


Stage One: Objectives of Cluster Analysis


Taxonomy description


Data simplification


Relationship identification


Selection of Clustering Variables



Characterize the objects being clustered


Relate specifically to the objectives of the cluster
analysis


Cluster Analysis Decision Process (Cont.)


Stage 2: Research Design in Cluster Analysis


Detecting Outliers


Similarity Measures (Interobject similarity)


Correlational Measures


Distance Measures


Comparison to Correlational Measures


Types of Distance Measures (Euclidean distance)


Impact of Unstandardized Data Values (Mahalonobis Distance, D
2
)


Association Measures


Standardizing the Data


Standardizing By Variables (normalized distance
function)


Standardizing By Observation (within
-
case vs. row
-
centering standarlization)


Cluster Analysis Decision Process
(Cont.)


Stage 3: Assumptions in Cluster Analysis


Representativeness of the Sample


Impact of Multicollinearity

Cluster Analysis Decision Process (Cont.)


Stage 4: Deriving Clusters and Assessing Overall Fit


Clustering Algorithms


Hierarchical Cluster Procedures



Single Linkage


Complete Linkage


Average Linkage


Ward's Method


Centroid Method


Nonhierarchical Clustering Procedures



Sequential Threshold


Parallel Threshold


Optimization


Selecting Seed Points



Should Hierarchical or Nonhierarchical Methods Be Used?



Pros and Cons of Hierarchical Methods


Emergence of Nonhierarchical Methods



A Combination of Both Methods



How Many Clusters Should Be Formed?


Should the Cluster Analysis Be Respecified

Cluster Analysis Decision Process (Cont.)


Stage 5: Interpretation of the Clusters


Stage 6: Validation and Profiling of the Clusters


Validating the Cluster Solution


Criterion or predictive validity


Profiling the Cluster Solution



Summary of the Decision Process


An Illustrative Example


Stage 1: Objectives of the Cluster Analysis


Segment objects (customers) into groups with
similar perceptions of HATCO


HATCO can then formulate strategies with
different appeals for the separate groups.


Stage 2: Research Design of the Cluster




Analysis


Identify any outliers


Similarity measure (multicollinearity: D
2
)


Stage 3: Assumptions in Cluster Analysis


An Illustrative Example (Cont.)


Stage 4: Deriving Clusters and Assessing




Overall Fit


Step 1: Hierarchical Cluster Analysis


Step 2: Nonhierarchical Cluster Analysis


Stage 5: Interpretation of the Clusters


Two
-
cluster solution


Four
-
cluster solution


Stage 6: Validation and Profiling of the Clusters


Managerial view