A Transaction Pattern Analysis System Based on Neural Network

sciencediscussionAI and Robotics

Oct 20, 2013 (3 years and 5 months ago)

83 views

A Transaction Pattern Analysis System Based on
Neural Network


Tzu
-
Chuen Lu
*

and
Kun
-
Yi Wu

Department of Information Management,

Chaoyang University of Technology, Taichung 41349, Taiwan, R.O.C.

E
-
mail:

tclu
@cyut.edu
.tw
,
kenwu@dsc.com.tw















Correspondence address:

Tzu
-
Chuen Lu

Department of Information Management,

Chaoyang University of Technology,

168, Jifong East Road, Wufong Township,

Taichung County 41349, Tai
wan (R.O.C.)

FAX: +886
-
4
-
23742337

E
-
mail:
tclu@cyut.edu.tw


URL:
http://www.cyut.edu.tw/~tclu


A Transaction Pattern Analysis System Based on
Neural Network


Abstr
act

Customer segmentation is a key element for target marketing or market segmentation.
Although there have been
quite a lot of ways available for segmentation today, most
of them emphasize numeric calculation instead of commercial goals. In this study, we

propose an improved segmentation method called
Transaction Pattern based Customer
Segmentation with Neural Network
(
TPCSN
N
)

based on customer’s historical
transaction patterns. First of all,
it filters transaction data
from database
for records
with typic
al patterns.
Next
, it reduces inter
-
group correlation coefficient and
increases
inner
cluster

density to achieve customer segmentation by iterative calculation. Then,
it utilizes neural network to dig patterns of consumptive behaviors. The results can be
u
sed to segment

new customers. By this way,
customer segmentation can be
implemented in very short time and costs little. Furthermore, the results of
segmentation are also analyzed and explained in this study.


Keywords
: market segmentation,
clustering,
cus
tomer segmentation, association rule
mining, neural network


1.

Introduction


Consumer market changes rapidly without any settled logic. In most occasions,
you can find all kinds of demands in it. Customers’ requirement will never be satisfied
merely by one o
r two products. However,
excessive products can be a burden or risk
to the company’s operation

[
9
,
13
,
17
,
19
]
. Therefore, in order to satisfy various
customer requirements within
company’s capacity, we need to split consumer market
into several segmentations and find out appropriate marketing strategies for them

[
3
,
4
,
5
,
6
,
14
,
15
]
.

The spirit of

strategic marketing proposed by Kotler is STP: Segmentation,
Targeting, and Positioning

[
14
,
15
]
. Ba
sed on some
customer diversities
, the
complicated market in the reality can be separated into
several small markets with
similar properties.
Among

them, the companies can find their target markets and the
positions. Such strategy is a golden rule even in today’s
business
. In recent years,
Kotler proposed a new brand marketing mode

Create Communicat
e Deliver Value
Target Profit

(
CCDVTP
).
It tries to create new communication tunnels and deliver
brand values. Then it conducts marketing with specific targets, and finally achieves
profits. To stipulate for a marketing strategy, market segmentation is the

first
step

[
14
,
15
]
.

Today, there are a lot of market segmentation methods available, but most of
them are based on
existing

segmentation
methods
, such as K
-
means,
Density
-
based
Spatial Clustering of Applications with N
oise
(
DBSCAN
)

and so on

[
1
,
2
,
9
,
7
,
8
,
10
,
11
,
16
,
18
]
. The users have to choose an ap
propriate clustering method based on the
goals to be resolved or the characteristics of the database.
After the decision, the users
need to find suitable parameters for the clustering. It requires the users to be very
familiar with the problem or the data
characteristics to find the parameters and obtain
the optimized result. As a matter of fact, this is a mission impossible for normal
companies. Therefore, in this study, we try to propose an easy and understandable
method for normal users
, which can implem
ent the segmentation rapidly and correctly.
By this way, the business operation can be supported by the theory. Besides,
most of
existing
method
s are not designed for business purpose. As a result, some adjustments
to the data are required during the appli
cation. Such adjustment may make the results
away from the target problems.

In 2004
Chang
Chien

and Kuo
proposed a
customer segmentation method called
Transaction Pattern based Customer Segmentation

(
TPCS)

[
3
]
.
Their
customer
segmentation

method
covers both marketing and business purposes.

They
utili
ze

customers


historical transaction data and group the customers by similar transaction
patterns. Meanwhile, they put marketing and business purposes into consideration.
However, their method
is incapable of analyzing or explaining the segmentation
results. In this study, we try to improve the TPCS method and
add

extraction
mechanics for transaction data to t
he original method, analyze the segmentation result.
Furthermore, neural network techno
logy is adopted in our improvement. It enables the
users to obtain new customer segmentation quickly.

Here is a brief structure of this paper
.

S
ection 2 is a literature review, which
introduces TPCS method and other relevant technologies. In
S
ection 3, imp
roved
segmentation method proposed by us is introduced. In
S
ection 4, we try to evaluate
the improvements by simulation and actual data. The evaluation can be validated by
the real system. Section 5 is a summary.


2.

Related Works


2.1
Transaction Pattern bas
ed Customer Segmentation (TPCS)


TPCS looks for the patterns of consumer behaviors from the transaction data.
The segmentation can be adjusted by the customer weighing and become customer
oriented. This function avoids some single customer with transaction

data in different
segmentations. Then, the
s
egmentation
c
orrelation
m
atrix
(SCM)
is produced based
on consumer behavior patterns and correlation coefficient to demonstrate the
intersection degree among the segmentations.
The segmentation can be further
ad
justed to achieve lowest inter
-
segmentation correlation. After that,
a density
indicator

is added to measure inner
cluster

correlation. By reducing
inter
-
segmentation correlation and enhancing inner
cluster

density, customer
segmentation can be achieved us
ing a combination of merging and separation
strategies.
However, t
his method has two limitations:

1.

During transaction rules mining, attentions are paid to items purchased instead of
amount.

2.

Transaction is a collection of items purchased. It’s not connected
with purchasing
order. It also includes the situation of purchasing single item.

We are going to explain the steps of TPCS segmentation by 7 transaction records
which with
4 customers and 4 different items. Table 1(a) is the original transaction
data. In c
olumn Item, commodities purchased by each customer are listed. A, B, C,
and D are item codes. TID
is transaction number. Above all, we convert this table into
Table 1(b). 1 stands for yes and 0 stands for no. Minimum support is adopted at this
time to get
rid of rare items.


Table 1
.

The transaction records and the
c
onverted data

(a)

original transaction data

TID

Customer

ID

Item

1

Alpha

A
,
B
,
D

2

Beta

A
,
B

3

Charlie

C
,
D

4

Delta

B
,
C
,
D

5

Alpha

A
,
B
,
C

6

Beta

A
,
B
,
D

7

Delta

C
,
D


(b)

Converted data




I
t e m

T I D

A

B

C

D

1

1

1

0

1

2

1

1

0

0

3

0

0

1

1

4

0

1

1

1

5

1

1

1

0

6

1

1

0

1

7

0

0

1

1

C o u n t

4

5

4

5


A
s s u me t h a t t h e c u s t o m e r s c a n b e s e p a r a t e d i n t o
c l u s t e r

X a n d
c l u s t e r
Y, a s
s h o w n i n Ta b l e 2, w e c a n c a l c u l a t e t h e c u s t o me r s ’
c
l u s t e r
c
o r r e l a t i o n
m
a t r i x f o r t h e s e
t w o s e g me n t a t i o n s u s i n g e x p e c t e d v a l u e a n d p r o b a b i l i t y o f t h e r u l e s t o g e t h e r w i t h
p e r mu t a t i o n a n d c o mb i n a t i o n o f t h e f o u r i t e ms, a s s h o w n i n Ta b l e 3. B e s i d e s, i f w e
a s s u me t h a t t h e p r o f i t f o r e a c h p r o d u c t i s 1
, t h e

i n t h e t a b l e r e p r e s e n t s t h e
o c c u r r e n c e p r o b a b i l i t y o f
r
u l e

i n c l u s t e r X.

i s t h e e x p e c t e d v a l u e.
F o r
e x a mp l e,

r
u l e 1 2 i s 1 1 0 0, i t r e p r e s e n t s t h a t t h e o c c u r r e n c e p r o b a b i l i t y o f t r a n s a c t i o n s
p u r c h a s i n g b o t h A a n
d B i n c l u s t e r X i s 1/7, t h e e x p e c t e d v a l u e i s 2/7.


Table 2
.

I
nitial segmentation results

(a)
C
luster X


I t e m

T I D

A

B

C

D

1

1

1

0

1

2

1

1

0

0

3

1

1

1

0

4

1

1

0

1

( b )
C
l u s t e r Y


I t e m

T I D

A

B

C

D

1

0

0

1

1

2

0

1

1

1

3

0

0

1

1


Ta b l e

3
.

C u s t o me r C o r r e l a t i o n Ma t r i x


C l u s t e r


R u l e

C
l u s t e r X

C
l u s t e r Y







1

0 0 0 1

0

0

0

0

2

0 0 1 0

0

0

0

0

3

0 0 1 1

0

0

2/7

4/7

4

0 1 0 0

0

0

0

0

5

0 1 0 1

0

0

0

0

6

0 1 1 0

0

0

0

0

7

0 1 1 1

0

0

1/7

3/7

8

1 0 0 0

0

0

0

0

9

1 0 0 1

0

0

0

0

1 0

1 0 1 0

0

0

0

0

11

1011

0

0

0

0

12

1100

1/7

2/7

0

0

13

1101

2/7

6/7

0

0

14

1110

1/7

3/7

0

0

15

1111

0

0

0

0

Correlation Coefficient (CC)

-
0
.1721


Then, we go ahead with inner cluster density calculation. The number of data
records is divided by the sum of distances between each transaction record and the
cluster centroid.
The result can be used as a density indicator for inner cluster
compac
tness. The distances are calculated as Euclidean
d
istance. We can calculate the
centroid for every cluster from Table 2. Table 4 shows the results.


Table 4
.

C
entroid list

(a)
C
luster X

ITEM

C
ount

M
ean

A

4

1

B

4

1

C

1

1/4

D

2

1/2

(b)
C
luster Y

ITEM

C
o
unt

M
ean

A

0

0

B

1

1/3

C

3

1

D

3

1


From Table 4, we can find that the density of cluster X is 2.2857 and the density
of cluster Y is 4.4998. The effect of two clusters is 42.4953. This algorithm pairs
every two clusters and look for the combination w
ith worst effect for adjustment. If
all possible combinations have been tested and none of them is better than the original
cluster, the segmentation is finished.


2.2
Back
-
Propagation Network

(
BPN
)

Although the segmentation can be implemented by TPCS effe
ctively, this
method has no capability of explaining. Therefore, the BPN is adopted in the analysis
and learning of segmentation

[
12
,
13
]
. BPN is a kind of supervised learning network
and is wide
ly

used in many occasio
ns

[
20
]
. It has very good performance in diagnosis,
prediction, classification, and optimization.
BPN has three layers: input layer, hidden
layer, and output layer. Each layer contains several processing units. The input layer is
for f
eeding data and output layer is for sending results. There may be several hidden
layers between the input and output layers. They are used to represent the interactions
among input processing units. Today, BPN is the most widely adopted network. It has
a l
ot of successful stories, high learning precision, and fast recalling.


3.

The Proposed
Method

In this study, we are going to propose an improved method named Transaction
Pattern based Customer Segmentation with Neural Network (TPCSNN). Following
goals are e
xpected to be achieved.

1.

Minimized inter cluster correlation
.

T
he correlation can be calculated by
the
c
luster
c
orrelation
m
atrix

(CCM)
. Segmentation can be achieved by reducing inter
cluster correlation.

2.

Maximized inner cluster correlation
.

Euclidean
d
ist
ance is used to generate a
density indicator, in order to improve the incompact structure in the cluster.

3.

A fast responding system
.

N
ormal algorithms consume a great deal of time in the
segmentation operation. In this study, neural network technology is a
dopted to
find out the best cluster for the new customer.

The m
ajor steps

are shown below.

1.

Extract customer data.

2.

Initial segmentation: transaction records with equivalent patterns are classified
into one cluster.

3.

Adjust the segmentation to be customer ori
ented: the initial segmentation in Step 2
is based on transaction patterns. As a result, transaction records under same
customer may be grouped into different cluster. The clusters should be converted
to be customer oriented to ensure that every customer b
elongs to a single cluster.

4.

Transaction rule mining on all clusters: transaction patterns can be dug from each
cluster. These rules can be used to represent consumer properties in the cluster.

5.

Inter cluster correlation calculation: cluster correlation matr
ix can be generated
using

the correlation coefficient among the clusters.

6.

Inner cluster density calculation: divide the number of total transaction records in
the cluster by the sum of distances between centroid and all transaction data. The
result can be
used to evaluate the compactness in the cluster.

7.

Evaluating clustering: clusters are paired freely to evaluate the clustering. The
worst cluster pair is picked as improvement candidate.

8.

Improvement: through three segmentation adjustment strategies, we can
find a
best improvement method. If none of the strategies works, go back to Step 7 for
second worst pair for improvement. If no improvement is possible, repeat this
operation until all combinations are tested. If there is some improvement, go back
to Step
3 and repeat the calculation.

9.

Result analysis: an interface can be provided to the user

for analyzing inter
-
cluster
or inner cluster transaction situation.

10.

Building BPN

model
: the segmentation results generated by the above steps can
be used to train the B
PN with capability of rapid segmentation.


3.1
Extraction of customers’ transaction data

Let’s take small to medium businesses as examples. Generally, a lot of products
are being sold and thousands of customers take part in the transactions. Therefore, we
will not include all items in the analysis. We put our focuses on the important
products with top

purchases. The segmentation is also aiming at the customers
that may purchase these products. Thus, the company can conduct further
cl
assification and marketing for the customers that may purchase popular items.

During the import of
Enterprise Resource Planning

(
ERP)
system
, product
numbers are established for some services or expenses just for convenience, such as
installation and deliv
ery expenses, gifts, and rent. These products are not purchased
by the customers on their own initiatives. As a matter of fact, they are attached by the
manufacturers. These products should be removed when filtering the data. Besides,
the product
numbers h
ave different leading codes indicating their types. This filtering
condition can be provided to the users. Figure 1 shows the conditions of data
extraction.


Figure 1
.

S
creenshot of ERP data filtering


3.2
Initial segmentation

In this stage, we convert th
e customers’ transaction records into a format
required by the system. Transaction data with equivalent rules are merged into same
cluster. At this time, we don’t care about the owner of the transaction. Table 1
demonstrates the transformation.


3.3
Adjust

the segmentation to be customer oriented

In the previous stage, same customer can be put into different clusters. This
conflicts with our purpos
e. To ensure that all of the records under same customer are
put into the same cluster, we determine the master

cluster by the transaction ratio. If
there are more than two clusters sharing highest percentage, benefits will be
considered. Currently, the first cluster will be taken as master cluster. Table 5
demonstrates how the choosing works. In this table, the
pr
obabilities

of
classifying
transaction records of
c
ustomer Alpha into cluster 1,

2,

3, and 4 are 50%, 25%, 12.5%,
and 12.5% respectively. Therefore, cluster 1 becomes the master cluster for
c
ustomer
Alpha.



Table 5
.

C
hoosing mater cluster

Customer

Cluster

Transaction ID

Probability

Master
cluster

Alpha

1

3,4,5,6

50%



2

2,3

25%


3

1

12.5%


4

5

12.5%



3.4
Transaction rule mining on the clusters

Say all of the transaction data can be represented as
, and the
products purchased

in each record can be represented as
. Here,
,

. Item number is represented by
. If
=
1, it indicates that
p
roduct

is

purchased in the current transaction. If
=0, it indicates that
p
roduct

is not purchased in the current transaction. For example, there are four kinds of
available
products A, B, C, and D
. A transaction record

expression 1101 indicates that
A, B, and D are purchased in that transaction.

After all of the transaction data have been converted, we are going to calculate
the vertical sum for each product, as shown in Table 1

(b). At this time, minimum
support is add
ed into the judgment. If the minimum support is 0.6 and there are 7
transactions, it indicates that the single product should have more than 4.2
occurrences before being included in the calculation. As a result, product A and C
listed in Table 1 (b) will n
ever be included in the calculation. After removal of
products with low occurrence, we are going to find out the transaction patterns based
on the remaining products.

In this study, we ask the users to choose products with top sales for analysis in
the sta
ge of customer data extraction. Since the selection is based on the subjective
judgment, minimum support i
s

not needed.

3.5
Inter
-
cluster correlation

Correlation coefficient is a statistical method which is used to demonstrate the
linear relationship betwe
en two variables and determine their compactness. Closer its
absolute value is to 1,
closer the correlation between two clusters will be. In order to
achieve low inter
-
cluster correlation, lower the correlation coefficient, the better. We
utilize expected
profit of each transaction pattern to calculate the correlation
coefficient between the two clusters and present it by cluster correlation matrix. We
represent the correlation coefficients of cluster X and Y by
. The calculation
form
ula is


.

(
1
)


The

in the Equation
1

is the product number to be analyzed.

and

are expected profits for different tran
saction patterns
.

and

are average expected profits for different transaction patterns
.

In order to calculate the expected profit for the transaction pattern, we

also define
the
occurrence probabilities for the transaction patterns in

.

(
2
)

Here,

represents the actual amount purchased satisfying the specific
pattern.

is total records in the transaction database.

is the
transaction probability of pattern


in cluster X. As shown in Table 3, the transaction
pattern 1101 has two occurrences in cluster X. The total number of tr
ansactions is 7.
Therefore, the occurrence probability of this transaction pattern in cluster X is 0.2857.
Then, we utilize


(
3
)

to calculate the profits for this transaction pattern. Here, we put our empha
sis on the
purchase pattern. Therefore, the profit can be ignored. We can assume the profits for
all products are 1. In the equation,

represents whether there is purchase
for
i
tem
, Pattern
, under cluster X. It’s indicated by 0 or 1.

For the transaction pattern 1101 listed in Table 3, the expected profit is
0.8571,
since
(1
1
0.2857)+ (1
1
0.2857)+ (
1
0
0.2857)+ (1
1
0.2857) = 0.8571. The
average expected profit for cluster X is the sum of all expected profits


divided by the number of patte
rns. For the cluster X listed in Table 3, the average
expected profit is

0.1048, since
.


3.
6

Calculation of inner cluster density

In most occasions, the number of products with historical transactions is always less
than the total n
umber of
company

s
products. For example, we analyze 50 different
company

s
products. However, less than 10 products are valid for each transaction
record. Therefore, we define the density


(
4
)

as the numbe
r of items in the unit distance. This formula utilizes Euclidean
d
istance to
calculate the total distance

between each transaction and the centroid. As shown in
Equation 5, less the distance, similar the cluster members will be.

,

(
5
)

where

represents the product

in each transaction
,


represents the average value of product

among all transactions
, and


represents total number of transactions in the cluster.




3.7
Evaluations

We merge the inter
-
cluster correlation indicator and inner density indicator as
described in the previous steps.
The numerator is the sum of density for the two
clusters.

The larger, the better will be. The denominator is the correlation between the
two clusters, correlation coefficient in other words. Small it is, better it will be. We
can then generate an evaluation formula for evaluating segmentations.
This paper uses
T
otal Clustering Effectiveness

(
TCE)
to
describe the effect of segmentation.

,

(
6
)

where

represent the total number of clusters,


and

a
re any two of
cluster,


is the inner
cluster density for cluster
, and


is the inter
-
cluster correlation between cluster

and
.

As shown in

Equation
6
, larger the value, better it will be.


3.8
Improvements

After the first segmentation, we have to continue to look for the possibilities of
improvement for the current clustering. Therefore, we pick out combinations with
worst p
erformance, in order words, combinations with lowest Each Cluster
Effectiveness

(
ECE
)
, which defined as

.

(
7
)

After finding out the worst combination, we
are going to
adjust them by three
different strategie
s and
see whether the segmentation can be improved. If the TCE
value becomes better after the adjustment, it will be accepted and the cluster will be
reallocated. If the TCE value is not better than before, the original segmentation will
be kept and second

worst combination will be picked up for adjustment. If the TCE
value has no improvement after adjusting all cluster combinations, we can confirm
that this segmentation result is the best. There are three strategies for segmentation
improvement:

1.

Merge: it’
s possible that the original two clusters are quite similar to each other
and should actually become one. Making it two clusters results in excessive
correlation coefficient and a lowered TCE value.

2.

Splitting into three: the intersection between the two cl
usters can be excessively
large, which results in an excessively large correlation coefficient. It’s possible
that the intersection can become an independent cluster for calculation.

3.

The owner of intersection: another possibility is that the intersection h
as a wrong
owner. For example, a part which should be put under cluster A is actually put
under cluster B.

Therefore, on single customer basis check the distance between
the transaction records and the centroid, and put the customer under the right
cluster
.


3.9
Result analysis

After the segmentation is finished, we also provide an interface for users to
analyze the inter
-
cluster or inner
-
cluster transaction properties, as shown in Figure 2
and Figure

3. Here, we utilize the centroid and Euclidean
d
istance
of clusters to
calculate the similarity.
The
formula

is

,

(
8
)

where

represents the similarity
,

represents the actual distance
, and

re
presents the maximum distance.

We set the similarity to 1 (regulated distance ratio)
for easy understanding by the users
.

For example, there are totally ten products. The maximum distance shows up
between 0000000000 and 111111111, which is about 3.1623. Th
en we calculate the
actual distance between the two clusters and regulate it and obtain the proportion of
this distance in the total distance. For instance, the
actual distance

is 1.0123, the
distance ratio between the clusters is
, which means that the
similarity between the two clusters is actually 67.99%
, since
.


Figure 2
.

P
urchase analysis



Figure 3
.

C
ustomer similarity analysis


3.10
Building BPN

Model

BPN is good for segmentatio
n due to its learning precision and rapid recalling.
Therefore, we try to build a BPN for rapid customer segmentation using PCNeuron
with our customer transaction data and final segmentation results as training data.
Here, the processing unit collection fo
r input layer is
,

is used
to total number of products in the transactions. The processing unit collection for
output layer is
,

is the final number of clusters.

The hidden layers may consist of 1
-
2 layers. That’s enough for reflecting most
occasions. Excessive layers increase the complexity. After necessary test, we find that
one hidden layer has brought us pretty good results. As a result, only one hidden layer
i
s adopted. The number of units in the hidden layer changes according to the product
number being analyzed and final segmentation result. Here, we have the following
definitions:

Units in hidden layer =
.


(
9
)

The initial weighing region is set to 0.3. The
weight is generated by a random
number 0.456. Therefore, the weight between two processing units is a random
number between
0.3. This weighing region and random number can be user
de
fined. The training times can be set to 50. Learning rate is a value between
-
1 and
0.5. Tested by experiment, the learning rate should be above 0.5 for a better effect.
The configuration menu is shown in Figure 4.



Figure 4
.

Configuration of BPN traini
ng parameters


4.

Results and
D
iscussions

4.1 Virtual data

In the experiments, we implement the tests with 51 virtual transactions. The
number of customers is 17, total

number of

products are 27. The expected
segmentation number is 5. The result consists of 7

clusters.



Figure 5
.

T
he distribution of purchases


We can also examine whether or not a customer is suitable for the cluster. From
Figure 6, we can see that all of the 5 customers are closest to the centroid of cluster 2.
That means they are most simil
ar customers. Therefore, we can say that the
segmentation is 100% correct.



Figure 6
.

T
he similarity with other clusters of customers in same cluster


4.2 Actual data

In this study, we are going to carry out segmentation analysis on three different
types

of companies, in order to validate the correctness of TPCSNN method. For the
sake of protecting privacy, the real data is replaced by
code, as shown in the
following
.

Table 6
.

C
ompany types

Types

Business range

Number of
customers

Number of
products

Rice

company

Production and
sales of rice

10
,
564

268

Candy dealer

Japanese candies,
cookies, chocolate,
etc.

1
,
179

1
,
958

Home appliance
agent

Agents for
Japanese home
appliances

1
,
990

9
,
305


Table 7 lists the results for real data based validation. We analy
ze the correctness
for every cluster to see the chance of misjudgment. The number listed in the table is
final segmentation result.






Table 7
. The
segmentation

results

(a) Segmentation result for Rice Company

Rice Company

Product No.
:

50

Analyzed
customers
:

50

Duration: 3 months

Transactions:
2
,
189

Inter
-
cluster
similarity

Cluster

Cluster 1

Cluster 2

Cluster 3

Cluster 1

100.00%

79.04%

58.48%

Cluster 2

79.04%

100.00%

71.30%

Cluster 3

58.48%

71.30%

100.00%

Inner cluster correctness

89.96%

100
.00%

100.00%


(b) Segmentation results for
C
andy
D
ealer

Candy
D
ealer

Product No.
:

50

Customer No.
:

100

Duration: 3 months

Transactions
:1
,
354

Inter
-
cluster
similarity

Cluster

Cluster 1

Cluster 2

Cluster 3

Cluster 4

Cluster 5

Cluster 1

100.00%

79.58%

76.78%

86.94%

78.42%

Cluster
2

79.58%

100.00%

70.91%

77.52%

67.00%

Cluster 3

76.78%

70.91%

100.00%

70.21%

66.91%

Cluster 4

86.94%

77.52%

70.21%

100.00%

71.01%

Cluster 5

78.42%

67.00%

66.91%

71.01%

100.00%

Inner cluster correctness

90.07%

100.00%

1
00.00%

100.00%

100.00%


(c)

Segmentation result for
H
ome
A
ppliance
A
gent

Home
A
ppliance
A
gent

Product No.
:

50

Analyzed customers
:

50

Duration: 3 months

Transactions: 4
,
633

Inter
-
cluster
similarity

Cluster

Cluster 1

Cluster 2

Cluster 3

Cluster 1

100.00
%

93.32%

93.74%

Cluster 2

93.32%

100.00%

89.79%

Cluster 3

93.74%

89.79%

100.00%

Inner cluster correctness

89.96%

97.87%

100.00%



4.3 Building BPN

Model

We try to train the BPN using the segmentation results generated by the previous
steps. After tha
t, if a new customer needs to be segmented, we can simply input the
target products’ transaction records and soon we can get the segmentation result. Let’s
validate it with data of Rice Company. Table 8 lists the procurement probabilities for
various produ
cts after segmentation.


Table 8
.

Procurement probabilities for various clusters by Rice Company

Procurement probabilities list

Cluster

Item

Cluster 1

Cluster 2

Cluster 3

1

0.0524

0.3465

0.963

2

0.4178

0

0

3

0.2581

0.099

0

4

0.0078

0.1386

0

5

0

0.247
5

0

6

0.2475

0.1683

0

7

0.0262

0.3267

0.7778

8

0.0204

0.1485

0

9

0.0238

0.0693

0

10

0.1087

0.0198

0

11

0.0068

0

0

12

0.0213

0

0.6667

13

0.3658

0.3168

0

14

0.0344

0.3564

0.8148

15

0.0844

0.2277

0.4444

16

0.3372

0.1782

0.2222

17

0.3857

0

0

18

0.
3338

0.1287

0

19

0.1339

0.198

0.0741


We simulate the data distribution of a customer in cluster 2, and put in two
transaction records for validation. It’s expected that this customer is segmented into
cluster 2. Previous segmentation results are adopted

for training. The result shows that
this customer is segmented into cluster 2, just as expected. The judgment takes only a
few seconds, as shown in Figure 7.



Figure 7
.

T
he segmentation result of simulation data


4.3 Discussions

After validati
on by real

data, there are some notes worthy of discussion in the
repeats.

4.3.1 Time costs of TPCSNN

The TPCSNN method adopted in this study requires a lot of repeated
calculations. For example, if we choose

products in the analysis, there a
re

patterns to be calculated. Such calculation has to be repeated when there is a cluster
modification. Therefore, the time consumption is quite considerable. On a P4 3.0 PC,
it takes about one day to finish required calculation for

50 products. If there are 200
products, the calculation may take up to 5 days. This situation can be improved in
future researches.
However
,
this is required only for the first execution of analysis.


4.3.2 Filtering customer transaction data

TPCSNN takes

a long time in the calculation. Generally, a company may have a
lot of products available. Therefore, we hope that the analysis is carried out on
focused products. Here, we choose most popular products in the analysis instead of
base it on the price. Expe
nsive items have fewer customers and will not be valid in
segmentation test. Besides, the sources of this study are filtered ERP
shipping orders.
Items like gifts or shipment charges are not our interests. However, such products
have a lot of occurrences o
n the shipping order. They should be filtered to avoid
confusions.



4.3.3 The impact of transaction data on the segmentation

If you study the actual segmentation, you can find that the customers in the same
cluster are not always very similar in purchase
properties.
This is because there are a
lot of purchase
patterns

in the reality and a lot of transactions follow these
patterns
.
Furthermore, the products and customers selected by us are all important. It’s possible
that the company has several major cust
omers and these customers cover most of the
transactions. It’s quite common that these customers’ transaction data are close to the
others and result in
a clustering effect centered on this customer. For example, A is
similar to B and C. But B is not simil
ar to C. However, all of three are supposed to be
in the same cluster. This brings some confusion in the analysis.

Besides, if there are excessive transaction patterns, the program may not handle
the segmentation very precisely. For example, both A and B h
ave 5 transactions and 5
transaction patterns. However, only one transaction pattern is common between A and
B. Under this situation, since two highly related clusters will be merged during TCE
calculation, these two will be in the same cluster because of
high TCE value even if
there is only one pattern in common. However, this is reasonable.


4.3.4 Validation of neural network segmentation

In this study, neural network is adopted in segmentation result
analysis and
learning. Validated by experiment, the tr
ained neural network demonstrates very good
precision. Using this method, we can avoid time consumption in the segmentation
calculation. If there is a new customer to be analyzed, what we need to do is putting in
its data. The segmentation result will be a
vailable on the fly. This makes TPCSNN
acceptable by the users.


5.

Conclusions

This study is based on a segmentation method using customers’ historical
transaction records. Additional business logics for customer transaction data
extraction, analysis and exp
lanations on segmentation results, as well as application of
neural network help the users to obtain new customer segmentation without spending
a lot of time. Validated by simulated and real data, such segmentation method is
proved to work well in the comm
ercial segmentation.


References

[
1
]

Aldenderfer, M. S. and Blashfied, R. K., Cluster Analysis, Sage Publications,
Inc., 1984.

[
2
]

Bezdek, J. C. and Pal, N. R., “Some New Indexes of Cluster Validity,
” IEEE
Transactions on Systems, Man, and Cybernetics


Part B: Cybernetics, Vol. 28,
No. 3, pp. 301
-
315, 1998.

[
3
]

Changchien, S. W. and Kuo, S. Y.,

A Customer Segmentation Method Based on
Transaction Patterns,


Thesis for the

Degree of Master, Chaoyang University of
Technology, 2004.

[
4
]

Changchien, S. W. and Lu, T. C.,

Mining Association Rules Procedure to
Support On
-
line Recommendation by Customers and Products Fragmentation,


Expert System
s

with

Applications, Vol. 20, No. 4, pp. 325
-
335, 2001.

[
5
]

Changchien, S. W. and Lu, T. C.,

Knowledge Discovery from Object
-
Oriented
Databases Using an Association Rules Mining Algorithm,


Proceedings of the
5th International Confer
ence on Knowledge
-
Based Intelligent Information
Engineering Systems (KES) and Allied Technologies, Osaka, Japan, 2001.

[
6
]

Dennis, C., Marsland, D., Cockett, T. and Hlupic, V.,

Market Segmentation and
Customer Knowledge for Sho
pping Centers,


Proceedings of 25
th

International
Conference on Information Technology Interfaces, pp. 16
-
19, 2003.

[
7
]

Ester, M., Kriegel, H. P., Sander, J. and Xu, X., “A Density
-
Based Algorithm for
Discovering Clusters in Lar
ge Spatial Databases with Noise,” Proceedings of the
2
nd

International Conference on Knowledge Discovery and Data Mining (KDD),
pp. 226
-
231
, 1996
.

[
8
]

Filippone, M., Camastra, F., Masulli, F. and Rovetta, S.,

A Survey of Kernel

and Spectral Methods for Clustering,


Pattern Recognition, Vol. 41, pp. 176
-
179,
2008.

[
9
]

Guha, S., Rastogi, R. and Shim, K.,

ROCK: A Robust
C
lustering
A
lgorithm for
Categorical Attributes,


Information Systems, Vol. 25, No.
5, pp. 345
-
366, 2000.

[
10
] Guldemir, H. and Senguar, A.,

Comparison of Clustering Algorithms for Analog
Modulation Classification,


Expert System
s

w
ith Applications, Vol. 30,
No. 4,
pp. 642
-
649
, 2006
.

[
11
] Gunter, S. and Bunke, H.,

Validation Indices for Graph Clustering,


Pattern
Recognition Letters, Vol. 24, pp. 1197
-
1113, 2003.

[
12
] Han, J. and Kamber, M., Data Mining Concepts and Techniques, Morgan
Kaufmann Publishers, 20
01.

[
13
] Hagan, M. T., Demuth, H. B., and Beale, M. H., Neural Network Design, PWS
Publishing Company, 1996.

[
14
]
Kotler, P., Marketing Management: Analysis, Planning, Implementation, And
Control, 8
th

e
d., Prentice Hall, Inc.
,
1994
.

[
15
]
Kotler, P. and Armstrong, G
.
, Marketing
a
n Introduction, 4
th

ed., Prentice Hall,
Inc.
, 1997.

[
16
] Kumar, M. and Patel N. R.,

Clustering Data
w
ith Measurement Errors,


Computational Statistics and Data Analysis, Vol. 51, pp. 6084
-
6101, 2007.

[
17
]
Lee, D. H., Kim, S. H. and Ahn, B. S.,

A Conjoint Model for Internet Shopping
Malls using Customer

s Purchasing Data,


Expert Systems with Applicat
ions,
Vol. 19, No. 1, pp. 59
-
66.

[
18
]
Liu, M. and Samal, A.,

Cluster Validation Using Legacy Delineations,


Image
and Vision Computing, Vol. 20, pp. 459
-
467, 2002.

[
19
]
Schiffman, L. G. and Kanuk, L. L
., Consumer Behavior, 15
th

ed.
,

Prentice
-
Hall,

Inc.
,
2000.

[
20
] Wang, T. Y. and Huang, C. Y.,

Optimizing Back
-
Propagation Network via a
calibrated Heuristic Algorithm with an Orthogonal Array,


Expert Systems with
Application,
Vol. 34, pp. 1630
-
1641, 2008.