An Innovative Personalized Recommendation System
I
ntegrated
C
ollaborative
F
iltering and
D
ecision
T
rees
Tien

Chin Wang
1
, Hsien

Da Lee
2
1
Department of Informati
on Management, I

Shou
University
, Taiwan
tcwang@isu.edu.tw
2
Department of Informati
on
Engine
ering
, I

Shou
University
, Taiwan
leesd@center.fotech.edu.tw
Abstract
As there is
explosive growth of information o
n
the
Internet, customers may spend more time to find
suitable products when they purchase on the web. In
order to boost sales and enhance
customer loyalty,
developing an intelligent recommendation system is a
good way to help customers effectively find suitable
products in nowadays overloaded information of
Internet environment. Traditional collaborative
filtering algorithms have been widel
y accepted as the
most popular PRS approaches. However,
collaborative filtering approaches face problems such
as sparsity and cold

start that limit the applicability
of PRS.
T
his paper proposes a
PRS which integrates
collaborative filtering and
Shannon’
s
e
ntropy theory.
The entropy

based collaborative filtering algorithm
intends to better improve accuracy and performance.
A decision tree induction is developed to identify
customer patterns.
Three measures (precision, recall
and F1

measure) are used to evalu
ate the
performance of the system. An experiment results
show the applicability of the proposed system
.
1. Introduction
Like
the booming of
the
Internet, e

commerce
attracts millions of people to buy and sell products
over the Internet. In order to enla
rge market shares
and create more business opportunities, enterprises
have been developing new business portals and
providing large amounts of product information, as a
result of which customers have more opportunities to
choose various products that meet
their needs.
However, the explosive growth of information may
cause customers to spend more time and efforts to find
products. On the other hand, companies desire to
collect customer information in order to provide
suitable products to meet customer needs.
In order to
solve the information overload and to identify
customer purchase behavior, developing
a
web

based
personalized recommendation system is one of the
most feasible approaches to alleviate the information
overload burden and provide users with per
sonalized
i
nformation to meet different
needs. According to [1]
“personalization is defined as any action that
adapts
the information or services provided by a web site to
the knowledge gained from the users’ navigational
behavior and individual interests,
in combination with
the content and the structure of the site”.
With personalized recommendation systems,
consumers can effectively gain the information they
are interested in, and save effort while reading
enormous web pages to compare similar product
f
eatures. In addition, enterprises can classify
customers’ previous purchasing behaviors and then
develop appropriate marketing strategies to enhance
customer loyalty.
Most recommendation systems can be divided into
two major categories: content

based app
roach and
collaborative filtering approach. In the content

based
approach, it recommends products or services that are
similar to what the user has been interested in the past
[2]. In the collaborative filtering approach, it
recommends products or services
to customers based
on other customers with similar interests [3]. Based on
the two approaches, many AI techniques have been
researched by researchers in order to generate
accurate recommendations and improve the efficiency
and effectiveness of recommenda
tion systems, such
as Bayesian network [4], clustering technology [5],[6],
singular value decomposition [8],[9], association rule
mining [10],[11], etc.
9999
Collaborative filtering, one of the earliest and most
successful recommendation technologies
,
obse
rves the
behavior of individuals in a like

minded peer group
and makes recommendations to individual users based
on the behavior pattern of the peer group. Examples of
systems using this approach include Ringo [3] and
GroupLens [12]. Although collaborative
filtering
approaches have been successfully applied to many
domains, they
are
constrain
ed
b
y three major
limitations that
need
to be solved. These limitations
include sparsity, cold

start and scalability problems
which restricted the feasibility and sprea
d of a
practical PRS.
This paper proposes an entropy

based collaborative
filtering algorithm while implementing a personal
recommendation system in order to better improve
performance. The remainder of th
e paper is organized
as follows:
In Section 2, re
search background is
expatiated, including an overview of personalized
recommendation systems and entropy. Section 3
explains the implementation issues of the proposed
method. Section 4 reports the
experimental process
and the results of the study. Finally
, the conclusion is
given in Section 5.
2.
Background
Each day
, as
more and more web
pages appear in
cyberspace
,
people become overwhelmed by
information overload problem
s
when searching
for
info
rmation or purchasing products o
n the Internet. In
respons
e to the challenge of information overload,
many researchers devote their efforts to developing
effective personal recommendation systems.
Personalization, a special form of differentiation, is
that a website can respond to a customer’s unique and
particul
ar needs. Mobasher et al. defined Web
personalization as an act of response according to the
individual user’s interest and hobby on Internet usage
[13]. A personalized recommendation system can
provide personal service to customers based on
customers’ pas
t purchasing patterns and through
inference from other users with similar preferences.
The aim of personalization is to offer customers what
they want without asking them explicitly and to
capture the social component of interpersonal
interaction [14].
2
.1.
Personal recommendation system
Personalized recommendation systems can be
categorized into two approaches: contented

based
approach and collaborative filtering approach. In the
content

based approach, products are described by a
set of attributes or t
he content of the ite
ms. It analyz
es
the content of items that a person has selected in the
past and recommends items with similar content [15].
The content

based filtering approach adopts some
artificial intelligence concepts such as information
retrieval
and information filtering. The item
recommended by content

based filtering often
indicates textual information, such as news webs and
documents. And these items usually describe with
keywords and its correspondent weights. Usually
clustering techniques ar
e utilized to analyze the
feature content of products and recommend suitable
content based on features characteristics or customer’s
preference. The challenge of this approach includes
limited content analysis because of limited keywords,
overspecializatio
n problems and new user problems.
On the other hand,
the
collaborative filtering (CF)
approach works on building a customer dataset from
customers and present
s
recommendation
s
by
collaborative algorithm.
The
Collaborative filtering
approach identifies oth
er users who have shown
similar preferences to a given use
r and recommends
what they like
d
[16]. It is based on the idea that the
target users may rate products
which
are similar to
their nearest neighbors.
Collaborative filtering approaches usually are
c
onsisted of three steps. At first, a user

item rating
matrix is constructed to represent user ratings of items.
Secondly, the nearest
–
neighbor clustering techniques
are applied by computing the similarities for all pairs
of users. Finally, the recommendati
on generation is
performed by aggregating ratings, which involves
aggregating the ratings of the target item by the target
user’s neighbors. Those steps can be described as
follows:
1.
Ratings Matrix construction : The users’
judgments or preferences are e
xplicitly represented by
a m×n user item ratings matrix R,
where m is the
number of users and n is the number of items. R =
(
ij
r
), the value of
ij
r
means that user i rates on item
j
. In the e

commerce recommendation
sys
tems, the
entry
represents a user’s tendency toward the rated
item. The higher the value, the more positive
preference the user. However, it is not necessary
needed that the user had purchased the rated item
before.
2. Neighborhood

Similarity C
lustering
:
Clustering
is a form of unsupervised learning, i.e., the data
available is not labeled and the output is a set of
clusters containing the similar points. Based on
clustering concept, K nearest

neighbors technique is
applied. KNN is that all the similariti
es between the
target user and other users in the system are computed
in order to find the set of the K most similar users

nearest

neighbors. The K nearest

neighbors are sorted
by similarity. To a great extent, the efficiency and
effectiveness
of collabora
tive filtering
recommendation algorithms mainly depend on
the
efficiency and effectiveness of K

nearest

neighbors
algorithms.
3.
Recommendation generation: Based on the
nearest

neighbor set, the predicted ratings of the items
unrated by the target user can
be computed, and the
recommendations are generated by triggering rules
whose condition match
the
threshold.
Although collaborative filtering technology has
been successfully used in many applications, its major
limitations including data sparsity, cold

s
tart and
scalability have restricted its widespread use in
practical e

commerce systems.
The first limitation is the sparsity problem [17].
Conventional collaborative filtering recommendation
systems require users to explicitly input preference
ratings abo
ut many products. In a large e

commerce
system, the number of items rated by a user is usually
less than one percent of total items. The percentage of
items rated by two or more users is much less than
that, which resulting in a very sparse user

item ratin
gs
matrix. Using the large scale matrix and sparse
ratings, the computation cost of similarities between
users is high while the results may not be acceptable.
As a result, predicted ratings accuracy degrades
significantly when the received ratings are spa
rse.
Scalability is also a common concern faced by CF
[18].
As the
number of users and items grow
s
, the
computation complexity increases rapidly. User

based
collaborative filtering algorithm requires computation
that grows with both the number of users and
the
number of items. An e

commerce usually has millions
of users and items. A typical web

based recommender
system running the CF algorithm will suffer serious
scalability problems.
2
.
2
.
Shannon
’
s entropy
The concept “entropy” originally comes from
the
rmodynamics. In the thermodynamic systems,
entropy is defined in terms of heat divided by the
absolute temperature. The entropy measure is used to
calculate the information gain which reflects the
quality of an attribute as the branching
attribute
[7]
.
An
information

based heuristic selects the attribute
providing the highest information gain. A data set with
some discrete

valued condition attributes and one
discrete

valued decision attributes can be presented in
the form of knowledge representation
syst
em
)
,
(
D
C
U
J
,
where
s
u
u
u
U
,
,
,
2
1
is the set of data samples,
n
c
c
c
C
,
,
,
2
1
is the set of condition attributes
and
d
D
is the one

elemental set with the
decision attribute or class label attribute. Suppose this
class la
bel attribute has m distinct values
defining m
distinct classes ,
i
d
(
m
i
,
,
2
,
1
), let
i
s
be the
number of samples of
U
in class
i
d
.The expected
informa
tion or entropy need to classify a given
sample
is given by
m
i
i
i
m
p
p
s
s
s
I
1
2
2
1
log
)
,
,
,
(
(1)
Where
i
p
is the probability that an arbitrary
sample belongs to class
i
s
and is estimated by
summation those samples’ entropy (m is the number
of all samples). Let
attribute
i
c
have v distinct value
v
A
A
A
,
,
,
2
1
, attribute
i
c
can be used to
partition U into v subsets
v
s
s
s
,
,
,
2
1
where
j
s
(
v
j
,
,
2
,
1
) contains those samples in
U
that
ha
ve value
j
A
of
i
c
. Let
ij
s
be the number of
samples of class
i
d
in a subset
j
s
, the entropy of
attribute
i
c
is given by
v
j
mj
j
j
mj
j
j
i
s
s
s
I
s
s
s
s
c
E
1
2
1
2
1
)
,
,
,
(
)
(
(2)
The
term
s
s
s
s
mj
j
j
2
1
acts as the weight
of the
j
th subset and is the number of samples in the
subset divided by the total number of samples. The
smaller the entropy value
is
, the greater the purity of
the subset partitions
is
.
Thus the attribute that
leads to
the largest information gain, is selected as the
branching attribute.
For a given subset
j
s
,the
information gain is expressed as
m
i
ij
ij
mj
j
j
p
p
s
s
s
I
1
2
2
1
log
)
,
,
,
(
(3)
Where
j
ij
ij
S
s
p
(
j
S
is the number of
samples in the subset
j
S
) and is the probability that
a sample in
j
S
belongs to class
i
d
. So information
gain of attribute
i
c
is given by
)
(
)
,
,
,
(
)
(
2
1
i
mj
j
j
i
c
E
s
s
s
I
c
Gain
(4)
We compute the information gain of each condition
attribute, the attribute with the highest information
gain is the most informative and the most
discriminating attribute of the given set.
2.2.1.
Item entropy
Based on Shannon
’
s entropy, we
can extend to
define the item entropy as
n
i
x
i
x
i
p
p
H
I
1
,
2
,
log
)
(
(
5
)
W
here n is the total number of users,
)
(
X
I
is the
entropy measure of item x. If a user
I
rates on item x,
the probability p is calculated as
items
of
number
total
useri
by
rated
items
of
number
items
of
number
total
i
user
by
rated
items
of
number
p
m
x
i
1
,
(
6
)
According to the definition of entropy, the more
rates the item, the more entropy the item has. From the
user perspective, the user rates more items, the use has
more influence on item entropy. An item with large
item entropy values indicates u
sers are more interested
in the specific item compared to other items.
3
.
Proposed methodology
I
n this section, a web

based personal intelligent
recommendation system is proposed, which
based on
Shannon
’
s entropy measure
.
The objective of our
system is
to recommend a unique set of
objects to
satisfy the needs of each active user.
The proposed
system is composed of the following major parts
:
1.
Data Representation Module: Data need to
be
pre

process
ed
into structure form.
2.
Decision Trees and Simila
rity Calc
ulation
Module:
I
t is possible to g
enerat
e the
nearest

neighbors of the target user
by
implementing
the ratings matrix
.
Besides, c
omput
e
the
entropy of every item
attribute
, w
e apply ID3
to construct a decision tree to identify user
preference patterns.
3.
Generation of recommendations
:
Recommendations are generated by
triggering rules whose
conditions match the
thresholds
in customers’ inputs.
For a better
performance quality, a threshold is defined
for the requirement
s
to
be
met
.
Products in
the action pa
rts of the fired rules
can be those
potential candidates for
recommendation.
4.
Experiment
4
.
1
.
Data sets
We use the
famous
movielens dataset
(available for
downloading
from http://movielens.umn.edu)
collected by the GroupLens Research at the Universit
y
of Minnesota.
Movielens
contains 100,000 ratings
from 943 users for 16
81
movies
[19]
. Each user
has
rated at least 20 movies, and each movie has been
rated at least once.
We divided the database into 80%
training set and 20% test set.
The training set is
used
to generate the recommendation
model. Our
recommendation
system is then evaluated by
comparing the Top

N recommendations it makes,
given the test data, with the set of deleted items.
4
.
2
.
Evaluation metrics
Many metrics have been proposed for asse
ssing
the accuracy of a collaborative filtering system.
To
evaluate the effectiveness of proposed system, we
apply th
e most common
used
metric
–
the
F1 measure
for evaluating the recommendati
on quality [
20
]. F1
is
calculated as follow:
recall
precision
recall
precision
F
2
1
(
7
)
Precision is the
ratio of the
accura
te items identified
over the top N set.
It is
computed as the ratio of the
number of relevant
recommendations to the total
number of recommendations
that a
r
ecommendation
s
ystem produces.
N
N
top
test
precision
_
Rec
all
measures the ability of a
r
ecommendation
s
ystem to
recommend all the products that are likely to
interest the
customers. It is the ratio of the number of
recommendations that are
correctly
generated by the
r
ecommendation
s
ystem
over
the
total data set.
set
test
N
top
test
recall
_
These two measures
, precision and recall, are
o
ften
conflicting
with
each
other in nature
.
Take number N
for example, i
ncreasing the number N tends to
increase recall but decrease precision. The fact that
both are critical for the qu
ality judgment leads us to
use a combination of the two.
T
he standard F1 metric
can be a balance to both precision and recall
.
4
.
3
.
Experiments
As described in section 3, an innovative
personalized recommendation system which integrated
collaborative f
iltering and decision trees is
presented
.
At the first phase of data
representation
, we
select the
movie

rating table
from
several
database tables
according to their genres. The movie table can be
separated into
“
Action
”
,
“
Adventure
”
,
…
,
“
Western
”
tables.
The genre is used as decisi
ve attribute. In order
to simpl
ify
computation complexity, we also further
divide the table into several sub

tables according to
user occupations. For instance, there is a table called
student

drama rating table which solely cont
ains the
users who
are
students. We intend to explore the user
preference
patterns toward t
he
movie genre.
Which
user
group
is
more
likely
to watch drama movies than
other user group? Are male users supposed to
like
sci

fi movies
better than
romance movie
s? We also define
use
rs’
age to several groups. For example, the user
whose age is less than ten belongs to the group
“
kid
”
.
There are five groups
which are
“
kid
”
,
“
teenager
”
,
“
young adult
”
,
“
adult
”
and
“
mid
dle

aged
”
.
The rating
values in movielen
s
also
ha
d
be
en
classified as “
high
”
and
“
low
”
. If the
values are 3,4 or 5, we
classified
them
as
“
high
”
, otherwise
they will be
classified
as
“
low
”
.
After pr
e

processing the data, we then develop a
decision trees and conduct data analysis through
similarity calc
ulation. Take student

drama table for
example, we
develop
ed a decision tree by calculating
every
attribute entropy.
According to eq. (5), t
he
“
age
”
entropy is 0.00329 and
“
gender
”
is 0.000034.
The attribute
“
age
”
is selected as the split node. Then
we calc
ulated different age group
’
s rating probability
according to eq.(6).
We developed a decision tree
shown in fig.1.
Figure 1.
Decision tree of students rating drama
movies
The main
advantage of decision tree is
easy to
interpret. As in figure 1, we may
generate some useful
decision rules. Take the group
“
kid
”
for example, if a
student age is less then 10 (kid), the probability that
he or she rates drama movies as high, i.e highly
recommend, is about 75%. The
probability
that
“
middle aged
”
h
ighly recomme
nd drama movies
jumped to 97.2% which is th
e highest among five age
groups. It may indicate that middle

aged student
s
favor drama movies a lot.
Based on the decision tree,
we can
evaluate
the effectiveness of the proposed
system by utilizing F1 metrics.
We
randomly selected
20% of cases as test sets. Then we
applied eq. (7) to
evaluate F1 values.
T
able 1
shows the result
:
Table 1. Performance of the
entropy

based model on
test dataset
Movie type
Precision(%)
Recall(%)
F1
A
ction
45
3.77
0.070
A
dventure
3
6
4.20
0.075
A
nimation
32
2.46
0.046
C
hildren
25
4.60
0.078
C
omedy
38
3.80
0.069
C
rime
36
4.00
0.072
D
ocumentary
39
7.00
0.119
D
rama
56
5.20
0.095
F
antancy
43
6.50
0.113
F
ilmnoir
37
8.60
0.140
H
orror
53
11.20
0.185
M
usical
49
10.8
0
0.177
M
ystery
45
4.90
0.088
R
omance
36
5.80
0.100
S
ci

fi
41
6.80
0.117
T
hriller
38
5.60
0.098
W
ar
23
7.00
0.107
Western
12
3.77
0.057
5.
Conclusions
As Internet
has
become a
n
important
part
for
everyone
’
s
daily life, recommendation
systems have
emerged as a powerful new technology for extracting
valuable information effectively from the Web
.
Recommendation
systems help customers find
suitable
products and boost company sales. Recommendation
systems are quickly becoming a crucial tool
in E

commence. In this paper, we presented an innovated
personalized
recommendation
system that integrated
collaborative filtering and Shannon
’
s entropy concept.
Based on Shannon
’
s entropy, a recommendation
decision tree is constructed. Decision
tree
s hav
e the
advantage
of being
easy to comprehend and
implement.
The experimental results
show the
applicability of the proposed system
by achieving
good performance.
6
. References
[1]
R.
Agrawal
,
and
R.
Srikant, “Fast Algorithms for
Mining Association Rules
”,
Proc. of the 20th VLDB
,
J.
Bocca, M. Jarke, & C., Zaniolo,
(Eds.), Morgan Kaufmann,
1994, pp. 487

499.
[2]
K.
Lang
,
“
Newsweeder. Learning to
F
ilter
N
etnews
”
,
Proceedings of the 12th
i
nternational conference on
machine learning
, Tahoe City,
California
,
1
995
.
[3]
U.,
Shardanand
,
and P.
Maes
,
“Social Information
Filtering: Algorithms
f
or Automating ‘Word
o
f Mouth
’
”
,
Proceedings of the Conference on Human Factors in
Computing Systems

CHI’95
, Denver, Co., May 1995.
[4]
D.
Chickering
,
and
D.
Hecherman
,
“Effi
cient
Approximations for the Marginal Likelihood of Bayesian
Networks with Hidden Variables”
,
Machine Learning
, 1997,
29:
pp.
181

212.
[5]
A.
Dempster,
N.
Laird, and
D.
Rubin,
“Maximum
Likelihood from Incomplete Data via the EM Algorithm”
,
Journal of the
Royal Statistical Society
, 1977.
[6]
B.
Thiesson,
C.
Meek,
D.
Chickering, and
D.
Heckerman, “Learning Mixture of DAG Models”
,
Technical
Report MSR

TR

97

30
, Microsoft Research, Redmond, WA,
1997.
[7]
X.
Yang,
“
A Maximum Entropy Model
A
pplication
o
n Recog
nition Of Metaphor Phenomena
”
,
Proceedings of the 15th International Conference on
Computing (CIC'06)
, 2006.
[8]
B.M.
Sarwar,
G.
Karypis,
J.A.
Konstan, and
J.
Riedl,
“Application of Dimensionality Reduction in Recommender
System

A Case Study”
,
In
ACM We
bKDD 2000 Workshop
,
2000.
[9]
C.C.
Aggarwal,
“On the Effects of Dimensionality
Reduction on High Dimensional Similarity Search”
,
ACM
PODS Conference
, 2001.
[10]
A.
Zheng
, Y.Y.
Zhu
, and B.L.
Shi
,
“
Collaborative
Filtering Recommendation Algorithm based on I
tem Rating
Prediction
”
,
Journal of Software
, 13(4),
2002.
[11]
B.
Sarwar,
G.
Karypis,
J.
Konstan, and
J.
Riedl,
“Analysis of Recommendation Algorithms for E

Commerce”
,
ACM Conference on Electronic Commerce
,
2000,
pp.
158

167.
[12]
J.
Konstan,
B.
Miller,
D
.
Maltz,
J.
Herlocker,
L.
Gordon,
and
J.
Riedl,
“Grouplens: Applying Collaborative
Filtering
t
o Usenet News”
,
Communications of the ACM
,
40(3),
1997, pp.
77
–
87.
[13]
B.
Mobashe,
H.
Dai,
and T.
Luo,
“Discovery
a
nd
Evaluation
o
f Aggregate Usage Profiles
f
or
Web
Personalization”
,
Data Mining and Knowledge Discovery
,
6(1)
, 2002
,
pp.
61
–
82.
[14]
B.
Mittal,
and W.
Lassar,
“The Role
o
f
Personalization
i
n Service Encounters”
,
Journal of
Retailing
, 72(1),
1996, pp.
95
–
109.
[15]
P.S.
Yu,
“Data Mining
a
nd Personali
zation
Technologies”
,
Proceedings
o
f
t
he Sixth International
Conference
o
n Database System
f
or Advanced Application
,
Hsinchu, Taiwan
, 1999,
pp. 6
–
13.
[16]
J.S.
Breese,
D.
Heckerman,
and C.
Kadie,
“Empirical
Analysis
o
f Predictive Algorithms
f
or Collabora
tive
Filtering”
,
Proceedings of the 14th conference on
uncertainty in artificial intelligence (UAI

98)
,
1998,
pp.
43
–
52.
[17]
K.W.
Cheung,
J.T.
Kwok,
M.H.
Law,
and K.C.
Tsui,
“Mining Customer Product Ratings
f
or Personalized
Marketing”
,
Journal of Decisi
on Making
,
35,
2003
,
pp.
231
–
243.
[18]
Y.H.
Cho,
J.K.
Kim,
and
S.H.
Kim
,
“A Personalized
Recommender System
b
ased
o
n Web Usage Mining And
Decision Tree Induction”
,
Journal of Expert Systems with
Applications
,
23(3),
2002,
pp.
329
–
342.
[19]
J.K.
Herlocker
,
J.A.
Konstan,
A.
Borchers,
a
nd
J.
Riedl
,
“An Algorithmic Framework
f
or Performing
Collaborative Filtering”
,
Proceeding of ACM SIGIR’99
,
pp.
230

237.
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο