MINING SEMANTIC DATA FOR SOLVING

farmpaintlickInternet and Web Development

Oct 21, 2013 (3 years and 11 months ago)

94 views

1

MINING SEMANTIC DATA FOR SOLVING
FIRST
-
RATER AND COLD
-
START
PROBLEMS IN RECOMMENDER SYSTEMS

María N. Moreno,
Saddys

Segrera
, Vivian F. López,

M. Dolores Muñoz and Ángel Luis Sánchez



Department
of

Computing
and Automatic

Data
Mining

Research

Group

http://mida.usal.es

IDEAS 2011

Lisbon

21
-
23
September

CEDI 2010

Mining Semantic Data for Solving First
-
rater and Cold
-
start Problems in Recommender Systems

M
aría

N. Moreno,
Saddys

Segrera
, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez

Contents


Introduction


Recommender Systems


Recommendation framework


Case Study


Conclusions

CEDI 2010

Mining Semantic Data for Solving First
-
rater and Cold
-
start Problems in Recommender Systems

M
aría

N. Moreno,
Saddys

Segrera
, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez

Introduction

Client

Server

Catalog

commerce


Recommender systems






Applications
:
e
-
commerce, e
-
learning,
tourism, news’ pages…


Drawbacks
: low performance, low
reliability of recommendations…






Recommender systems provide users
with intelligent mechanisms to find
products to purchase

CEDI 2010

Mining Semantic Data for Solving First
-
rater and Cold
-
start Problems in Recommender Systems

M
aría

N. Moreno,
Saddys

Segrera
, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez

Introduction


Proposal


Objective
:
overcome critical drawbacks in
recommender systems


Methodology
:
Semantic based Web Mining


Associative classification (Web Mining)


Machine learning technique that combines concepts from
classification and association


Domain
-
specific ontology (Semantic Web)


Enrichment of the data to be mined with semantic annotations

CEDI 2010

Mining Semantic Data for Solving First
-
rater and Cold
-
start Problems in Recommender Systems

M
aría

N. Moreno,
Saddys

Segrera
, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez

Recommender Systems


Classification of recommendation methods


Content
-
based
: compare text documents to user
profiles


Collaborative filtering
: is based on opinions of
other users (ratings)


Memory based
(User
-
based): find users with similar
preferences (neighbors) by means of statistical techniques


Model based
(Item
-
based): use data mining techniques to
develop a model of user ratings


CEDI 2010

Mining Semantic Data for Solving First
-
rater and Cold
-
start Problems in Recommender Systems

M
aría

N. Moreno,
Saddys

Segrera
, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez

Recommender Systems


Critical drawbacks


Sparsity
: the number of ratings needed for prediction is greater
than the number of the ratings obtained from users


Scalability
: performance problems presented mainly in memory
-
based methods where the computation time grows linearly with
both the number of customers and the number of products in
the site


First
-
rater problem
: new products never have been rated,
therefore they cannot be recommended


Cold
-
Start problem
: new users cannot receive recommendations
since they have no evaluations about products


CEDI 2010

Mining Semantic Data for Solving First
-
rater and Cold
-
start Problems in Recommender Systems

M
aría

N. Moreno,
Saddys

Segrera
, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez

Recommendation framework


Associative classification
(Web Mining)


Sparsity
: slightly sensitive to sparse data


Scalability
: model based approach


Domain
-
specific ontology
(Semantic Web)


First
-
rater problem
:


Use of taxonomies to classify products


Induction of abstracts patterns which relate user
profiles with categories of products


Cold
-
Start problem
:


Recommendations based on user profiles

CEDI 2010

Mining Semantic Data for Solving First
-
rater and Cold
-
start Problems in Recommender Systems

M
aría

N. Moreno,
Saddys

Segrera
, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez

Recommendation framework

CEDI 2010

Case Study


User

Data


Zip

Num.


Movies

Data

Title

String

ID

Num.

Genre

(19 attributes)

Binary

ID

Num.


Ratings Data

Rating

Num. (1
-

5)

Movie

ID

Num.

User

ID

Num.

ID

Num.

Occupation

String

Gender

Binary

Age

Num.


MovieLens

Data

CEDI 2010

Case Study

ID

Num.

User Gender

Binary

*
User Age


< 18


[18, 24]


[25, 34]


[35, 44]


[45, 49]


[50, 55]


> 55

User Occupation

String

Movie Title

String

*
Movie Genre


String


MovieLens

Data

CEDI 2010

Mining Semantic Data for Solving First
-
rater and Cold
-
start Problems in Recommender Systems

M
aría

N. Moreno,
Saddys

Segrera
, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez

Case Study


Ontology definition


CEDI 2010

Mining Semantic Data for Solving First
-
rater and Cold
-
start Problems in Recommender Systems

M
aría

N. Moreno,
Saddys

Segrera
, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez

Case Study


Results


Associative classification methods (CBA, CMAR, FOIL and CPAR)
were compared to non
-
associative classification algorithms

CEDI 2010

Mining Semantic Data for Solving First
-
rater and Cold
-
start Problems in Recommender Systems

M
aría

N. Moreno,
Saddys

Segrera
, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez

Conclusions


A framework for recommender systems is proposed in order to
overcome some critical drawbacks


The proposal combines web mining methods and domain specific
ontologies

in order to induce models at two abstraction levels:


The low level model relates users, movies and ratings for making the
recommendations


High level model is used for recommender not rated movies or for
making recommendation to new users and overcome the first
-
rater and
the cold
-
start problem


The off
-
line model induction avoids scalability problems in
recommendation time


Associative classification methods provides a way to deal with
sparsity

problem

THANKS FOR YOUR ATTENTION !

MINING SEMANTIC DATA FOR SOLVING FIRST
-
RATER AND COLD
-
START PROBLEMS IN RECOMMENDER SYSTEMS

María N. Moreno*,
Saddys

Segrera
, Vivian F. López, M. Dolores Muñoz & Ángel Luis Sánchez

*mmg@usal.es


Department
of

Computing
and Automatic

IDEAS 2011

Lisbon

21
-
23
September