Music Recommendation A Data Mining Approach - Graph-RAT

sentencehuddleData Management

Nov 20, 2013 (3 years and 6 months ago)

73 views

Music Recommendation

A Data Mining Approach

Daniel McEnnis

2nd year PhD

Overview


High level overview


Toolkit Improvements


Experiments


Evaluation


Algorithms research


Data


Future work

Project Goals


Integrate social information


Make algorithms ‘culturally aware’


Implement existing algorithms


Systematic evaluation framework

Similarity Algorithms


Create new relations based on some
aspect of similarity


6 different varieties of similarity


Each algorithm can use one of 6
distance functions

Aggregator Algorithms


Takes data from one set of actors and
moves it to another


6 different varierties



Each variety uses one of 7 aggregator
functions


Basic building block of Graph
-
RAT
applications


Graph Triples Census


Probable novel algorithm


Proof of Correctness Completed


Proof of Time Complexity Completed


Literature review in progress

SUCCESS!


Graph
-
RAT programming language now
functioning


Graph
-
RAT integrates social, cultural,
personal, and audio data into algorithms


Includes most commercial algorithms


Contains primitives for existing academic
systems


Evaluation is entirely automated

PROBLEMS

Evaluation Exploration


9 types of music recommendation


Personalized versus generic


Open query versus targeted query


Dynamic versus static data


New music versus all music

Personalized Radio


Open query with personalized
presentation


Static data vs dynamic data


New items prediction vs predict
anything

Targeted Search


Not personalized


Similarity queries


Automatically generating targeted lists
for a browsing hierarchy


New music vs all music


Static vs dynamic data


Personalized Tag Radio


Create a personalized play list matching
a given query


New music vs all music


Static vs dynamic data

Excluded Types


‘Top 40’ prediction


Rendered obsolete by other types

Existing Algorithms


Item
-
to
-
Item collaborative filtering


7 variations


User
-
to
-
user collaborative filtering


7 variations


Associative mining collaborative filtering


Direct machine learning playlist data


Direct machine learning audio data

Novel Algorithms


Machine learning over profile data


Machine learning over cultural and profile
data


Machine learning on different concatenations


Audio


Playlist


Profile


Cultural


Initial Data


LiveJournal


Separating music data is difficult


No tag info or audio content


No enough musical data


LastFM by User


No audio content


Data cleaning is an issue


Current Data


40’s Jazz Recordings


1800 annotated recordings from 70 CDs


Covers nearly all 40’s popular music


LastFM by Song


Retrieves tag and user info by song


Data cleaning on user playcounts needed

Data Cleaning Tags


Polysemy


Synonomy


Disjoint


Hypersomny


Hyposomny



Initial algorithms developed


Future Work: Programming


Radically different programming
environment


SQL


LINQ library package in C#

Future Work: Scalability


Distributed SQL database
implementation


Just
-
in
-
time compilation


Event
-
based recalculation of algorithm
results


Parallel execution of algorithms


Multi
-
threaded algorithms