VLDB 2012 Summary
Day 1: QDB Workshop
(MPI, creator of YAGO), Michael
(Oxford), Michael Ley (Trier, creator of
DBLP), Luna Dong (
Entity Resolution by Erhard
Data Cleaning by
Paper from our group: truth finding on
Observation: Data quality issues draw more
attention in today’s big data environment.
An analysis of structured data on the web.
: Crowdsourcing Entity Resolution.
CDAS: A Crowdsourcing Data Analytics System
Entity Resolution Tutorial by
Panel on big data (quote: hot term played by machine
learning people, but
VERY LARGE database
cares about big
data since the very beginning,
people need to catch up)
Observation: crowd sourcing becomes popular, one
important issue is still data quality. But there are some
new aspects, e.g. costs.
Give two talks:
Marina’s pattern mining paper.
Truth finding paper.
PARIS: probabilistic alignment of relations, instances and schema
(Matching of two knowledge bases)
Learning expressive linkage rules using genetic programming (Entity
Supercharging recommender systems using taxonomies for learning
user purchase behavior (Utilizing taxonomies for recommendation)
Who tags what? An analysis framework (social tagging)
Whom to ask? Jury selection for decision making tasks on micro
services (Crowdsourcing, considering cost)
Multilingual schema matching for Wikipedia info
REX: explaining relations between entity pairs (efficiently finding graph
patterns linking two entities).
Quite a few machine learning style papers.
But the general audience is not very familiar with
machine learning, but they are very interested in
tutorial, she asked how many
people have heard of LDA? around 10 raised
hands; and Markov Logic
(including me and
If you have a new machine learning method that
happens to be scalable, and has interesting
community), consider sending
it to VLDB.