Incorporating User Provided Constraints into Document Clustering

gurgleplayAI and Robotics

Oct 18, 2013 (3 years and 9 months ago)

128 views

5143 Cass Avenue


431 State Hall


Detroit, Michigan 48202


+1.313.577.2477


Fax +1.313.577.6868


http://www.cs.wayne.edu






Incorporating User Provided Constraints

into
Document
C
lustering



Yanhua Chen



Dept. of Computer Science

Wayne State University


Tuesday,

February 12
, 200
8

3:00pm Rm 110 Purdy
-
Kres
ge

Libarary




Abstract:


Document clustering without any prior
knowledge or background information is a
challenging problem. In this talk, we introduce SS
-
NMF: a semi
-
supervised nonnegative
matrix factorization framework for document clustering. In SS
-
NMF, users are able to
provide supervision for document clustering
in terms of pairwise constraints on a few
documents specifying whether they “must” or “cannot”


be clustered together. Through an
iterative algorithm, we perform symmetric tri
-
factorization of the document similarity matrix
to infer the document clusters.
Theoretically, we show that SS
-
NMF provides a general
framework for semi
-
supervised clustering and that existing approaches can be considered
as special cases of SS
-
NMF. Through extensive experiments conducted on publicly
available data sets, we demonstrat
e the superior performance of SS
-
NMF for clustering
documents.





Biography
:


Yanhua Chen received the MS degree in Computer Science and Engineering from
Michigan State University, East Lansing, MI, in 2004. She is currently a PhD student at
Machine Visio
n and Pattern Recognition Laboratory of Department of

Computer Science,
Wayne State University, Detroit, MI. Her research interests are in the areas of pattern
recognition, machine learning, data mining, graph theory, and information retrieval.




DEPARTMENT OF COMPUTER SCIENCE