UNIVERSITY GRADUATE SCHOOL BULLETIN ANNOUNCEMENT Florida International University

cobblerbeggarΤεχνίτη Νοημοσύνη και Ρομποτική

15 Οκτ 2013 (πριν από 3 χρόνια και 11 μήνες)

83 εμφανίσεις

UNIVERSITY GRADUATE SCHOOL BULLETIN

ANNOUNCEMENT


Florida International University

University Graduate School


Doctoral Dissertation Defense


Abstract


Document Understanding using Data Mining and Machine Learning Techniques


by


Dingding Wang



W
ith the explosive growth of the volume and complexity of document data (e.g., news,
emails, blogs, web pages), it has become a necessity to semantically understand documents and
deliver meaningful information to users. Areas dealing with these problems are

crossing data
mining, information retrieval, and machine learning. For example, document clustering and
summarization are two fundamental techniques for understanding document data and have
attracted much attention in recent years. Given a collection of d
ocuments, document clustering
aims to partition them into different groups to provide efficient document browsing and
navigation mechanisms. One unrevealed area in document clustering is that how to generate
meaningful interpretation for the each document
cluster resulted from the clustering process.
Document summarization is another effective

technique for document understanding, which
generates a summary by

selecting sentences that deliver the major or topic
-
relevant

information
in the original documents.

How to improve the automatic

summarization performance and apply
it to newly emerging problems

are two valuable research directions.


To assist people to capture the semantics of documents effectively

and efficiently, the dissertation
focuses on developin
g effective

data mining and machine learning algorithms and systems for (1)

integrating document clustering and summarization to obtain

meaningful document clusters with
summarized interpretation, (2)

improving document summarization performance and buildi
ng
document

understanding systems to solve real
-
world applications, and (3)

summarizing the
differences and evolution of multiple document

sources.


Date:

September 8, 2010

Department:

School of Computing and Information Sciences

Time:

2:00 p.m.



Major Professor:

Dr. Tao Li

Place:

ECS 349