MIS 542 Data Warehousing and Data Mining

tribecagamosisΤεχνίτη Νοημοσύνη και Ρομποτική

8 Νοε 2013 (πριν από 3 χρόνια και 10 μήνες)

68 εμφανίσεις



MIS 542 Data Warehousing and Data Mining

Spring 2004


Presentation Papers

A rough list of presentation papers

See me for scheduling a presentation day and paper

[R] reserved


Due to 22.03.2004

Data Preprocessing

1.

V. Raman and J. M. Hellerstein.
Potter's Wheel: An Interactive Data Cleaning
System
,


Proc. 2001 Int. Conf. on Very Large Data Bases (VLDB'01), Rome, Italy,
pp. 381
-
390, Sept. 2001.

2.

H. Galhardas, D. Florescu, D. Shasha, E
. Simon, and C.
-
A. Saita.
Declarative
Data Cleaning: Language, Model, and Algorithms

Proc. 2001 Int. Conf. on Very
Large Data Bases (VLDB'01), Rome, Italy, pp. 371
-
380, Sept. 2001.


3.

T. Dasu, T. Johnson, S. Muthukrishnan, V. Shkapenyuk.


Mining Database
Structure; Or, How to Build a Data Quality Browser
. Proc. 2002 ACM
-
SIGMOD
Int. Conf. Management of Data (SIG
MOD'02), Madison, WI, pp. 240
-
251, June
2002.

Due to 29.03.2004

Data Warehouse, OLAP, and Data Generalization

1.

S. Sarawagi, R. Agrawal, and N. Megiddo.

Discovery
-
driven explorati
on of
OLAP data cubes.

In
Proc. Int. Conf. of Extending Database Technology
(EDBT'98)
, Valencia, Spain, pp. 168
-
182, March 1998.

2.

L. V. S. Lakshmanan, J. Pei, and J. Han,
Quotient Cube: How
to Summarize the
Semantics of a Data Cube
, Proc. 2002 Int. Conf. on Very Large Data Bases
(VLDB'02), Hong Kong, China, Aug. 2002.

3.

J. Han.
Towards on
-
line analytical mining in large
databases
.
ACM SIGMOD
Record
, 27:97
-
107, 1998.

4.

J. Han, Y. Cai and N. Cercone,
Knowledge Discovery in Databases: An Attribute
-
Oriented Approach in
(VLDB'92)

, Vancouver, Canada,
August 1992, pp. 547
-
559.

5.

G. Sathe and S. Sarawagi.
Intelli
gent
Rollups in Multidimensional OLAP Data
. In
Proc. Int. Conf. of Very Large Data Bases (VLDB'01)
, Rome, Italy, pp. 531
-
540

Due to 12.04.2004

Cluster Analysis


1.

R. Ng and J. Han.
Efficient and effective clustering method for spatial data
mining
. In
VLDB'94
, pp. 144
-
155, Santiago, Chile, Sept. 1994.

2.

T.

Zhang, R. Ramakrishnan, and M. Livny.
BIRCH: An efficient data clustering
method for very large databases
. In
SIGMOD'96
, pp. 103
-
114, Montreal, Canada,
June 1996.

3.

S. Guha, R. Ras
togi, and K. Shim.
CURE: An efficient clustering algorithm for
large databases
. In
SIGMOD'98
, pp. 73
-
84, Seattle, Washington, June 1998.

4.

S. Guha, R. Rastogi, and K. Shim.
ROCK: A robust clustering algorithm for
categorical attributes
. In
ICDE'99
, pp. 512
-
521, Sydney, Australia, March 1999.

5.

M. Ankerst, M. Breunig, H.
-
P. Kriegel, and J. Sander.
Optics: Ordering points to
identify the clustering structure.

In
SIGMOD'99
, pp. 49
-
60, Philadelphia, PA,
June 1999.

6.

Beil F., Ester M., Xu X.: "
Frequent Term
-
Based Text Clustering
", Proc. 8th Int.
Conf. on Knowledge Discovery and Data Mining (KDD'02), Edmonton, Alberta,
Canada, 2002.

7.

Haung Z. “Extensions to the K
-
Means Algorithm for Clustering Large Datasets
with Cate
gorical Values”, Data Mining and Knowledge Discovery, 2:283
-
304,
1998

8.

Haung Z “Clustering Large Datasets with Mixed Numeric and Categorical
Values”


Due to 03.05
.2004

Mining Frequent Patterns and Association Rules in Large Databases


2.

J. Han and Y. Fu.
Discovery of multiple
-
level association rules from large
databases
. In
VLDB'95
, pp. 420
-
431, Zürich, Switzerland, Sept. 1995.

3.

R. Srikant and R. Agrawal.
Mining generalized association rules
. In
VLDB'95
, pp.
407
-
419, Zürich, Switzerland, Sept. 1995.

4.

R. Srikant and R. Agrawal.
Mining quantitative as
sociation rules in large
relational tables.

In
SIGMOD'96
, pp. 1
-
12, Montreal, Canada, June 1996.

5.

B. Lent, A. Swami, and J. Widom.
Clustering association rules
. In
ICDE'97
, pp.
220
-
231, Birmingham, England, April 1997.

6.

S. Brin, R. Motwani, and C. Silverstein.
Beyond market basket: Generalizing
association rules to correlations
. In
SIGMOD'97
, pp. 265
-
276, Tuc
son, Arizona,
May 1997.

7.

A. Savasere, E. Omiecinski, S. B. Navathe,
Mining for Strong Negative
Associations in a Large Database of Customer Transactions
, In ICDE’98,Feb.,
199
8, Orlando, Florida.

8.

E. Omiecinski.
Alternative Interest Measures for Mining Associations
, IEEE
Trans. Knowledge and Data Engineering, 15(1):57
-
69, 2003.

9.

Y.
-
K. Lee, W.
-
Y. Kim, Y.
D. Cai, and J. Han, “
CoMine: Efficient Mining of
Correlated Patterns
”, Proc. 2003 Int. Conf. on Data Mining (ICDM'03),
Melbourne, FL, Nov. 2003.


Due to 03.05.2004

Mining Sequential Patte
rn, Structured Patterns, and Time
-
Series Mining

1.

Mannila H.; Toivonen H.; Inkeri Verkamo A.,
Discovery of Frequent Episodes in
Event Sequences
. Data Mining and Knowledge Discovery
, 1997, vol. 1, no. 3, pp.
259
-
289(31)

2.

[R]
J. Han, G. Dong, and Y. Yin.
Efficient mining of partial periodic patterns in
time series database
. In
ICDE'99
, pp. 106
-
115, Sydney, Australia, April
1999.

3.

J. Pei, J. Han, and W. Wang, “
Mining Sequential Patterns with Constraints in
Large Databases
”, Proc. 2002 Int. Conf. on Information and Knowledge
Management (CIKM'02)}, Washington,
D.C., Nov. 2001.

4.

X. Yan and J. Han, “
gSpan: Graph
-
Based Substructure Pattern Mining
”, Proc.
2002 Int. Conf. on Data Mining (ICDM'02), Maebashi, Japan, Dec. 2002.

5.

X. Yan and J. Han,


CloseGraph: Mining Closed Frequent Graph Patterns
”, Proc.
2003 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining
(KDD'03), Washington, D.C., Aug. 2003.

6.

X. Yan, J. H
an, and R. Afshar, “
CloSpan: Mining Closed Sequential Patterns in
Large Datasets
”, Proc. 2003 SIAM Int.Conf. on Data Mining (SDM'03), San
Fransisco, CA, May 2003.

Due to 24.05.2004

Classification and Prediction


1.

J. Shafer, R. Agrawal, and M. Mehta.
SPRINT: A scalable parallel classifier for
data mining.

In
VLDB'96
, pp. 544
-
555, Bombay, India, Sept. 1996.

2.

J
. Gehrke, V. Gant, R. Ramakrishnan, and W.
-
Y. Loh,
BOAT
--

Optimistic
Decision Tree Construction

. In
SIGMOD'99
, Philadelphia, Pennsylvania, 1999

3.

B. Liu, W. Hsu, and Y. Ma.
Integrating Classification and Association Rule
Mining
.
Proc. 1998 Int. Conf. Knowledge Discovery and Data Mining (KDD'98)
New York, NY, Aug. 1998.

4.

W. Li, J. Han, and J. Pei,
CMAR: Accurate and Efficient Classification Based on
Multiple Class
-
Association Rules,

, Proc. 2001 Int. Conf. on Data Mining
(ICDM'01), San Jose, CA, Nov. 2001.

5.

X. Yin and J. Han, “
CPAR: Classification based on Predictive Association Rules
”,
Proc. 2003 SIAM Int.Conf. on Data Mining (SDM'03), San Fransisco, CA, May
2003.

6.

X. Yin, J. Han, J. Yang, and P. S. Yu,

CrossMine: Efficient Classification across
Multiple Database Relations
”, Proc. 2004 Int. Conf. on Data
Engineering

(
ICDE'04
),
Boston, MA, March 2004.

Other topics about Classification or Predic
tion


Classification by Genetic Algorithms




by Rough Sets




by Fuzzy Set Approach


Support Vector Machines


Case Based Reasoning


Regression Trees







Due to 24.05.2004

Web Mining


1.

S. Chakrabarti, B. E. Dom, S. R. Kum
ar, P. Raghavan, S. Rajagopalan, A.
Tomkins, D. Gibson, and J. Kleinberg.
Mining the Web's link structure
.
COMPUTER, 32(8):60
-
67, 1999.

2.

J. M. Kleinberg.
Authoritative Sources in a Hyperlinked Environment
. Journal of
ACM, 46(5):604
-
632, 1999.

3.

K. Wang, S. Zhou and S. C. Liew.
Buildi
ng hierarchical classifiers using class
proximity
. In VLDB99, Edinburgh, UK, Sept. 1999.

4.

J. Han, and K. C.
-
C. Chang, “
Data Mining for Web Intelligence
”, Computer, Nov.
2002

5.

Corin R.

Anderson, Pedro Domingos, Daniel S. Weld:
Personalizing Web Sites
for Mobile Users
. In
WWW 2001
: pages 565
-
575. 2001.

Due to 24.05.2004

Data Mining Applications and Trends in Data Mining


1.

C.
Clifton and D. Marks.
Security and Privacy Implications of Data Mining
. In
Proc. 1996 SIGMOD'96 Workshop on Research Issues on Data Mining and
Knowledge Discovery (DMKD'96), Montrea
l, Canada, pp. 15
-
20, June 1996.

2.

R. Agrawal and R. Srikant.
Privacy
-
preserving data mining
. In Proc. 2000 ACM
-
SIGMOD Int. Conf. Management of Data (SIGMOD'00), pages 439
-
450, Dalla
s,
TX, May 2000.


3.

H. V. Jagadish, J. Madar, and R. Ng.
Semantic compression and pattern extraction
with fascicles
. In
Proc. 1999 Int. Conf. Very Large Data Bases (VLDB'99)
, page
s
186
-
197, Edinburgh, UK, Sept. 1999.

4.

Qiming Chen, Umesh Dayal, Meichun Hsu,
OLAP
-
based Scalable Profiling of
Customer Behavior
, In
Proc.1999 Int.l Conf.Data Warehousing and Knowle
dge
Discovery(DAWAK99)
, Italy, 1999.


5.

Ron Kohavi,
Mining E
-
Commerce Data: The Good, the Bad, and the Ugly
,
KDD’2001, 2001.

6.

[R]
P. Domingoa and M. Richardson.
Mining the Network Value of Customers
, in
Proc. 2001 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining
(pp. 57
-
66), 2001. San Francisco, CA: ACM Press.