Homework/Programming Assignments for Data Mining Spring 2008

overratedbeltAI and Robotics

Nov 25, 2013 (3 years and 6 months ago)

77 views

Homework/
Programming Assignments

for
Data Mining

Spring 200
8


You will test and compare at least two algorithms for classification, two for clustering, and two for creating
association rules. You may use any available data mining software package, or you m
ay implement the
algorithms yourself.


Specifications

For each pair of algorithms, you will write a brief report that will:

1.

Describe the algorithms you chose.

Use algorithms we discussed in class.

2.

Describe the data
sets you used.

The size of at least one d
ataset must be more than 1000 tuples. Use
larger datasets if possible.

3.

If you implemented the algorithms, describe your implementations (including instructions on how they
are used).

4.

If you used a data mining software package, describe how these algorithm
s are used from the package,
and provide a URL or other reference where more information may be found.

5.

Analyze the performance of the algorithms

in your tests, discussing any advantages and/or
disadvantages that the tests indicate.

6.

Discuss any conclusions
you can draw from the analyses.

7.

If you implemented all or any part of the algorithms,
upload
your code

on eccentric
. You may also be
asked for
hard copies of your code
.


For example, a
n outline for a

report on clustering might look like this:

Clustering Al
gorithms for Data Mining

Introduction

A brief explanation of what clustering algorithms do and what they are used for

Algorithms

A description

of the

algorithms you used

Software

A description of your implementations or of the software package you used

Dat
a

A description of the datasets

you used

Results and Discussion

Describe and discuss the results of using the algorithms for clustering

Conclusion

Draw conclusions. Is one algorithm clearly better than the other
s
? Are there some things that one does
better
, and other things
an
other does better? Is there reason to believe that other tests would indicate
different advantages and disadvantages?


Grading Criteria

Assignments will be graded on the following criteria:

1.

Appropriate use of the algorithms.

2.

Using appr
opriate datasets.

3.

Completeness
, accuracy and clarity

of the report.

4.

Thorough
ness of your

analysis.

5.

Drawing of reasonable conclusions.


Tentative
Due dates

All assignments are due at the beginning of class on the due da
te.

Classification


Due date t
o be det
ermined l
ater



Clustering



Association Rule
s