STAT480 Data Mining for Statistics
Lecturer: Assist. Prof. Dr. Mete Eminağaoğlu
Quiz No.
3
Date:
3
1
st
Ma
y
, 2013
Duration:
60
minutes
Name & Surname:
_______
_______
_______
_______
_______
_______
Signature
:
_______
_______
_______
_______
_______
______________
General Instructions:
Open your web browser and go to
http://meminagaoglu.yasar.edu.tr/
Then,
select the
menu
“STAT480”
and then
select the
sub
-
menu “Quiz
3
”.
To open th
is
page,
you must enter the correct password. Password:
Behemoth
Q1. (
3
0 points)
Click on
Question1
and d
ownload
new
-
sales
.
zip
(zipped
.arff
file).
You
will only use
these two regression
algorithms
in Weka
: Linear Regression,
Simple Linear Regression.
1.
1.
(
1
5
points)
For
each of these two algorithms
;
First you will train the model with “new
-
sales
-
train.arff”. Then you will test the model with “new
-
sales
-
test.arff”.
Analyze the results that you have derived from both of the
regression alogrithms’ train and test performances.
Which algorithm is better?
(more accurate / reliable) _____________________________________
Why?
(you must answer it
by giving
some
necessary and relevant results
that you have found in Weka)
.
1.2
.
(
15
points)
Write down the
regression equation
that you have found for the best algorithm in part 1.1.
Using this equation,
calculate the
“
total
-
no
-
sales
”
prediction for;
others
-
price
=
9.5
our
-
price
=
10
our
-
cost =
9.18
inflation
-
rate
=
3.47
_____________________________________________________________________
Q
2
. (
2
0
points)
outlook
temperature
humidity
day
-
of
-
week
play
rainy
mild
high
Saturday
no
sunny
hot
high
other
no
sunny
hot
high
Sunday
no
rainy
mild
high
other
no
rainy
cool
normal
Sunday
no
rainy
cool
normal
Saturday
no
sunny
mild
low
other
yes
sunny
hot
normal
other
yes
sunny
mild
high
Saturday
yes
sunny
hot
normal
Sunday
yes
rainy
mild
normal
Sunday
yes
sunny
mild
low
Saturday
yes
Suppose
that you use
Tertius
algorithm
for the
above
data
set
and you find the rule denoted as below;
If
(humidity=
normal
) AND (play=
yes
)
THEN (temperature=hot) OR (day
-
of
-
week=Sunday)
According to
Tertius
algorithm, find the
values
that could be obtained by this rule.
You must show
all the necessary
calculations.
Expected
=
?
Observed =
?
Confirmation =
?
True Positive rate =
?
False Positive rate =
?
Q
3
. (
30
points)
hair
height
weight
burned
blonde
average
light
yes
blonde
tall
average
no
brown
short
average
no
blonde
short
average
yes
red
average
heavy
yes
brown
tall
heavy
no
blonde
short
light
no
Suppose
that you have the above original data set. You will use this for
machine learning
classification. “burned” is the
class attribute.
Q3.1.
If you use
k
-
NN
classifier algorithm (
taking
k = 3
and using
Manhattan
distance
for the distance function), what
will be predicted as the class of this record? yes or no?
You must show all the necessary
calculations.
hair
height
weight
blonde
short
average
Q3.2.
If you use
NN
simple instance
-
based algorithm (
using
Euclidean
distance
for the distance function), what will be
predicted as the class of this record? yes or no?
You must show all the necessary
calculations.
hair
height
weight
b
rown
tall
heavy
Q
4
. (
20
points)
Team Name
no
-
of
-
wins
(this season)
no
-
of
-
wins (predicted
-
next season)
TS
20
2
2
BNK
14
13
AU
0
2
SSN
7
7
VK
12
10
RM
20
5
PO
2
0
RYL
10
1
1
You have a report for
eight different football
teams’ next season performance prediction given as above. This report is
derived by any machine learning numeric prediction method. According to this report, calculate the
evaluation
values
given below.
You must show all the necessary
calculations.
MSE (m
ean squared er
ror) =
?
Mean absolute error =
?
Relative squared error =
?
According to these evaluation values,
what
can you say about this
prediction performance and data sample?
What
could
you do
improve
the prediction performance?
Enter the password to open this PDF file:
File name:
-
File size:
-
Title:
-
Author:
-
Subject:
-
Keywords:
-
Creation Date:
-
Modification Date:
-
Creator:
-
PDF Producer:
-
PDF Version:
-
Page Count:
-
Preparing document for printing…
0%
Comments 0
Log in to post a comment