regression powerpoint slides

unknownlippsΤεχνίτη Νοημοσύνη και Ρομποτική

16 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

92 εμφανίσεις

Regression

Population Covariance and Correlation

Sample Correlation

Sample Correlation

.98

-
.04

-
.79

Linear Model

DATA

REGRESSION LINE

(Still) Linear Model

DATA

REGRESSION CURVE

Parameter Estimation

Minimize SSE over possible parameter values

Fitting a linear model in R

Fitting a linear model in R

Intercept parameter is significant at .0623 level

Fitting a linear model in R

Slope parameter is significant at .001 level, so reject

Fitting a linear model in R

Residual Standard Error:

Fitting a linear model in R

R
-
squared is the correlation squared, also % of variation

explained by the linear regression

Create a Best Fit Scatter Plot

Add X and Y Labels

Inspect Residuals

Multiple Regression

Example: we could try to predict change in diameter

u
sing both change in height as well as starting height

and Fertilizer

Multiple Regression


All variables are significant at .05 level


The Error went down and R
-
squared went up (this is good)


Can even handle categorical variables

Regression w/ Machine Learning point
of view

Regression w/ Machine Learning point
of view


Let’s “train” (fit) different models to a training data set



T
hen see how well they do at predicting a different
“validation” data set (this is how ML competitions on
Kaggle

work)

http://
archive.ics.uci.edu
/ml/datasets/
YearPredictionMSD

Music Year

Timbre (90 attributes)

Regression w/ Machine Learning point
of view


Create a random sample of size 10000 from original
515,345 songs



Assign first 5000 to training data set, second 5000 are
saved for validation

Regression w/ Machine Learning point
of view


Fit linear model and generalized boosting regression
model (other popular choices include random forests
and neural networks)



The period after the tilde denotes we will use all 91
variables for training, the

V1 throws out V1 (since this
is what we’re predicting)

Regression w/ Machine Learning point
of view


Next we make predictions for the validation data set



We compare the models by calculating the sum of
squares error (SSE) for each model

Regression w/ Machine Learning point
of view