Building and Deploying an

trainerhungarianAI and Robotics

Oct 20, 2013 (3 years and 10 months ago)

71 views

Building and Deploying an
Early Warning System
in
Wisconsin

Jared Knowles

Research Analyst

Wisconsin Department of Public Instruction

Statewide PBIS Network Conference

Wisconsin Dells, Wisconsin

August 20
th
, 2013

Agenda


Principles for a Dropout Early Warning
System


Building a Statewide DEWS


Piloting a DEWS


Learning from the Pilot


DEWS Refresher


DEWS score calculated using a combination of
demographic and student outcome measures to
improve accuracy


Attendance, disciplinary events, assessment scores,
and student mobility


Student risk is calculated individually for each
student


Students are classified as at risk if their score
crosses a threshold set by DPI; districts can use
this or ignore it

DEWS Refresher


DPI early warning system is called the
D
ropout
E
arly
W
arning
S
ystem, or DEWS


DEWS provides a score from 0
-
100 for current
6
th
, 7
th
, and 8
th

graders


The score represents the rate at which students
similar to the current student in previous cohorts
graduated


A score of 75 means that 75% of prior students
with similar characteristics graduated on time

DPI’s System is in Development


More than
60% of students

who eventually do
not graduate after 4 years of high school can be
identified with current data
before the start of
7
th

grade


DPI is working to improve this through better
techniques to allow students to be identified
earlier

and with
more accuracy


The system will continually improve with
better
data, better mathematical models, and more
real time results


Classification

Project Plan


DEWS was developed during the 2012
-
13
school year


Pilot group of 34 schools identified in early 2013


Pilot materials delivered electronically in mid
-
April 2013; participation in follow
-
up survey too


Interpretative guide


Student reports for all current 7
th

graders


School report


School roster


Pilot materials mimic
WISEdash
, final scheduled
for September 2013 rollout in
WISEdash

Awareness and Communications


Title I Coordinators


Accountability Trainers


Statewide PBIS Network


CESA Support Network


SSEDAC


School Administrators
Alliance


School Counselors
Association


WERAC


National Forum on
Education Statistics


REL Midwest


Partners at WCER


Department of Children
and Families


Members of
WISEexplore


DEWS Process

Demographics

STATE DATA

Student Risk
Identification

Teacher / program context


Parent input


Special circumstances


CONTEXT

LOCAL KNOWLEDGE

Intervention
Strategies

Pilot Reports

DEWS Pilot Schools


Washington and Lincoln Mid.;
Kenosha


Gilmore and Starbuck Mid.;
Racine


Deer Creek Inter.;
St. Francis


Franklin and Edison Mid;
Janesville


Aldrich Middle,
Beloit


Toki and James Wright Mid.,
MMSD


Riverview Elementary;
Silver Lake


Oconto Mid.;
Oconto


Menominee Indian Mid.;
Menominee Indian


James Williams Mid;
Rhinelander


Riverview Mid;
Barron


Cumberland Mid;
Cumberland


Lac du Flambeau Elem;


Lancaster Middle;
Lancaster


Tomah Middle;
Tomah


River

Valley Middle;
River Valley


Spring

Hill Middle;
Wisc
. Dells


Waupaca

Middle;
Waupaca


Roosevelt

Middle;
Appleton


Webster Stanley Middle;
Oshkosh


L.B. Clarke Middle;
Two

Rivers


Random

Lake Mid.;
Random Lake


Washington

and Edison Mid;
Green
Bay


D.C. Everest Mid;
DC Everest


Colfax Elementary;
Colfax


Bloomer

Middle;
Bloomer


DeLong

Middle;
Eau Claire

Pilot Reports
-

Student

Pilot Reports
-

School

Survey Results


Survey sought to identify the utility of the DEWS
reports in relation to existing Early Warning
System / identification measures

Asked about:


usefulness of DEWS report


usefulness of interpretation guides


desire to have DEWS available


WISEdash

usage


Likelihood to use
WISEdash

if DEWS included


Survey Summary


18 of the 34 participating pilot schools have
responded to the survey so far (52.9%)


15 of the respondents indicated they “fully
reviewed” the results


3 schools have been interviewed with 3 more
interviews scheduled


5 schools said staff reviewed the reports
individually, 11 said staff reviewed them in a
group working together



DEWS Overall Valuable!

DEWS Identifies Students Missed

DEWS Does Not Miss Many
Students

Student reports are most positive
element

The Student Roster is Valuable

School reports are well liked

Most respondents expect DEWS to
be used at least annually

Annual Delivery Before School
Year is Strongly Preferred

DEWS can drive
WISEdash

usage

Principals and Student Services
Staff Must Have
WISEdash

access

DEWS Beyond Fall 2013

DEWS as it exists is just a start. Several
extensions for DEWS may be desired:


Deeper
WISEdash

integration?


Communication and professional development
to raise awareness and use for informing
interventions?


Extend coverage to earlier and later grades


Increase accuracy?


Add college
-
enrollment as a secondary
warning?


All data is fictitious and for demonstration purposes only.

Student Overview

All data is fictitious and for demonstration purposes only.

Get More Information

All data is fictitious and for demonstration purposes only.

Mobility History

All data is fictitious and for demonstration purposes only.

Detailed Assessment History

All data is fictitious and for demonstration purposes only.

EWS in a Multi
-
Level System of
Support (MLSS or RtI)

Student Data
Collected

At Risk Students
Identified by
EWS

Local Review of
Results & Local
Data

Plan
Interventions

Determine how
well interventions
work

Current and Future Partners

Current:


Title I Coordinators


Accountability
Trainers


Statewide PBIS
Network


CESA Support
Network


SSEDAC


Future:


WCER / VARC


RtI

Center


Pupil Services State
Organizations


DPI Divisions and
Teams?





In the Works

Research Grant with WCER funded through the Institute for
Education Sciences (IES) to explore DEWS usage and
dropout prevention strategies


Goal is to provide districts an answer to the “What now?”
question that DEWS poses


Grant submission in September, notification by February
1, 2014



Extending Grades

Increase Accuracy with New
Datasets

Inputs


ISES /
WSAS /
SBAC


Attendance


Discipline


Mobility



Interventions



Outputs


Student
-
specific
identification


WISEdash

Dashboard





RTI module


Local analysis

NOW












LATER

Model Types

Models Tried:


Probit

(winner)


Logit


HLM


k
-
nearest neighbors (
knn
)


Gradient Boosted
Machine


Random Forests

Models Yet Tried:


Cubist


Support Vector Machines


Multivariate Adaptive
Regression
Splines


Discriminant

Analysis


Neural networks


Bayesian Model
Averaging



Currently a manual process, automation is the next step

Questions and Contact Info


E
-
mail:
jared.knowles@dpi.wi.gov


Web:
www.jaredknowles.com


Code: github.com/
jknowles


Twitter: @
jknowles


LET’S GET TECHNICAL


Free and Open Source Platform


Fully modular


Empirically Derived


Flexible


Extensible




DPI DEWS Features


A key feature of the DPI DEWS is that it is built
on free and open source technologies


It is a series of 5 modules:


Data import


Data recoding / cleaning


Model selection


Prediction


Data Export


It has some pre
-
requisites to work


Free and Open Source


The EWS is written for the R open
-
source statistical
computing language


It is a series of modular scripts that perform some basic
functions and may not be necessary everywhere


Each module expects data in certain formats and returns
data in a specific format


This is entirely local to Wisconsin currently, but
improvements made during the pilot phase should allow
time to generalize it more



Technologies

Modules

Data Import

All data is fictitious and for demonstration purposes only.


Extract raw data from an Oracle data warehouse


Extract needs all records for a grade of students
from grade 7 to graduation


Extract will be reused to get data on current
grade 7 students for prediction

Data Import


Data recoding is the only place that decisions are forced
on the statistical model


Administrative records need to be reshaped in a way to
fit the statistical procedures


Business rules need to be in place to enforce
standardization of fields


Example: FRL is coded as “F”, “R”, “N”, “A”, “P”


Need to reduce this to “F” and “N” or “F”, “R”, and “N”


Use business rules from the
Strategic Data Project


Enforce some rules to make statistical model easier to fit
(grouping categories to increase cell size)

Data Recoding and Cleaning

Inputs and Outputs

All data is fictitious and for demonstration purposes only.


Fit a basic statistical model regressing a subset of the
data on students in 7
th

grade on an indicator of whether
or not they graduated


More variables are added to the model, and the
prediction rate of each successive model is evaluated on
a test set of data


Finally, when all variables have been exhausted, or the
best possible prediction rate has been achieved, the
process is stopped


This is repeated for other classes of models / functional
forms until the best model from the best of each class is
identified

Model Selection


Depending on the data available, the factors included in
the model will change, as will their weight in predicting
the outcome


The system is flexible to this, so it can expand as new
data comes online, and as more longitudinal data is
available on cohorts


For now, in Wisconsin, for two cohorts, these factors
seem to matter


Assessments


Attendance


Mobility


Discipline


School of attendance

Model Selection

ROC Curve

Receiver Operating Characteristic (ROC): A measure of signal to noise in binary
classification.
http://en.wikipedia.org/wiki/Receiver_operating_characteristic

Binary
Outcome Tradeoffs


Prediction is handled by determining the risk score of an
individual student and the uncertainty around that score


A threshold is set above which students are flagged


Districts will see both the score and the flag


The flag is based on a predetermined level of confidence
in the prediction


e.g. 50% of flagged students are true dropouts, 50% are
false negatives

Prediction

How?

All data is fictitious and for demonstration
purposes only.

Example of Predictions


Predictions are output to a data store, where they are
loaded into our statewide reporting instance via ETL


Working on building the prediction module into the ETL
process (easily done)


Allows the scores to be updated when new data is
available


Crucial as state transitions to a Statewide Student
Information System, allowing more frequent updates on
data


Theoretically any reporting environment could be hooked
up to the system


SAMPLE REPORTS

Reporting


Requirements


1 cohort of students who have valid and
reliable measurements of several attributes in
the prediction year, and observed graduation,
dropout, or transfer


Serious computing resources (depending on
data size and complexity)


Preferences


Multiple measures and more than 1 cohort


No selection bias in students in the data



Requirements



Flexibility


Open source code that can be viewed, modified, copied,
enhanced


System

is built on few assumptions; it learns from the
data it is fed


Can

input data from a variety of formats and output data
in a variety of formats (JSON, SQL, ORACLE, CSV, etc.)


Modular

use only the pieces needed




The predictive model does not make assumptions about
factors that increase or decrease risk


Searches among data provided to identify the
combination of factors that provide the
best prediction


Factors that matter more are given more weight, those
that matter less are discarded


Depending on the data available this may dramatically
change


With the data provided, the system will search for the
best available model



Empirically Derived

Recap


Each student receives a score from 0 to 1 (or 0
-
100)
representing the probability of graduation in 4 years of
HS


DPI can transform this into a binary indicator (on
-
track,
not
-
on
-
track) based on historical information about the
prediction (above or below a threshold)


DPI can work on calibrating this binary indicator