RStat: Release 1.2
Ali
-
Zain Rahim, Strategic Product Manager
March 18, 2010
Agenda
:
Differentiators and Benefits
Review 1.2 Enhancements
Survival Analysis demo
-
Child welfare
Questions
RStat: Differentiators & Benefits
Based on
R
-
Project
Open Source
Maintained by world wide consortium of universities, scientists,
government funded research organizations, statisticians.
Over 2000 packages
RStat is a GUI to R
Intuitive guided approach to modeling
Simple model evaluation
Intended both for business analysts and advanced modelers
Single BI and Predictive Modeling Environment
Re
-
use metadata and queries
Perform data manipulation and sampling
Build scoring applications
Unique Deployment Method for Scoring Solutions
Scoring
models are built
directly into WF metadata
Deployment on any platform and operating system
-
Windows, Unix,
Linux, Z/OS, and i Series.
RStat 1.2 Enhancements:
New Modeling Technique:
Survival Analysis:
Two Techniques
–
Cox Regression and Parametric Time Regression
Cox Regression
–
risk scoring routine
Parametric regression
–
time scoring routine
What Survival Does and when to use
Survival analysis
encompasses a wide variety of methods for
analyzing the timing of events with censored data (Censoring: Nearly
every sample contains some cases that do not experience an event)
How to study the causes of
Births and Deaths
Marriages and Divorces
Arrests and Convictions
Job Changes and Promotions
Bankruptcies and Mergers
Wars and Revolutions
Residence Changes
Consumer Purchases
Adoption of Innovations
Hospitalizations
.
RStat 1.2 Enhancements
–
cont’d
New Scoring Routines:
Neural Network model with comprehensive output
–
Enables
users to compile NNET models into WebFOCUS functions for
creation of applications.
Transformation capabilities for scoring routines
–
Allows for data
manipulation within the RStat tool. Some methods are: Imputation,
Scaling, and Remapping
Enhanced statistical output:
Indicators to Regression models ANOVA table to show
significance
–
Enables users to determine the variables that are
significant to the model.
Performance and Usability optimization
Auto sampling for faster visualization of large data sets in the
KMeans model
–
Enables more optimized and efficient resource
usage to display Cluster model statistics and data plots.
Performance and Usability optimization
Model optimization
–
Allows only the variables used to create
the model to be included in the exported C file. [In RStat 1.1 all
variables selected by the user were included in the model]
Enhanced Log functionality
–
Allows users to create R
-
scripts for
use with other applications, such as a Dialogue Manager
application.
Process Cancellation capability
–
Allows users to cancel a long
running process from within RStat.
Special characters functionality
–
Enables efficient handling of
data with special characters.
Timestamp within the RConsole and Log Textview
–
Enables
users to view and match the log with any errors received,
thereby allowing for easier troubleshooting.
RStat 1.2 Enhancements
–
cont’d
Copyright 2007, Information Builders. Slide
7
Demo: Child Welfare Use Case
To identify the children who will stay in Child Welfare
programs, and at what age will the children leave the
programs
–
a time to event analysis
Foster Care Analytical Framework: Background and
Optimization Goals
Half a million children in foster care
Managed by county departments and the private
agencies who train families
It is a team effort to find a child a permanent
home
Severe consequence of bad foster care:
Youth who leave the system are more likely
to be homeless, incarcerated, unemployed,
and unskilled.
Foster Care Analytical Framework
: Goals &
Benefits :
Provide better understanding of the factors
that contribute to better foster care to all
parties involved in the process
Provide standardized analytic and reporting
system
Match children with better foster parents
Optimize child foster care duration
Survival Analysis
–
Child Welfare
Survival Analysis
–
Child Welfare
Survival Analysis
–
Child Welfare
(cont’d)
Survival Analysis
–
Child Welfare
(cont’d)
Survival Analysis
–
Child Welfare
(cont’d)
Survival Analysis
–
Child Welfare
(cont’d)
Survival Analysis
–
Child Welfare
(cont’d)
Survival Analysis
–
Child Welfare
(cont’d)
Survival Analysis
–
Child Welfare
(cont’d)
Survival Analysis
–
Child Welfare
(cont’d)
Survival Analysis
–
Child Welfare
(cont’d)
Survival Analysis
–
Child Welfare
(cont’d)
Copyright 2007, Information Builders. Slide
22
Thank you!
"..if you are serious about statistics as a career, you need to become
familiar with R because it is the most powerful and flexible
language available, and may become the
lingua franca
of
statistical programming in the near future.“
Source
: "Statistics in a Nutshell" by Sarah Boslaugh
published by O'Reilly
Enter the password to open this PDF file:
File name:
-
File size:
-
Title:
-
Author:
-
Subject:
-
Keywords:
-
Creation Date:
-
Modification Date:
-
Creator:
-
PDF Producer:
-
PDF Version:
-
Page Count:
-
Preparing document for printing…
0%
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο