RStat: Release 1.2

runmidgeΤεχνίτη Νοημοσύνη και Ρομποτική

20 Οκτ 2013 (πριν από 4 χρόνια και 2 μήνες)

88 εμφανίσεις

RStat: Release 1.2

Ali
-
Zain Rahim, Strategic Product Manager

March 18, 2010

Agenda
:


Differentiators and Benefits


Review 1.2 Enhancements


Survival Analysis demo
-

Child welfare


Questions


RStat: Differentiators & Benefits



Based on

R
-
Project


Open Source


Maintained by world wide consortium of universities, scientists,
government funded research organizations, statisticians.


Over 2000 packages


RStat is a GUI to R


Intuitive guided approach to modeling


Simple model evaluation


Intended both for business analysts and advanced modelers


Single BI and Predictive Modeling Environment


Re
-
use metadata and queries


Perform data manipulation and sampling


Build scoring applications


Unique Deployment Method for Scoring Solutions



Scoring

models are built
directly into WF metadata


Deployment on any platform and operating system
-

Windows, Unix,
Linux, Z/OS, and i Series.


RStat 1.2 Enhancements:


New Modeling Technique:


Survival Analysis:


Two Techniques


Cox Regression and Parametric Time Regression


Cox Regression


risk scoring routine


Parametric regression


time scoring routine


What Survival Does and when to use


Survival analysis
encompasses a wide variety of methods for
analyzing the timing of events with censored data (Censoring: Nearly
every sample contains some cases that do not experience an event)


How to study the causes of


Births and Deaths


Marriages and Divorces


Arrests and Convictions


Job Changes and Promotions


Bankruptcies and Mergers









Wars and Revolutions


Residence Changes


Consumer Purchases


Adoption of Innovations


Hospitalizations






.

RStat 1.2 Enhancements


cont’d


New Scoring Routines:


Neural Network model with comprehensive output


Enables
users to compile NNET models into WebFOCUS functions for
creation of applications.


Transformation capabilities for scoring routines


Allows for data
manipulation within the RStat tool. Some methods are: Imputation,
Scaling, and Remapping


Enhanced statistical output:


Indicators to Regression models ANOVA table to show
significance


Enables users to determine the variables that are
significant to the model.


Performance and Usability optimization


Auto sampling for faster visualization of large data sets in the
KMeans model


Enables more optimized and efficient resource
usage to display Cluster model statistics and data plots.



Performance and Usability optimization


Model optimization


Allows only the variables used to create
the model to be included in the exported C file. [In RStat 1.1 all
variables selected by the user were included in the model]


Enhanced Log functionality


Allows users to create R
-
scripts for
use with other applications, such as a Dialogue Manager
application.


Process Cancellation capability


Allows users to cancel a long
running process from within RStat.


Special characters functionality


Enables efficient handling of
data with special characters.


Timestamp within the RConsole and Log Textview


Enables
users to view and match the log with any errors received,
thereby allowing for easier troubleshooting.


RStat 1.2 Enhancements


cont’d

Copyright 2007, Information Builders. Slide
7

Demo: Child Welfare Use Case



To identify the children who will stay in Child Welfare
programs, and at what age will the children leave the
programs


a time to event analysis

Foster Care Analytical Framework: Background and
Optimization Goals


Half a million children in foster care


Managed by county departments and the private
agencies who train families


It is a team effort to find a child a permanent
home


Severe consequence of bad foster care:


Youth who leave the system are more likely
to be homeless, incarcerated, unemployed,
and unskilled.



Foster Care Analytical Framework
: Goals &
Benefits :


Provide better understanding of the factors
that contribute to better foster care to all
parties involved in the process


Provide standardized analytic and reporting
system


Match children with better foster parents


Optimize child foster care duration





Survival Analysis


Child Welfare

Survival Analysis


Child Welfare

Survival Analysis


Child Welfare



(cont’d)

Survival Analysis


Child Welfare



(cont’d)

Survival Analysis


Child Welfare



(cont’d)

Survival Analysis


Child Welfare



(cont’d)

Survival Analysis


Child Welfare



(cont’d)

Survival Analysis


Child Welfare



(cont’d)

Survival Analysis


Child Welfare



(cont’d)

Survival Analysis


Child Welfare



(cont’d)

Survival Analysis


Child Welfare



(cont’d)

Survival Analysis


Child Welfare



(cont’d)

Copyright 2007, Information Builders. Slide
22

Thank you!



"..if you are serious about statistics as a career, you need to become
familiar with R because it is the most powerful and flexible
language available, and may become the
lingua franca

of
statistical programming in the near future.“



Source
: "Statistics in a Nutshell" by Sarah Boslaugh
published by O'Reilly