Statistica

siberiaskeinΔιαχείριση Δεδομένων

20 Νοε 2013 (πριν από 3 χρόνια και 6 μήνες)

219 εμφανίσεις

David Young
,
STAT 582
,
5
-
1
-
10

S
TATISTICA


Introduction:



STATISTICA is a software package that deals with statistics and analytics, produced by StatSoft.
Some of the applications that STATISTICA provides include data analysis, data management, data
mining, and data visualization. The base version of STATISTICA contains standard statistical procedures
including descriptive and summary statistics, exploratory data analysis, correlations, probability
calculators, group difference tests including t
-
test
s, ANOVA, and non
-
parametric tests, frequency tables
and cross
-
tabs, multiple response analysis, regression methods including multiple regression and logistic
regression, non
-
parametric studies, and distribution fitting.

Beyond the base version, STATISTIC
A
houses many other additional modules that contain more advanced features including cluster analysis,
simulation, power calculations, neural networks, quality control, design of experiments, and data
mining. Beyond producing numerical results and tables,

STATISTICA also provides its users with the
ability to produce two and three dimensional graphic visualizations for data exploration, analysis, and
presentation.


One advantage of STATISTICA is that it combines a user
-
friendly platform with powerful
stati
stical computation. It is also one of the more cost
-
effective and easily customized statistical
software solutions available.

Starting the program simply involves loading a table of data and applying
the easy to navigate functions which are driven by pul
l
-
down menus. Within these menus users may tell
STATISTICA which variables and types of analysis they wish to conduct, then explore their results in
graphical or tabular outputs.

Throughout this paper we shall explore several of the functions that STATIST
ICA is readily able to
handle. The list below contains the categories of statistical data analysis that we shall investigate.



Descriptive
and Basic
Statistics



Tests of Differences (ANOVA and

Kruskal Wallis test)



Regression



Power Analysis



Quality Control



Descriptive

and Basic

Statistics:


Some basic statistics any researcher will need to explore include variable distributions and
correlations. These along with many other descriptive measures are provided in the framework of
STATISTICA.
Once a user input
s their data table, they can simply choose whichever statistical summary
reports and graphs that they would like to learn more about, and STATISTICA will output these data
quandaries. Figure 1a below presents the density and distribution function of a var
iable. Figure 1b
shows a histogram of the data to give the user a feel for how the data is distributed. Figure 1c shows an
example of some scatterplots to gauge the pairwise correlation relationships between a set of variables.


Figure 1a



Figure 1b



Figure 1c


Tests of Differences:


One of the main methods of comparing tests of difference is Analysis of Variance. To perform a
one
-
way ANOVA in STATISTICA, a user can simply follow the steps provided:

Ste
p 1: Choose the Statistics Option from the
Menu Bar

Step 2: Choose Statistics/Tables

Step 3: Choose Breakdown and One
-
Way ANOVA and click “OK”

Step 4: Go to Individual Tables tab and click “Variables”

Step 5: Select dependent and grouping variables and click “OK”

Step 6:

Click on

Analysis of Variance


Following these simple steps will provide the user with an ANOVA table. Choosing other options within
this menu can also provide plots. Figure 2 below shows an example of an interaction plot for the set of
variables.


Figure 2



T
he Kruskal Wallis test is an example of a non
-
parametric test of differences. To perform the Kruskal
Wallis test in STATISTICA for a one
-
way test of differences, the user may follow the steps below:


Step 1: Choose the Statistics Option from the Menu Bar

Step 2: Choose Nonparametrics

Step 3: Choose Comparing multiple indep. Samples (Groups)

Step 4: Click “Variables”

Step 5: Select dependent and grouping variables and click “OK”

Step 6: Click on Summary


Regression:



To perform ordinary least squares
regression on a set of data, a user can follow the steps below:


Step 1: Choose the Statistics Option from the Menu Bar

Step 2: Choose Multiple Regression

Step 3: Choose dependent and independent variables and click “OK”

Step 4: Go to Quick tab and cli
ck on “Summary: Regression results”


This process will provide output that includes parameter estimates with standard errors and p
-
values,
along with an ANOVA table.

Figure 3 below shows a scatter plot with a trend

line and confidence intervals, along wi
th other

important regression information.


Figure 3



Power Analysis:




Power analysis is important to consider when designing an experiment. Whether the researcher
is concerned with designing an experiment with a large enough sample size, or if a po
st
-
hoc analysis
needs to be conducted to help determine why significant results are not being achieved, it is often
worthwhile to conduct a power analysis. STATISTICA allows a user to compute sample sizes and power
estimates with a quick input of appropri
ate table. Figure 4 below shows a Power curve for various
sample sizes and power levels, based on certain levels of expected differences, type I error rates, etc.


Figure 4



Quality Control:


The final type of analysis we will investigate in this primer

is that of quality control. In quality
control we are often interested in how the average value and ranges of a specific measurement changes
in time and across samples. Figure 5 below shows both an x
-
bar and R chart for a specific variable.
These chart
s also show histograms to determine the distribution of the sample averages and ranges.




Figure 5



Another Important aspect of Quality control involves the exploration of Operating Characteristic Curves.
These are used when a researcher or quality
control engineer wants to investigate the likelihood of
finding a sample outside a set of calculated control limits. Figure 6 below shows an example of an OC
curve for an x
-
bar chart for a set of data.


Figure 6



Conclusion:



Beyond the types of analys
es discussed in this paper, STATISTICA provides a wide range of
statistical analysis similar in magnitude to that of SPSS. The ease and quickness of STATISTICA to provide
statistical results makes this software a practical and helpful tool. The one down
side of STATISTICA is
that compared to its contemporaries of SPSS

and Minitab, documentation is
not as readily available.


Sources Used:


http://www.statsoft.com/textbook/


http://en.wikipedia.org/wiki/Statistica


http://www.ats.ucla.edu/stat/Statistica/notes/default.htm