Online analytical processing

previousdankishSoftware and s/w Development

Nov 25, 2013 (3 years and 11 months ago)

621 views

Online analytical processing

From Wikipedia, the free encyclopedia


(Redirected from
OLAP
)

Jump to:
navigation
,
search

Online analytical processing,

or
OLAP

(
IPA
:
/
ˈ
o
ʊ
læp/
), is an
approach to quickly answer multi
-
dimensional analytical queries.
[1]

OLAP is part of the broader category of
business intelligence
, which also encompasses
relational reporting

and
data mining
.
[2]

The typical applications of OLAP are in
business reporting

for sales,
marketing
,
management reporting
,
business process
management

(BPM),
budgeting

and
forecasting
,
financial reporting

and similar areas. The term
OLAP

was created as a slight
modification of the traditional database term
OLTP

(
Online Transaction P
rocessing
).
[3]

Databases

configured for OLAP use a multidimensional data model, allowing for complex analytical and ad
-
hoc queries with a
rapid execution time. They borrow aspects of
nav
igational databases

and
hierarchical databases

that are faster than
rela
tional
databases
.
[4]

Nigel Pendse

has suggested that an alternative and perhaps more descriptive term to describe

the concept of OLAP is
Fast
Analysis of Shared Multidimensional Information

(
FASMI
).
[5]

The output of an OLAP query is typical
ly displayed in a matrix (or
pivot
) format. The dimensions form the rows and columns of
the matrix; the measures form the values.

Contents

[
hide
]



1 Concept


o

1.1 Multidimensional databases



2 Aggregations



3 Types


o

3.1 Multidimensional

o

3.2 Relational

o

3.3 Hybrid

o

3.4 Comparison

o

3.5 Other types



4 APIs and query languages



5 Products


o

5.1 History

o

5.2 Market structure



6 See also



7 Bibliography



8 References

[
edit
] Concept

In the core of any OLAP system is a concept of an
OLAP cube

(also called a
multidimensional cube

or a
hypercube
). It consists of
numeric facts called
measures

which are categorized by
dimensions
. The cube metadata is typically created from a
star schema

or
snowflake schema

of tables in a
relational database
. Measures are derived from the records in the
fact table

and dimensions
are derived from the
dimension tables
.

Each
measure

can be thought of as having a set of
labels
, or meta
-
data assoc
iated with it. A
dimension

is what describes these
labels
; it provides information about the
measure
.

A simple example would be a cube that contains a store's sales as a
measure
, and Date/Time as a
dimension
. Each Sale has a
Date/Time
label

that describes more about that sale.

Any number of
dimensions

can be added to the structure such as Store, Cashier, or Customer by adding a column to the
fact
table
. This allows an an
alyst to view the
measures

along any combination of the
dimensions
.

For Example:


Sales Fact Table

+
-----------------------
+

| sale_amount | time_id |

+
-----------------------
+ Time Dimension

| 2008.08| 1234|
---
+ +
---------------
-------------
+

+
-----------------------
+ | | time_id | timestamp |


| +
----------------------------
+


+
----
>| 1234 | 20080902 12:35:43|


+
-----------
-----------------
+

[
edit
] Multidimensional databases

Multidimensional structure is defined as “a
variation of the relational model that uses multidimensional structures to organize
data and express the relationships between data” (O'Brien & Marakas, 2009, pg 177). The structure is broken into cubes and th
e
cubes are able to store and access data withi
n the confines of each cube. “Each cell within a multidimensional structure contains
aggregated data related to elements along each of its dimensions” (pg. 178). Even when data is manipulated it is still easy t
o
access as well as be a compact type of datab
ase. The data still remains interrelated. Multidimensional structure is quite popular
for analytical databases that use online analytical processing (OLAP) applications (O’Brien & Marakas, 2009). Analytical
databases use these databases because of their ab
ility to deliver answers quickly to complex business queries. Data can be seen
from different ways, which gives a broader picture of a problem unlike other models (Williams, Garza, Tucker & Marcus, 1994).

[
edit
] Aggregations

It has been claimed that for complex queries OLAP cubes can produce an answer in around 0.1% of the time for the same query
on
OLTP

relational data.
[6]

[7]

The most important mechanism in OLAP which allows it to achieve such performance is the use of
aggregations
. Aggregations are built from the fact table by changing the granularity on specific dimensions and aggregating up
data along these dimensions. The
number of possible aggregations is determined by every possible combination of dimension
granularities.

The combination of all possible aggregations and the base data contains the answers to every query which can be answered
from the data
[8]
.

Because usually there are many aggregations that can be calculated, often only a predetermined number are fully calculated;
the remainder are solved on demand. The problem of deciding which aggregations (views) to calculate is known as the
view
selection

problem. View selection can be constrained by the total size of the selected set of aggregations, the time to update
the
m from changes in the base data, or both. The objective of view selection is typically to minimize the average time to answer

OLAP queries, although some studies also minimize the update time. View selection is
NP
-
Complete
. Many approaches to the
problem have been explored, including
greedy algorithms
,
randomized search
,
genetic algorithms

and
A* search algorithm
.

A very effective way to support aggregation and other common OLAP operations is the use of
bitmap
indexes
.

[
edit
] Types

OLAP systems have been traditionally categorized using the following taxonomy.
[9]

[
edit
] Multidimensional

Main article:
MOLAP

MOLAP

is the 'classic' form of OLAP and is sometimes referred to as just OLAP. MOLAP stores this data in an optimized multi
-
dimensional array storage, rather than in a relational database. T
herefore it requires the pre
-
computation and storage of
information in the cube
-

the operation known as processing.

[
edit
] Relational

Main article:
ROLAP

ROLAP

works directly with relational databases. The base data and the dimension tables are stored as relational tables and new
tables are created to hold th
e aggregated information. Depends on a specialized schema design.

[
edit
] Hybrid

Main article:
HOLAP

There is no clear agreement across the industry as to what constitutes "Hybrid OLAP", except that a database will divide data

between relational and specialized storage. For example, for som
e vendors, a HOLAP database will use relational tables to hold
the larger quantities of detailed data, and use specialized storage for at least some aspects of the smaller quantities of mo
re
-
aggregate or less
-
detailed data.

[
edit
] Comparison

Each type has certain benefits, although there is disagreement about the specifics of the benefits between providers.



Some MOLAP implementations are prone to database explosion.
Database explosion

is a phenomenon causing vast
amo
unts of storage space to be used by MOLAP databases when certain common conditions are met: high number of
dimensions, pre
-
calculated results and sparse multidimensional data. The typical mitigation technique for database
explosion is not to materialize al
l the possible aggregation, but only the optimal subset of aggregations based on the
desired performance vs. storage trade off.



MOLAP generally delivers better performance due to specialized indexing and storage optimizations. MOLAP also needs
less storage

space compared to ROLAP because the specialized storage typically includes
compression

techniques.
[10]



ROLAP is generally more scalable.
[10]

However, large volume pre
-
processing is difficult to implement efficiently so it is
frequently skipped.
ROLAP query performance can
therefore suffer tremendously



Since ROLAP relies more on the database to perform calculations, it has more limitations in the specialized functions it
can use.



HOLAP encompasses a range of solutions that attempt to mix the best of ROLAP and MOLAP. It can
generally pre
-
process quickly, scale well, and offer good function support.

[
edit
] Other types

The following acronym
s are also sometimes used, although they are not as widespread as the ones above:



WOLAP

-

Web
-
based OLAP



DOLAP

-

Desktop

OLAP



RTOLAP

-

Real
-
Time OLAP

[
edit
] APIs and query languages

Unlike
relational databases
, which had SQL as the standard query language, and wide
-
spread APIs such as
ODBC
,
JDBC

and
OLEDB
, there was no such unification in the OLAP world for a long time. The first real standard API was
OLE DB for OLAP

specification from
Microsoft

which appeared in 1997 and introduced the
MDX

query language. Several OLAP vendors
-

both
server and client
-

adopted it. In 2001
Microsoft

and
Hyperion

announced the
XML for Analysis

specification, which was
endorsed by

most of the OLAP vendors. Since this also used
MDX

as a query language,
MDX

became the de
-
facto standard.
[11]

[
edit
] Products

[
edit
] History

The first product that performed OLAP queries was
Express,

which was released in 1970 (and acquired by
Oracle

in 1995 from
Information Resources)
[12]
. However, the term did not appear until 1993 when it was coined by
Ted Codd
, wh
o has been
described as "the father of the relational database". Codd's paper
[1]

resulted from a short consulting assignment which Codd
undertook for former Arbor Software (later
Hyperion Solutions
, and in 2007 acquired by Oracle), as a sort of marketing coup.
The company had released its own OLAP product,
Essbase
, a year earlier. As a result Codd's "twelve laws of online analytical
processing" were explicit in their reference to Essbase. There was some ensuing controversy and when Computerworld learned
that Codd was paid by Arbor,
it retracted the article. OLAP market experienced strong growth in late 90s with dozens of
commercial products going into market. In 1998,
Microsoft

released its first OLAP Server
-

Microsoft Analysis Services
, which
drove wide adoption of OLAP technology and moved it into mainstream.

[
edit
] Market structure

Below is a list of top OLAP vendors in 2006, with figures in millions of
United States Dollars
.
[13]

Vendor

Global Revenue

Microsoft Corporation

1,801

Hyperion Solutions
Corporation

1,077

Cognos

735

Business Objects

416

MicroStrategy

416

SAP AG

330

Cartesis SA

210

Applix

205

Infor

199

Oracle Corporation

159

Others

152

Total

5,700

Microsoft was the only vendor that continuously
exceeded the industrial average growth during 2000
-
2006. Since the above
data was collected, Hyperion has been acquired by Oracle, Cartesis by Business Objects, Business Objects by SAP, Applix by
Cognos, and Cognos by IBM.
[14]

[
edit
] See also


Computer science portal



Business intelligence



Data warehousing



Data mining



Predictive analytics



Business analytics



OLTP

[
edit
] Bibliography



Daniel Lemire (2007
-
12).
"Data Warehousing an
d OLAP
-
A Research
-
Oriented Bibliography"

(in English).
http://www.daniel
-
lemire.com/OLAP/
.



Erik Thomsen.

(1997).
OLAP Solutions: Building Multidimensional Information Systems, 2nd Edition
.
John Wiley & Sons.
ISBN 978
-
0471149316
.



O’Brien, J. A., & Marakas, G. M. (2009). Management information systems (9th ed.).
Boston, MA: McGraw
-
Hill/Irwin.



Williams, C., Garza, V. R., Tucker, S, Marcus, A. M. (1994, January 24).
Multidimensional models boost viewing options.
InfoWorld, 16(4).

[
edit
] References

1.

^
a

b

Codd E.F., Codd S.B., and Salley C.T. (1993).
"Providing OLAP (On
-
line Analytical Processing) to User
-
Anal
ysts: An IT
Mandate"
. Codd & Date, Inc.
http://www.fpm.com/refer/codd.html
.
Retrieved on 2008
-
03
-
05.

2.

^

Deepak Pareek (2007).
Business Intelligence for Telecommunications
. CRC Press
. pp.

294 pp.
ISBN 0849387922
.
http://books.google.com/books?id=M
-
U
OE1Cp9OEC
.
Retrieved on 2008
-
03
-
18.

3.

^

"OLAP Coun
cil White Paper"

(PDF). OLAP Council. 1997.
http://www.symcorp.com/downloads/OLAP_CouncilWhitePaper.pdf
.
Retrieved
on 2008
-
03
-
18.

4.

^

Hari Mailvaganam (2007).
"Introduction to OLAP
-

Slice, Dice and Drill!"
. Data Warehousing Review.
http://www.dwreview.com/OLAP/Introduction_OLAP.html
.
Retrieved on 2008
-
03
-
18.

5.

^

Nigel Pendse (2008
-
03
-
03).
"What is OLAP? An analysis of what the often misused OLAP term is su
pposed to mean"
.
OLAP Report.
http://www.olapreport.com/fasmi.htm
. Retrieved on 2008
-
03
-
18.

6.

^

MicroStrategy, Incorporated (1995).
"The Case for Relational OLAP"

(PDF).
http://www.cs.bgu.ac.il/~dbm031/dw042/Papers/microstrategy_211.pdf
. Retrieved on 2008
-
03
-
20.

7.

^

Surajit Chaudhuri and Umeshwar Dayal (1997). "
An overview of data warehousing and OLAP technology
".
SIGMOD
Rec.

(
ACM
)
26
: 65.
doi
:
10.1145/248603.248616
.
http://doi.acm.org/10.1145/248603.248616
.

Retrieved on 2008
-
03
-
20.

8.

^

Gray, Jim
; Chaudhuri, Surajit; Layman, Andrew; Reichart, Hamid; Venkatrao; Pellow; Pirahesh
(1997). "
Data Cube: {A}
Relational Aggregation Operator Generalizing Group
-
By, Cross
-
Tab, and Sub
-
Totals
".
J. Data Mining and Knowledge
Discovery

1

(1): pp. 29

53.
http://citeseer.ist.psu.edu/gray97data.html
.
Retrieved on 2008
-
03
-
20.

9.

^

Nigel Pendse (2006
-
06
-
27).
"OLAP architectures"
. OLAP Report.
http://www.olapreport.com/Architectures.htm
.
Retrieved on 2008
-
03
-
17.

10.

^
a

b

Bach Pedersen, Torben; S. Jensen (December 2001). "
Multidimensional Database Technology
" (PDF).
Distributed
Systems Online

(
IEEE
): 40

46.
ISSN

0018
-
9162
.
http://ieeexplore.ieee.org/iel5/2/20936/00970558.pdf
.

11.

^

Nigel Pendse (2007
-
08
-
23).
"Commentary: OLAP API wars"
. OLAP Report.
http:
//www.olapreport.com/Comment_APIs.htm
. Retrieved on 2008
-
03
-
18.

12.

^

Nigel Pendse (2007
-
08
-
23).
"The o
rigins of today’s OLAP products"
. OLAP Report.
http://olapreport.com/origins.htm
.
Retrieved on November 27.

13.

^

Nigel Pendse (2006).
"OLAP Market"
. OLAP Report.
http://www.olapreport.com/market.htm
. Retrieved on 2008
-
03
-
17.

14.

^

Nigel Pendse (2008
-
03
-
07).
"Consolidations in the BI industry"
.
http://www.olapreport.com/consolidations.ht
m
.
Retrieved on 2008
-
03
-
18.



OLAP

Origem: Wikipédia, a enciclopédia livre.

Ir para:
navegação
,
pesquisa

OLAP
,ou
On
-
line
Analytical Processing

é a capacidade para manipular e analisar um grande volume de dados sob múltiplas
perspectivas.

As aplicações OLAP são usadas pelos gestores em qualquer nível da organização para lhes permitir análises comparativas que
facilitem a sua
tomada de decisões diária.

Classifica
-
se em
DOLAP
,
ROLAP
,
MOLAP

e
HOLAP

[
editar
] Ligações externas

OLAP
-

On Line Analytical Processing



O que é OLAP

OLAP
-

On Line Analytical Processing é a tecnologia que permite ao usuário (geralmente diretores,

presidentes e gerentes) um
rápido acesso para visualizar e analisar os dados com alta flexibilidade e desempenho. Esse alto desempenho se dá graças ao
modelo multidimensional, que simplifica o processo de pesquisa. Classifica
-
se em (DOLAP, ROLAP, MOLAP e
HOLAP).


DOLAP


Desktop On Line Analytical Processing

São as ferramentas que disparam uma QUERY da estação de trabalho para o servidor que por sua vez retornam enviando o
micro
-
cubo de volta para ser analisado na estação de trabalho do cliente.


Vantagem:

Pouco tráfego na rede, pois o processamento acontece estação de trabalho do cliente. Maior agilidade na análise dos
dados.


Desvantagem: O tamanho do micro
-
cubo não pode ser grande, se não a análise passa a ser demorada e a máquina do cliente
pode não sup
ortar dependendo de sua configuração.


ROLAP
-

Relational On Line Analytical Processing

São ferramentas que enviam as consultas SQL para o servidor de banco de dados relacional e processada lá mesmo. Sendo assim
o processamento será apenas no servidor.


V
antagem: Permite a análise de grandes volumes de dados devido aos processamentos serem do lado do servidor e não da
estação de trabalho do cliente.


Desvantagem: Se forem feitas diversas requisições ao servidor simultaneamente o mesmo poderá ficar lento ou

até mesmo
indisponível dependendo de sua configuração. Isso se da exatamente por ele ter que processar todas as requisições de todos os

clientes.


MOLAP
-

Multidimensional On Line Analytical Processing

São ferramentas que fazem suas requisições diretament
e ao servidor de banco de Dados multidimensional. O usuário manipula
os dados diretamente no servidor.


Vantagem: Ganho no desempenho, e permite a consulta de grandes volumes de dados devido ao processamento ser feito
diretamente no servidor.


Desvantagem:

Custo da ferramenta é elevado e também temos o problema de escalailidade.


HOLAP
-

Hybrid On Line Analytical Processing

São as ferramentas hibridas, ou seja, a combinação de ROLAP e MOLAP.

Vantagem: A mistura das duas tecnologias obtendo o melhor de cada

uma delas, ROLAP (escalabilidade) + MOLAP (alto
desempenho).


Desvantagem: Custo da ferramenta é elevado



OLAP
-

On Line Analytical Processing pode ser traduzido como Processo Analítico On Line, é a tecnologia que permite ao usuário
(geralmente diretores
, presidentes e gerentes) um rápido acesso para visualizar e analisar os dados com alta flexibilidade e
desempenho. Esse alto desempenho se dá graças ao modelo multidimensional, que simplifica o processo de pesquisa. Classifica
-
se em (DOLAP, ROLAP, MOLAP e

HOLAP).



Business Objects



Cognos



Hyperio
n



Microstrategy



MV Business Analytics Suite



Oracle BI Enterprise Edition



Pentaho




What is OLAP?

An analysis of what the often misuse
d OLAP term is supposed to mean

You can contact Nigel Pendse, the author of this section, by e
-
mail on
NigelP@olapreport.com

if you have any comments or
observations.
Last updated on March 3, 2008.


The term,
of course, stands for ‘On
-
Line Analytical Processing’. Unfortunately, this is neither a meaningful definition nor a
description of what OLAP means. It certainly gives no indication of why you would want to use an OLAP tool, or even what an
OLAP tool actual
ly does. And it gives you no help in deciding if a product is an OLAP tool or not. It was simply chosen as a term to
contrast with OLTP, on
-
line transaction processing, which is much more meaningful.

We hit this problem as soon as we started researching Th
e OLAP Report in late 1994 as we needed to decide which products fell
into the category. Deciding what is an OLAP has not got any easier since then, as more and more vendors claim to have ‘OLAP
compliant’ products, whatever that may mean (often they don’t
even know). It is not possible to rely on the
vendors’ own
descriptions

and membership of the long
-
defunct OLAP Council was not a reliable indicator of whether or not a company
produces OLAP products. For
example, several significant OLAP vendors were never members or resigned, and several members
were not OLAP vendors. Membership of the instantly moribund replacement Analytical Solutions Forum was even less of a
guide, as it was intended to include non
-
OLA
P vendors.

The
Codd rules

also turned out to be an unsuitable way of detecting ‘OLAP compliance’, so we were forced to create our own
definition. It had to be simple, memorable and product
-
indep
endent, and the resulting definition is the ‘FASMI’ test. The key
thing that all OLAP products have in common is multidimensionality, but that is not the only requirement for an OLAP product.


This is copyright material. You can make
brief references to it

freely, with
attribution, but not reproduce large
sections or the entire article without
permission from the publisher. You are
free to link to this page without
permission.






In addition to this article, The
OLAP Report contains
numerous other
analyses
,
product reviews

and
case
studies
. Many of these are
available for
immediate
individual purchase, or you can
subscribe to the entire site.

The FASMI test

We wanted to define the characteristics of an OLAP application in a specific way, without dictating how it should be
implemented. As our research has shown, there are m
any ways of implementing OLAP compliant applications, and no single
piece of technology should be officially required, or even recommended. Of course, we have studied the technologies used in
commercial OLAP products and this report provides many such deta
ils. We have suggested in which circumstances one approach
or another might be preferred, and have also identified areas where we feel that all the products currently fall short of wha
t we
regard as a technology ideal.

Our definition is designed to be shor
t and easy to remember


12 rules or 18 features are far too many for most people to carry
in their heads; we are pleased that we were able to summarize the OLAP definition in just five key words:
F
ast
A
nalysis of
S
hared
M
ultidimensional
I
nformation


or,
FASMI for short.

This definition was first used by us in early 1995, and we are very pleased that it has not needed revision in the years sinc
e. This
definition has now been widely adopted and is
cited

in over 120 Web sites in about 30 countries.

FAST

means that the system is targeted to deliver most responses to users in less than five seconds, with the simplest
analyses taking no more than one second and very few taking more than 20 seconds. Even if users have been warned that it will

take more than
a few seconds, they are soon likely to get distracted and lose their chain of thought, so the quality of analysis
suffers. This speed is not easy to achieve with large amounts of data, particularly if on
-
the
-
fly and
ad hoc

calculations are
required. Vendor
s resort to a wide variety of techniques to achieve this goal, including specialized forms of data storage,
extensive pre
-
calculations and specific hardware requirements, but we do not think any products are yet fully optimized, so we
expect this to be an
area of developing technology. In particular, the full pre
-
calculation approach fails with very large, sparse
applications as the databases simply get too large (the
database explosion

problem
), whereas doing everything on
-
the
-
fly is
much too slow with large databases, even if exotic hardware is used. Even though it may seem miraculous at first if reports t
hat
previously took days now take only minutes, users soon get bored of waiting, and the
project will be much less successful than if
it had delivered a near instantaneous response, even at the cost of less detailed analysis.
The BI and OLAP Surveys

have found
that slow query response is co
nsistently the most often
-
cited technical problem with OLAP products, so too many deployments
are clearly still failing to pass this test. Indeed, there are strong indications that users are becoming ever more demanding
, so
query responses that would have
been considered adequate just a few years ago are now regarded as painfully slow. After all, if
Google can search a large proportion of all the on
-
line information in the world in a quarter of a second, why should relatively
tiny amounts of management info
rmation take orders of magnitude longer to query?

ANALYSIS

means that the system can cope with any business logic and statistical analysis that is relevant for the application
and the user, and keep it easy enough for the target user. Although some pre
-
pr
ogramming may be needed, we do not think it
acceptable if all application definitions have to be done using a professional 4GL. It is certainly necessary to allow the us
er to
define new
ad hoc

calculations as part of the analysis and to report on the data
in any desired way, without having to program,
so we exclude products (like Oracle Discoverer) that do not allow adequate end
-
user oriented calculation flexibility. We do not
mind whether this analysis is done in the vendor's own tools or in a linked exter
nal product such as a spreadsheet, simply that all
the required analysis functionality be provided in an intuitive manner for the target users. This could include specific feat
ures
like time series analysis, cost allocations, currency translation, goal see
king,
ad hoc

multidimensional structural changes, non
-
procedural modeling, exception alerting, data mining and other application dependent features. These capabilities differ wide
ly
between products, depending on their target markets.

SHARED

means that th
e system implements all the security requirements for confidentiality (possibly down to cell level)
and, if multiple write access is needed, concurrent update locking at an appropriate level. Not all applications need users t
o
write data back, but for the
growing number that do, the system should be able to handle multiple updates in a timely, secure
manner. This is a major area of weakness in many OLAP products, which tend to assume that all OLAP applications will be read
-
only, with simplistic security con
trols. Even products with multi
-
user read
-
write often have crude security models; an example is
Microsoft OLAP Services.

MULTIDIMENSIONAL

is our key requirement. If we had to pick a one
-
word definition of OLAP, this is it. The system must
provide a multid
imensional conceptual view of the data, including full support for hierarchies and multiple hierarchies, as this is
certainly the most logical way to analyze businesses and organizations. We are not setting up a specific minimum number of
dimensions that m
ust be handled as it is too application dependent and most products seem to have enough for their target
markets. Again, we do not specify what underlying database technology should be used providing that the user gets a truly
multidimensional conceptual v
iew.

INFORMATION

is all of the data and derived information needed, wherever it is and however much is relevant for the
application. We are measuring the capacity of various products in terms of how much input data they can handle, not how many
Gigabytes they take to stor
e it. The capacities of the products differ greatly


the largest OLAP products can hold at least a
thousand times as much data as the smallest. There are many considerations here, including data duplication, RAM required,
disk space utilization, performan
ce, integration with data warehouses and the like.

We think that the FASMI test is a reasonable and understandable definition of the goals OLAP is meant to achieve. We
encourage users and vendors to adopt this definition, which we hope will avoid the contr
oversies of previous attempts.

The techniques used to achieve it include many flavors of client/server architecture, time series analysis, object
-
orientation,
optimized proprietary data storage, multithreading and various patented ideas that vendors are s
o proud of. We have views on
these as well, but we would not want any such technologies to become part of the definition of OLAP. Vendors who are covered
in this report had every chance to tell us about their technologies, but it is their ability to achiev
e OLAP goals for their chosen
application areas that impressed us most.




Dr Edgar “Ted” Codd (1923
-
2003)


It is with sadness that I learned of the death
last week of Dr Ted Codd, the inventor of the
relational database model. I was fortunate
enough to meet Dr Codd in October 1994,
shortly after he, in a white paper
commissioned by Arbor Software (now part of
H
yperion Solutions), first coined the term
OLAP. I was chairing a conference in London
(the same conference at which I first met Nigel
Pendse) and Dr Codd gave the keynote
address. He explained how analytical
databases were a necessary companion to
database
s built on the relational model which
he invented in 1969. It is easy to forget today,
when the relational database is ubiquitous,
that there was a time when it was far from the
dominant standard and, in fact, competed
with network, hierarchical and other
types of
databases. Dr Codd defended his invention
strongly. Even when Honeywell
MRDS
, the first
commercial relational data base, was released
in 1976, there were still many detractors. By
the time
Oracle released its relational
database in 1979 and started to gain traction
with the market, Dr Codd had spent ten long
years defending his invention. It was not until
the early 80’s that the relational database
emerged as a clear standard.


Subsequently
I was fortunate enough to share
the podium with Dr Codd and his
knowledgeable wife, Sharon, as we gave many
presentations on the subject of OLAP at
conferences around North America. This gave
me a chance to get to know both Ted and
Sharon on a more persona
l level. To hear Ted
explain how he landed flying boats on lakes in
Africa during the second World War made me
realize that there was much more to Ted than
the public face of this man who revolutionized
computing in his lifetime.


The invention of the rela
tional model is well
understood to be a major factor in making
modern computing what it is today. ERP
systems could not have evolved to where they
are without a strong database standard such
as the relational model. Modern e
-
commerce
Web sites are dependen
t on relational
technology. But relational technology is
equally crucial to those of us in the OLAP
world. The source data for our OLAP system
comes almost exclusively from relational
sources, and it is reassuring to know that the
man who invented the rela
tional model, also
recognized that it could not provide, without
help, the rich analytics that business needs. In
the 1994 white paper Dr Codd wrote,
“Attempting to force one technology or tool to
satisfy a particular need for which another tool
The Codd rules and features

In 1993, E.F. Codd &

Associates published a white paper, commissioned by
Arbor Software (now Hyperion Solutions), entitled ‘Providing OLAP (On
-
line
Analytical Processing) to User
-
Analysts: An IT Mandate’. The late Dr Codd was
very well known as a respected database researcher

from the 1960s through
to the late 1980s and is credited with being the inventor of the relational
database model in 1969. Unfortunately, his OLAP rules proved to be
controversial due to being vendor
-
sponsored, rather than mathematically
based.

It is als
o unclear how much involvement Dr Codd himself had with the OLAP work, but it seems likely that his role was very
limited, with more of the work being done by his wife and a temporary researcher than by Dr Codd himself. Several of the rule
s
seem to have be
en invented by the sponsoring vendor, not Dr Codd. The white paper should therefore be regarded as a vendor
-
published brochure (which it was) rather than as a serious research paper (which it was not). Note that this paper was
not

published by Codd &

Date, and Chris Date has never endorsed Codd’s OLAP work.

The OLAP white paper included 12 rules, which are now well known (and available for download from vendors’ Web sites). They
were followed by another six (much less well known) rules in 1995 and Dr
Codd also restructured the rules into four groups,
calling them ‘features’. The features are briefly described and evaluated here, but they are now rarely quoted and little use
d.

Basic Features B

F1
:
Multidimensional Conceptual View

(Original Rule 1). Few would argue with this feature; like Dr Codd, we believe this to
be the central core of OLAP. Dr Codd included ‘slice and dice’ as part of this requirement.

F2
:
Intuitive Data Manipulation

(Original Rule 10). Dr Codd preferred data
manipulation to be done through direct actions
on cells in the view, without recourse to menus or multiple actions. One assumes that this is by using a mouse (or equivalent
),
but Dr Codd did not actually say so. Many products fail on this, because they do
not necessarily support double clicking or drag
and drop. The vendors, of course, all claim otherwise. In our view, this feature adds little value to the evaluation process.

We
think that products should offer a choice of modes (at all times), because not
all users like the same approach.

F3
:
Accessibility: OLAP as a Mediator

(Original Rule 3). In this rule, Dr Codd essentially described OLAP engines as
middleware, sitting between heterogeneous data sources and an OLAP front
-
end. Most products can achieve
this, but often
with more data staging and batching than vendors like to admit.

F4
:
Batch Extraction vs Interpretive

(New). This rule effectively required that products offer
both

their own staging database
for OLAP data
as well

as offering live access to external data. We agree with Dr Codd on this feature and are disappointed that
only a minority of OLAP products properly comply with it, and even those products do not often make it easy or automatic. In
effect, Dr Codd was endo
rsing multidimensional data staging plus partial pre
-
calculation of large multidimensional databases,
with transparent reach
-
through to underlying detail. Today, this would be regarded as the definition of a hybrid OLAP, which is
indeed becoming a popular
architecture, so Dr Codd has proved to be very perceptive in this area.

F5
:
OLAP Analysis Models

(New). Dr Codd required that OLAP products should support all four analysis models that he
described in his white paper (Categorical, Exegetical, Contemplativ
e and Formulaic). We hesitate to simplify Dr Codd’s erudite
phraseology, but we would describe these as parameterized static reporting, slicing and dicing with drill down, ‘what if?’ an
alysis
and goal seeking models, respectively. All OLAP tools in this Re
port support the first two (but some other claimants do not fully
support the second), most support the third to some degree (but probably less than Dr Codd would have liked) and few support
the fourth to any usable extent. Perhaps Dr Codd was anticipating

data mining in this rule?

F6
:
Client Server Architecture

(Original Rule 5). Dr Codd required not only that the product should be client/server but that
the server component of an OLAP product should be sufficiently intelligent that various clients could

be attached with minimum
effort and programming for integration. This is a much tougher test than simple client/server, and relatively few products
qualify. We would argue that this test is probably tougher than it needs to be, and we prefer not to dictat
e architectures.
However, if you do agree with the feature, then you should be aware that most vendors who claim compliance, do so wrongly.
In effect, this is also an indirect requirement for openness on the desktop. Perhaps Dr Codd, without ever using the

term, was
is more ef
fective and efficient is like
attempting to drive a screw into a wall with a
hammer when a screwdriver is at hand: the
screw may eventually enter the wall but at
what cost?”


TU慮欠youH⁔ T⁃ TT.

Richard Cre
eth

April 22, 2003



thinking of what the Web would one day deliver? Or perhaps he was anticipating a widely accepted API standard, which still
does not really exist. Perhaps, one day, XML for Analysis will fill this gap.

F7
:
Transparency

(Original Rule 2). This te
st was also a tough but valid one. Full compliance means that a user of, say, a
spreadsheet should be able to get full value from an OLAP engine and not even be aware of where the data ultimately comes
from. To do this, products must allow live access to h
eterogeneous data sources from a full function spreadsheet add
-
in, with
the OLAP server engine in between. Although all vendors claimed compliance, many did so by outrageously rewriting Dr Codd’s
words. Even Dr Codd’s own vendor
-
sponsored analyses of Essba
se and (then) TM/1 ignore part of the test. In fact, there are a
few products that do
fully

comply with the test, including Analysis Services, Express, and Holos, but neither Essbase nor iTM1
(because they do not support live, transparent access to externa
l data), in spite of Dr Codd’s apparent endorsement. Most
products fail to give either full spreadsheet access or live access to heterogeneous data sources. Like the previous feature,

this is
a tough test for openness.

F8
:
Multi
-
User Support

(Original Rul
e 8). Dr Codd recognized that OLAP applications were not all read
-
only and said that, to
be regarded as strategic, OLAP tools must provide concurrent access (retrieval and update), integrity and security. We agree
with Dr Codd, but also note that many OLAP

applications are still read
-
only. Again, all the vendors claim compliance but, on a
strict interpretation of Dr Codd’s words, few are justified in so doing.

Special Features S

F9
:
Treatment of Non
-
Normalized Data

(New). This refers to the integration be
tween an OLAP engine and denormalized
source data. Dr Codd pointed out that any data updates performed in the OLAP environment should not be allowed to alter
stored denormalized data in feeder systems. He could also be interpreted as saying that data chang
es should not be allowed in
what are normally regarded as calculated cells within the OLAP database. For example, Essbase allows this, and Dr Codd would
perhaps have disapproved.

F10
:
Storing OLAP Results: Keeping Them Separate from Source Data

(New). This is really an implementation rather than a
product issue, but few would disagree with it. In effect, Dr Codd was endorsing the widely
-
held view that read
-
write OLAP
applications should not be implemented directly on live transaction data, and O
LAP data changes should be kept distinct from
transaction data. The method of data write
-
back used in Microsoft Analysis Services is the best implementation of this, as it
allows the effects of data changes even within the OLAP environment to be kept segre
gated from the base data.

F11
:
Extraction of Missing Values

(New). All missing values are cast in the uniform representation defined by the Relational
Model Version 2. We interpret this to mean that missing values are to be distinguished from zero values.

In fact, in the interests
of storing sparse data more compactly, a few OLAP tools such as TM1 do break this rule, without great loss of function.

F12
:
Treatment of Missing Values

(New). All missing values to be ignored by the OLAP analyzer regardless of
their source.
This relates to Feature 11, and is probably an almost inevitable consequence of how multidimensional engines treat all data.

Reporting Features R

F13
:
Flexible Reporting

(Original Rule 11). Dr Codd required that the dimensions can be laid o
ut in any way that the user
requires in reports. We would agree, and most products are capable of this in their formal report writers. Dr Codd did not
explicitly state whether he expected the same flexibility in the interactive viewers, perhaps because he
was not aware of the
distinction between the two. We prefer that it is available, but note that relatively fewer viewers are capable of it. This i
s one of
the reasons that we prefer that analysis and reporting facilities be combined in one module.

F14
:
Un
iform Reporting Performance

(Original Rule 4). Dr Codd required that reporting performance be not significantly
degraded by increasing the number of dimensions or database size. Curiously, nowhere did he mention that the performance
must be fast, merely th
at it be consistent. In fact, our experience suggests that merely increasing the number of dimensions or
database size does not affect performance significantly in fully pre
-
calculated databases, so Dr Codd could be interpreted as
endorsing this approach


which may not be a surprise given that Arbor Software sponsored the paper. However, reports with
more content or more on
-
the
-
fly calculations usually take longer (in the good products, performance is almost linearly
dependent on the number of cells used t
o produce the report, which may be more than appear in the finished report) and some
dimensional layouts will be slower than others, because more disk blocks will have to be read. There are differences between
products, but the principal factor that affect
s performance is the degree to which the calculations are performed in advance and
where live calculations are done (client, multidimensional server engine or RDBMS). This is far more important than database
size, number of dimensions or report complexity.

F15
:
Automatic Adjustment of Physical Level

(Supersedes Original Rule 7). Dr Codd required that the OLAP system adjust its
physical schema automatically to adapt to the type of model, data volumes and sparsity. We agree with him, but are
disappointed tha
t most vendors fall far short of this noble ideal. We would like to see more progress in this area and also in the
related area of determining the degree to which models should be pre
-
calculated (a major issue that Dr Codd ignores). The
Panorama technology
, acquired by Microsoft in October 1996, broke new ground here, and users can now benefit from it in
Microsoft Analysis Services.

Dimension Control D

F16
:
Generic Dimensionality

(Original Rule 6). Dr Codd took the purist view that each dimension must be
equivalent in both
its structure and operational capabilities. This may not be unconnected with the fact that this is an Essbase characteristic.

However, he did allow additional operational capabilities to be granted to selected dimensions (presumably incl
uding time), but
he insisted that such additional functions should be grantable to any dimension. He did not want the basic data structures,
formulae or reporting formats to be biased towards any one dimension. This has proven to be one of the most controv
ersial of
all the original 12 rules. Technology focused products tend to largely comply with it, so the vendors of such products suppor
t it.
Application focused products usually make no effort to comply, and their vendors bitterly attack the rule. With a s
trictly purist
interpretation, few products fully comply. We would suggest that if you are purchasing a tool for general purpose, multiple
application use, then you want to consider this rule, but even then with a lower priority. If you are buying a produc
t for a specific
application, you may safely ignore the rule.

F17
:
Unlimited Dimensions & Aggregation Levels

(Original Rule 12). Technically, no product can possibly comply with this
feature, because there is no such thing as an unlimited entity on a limi
ted computer. In any case, few applications need more
than about eight or ten dimensions, and few hierarchies have more than about six consolidation levels. Dr Codd suggested that

if a maximum must be accepted, it should be at least 15 and preferably 20; w
e believe that this is too arbitrary and takes no
account of usage. You should ensure that any product you buy has limits that are greater than you need, but there are many
other limiting factors in OLAP products that are liable to trouble you more than th
is one. In practice, therefore, you can probably
ignore this requirement.

F18
:
Unrestricted Cross
-
dimensional Operations

(Original Rule 9). Dr Codd asserted, and we agree, that all forms of
calculation must be allowed across all dimensions, not just the ‘measures’ dimension. In fact, many products which use only
relational storage are weak in this area. Most products, such
as Essbase, with a multidimensional database are strong. These
types of calculations are important if you are doing complex calculations, not just cross tabulations, and are particularly r
elevant
in applications that analyze profitability.


This page is p
art of the free content of The OLAP Report, but ten times more information is available only to subscribers,
including reviews of dozens of products, case studies and in
-
depth analyses. You can
register

f
or access to a preview of some of
the subscriber
-
only material in The OLAP Report or
subscribe

on
-
line. It is also possible to purchase individual
reviews
,
analyses

and
case studies

from The OLAP Report.



Category:OLAP History

From OLAP

(Redirected from
OLAP History
)

Jump to:
navigation
,
search

Contents

[
hide
]



1 The History of OLAP



2 Birth of the Multidimensional Analysis through the APL



3 Express, an Enduring Example



4 System W for Financial Applications



5 Metaphor, the Beginning of the Client/Server



6 The New MIS Us
ing GUI



7 PowerOLAP, Real
-
time Data and Excel Integraton



8 The Spread of Spreadsheets

The History of OLAP

OLAP is not a new concept and has persisted through the decades. As a matter of fact, the origin of OLAP technology can be
traced way back in 1962. It was not until 1993 that the term OLAP was coined in the Codd white paper authored by the highly
esteemed
database researcher Ted Codd, who also established the 12 rules for an OLAP product. Like many other applications, it
has undergone several stages of evolution whose patterns of progress are relatively intricate to follow through.

Birth of the Multidimens
ional Analysis through the APL

It was Kenneth Iverson who first introduced the base foundation of OLAP through his book “A Programming Language”, which
defined a mathematical language with processing operators and multidimensional variables. The APL was regarded as the first
multidimens
ional language and its implementation as a computer programming language happened during the late 1960’s by
IBM.

Iverson created brief notations by employing Greek symbols as operators. During this period, high resolution GUIs had not yet

surfaced and, as

APL uses Greek symbols, it requires support of special hardware like special keyboards, screens and printers. On
top of this, since early APL programs were interpreted as opposed to being compiled, it tends to inefficiently exhaust more
machine resources
and is known for consuming too much RAM space, to name only a few of its drawbacks. Maintenance of APL
-
based mainframe products is very costly and most programmers encounter difficulties in programming multidimensional
applications using arrays in other la
nguages.

Eventually, there was a decline in the market significance of APL, but it still survives to a limited degree. Although it was

not
deemed a modern OLAP tool, several of its ideas can be seen living through some of the modern day multidimensional
a
pplications.

Express, an Enduring Example

A new multidimensional product emerged during the year 1970’s, which became a popular OLAP offering, in the form of
Express. This was the first multidimensional tool directed to support marketing related demands
or application needs. It later on
evolved into a hybrid OLAP after its acquisition by Oracle and has thrived for more than 3 decades. It remains, even in the
current period, as one of the well
-
marketed multidimensional products. One of Express’ more famous

successors is the Oracle9i
OLAP. And though several enhanced versions have been released throughout the years, the concepts and data models remain
unchanged.

The 1980’s period played a significant role in the advancement of the OLAP industry as this trig
gered the rise of many
multidimensional products.


System W for Financial Applications

By the year 1981, a new decision support system software, has been developed by Comshare as a result of their attempt to
expand the scope of their market and services o
ffered. System W was the first OLAP tool to cater to financial applications and
the first to apply hypercube approach in its multidimensional modeling. But though it proved to be a profitable venture for
Comshare for quite some time, it didn’t really achie
ve much success in the market and was even less favored by technical people
as it was more difficult to program in comparison with other software of its kind. Furthermore, it also takes up much of the
machine resources and often suffers from database explo
sion.

UNIX also released APL but never promoted it as an OLAP tool. Presently, System W ceased being marketed but is still operatin
g
limitedly on a few IBM mainframes. Other products who replicated similar System W concepts came out such as DOS One
-
Up by
Comshare and the Windows
-
based Commander Prism but did not make quite a significant mark in the industry. In 1992, Essbase
was launched by Hyperion Solution which eventually became a major OLAP server product in the market come year 1997. But
just like the

original product, this descendant application suffers too from database explosion. Hyperion was finally able to
resolve the problem with exploding databases through the release of its Essbase 7X version.

Metaphor, the Beginning of the Client/Server

Afte
r a couple or so years after the release of System W, the generally considered first ROLAP product, Metaphor, entered the
OLAP market. This multidimensional product established new concepts like client/server computing, multidimensional
processing on relat
ional data, workgroup processing and object
-
oriented development and was basically designed to cater for
companies of consumption goods. The vendor of Metaphor was compelled to create proprietary PC and networks since
hardware in those days could barely su
pport Metaphor’s requirements.

In 1991, IBM acquired Metaphor and launched the product under the new name IDS. The product still remains operational to
support remaining loyal users.

The New MIS Using GUI

A new type of Management Information System prod
uct emerged during the mid 1980’s in the form of Executive Information
System, or more commonly known as EIS which emphasizes the use of graphical user interfaces (GUI). And on 1985, Pilot
Command Center, which was branded as the first ever client/server E
IS was released.

Other client/server products that came out are Strategy, Holos, and Information Advantage. Pilot has decided to phase out
Command Center but has implemented some of the concepts in its Lightship Server product. Some of Command Center
conc
epts such as automatic time series handling, multidimensional client/server processing and simplified human factors can
still be seen living through some modern OLAP products.


PowerOLAP, Real
-
time Data and Excel Integraton

Founded in 1997,
PARIS Technologies

published
PowerOLAP
™, which represents a milestone in the evolution of OLAP (on
-
line
analytical processing) technology. Like any important evolutionary event, PowerOLAP combines the most advanced features of
what came before it with new capabilities. Most significantly, Powe
rOLAP enables users to reach through seamlessly to access
transactional data in a relational database for dynamic OLAP manipulations in a true multidimensional environment. In additio
n,
PowerOLAP employs Excel and the Web as a front end, connecting users t
hroughout an organization with underlying data
sources via the tools they know best, direct to their desktops.


The Spread of Spreadsheets

A new end
-
user analysis tool was becoming a favorite during the latter period of 1980. The spreadsheet market was f
ast
prevailing which compelled some of the vendors to create multidimensional applications that could reside on a spreadsheet
environment.

Compete initiated to open the market for a multidimensional spreadsheet. It was later on acquired by Computer Associ
ates, in
addition to its other spreadsheet products like the SuperCalc and 20/20, from its original vendor then heavily advertised and

offered it at a lower cost, but even at this rate it still did not make much market significance. CA later on came out wi
th the
version 5 of SuperCalc which was clearly influenced by the almost defunct Compete product.

Improv from Lotus followed suit after Compete. Lotus 1
-
2
-
3 began to develop Improv for the NeXT machine under the code
name ‘BackBay’. This became a reality
as Improv was later on launched on NeXT machines. This became a phenomenal success
and has considerably augmented Lotus’ sales until after the efforts to port Improv in Windows and Macintosh system software.
The rise of the competitor Microsoft’s Excel pro
duct marked the beginning of the decline of Lotus. Lotus attempted moving
Improv down the market in the hope of increasing it’s marketability but did not work out. Excel steadily gained on 1
-
2
-
3 and
ultimately proved to be the superior product which domina
ted the market. Microsoft’s integration of the Pivot Tables feature in
Excel was probably one of the most important enhancements of the Excel product as PivotTable became the most popular and
widely used tool for multidimensional analysis. Throughout the y
ears, Microsoft continued to produce new and enhanced
versions of Excel like the Excel 2000 and Excel 2003 which showcases a more sophisticated Pivot Table feature that is functio
ns
as both a desktop OLAP: small cubes, generated from large databases, but d
ownloaded to PCs for processing (even though, in
Web implementations, the cubes usually reside on the server) and a client to Microsoft Analysis Services.

Sinper Corporation came into the OLAP market during the late 1980’s and presented its multidimension
al analysis software
product for DOS and Windows, then known as TM/1. Sinper turned TM/1 to serve as a multidimensional back
-
end server for
Excel and 1
-
2
-
3. Essbase by Arbor followed suit. Market for a multidimensional spreadsheet is booming fast. More and

more
vendors were attracted to plunge into this growing business. Traditional vendors of host
-
oriented products like Acumate,
Express, Gentia, Holos, Hyperion, Mineshare, MetaCube, PowerPlay and WhiteLight all offer products which provide highly
integrate
d spreadsheet access to their OLAP servers.

Soon after came the release of the OLAP@Work Excel Add
-
In with features that enable users to make full use of OLAP Services.
Then on the year 2004, Excel Add
-
in went mainstream. Vendors like Business Objects, Co
gnos, Microsoft, MicroStrategy and
Oracle launched their own versions of the product. Concurrently, IntelligentApps, a main vendor of Analysis Services Excel Ad
d
-
In, was acquired by Sage.

Microsoft released PerformancePoint which delivers more functionali
ty for execution of performance management in the year
2007, but has announced the existence of the product in the prior year.

Pages in category "OLAP History"

The following 4 pages are in this category, out of 4 total.

B



Business Applications

C



Codd's Paper

M



Multidimensional Basics

T



Types of OLAP Systems



OLAP AND OLAP SERV
ER DEFINITIONS

OLAP: ON
-
LINE ANALYTICAL PROCESSING

Defined terms


On
-
Line Analytical Processing (OLAP) is a category of software technology that enables analysts, managers and executives to
gain insight into data through fast, consistent, interactive access to a wide variety of possible views of information that h
as been

transformed from raw data to reflect the real dimensionality of the enterprise as understood by the user.

OLAP functionality is characterized by dynamic
mul
ti
-
dimensional

analysis of consolidated enterprise data supporting end user
analytical and navigational activities including:



calculations and modeling applied across dimensions, through
hierarchies

and/or across members



trend analysis over sequential time periods



slicing
subsets for on
-
screen viewing



drill
-
down

to deeper levels of consolidation



reach
-
through

to underlying detail data



rotation

to new dimensional comparisons in the viewing area

OLAP is implemented in a multi
-
user client/server mode and offers consistently rapid response to queries, regardless of
database size and comp
lexity. OLAP helps the user synthesize enterprise information through comparative, personalized
viewing, as well as through analysis of historical and projected data in various "what
-
if" data model scenarios. This is achieved
through use of an OLAP Server.


OLAP SERVER

An OLAP server is a high
-
capacity, multi
-
user data manipulation engine specifically designed to support and operate on
multi
-
dimensional data

structures. A multi
-
dimensional structure is arranged so that every data item is located and accessed based on
the intersection of the dimension members which define that item. The design of the server and the structure of the data are
optimized for rapid

ad
-
hoc information retrieval in any orientation, as well as for fast, flexible calculation and transformation of
raw data based on
formulaic

relationships. The OLAP Server may either physically stage the

processed multi
-
dimensional
information to deliver consistent and rapid response times to end users, or it may populate its data structures in real
-
time from
relational or other databases, or offer a choice of both. Given the current state of technology a
nd the end user requirement for
consistent and rapid response times, staging the multi
-
dimensional data in the OLAP Server is often the preferred method.

OLAP GLOSSARY

Defined terms:



AGGREGATE




ANALYSIS, MULTI
-
DIMENSIONAL



ARRAY, MULTI
-
DIMENSIONAL




CALCULATED MEMBER




CELL




CHILDREN




COLUMN DIMENSION




CONSOLIDATE




CUBE




DENSE




DERIVED DATA




DERIVED MEMBERS




DETAIL MEMBER




DIMENSION




DRILL DOWN/UP




FORMULA




FORMULA, CROSS
-
DIMENSIONAL




GENERATION, HIERARCHI
CAL




HIERARCHICAL RELATIONSHIPS




HORIZONTAL DIMENSION




HYPERCUBE




INPUT MEMBERS




LEVEL, HIERARCHICAL




MEMBER, DIMENSION




MEMBER COMBINATION




MISSING DATA, MISSING VALUE



MULTI
-
DIMENSIONAL DATA STRUCTURE




MULTI
-
DIMENSIONAL QUERY LANGUAGE




NAVIGATION




NESTING (OF MULTI
-
DIMENSIONAL COLUMNS AND ROWS)



NON
-
MISSING DATA




OLAP CLIENT




PAGE DIMENSION




PAGE D
ISPLAY




PARENT




PIVOT




PRE
-
CALCULATED/PRE
-
CONSOLIDATED DATA




REACH THROUGH




ROLL
-
UP




ROTATE




ROW DIMENSION




SCOPING




SELECTION




SLICE




SLICE AND DICE




SPARSE




VERTICAL DIMENSION

Definitions:

AGGREGATE

See:
Consolidate


ANALYSIS, MULTI
-
DIMENSIONAL

The objective of multi
-
dimensional analysis is for end users to gain insight into the meaning contained in databases. The multi
-
dimensio
nal approach to analysis aligns the data content with the analyst's mental model, hence reducing confusion and
lowering the incidence of erroneous interpretations. It also eases navigating the database, screening for a particular subset

of
data, asking for

the data in a particular orientation and defining analytical calculations. Furthermore, because the data is
physically stored in a multi
-
dimensional structure, the speed of these operations is many times faster and more consistent than
is possible in othe
r database structures. This combination of simplicity and speed is one of the key benefits of multi
-
dimensional
analysis.

ARRAY, MULTI
-
DIMENSIONAL

A group of data cells arranged by the
dimensions

of the data. For example, a spreadsheet exemplifies a two
-
dimensional array
with the data cells arranged in rows and columns, each being a dimension. A three
-
dimensional array can be visualized as a cube
with each dimension forming a si
de of the cube, including any slice parallel with that side. Higher dimensional arrays have no
physical metaphor, but they organize the data in the way users think of their enterprise. Typical enterprise dimensions are t
ime,
measures, products, geographica
l regions, sales channels, etc.

Synonyms: Multi
-
dimensional Structure,
Cube
,
Hypercube


CALCULATED MEMBE
R

A calculated member is a member of a dimension whose value is determined from other members' values (e.g., by application of
a mathematical or logical operation). Calculated members may be part of the OLAP server database or may have been specified
by t
he user during an interactive session. A calculated member is any member that is not an input member.

CELL

A single datapoint that occurs at the intersection defined by selecting one member from each
dimension

in a
multi
-
dimensional
array
. For example, if the dimensions are measures, time, product and geography, then the dimension
members: Sales, Janu



OLAP Server

From OLAP

Jump to:
navigation
,
search

An OLAP server is a high
-
capacity, multi
-
user data manipulation engine specifically designed to support and operate on multi
-
dimensional data structures. A multi
-
dimensional structure is arranged so that every data item is located and accessed based on
the

intersection of the dimension members which define that item. The design of the server and the structure of the data are
optimized for rapid ad
-
hoc information retrieval in any orientation, as well as for fast, flexible calculation and transformation of
r
aw data based on formulaic relationships. The OLAP Server may either physically stage the processed multi
-
dimensional
information to deliver consistent and rapid response times to end users, or it may populate its data structures in real
-
time from
relation
al or other databases, or offer a choice of both. Given the current state of technology and the end user requirement for
consistent and rapid response times, staging the multi
-
dimensional data in the OLAP Server is often the preferred method.



OLAP Funct
ionality

From OLAP

Jump to:
navigation
,
search

In the core of any OLAP system is a concept of an OLAP cube (also called a multidimensional cube or a hypercube). It consists

of
numeric facts called
measures

which are categorized by
dimensions
. The cube
me
tadata

is typically created from a
star schema

or
snowflake schema

of tables in a relational database.
Measures

are derived from the record
s in the fact table and dimensions
are derived from the dimension tables.



OLAP Cube

From OLAP

Jump to:
navigation
,
search

An
OLAP cube

is a data structure that allows fast analysis of data. The arrangement of data into cubes overcomes a limitation of
relational databases. Relational databases are not
well suited for near instantaneous analysis and display of large amounts of
data. Instead, they are better suited for creating records from a series of transactions known as OLTP or On
-
Line Transaction
Processing. Although many report
-
writing tools exist f
or relational databases, these are slow when the whole database must be
summarized.

Contents

[
hide
]



1 Background


o

1.1 Functionality

o

1.2 Pivot

o

1.3 Hierarchy

o

1.4 OLAP operations

o

1.5 Linking cubes and sparsity

o

1.6 Variance in products



2 Technical definition

Background

OLAP cubes can be thought of as extensions to the two
-
dimensional array of a spreadsheet. For example a company might wish
to analyze some financial data by product, by time
-
period, by city, by type of revenue and cost, and by comparing actual data
with a
budget. These additional methods of analyzing the data are known as dimensions.Because there can be more than three
dimensions in an OLAP system the term
hypercube

is sometimes used.

Functionality

The OLAP cube consists of numeric facts called
measures

which are categorized by
dimensions
. The cube metadata is
typically
Template:Fact

created from a
star schema

or
snowflake schema

of tables in a
relational database
. Measures are derived
from the records in the
fact table

and dimensions are derived from the
dimension tables
.

Pivot

A financial analyst might want to view or "pivot" the data in various ways, such as displaying all the cities down the page a
nd all
the products across a page. This could be for a specified
period, version and type of expenditure. Having seen the data in this
particular way the analyst might then immediately wish to view it in another way. The cube could effectively be re
-
oriented so
that the data displayed now had periods across the page and

type of cost down the page. Because this re
-
orientation involved
re
-
summarizing very large amounts of data, this new view of the data had to be generated efficiently to avoid wasting the
analyst's time, i.e within seconds, rather than the hours a relation
al database and conventional report
-
writer might have taken.

Hierarchy

Each of the elements of a dimension could be summarized using a
hierarchy
. The hierarchy is a series of parent
-
chil
d
relationships, typically where a parent member represents the consolidation of the members which are its children. Parent
members can be further aggregated as the children of another parent.

For example May 2005 could be summarized into Second Quarter 2
005 which in turn would be summarized in the Year 2005.
Similarly the cities could be summarized into regions, countries and then global regions; products could be summarized into
larger categories; and cost headings could be grouped into types of expendit
ure. Conversely the analyst could start at a highly
summarized level such as the total difference between the actual results and the budget and
drill down

into the cube to discover
whic
h locations, products and periods had produced this difference.

OLAP operations

The analyst can understand the meaning contained in the databases using multi
-
dimensional analysis. By aligning the data
content with the analyst's mental model, the chances o
f confusion and erroneous interpretations are reduced. The analyst can
navigate through the database and screen for a particular subset of the data, changing the data's orientations and defining
analytical calculations. The user
-
initiated process of naviga
ting by calling for page displays interactively, through the
specification of slices via rotations and drill down/up is sometimes called "slice and dice". Common operations include slice

and
dice, drill down, roll up, and pivot.

Slice
: A slice is a subset

of a multi
-
dimensional array corresponding to a single value for one or more members of the dimensions
not in the subset.<ref name=OLAPGlossary1995/>

Dice
: The dice operation is a slice on more than two dimensions of a data cube (or more than two consecu
tive
slices).<ref>
Template:Cite web
</ref>

Drill Down/Up
: Drilling down or up is a specific analytical technique whereby the user navigates among levels of data ranging
from the most summarized (up) to the most detailed (down).<ref name=OLAPGlossary1995/>

Roll
-
up
: A roll
-
up involves computing all of the data r
elationships for one or more dimensions. To do this, a computational
relationship or formula might be defined.<ref name=OLAPGlossary1995/>

Pivot
: To change the dimensional orientation of a report or page display.<ref name=OLAPGlossary1995/>

Linking cubes

and sparsity

The commercial OLAP products have different methods of creating the cubes and hypercubes and of linking cubes and
hypercubes (see Types of OLAP in the article on
OLAP
.)

Linking cubes is a method of overcoming
sparsity
. Sparsity arises when not every cell in the cube is filled with data and so
valuable proce
ssing time is taken by effectively adding up zeros. For example revenues may be available for each customer and
product but cost data may not be available with this amount of analysis. Instead of creating a sparse cube, it is sometimes b
etter
to create ano
ther separate, but linked, cube in which a sub
-
set of the data can be analyzed into great detail. The linking ensures
that the data in the cubes remain consistent.

Variance in products

The data in cubes may be updated at times, perhaps by different people
. Techniques are therefore often needed to lock parts of
the cube while one of the users is writing to it and to recalculate the cube's totals. Other facilities may allow an alert th
at shows
previously calculated totals are no longer valid after the new da
ta has been added, but some products only calculate the totals
when they are needed.

Technical definition

In
database
theory
, an
OLAP cube

is
Template:Fact

an abstract representation of a
projection

of an
RDBMS

relation. Given a
relation

of order
N
, consider a projection that subtends
X
,
Y
, and
Z

as the key and
W

as the
residual

attribute
. Ch
aracterizing this
as a
function
,

W

: (
X
,
Y
,
Z
) →
W


the attributes
X
,
Y
, and
Z

correspond to the axes of the

cube, while the
W

value into which each
( X, Y, Z )

triple maps
corresponds to the data element that populates each cell of the cube.

Insofar as two
-
dimensional output devices cannot readily characterize four dimensions, it is more practical to project "
slices" of
the data cube (we say
project

in the classic vector analytic sense of dimensional reduction, not in the
SQL

sense, although the
two are clearly conceptually homologous), perhaps

W

: (
X
,
Y
)


W


which may suppress a primary key, but still have some semantic significance, perhaps a slice of the triadic functional
representation for a given
Z

value of interest.

The motivation
Template:Fact

behind
OLAP

displays harks back to the
cross
-
tabbed report

paradigm of 1980s
DBMS
. One may
wish for a
spreadsheet
-
style display, where

to appropriate the Microsoft
Excel

paradigm

val
ues of
X

populate row $1; values
of
Y

populate column $A; and values of
W

: ( X, Y ) → W

populate the individual cells "southeast of" $B2, so to speak, $B2 itself
included. While one can certainly use the
DML

(Data Manipulation Language) of traditional
SQL

to display
( X, Y, W )

triples, this
output format is not nearly as convenient as the cross
-
tabbed alternative: certainly, the former requires one to hunt linearly for
a given
( X, Y )

pair in order to determine the corresponding
W

value, while the latter enables one to more conveniently scan for
the intersection of the proper
X

column with the
proper
Y

row.

See also
Cube



OLAP

From OLAP

Jump to:
navigation
,
search

OLAP or ON
-
Line Analytical Processing is a software technology that enables analysts, managers and executives to gain insight
into data through fast, consistent, interactive access to a wide variety of possible views of information that has been
tr
ansformed from raw data to reflect the real dimensionality of the enterprise as understood by a user. OLAP functionality is
characterized by dynamic multi
-
dimensional analysis of consolidated enterprise data supporting end user analytical and
navigational
activities. OLAP tools do not store individual transaction records in two
-
dimensional, row
-
by
-
column formats, like a
worksheet, but instead use multi
-
dimensional database structures
-
known as Cubes in OLAP terminology
-
to store arrays of
consolidated informa
tion. The data and formulas are stored in an optimized multidimensional database, while views of the data
are created on demand. Analysts can take any view, or, Slice, of a Cube to produce a worksheet
-
like view of points of interest.



Category:OLAP and E
xcel

From OLAP

Jump to:
navigation
,
search

Contents

[
hide
]



1 The Power of Excel
-
Friendly OLAP



2 Introducing OLAP



3 Excel
-
Friendly OLAP



4 How Much Tr
uth?



5 Data Warehouse vs. OLAP Database


o

5.1 Data Silos

o

5.2 Mergers and Acquisitions

o

5.3 System Conversions

o

5.4 External Data

o

5.5 Forecasts

o

5.6 Statistical Corrections

o

5.7 Excel Dashboard Reporting



6 Limitations of Excel BPM Reporting



7 Examples of OLAP integration in Excel

The Power of Excel
-
Friendly OLAP

Should Excel be a key component of your company’s BPM system?

There’s no doubt how most IT managers would answer this question. Name IT’s top ten requirements for a successful BPM
system, and they’ll quickly explain how E
xcel violates dozens of them. Even the user community is concerned. Companies are
larger and more complex now than in the past; they seem too complex for Excel. Managers need information more quickly now;
they can’t wait for another Excel report.

Excel spreadsheets don’t scale well. They can’t be used by many different users. Excel reports have many errors. Excel securi
ty is
a joke. Excel output is ugly. Excel consolidation occupies a large corner of Spreadsheet Hell. And Sarbanes Oxley has changed

everything.

Or so we’re told.

For these reasons, and many more, a growing number of companies of all sizes have concluded that it’s time to replace Excel.

But before your company takes that leap of hope or faith, perhaps you should take another look at

Excel…particularly when
Excel can be enhanced by an Excel
-
friendly OLAP database.

Excel
-
friendly OLAP could force your company to take another look at Excel. That technology helps to eliminate many of the
classic objections to using Excel for business pe
rformance management.

Introducing OLAP

Excel
-
friendly

OLAP products cure many of the problems that both users and IT managers have with Excel. But before I explain
why this is

so, I should explain what OLAP is, and how it can be Excel
-
friendly.

Although OLAP technology has been available for years, it’s still quite obscure. One reason is that “OLAP” is an acronym for
four
words that are remarkably devoid of meaning: On
-
Line An
alytical Processing.

OLAP databases are more easily understood when they’re compared with relational databases. Both “OLAP” and “relational” are
names for a type of database technology. Oversimplified, relational databases contain lists of stuff; OLAP dat
abases contain
cubes of stuff.

For example, you could keep your accounting general ledger data in a simple cube with three dimensions: Account, Division, an
d
Month. At the intersection of any particular account, division, and month you would find one numb
er. By convention, a positive
number would be a debit and a negative number would be a credit.

Most cubes have more than three dimensions. And they typically contain a wide variety of business data, not merely General
Ledger data. OLAP cubes also could co
ntain monthly headcounts, currency exchange rates, daily sales detail, budgets, forecasts,
hourly production data, the quarterly financials of your publicly traded competitors, and so on.

You can define any consolidation hierarchy for any of a cube’s dime
nsions. For example, in the Month dimension every month
could roll up into quarters, which could roll up into years. Months also could roll up into year
-
to
-
date categories. Users treat
both the “leaf” members and the consolidated members as equivalent sour
ces of data. To illustrate, users could choose data
from a leaf member like Aug
-
2006 just as easily as they could choose from a consolidated member like Aug
-
2006
-
YTD.

Other dimensions typically have their own roll
-
up structures. An Account dimension could

roll up accounts into traditional
financial statement hierarchies. A Division dimension could roll up divisions into the corporate reporting hierarchy. And a
Product dimension could roll up products into one or more product structures.

Excel
-
Friendly OLA
P

You probably could find at least 50 OLAP products on the market. But most of them lack a key characteristic: spreadsheet
functions.

Excel
-
friendl
y OLAP products

offer a wide variety of spreadsheet functions that read data from cubes into Excel. Most such
products also offer spreadsheet functions that can write to the OLAP database from Excel…with full security, of course.

Read
-
write security typic
ally can be defined down to the cell level by user. Therefore, only certain analysts can write to a
forecast cube. A department manager can read only the salaries of people who report to him. And the OLAP administrator must
use a special password to update

the General Ledger cube.

Other OLAP products push data into Excel; Excel
-
friendly OLAPs pull data into Excel. To an Excel user, the difference between
push and pull is significant.

Using the push technology, users typically must interact with their OLAP

product’s user interface to choose data and then write
it as a block of numbers to Excel. If a report relies on five different views of data, users must do this five times. Worse,
the data
typically isn’t written where it’s needed within the body of the r
eport. Instead, the data merely is parked in the spreadsheet for
use somewhere else.

Using the pull technology, spreadsheet users can write formulas that pull the data from any number of cells in any number of