BigData_and_BusinessIntelligence_EclipseCon2013x

batterycopperInternet and Web Development

Nov 12, 2013 (3 years and 11 months ago)

204 views

1

Actuate Corporation ©
2012

Big Data and Business Intelligence

Virgil Dodson

2

Actuate Corporation ©
2012

Today’s Agenda and Goals


Introduction to Big Data


Eclipse Survey Results


Independent Survey Results


Introduction to BIRT


Big Data Connections


Live Demo


Questions

3

Actuate Corporation ©
2012

Big Data Definition

Big data is a collection of data sets
so large and complex
that it
becomes
difficult to process
using on
-
hand database
management tools or traditional data processing applications.



web logs

RFID



sensors


social networks

Internet text

search indexes call detail records

astronomy

atmospheric info



genomics



biogeochemical

biological

military surveillance



medical records

photographs

video



large
-
scale e
-
commerce

-

Wikipedia

4

Actuate Corporation ©
2012


The “Digital Universe” will expand to over 4
zettabytes
… Over 50%
growth from 2012



The Big Data focus will shift “up the stack”, toward analytics and
discovery, and analytic applications



Spending will reach $10 billion in 2013, over $20 billion by 2016






Source: IDC, IDC Predictions 2013 presentation



IDC 2013 Big Data Predictions

5

Actuate Corporation ©
2012


Big Data or Little Data
-

How Do You Display Yours?

The Eclipse Foundation would like to better understand how developers
are using Eclipse with big data and reporting projects
.



We ran this survey to get the pulse of what technologies where in
demand related to Eclipse/BIRT technologies.


Eclipse Promoted the Survey.



60% of 518 responders claimed to be big data users


Eclipse BIRT Survey


Oct/Nov 2012

6

Actuate Corporation ©
2012

Eclipse BIRT Survey
-

Technology Choices

0.0%
5.0%
10.0%
15.0%
20.0%
25.0%
30.0%
35.0%
40.0%
45.0%
Hadoop
Cassandra
MongoDB
BIRT
Hive
Talend Open Studio
Mahout
R
None
Hadoop
Cassandra
MongoDB
BIRT
Hive
Talend Open
Studio
Mahout
R
None
28.5%
7.3%
17.0%
20.6%
10.9%
7.3%
7.9%
12.1%
40.0%
What big data technologies are you using with Eclipse?

Note: Responders could choose more than one option

7

Actuate Corporation ©
2012

Eclipse BIRT Survey
-

Other Mentions

Other

Mentions

Home grown

Jasper

Greenplum

jdt

Netezza

ZEND

StreamBase

hypertable

HBase

CouchDB

torque

Pentaho

OOZIE

Sqoop

IBM Inforsphere Streams

Kamasphere

Bigtop

BerekelyDB
-
JE

Next
-
generation
-
sequencing (BAM)

8

Actuate Corporation ©
2012

Eclipse BIRT Survey
-

Data Visualization

Essential

52%

Sometimes
important

28%

Occasionally
useful

13%

Never needed

7%

How Important is Data
Visualization/Reporting to Your Projects?

9

Actuate Corporation ©
2012

Report/Visualization Tools

I use open source
data
reporting/visualizati
on tools
I use commercial
data
reporting/visualizati
on tools
I use home grown
routines or open
source libraries to
display data
My projects don't
require reporting or
data visualization
70.9%
20.0%
39.4%
7.9%
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
80.0%
How do you create and/or use data display
tools or libraries in development ?

Note: Responders could choose more than one option

10

Actuate Corporation ©
2012

Goals:


How many large firms (>$1B) are conducting
Big Data
projects


W
hat are such companies
doing with
their Big Data projects


W
hat are the expected benefits for those Big Data initiatives


What are the inhibitors



King Research received 516 surveys


316 completed and 200 partially completed surveys


C
ompleted surveys were the primary source of analysis


32% of those who completed
survey
(98 respondents) work at
companies with revenue of $1B or more


Independent Big Data Survey


Sept/Oct 2012

11

Actuate Corporation ©
2012


26% of large companies have Big Data projects. 40% have not evaluated Big
Data or have evaluated and decided not to proceed. The balance (34%) are
either evaluating or planning such initiatives.


“Not enough staff with expertise” and “Expected cost of Big Data initiatives” are
the major inhibitors


Major benefits
expected

from Big Data initiatives are:


Make better decisions, faster


Gain competitive advantage


Improve efficiency


Improve customer targeting


Major benefits
realized

from Big Data initiatives are:


Gain competitive advantage


Improve customer targeting


Make better decisions, faster


Improve efficiency




Independent Big Data Survey


Key Findings

12

Actuate Corporation ©
2012

Does
your organization have a Big Data implementation today?














More large companies have implemented Big Data projects (26%) than the universe of
companies represented in this survey (19%)


Conversely, far fewer respondents at large companies responded “No” to this question
(40% versus the universe of respondents 49%)

0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
40.00%
45.00%
50.00%
No


Have not evaluated
Big Data

No


Evaluated and
decided not to proceed

Evaluating
Planning to use in the
short term


less than 1
year

Planning to use in the
long term


more than 1
year

Yes


We have a Big Data
implementation today

Independent Big Data Survey


Big Data Usage


$1B+ Revenue

Universe of Respondents

13

Actuate Corporation ©
2012

What Big Data technologies do you plan to use?
(
eval/
planning)












We asked about their planned use of 15 technologies, and the top 5, in descending order of
frequency of mention are displayed above


Other technologies planned for use at $1B+ organizations include:
Apache Cassandra,
12%; Hortonworks Hadoop, 12%
;

Amazon DynamoDB, 9%; Apache CouchDB, 9%
; VoltDB,
9
%;
HyperTable, 6
%
;
10gen
MongoDB, 3%; Datastax
Cassandra, 3%

0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
40.00%
45.00%
50.00%
Apache Hadoop
Cloudera Hadoop
Apache Hive
Apache HBase
EMC Greenplum HD
Independent Big Data Survey


Big Data Technologies


$1B+ Revenue

Universe of Respondents

14

Actuate Corporation ©
2012

What
are

likely
to be your Big Data applications
?
(
responses from those who
are
evaluating or planning
Big Data implementations)













Our survey
listed 23 frequently reported Big Data applications and when
asked
which of
these they have evaluated or planned to use, they indicated an
average 4.5 apps each.


Shown above are the 14 apps that were most frequently indicated

0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
40.00%
45.00%
50.00%
Independent Big Data Survey


Application Types

15

Actuate Corporation ©
2012

How
many people in your organization
will consume
information from or use your Big
Data applications
?
(evaluating/planning)













Clearly companies with revenues of $1B or greater plan to share their Big Data information
with large audiences across their companies

0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
40.00%
45.00%
50.00%
1


9 people

10 - 49 people
50 - 99 people
100 - 499 people
500 or more people
Independent Big Data Survey


Number of End Users

16

Actuate Corporation ©
2012

Actuate Launches the BIRT Project

AUGUST

2004

Actuate Joins

Eclipse Foundation

as Strategic Developer

and Board Member

Actuate proposed and started

BIRT

B
usiness
I
ntelligence

and
R
eporting
T
ools Project

… a top
-
level Eclipse project

Adds BI and Reporting

as Open Source Project

Professional open source

Primary development resources

funded by Actuate

Contributions from many sources

IBM, Innovent Solutions and community

17

Actuate Corporation ©
2012

Simplicity

that makes
simple
layouts easy

Power

to create

very complex

layouts


BIRT


B
usiness
I
ntelligence and
R
eporting
T
ools


Makes all data
-
driven content development easy


Modern, web
-
page design metaphor


Open and standards
-
based


Flexible with rich programmatic control


Full support for libraries and reuse


Foundation for a range of solutions

A New Generation of Data Visualization Technology

18

Actuate Corporation ©
2012

BIRT Release History

September 2004

BIRT Project proposal accepted, and project launched

June 2005

1.0

Eclipse Report Designer, Report Engine, Chart Engine

December 2005

2.0

Support for a wide variety of common layouts

June 2006

2.1

Advanced parameters, ability to join data sets, …

June 2007

2.2

Dynamic crosstab support, web services data source, …

June 2008

2.3

JavaScript Debugger, BiDi Support, Charts in Crosstabs, …

June 2009

2.5

Page aggregates, Multiple drill
-
downs in Charts, …

June 2010

2.6

New charts, more chart control, developer productivity, …

June 2011

3.7

POJO Runtime, Hive/
Hadoop
, Open Office emitters…

June 2012

4.2

Maven Support, Excel Data Source, Relative Time Periods…


Ground
-
up initiative: Innovative approach to layout and design


Developed in the open with community feedback at all stages

19

Actuate Corporation ©
2012

BIRT Example Key Capabilities

Very Simple to Very Complex Layouts


Listings, cross
-
tab, dashboard, pixel
-
perfect, charts …


Grouping, advanced aggregations, sub
-
totals, calculations


Multi
-
section and sub
-
reports


Conditional sections and logic


Full programmatic control/scripting


Embedded images…

Comprehensive Data Access


SQL databases, Web Services, Flat Files,
XML, scripted data sources …


Multiple data sources in one design…

Output Formats


HTML, PDF, Excel, Word, PowerPoint…


Internationalization of labels and text


Bi
-
Directional language display

Re
-
use and Developer Productivity


Library support for publishing and
sharing components


Leverages common standards (SQL,
HTML, JavaScript, Java, XML)


Cascading Style Sheets


Built
-
in debugger…

Interactivity and Linking


Data driven hyperlinks


Drill
-
through charts and graphics…

Multiple Usage and Productivity Aids


Graphical layout and design


Query & metadata editors


Formatting Builder


Grouping Builder


Customizable cheat sheets and
templates…

20

Actuate Corporation ©
2012

Getting to Know BIRT

DEMO

21

Actuate Corporation ©
2012

BIRT Design Gallery

Charts and Tables

Listing with Groups and Sub
-
Totals

22

Actuate Corporation ©
2012

BIRT Design Gallery

Crosstabs

Crosstab and Charts

23

Actuate Corporation ©
2012

BIRT Design Gallery

Forms

Calendar / Schedule

24

Actuate Corporation ©
2012

BIRT Design Gallery

Dashboards

Multi
-
Language and Bi
-
Directional

25

Actuate Corporation ©
2012

BIRT Chart Gallery

26

Actuate Corporation ©
2012

BIRT Chart Gallery

27

Actuate Corporation ©
2012

BIRT Chart Gallery

28

Actuate Corporation ©
2012


BIRT Designer

High
-
Level BIRT Architecture

BIRT Engine

Presentation

Services

Design Engine

Generation

Services

Data

Services

Charting

Engine

Eclipse

Designer

Chart

Designer

Eclipse

DTP,

WTP,…

Data

Data

HTML

PDF

Excel

Word

PowerPoint

PostScript



XML

Design

Document

29

Actuate Corporation ©
2012

Design Engine

Report Engine

Chart Engine

Produces XML Report,
Templates, and Library
Designs

Runs Reports and
produces output


PDF,
HTML, Doc, XLS, PS,
PPT Etc

Consume Chart EMF
model and produces
Chart Output. Supports
14 Main types and many
sub types.
Ouputs

to
PNG, JPG, BMP, SVG,
PDF
,
SWT
, and
SWING

DE API

RE API

CE API

All Engines can be ran with or without
OSGi


Report Designer


Chart Builder


Example Viewer

Can be ran outside of BIRT

Core BIRT Open Source Products


High Level BIRT Architecture

30

Actuate Corporation ©
2012

BIRT AJAX Based Viewer

31

Actuate Corporation ©
2012


BIRT Offers many ways to get data


Standard Data Sources


Flat File (CSV, TSV, SSV, PSV)


Hive Data Source


Cassandra Scripted Data Source


JDBC Textual or Graphical


Web Service
-

XPath

syntax


XML
-

XPath

syntax


XLS/XLSX


Scripted Data Source Written in Java
or JavaScript


Open Data Access (ODA) DTP Project


Extensible JDBC Driver Framework



BIRT Data Access

Community

Contributions

GoogleDocs

XML/A

Casandra

REST

MongoDB

Multi
-
Flat

File

GitHub

Twitter JSON Search

Dropbox

usage

YQL

Google Analytics

LinkedIn

Facebook

FQL

32

Actuate Corporation ©
2012

Live Demo


New
MongoDB

ODA

DEMO

33

Actuate Corporation ©
2012

Connecting to
Hadoop

34

Actuate Corporation ©
2012

Hive JDBC


HQL Sub Query Example

35

Actuate Corporation ©
2012

Hive JDBC


get_json_object

UDF

36

Actuate Corporation ©
2012

Hive JDBC


RegExP

Example

37

Actuate Corporation ©
2012

Hive JDBC


HQL Hints example

38

Actuate Corporation ©
2012

Hive JDBC


Transform Example

39

Actuate Corporation ©
2012

Explore


Search/sort


Rate, comment


Forums

Download


Documentation


Software


Examples

Contribute


BIRT designs, code


Technical tips


Contests

Centralized hub for BIRT developers


Access demos, tutorials, tips and techniques, documentation…


Enables developers to be more productive and build applications faster


Marketplace for applications

BIRT Exchange Community Site

40

Actuate Corporation ©
2012

Visit BIRT Exchange for full contest details

Contest runs from March 28, 2013 to April 30, 2013


Plug
-
In Categories

Open Data Access (ODA) Drivers

Output Emitters

Report Item Extensions

Chart Extensions


New
iPad

for Top 3

Plug
-
Ins!

Plug in to BIRT Spring 2013 Contest

41

Actuate Corporation ©
2012

Big Data and Business Intelligence

Virgil Dodson

vdodson@actuate.com

Questions?