DATA WAREHOUSE /DATA MINING ROAD MAP

fantasicgilamonsterΔιαχείριση Δεδομένων

20 Νοε 2013 (πριν από 3 χρόνια και 6 μήνες)

112 εμφανίσεις

DATA WAREHOUSE /DATA MINING ROAD MAP



1. WHAT THE USER WANT?



Does the organization need a data warehouse/data mart? (Why a warehouse?)



What are the business objectives? In the government it may be fulfilling social
sector projects, resource mobilizat
ion, financial issues, target
-
based projects, how
the resources are utilized etc.



Scope of the DW? Very important. Its not that every system needs a DW.



Where the data will come from? From Blocks, Districts, States, PSUs, Markets
etc



Calculate the cost/ben
efit analysis. DW is an expensive proposition. One has to do
a careful analysis and justify.



Calculate the project estimation. How much time it will take to establish a DW
system

project approval, release of fund, execution, Cost of the project.



Calculat
e the risk assessment. What are the positive and negative aspects of the
project; Whether the Rate of Return is justified?

2.

DETERMINE DBMS SERVER PLATFORM



Which database servers do you already have? Whether the server is good enough
to handle large amount o
f databases and the proposed DWs/Data Marts.



Determine cost, inter
-
operability and staff training considerations.



Determine DBMS server platform based on Return on Investment (ROI)


3.

DETERMINE HW PLATFORM



Where should the data warehouse or data mart be hou
sed? At the center and/or
state level



Which hardware platforms do you already have? Is it sufficient?



Determine cost, inter
-
operability and staff training considerations.



Determine hardware based on Return on Investment (ROI).


4.

INFORMATION AND DATA MODELLI
NG

Building an Information Model



How do I use a data model to express relationships between the data? There are
hundreds of departments. Means thousands of OLTP applications.



Resolve access and usage issues.



Determine the logical and physical design of th
e data warehouse or data mart.


5.

CONSTRUCT THE META DATA REPOSITORY

How do I keep track of what means what, who can access it and how and when it will
be accessed? Who all can access it? Fully/partially.



Build a repository.



Determine the business user's v
iew of the metadata repository



6.

DATA ACQUISITION AND CLEANSING


How do I



Extract data from multiple sources across multiple database/OS/HW platforms.



Cleanse the data. Acceptable level of information. Add, Delete etc.



Scrub



Reconcile



Aggregate



Summariz
e

7.

DATA TRANSFORMATION, TRANSPORTATION & POPULATION

How do I build the data warehouse/data mart?



Transform the data.



Transport the data.



Populate the data warehouse or data mart.

8.

DETERMINE MIDDLEWARE CONNECTIVITY

How do I connect the source data to the

target data warehouse or data mart?



Ongoing connection



Direct data access

9.

PROTOTYPING, QUERYING & REPORTING

How do I:



Implement a prototype with user involvement?



Develop applications?



Use query and reporting tools?

10. DATA MINING

How do I find patt
erns in the data?



How can those patterns be used for revenue growth?



How can tools be used to further identify patterns in the data warehouse or data
mart?

10.

OLAP


How can users analyze the data?


How can the necessary number of dimen
sions be determined?



How can users see the data represented in multiple dimensions?



How can they use the OLAP tools?


12. DEPLOYMENT & SYSTEM MANAGEMENT

How can I provide security, backup, recovery and the necessary capabilities for a
production data war
ehouse or data mart?



How can resources be allocated best?



How can the growth of the data be accommodated?



How can everything be kept running smoothly?

How do I:



Implement a prototype with user involvement?



Develop applications?



Use query and reporti
ng tools?


2.12 Data Warehouse "Roadmap" and Roll out Strategy


The following table provides the framework and the general "roadmap" for the
rollout of the Enterprise Data Warehouse.


Data Warehouse Rollout Map & Related Activities

Warehouse Task

Activitie
s

Objectives

Implementation Details

Create Project
Plan

Develop new
project Plan for data
warehouse
increment(s)

Gain consensus as to
project tasks and
schedule

Develop Plan and Schedule

Agree to Scope
from Technical
Requirements

Identify and
Document sc
ope
data and
functionality

Gain consensus on
specific deliverables
(data, functionality,
structures) of enterprise
project

Obtain Business Community
Agreement

Data Acquisition


Identify, evaluate,
and design
elements pertinent
to data acquisition

Document

and agree on
design components
(design standards, data
source mapping, ETT,
load, refresh and purge
modules, as well as data
mart design)

Develop the Data Acquisition
Plan Development Standards
and Component
Implementation Strategies

Data Quality

Assess
Data
Quality, Develop
Define objectives and
requirements for data
Develop Data Quality
Compon
ents Design and Test
Design & Build
Standards

(cleansing, error
handling, audit &
control) assess quality of
source data, identify
data management
procedures, finalize tool
selection


Plans

Data Warehouse
Technical
Architecture

Identify, plan, and
design capacity,
hardware, and
software
components and
test criteria

Ensure adequate
capacity, identify
process flow of data,
plan integration of
modules, data, access,
a
nd meta data

Plan Implementation and
Test Environments for
Warehouse Architecture
Components

Warehouse
Administration


Develop, Design
and Build
Standards for
Warehouse
Administration

Design and plan version
control, data archiving,
scheduling, usage, dat
a
governing, backup,
restoration, query
profiles, and security

Plan for Development and
Test Environments for
Warehouse Administration
Components

Meta data
Management

Identify and
document Meta
data Requirements,
as well as design
and build standards

Coll
ect meta data
needed by technical and
user communities,
identify required meta
data tools, and develop
standards to which meta
data should adhere

Develop a Meta data
Management Plan to manage
and monitor meta data,
Develop Design and Build
Standards (or id
entify
appropriate tool)

Data Access

Determine Access
Requirements and
develop design and
build standards

Collect specific access
requirements to support
analysis capabilities,
data manipulation
functionality, and user
interface. Develop,
design and build

standards to support
query and reporting and
query criteria, user
security, and
confirmation of data
availability

Develop Access Plan and
Determine access tools.
Develop Standards and
Confirm Data Availability

Design and Build

Design and Build
Developmen
t
Standards and
Modules for all
Components of
Data Warehouse
Solution

Design, build, load and
test Data Acquisition
Modules (ETT, load,
refresh, purge, data
mart)



Design, develop, and
generate or implement
Data Quality Modules
(cleansing, error
handling,

audit &
control)



Design, build, load and
test Architecture
Components
(Multidimensional, Test
and production
Databases)


Design, Build, Populate and
Test:




Data Acquisition
Modules





Data Quality
Modules





Databases and
other Schema’s




Warehouse
Administration
Modules





Meta data
Modules




Data Access
Modules


Design, build, load and
test Warehouse
Administration
Modules (versioning,
scheduling, backup,
restoration,
performance, security)


Build or Implement
Meta data Modules


Design, build, load and
test Data Access
Components (reports
and query criteria, user
security, user access and
style specifications)

Documentation

Define and Produce
Documentation
Standards,
Procedures and
Environment

Specify and develop
documentation
deliverables (User,
Technical,

Operational,
Reference)

Develop Documentation
Requirements, Standards, and
Delivery Strategy and
Produce Final Documentation

Testing

Define testing
strategies, develop
test procedures and
perform testing
specific to the scope
of the Data
Warehouse soluti
on

Design, develop and
implement testing plans
and strategies for ETT,
Performance, Interfaces,
Integration, Volume,
Query Profiles and
other Components

Develop and Implement
Testing Strategy, Plan,
Models and Integration
requirements

Training

Define and
develop
increment training
requirements and
plans

Identify and document
training requirements
for technical and end
user staff, Identify
specific roles who
should receive training,
and create training
databases

Develop Training Strategy,
Requirements, and
Class
Material. Create Training
Databases

Installation

Define and develop
installation plan

Develop an installation
plan to support the
production, test and
other maintenance
environments for the
data warehouse solution

Develop Sequential or
repeatable (S
tep by Step)
Installation Plan

Transition

Define transition
strategy to
production
environment

Identify how the
transition to production
will occur including
planning for data
acquisition, preparation
of production
database(s), developer
preparations and
other
cut
-
over issues

Develop Cut
-
Over Plan,
Implement Maintenance,
Production and Regression
Environments

Production
Support

Measure and
support the
production systems

Evaluate and audit the
system for performance,
faults, use, growth,
recovery and tunin
g
issues

Develop Library for
production support metrics,
corrections, enhancements,
results

End Phase

Prepare for and
complete phase end
activities

Secure acceptance of
phase end deliverables,
release resources, assess
and audit deliverables

Prepare Phase

End Report
and perform Quality
Assessment activities

Post
Implementation
Support

Evaluate
implemented
increment, and non
-
implemented
requirements, data
warehouse
architecture and
plans

Assess responsiveness
of solution to stated
need, assess
performance
of data
warehouse, identify
next increment

Document Evaluation of Data
Warehouse Solution and
Architecture, Identify Next
Increment Opportunities,
Assess Performance of Project
to Plan

End DW
Increment

Prepare for and
complete final
project activities

Sec
ure final acceptance
of project deliverables

Conduct Discovery Meetings
with Business Community


NOTE: Input for the above from the INTERNET