Assets in Enterprise Directory

gruesomebugscuffleSoftware and s/w Development

Nov 25, 2013 (3 years and 8 months ago)

74 views

Leveraging Data Warehousing
Assets in Enterprise Directory
Design

Brendan Bellina

Identity Services Architect

University of Southern California

bbellina@usc.edu

http://isd.usc.edu/~bbellina


October 21, 2005

Data Warehousing Experience


Corporate Data Warehouse Architect, Jayco
Enterprises,1997
-
1999


Certified Data Warehousing, Atre Institute


IBM AS/400, MS SQL Server


Cognos Data Mining Tools (Impromptu, PowerPlay)
used to construct Sales Data Mart


Information Engineering Manager, University of
Notre Dame, 1999
-
2001


Metadata Repository


Logical Data Modeling


Business Objects Universe Designer


University Data Warehouse architecture


Admissions Data Mart design

Directory Services Experience


Enterprise Directory Architect, University of
Notre Dame, 2000
-
2005


Architect of Enterprise Directory Service


Editor, NMI/Internet2 Metadirectories White Paper


Author, NMI/Internet2 Local Domain Person Survey
Results


Author, NMI component
-

Look (LDAP Operational
ORCA Collector)


Active participant in Internet2 MACE
-
Dir working
group


Directory Consulting: George Mason University,
Pittsburg State Kansas, USC

Directory Services Experience


Identity Services Architect, University of
Southern California, 2005
-

present


Global Directory Service implementation (includes
Person Registry, LDAP service)


Metadirectory process implementation


Role on many campus committees related to IdM
policies, direction, GDS operations


Author, NMI/Internet2 Higher Education Person
document (currently in draft)


Speaker on directory services at AACRAO,
EDUCAUSE regional, EDUCAUSE national, Internet2
member mtg, Internet2 CAMP, CUMREC (for access
to presentations go to
http://isd.usc.edu/~bbellina

)

Typical Asset Development Path

Data Warehouse Designer


Business Application Programmer



Business Application Analyst



Data Base Administrator



Data Warehouse Designer


Often the designer’s educational background
is not

in
computer science or engineering.

Typical Asset Development Path

Enterprise Directory Architect


“OS Geek” / “scripter”



Systems Programmer



System Administrator



Email administrator and/or LDAP administrator



Enterprise Directory Architect


Usually the architect’s background
is

computer science or
engineering.


Asset Stereotyping

DW Architect


-

Thorough Perfectionist


-

Wants to understand
and accommodate
Customer Requirements


-

Excited by “meaning”
and “relations”


-

Tendency toward
extensive prep

Directory Architect


-

Creative Implementer


-

Wants to contribute and
rely upon open
-
source
and developing standards


-

Excited by “blinking
lights”


-

Tendency toward “Q&D”
(ie.
Quick and Dirty
, not
Quality
and Documentation
)

NO COMMUNICATION

Technological Similarities


Process not project


Long ROI
-

DW 18
-
36 months, DS 9
-
24 months


Long
-
term investments requiring continual
attention, funding, care and feeding


DW design goes stale due to changing sources,
changing reporting needs, changing performance
characteristics


EDS design goes stale due to the same as DW
and also changes in external standards and
privacy laws

Architectural Similarities

Typical Data Warehouse:


Source

Source

Source

Data Warehouse

ETL

Staging

Area

Meta

data

Data

Mart


>

Reporting

Tools

Typical Enterprise Directory:


System

of Record

SOR

SOR

Enterprise Directory

Metadirectory

Process

Registry

AuthN

(Kerb)

AuthZ

(LDAP)


>

LDAP

Protocol

Application

Semantics and Methods

DW

EDS

Data Collection

Multiple internal and
external Data Sources

Multiple internal Systems
of Record

Data Migration

ETL Tools

Metadirectory scripts

Phased
Implementation

Content specific “data
marts”

Organizational Units
(People, Groups,
Courses, etc.)

Data Restrictions

Views and db
permissions

LDAP Access Controls

Data Access

Designed for High
-
volume read, low
-
value
write.

Reporting Tools, Data
Marts

Designed for High
-
volume read, low
-
value
write.

Applications, End
-
users,
Application/NOS
directories

Design Process Comparison

DW

EDS

Executive Support

Both require complete executive support to ensure that
departments are motivated to establish standards, provide
data specialists, and contribute data.

Data Interviews

Define scope, understanding
business policies, data
entry, data ownership, and
meaning.

Same. Scope may pertain
to initial applications,
organizational unit type
(people, groups, courses,
etc.)

Data Modeling

Logical Data Model,

Denormalization,

Aggregation (exp. Monthly
totals).

Leverage standard object
classes, create auxiliary
classes, aggregation (exp.
PostalAddress), combining
related attributes.

Schema Design

Star Schema, Snowflake,
Dimensional

DIT structure (flat or tall, fat
or thin)

Design Process Comparison

DW

EDS

Indexing

Based on anticipated
reports.

Based on standards and
anticipated application
needs.

Data Access Protection

Views and permissions.
Special Data base
accounts.

Access Controls. Service
accounts for specialized
access.

Data Extraction

Usually batched from
Sources or ODS. ETL
tools.

May be batched or near
real
-
time via triggers.
Metadirectory scripts.

Data Transformation

Cleansing (75% of time),
aggregation, splitting
fields. Intended to create
information useful for
reporting. ETL Tools.

Same issues. In addition
accommodate standard
attributes, object classes,
syntaxes, and domains.
Metadirectory scripts.

Data Loading

ETL Tools.

Metadirectory scripts.

Design Process Comparison

DW

EDS

Operational Planning
(done in conjunction
with security, network,
and system admins)

System security, redundancy, recovery, monitoring
audit.

Data Access Protection

Views and permissions.
Special Data base
accounts.

Access Controls. Service
accounts for specialized
access.

Data Propagation

Data Marts, Metadata
Repository.

Dependent directories.
NOS directories. AuthN,
AuthZ systems.

Performance
Monitoring

Query response time. Frequency. Changing indexing
needs.

Policies

Data Requests with involvement of data stewards.
Addition and retirement of data sources.

Key Differences

DW

EDS

Availability

Historically not 24x7, but
moving toward 24x7
reporting.

24x7x365

Data Currency

Historically daily or weekly
or sometimes monthly
depending on reporting
requirements.

Near real
-
time. Daily at worst.

Language

Robust admin language
-

SQL.

LDAP is not robust. Will require
admin scripts or tools.

Fundamental Design
Principle

Meet known customer
reporting needs. Attributes
are cheap.

Appreciation for standards to
ensure compatibility with future
vendor applications. Unique
attributes are expensive.

Optimization Strategy

Optimized for reporting.

Optimized for entry retrieval.

External Access

Internally accessible only
(usually)

Both internally and externally
accessible (usually)

Summary


Broadly similar, though specifically dissimilar


DW Designer/Analyst can bring important skills
to a Directory team


Data Interviewing Techniques


Understanding of need for Data Stewardship Policies


Metadata management


Challenges


Scripting and LDAP instead of SQL


Metadirectory tools are not as mature as ETL tools


Acceptance of standards as primary influence to prevent
creation of unique attributes and ensure future compatibility


Real
-
time considerations


High availability considerations


External visibility of the service

Questions

Contact Information:


Brendan Bellina

bbellina@usc.edu

http://isd.usc.edu/~bbellina

Copyright Brendan Bellina, 2005. This work is the intellectual property of the author. Permission is granted for this
material to be shared for non
-
commercial, educational purposes, provided that this copyright statement appears on
the reproduced materials and notice is given that the copying is by permission of the author. To disseminate
otherwise or to republish requires written permission from the author.