Drupal Reuse - Project Collaboration Space - The Ohio State ...

peruvianwageslaveInternet and Web Development

Feb 5, 2013 (4 years and 7 months ago)

155 views

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

KmData

& Friends

OCIO, ENG, ASC,
VetMed
, KSA,
etc

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Introduction:
Many Isolated Web Groups

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Introduction:
Reinventing the Wheel

1891 Patent for Wheel

Implementing a Directory


Duplicated Effort

2
-
3 weeks * 50 groups

2
-
3 years of labor


Missing the Target

Divergent Efforts

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Introduction:
Many Data Sources, Policies & Tech

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Introduction:
Reinventing the Data

A faculty member with
profiles/information on at least
8 different sites.


Choose: Stale Information or
Duplicated Effort


8 x 1hr x 20,000 employees

= 80 years labor (worst case)

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Introduction:
Hard to Answer Questions

Do students that took a
course in Java have a
better chance of getting
a co
-
op or internship?

SIS &
Courses

Career

Services


What are Professor X’s
strengths and weaknesses
relative to his department peers
(
publications, grants, teaching
evaluations)?

OSU

Pro

SIS

HR

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Introduction:
Agenda

An information system
to connect the dots.


Tools / Methods for
Collaboration


Reusable Integration

to aid adoption.


Introduction to KMData

Vedu Hariths

John Wilkins

kmdata.osu.edu

IR
-
1

IR
-
2

IR
-
7

IR
-
4

IR
-
6

IR
-
3

IR
-
5

IR
-
4

IR
-
8

Our Digital Landscape



decentralized & distributed



interdependencies



size & magnitude



changing & ever
-
evolving



digital repositories



middleware solutions

Purpose

To create a dynamic and community
-
sourced way for for
university customers to discover, access and analyze
enhanced institutional data that was formerly isolated in
localized silos across campus. Products of this project
include:


Open
-
source API

Dashboard

Reporting tools





Why?


To provide a more holistic view of our institution’s knowledge
assets and enable the aggregation academic data for
analytics, research and administration.



To build better data services with less effort, higher accuracy
and more context.



To allow us to ask and answer questions about the academic
community like never before.



To share information based on your specific requirements.



Existing Architecture and Features


Server architecture and configuration:


-
Webservers (
Dev
/Prod): Apache +
Jboss+Phusion

Passenger+ Rails on
CentOS

-
Database servers:
PostgreSQL

with PAM authentication, and UUID on
CentOS

-
SVN/
Trac
/Wiki: SVN +
Trac

on
CentOS



Community design and scoping sessions developed:


Core
KmData

database schema


Core Rails UI/WS platform


Use Cases and user stories



API Documentation Available:

https://kmdata
-
svn.osumc.edu/wiki/api


KMData Requirements


Use Cases


Department directories on
web sites


Course listings tied to
syllabi


Need centralized data mart
and API to access the
combined information


Data is merged into a core
schema capable of fitting
different types of elements in a
relatively few number of tables

User Data

User Data

User Data

Open
-
Source Model


This data belongs to all of OSU's disparate systems.


We need active participation from representatives of these systems to help us
activate KMData and integrate or connect multiple systems with the core.


A Trac site is available to anyone in the OSU community who would like to
participate. The ticketing system component is open to anyone with OSU
credentials while the rest of the site is open to individuals who have expressed
interest in more involved participation.


As we move forward, all code and artifacts produced by the project will be
available via this site to browse as well as download.


http://kmdata.osu.edu/

Partners


Office of Academic Affairs


Health Sciences Library


Knowlton School of Architecture, CoE


Engineering Computing Services, CoE


Department of Physics


University Libraries


Department of Statistics


Department of Sociology


ASC Technology


Office of Research

Get Involved


Communicate this project and concept to your local
leadership



Research/construct questions and ideas around the use
of this data



Volunteer to pick up available development tasks



Commit time, resources or skills to the effort



Contact Vedu Hariths (.1) for more information.

Data

Sources

ETL’s

Target

Schemas

Core

Schema

Web

Services

Adaptor

API’s

Drupal

API

PHP

API

Ruby

API

Java

API

HR/SIS

Oracle

OSU:pro

MS SQL Svr

Digital Library

MySQL

.NET

API

KMData Database

PostgreSQL 9.0

on CentOS

KMData

Core

Schema

HR/SIS

Schema

OSU:pro

Schema

Digital Library

Schema

Kettle

ETL

Kettle

ETL

Kettle

ETL

Pentaho Data

Integration (Kettle)

ETL’s nightly

schedule with
crontab

Merge Stored Procedures (PL/pgSQL)



Kettle

ETL



KMData
Application

Web Service API

written in Ruby on
Rails

External Data Silos

(various sources)

WS Method

WS Method

WS Method

Drupal API



KMData Architecture

Focus of today’s presentation

Focus of DrupalCamp Ohio
presentation

Direction of Data Flow

Direction of API Calls

Data Source, ETL


Various data silos exist
across organization


Database sources include
Oracle, MS SQL Server,
MySQL, Postgres, and
others


KMData database is
currently PostgreSQL 9.0
running on CentOS (to be
upgraded soon)


Kettle ETL tool is used as a
common means to load data
and transform when
necessary

Data

Sources

ETL’s

Target

Schemas

Core

Schema

Web

Services

Adaptor

API’s

Drupal

API

Data Source, ETL

Data

Sources

ETL’s

Target

Schemas

Core

Schema

Web

Services

Adaptor

API’s

Drupal

API

Data Source, ETL

Data

Sources

ETL’s

Target

Schemas

Core

Schema

Web

Services

Adaptor

API’s

Drupal

API

Transformations run nightly


Currently limited information is

being updated on a nightly

basis


Around the next upgrade more

data will become automated

Target Schemas


Data is cached into individual schemas
representative of the data silos from early steps


Individual Kettle ETL’s accomplish most of
transformation of data before being stored


Target schemas provide a level of tracing data
back to source system within the same database
as the core schema


Privileges negotiated from source systems are
honored in KMData (items marked private
excluded)


No binary data stored



Data

Sources

ETL’s

Target

Schemas

Core

Schema

Web

Services

Adaptor

API’s

Drupal

API

Target Schemas

Data

Sources

ETL’s

Target

Schemas

Core

Schema

Web

Services

Adaptor

API’s

Drupal

API

Merge Procedures


Data is transferred from ETL target
schemas into the core KMData schema


This logic is implemented in PL/pgSQL
stored procedures and are run from within
Kettle jobs


Many internal stored procedures exist to
make data comply within generic KMData
table definitions

Data

Sources

ETL’s

Target

Schemas

Core

Schema

Web

Services

Adaptor

API’s

Drupal

API

Merge Procedures

Data

Sources

ETL’s

Target

Schemas

Core

Schema

Web

Services

Adaptor

API’s

Drupal

API

OSU:pro

MS SQL Svr

OSU:pro

Schema

Kettle

ETL

OSU:Pro Database Silo

Kettle Transformation

KMData OSU:pro Schema

Merge to Core Procedures

Core Schema


Contains unified view of the
data


Core tables include users,
works, groups, locations,
and narratives


Global resource table
contains unique KMData
identifier


Identifies the source of data
and current location within
KMData


Data

Sources

ETL’s

Target

Schemas

Core

Schema

Web

Services

Adaptor

API’s

Drupal

API

Core Schema

Data

Sources

ETL’s

Target

Schemas

Core

Schema

Web

Services

Adaptor

API’s

Drupal

API

Core Schema: Groups


KMData features a robust user/group system
with a web service API


This can be used to create departmental
listings or custom groups


Organizational structures are read from HR

Data

Sources

ETL’s

Target

Schemas

Core

Schema

Web

Services

Adaptor

API’s

Drupal

API

Group

Group

User

User

User

Core Schema: Groups UI


The KMData Groups tool
allows administrators to log
in and manage groups from
the KMData application



Groups tool is part of
KMData UI application



The web service API for
group management can
also be utilized



Groups may be expanded
from merely groups of users
to groups of other entities

Data

Sources

ETL’s

Target

Schemas

Core

Schema

Web

Services

Adaptor

API’s

Drupal

API


Current implementation of REST web
services written using Ruby on Rails


Implementation may be changed to Java
Spring framework


Output available in XML or JSON


Use of adaptors is encouraged to abstract
web service calls on client platform


PHP adaptor written by
cross
-
department
development team
currently used to power
KMDrupal

Web Services

Data

Sources

ETL’s

Target

Schemas

Core

Schema

Web

Services

Adaptor

API’s

Drupal

API

Web Services

Data

Sources

ETL’s

Target

Schemas

Core

Schema

Web

Services

Adaptor

API’s

Drupal

API

KMData Web Services API

https://kmdata
-
svn.osumc.edu/wiki/api

Data

Sources

ETL’s

Target

Schemas

Core

Schema

Web

Services

Adaptor

API’s

Drupal

API

Status and Next Steps


PostgreSQL upgrade from 9.0 to 9.1


New database server


Additional information added to nightly update
(currently periodic manual updates)


Courses information from SIS to be added to
database and KMData API


Expansion of enterprise information from HR
and SIS made available to KMData


Potential use of Java Spring Framework in place
of Ruby on Rails for KMData Web Service API

Demo


Search for a user

http://kmdata.osu.edu/searchApi_users/96069775


Search for a narrative with a term:

http://kmdata.osu.edu/searchApi_narratives/men


Search for a word in a title for any published work:

http://kmdata.osu.edu/searchApi_publications/women



THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

PHP Reuse:
PHP API

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Drupal Reuse:
Directory/Profiles as First Use Case

Note: Directories and profiles are 20
-
30% of traffic on some department sites.

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Drupal Reuse:
engineering.osu.edu/directory

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Drupal Reuse:
U
nthemed

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Drupal Reuse:
Add a person a la carte

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Drupal Reuse:
Add dynamic groups of people (
hr
)

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Drupal Reuse:
$#@! That’s not my address!

Common problems…



I want different contact info on different sites.



I
want
different bios on different sites (with different audiences).



I want some feature not supported by the upstream data
source (images)


THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Drupal Reuse:
Overrides

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Drupal Reuse:
Directory Sum Up

An
editor can build a directory in
minutes not weeks.


Faculty can update their own profile.


Central and distributed information systems increase in quality.


People can still get the job done without the run around (overrides).

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Drupal Reuse:
Directory Status & Roadmap

Version on engineering.osu.edu is the alpha1 (with very slight variation)


Plans to begin large scale rollout in late January (40+ sites)


Overall works well now.


Still needs


Better group import support.


Ability to override sub
-
objects.


Ability to create people and other objects not in
kmdata
.


Permissions refinements.


Stability improvements.


Performance improvements.


Additional configuration points.


THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Drupal Reuse:
Beyond Directory

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Drupal Reusable
Components

Drupal Reuse:
Extensible Architecture

KmPerson

API

(structured caching
for views)

KmPerson

CCK Field

KmData

API

(serialized caching)

PHP OO WS

Client

(Library in Drupal)

KmObject

KmPerson

Drupal Directory Feature

Person Content Type

KmPerson

CCK
Field Instance

KmData

Web
Services

Directory View

Management View
(VBO)

Other Features
Goodness, Theme
Functions,
etc

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Drupal Reuse:
Contexts and Overrides

Context Module


A collection of configured overrides applied to a certain context.


Think input formats for
KmObjects
.


Can be associated to multiple uses.


Override Module(s)


Can override arbitrary values.


Can be automatic, possibly from active directory.


Can be user interactive, allowing user to manually override.


Can apply to a class of
KmObject
.


Operate in a stack.


THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Contexts

(groups of overrides)

Use Cases

(context: field setting or
api

arg
)

Drupal Reuse:
Contexts and Overrides

Directory Profile

Directory View

User Pages

Directory

(
KmPerson
)

Override

Modules

Manual Field

Override

Active Directory

LDAP



Default

(
KmObject
)

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Drupal Reuse:
Beyond People

Course Listings


Publication Listings


Reports


Upcoming Job Opportunities

News


Events


Institutional Tagging


THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA


Use various open source tools internally


Git
/
gitweb
/
gitosis


Distributed version
control


Code repositories


Drupal instance with project module


Release management and versioning


Update repositories


Issue tracking


Usage statistics


Shibboleth account
integration

Collaboration:
Solutions

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Collaboration:
Git



Open source, distributed
version control system


Every
G
it

clone
contains a full copy of
the current version and
past versions of the
repository


Now the primary VCS used

on drupal.org



THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Collaboration:
Gitweb



http://code.web.engadmin.ohio
-
state.edu

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA


Drupal.org provides excellent project management and
source control in a tightly integrated package


Also serves updates for modules and themes



Implemented many
d.o

modules and conventions in an
internal source control site


Allows for easy cross
-
posting to drupal.org



Hosted at
http://source.web.engadmin.ohio
-
state.edu



Currently hosting over 60 projects


Collaboration:
Drupal Source Control

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Collaboration:
Drupal Source Control

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA


Customizable


Supports non
-
Drupal projects


Issue Queue


Automatic development builds


Automatic site updates


Drush

make integration


Usage statistics


Easy cross
-
posting to drupal.org


Collaboration:
Drupal Source Control Benefits

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Collaboration:
Other Efforts


In addition to technical solutions, we have also adopted other “open
source” collaborative practices


THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Collaboration:
Outcomes

Starting slow, but growing


College
of
Engineering


College
of the Arts and
Sciences


College
of Veterinary
Medicine


Office
of the Chief
Information
Officer


Knowlton
School of
Architecture


Health
Sciences Library

Overall, positive

Lots of room for improvement

THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Closing:
Summary

An information system
to connect the dots.


Tools / Methods for
Collaboration


Reusable Integration

to aid adoption.


THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Closing:
How to Help


It’s a Do
-
acracy
.

Try out the
KmData

feeds.


https://
kmdata
-
svn.osumc.edu/wiki/api

(
api

documentation)


Try the Drupal projects.


http://
source.web.engadmin.ohio
-
state.edu/project/km_demo

(install profile)


kmdemo.asc.ohio
-
state.edu

(demo)


Join the discussion


On the lists
kmdrupal@lists.osu.edu



In the issue queue @
http://source.web.engadmin.ohio
-
state.edu



In person, weekly
kmdrupal

dev
/design meetings Thurs @ 11a in 328
Bolz


Contribute your use case or patch. Report a bug. Suggest a solution.




THE OHIO STATE UNIVERSITY


WEBSIG
KMDATA

Closing:
Questions

Office of the Chief Information Officer

College
of Engineering

College of the Arts and Sciences

College of Veterinary Medicine

Knowlton
School of Architecture

Health Sciences Library


http
://kmdata.osu.edu/

http
://source.web.engadmin.ohio
-
state.edu/projects/kmdata