Mining Web 2.0 content for enterprise gold

capybarabowwowSoftware and s/w Development

Oct 30, 2013 (3 years and 11 months ago)

125 views

Corporate User Technologies

© 2007, 2008 IBM Corporation

Web 2.0 and the enterprise


June 2008

Mining Web 2.0 content for enterprise gold



Michael Priestley, Lead IBM
®

DITA Architect

March 2008

Corporate User Technologies

© 2007, 2008 IBM Corporation

2

Feedable, portable, mashable, DITAble

March 2008

Overview


What is DITA?


What about Web 2.0?


The problem


The solution (or part of it)


Scenarios: DITA and Wikis


Scenarios: DITA and mashups


Insights

Corporate User Technologies

© 2007, 2008 IBM Corporation

3

Feedable, portable, mashable, DITAble

March 2008

What is DITA?

(the Darwin Information Typing Architecture)


It’s an OASIS standard

for designing,
authoring, and publishing modular
information, such as technical publications,
help sets, or Web sites


It’s a markup language
:

topics for content, maps for collecting and
publishing content


And it’s an architecture
:


specializing to create new types of topics
and maps, with inheritance of existing
processing


Supported by an open
-
source toolkit, a
wide range of products, and an active
community of users

Corporate User Technologies

© 2007, 2008 IBM Corporation

4

Feedable, portable, mashable, DITAble

March 2008

Vendor response http://dita.xml.org/products
-
services



"PTC expects that by the end of 2008, up to
80% of all new XML publishing installations
will be based on DITA."



From PTC news release on Arbortext(R) 5.3
.

"Nearly 50% of the respondents estimated
they reuse their content and are
investigating the implementation of DITA
within their organization.”

From results of web survey by Astoria Software.

And others: Elkera, Doczone, DITA Storm, in.Vision…

Corporate User Technologies

© 2007, 2008 IBM Corporation

5

Feedable, portable, mashable, DITAble

March 2008

Why DITA?


Information quality (MasterCard, Avaya, Business Objects, Sybase, RIM)


Reduced translation costs (IBI, RIM, ATI/AMD)


Ability to reuse across products/product variants (Adobe, Nokia, IBM,
Sterling Commerce, Teradata)


Speed in responding to changes


Flexibility in responding to organizational change (Teradata, IBM)


Better management of workload (IBM, IBI)


Ability to specialize to meet domain needs (Siemens Medical, Nokia, Kone)


Ability to reuse across kinds of content (marketing, education, support…)
(Business Objects, Nokia, IBM)


Ability to reuse across companies (Siemens Medical, IBM)


Vendor independence (because open standard)


Ease of incremental adoption (Comet, Schlumberger, RIM)

From an informal survey of DITA users at recent conferences

Corporate User Technologies

© 2007, 2008 IBM Corporation

6

Feedable, portable, mashable, DITAble

March 2008

An open standard for architected content


Navigation


DITA maps manage
relationships among topics


Tables of contents, site maps,
related links…


Metadata


Can be managed at topic level
(content) or map level
(collection)


Content


DITA topics, which can be
specialized to support specific
information types, for example
DITA task


Separates core content from
metadata and links


Corporate User Technologies

© 2007, 2008 IBM Corporation

7

Feedable, portable, mashable, DITAble

March 2008

What about Web 2.0?

“Web 2.0 is of course a piece of jargon,
nobody even knows what it means.”


--

Tim Berners
-
Lee (that guy who invented the Web)




Web 2.0

… refers to a perceived or proposed second
generation of Internet
-
based services

such as social networking
sites, wikis, communication tools, and folksonomies

that
emphasize
online collaboration and sharing
among users
.”


--

http://en.wikipedia.org/wiki/Web_2.0


Corporate User Technologies

© 2007, 2008 IBM Corporation

8

Feedable, portable, mashable, DITAble

March 2008

Pick 2


Wikis


Create content collaboratively


Blogs


Social networking


Mashups


Combine content from multiple sources


Folksonomies

Corporate User Technologies

© 2007, 2008 IBM Corporation

9

Feedable, portable, mashable, DITAble

March 2008

Why Wikis and mashups?


Powerful enterprise tools


Enable fast, easy, open collaboration on content
using Wikis


Create new content quickly


Enable fast, flexible development of tactical
applications using mashups


Leverage investment in trusted content/data


Easier collaboration, faster innovation


Corporate User Technologies

© 2007, 2008 IBM Corporation

10

Feedable, portable, mashable, DITAble

March 2008

The problem with Wikis...


Content is unstructured


There may be templates
and implied semantics, but
no validation


Content is non
-
standard


Moving content out of a
Wiki


even between Wikis


is hard


Content is tangled


Selecting a subset of
content results in broken
links

?

?

?

?

?

Corporate User Technologies

© 2007, 2008 IBM Corporation

11

Feedable, portable, mashable, DITAble

March 2008

The problem with mashups...


Sources of content aren’t
standard


Every new source means a
new widget or control


Mashups aren’t standard


Can’t share mashup definitions
with other applications or even
other mashup engines


Mashups don’t stack


Every new mashup is a new
source of non
-
standard content

Corporate User Technologies

© 2007, 2008 IBM Corporation

12

Feedable, portable, mashable, DITAble

March 2008

Sum: Wikis don’t mash well


Faster creation of silo’d
content


Faster creation of
redundant content


Faster creation of more
content you can’t reuse


?

?

?

Corporate User Technologies

© 2007, 2008 IBM Corporation

13

Feedable, portable, mashable, DITAble

March 2008

Standard solutions


XML:


Separate content from
application


Then share content across
applications


DITA:


Standard content sources
emphasizing reuse


Stackable collection standard


let collections reuse collections


New content types and
collection types work with
existing applications

Corporate User Technologies

© 2007, 2008 IBM Corporation

14

Feedable, portable, mashable, DITAble

March 2008

Scenarios


Wikis


Create DITA, publish to Wiki


Create DITA, feed to Wiki


Create DITA, migrate to Wiki


Create Wiki, feed to DITA


Create Wiki, migrate to DITA


Or: a native DITA wiki: portable content


Mashups


With standardized sources


With added semantics

Corporate User Technologies

© 2007, 2008 IBM Corporation

15

Feedable, portable, mashable, DITAble

March 2008

Create DITA, publish to Wiki


DITA remains source


Wiki is published out to
provide forum for
comments on source


Example: maintain common
source for multiple Wikis:


Different audiences


Different products


Different platforms

Corporate User Technologies

© 2007, 2008 IBM Corporation

16

Feedable, portable, mashable, DITAble

March 2008

Wiki published from DITA
-

example

Corporate User Technologies

© 2007, 2008 IBM Corporation

17

Feedable, portable, mashable, DITAble

March 2008

Create DITA, feed to Wiki


DITA remains source


Surface some DITA content
in specific Wiki contexts


Disable editing in Wiki for
just the derived topics


Example: tech support
database


When answer moves into
product docs, replace tech
support doc with feed from
product doc

Corporate User Technologies

© 2007, 2008 IBM Corporation

18

Feedable, portable, mashable, DITAble

March 2008

Create DITA, migrate to Wiki


DITA stops being source


Use as seed content for new
cycle of development


Example: collaborate on
scenarios for proposed features
in new product


Port previous release’s
scenarios from DITA to wiki


Collaborate until design
approved


Then port back to DITA to track
approvals, changes, etc. and
add reuse/conditionality

Corporate User Technologies

© 2007, 2008 IBM Corporation

19

Feedable, portable, mashable, DITAble

March 2008

Create Wiki, feed to DITA


Wiki remains source, but makes
Wiki source reusable by DITA
applications


Gets rid of dangling links,
formalizes semantics


Does not provide validation,
conditional processing, advanced
DITA features


Example: OLPC reuse of Wikipedia
content into class curriculum
(proposed design)


Export/feed specialized topics for
different article types


Export/feed wiki slices to DITA maps


Allows integration of content across
multiple Wikis/repositories


Allows specialized processing for
specific article types (eg biology)

Corporate User Technologies

© 2007, 2008 IBM Corporation

20

Feedable, portable, mashable, DITAble

March 2008

Create Wiki, migrate to DITA


DITA becomes source


Example: After
brainstorming to create
newscenarios, move into
DITA for formal use


Begin topic analysis and
associate requirements,
tasks, features etc.


Begin reusing


identifying
parts of scenario that apply
to multiple products, etc.


Corporate User Technologies

© 2007, 2008 IBM Corporation

21

Feedable, portable, mashable, DITAble

March 2008

Or: a native DITA wiki


Feed back and forth
between systems with no
loss of semantics


Port content to the system
that meets its needs easily,
reliably, repeatably


Integrate with new systems
quickly based on shared
content standards

Corporate User Technologies

© 2007, 2008 IBM Corporation

22

Feedable, portable, mashable, DITAble

March 2008

Portable content means repeatable collaboration

1. Design

2. Develop

3. Deploy

Authors

Architects

Developers,

editors

Translators

1. Market

2. Train

3. Support

Marketers

Trainers

Technical

communicators

Tech

support

Users

Move the content to the people, not the people to the content

Corporate User Technologies

© 2007, 2008 IBM Corporation

23

Feedable, portable, mashable, DITAble

March 2008

Mashup scenarios


With standardized sources


Combine Wikipedia country
information with specific city
articles, tourist sites, Google
maps, and WikiTravel notes


based on title keywords


Generate printable PDF with
index, TOC


custom travel
guide; or create a hyperguide you
can use on your phone/PDA


With added semantics


Educational: Generate lists of
countries by population density
(combining population and area)


Recreational: Create a “see” list
for European capitals

Corporate User Technologies

© 2007, 2008 IBM Corporation

24

Feedable, portable, mashable, DITAble

March 2008

DITA mashup example


IBM
®

Custom Content Assembler


DITA feeds for Lotus
®

product documentation


Dynamic publishing for user
-
selected and

organized topics


User
-
created collections are themselves
searchable and reusable


Collection includes DITA standard content types
plus DITA specialized content for learning/training,
plus DITA metadata wrappers for multimedia/Flash

Corporate User Technologies

© 2007, 2008 IBM Corporation

Dynamic content delivery


DITA feeds

Corporate User Technologies

© 2007, 2008 IBM Corporation

DITA feeds: subscribable, organizable, taggable

Corporate User Technologies

© 2007, 2008 IBM Corporation

27

Feedable, portable, mashable, DITAble

March 2008

Find the topics

you want

Corporate User Technologies

© 2007, 2008 IBM Corporation

28

Feedable, portable, mashable, DITAble

March 2008

Create the book you want

Corporate User Technologies

© 2007, 2008 IBM Corporation

29

Feedable, portable, mashable, DITAble

March 2008

Insights


Lots of different types of content in Wikis


Range of formality/structure, range of mechanisms for
enforcing


Not a single type of content: a phase in the content lifecycle


As requirements change over time, let content move to the
application that best supports those requirements


The conflict between structure and collaboration is
resolvable


All you need is standardized modular content

Corporate User Technologies

© 2007, 2008 IBM Corporation

30

Feedable, portable, mashable, DITAble

March 2008

other

DITA as a common currency


DITA preserves semantics and
structure through a feed


Provides scalable
semantic
bandwidth



same feed can be
used by both low
-
semantics and
high
-
semantics applications


Preserve investment in structure
and semantics, even add
semantics through DITA maps


Validate, integrate, automate

DITA

other

RSS


throws away structure/semantics

DITA

DITA

Hybrid

Semi

structured

ATOM+DITA


preserves

structure/semantics

Corporate User Technologies

© 2007, 2008 IBM Corporation

31

Feedable, portable, mashable, DITAble

March 2008

A semantic ecosystem:

feedable, portable, mashable content

2. Draft

content

3. Review/

edit

4. Approved

content

1. Design

content

5. Public

infocenter/

wiki

6. Articles/

new content

7. Tech

support

B. Design

artifacts

C. Solution

artifacts

D. Developer/

partner artifacts

A. External

sources

Taxonomies

Corporate User Technologies

© 2007, 2008 IBM Corporation

32

Feedable, portable, mashable, DITAble

March 2008

DITAble: use, reuse, specialize, collaborate


Across tools and silos


Standards
-
based reuse even across customized
solutions/tools


allows specialized solutions,
still supports content interchange



Across views and output types


Separates content from metadata and
navigation, allows use of content for different
purposes



Across communities and industries


Integrate information from multiple sources
(structured topics, design documents, blogs…)


Share infrastructure across multiple industries
(retail, government, software…)

Corporate User Technologies

© 2007, 2008 IBM Corporation

33

Feedable, portable, mashable, DITAble

March 2008

The DITA community


OASIS DITA Technical Committee now working on DITA 1.2


http://oasis
-
open.org/committees/dita


Tool vendors (Adobe, Idiom, In.vision, Ixiasoft, Justsystems, Lionbridge, Mekon, PTC,
RSI, Syntext, Siberlogic, XyEnterprise…)


Consultants (Comtech, Innodata
-
Isogen, Mulberrytech, Rockley, Flatirons, Comet…)


Users (BMC, Business Objects, Boeing, Freescale, Gambro, IBM, Intel, Lucent, Nokia,
Novartis, Oracle, US DoD, Sun, RIM, STC…)


Subcommittees: Semiconductor industry, Machine industry, Learning and Training,
Translation, Enterprise Business Documents, Online Help...


DITA
-
OT as Open Source on SourceForge


http://dita
-
ot.sourceforge.net


Reference implementation


continuing to improve with many contributors


Plugin architecture for new capabilities and specializations



DITA focus area and Wiki:
http://dita.xml.org


Michael Priestley’s blog: http://dita.xml.org/blog/25



DITA users mailing list:
http://groups.yahoo.com/group/dita
-
users


Corporate User Technologies

© 2007, 2008 IBM Corporation

34

Feedable, portable, mashable, DITAble

March 2008

More Wiki/DITA stuff


Building a DITA
-
Wiki hybrid (with Lisa Dyer, Anne Gentle):

http://www.stc.org/intercom/PDFs/2008/200804_18
-
21.pdf


The DITA Maturity Model (with Amber Swope):

http://dita.xml.org/wiki/the
-
dita
-
maturity
-
model


What does Wiki have to do with DITA? (Bob Doyle)

http://dita.xml.org/what
-
does
-
wiki
-
have
-
do
-
dita


DITA/Wiki/OLPC project

http://wiki.laptop.org/go/Projects/Wikislice


Corporate User Technologies

© 2007, 2008 IBM Corporation

35

Feedable, portable, mashable, DITAble

March 2008

Next steps


How do we get there?


Just one tool will never cut it


so insist on support for
standards like DITA and XML, so you can chain
multiple tools together


Blog about it, ask about it on Wikis, log requirements
on sourceforge, with your vendors...


Let people know what you want or you won’t get it



And when you have it working, share your
experiences


Corporate User Technologies

© 2007, 2008 IBM Corporation

36

Feedable, portable, mashable, DITAble

March 2008

Backup


Corporate User Technologies

© 2007, 2008 IBM Corporation

37

Feedable, portable, mashable, DITAble

March 2008

DITA and the Web


The Semantic Web


The Structured Web


The Social Web

Corporate User Technologies

© 2007, 2008 IBM Corporation

38

Feedable, portable, mashable, DITAble

March 2008

DITA and the Semantic Web


The Semantic Web


Formal expression of concepts and relationships within a given knowledge
domain


Ontologies, taxonomies, metadata and relationships


The problem


Requires special skills and knowledge to create


Typically not part of authoring process


so content may be at odds with
ontology, or out of synch


The opportunity


Simplify the problem: integrate metadata management with the authoring
process


Consolidate formats: use DITA maps to manage relationships and metadata
for shareable content, DITA topics for definitions


Specialize: create special
-
purpose map formats for particular problem areas

Corporate User Technologies

© 2007, 2008 IBM Corporation

39

Feedable, portable, mashable, DITAble

March 2008

DITA and the Structured Web


The Structured Web


The convergence of structured authoring and information architecture


Adding structure and semantics to the way information is designed,
organized, and delivered


The problem



Requires specialized skills and tools to create structured content


Information architecture gets out of synch with content


The opportunity


Simplify the tooling: use DITA as common base for structured content


Integrate processes: keep information architecture relevant by making
it part of delivery architecture using DITA maps


Corporate User Technologies

© 2007, 2008 IBM Corporation

40

Feedable, portable, mashable, DITAble

March 2008

DITA and the Social Web


The Social Web


Easy to create content, collaborate, and manage relationships


Easy to build new applications


The problem


Hard to move content between systems


content can easily become silo’d


Hard to integrate structure


most content is lowest common denomenator


The content assets are out of the reach of existing business processes and
applications, such as workflow, translation, etc.


The opportunity


Standardize content: Use DITA to integrate/share/move content between
systems, reduce translation and republishing costs


Support specialization: Structure and semantics at source allows robust
integration with enterprise processes, like regulatory workflows, legal
requirements



Corporate User Technologies

© 2007, 2008 IBM Corporation

41

Feedable, portable, mashable, DITAble

March 2008

DITA: Reconciling three web models

Social web

Structured web

Semantic web

Wikis, blogs…


structured content

and collections….

folksonomies,

tag clouds…

formal taxonomies…

Generic topics

and metadata

Specialized topics

and maps

Specialized maps

and metadata

DITA