Taxonomy Development Workshop - KAPS Group

neversinkhurriedΚινητά – Ασύρματες Τεχνολογίες

12 Νοε 2013 (πριν από 3 χρόνια και 9 μήνες)

60 εμφανίσεις

Taxonomy Development

Knowledge Structures



Tom Reamy

Chief Knowledge Architect

KAPS Group

Knowledge Architecture Professional Services

http://www.kapsgroup.com

2

Agenda


Introduction



Knowledge Structures



Taxonomy Management Software



Exercises



Conclusion

3

Knowledge Structures


List of Keywords (Folksonomies)


Controlled Vocabularies, Glossaries


Thesaurus


Browse Taxonomies (Classification)


Formal Taxonomies


Faceted Classifications


Semantic Networks / Ontologies


Topic Maps


Knowledge Maps

4

Knowledge Structures

Lists of Keywords (Folksonomies)


Folksonomy

(also known as
collaborative tagging
,
social classification
,
social indexing
, and
social
tagging
) is the practice and method of
collaboratively

creating and managing
tags

to annotate and
categorize

content
. Folksonomy describes the bottom
-
up classification
systems that emerge from social tagging.
[1]

In contrast to
traditional
subject indexing
, metadata is generated not only
by experts but also by creators and consumers of the
content. Usually, freely chosen
keywords

are used instead
of a
controlled vocabulary
.
[2]

Folksonomy

(from
folk

+
taxonomy
) is a
user generated

taxonomy.


5

Knowledge Structures

Controlled Vocabularies, Glossaries


Controlled Vocabularies, Glossaries


Lists with minimum structure


Easy to develop


Difficult to get value from


Simple Reference resource


Thesaurus


Taxonomy
-
like


Less formal


BT, NT


also RT



6

Two Types of Taxonomies:
Browse and Formal


Browse Taxonomy



Yahoo

7

Two Types of Taxonomies:
Formal


8

Facets and Dynamic Classification


Facets are not categories


Entities or concepts belong to a category


Entities have facets


Facets are metadata
-

properties or attributes


Entities or concepts fit into one category


All entities have all facets


defined by set of values


Facets are orthogonal


mutually exclusive


dimensions


An event is not a person is not a document is not a place.


Facets


variety


of units, of structure


Date or price


numerical range


Location


big to small (partonomy)


Winery


alphabetical


Hierarchical
-

taxonomic

9

Knowledge Structures

Semantic Networks / Ontologies


Ontology more formal


XML standards


OWL, DAML


Semantic Web


machine understanding


RDF


Noun


Verb


Object


Vice President is Officer


Build implications


from properties of Officer


Semantic Network


less formal


Represent large ontologies


Synonyms and variety of relationships


10

Knowledge Structures: Ontology




Music

Instruments

Violins

Bluegrass

Violinists

Musicians

uses

uses

is a

is a

create

is a

11

Knowledge Structures

Topic Maps


ISO Standard


See
www.topicmaps.org


Topic Maps represent subjects (topics) and associations
and occurrences


Similar to semantic networks


Ontology defines the types of subjects and types of
relationships


Combination of semantic network and other formal
structures (taxonomy or ontology)




12

Knowledge Structure: Topic Maps




13

Knowledge Structures

Knowledge Maps


No standards


applied at high level


Ontologies plus / applied to specific environment


Map of Groups


Content Stores


Purpose


Technology


Add structure to each element


Facet Structure


filter by group


content


purpose


Strategic resource

14

Knowledge Structures: Which one to use?


Level 1


keywords, glossaries, acronym lists, search logs


Resources, inputs into upper levels


Level 2


Thesaurus, Taxonomies


Semantic Resource


foundation for applications, metadata


Level 3


Facets, Ontologies, semantic networks, topic
maps


Applications


Level 4


Knowledge maps


Strategic Resource

15

Web 2.0


No need for Taxonomies etc.?



Tags are great because you throw caution to the wind, forget
about whittling down everything into a distinct set of categories
and instead let folks loose categorizing their own stuff on their
own terms."
-

Matt Haughey
-

MetaFilter


Tyranny of the majority
-

worst type of central authority


More Madness of Crowds than Wisdom of Crowds


“Things fall apart; the center cannot hold;

Mere anarchy is loosed upon the world,…

The best lack all conviction, while the worst

Are full of passionate conviction.”
-

The Second Coming


W.B.
Yeats




16

Advantages of Folksonomies


Simple (no complex structure to learn)


No need to learn difficult formal classification system


Lower cost of categorization


Distributes cost of tagging over large population


Open ended


can respond quickly to changes


Relevance


User’s own terms


Support serendipitous form of browsing


Easy to tag any object


photo, document, bookmark


Better than no tags at all


Getting people excited about metadata!

17

Folksonomies


Problems and Limits


Folksonomies don’t compare with taxonomies or ontologies


Serendipity browsing is small part of search


Limited areas of success


popular sites are popular


Quality Content


finance, science, etc


not good candidates


No mechanism for improving folksonomies


Scale


Too Big (million hits)


Too Little (200 items)


Amazon
and LibraryThing


Need intrinsic value of tagging


not tagging for better tags


Bad Tags
-

idiosyncratic or too broad, errors, limited reach


Most people can’t tag very well


learned skill



18

Del.icio.us Tags


Design blog software music tools reference art video
programming webdesign web2.0 mac howto linux
tutorial web free news photography shopping blogs
css imported education travel javascript food games


Development inspiration politics flash apple tips java google osx
business windows iphone science productivity books toread helath funny
internet wordpress ajax ruby research humor fun technology search
opensource


Photoshop media recipes cool work article marketing security mobile jobs rails
lifehacks tutorials resources php social download diy ubuntu freeware portfolio
photo movies writing graphics youtube audio online

19

Del.icio.us
-

Folksonomy Findability


Too many hits (where have we heard that before?)


Design


1 Mil, software


931,259, sex


129,468


No plurals, stemming (singular preferred)


Folksonomy


14,073, folksonomies


3,843, both


1,891


Blog
-
1.7M, blogs


516,340, Weblog
-

155,917, weblogs


36,434,
blogging


157,922, bloging


697


Taxonomy


9.683, taxonomies


1,574


Personal tags


cool, fun, funny, etc


Good for social research, not finding documents or sites


How good for personal use? Funny is time dependent

20

Library Thing


Book people aren’t much better at tagging


High level concepts


psychology (55,000), religion
(120,000), science (101,000)


Issue


variety of terms


cognitive science


need at least
40 other tags to cover the actual field of cognitive science


Strange tags


book (19,000)


it’s a book site?


Combination of facets and topics


Facets


Date (16
th

century, 1950’s, 2007) // Function (owned,
not read) // Type (graphic novel, novel) // Genre (horror,
mystery)


Topics


majority like Del.icio.us

21

Library Thing


Book on Neuroscience


1)
(Location: dining room)
(1)
biological
(1)
biology
(8)
box74
(1)
Brain
(1)
brain research
(1)
brains
(1)
cognitive
neuroscience
(1)
cognitive science
(1)
consciousness
(1)
currently reading
(1)
HelixHealth
(1)
kognitionswissenschaft
(1)
medical
(1)
medicine
(1)
neuroscience
(19)
non
-
fiction
(5)
partread
(1)
Psychology
(4)
Science
(10)
textbook
(10)
theory
(1)


Too General: Science, Psychology, biology, textbook


Too specific: Location: dining room, box74


Facets: currently reading, partread

22

Better Folksonomies:


Will social networking make tags better?


Not so far


example of Del.icio.us


same tags


Quality and Popularity are very different things


Most people don’t tag, don’t re
-
tag


Study


folksonomies follow NISO guidelines


nouns, etc


but do they actually work


see analysis


Most tags deal with computers and are created by people
that love to do this stuff


not regular users and infrequent
users


Beware true believers!

23

Browse Taxonomies:
Strengths and Weaknesses


Strengths
: Browse is better than search


Context and discovery


Browse by task, type, etc.


Weaknesses
:


Mix of organization


Catalogs, alphabetical listings, inventories


Subject matter, functional, publisher,

document type


Vocabulary and nomenclature Issues


Problems with maintenance, new material


Poor granularity and little relationship

between parts.


Web site unit of organization


No foundation for standards

24

Formal Taxonomies:
Strengths and Weaknesses


Strengths
:


Fixed Resource


little or no maintenance


Communication Platform


share ideas, standards


Infrastructure Resource


Controlled vocabulary and keywords


More depth, finer granularity


Weaknesses
:


Difficult to develop and customize


Don’t reflect users’ perspectives


Users have to adapt to language

25

Faceted Navigation:
Strengths and Weaknesses


Strengths
:


More intuitive


easy to guess what is behind each door


20 questions


we know and use


Dynamic selection of categories


Allow multiple perspectives


Trick Users into “using” Advanced Search


wine where color = red, price = x
-
y, etc..


Weaknesses
:


Difficulty of expressing complex relationships


Simplicity of internal organization


Loss of Browse Context


Difficult to grasp scope and relationships


Limited Domain Applicability


type and size


Entities not concepts, documents, web sites

26

Dynamic Classification / Faceted navigation


Search and browse better than either alone


Categorized search


context


Browse as an advanced search


Dynamic search and browse is best


Can’t predict all the ways people think


Advanced cognitive differences


Panda, Monkey, Banana


Can’t predict all the questions and activities


Intersections of what users are looking for
and what documents are often about


China and Biotech


Economics and Regulatory

27

Varieties of Taxonomy/ Text Analytics Software


Taxonomy Management


Text Analytics


Auto
-
Categorization, Entity Extraction


Sentiment Analysis


Software Platforms


Content Management, Search


Application Specific


Business Intelligence

Vendors of Taxonomy/ Text Analytics Software


Attensity


Business Objects


Inxight


Clarabridge


ClearForest


Data Harmony / Access
Innovations



Lexalytics


Multi
-
Tes


Nstein


SchemaLogic


Teragram


Wikionomy


Wordmap


Lots More

28

29

Why Taxonomy Software?


If you have to ask, you can’t afford it


Spreadsheets


Good for calculations, days of taxonomy development over


(almost)


Ease of use


more productive


Increase speed of taxonomy development


Better Quality


synonyms, related terms, etc.


Distributed development


lower cost, user input (good and
bad)

30

Text Analytics Software


Features


Taxonomy Management Functions


Entity Extraction


Multiple types, custom classes


Auto
-
categorization


Taxonomy Structure


Training sets


Bayesian, Vector space


Terms


literal strings, stemming, dictionary of related terms


Rules


simple


position in text (Title, body, url)


Boolean


Full search syntax


AND, OR, NOT


Advanced


NEAR (#), PARAGRAPH, SENTENCE


Advanced Features


Facts / ontologies /Semantic Web


RDF +


Sentiment Analysis

31

Conclusion


Variety of information and knowledge structures


Important to know what will solve what


Taxonomies and Facets are foundation elements


Build higher levels based on lower levels


Glossaries to Taxonomies


Taxonomy to Ontology / faceted navigation


Important to have good taxonomy and text analytics
software (spreadsheets are OK for first draft)


Web 2.0/Folksonomies are not the answer

32

Resources


Books


Women, Fire, and Dangerous Things


George
Lakoff


Knowledge, Concepts, and Categories


Koen

Lamberts and David Shanks


The Stuff of Thought


Steven Pinker


Software


Tools & Techniques (Taxonomy Boot Camp)


Web Sites


Taxonomy Community of Practice:
http://finance.groups.yahoo.com/group/TaxoCoP/


Questions?


Tom Reamy

tomr@kapsgroup.com

KAPS Group

Knowledge Architecture Professional Services

http://www.kapsgroup.com