Politics and Pragmatism in Scientific Ontology Construction

wrendeceitInternet και Εφαρμογές Web

21 Οκτ 2013 (πριν από 4 χρόνια και 17 μέρες)

100 εμφανίσεις

Politics and Pragmatism

in Scientific Ontology Construction

Mike Travers

Inconsistency Robustness 2011

Overview


Introduction


Two kinds of knowledge infrastructure


Ontological controversies: some examples


The nature of actual scientific representation


Representational pragmatism


Technical directions


Conclusions


My background


SSS

Artificial Intelligence

Knowledge Representation

Agent
-
based systems

Programming Languages


Media Science

Human Interface

Constructionism

Visual Programming

Scientific Software

(@startups, large
companies, open source
projects, and now SRI)

Scientific KM

Collaboration

Decision Support

Publishing

Standards

Philosophy of Science

Sociology

Narrative Theory

Cognitive Science

Synopsis


Knowledge representation inevitably involves
inconsistency, controversy, hence politics;


Scientific representation does too, but it has
worked
-
out practices for dealing with it;


KR should work more like science rather than
the other way around;


Representational Pragmatism: a conceptual
framework to make it happen

Overview


Introduction


Two kinds of knowledge infrastructure


Ontological controversies: some examples


The nature of actual scientific representation


Representational pragmatism


Technical directions


Conclusions

What’s a knowledge infrastructure?


A system of


Technologies,


Institutions,


Standards,


and Practices


that serve to support knowledge


Collection


Storage


Curation


Sharing


Validation




Knowledge Infrastructure #1: Science


The scientific community


An elaborate web of


People (scientists and others)


Institutions (labs, journals, funding agencies, instrument
makers…)


Practices (publishing criteria, protocols, conferences)


Works pretty well! The gold standard for
knowledge in fact.


But there are issues of scaling, quality, inertia,
siloing
, epistemological closure…

Knowledge Infrastructure #2:

The Semantic Web


Set of technical standards for sharing
formalized knowledge


Aspires to be a universal framework for
knowledge


A grand vision of global
-
scale knowledge
representation


And tremendously important and needed.













Naming

Relations

&

Properties

Classification

Reasoning

Provenance

These two are becoming one…

Bioscience is by far the largest application area for semantic
web technology

Some non
-
robust properties of the
semantic web


Too inexpressive

(Can’t represent default reasoning or
n
-
way
predicates)


Too complex

(Prevents widespread acceptance)


Too logic
-
based

(Emphasizes wrong things)

Overview


Introduction


Two kinds of knowledge infrastructure


Ontological controversies: some examples


The nature of actual scientific representation


Representational pragmatism


Technical directions


Conclusions


Convergence and Controversy


Ontologies are supposed to define a common
understanding of a domain


But “common” is easier said than done


In practice:


Many different constituencies


With different ideas about what’s important


Many side
-
factors complicate things (implementation
cost, personal status, existing non
-
rigorous usages…)


Compromise is necessary but rarely produces elegant
results

Example: psychiatric illness



What constitutes a mental illness?


Not at all obvious that categories correspond to real
phenomena


Huge changes over over time


Currently defined by DSM
-
IV through a highly
politicized process


History of PTSD (Scott, 1990)


“combat fatigue” or cowardice



In and out of the DSM


Finally recognized as PTSD, partly as response to Vietnam
War

Psychiatric illness (2)



Homosexuality


Formerly a pathology, now not, through a highly politicized
process


Attention Deficit Disorder


Cluster of symptoms, not clear what the boundaries should be


Opinions often determined by theories of child
-
rearing or
institutional aspects of school.


Insurers and economics are important actors in debate


Summary:


these disorders are social constructed categories


over a definite but unclear underlying reality.


Example: category fudging


In Pathway Tools,
SRI’s

bioinformatics
knowledge base


This is a widely used system for
curating

genomes and metabolic pathways


Underlying frame system


Web based interface

Example: Gene/Protein conflation


Genes and Proteins are different things


But biologists tend to want to use the same
name for a gene and its product


Tension between formal ontology and actual
scientific usage


Equivalently, an argument between the
computer scientists who build the system and
the biologists who use it and curate it

Gene (DNA)

Gene product

(Protein)

trpA

Search for “
trpA


Moral of this somewhat trivial example


There are tensions (inconsistencies) between
formal representation and actual usage


And, software makers end up having to cope
with these tensions in design decisions


Usually in a
kludgy

way!


Eg
, papering over the conflict in the user
interface layer


Would be nice to have a better theory of how
do this.

Example: how do we classify
mitochondria?



Organelles (part of cell)


But descended from
separate
endosymbiotic

organisms


With their own DNA


(Generally but not
universally accepted
theory)


There are consequences


“If we accept that mitochondria are
bacteria, then the record books have to be
rewritten. The first bacterial genome
sequence was completed not by American
arriviste
Craig Venter …in 1995, but
instead by … Fred Sanger, who completed
the human mitochondrial genome
sequence in 1981!”

Expressivity in Description Logics


Description Logics (DL) are the basis for
semantic web ontology.


Selected largely for computational tractability


But DL make it hard to do simple things such as
representing defaults


All cats have hair


Except for this one!


Expressivity has been

traded away


A compromise and perhaps

not the right one

Overview


Introduction


Two kinds of knowledge infrastructure


Ontological controversies: some examples


The nature of actual scientific representation


Representational pragmatism


Technical directions


Conclusions

Bruno Latour


French philosopher and
sociologist of science


Roundly reviled for
perceived anti
-
realism


Started with
anthropological studies
of science in labs and
fields


Ends in a rather unique
view of representation
and even metaphysics


Latour for dummies


Science is a social construction

(but not an arbitrary one)


Network based: a network consists of humans and
non
-
human actors (lab animals, instruments, funding
institutions…)


Agonistic


trials of strength between networks


Understand how science works by tracing the flow of
inscriptions, abstractions, and power through these
networks


An
enriched

realism, that provides a rich account of the
relation between phenomena and representation


Dual face of science


Science under construction:

Unsettled

Contentious

Searching for allies (people, funding, tools)

Building networks of alliance

Social

Settled science:

“That’s the way it is”

Objective

Black
-
boxed

Politically Established

Natural


Science in the making:


EG: Watson and Crick’s work on the structure of
DNA


Speculations (A three
-
strand model was proposed)


Contending theories


Eventually a winner emerges


Science made


Now that the structure of DNA is known,


it’s a “black box”


we can make instruments that measure it


representations of its sequence


Under construction


Black boxed


Where the representation meets the
road


Science is: “the transformation of rats and
mice into paper”


Situated representations



From phenomena


Lab notebook


Tables in articles


Laws of nature

Concrete, situated

Abstract, objective

Jeff
Shrager
, “Diary of an Insane Cell Mechanic”




Intercalation of representations and
the phenomenon

Analogizing to KR


Knowledge Construction:

Situated representations

Unsettled

Bottom
-
up

User interfaces

Ad
-
hoc structures


Knowledge Representation:

Realist

Objective

Settled

Factual

Established

Abstract

Graph structures

A new view of the relation between
world and representation


Latour refocuses epistemology


Less on the truth of representations,


More on their connection to the world via
networks of
actants
.


Should be a natural fit for computationalists


Who also make systems of symbols with causal
connections to the world and each other

Overview


Introduction


Two kinds of knowledge infrastructure


Ontological controversies: some examples


The nature of actual scientific representation


Representational pragmatism


Technical directions


Conclusions


Realism
vs

Conceptualism


Realism: a movement in philosophy of KR


Led mostly by Barry Smith, SUNY Buffalo

(
eg

“Beyond Concepts: Ontology as Reality Representation”, 2004)


The problem: nobody knows what makes a good
ontology


His solution: Aristotelian universals


Bad ontologies are…those whose general terms lack
the relation to corresponding universals in reality, and
thereby also to corresponding instances. Good
ontologies are reality representations...

Realism is extremely annoying


Both vacuous and wrong


Vacuous: because it presupposes we know what is
real beforehand


Wrong: because it doesn’t correspond to actual
scientific knowledge representation


Examples of failure:


Higgs bosons


we don’t know if they are real


Genes


were hypothesized before their
“implementation” was known; when were they real?


Software for synthetic chemistry


mixes real and not
-
yet
-
real molecular structures

Afferent: software for drug discovery
chemists


But Realism is Winning



Basis of BFO (Basic Formal Ontology)


Which is used by OBO Foundry and other
bio
-
ontology efforts


Nobody wants to be against “realism”… so
they picked a good name

Realism only deals with half of science








May work for ready
-
made science,



hopeless for science
-
in
-
the
-
making


Where we don’t know what’s real


And which is where the action is

Representational Pragmatism


Needed: a term with good connotations to compete
with “realism”.


Connects to a philosophical tradition (James, Peirce,
Dewey,
Rorty
)


“It is astonishing how many philosophical disputes collapse
into insignificance the moment that you subject them to
this simple test of
tracing a concrete consequence

--

James


Bottom
-
up rather than top
-
down; opposed to
premature
ontologizing
;
Latourian


Support the divergent representational practices of
actual science


Help science towards convergence, objectivity, and
realism, rather than demanding it upfront.

Overview


Introduction


Two kinds of knowledge infrastructure


Ontological controversies: some examples


The nature of actual scientific representation


Representational pragmatism


Technical directions


Conclusions

Some encouraging developments


Linked data
vs

semantic web

A somewhat more bottom
-
up, pragmatic
approach to universal knowledge
infrastructure


Freebase,
DBPedia

similar efforts


Open Science movement


Open Access Journals (
PLoS
, etc)


Open Data (standards)


Open Notebook (practices)




BioBike
: a platform

for symbolic
biocomputing


A web
-
based, programmable tool for advanced
biocomputing



Knowledge
-
based


Programmable


Social


Really the inspiration of many of the ideas
here


Joint work with Jeff
Shrager

(Stanford), Jeff
Elhai

(VCU), and others




Reworked to be more social


Bio
-
blog
menu

Integration
with
services

Bio
-
computation

Knowledge/
data analysis

Commentary


Prototype
-
based KR


How the mind categorizes (
Rosche
,
Lakoff
)


A perennial minority theme in computation:


60s: Sutherland, Sketchpad


70s: Early frame
-
based KR systems


80s:
Ungar

and Smith, SELF programming language


90s: Ken
Haase
, Framer


Now:
Javascript


A structured way to manage inconsistency

Biology is prototype
-
based


Every feature of a biological class started out
as an exception to a general case!


aka mutation



Classes are Aristotelian


Prototypes are Darwinian


Overview


Introduction


Two kinds of knowledge infrastructure


Ontological controversies: some examples


The nature of actual scientific representation


Representational pragmatism


Technical directions


Conclusions

The Problems


Ontologies are plagued with inconsistencies (or
compromise) because they are inevitably the product of
different interests.



Ontologies generally only try to capture the settled science



Realism is vacuous, question
-
begging; if we knew at the
start what was real we wouldn't need to do science



Knowledge construction is social, tentative, situated, multi
-
viewpoint, and only objective at its endpoints.




The Solutions


Tools that support how science is actually done, at web scale and with
greater visibility and traceability



A pragmatic view of scientific representation


That let scientists work bottom
-
up from their results


that foregrounds the concrete relations between representation and reality
(circulating reference)


connects science in progress with settled science, supporting and preserving
controversy, unsettledness, and argument structure



More simply: integrate data and knowledge and the processes that connect
them.



Open Science: institutions, standards, practices.



A representational infrastructure that supports prototypes, default
reasoning, and exceptions.




Thank you!