Search in the Neuroscience

paraderollAI and Robotics

Nov 17, 2013 (3 years and 8 months ago)

152 views

Mental Functioning and Semantic
Search in the Neuroscience
Information Framework

Maryann Martone

Fahim

Imam



Funded in part by the NIH Neuroscience Blueprint
HHSN271200800035C via NIDA

Neuroscience Information Framework


http://neuinfo.org

The Neuroscience Information Framework: Discovery and
utilization of web
-
based resources for neuroscience



A portal for finding and using
neuroscience resources



A consistent framework for
describing resources



Provides simultaneous
search of multiple types of
information, organized by
category



Supported by an expansive
ontology for neuroscience



Utilizes advanced
technologies to search the
“hidden web”


http://neuinfo.org

UCSD, Yale, Cal Tech, George Mason, Washington Univ

Supported by NIH Blueprint

Literature

Database
Federation

Registry

NIF takes a global view of resources


NIF’s

goal: Discover and use
resources


Data


Databases


Tools


Materials


Services


Federated approach: Resources
are developed and maintained by
the community


>150 data sources; 350M records


Agile approach: the NIF system is
designed to be populated quickly
and allow for incremental
improvements to representation
and search


Contract specifies 25 sources/year

NIF’s

Rules for
using digital
resources


#1: YOU HAVE TO FIND
THEM!!!!!!!


#2: You have to access/open
them


#3: You have to understand
them


Neuroscience is inherently interdisciplinary; no one technique
reveals all

What do you mean by data?

Databases come in many shapes and sizes


Primary data
:


Data available for reanalysis, e.g.,
microarray data sets from GEO; brain
images from XNAT; microscopic images
(CCDB/CIL)


Secondary data


Data features extracted through data
processing and sometimes
normalization,
e.g
, brain structure
volumes (IBVD), gene expression levels
(Allen Brain Atlas); brain connectivity
statements (BAMS)


Tertiary data


Claims and assertions about the meaning
of data


E.g., gene
upregulation/downregulation
, brain
activation as a function of task






Registries:


Metadata


Pointers to data sets or
materials stored elsewhere


Data aggregators


Aggregate data of the same
type from multiple sources,
e.g., Cell Image Library
,
SUMSdb
, Brede


Single source


Data acquired within a single
context , e.g., Allen Brain Atlas


Set
of modular
ontologies



86, 000 +
distinct concepts +
synonyms


Expressed in OWL
-
DL language


Supported by common DL
Reasoners


Currently supports OWL 2


Closely follows OBO community best
practices


Avoids duplication of efforts


Standardized to the same upper
level
ontologies



e.g., Basic Formal Ontology
(BFO), OBO Relations
Ontology (OBO
-
RO)


Relies on existing community
ontologies



e.g., CHEBI, GO, PRO, DOID,
OBI etc.

5


Modules cover orthogonal domain

e.g. , Brain Regions, Cells, Molecules,
Subcellular

parts, Diseases,
Nervous system functions, etc.

Bill Bug et al.

NIFSTD
Ontologies

Neuroscience Information Framework


http://neuinfo.org

Importing into NIFSTD


NIF converts to OWL and aligns to BFO, if not already


Facilitates ingestion, but can have negative consequences for
search if model adds computational complexity


Data sources do not make careful distinctions but use what is
customary for the domain


Modularity: NIF seeks to have single coverage of a sub
-
domain


We are not UMLS or
Bioportal


NIF uses MIREOT to import individual classes or branches of
classes from large ontologies


NIF retains identifier of source


NIF uses
ID’s

for names, not text strings


Avoids collision


Allows retiring of class without retiring the string

NIFSTD has evolved as the ontologies have evolved; had to make many
compromises based on ontologies and tools available

NIFSTD Modules and Sources

NIFSTD

Modules

External

Source

Import/

Adapt

Organismal

taxonomy


NCBI Taxonomy, GBIF, ITIS, IMSR, Jackson Labs mouse catalog; the model
organisms in common use by neuroscientists are extracted from NCBI
Taxonomy and kept in a separate module with mappings

Adapt


Molecules,

Chemicals


IUPHAR ion channels and receptors, Sequence Ontology (SO); NIDA drug lists
from ChEBI, and imported Protein Ontology (PRO)

Adapt/Import


Sub
-
cellular

anatomy

Sub
-
cellular Anatomy Ontology (SAO). Extracted cell parts and subcellular
structures from SAO
-
CORE. Imported GO Cellular Component with mapping.

Adapt/Import

Cell


CCDB, NeuronDB, NeuroMorpho.org. Terminologies; OBO Cell Ontology was
not considered as it
did not contain region specific cell types

Adapt


Gross

Anatomy


NeuroNames

extended by including terms from
BIRNLex
,
SumsDB
,
BrainMap.org, etc; Multi
-
scale representation of Nervous System, Macroscopic
anatomy

Adapt


Nervous

system


function


BIRN
, BrainMap.org,
MeSH
, and UMLS
, GO Biological functions

Adapt


Nervous

system


dysfunction


Nervous system disease from
MeSH
, NINDS terminology; Imported Disease
Ontology (DO) with mapping

Adapt/Import


Phenotypic

qualities


Phenotypic Quality Ontology (PATO); Imported as part of the OBO foundry
core

Import


Investigation
:

reagents


Overlaps with molecules above from
ChEBI
, SO, and PRO

Adapt/Import


Investigation
:

instruments,

protocols,

plans


CogPo
,
BIRNLex

Adapt

Investigation
:

resource

type


NIF, OBI, NITRC, Biomedical Resource Ontology (BRO)

Adapt


Biological

Process


Gene Ontology (GO) biological process

Import


Neuroscience Information Framework


http://neuinfo.org

What are the connections of the
hippocampus
?

Hippocampus OR “
Cornu

Ammonis
” OR

Ammon’s

horn”

Query expansion: Synonyms
and related concepts

Boolean queries

Data sources
categorized by
“data type” and
level of nervous
system

Common views
across multiple
sources

Tutorials for using
full resource when
getting there from
NIF

Link back to
record in
original source

Entity mapping

BIRNLex_435

Brodmann.3

Explicit mapping
of database content
helps disambiguate non
-
unique and custom terminology


Search Google: GABAergic neuron


Search NIF: GABAergic neuron


NIF automatically searches for types of
GABAergic

neurons


Defined by OWL axioms

Types of
GABAergic

neurons

NIF Concept
-
Based Search

Neuroscience Information Framework


http://neuinfo.org

Ontological Query expansion through
OntoQuest

Example

Query

Type

Ontological

Expansion

A single term query for Hippocampus and its synonyms

synonyms(Hippocampus);

expands to Hippocampus OR "Cornu ammonis" OR "Ammon's horn" OR "hippocampus
proper".

A conjunctive query with 3 terms

transcription AND gene AND pathway

A 6
-
term AND/OR query with one term expanded into
synonyms

(gene) AND (pathway) AND (regulation OR "biological
regulation") AND (transcription) AND (recombinant)

A conjunctive query with 2 terms, where a user chooses
to select the subclasses of the 2
nd

term

synonyms(zebrafish AND descendants(promoter,subclassOf))),
zebrafish gets expanded by synonym search and the second term transitively
expands to all subclasses of promoter as well as their synonyms.

A single term query for an anatomical structure where a
user chooses to select all of the anatomical parts of the
term along with synonyms

synonyms(descendants(Hippocampus,partOf)),

expands to all parts of hippocampus and all their synonyms through the ontology.
All parts are joined as an “OR” operation.

A conjunctive query with 2 terms, where a user chooses
to select all the equivalent terms for the 2
nd

term

synonyms(Hippocampus) AND equivalent(synonyms(memory)),

the second term uses the ontology to find all terms that are equivalent to the term
memory by ontological assertion, along with synonyms.

A conjunctive query with 2 terms, where a user is
interested in a specific subclasses for both of the terms

synonyms(x:descendants(
neuron,subclassOf
) where
x.neurotransmitter
='GABA') AND synonyms(gene where gene.
name='IGF'),
x
is an internal variable.

A query to seek all subclasses of neuron whose soma
location is in any transitive part of the hippocampus

synonyms(x:descendants(
neuron,subclassOf
) where
x.soma.location

= descendants (Hippocampus,
partOf
))

A query to seek a conceptual term that is
semantically equivalent to a collection of terms
rather than a single term.

'
GABAergic

neuron' AND
Equivalent
('
GABAergic

neuron'),

The term
is recognized
as ontologically
equivalent

to any neuron that has
GABA as a neurotransmitter and therefore expands to a list of inferred
neuron

types

OntoQuest



NIF’s ontology management
system for NIFSTD
ontologies




Implements various graph
search algorithms for
ontological graphs


Automated
query expansion for
NIFSTD
terms, including
the ones with defined logical
restrictions.










Gupta
et al.,
2010

NIF information space

NIF developed a tiered system


Domain knowledge


What you would teach someone
coming into your domain


NIFSTD/
Ontoquest


All upper level BFO categories are
suppressed


Claims based on data


Bridge files across domains
(constructed by NIF), Databases,
triple stores,


Text


Data


Relational databases


Spreadsheets

Concepts

Data

Knowledge

Base

Concepts, Entities + data summaries

Scientists search via the terms they use,
not what we would like them to use
-
NIF
needs a broad net to find relevant
resources

When searching across broad information sources, need to search for what
people are looking for

What genes are
upregulated

by
drugs of abuse
in the adult mouse
?

Gene
upregulated

mice illegal drug

NIF “translates” common concepts through
ontology and annotation standards


What genes are
upregulated

by
drugs of abuse in the
adult mouse
?

Morphine

Increased
expression

Adult Mouse

Arbitrary but defensible

N
IF
S
TD

AND

N
EURO
L
EX

W
IKI


Semantic wiki platform


Provides simple forms for
structured knowledge


People can add concepts,
properties, and annotations


Generate hierarchies without
having to learn complicated
ontology tools


Community can contribute


Relax rules for NIFSTD so
dedicated domain scientists can
contribute their knowledge and
review other contributions


Teaches structuring of knowledge
via red links/blue links


Process is tracked and exposed


Implemented versioning

15

Larson et al.

Readily indexed by Google; queries to NIF data via NIF navigator

NeuroLex

Content Structure

Stephen D. Larson et al.

Neurolex

is becoming a significant knowledge base

Top Down Vs. Bottom up

Top
-
down ontology construction



A select few authors have write privileges



Maximizes consistency of terms with each other



Making changes requires approval and re
-
publishing



Works best when domain to be organized has: small corpus, formal categories,
stable entities, restricted entities, clear edges.



Works best with participants who are: expert catalogers, coordinated users, expert
users, people with authoritative source of judgment


Bottom
-
up ontology construction



Multiple participants can edit the ontology instantly



Semantics are limited to what is convenient for the domain



Not a replacement for top
-
down construction; sometimes necessary to increase flexibility



Necessary when domain has: large corpus, no formal categories, no clear edges



Necessary when participants are: uncoordinated users, amateur users, naïve catalogers



Neuroscience is a domain that is less formal and neuroscientists are more uncoordinated

Larson et. al

NIFSTD

NEUROLEX

Neuroscience Information Framework


http://neuinfo.org

Engaging domain scientists

Memory

Mental
Process

Cognitive
process

Recall

Retrieval

Encoding

Disposition

Planned
process

Continuant

Episodic

Non
-
declarative

Mental state

?

?

?

Mental functioning is difficult to define
and dissect


Very few behaviors are “pure”


Operationally defined
through experiments



What is a mental function?


Activity, state, function,
process



Subtypes are rarely disjoint


Episodic memory


Semantic memory


Procedural memory


Declarative memory



Distinctions among paradigms,
assessments, tests, rating scales,
tasks are often subtle


Early work done in BIRN; later terms added by students and curators

Neurolex

does not adhere strictly to
BFO

Concepts and things happily co
-
exist; content gets reconciled over time

Nevertheless...


We do not allow
duplicates


We do not allow
multiple inheritance


Use “role” to shortcut
many relations


We do try to re
-
factor
contributions so as to
avoid collisions across
our domains


But...once they are in
the wiki, they will
move about and be
added to as necessary


Neuinfo.org/neurolex/wiki/COGPO_00123

Cognitive
-
related searches through NIF


fear prefrontal arousal


Attention and distraction


Passive viewing


stroop

effect


sequence learning


studies done on the
cognitive
-
behavioral model
of addiction


memory recall


self
-
administration


Visual oddball paradigm


Sexual Orientation


Face recognition


neurophysiology of
language


Olfaction


Consciousness


Gustatory

Scientists tend to focus on tests and general concepts rather than deep
considerations of cognitive processes

Mental Functioning: What NIF needs



Computable taxonomies of test (assessments,
paradigms, tasks) types


Test types should be related to the function they
purport to measure but will only be an approximation


Not just human!!!


Computable operational definitions of cognitive
concepts


Translates tests into concepts used in search


Dementia rating scale scores = Dementia


Smoking assessment scores = smoker



Concluding Remarks


NIFSTD is utilized to provide a semantic index to
heterogeneous data sources


BFO allows us to promote a broad semantic
interoperability between biomedical
ontologies
.


The modularity principles allows us to limit the complexity
of the base
ontologies


NIF defines a process to form complex semantics to
neuroscience concepts through NIFSTD and
NeuroLex

collaborative environment.


NIF encourages the use of community
ontologies


Moving towards building rich knowledgebase for
Neuroscience that integrates with larger life science
communities



Neuroscience Information Framework


http://neuinfo.org

Points of Discussion

CogPO/CogAT/NEMO/MHO

Harmonization?


What kind of interplay are we looking at?


Is it about re
-
use of ontological vocabularies?


What should be the best practice for reuse?


Re
-
using URI
vs

Creating new class and Mapping


Non
-
semantic reuse of classes as entities (e.g., MIREOT)


Is it about building new relationships between the entities covered in all these four
ontologies
?


What do we achieve through doing this?


Are we trying to connect all the
curated
/ annotated experimental data
-
set to a common semantic
layer?


All of the above?


What should be NIF's role?


How can we help to expose your experiments and results to a broader audience through our
interface?


What kind of involvement can people have in terms of re
-
using your ontological content or
contributing to your content?


We want to be the 'host' of all the NS concepts and entities, but not necessarily the 'maintainer'.


What ontology isn’t

(or shouldn’t be)



A rigid top
-
down fixed hierarchy for
limiting expression in the
neurosciences


Not about restricting expression
but how to express meaning clearly
and in a machine readable form


A bottomless resource
-
eating pit
that consumes dollars and returns
nothing


A cure
-
all for all our problems


A completely solved area


Applied
vs

theoretical


Easy to understand

Mike Bergman