ass2_1[Version0.9] - Google Code

cobblerbeggarAI and Robotics

Oct 15, 2013 (3 years and 7 months ago)

66 views

INFO4990 Assignment 2_1 Literature Review

Wei Hang ZHANG

Page
1

of
11

15/10/2013

Literature Review




Converting Medical Text to Diagnosis Code

Wei Hang ZHANG

1.

Back
g
round

As
health care delivery is

firmly

dependent on accurate and detailed clinical
data, a
n important
task

of medical informatics is to facilitate access to and
enhance the

quality of this information, thereby

improving the accuracy of
clinical outcomes.

The first step of such computer analysis is to extract data from various medical
reports and reformat them into a structured coded form. Often, this information
conversion t
ask can be followed by text classification. Text classifiers can be
built to automatically detect and extract the medical condition features in the
medical reports and convert them into predefined medical codes or terms.

Although such a classifier is likel
y to be built through manual work, considering
the difficulties and expenses, as well as coordination between medical experts
and knowledge engineers,
researchers therefore have been investigated
the
use of inductive learning algorithms
, in the pursuit of
automatically generating
classifiers for clinical reports.

2.

A Snippet of Text Categorisation

In common
, natural language processing (NLP)
technologies
have been used
to structure narrative clinical data by extracting observations and descriptive
mo
difiers f
rom free
-
text reports
, for future usages by machine

learning
algorithms
.

Aas

et al.
[1]

gave a main structure of text categorisation in 1999. In the report,
th
ey
enumerated and analysed
the

main methods
and algorithms
used with
in

text categorisation process, and gave their comparable experiment result
using the same Ruters
-
21578 collection with previous works done by other
researchers
.

I
n Figure 1 is given the M
ain Steps of Text Categorisation
process.

INFO4990 Assignment 2_1 Literature Review

Wei Hang ZHANG

Page
2

of
11

15/10/2013


Fig.1. Text Classification Process

Ikonomakis et al.
[2]

illustrated this procedure from the angle of machine
learning techniques
.

They performed experiments against

Reuters Corpus
Volume I (R
CV1) collection,

and pointed out
two assumptions
that a)
the
training corpus is more or less effecting classifier performance,

and b) training
corpuses

of higher quality

prone to

derive classifiers of

better

performance.

3.

Efforts
on Medical Text Categorisation

It is obvio
us that manual classification for the

diagnoses is a labor intensive
process

and

consumes significant resources.
It is worthwhile for
researchers
developing

an automating system
to carry on

such medical text classi
fication
task
s
.

3.1.

Early
System

Based on expert knowledge concept
s
, Yang et al.
[3]

produced a system
named ExpNet which

used

cat
ego
ry
-
ranking method
s for automatic
ally

coding
the diagnosis

reports at the Mayo Clinic. The ExpNet technique extended and
enhanced previous techniques (Linear Least Squares Fit and Latent Semantic
Indexing) and reached at
a level where the average precisi
on was 83% and
recall was

81%. One weakness of this system
was

its automatic
ally

coding
method only went well with short phrases

(
less than six words
)

and
merely
a
single diagnostic rubric.

To evaluate the expert knowledge system, Chapman et al. [4] compar
ed the
outcomes from expert
-
crafted rules, based on Bayesian network and decision
tree, against a collection of chest X
-
ray reports that support acute bacterial
pneumonia. They randomly selected 292 reports encoded by N
atural
L
anguage
P
rocessing (NLP)

syst
em, and mistakes occurred in reports were
manually corrected. In their implementations three expert systems were
Read Document

Text Tokenization

Lexical Verification

Delete Stopwords

Stemming

Vector Representation


of Text

(Indexing)

Feature Selection

and/or

Feature
Transformation

Learning
algorithm

Classifier

INFO4990 Assignment 2_1 Literature Review

Wei Hang ZHANG

Page
3

of
11

15/10/2013

employed to determine whether the encoded observations supported
pneumonia. The output from one expert system was compared by other two
systems

to vote a result, and further the result was judged by four physicians.
The conclusion showed that all three expert systems performed comparably to
physicians.

3.2.

Data Index Support

for
Categori
s
ation

The data size grows. Manual indexing is always an expensi
ve and
labour
-
intensive activity. Take National Library of Medicine (NLM)
[4]

as an
example, the total costs covered data
-
entry, indexing and revi
sing, staff
equipment, and telecommunications costs etc. Besides, indexers are all highly
trained not only in indexing practice, but also field domains.



MTI

To solve this problem, researchers at NLM developed The Medical Text
Indexer (MTI) for both semi
-
au
tomated and fully automated indexing tasks.
Aronson et al.
[5]

reported an experiment conducted with NLM’s database
MEDLINE to evaluate MTI’s performance
. They invited ten volunteer indexers
used the web
-
based tool DCMS to index MEDLINE citations. When they
indexed each article for a journal in the experiment, MTI would recommend 25
related terms, which could be include into their normal indexing. After
ex
periment, the volunteers were asked to complete questionnaires to reflect
the performance of MTI. Their experiment results indicated that MTI
performance varies significantly depending on the features of journals, such as
titles, and author names.



Cooperat
ion

of MTI and Domain Knowledge

However, the unsupervised methods within MTI were later integrated with
machine learning techniques, and successfully worked on the classification
processing in the Genomics Track evaluations
[6, 7]
. Based on that, Aronson
et al.
[8]

described an ensemble of indexing and classification system, which
turned out to

performance well in information retrieval and medical text
classification, and successfully finished a new task of assigning ICD
-
9
-
CM
codes to the clinical records and impression section of radiology diagnosis.
They used various stacks of k
-
NN, SVM, and a

simple pattern
-
matching
mechanism, and a modified MTI, along with a training corpus of approximately
1000 of the anonymized and abbreviated radiology reports, ruled by gold
standard ICD
-
9
-
CM assignments, this embedded system produced an F
-
score
of 0.85.

Their work confirmed that with the help from advanced statistical algorithms,
even basic methods with domain knowledge are applicable to medical text.

INFO4990 Assignment 2_1 Literature Review

Wei Hang ZHANG

Page
4

of
11

15/10/2013

3.3.

Deeper
Focus on
Domain

Knowledge

As
NLP systems have been used
commonly in preparation phase
to structur
e
free text

clinical data by extracting observations and descriptive modifiers.

To
prevent substantial variation in data preparation, e
xpert knowledge can be
used to determine the subset of attributes or features
for

the classification
work
.

Wilcox and Hr
ipcsak
[9]

suggested
a method of
using

domain knowledge for
the
feature selection to enhance the performance of machine learning
algorithms.

L
ater

they
delivered an analysis
[10]

towards

the effect of expert
knowledge on

inductive learning process while

creating classifiers for medical

text

reports.

They randomly selected 200 reports form a set of chest
radiograph data, which had been
already
classified with
6

clinical conditions

(i.e. 6 classes)

by physicians
.
Using NLP, they

restructured

medical text
s
,

and
create classifiers
based on vario
us degrees

and types of expert knowledge
,
combined with

different inductive learning

algorithms
.

T
hey

measured the
costs to induce classifiers
,

training
-
set size efficiency
, and the performances of
the

classifiers
.

The

result showed that f
or medical text r
eport
categorization

tasks
, expert knowledge

acquisition was

more significant and more effective
than knowledge discovery.
Therefore
, to
b
uild

classifiers
, people

should

focus
more on knowledge

acquisition

from experts
other
than trying to learn
the
knowle
dge inductively
.

4.

Medical
Coding Implementations and Analyses

4.1.

Base Line

Building on the groundwork laid by Yang

[3]
, Pakhomov e
t al
.

[11]

implemented
an automatic diagnosis coding system which made it possible
to use

spe
cially
trained medical coders on categorizing

diagnoses fo
r billing and research
purposes.

According to a pre
-
defined classification scheme, t
heir system use
d

a

certainty concept

which was

indicated by example
-
based classification, and
then to assign classification codes

to the natural language stated diagnoses,
which were generated by the MI
-
indexed EMR database at the Mayo Clinic
.

It

assumed that

the

diagnostic statements were highly repetitive and
, that

new
diagnose reports should be accurately and automatically coded by simply
looking them up in the
previously

classified entries database

(this

database

stored 22 million manually coded entries)
. Therefore, codes would

be
generated

simply

by matching the diagnostic text to
the
frequent examples in
the database
.

Manual review

was
need only

if

the codes

were

genera
ted

at

a
lower certainty level. Their highest result achieved macro
-
averaged 98.0%
precision, 98.3% recall and an f
-
score of 98.2%. Over two thirds of

the
INFO4990 Assignment 2_1 Literature Review

Wei Hang ZHANG

Page
5

of
11

15/10/2013

diagnoses were

coded automatically with high accuracy.

4.2.

ICD

The International Statistical Classificat
ion of Diseases and Related Health
Problems (ICD) provide
d

medical
codes to classify diseases and a wide variety
of
descriptions, such as
signs, symptoms, abnormal findings, complaints,
social circumstances and external causes of injury or disease.
The goa
l of ICD
system is to unify each health condition

by its schemas
,
and so as to
group
diseases

categories.

Nowadays t
he research topic
s

and developments on ICD are

still

flourishing,
especially in Medic Coding domain.



Application


“automatic code assignmen
t system”

Crammer et al.
[12]

integrated three coding system
s into one

for assigning
ICD
-
9
-
CM medical codes to unstructured radiology reports. In their
implementation, three automated systems were developed at first
, along with a
learning system which

equipped
natural language

processing functionality
.

A

rule based system was designed to assign the codes,
meanwhile
to match the
codes to the medical texts and ICD code descriptions
.
The rule based system
required

no tr
aining

process

but used

the ICD
-
9
-
CM codes and

code
-
descriptions
. For a

given report, the system parsed

both the clinical
history and
session
impression into sentences, and then

checked

the
sentences

using

code description
s,

set flags when

the description
words
occurred

there
. If a matched code wa
s a disease and no negation words
appeared

in the sentence, the flag would

be removed.
In the final stage, a
specialized system
judged

the most common codes

as the assignment values.

On Computational Medicine
Centr
e’s

challenge, their system was evaluated
against

the

labelled

training data

(
978 radiology reports
, 80/20 divided)
, and
performed outstandingly

against the test data (
976 documents
). B
eing
compared with both human annotators
and other automated systems
, t
his
combined system

performed better than

each individual system.



Application


“shared
-
task involving multi
-
label classification
system”

Pestian et al.
[13]

reported their system

on the same task as above
.
What they
presented was a shared
-
task involving multi
-
label classification system
.

First,
medical jargon, abbreviations, and

acronyms,

were filtered out because they
turn
ed out to be ambiguous
; secondly,
considering

on

patient privacy and
machine
-
learning methods,
a)
all

human names were

replaced
with
“Jane” or
“John” depending

on gender b)

all
surnames
were substituted with
“J
ohnson”.

It was interesting that

manual inspection was adopted before medical code
assignment
s
. All data were manually reviewed,

as the result

the data
INFO4990 Assignment 2_1 Literature Review

Wei Hang ZHANG

Page
6

of
11

15/10/2013

potentia
lly violating PHI regulation were

d
eleted, and geographic words were

changed.

And then data were

annotated by the coding staff and tow
independent coding companies.

At last, they

performed a majority annotation
process of

analysing

agreement statistics

to suggest final cod
e
s.

By
the
macro
-
averaged F
-
measure with

1167
correct label
-
assignment
s, as wel
l as
the

cost
-
sensitive measure, this system even performed better than
Crammer’s
[12]
.

(
The ICD
-
9 was
delivered by the WHO in 1977
. D
evelopment

on ICD
-
10 began in 1983
and
finished
in 1992
, and

t
he first draft of t
he ICD
-
11 is expected in 2008
)

4.3.

SNOMED

CT

The Systematized Nomenclature of Medicine (SNOMED) is
another
classification system

which

possess
es

its own
multi
-
axial

and hierarchical
structure
.



Comparison with ICD

Helen Moore
[14]

made a
matching
comparison between SNOMED
-
CT and
ICD
-
10
-
AM
.
She extracted medical terms from
160
paper
-
based medical

records,

and coded the terms using

SNOMED CT and ICD
-
10
-
AM

schemas
separately
.
Based on

that step

a

rating process compared the
se

two systems
on two features: where a match existed, and whether the coded terms
specifically related to clinical concepts.

The outcome

of her work

indicate
d

that

ICD
-
10
-
AM exactly matched

2.7% of the source terms

while

SNOMED
-
CT
achieved

48.6%
;

in contrast

the relevance
result of ICD
-
10
-
AM

in most cases
was

even

m
o
re specific than the value of 72.2% SNOMED
-
CT

reached
.
She
suggested
SNOME
D
-
CT

would

be suitable for consideration to be adopted as
the clinical terminology system.



A
pplication

Melton et al.
[15]

applied S
NOMED

CT
schema in

their
P
atient
-
based similarity
metrics, which
were

used as
an important case
-
based reasoning tool
to assist

patient care applications. All patient cases

(
collect from1989 to
2003)

from the
Columbia University Medical
Centre

data reposito
ry were converted to
SNOMED CT concepts
, using automated tools

(
such as
t
he demographic and
ICD9
-
CM codes were converted to SNOMED CT concepts using MRCONSO
from the UMLS
)
.

The all 5

metrics
were

computed overall and along each of
the 18 SNOMED CT axes
, wi
thin which f
our metrics

were applied with
SNOMED CT defined

relationships.

This application showed that

while

construct
ing

the

distance metrics,
both
the defined relationships of the
terminology and
the
principles of information content provided

valuable
i
nformation; meanwhile, the

SNOMED CT axes was helpful

to narrow
in
-
on

the
features used in expert determination of similarity.

INFO4990 Assignment 2_1 Literature Review

Wei Hang ZHANG

Page
7

of
11

15/10/2013



Evaluations

As for such system, t
erminologies and concepts coverage

are

needed for the
comprehensive encoding of a medical diagno
sis in a real
-
world.
In 2003
,

Wasserman et al.
[16]

evaluated SNOMED CT on these two features.
They
used a computerize physician order entry (C
POE) system, to check all
submitted requests for clinical terms which

were not

represent in SNOMED CT.
The result showed that SNOMED CT had colleted 88.4% of their p
repared
diagnoses and problem
-
list terms, and achieved the concept coverage of
98.5%.

These

scores indicated

that SNOMED CT was

“a relatively complete
standardized terminology on which to base a vocabulary for the clinical
problem list”
[16]
.

Richesson et al.
[17]

did the similar estimation on SNOMED CT. They
evalua
ted the coverage provided by SNOMED CT for clinical research
concepts, and further the se
mantic nature of those concepts.

They used

17
case
-
report forms (CRFs) from which a set of 616 items were identified and
coded by the presence and nature of SNOMED CT
coverage. A basic
frequency analysis showed

that more tha
n 88% of the core clinical concepts
from these data items were covered by SNOMED CT. It

concluded

that
although less suited for representing the whole information recorded on CRFs,
SNOMED CT did the
job well to represent clinical concepts.


4.4.

UMLS

Since
ICD
-
9, Read Codes, MedDRA, CPT

etc.

are
represent
ed

in terms of
various “coding schemes”, researchers

ha
ve

start
ed

to find a way to relate
these disparate biomedical ontologies

[18]
.

One of their achievements is the

Unified Medical Language System (UMLS)
. As

a compendium of many
controll
ed vocabularies in the biomedical
field
, i
t provides a mapping structure
between these vocabularies and
therefore supports to

translate terms

between
the various
terminology

systems. Also,

it may be

considered
as a
comprehensive thesaurus and ontology of b
iomedical concepts.



Language Support and
S
ynonymy

Description

Michael Schopen

[19]

reported
some facts about
the integrated vocabularies
:

the vocabularies are

consist of

the Medical Subject Headings
(MeSH) in eight
languages, ICPC
-
93 in 14 languages, WHO Adverse Drug Reaction
Terminology in 5 languages, SNOMED
-
2, SNOMED
-
3, and the UK Clinical
Terms (former Read Codes). The WHO version of ICD
-
10 is available in two
languages: English (plus an Americani
zed version) and German. Furthermore,
the Australian modification ICD
-
10
-
AM has been integrated (also with an
additional Americanized version). ICD
-
9 is only available in its US clinical
modification.

INFO4990 Assignment 2_1 Literature Review

Wei Hang ZHANG

Page
8

of
11

15/10/2013

Focusing on English

and take it

as an example, t
he UMLS

is built on one view
of synonymy, but its structure also contains all the individual views of
synonymy from its source vocabularies.
Powered by NLP and statistical
technologies,

t
he development became a knowledge
-
based automatic process,

so

as

to

the voca
bularies maintenance
; however,

manual correction is still
heavily used in determining synonymy.

By
investigating the result of human judgment of synonymy, Fung et al.

[20]

evaluated the synonym similarity between

the SNOMED CT

and

the UMLS,
involving the alignment of two dif
ferent views of synonymy due to the two
vocabulary systems have different designed purposes and editing principles.
60 pairs of potentially controversial SNOMED CT synonyms were reviewed by
6 UMLS editors and 5 non
-
editors, thus depending on the degree of
synonymy
they were scored. In order to evaluate accuracy these synonymy scores of
each subject were compared to the overall
-
averaged score of all subjects. The
difference of score between UMLS editors and non
-
editors was agreed on by
their mean synonymy sc
ores. The result showed
the
comparable
a
verage
accuracies

-

71% for UMLS editors and 75% for non
-
editors
. By the result,
Fung suggested to
integrate the SNOMED CT into the UMLS.



Application

Based on natural language processing,
Friedman et al.

[21]

reported

and
evaluated

a method
to
automatically

map
a

clinical document

entirely

to

the

codes with modifiers
.

They

used a
collection of discharge summaries
,
consisting of 818,000 sentences. Those summaries were from
New York
Presbyterian Hospital
, and p
roduced t
wo 150

randomly

selected

sentences

test sets.

MedLEE NLP system
was employed
to
encode clinical documents;

an Encoding Table
was

created to select terms
which
were
complementary to
UMLS terms
. When

the known types of errors were
automatically remo
ved
during table generation, all the remaining entries
were kept
in

a

coding table
,
which

was

subsequently used to parse and encode sentences.

T
he parsed
sentences were used for mapping medical text to codes
.


One test set

of the two

reached a UMLS codes r
ecall

of .77 (95% CI .72

.81)
based on

MedLEE

processing,
compared
with.83 (.79

.87)
by seven experts

manual processing
.
The second set was measured by precision, and the
comparison the automatic
system

.89 (.87

.91), and the experts ranged
from .61 to .91
.

This method, which was combined with information e
xtraction
,

UMLS coding

and

NLP
,
appeared to be comparable to or better than six
experts. T
his method successfully mapped

text to codes along with other
related information, rendering the coded output suit
able for effective retrieval.

5.

Conclusion

Based on above previous works, the goal is reachable to develop

an automatic
INFO4990 Assignment 2_1 Literature Review

Wei Hang ZHANG

Page
9

of
11

15/10/2013

classifying system to convert clinical reports to the SNOMED codes. The
source data consist of 394,025 supervised pathology reports from a

hospital. A
report is assigned various number of code labels, from 1 to 15.

First of all,
these text documents will

be cleaned by NLP technique in order to
filter out noises such as the formatting tags, stop
-
words, or personal names
which occur within the

reports.

The second step will be

lexical verification. All word tokens
will

be
compared
with the

medical vocabularies from

ICD, SNOMED, and UMLS.

Entropy
-
based
weights of word tokens will be adopted to generate attributes’
values, and then the reports c
an be converted into vectors. In this way the
doc
-
word matrix is built. Information Gain of word
-
tokens (the attributes) will be
used to reduce dimensionalities.

SVM, Decision Tree
s
,
and

Naive Bayes algorithm
s will be used
to generate the
classifiers. This

multi
-
label problem
will be solved by using

a binary algorithm
,
to generate classifiers for each medical code. Being classified by each
classifier, every testing data instance will be assigned a reasonable
combination of codes by the classifiers. A code
l
abelling

rule can be set by a
threshold on scores from each code
-
classifier.

To evaluate the classification performance, empirical evaluation will be
conducted. Ten
-
fold cross validation will be employed as an evaluation
strategy and precision, recall and

F1
-
measure will be used as evaluation
metrics.

F
urther, the average
d

measure values

will show the accuracy of this
system.

INFO4990 Assignment 2_1 Literature Review

Wei Hang ZHANG

Page
10

of
11

15/10/2013

Bibliography


1.

Aas, K. and L. Eikvil,
Text Categorisation: A Survey
. 1999.

2.

M. Ikonomakis, S.K., V. Tampaka
s,
Text Classification Using Machine Learning Techniques.

WSEAS TRANSACTIONS on COMPUTERS, 2005.
4
(8): p. 966
-
974.

3.

Yang, Y.,
Expert networkEffective and efficient learning from human decisions in text
categorization and retrieval.

17th Ann Int ACM SI
-
GI
R Conference on Research and
Development in Information Retrieval, 1994: p. 13
-
22.

4.

Alan R. Aronson, O.B., H. Florence Chang, Susanne M. Humphrey, James G. Mork, Stuart J.
Nelson, Thomas C. Rindflesch, W. John Wilbur,
The NLM Indexing Initiative.

Proc AM
IA
Symp, 2000: p. 17
-
21.

5.

Alan R. Aronson, J.G.M., Clifford W. Gay, Susanne M. Humphrey, Willie J. Rogers,
The NLM
Indexing Initiative's Medical Text Indexer.

MEDINFO, 2004.

6.

Aronson AR, D.
-
F.D., Humphrey SM, Lin J, Liu H, Ruch P, Ruiz ME, Smith LH, Ta
nabe LK,
Wilbur WJ. ,
Fusion of knowledge
-
intensive and statistical approaches for retrieving and
annotating textual genomics documents.

Proc TREC, 2005: p. 36
-
45.

7.

Demner
-
Fushman D, H.S., Ide NC, Loane RF, Ruch P, Ruiz ME, Smith LH, Tanabe LK,
Wilbur WJ
, Aronson AR. ,
Finding relevant passages in scientific articles: fusion of automatic
approaches vs. an interactive team effort.

Proc TREC, 2006: p. 569
-
576.

8.

Alan R. Aronson, O.B., Dina Demner
-
Fushman, Kin Wah Fung, Vivian K. Lee, James G.
Mork, Aurélie

Névéol, Lee Peters, Willie J. Rogers,
From Indexing the Biomedical Literature
to Coding Clinical Text: Experience with MTI and Machine Learning Approaches.

Proceedings of the ACL' 2007 Workshop "BioNLP" 2007, 2007: p. 105
-
112.

9.

Adam B. Wilcox, G.H.,
Kn
owledge discovery and data mining to assist natural language
understanding.

Proc AMIA Annu Fall Symp, 1998: p. 835
-
9.

10.

Adam B. Wilcox, G.H.,
The Role of Domain Knowledge in Automating Medical Text Report
Classification
Journal of the American Medical In
formatics Association, 2003(10): p.
330
-
338.

11.

Serguei V.S. Pakhomov, J.D.B., Christopher G. Chute,
Automating the Assignment of
Diagnosis Codes to Patient Encounters Using Example
-
based and Machine Learning
Techniques.

J Am Med Inform Assoc, 2006(13): p
. 516
-
525.

12.

Koby Crammer, M.D., Kuzman Ganchev, Partha Pratim Talukdar,
Automatic Code
Assignment to Medical Text.

2007.

13.

John P. Pestian, C.B., Paweł Matykiewicz, DJ Hovermale, Neil Johnson, K. Bretonnel Cohen,
Włodzisław Duch,
A Shared Task Involving Multi
-
label Classification of Clinical Free Text.

2007.

14.

Moore, H.,
A Comparison of SNOMED
-
CT and ICD
-
10
-
AM.

HIC 2003
RACGP 12CC
Combined Conferences, 2003.
2
.

15.

Genevieve B. Melton, S.P., Frances P. Morrison, Adam S. Rothschild, Marianthi Markatou and
George Hripcsak,
Inter
-
patient distance metrics using SNOMED CT defining relationships.

Journal of Biomedical Informati
cs, 2006.
39
(6): p. 697
-
705.

16.

Henry Wasserman, J.W.,
An Applied Evaluation of SNOMED CT as a Clinical Vocabulary for
INFO4990 Assignment 2_1 Literature Review

Wei Hang ZHANG

Page
11

of
11

15/10/2013

the Computerized Diagnosis and Problem List.

AMIA Annu Symp Proc. 2003, 2003: p.
699
-
703.

17.

Rachel L. Richesson, J.A., Jeffrey Krische
r,
Use of SNOMED CT to Represent Clinical
Research Data: A Semantic Characterization of Data Items on Case Report Forms in
Vasculitis Research.

Journal of the American Medical Informatics Association, 2006.
13
(5): p.
536
-
546.

18.

Jeffery L. Painter, K.M.K.
, Gary H. Merrill,
Inter
-
translation of Biomedical Coding Schemes
Using UMLS.

AAAI 2006 Fall Symposium on Semantic Web for Collaborative Knowledge
Acquisition, 2006.

19.

Schopen, M.,
Title: ICD
-
10 and the Unified Medical Language System (UMLS).

2002.

20.

K
in Wah Fung , W.T.H., Stuart J. Nelson , Suresh Srinivasan , Tammy Powell , Laura Roth
Integrating SNOMED CT into the UMLS: An Exploration of Different Views of Synonymy and
Quality of Editing.

Journal of the American Medical Informatics Association, 2005.

12
(4): p.
486
-
494.

21.

Carol Friedman, L.S., Yves Lussier, George Hripcsak,
Automated Encoding of Clinical
Documents Based on Natural Language Processing.

J Am Med Inform Assoc, 2004.
11
(5): p.
392
-
402.

22.

Zweigenbaum, P. and P. Courtois,
Acquisition of

lexical resources from SNOMED for medical
language processing.

Medinfo. MEDINFO, 1998.
9 Pt 1
: p. 586
-
90.