Area 4: Secondary Use of EHR Data

unknownlippsAI and Robotics

Oct 16, 2013 (3 years and 5 months ago)

76 views

Strategic Health IT Advanced Research
Projects (SHARP)

Area 4: Secondary Use of EHR Data

Project 3: High
-
Throughput Phenotyping

Project Lead: Jyotishman
Pathak,
PhD

PI: Christopher G. Chute, MD,
DrPH

June 12, 2012

SHARPn

High
-
Throughput Phenotyping

Electronic
h
ealth
r
ecords (EHRs) driven
phenotyping


Overarching goal


To develop
high
-
throughput

automated

techniques and algorithms that operate on
normalized EHR data to
identify cohorts of
potentially eligible subjects

on the basis of
disease, symptoms, or related findings



©2012 MFMER | slide
-
2

SHARPn

High
-
Throughput Phenotyping

Current HTP project themes


Standardization of phenotype definitions



Library of phenotyping algorithms



Phenotyping workbench



Machine learning techniques for
phenotyping



Just
-
in
-
time phenotyping



©2012 MFMER | slide
-
3

SHARPn

High
-
Throughput Phenotyping

Data

Transform

Transform

Algorithm Development Process
-

Modified

©2012 MFMER | slide
-
4

Phenotype

Algorithm

Visualization

Evaluation

NLP, SQL

Rules

Mappings

Semi
-
Automatic
Execution


Standardized representation of
clinical data


Create new and re
-
use existing
clinical element
models (CEMs)



Standardized and structured
representation of phenotype
definition criteria


Use the NQF Quality Data
Model (QDM)


Conversion of structured
phenotype criteria into
executable queries


Use
JBoss
® Drools (DRLs)

[Welch et al. 2012]

[Thompso
n et al., submitted 2012]

[Li et al., submitted 2012]

SHARPn

High
-
Throughput Phenotyping

NQF Quality Data Model (QDM)


Standard of the National Quality Forum (NQF)


A structure
and grammar to represent quality measures
in
a standardized format


Groups of codes in a code set (ICD
-
9, etc.)


"
Diagnosis, Active: steroid induced diabetes
" using
"steroid induced diabetes Value Set GROUPING
(2.16.840.1.113883.3.464.0001.113)”


Supports
temporality & sequences


AND: "
Procedure, Performed: eye exam
" > 1 year(s)
starts
before

or
during

"Measurement end date"


Implemented
as set of XML schemas


Links to
standardized
terminologies
(ICD
-
9, ICD
-
10,
SNOMED
-
CT,
CPT
-
4, LOINC
,
RxNorm

etc.)

©2012 MFMER | slide
-
5

SHARPn

High
-
Throughput Phenotyping

©2012 MFMER | slide
-
6

116 Meaningful Use Phase I Quality Measures

SHARPn

High
-
Throughput Phenotyping

Example: Diabetes & Lipid Mgmt.
-

I

©2012 MFMER | slide
-
7

Human readable HTML

SHARPn

High
-
Throughput Phenotyping

Example: Diabetes & Lipid Mgmt.
-

II

©2012 MFMER | slide
-
8

Computable XML

SHARPn

High
-
Throughput Phenotyping

Data

Transform

Transform

Algorithm Development Process
-

Modified

©2012 MFMER | slide
-
9

Phenotype

Algorithm

Visualization

Evaluation

NLP, SQL

Rules

Mappings

Semi
-
Automatic
Execution


Standardized representation of
clinical data


Create new and re
-
use existing
clinical element
models (CEMs)



Standardized and structured
representation of phenotype
definition criteria


Use the NQF Quality Data
Model (QDM)


Conversion of structured
phenotype criteria into
executable queries


Use
JBoss
® Drools (DRLs)

[Welch et al. 2012]

[Thompso
n et al., submitted 2012]

[Li et al., submitted 2012]

SHARPn

High
-
Throughput Phenotyping

Drools
-
based Phenotyping

Architecture

©2012 MFMER | slide
-
10


Business Logic

Clinical
Element
Database

List of

Diabetic
Patients


Data Access
Layer


Transformation
Layer


Inference
Engine
(Drools)

Service for
Creating Output
(File,

Database,
etc)

Transform physical representation



Normalized logical representation
(Fact Model)

SHARPn

High
-
Throughput Phenotyping

Automatic translation from NQF
QDM
criteria
to Drools

©2012 MFMER | slide
-
11

[Li et al., submitted 2012]

The “executable” Drools flow

©2012 MFMER | slide
-
12

©2012 MFMER | slide
-
13

Phenotype library and workbench
-

I

1.
Converts QDM to Drools

2.
Rule execution by querying
the CEM database

3.
Generate summary reports

http://phenotypeportal.org


©2012 MFMER | slide
-
14

Phenotype library and workbench
-

II

http://phenotypeportal.org


SHARPn

High
-
Throughput Phenotyping

©2012 MFMER | slide
-
15

Phenotype library and workbench
-

III

SHARPn

High
-
Throughput Phenotyping

Machine learning and HTP
-

I


Machine learning and
association rule mining


Manual creation of
algorithms take time


Let computers do the
“hard work”


Validate against
expert developed
ones

©2012 MFMER | slide
-
16

[
Caroll

et al. 2011]

SHARPn

High
-
Throughput Phenotyping

Machine learning and HTP
-

II


Origins from sales data


Items

(columns): co
-
morbid conditions


Transactions

(rows): patients


Itemsets
: sets of co
-
morbid conditions


Goal
: find
all

itemsets

(sets of
conditions) that
frequently

co
-
occur in
patients.


One of those conditions should be DM.



Support
: # of transactions the
itemset

I

appeared in


Support({TB, DLM, ND})=3


Frequent
: an
itemset

I

is frequent, if
support(
I
)>
minsup


Patien
t

TB

DL
M

ND



IEC

001

Y

Y

Y

Y

002

Y

Y

Y

Y

003

Y

Y

004

Y

005

Y

Y

Y

X
: infrequent

[Simon et al. 2012]

Electronic Health Records and Phenomics

Just
-
in
-
Time phenotyping
-

I

Transfusion
-
related Acute Lung Injury (TRALI)

Transfusion
-
associated Circulatory Overload (TACO)

SHARPn

High
-
Throughput Phenotyping

Just
-
in
-
Time phenotyping
-

II

©2012 MFMER | slide
-
19

TRALI/TACO

“sniffer”

Electronic Health Records and Phenomics

SHARPn

High
-
Throughput Phenotyping

Active Surveillance for TRALI and TACO

Of the
88 TRALI cases

correctly
identified by the CART algorithm, only
11 (12.5%)

of these were reported to
the blood bank by the clinical service.

Of the
45 TACO cases

correctly
identified by the CART algorithm, only
5
(11.1%)

were reported to the blood bank
by the clinical service.

SHARPn

High
-
Throughput Phenotyping

Publications till date (conservative)

8

6

6

2

12

0
2
4
6
8
10
12
14
Year 1 (2011)
Year 2 (2012)
Year 3 (2013)
Papers
Abstracts
Under review
©2012 MFMER | slide
-
22

SHARPn

High
-
Throughput Phenotyping

2011 Milestones


Standardized definitions for phenotype criteria


Rules
-
based environment for phenotype
algorithm execution


National library for standardized phenotype
definitions (collaboration with
eMERGE
)


Machine learning techniques for algorithm
definitions


Online, real
-
time phenotype execution


Phenotyping algorithm authoring environment


©2012 MFMER | slide
-
23

SHARPn

High
-
Throughput Phenotyping

2012 Milestones


Machine learning techniques for algorithm
definitions


Online, real
-
time phenotype execution


Collaboration with NQF, Query Health and i2b2
infrastructures


Use cases and demonstrations


MU
q
uality metrics (w/ NQF, Query Health)


Cohort identification (w/
eMERGE
, PGRN)


Value analysis (w/ Mayo CSHCD, REP)


Clinical trial alerting (w/ Mayo Cancer Ctr./CTSA)




©2012 MFMER | slide
-
24

SHARPn

High
-
Throughput Phenotyping

Project 3: Collaborators & Acknowledgments


CDISC (Clinical Data Interchange Standards Consortium)


Rebecca Kush,
Landen

Bain


Centerphase

Solutions


Gary
Lubin
, Jeff
Tarlowe


Group Health Seattle


David
Carrell


Harvard University/MIT


Guergana

Savova
, Peter
Szolovits


Intermountain Healthcare/University of Utah


Susan Welch, Herman Post, Darin Wilcox, Peter
Haug


Mayo Clinic


Cory
Endle
, Rick Kiefer,
Sahana

Murthy,
Gopu

Shrestha
,
Dingcheng

Li,
Gyorgy

Simon, Matt
Durski
,
Craig
Stancl
, Kevin Peterson, Cui Tao, Lacey Hart, Erin
Martin, Kent Bailey, Scott Tabor, Chris Chute


©2012 MFMER | slide
-
25