Translational Research IT (TraIT)

erminerebelAI and Robotics

Nov 15, 2013 (3 years and 9 months ago)

295 views

Translational Research IT (TraIT)


TraIT and OpenClinica: partners in
translational research


Marinel Cavelaars, Cuneyt Parlayan, Jacob Rousseau,

Sander de Ridder, Jan Willem Boiten and Jeroen Beliën


Boston; June 21
st

2013


Overview


Introduction and background


CTMM


Translational Research


TraIT


Three real
-
life examples: OpenClinica, BMIA, tranSMART


OpenClinica.com


TraIT partnership


CTMM
-
TRACER and OpenClinica by Sander de Ridder


Scripts, Long Lists, Tools developed


Things we learned/found useful


Who am I?


My name: Jeroen Beliën, PhD, MSc


Associate Professor, medical informatics, dept. of Pathology, VU
University medical center, Amsterdam


Digital Pathology, Image processing, IT in translational research


String of Pearls


IT
-
lead 2 CTMM projects: DeCoDe and TRACER


CTO CTMM
-
TraIT


BioMedBridges


Member of taskforce Stichting Palga


Palga: Dutch National Electronic Pathology Archive


Faculty member of NBIC

jam.belien@vumc.nl

CTMM, TIPharma and BMM

offer an integrated approach for innovations in

the Dutch health care sector

CTMM:
diagnosis


Early detection of disease by in
-
vitro and in
-
vivo diagnostics


Stratification of patients for
personalized treatment


Assessing efficiency and efficacy
of medicines by imaging


Image guided delivery of
medication


Focus on cancer, cardiovascular,
neurodegenerative and infectious
/autoimmune disease.


TIPharma:
drugs


Translational research on novel
pharmaceutical therapies


Target finding
, animal models and
lead selection


Drug formulation, delivery and
targeting


Special Theme focusing on the
e
fficiency of the process of drug
development


BMM
:
devices


Smart d
rug delivery systems


Innovations in contemporary organ replacement
therapies


Passive and active scaffolds, including c
ell
signalling functions

Image guided
drug delivery

Biomarkers

Drug
delivery

Imaging for
regenerative
medicine

CTMM projects


3
00 mln

Government

Academia

Industry






37,5
mln

CASH



37,5
mln

Kind



In kind




75 mln



Subsidy




150 mln


50%

Public
-
private partnerships: Financial model

Subsidy: 50% of research cost

CTMM projects

Breast

Prostate

Colon

Lung

Leukemia

Heart

Failure

Stroke

Diabetes

Kidney Failure

Arrhythmia

Peripheral Vascular

Disease

Thrombosis

Alzheimer

Rheumatoid Arthritis

Sepsis

Translational research process

Guiding principle: connecting phenotype to biology

Scientific Output

Patient enters
medical center

Intellectual
Property

Improved
Healthcare

Experimental
data

Downstream

analysis

Clinical
Procedures

Imaging

Samples

Experiments

Electronic

Health Record

Data

Integration

External data

Image database

Biobank
database

Clinical database

TraIT consortium
-

Started Oct. 2011

status 2013: 26 partners

Growing TraIT project team


IT infrastructure = main goal


No research on the side


Workflow
-
oriented approach


Create data pipelines to link data production and data analysis


User driven priority setting


Regular reprioritization possible (agile)


Avoid reinventing wheels


Adopt/adapt existing technology and expertise


Connect with other initiatives


O
rganizations (NBIC, EBI, PSI, IMI, etc.)


Think big; start small; act now


Short term focus on immediate needs CTMM projects

The TraIT approach

Division in work
packages

Five data generating work
packages

Data integration & analysis
across the four platforms

Shared service center for
hardware, training &
support

TraIT has been subdivided into four work packages (WPs) supporting data generating domains, and
two work packages dealing with the overarching TraIT requirements: data integration and
professional support respectively:

WP 1
Clinical
Data

WP 5 Core Infrastructure

WP 6 Deployment

Imaging Data

WP 3

Bio
-
banking
Data

WP 4

Experi
-
mental
Data

WP 2
Clinical
Imaging
Data

WP 7

Patho
-
logy
Imaging

High
-
level TraIT data flows

Hospital (IT)

Translational Research (IT)

Research Data

LIMS

data domains

clinical data

imaging data

annotations

experimental
data

biobanking

integrated
data

translational
analytics
workbench

Public Data



e.g.

tranSMART/
i2b2

NBIA

Open

Clinica

Various

solutions

HIS

PACS

LIS

P
s
e
u
d
o
n
y
m
i
z
a
t
i
o
n

e.g. Galaxy

cohort

explorer

e.g. R



CBM
-
NL

TraIT Pseudonymization

Hospital (IT)

Translational Research (IT)

Research Data

LIMS

data domains

clinical data

imaging data

experimental data

biobanking

integrated data

translational
analytics
workbench

Public Data

HIS

PACS

LIS

Galaxy

tranSMART/

cohort explorer

R



NBIA + AIM

e.g.

CBM catalog

e.g.
PhenotypeDB,
Annai Systems

e.g.

Galaxy,
Chipster

e.g.

caTissue

T
T
P

e.g.

GEO, EMBL
-
EBI


SubjectID

TumorStage

Cairo_135

T3c

..

..


BSN

Name

TumorStage

274839

J.Doe

T3c

..

..


BSN

Name

ImageID

Image

274839

J.Doe

782

..

..

..

..


BSN

Name

SampleID

Sample

274839

J.Doe

346

..

..

..

..


SubjectID

ImageID

Image

TumorSize

Cairo_135

Cairo_

img_492

12

..

..

..

..


SubjectID

SampleID

Volume

Cairo_135

Cairo_smpl_42

50 cc

..

..

..


SampleID

GeneExpProfile

Cairo_smpl_42

0.23, 012, 0.52, 1.67, …





Cairo

Private Study

SubjectID

Cairo_135

TumorStage

T3c

TumorSize

12

ImageID

Cairo_

img_492

SampleID

Cairo_

smpl_42

GeneExp

Profile

0.23, 012,

0.52, 1.67, ..

Public Study

SubjectID

Public_1931

TumorStage

T3c

TumorSize

12

ImageID

Public_

img_46

SampleID

Public_

smpl_23

GeneExp

Profile

0.23, 012,

0.52, 1.67, ..

T
T
P

TraIT
-

study driven approach

Data Integration

Translational

Analytics

Workbench

Study

1

Study

2

Study



UC 1

UC 2

UC …

Task 1:


study selection

Task 2:


use cases &
prototypes

Data Integration

integrated
translational
data
warehouse

p
s
e
u
d
o

E
T
L

Translational

Analytics

Workbench

Analytics

A
A
A

Data Integration

Translational

Analytics

Workbench

Task 3, 4, 5:

development of


data integration platform


analytics workbench


shared components

∙∙∙

∙∙∙

∙∙∙

2013

2014

Translational Research (IT)

Three real
-
life examples

Hospital (IT)

T
T
P

clinical

imaging

integrated
data

e.g.

tranSMART

NBIA

Open

Clinica

PACS

Example 2: CTMM AIRFORCE

Example 1: CTMM
INCOAG

Example 3: CTMM PCMM

Real
-
life example 1
-

CTMM Incoag


Discover new risk factors for thrombotic diseases


Approach: Combine existing clinical studies into one
OpenClinica data set for higher statistical power

OpenClinica
:


Clinical data capture


Web
-
based


Open
-
source


Full audit
-
trail


10,000+ installations


TraIT tool of choice

Incoag
-

Technical integration

Out
-
of
-
the
-
box OpenClinica can be applied in most projects: currently used
in CTMM projects AirForce, Cohfar, DeCoDe, Parisk, PCMM, and Tracer


Specific Incoag question: how to combine 5
+

independent existing studies
from mixed sources into one OpenClinica installation?

Study 1

Study 2

Study 3

?

Sustainable storage in TraIT environment

Incoag
-

Technical integration

Solution: TraIT
-
team created a batch upload toolbox for OpenClinica

Will be submitted to the OpenClinica open
-
source community

Study 1

Study 2

Study 3

Sustainable storage in TraIT environment

Incoag
-

Semantic integration

Study
1

Study
2

Study
4

Study
5

Study
3

Second question from Incoag project: how to identify common
fields and data items?

How to determine the overlap?

Incoag
-

Semantic integration

Study
1

Study
2

Study
4

Study
5

Study
3

Second question from Incoag project: how to identify common
fields and data items?

How to determine the overlap?

100
-
150 fields

in each study

More than 100
5

combinations to
consider!


Studies speak different

languages

:

A biomedical

Esperanto


needed

Study
1

Study
2

Study
3

Study
4

Study
5

Common
ground?

Incoag
-

Semantic integration

Project 1: Provide tools to standardize studies at data registration (as
far as possible):

TraIT building blocks

to rapidly build CRFs

for new studies based

on common dictionary

Study
n

Study
1

Study
2

Study
4

Study
3

Study
5

Project 2: First test with tools for automatic

after
-
the
-
fact


harmonization for historical data:

Harmonized
Incoag

dataset

Automatic mapping against

multiple dictionaries

(
SNOMED
-
CT, LOINC, NCI

thesaurus & Gene Ontology)

Real
-
life example 2


CTMM AirForce


Personalized chemo
-
radiation of lung and head & neck cancer


Lung cancer patients with PET
-
CT (and clinical data & tissue)


VUMC, MUMC+, NKI, UMCG + 35 patients from
Policlinico
Gemelli in Rome (via MUMC+)


Transfer of images from Rome using TraIT

s BioMedical Image
Archive (www.bmia.nl)


WP2 High level design


Upload




(Implemented)

Image
pseudonymization

pipeline

(based on CTP from the RSNA)

Image storage & simple web
-
shop like image viewing
(based on NBIA)

AirForce
-

de
-
identification of images


Install TraIT de
-
identification client in
Rome


Adopt: Clinical Trial Processor
(RSNA, open source, Java)


Configure DICOM de
-
identification


Remove identifying DICOM tags


Replace Codice Sanitario (PatientID)
with AirForce ID


Keep important tags (e.g. some tags
are crucial for downstream analysis
of PET)


Result: A pipeline to TraIT

s BMIA
from the local Rome Image Archive

DICOM TAGS

DICOM IMAGE

AirForce
-

QC of de
-
identification


Perform QC step by collection administrator before images are
visible in BMIA to prevent privacy breach (esp. burnt
-
in names).

AirForce
-

Resulting image archive in BMIA


Collection AirForce on
www.bmia.nl

with 35 patients from Rome


Web shop model where you can fill a basket with patients for
download


Real
-
life example 3


CTMM PCMM


Develop and validate biomarkers for diagnosis of prostate cancer


Requires correlation of phenotype data to biomarker data


Potential solution: tranSMART; to be validated with real
-
life data
from CTMM projects like PCMM

Can we address the

generic translational

question with the

tranSMART solution?

Role of tranSMART in TraIT

PCMM


tranSMART as a candidate solution

tranSMART
:


Developed in J&J


Made open
-
source



Data workbench


for
translational researchers


Searching across studies


Data exploration

PCMM
-

Import of prostate data

Prostate
data

Gleason score,

PSA values, etc.

Usually gene expression data will be
loaded as well; not yet done for PCMM

Reference to public data sources

available

PCMM
-

QC of the data set

PCMM
-

QC of the data set

Drag
-
and
-
drop data parameters to create simple
distribution plots and statistical values

PCMM: tranSMART for correlation analysis

Easy to create correlation plots between existing
and potential predictors for prostate cancer

Second tranSMART developer/user meeting,
June 17th
-
19th 2013, Amsterdam

CTMM
-
TraIT

Sanofi

Recombinant
/ Deloitte

University
of
Michigan

Thomson
Reuters

Pfizer

eTRIKS

/
Imperial College

CDISC

University of
Luxembourgh

Philips

Johnson &
Johnson

OpenClinica.com


TraIT partnership

Statement of Work


TraIT: automate data capture in OC as much as possible


E.g. automate upload of excel data and hospital lab data


Approach: OC
’s Web Services


Requires Improvements on OIDs and Bug Fixes


Support configurable role based authentication and
authorization within OC


E.g. Central review of images for all subjects in the different sites.
Each image is reviewed by three reviewers who are not allowed
to see each other

s reports in the CRFs


Parameterized links in CRFs


E.g. Links to images or to other subjects, with a dynamic URL
based on data in CRF

Other wishes


Study migration


E.g. Users want to switch to different OC server


Currently only "
ClinicalData
" ODM is imported


Studies can be exported in full detail but cannot be
imported as such


Support reference to ontologies in the CRF


Standardization of
data


Easy view for data entry


E.g. tree structure that indicates where you are while
entering data for easy navigation to other CRF for subject




The load on TraIT OpenClinica increased significantly in 2012


Considerable time and energy was spent on delivery management (availability, capacity and
security) and on improvement of the TraIT OpenClinica user support

0
3
3
15
26
47
0
3
3
15
26
47
0
3
3
15
26
47
0
3
3
15
26
47
0
5
10
15
20
25
30
35
40
45
50
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Mid
june
2008
2009
2010
2011
2012
2013
Number of studies
Timeline
Uptake of OpenClinica
0
3
3
15
26
47
0
5
10
15
20
25
30
35
40
45
50
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Mid
june
2008
2009
2010
2011
2012
2013
Number of studies
Timeline
Uptake of OpenClinica
0
3
3
15
26
47
0
5
10
15
20
25
30
35
40
45
50
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Mid
june
2008
2009
2010
2011
2012
2013
Number of studies
Timeline
Uptake of OpenClinica
0
3
3
15
26
47
0
5
10
15
20
25
30
35
40
45
50
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Mid
june
2008
2009
2010
2011
2012
2013
Number of studies
Timeline
Uptake of OpenClinica
0
3
3
15
26
47
0
5
10
15
20
25
30
35
40
45
50
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Mid
june
2008
2009
2010
2011
2012
2013
Number of studies
Timeline
Uptake of OpenClinica
Pre TraIT effect:

all multicenter

VUmc studies


Also multicenter studies

UMCU, UMCN, EMC,
Meander MC




47 studies


77 sites

256 users

Start

DeCoDe

OpenClinica

Start

TraIT

OpenClinica

Who am I?


My name: Sander de Ridder


Computer Science (MSc) & Bioinformatics (MSc)


Inflammatory Disease Profiling, Dept. of Pathology, VU University
medical center, Amsterdam


Bioinformatics for Inflammatory Disease Profiling Group


IT implementation CTMM TRACER

s.deridder@vumc.nl

CTMM
-
TRACER

Background information on TRACER



CTMM TRACER: Rheumatoid Arthritis


Prospective data


Retrospective data (To Do)



Go Live:


Wednesday
the 5
th

of June


Started at
9:00
-

Finished
at 12:00


Approximately 1 hour/study

Prospective Studies

VERA

ERA

ESRA

Sites

4

7

7

Events

7

6

6

CRFs

~35

~30

~30

Rules

~250

~450

~650

Age Calculation

After entering the DOB
and the date of signing…

The age is calculated

Age calculation script:
http://en.wikibooks.org/wiki/OpenClinica_User_Manual/AgeField

Created by Sander de Ridder and improved by Gerben Rienk

Long List Implementation


Problem
:


Maximum
of 4000
characters for single
-
select response options text


Some lists
need more characters: e.g
.
medication
list
> 9000
characters


Solution:


Created external list


Add field to CRF
which
opens new page with list


Allows user to select option; selected
value
is copied back to CRF

ITEM_NAME

RESPONSE_TYPE

RESPONSE_OPTIONS_TEXT

RESPONSE_VALUES_OR_CALCULATIONS

Smoking_Category

single
-
select

Never
smoked
,
Current

smoker

1,2

Example: Medication

User selects

Other


and then

clicks on question 3)

s field

A new tab/window opens with an

HTML page with a single
-
select

The user can select desired
medication from the list

Selected medication is copied to
the CRF

Some tools we created: CRF validator


Compares items between CRFs based on uids and ensures they match


CRF1


ID: Patient_Weight; DATA_TYPE: INT


CRF2


ID: Patient_Weight; DATA_TYPE: REAL



Mismatch for Patient_Weight!


Checks NULL
-
flavour coding integrity


Coding:
-
1=No Information,
-
2=Not Applicable,
-
3=Unknown, …


CRF1


RESPONSE_OPTIONS_TEXT: No Information
RESPONSE_VALUES_OR_CALCULATIONS:
-
2




Incorrect NULL
-
flavour coding!




Prevents errors and inconsistencies




Some tools we created: ID
-
Translator


Move rules file to new OC server


replace all item IDs


Automatic translation of item identifiers in rules


Prevented
replace errors and s
aved
many
hours of work


Requires:


ViewCRFVersion

file


Contains item ID information for CRF on new server


Rule file with properly specified header


Contains item ID information for CRF on old server


Parse ViewCRFVersion



mapping ITEM_NAME


new OC_ID

MedicatieBijgewerkt =

I_TRACE_MEDICATIEBIJGEWERKT_4714


Parse Header of rule file



mapping ITEM_NAME


old OC_ID

MedicatieBijgewerkt =
I_TRACE_PATIENTSTUDIE_MOMENT_AFROND


Translate rule file

old OC_ID


new OC_ID via ITEM_NAME

I_TRACE_PATIENTSTUDIE_MOMENT_AFROND =

I_TRACE_MEDICATIEBIJGEWERKT_4714

ViewCRFVersion (new Server)

Rules for old server

Translated Rules for new server

ITEM_NAME

OC_ID

OC_ID

ITEM_NAME

Things we learned/found useful


ITEM_NAME max 64 characters


SPSS compatibility


Truly unique identifiers (description label)


Easy to link to study definition (CTMMC)


Useful for consistency checking


Negative NULL
-
flavour coding


Prevent conflict with retrospective data


Easy to keep NULL
-
flavour coding consistent


Specify identifiers in header of rule file


Automatic translation


JavaScript code


$.noConflict();


Prevents our code from interfering with OC

s code


Reference to jquery


<script src="//ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js">


Prevents dependency on OC

s jQuery version


Create a checklist and follow it during go
-
live

Goal: make researchers want to use OpenClinica and tranSMART

And many more…

Acknowledgements