AMIA Summit on Translational Bioinformatics 2010 - Northwest ...

signtruculentΒιοτεχνολογία

2 Οκτ 2013 (πριν από 3 χρόνια και 8 μήνες)

126 εμφανίσεις

Shared Genomics:

Developing an accessible integrated analysis
platform for Genome
-
Wide Association Studies

David Hoyle,
Mark Delderfield
,


Lee
Kitching,
Gareth
Smith,
Iain Buchan


North

West Institute for BioHealth Informatics, University of Manchester


Outline


Introduction



Scientific task




Shared Genomics objectives


Overview

of the workbench


Demo
of key features


Future development options


Download link and chance to feedback

Scientific Task

Identify genetic variations associated with disease
outcomes, and the plausible biological mechanisms.
Taking into account gene
-
environment interactions.

No disease

Disease

c
2

For example, 1000
subjects , 0.6M SNPs and
over 1000 clinical
variables.


Manchester Asthma and Allergy Study


Birth Cohort


Salford Diabetes


Electronic Patient Records

Shared Genomics Platform

To design, build and implement an information
system to help researchers efficiently analyze
large
-
scale genetic data.


For example , Genome
-
wide SNP pair associations

=
0.6M*0.6M*10K/2 tests

Solution:


Deploy parallelised analysis algorithms on a High
Performance Cluster


Provide an accessible workbench for clinicians

Shared Genomics Workbench


Support pre
-
processing & QC process


Run large scale analyses quickly


‘One
-
click’ annotation of your bio
-
markers (SNPs)


Tool to explore automatically tracked annotations




Support pre
-
processing & QC


Data can be filtered on


SNPs

e.g.,


Hardy Weinberg Equilibrium


Minor Allele Frequency (MAF)


Missing

rate per SNP


Individuals e.g.,


Missing rate per person


Covariates, e.g. gender, ethnicity


Simplifies pre
-
processing


generates analysis input files

Run large scale analyses quickly


Based on PLINK from Shaun Purcell (Broad Institute)



Modified algorithms to run on high performance cluster:


Basic association tests
-

c
2


Basic model based calculations
-

CA trend tests


Basic epistasis calculations
-

Pair
-
wise


Basic test for association with non
-
genetic factor


-

Cochran Mantel Haenszel



Run large scale analyses quickly


Review results of analysis

Annotation Workflow

‘One click’ annotation of SNPs

Right click on ‘SNP’ provides menu
of further biological annotations

Automatic capture for
future review


with
option to add comments

Annotation exploration

Future Development Options




Usability Testing underway

Possible options based on feedback so far:


Offer support for full WTCCC Sanger QC Protocol


Provide LD Heat Map Plots (drawn by R)


Offer a standalone annotation capture tool


Expand analysis options e.g. IMPUTE


Your ideas??

Thanks & Download link


Microsoft External Research


OMII
-
UK & myGrid
-

workflows


Our clinical collaborators:


Prof. Adnan Custovic, Dr. Angela Simpson




University Hospital of South Manchester


Dr. John New, Dr. Martin Gibson



Salford Royal NHS Foundation Trust


Download from:
www.nibhi.org.uk/sharedgenomics



Any Questions