How and Why Novartis is

closebunkieAI and Robotics

Nov 15, 2013 (3 years and 4 months ago)

55 views

How and Why Novartis is
Exploiting GRID Technology?

HPC and Semantic Web


Prof. Manuel C. Peitsch, PhD

Global Head of Systems Biology

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Mechanism
-
based Drug Discovery


Understanding Disease


Pathways elucidation


Target validation


Clinical PoC

New drug candidates (to be tested in PoC
studies)

Reduce project life cycle

Increase PoS after D3 (Lead optimisation)

The Challenges of Drug Discovery

Systems Biology:

Combination of *Omics
& Mathematical

Modelling

}

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Japan



Oncology



Diabetes



Cardiovascular

Austria



Autoimmunity

Great Britain


Respiratory


Gastrointestinal

Switzerland


M
uscular and B
one


Nervous system


Oncology


Transplantation


Ophthalmology


Genome and Proteome Sciences


Discovery Techologies


Dis
covery Chemistry


Protease Platform


GPCR

U
nited States


Diabetes


Infectious diseases


Cardiovascular


Oncology


Discovery Techologies


Dis
covery Chemistry


Animal Models


Pathways


Genome and Proteome Sciences

Organizational complexity

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Data and Information complexity

Raw data from instruments

Literature

Molecular Structure

S
1
S
2
L
3
L
4
E
5
K
6
G
7
L
8
D
9
G
10
A
11
K
12
K
13
A
14
V
15
G
16
G
17
L
18
G
19
K
20
L
21
G
22
K
23
D
24
A
25
V
26
E
27
D
28
L
29
E
30
S
31
V
32
G
33
K
34
G
35
A
36
V
37
H
38
D
39
V
40
K
41
D
42
40
30
20
10
V
43
L
44
D
45
S
46
V
47
L
48
1
S
1
S
2
L
3
L
4
E
5
K
6
G
7
L
8
D
9
G
10
A
11
K
12
K
13
A
14
V
15
G
16
G
17
L
18
G
19
K
20
L
21
G
22
K
23
D
24
A
25
V
26
E
27
D
28
L
29
E
30
S
31
V
32
G
33
K
34
G
35
A
36
V
37
H
38
D
39
V
40
K
41
D
42
40
30
20
10
V
43
L
44
D
45
S
46
V
47
L
48
1
Mass (
m/z
)
% Intensity
1500
2200
2900
3600
4300
5000
50
100
3876.3
2738.9
2324.7
2495.6
3832.1
4174.9
2081.1
4503.2
2981.5
2623.8
3321.5
3717.1
3491.6
4059.6
2795.8
2209.3
3094.3
3167.7
4290.3
1838.1
1652.2
1911.5
b
27
b
42
-
D
b
30
b
38
y
39
-
D9
y
11
y
27
y
33
y
18
[M+H]
+
y
35
b
39
-
D
b
28
-
D (y
26
)
y
24
-
D
b
24
-
D (y
22
)
y
20
-
D
b
23
b
45
-
D
Genomics and Proteomics

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

The Vision

Computational life
science and HPC
GRIDs

People
Networks

Data Information and
Knowledge GRID


Knowledge Space / Semantic
Web

Enable and transform the Drug

Discovery

process through:


-

Comprehensive and reliable Data and
Information

-

Seamless information integration for
easy navigation

-
Turning Data into Knowledge

using in silico science

-
Simulate biomolecular processes using
in silico science

-
E
-
Collaboration and v
-
communities

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Computational Aspects in Drug Discovery

Target

finding

Target

validation

Lead

finding

Lead

optim.

Bioinformatics Lab

Macromolecular

Structure & Function Lab

Computational

Chemistry Lab

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Signal Transduction Networks

5

-

30

-

25

-

20

-

15

-

10

-

5

0

0

50

100

150

200

0
1
2
3
4
5
-2
0
2
4
6
time
control
cyto
0
1
2
3
4
5
-1
0
1
2
3
time
nuc
0
1
2
3
4
5
-2
0
2
4
6
time
drug
0
1
2
3
4
5
-1
0
1
2
3
time
...


HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Human data

SNP

DNA Sample

Sequencing

DB

DB

SAP

Translate &

Map/Align

Model &

Map

DB

DB

Disease association

Validated Targets

Virtual Drug Discovery

In Silico Docking

In Silico “Chemogenomics”

Virtual Library Design

Predictive MedChem

Tox PK/PK ADME modelling

Functional and

Structural insights

DB

Kinases

NR

Proteases

Structures &

Modelling templates

Proteins

Compounds

QSAR

In Silico

Drug Discovery

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

3D
-
Crunch

In Silico

Drug Discovery Pipeline: Can it be done?

Productive

Automated Protein

modelling email server

Productive

Automated Protein

modelling Web server

Genome scale Automated

Protein modelling

SETI@Home

1990

1995

2000

2005

Protein Model

Structure database

SETI@Home recognised as a leading new concept (ComputerWorld Award)

SWISS
-
MODEL and 3D
-
Crunch recognised as a leading new concept (ComputerWorld Award)

GeneCrunch

GeneCrunch recognised as a leading new concept (ComputerWorld Award)

First PC
-
GRID

at Novartis

Docking

in production

at Novartis

Automated
ToxCheck and
other CIx tools

Full Transcriptome

Modelling at Novartis

First automated pipelines

UD recognised
for visionary use of information technology in the category of Medicine
(ComputerWorld Award)

In Silico Drug

Discovery and

Chemogenomics

pipeline

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Novartis’ HPC Grid Strategy

Linux Clusters

Shared Servers

PC GRID

External

Collaborations

Job submission layer

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Influencing Biomolecular Processes

Target

Drug

Target

= enzyme, receptor, nucleic acid, …

Ligand

= substrate, hormone, other messenger, ...

Target

ACTIVE

Ligand

INACTIVE

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

PC Grid Success Story: Protein Kinase CK2 Inhibition

Target finding:

Protein Kinase CK2 has roles in cell growth, proliferation and
survival.

Protein Kinase CK2 has a possible role cancer and its over
expression has been associated with lymphoma.


Target validation:

To elucidate the different functions and roles of CK2 and confirm
it as a drug target for oncology, one needs a potent and
selective inhibitor.


Approach:

The problem was addressed by
in silico

screening (docking).

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Virtual Screening by
in silico

Docking

> 400,000
Compounds

Docking

Process

and

Selection

of

possible

hits

< 10
Compounds

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Important results

Conclusion

We

have

identified

a

7
-
substitued

Indoloquinazoline

compound

as

a

novel

inhibitor

of

protein

kinase

CK
2

by

virtual

screening

of

400

000

compounds
,

of

which

a

dozen

were

selected

for

actual

testing

in

a

biochemical

assay
.

The

compound

inhibits

the

enzymatic

activity

of

CK
2

with

an

IC
50

value

of

80

nM,

making

it

the

mostpotent

inhibitor

of

this

enzyme

ever

reported
.

Its

high

potency,

associated

with

high

selectivity,

provides

a

valuable

tool

for

the

study

of

the

biological

function

of

CK
2
.

“The reported work clearly shows that large database
docking in conjunction with appropriate scoring and
filtering processes can be useful in medicinal chemistry.
This approach has reached a maturation stage where it
can start contributing to the lead finding process
. At the
time of this study, nearly one month was necessary to
complete such a docking experiment in our laboratory
settings. The Grid computing architecture recently
developed by United Devices allows us to now perform the
same task in less than five working days using the power
of hundreds of desktop PC’s.
High
-
throughput docking has
therefore acquired the status of a routine screening
technique
.”

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Major benefits of GRID computing

Optimization of resources utilization:


HPC platforms usage is maximized and Technology expertise is shared.


Response to additional performance requirements is easier and faster


No service downtime due to possibility to run same job on many platforms
across different sites.


Enable cross business units collaboration and synergies:


Single efficient access path to Data and Compute resources.


Tools are easily exchanged between scientists/programs.


Favor “out of the box” thinking:


Apply HPC to areas which one would not even have considered a year ago.
This has created a fertile ground for a new paradigms in Drug Discovery
leading to Business Process transformation.

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Performance of the PC
-
GRID (today)

Computing Power:


Theoretical >5 TeraFLOPS harvested from 3000 PCs in all
geographical locations.


Acceleration of the
in silico
Docking process versus 1
standard 2002 PC (start of project)
: ~4000x



Financial:


Immediate savings in excess of 2m$.


No need for additional data centre to support this
computing power.


Optimally use of existing hardware (associates’ PCs)

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Building a GRID: Management focus

You need a champion!

Do not punctuate every sentence with the GRID word
and avoid the Hype!

Demonstrate value through pilots:


Think “Iterative Improvement”. The conceptual layers are there,
prototype are emerging, improvements and optimization is
essential, maturity will follow

Leadership, transcendence, entrepreneurship and
tenacity are the essence of transformation!


Concepts are easy to draw on a napkin over beer!


But new and great things are hard to achieve!


Use external goodwill to create internal acceptance!

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Peru

Community projects help with acceptance

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Building a GRID: User base

You need a clearly defined and communicated HP
Computing strategy.


Address unmet computational needs.


Apply HPC to areas which one would not even have considered
two years ago.
This has created a fertile ground for a new
paradigms in Drug Discovery leading to Business Process
transformation.


Are all problems “GRIDable”?


Further applications:


Sequence identification in proteomics from LC
-
MS/MS data


Text Mining and semantic Web infrastructure

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Building a GRID: Software

The Software licensing models will have to evolve


Do not stop because of software licensing issues.


Show success with freeware and home grown algorithms.


Demonstrate business value and cost leadership.


Opportunity to develop your own code?


Unification of HPC applications environment:


Ensure that applications can run on maximum number of
systems.


Introduce HPC software management:


Influence licensing models. The classical models do not fit the
GRID and HPC paradigm.

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Building a GRID: PC owners

Education and awareness.


Ensure that the HelpDesk is well trained and gives the right
answers.


Ensure that PC owners know about the REAL impacts,
including network.


The PCs are company
and not personal
assets!


Strategy to use them when they are idle is not a user but a
company decision.


Address power saving policies in a transparent manner.

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Knowledge Space
-

Vision

The "Knowledge Space Portal” is a Drug Discovery oriented
implementation of the Semantic Web. Through a single
customizable interface it:


Federates heterogeneous data resources and provide precise
organization of the content



Provides quick and intuitive access to information


Provides data extraction, analysis and exploration tools


Allows data integration, data exchange and interoperability of
applications


Provides mechanisms for data capture and annotation


Provides knowledge sharing and collaborative tools

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Basic principles behind the Knowledge Space

The Knowledge Space consists of:

The
collection

of all types of data and information within the
scope of interest defined by a particular business. There is no
conceptual difference between internal and external
data/information.

The
Meta Data

and the
Knowledge Map

which describe the
collection in terms of content and location.

The Text Mining platform which allows the identification of
entities (using
vocabularies
) and the concepts they belong to
using
ontologies
.

The
Ultralinker
, which associates identified entities and concepts
with specific contextual rules.

A user interface.

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

What is an Ultralink?

The Ultralink is an “intelligent” context
-
sensitive Hyperlink created at run time by the
Ultralinker.

The Ultralink is generally a menu of links instead of a single link.

This menu will only offers sensible actions/options:



No dead ends due to a verification process ensuring that the link has a target.


The Ultralink provides direct interaction between any type of entity (gene name,
compound name, mode of action, disease name, company name, etc… with an
appropriate set of tools and resources as defined by the rules encoded in the
Ultralinker.


The Ultralink functionality allows the selection of any portion of text in the Web
browser and sends it as input to the Ultralinker for analysis and menu creation.


The Ultralink allows easy navigation across the information domains contained in the
Knowledge Space.

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

How the Ultralinker works

The Ultralinker is a Web service which analyses any information (such as a complete
web pages) it receives for recognisable entities using text mining and pattern
recognition methods.

Each recognised item is mapped onto the ontologies and the Knowledge Map.

The Expert System will define what can be done with the identified entities e.g.



If a gene name is recognised then Ultralinks are created to:


get its sequence and perform sequence similarity searches;


query genetic disorder databases and map it onto the chromosome;


produce a 3D structure by comparative modelling;


look for hits from High Throughput Screening;


etc…

Automated predefined processes can thus be activated by a single click (Ultraaction or
work
-
flow).


The Ultralinker will create a menu that will be sent to the User interface.

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Ultralinker

Semantic

Search

Text

Mining

Analytics

What constitutes the Knowledge Space

Internet

Other

Research

Documentation

Chemistry

Biology

Literature

Comp. Inf.

Bioinformatics

Meta Data


K map

Thesaurii

Ontologies

Rules

Defined

workflows

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Knowledge Space Search Modes

Text

Structure

Concepts

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Knowledge Space: Text search

ACE modulator
Cholecystokinin modulator
Metalloprotease 4 modulator
ACE-related carboxypeptidase modulator
Chymase modulator
Metalloprotease 7 modulator
Acrosin modulator
Chymotrypsin modulator
Metalloprotease 8 modulator
Aggrecanase modulator
Clipsin modulator
Metalloprotease 9 modulator
Alpha 1 protease modulator
Collagenase modulator
Metalloprotease modulator
Alpha 1 proteinase inhibitor
Complement cascade modulator
NAALADase modulator
Alpha 1 proteinase modulator
Complement factor modulator
Pepsin modulator
Aminopeptidase modulator
Cysteine protease modulator
Peptidase modulator
Amyloid protease modulator
Dipeptidase modulator
Plasmepsin modulator
Antitrypsin modulator
Elastase modulator
Plasmin modulator
Aspartic protease modulator
Endopeptidase modulator
Protease inhibitor
Atriopeptidase modulator
Endothelin converting enzyme modulator
Protease stimulant
Calpain inhibitor
Factor IX modulator
Proteasome inhibitor
Calpain modulator
Factor VII modulator
Proteasome modulator
Carboxypeptidase modulator
Factor X modulator
Renin modulator
Caspase modulator
Factor XII modulator
Secretase modulator
Cathepsin B modulator
Gelatinase modulator
Serine protease modulator
Cathepsin D modulator
Interleukin 1 converting enzyme modulator
Thrombin modulator
Cathepsin F modulator
Kallikrein modulator
Thrombokinase modulator
Cathepsin G modulator
Metalloprotease 1 modulator
Trypsin modulator
Cathepsin K modulator
Metalloprotease 11 modulator
Tryptase modulator
Cathepsin L modulator
Metalloprotease 12 modulator
Ubiquitin-specific protease inhibitor
Cathepsin modulator
Metalloprotease 13 modulator
Ubiquitin-specific protease modulator
Cathepsin S modulator
Metalloprotease 2 modulator
Ubiquitin-specific protease stimulant
Cathepsin V modulator
Metalloprotease 3 modulator
Urokinase modulator
Cathepsin X modulator
Viral protease modulator
Antiviral
CMV protease inhibitor
CMV protease modulator
Hepatitis C protease inhibitor
Hepatitis C protease modulator
Herpes simplex virus protease inhibitor
Herpes simplex virus protease modulator
HIV protease inhibitor
HIV protease modulator
HIV-1 protease inhibitor
HIV-1 protease modulator
HIV-2 protease inhibitor
HIV-2 protease modulator
NS3 protease inhibitor
NS3 protease modulator
Picornavirus protease inhibitor
Picornavirus protease modulator
Expansion: EMTREE + Novartis proprietary dictionary


expansion
景r pro瑥ase modula瑯rs ⬠respec瑩ve synonyms


HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Analysis tools

Display
-
Navigation
-
Ultralink

Protease modulator in Literature DB (Medline
-
Embase)

Sort capabilities

Easy navigation in
record titles

Search report:
Number of Docs,
Key
-
words
extracted

Ranking value
and access to
document

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Document view

Take advantage of the full
-
text
article provided by PubMed

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Analysis Tools

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Data Analysis


Protease modulators in CI DBs
July 2004
-

ADIS & Pharmaprojects

Univariate
-

Companies

Univariate
-

MOA

Univariate
-

Diseases
conditionned by Companies

Clustering Diseases
-
MOAs

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Graph Navigator

Protease modulators in CI DBs
July 2004
-

ADIS & Pharmaprojects

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Clustering

HPTS Asilomar / M
. Peitsch

/
September
, 200
5

Chemistry, Chemoinformatics and Structural
Biology