Chemical Visualization: The Art of Drawing a Chemical Structure

boorishadamantAI and Robotics

Oct 29, 2013 (3 years and 5 months ago)

87 views

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Chemical Visualization:

The Art of Drawing a Chemical Structure

W. D. Ihlenfeldt

Computer
-
Chemistry
-
Center

University of Erlangen
-
Nuremberg

Erlangen, Germany

O

O

O

O

H

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Topics



Motivation




Drawing One Structure




Drawing a Set of Structures




Visualizing Structure Attributes

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Motivation: 3D is not Everything


3D structure displays


are valuable tools, but

...




limited to viewing part of structure




unsuited for quick comparisons



/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Motivation: Defending 2D Plots


2D structure plots are still the core of


of chemical information:




show complete structure




easy recognition of patterns


O

O

O

O

H

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Motivation



So why can’t you just draw it with
[your editor of choice]

?

The trend towards combinatorial libraries and structure

design requires fast, reliable and automatic drawing of

enormous numbers of compounds.


Unfortunately, 2D structure drawing is
hard
.

[Helson,
Rev. Comp. Chem.
,
13
, 313, 1999]



Because nobody wants to draw millions of structures!

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Topics



Motivation




Drawing One Structure




Drawing a Set of Structures




Visualizing Structure Attributes

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

The Rules



Chemists
know

how a good plot should look like:









Not a simple projection from 3D



Complex, ill
-
defined set of rules

O

O

O

O

H

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Basic Structure Data






Draw a structure from
connectivity



Atom/bond table



No coordinates to begin with



Often specified in linear notation such as SMILES


O1C(/C=C/[C@@](/C=C/[C@@H](C(=C
\
CC[C@H]([C@@H]1C)C)/C)OC)(O)C)=O

O

O

O

O

H

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

The Complications



Orientation of ring systems






Close atoms by colliding fragments

Ô

O

O

Ô

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3



Rules for bridge systems






Cages

The Complications

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

The Complications



1.3/1.4
-
embedded rings






Crowded connection points

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

The Complications



Choice of stereo attributes (wedges, etc.)






Trans
-
bonds in rings

O

H

O

O

O

O

O

O

O

H

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

A Simple Structure?



dual 1.4
-
embedded rings



close contacts



no solution on 120º grid



16
-
membered ring



implicit constraints



with +/
-

60,90,45,35º angles



8
14

naive patterns (4.3∙10
12
)



max. for real
-
time response: 250.000 (2.5∙10
5
)

N

N

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Controlling the Number of Patterns



closing distance and angle



cis/trans information



total distance criterium



full loop criterium



no clustering of non
-
60
°

angles



pseudo energy selector



favoring 60
°
, symmetry

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

A Simple Structure?

N

N

N

N

N

N

Isis/Draw 2.2

ChemSketch 3.5

ChemDraw 5.0

N

N

CACTVS 3.113

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

More Examples

Symmetry
-
preserving

optimization by

synchronous bond

bending

N

Ni

N

N

N

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

More Examples

Optimization by

bond shortening

H

N

N

O

H

O

O

N

O

N

H

O

N

H

O

N

H

O

H

N

N

O

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

More Examples

Cage system

analysis


(not a template)

O

H

H

H

O

H

H

H

H

H

H

H

H

H

H

H

H

H

O

O

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

More Examples

Complex ring

system and bond

arragenment

analysis

O

O

O

O

O

O

O

H

H

H

H

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

More Examples: Complex Ring Systems

C
\
C(C(CC)C(C)/C([H])=C/C([H])=C(CO3)/C2(O)C3C(OC)C(C)=CC2C4=O)=C([H])/CC1OC5(CCC(C)C(C(C)C)O5)CC(O4)C1

H

H

O

O

H

O

O

H

O

O

O

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Complex Ring

Analysis



Not really a


norbornane
-
type


bridge system



3 trans
-
ringbonds



2 implicit cis bonds



2 implicit trans bonds

O

O

O

O

O

O

O

H

H

H

H

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

The CACTVS Plot Algorithms



Defines state of the art



Fully stereo
-
aware, in chains and rings



Triple bonds in rings



Automatic wedge assignment



Intelligent pseudo
-
energy optimizer

C[C@H]1[C@H](C)CC
\
C=C/[C@H](OC)/C=C/[C@@](C)(O)/C=C/C(O1)=O

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Open Access

Test it!







http://www2.ccc.uni
-
erlangen.de/services/gif.html

(final version soon)

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Not just a Passive Image...

Web Interface

With image maps:

Portable,

user
-
friendly

selection/

manipulation

alternative

C[C@H]1[C@H](C)CC
\
C=C/[C@H](OC)/C=C/[C@@](C)(O)/C=C/C(O1)=O

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

More Examples

O

O

O

O

N

O

S

O

H

O

N

O

N

O

O

N

H

O

N

H

O

N

O

N

O

N

O

N

O

O

N

O

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Topics



Motivation




Drawing One Structure




Drawing a Set of Structures




Visualizing Structure Attributes

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Just a typical

small screening

experiment

1148

structures

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

346 Clusters

cluster #192

18 compounds


fingerprint

clustering

(144 bits)

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

A Closer Look



Plots have


similar


characteristics



Structures


not optimally


aligned

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Aligning Multiple Structures



Move Morgan
-
center atom to origin



Operate on hexagonal grid



Center on grid

N

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Aligning Multiple Structures

Operations on structures:




Move to neighbor grid (6)



Rotate by multiple of 60 degs. (5)



Rotation around center or hetero atom



Horizontal and vertical flip



120 deg. flip of non
-
terminal, non
-
ringbond



Shuffle to other sequence position

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Aligning Multiple Structures

Merit function:



Multiple occupation of grid cells



Bonus for overlay of


ring atoms, aromatic atoms, hetero atoms



Multiplicative rating within cell



Rate only against earlier entries in sequence, weight 1/

n



Optimizer:



Taboo
-
search



/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

A Simple Example

S

H

N

H

N

S

O

H

N

N

H

O

S

O

H

N

N

H

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

The

Cleaned

Cluster

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Topics



Motivation




Drawing One Structure




Drawing a Set of Structures




Visualizing Structure Attributes

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Atomic Charges
-

Wild Growth...

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

More Methods to Display Charges...

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

And Another Method!

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

2D Voronoi Polygons



Polygons around atoms



Full use of available space



Extra hidden points to limit area



Encode attribute by color codes



Immediate recognition

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Back to the Cluster



David Covell, NCI, NIH:


„What is the
essence
? How can I SEE what the


principle behind the cluster is?“




What are the similarities?



What are the differences?



What is the prototypical compound?

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Decoding Fingerprints



Fingerprints encode
presence of fragments



No count



No location






Vector %11010




Loss of information
-

can it be reclaimed or ignored?

O

N

O

N

O

O

O

O

O

O

O

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Decoding Fingerprints

Approach:





Perform statistics on all possible matches


of all fragments in fingerprint set and weigh atom


participation




Relative occurence of fragments on atom


compared to prototype




Prototype is either virtual average cluster


structure, or specific selected compound

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Results:

Virtual

Median

Structure

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Results:

Chose a

Prototype

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Results:

Chose

Another

Prototype

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

Acknowledgements



Marc Nicklaus, David Covell, et al. at NCI, NIH




BASF




Chemical Concepts




DuPont de Nemours

/slides/cactvs/acswashington2000.ppt

© Ihlenfeldt 2000

C
3

More Information

W. D. Ihlenfeldt

wdi@ccc.chemie.uni
-
erlangen.de


http://www2.ccc.uni
-
erlangen.de/wdi/