web-based - ChemAxon

basesprocketData Management

Oct 31, 2013 (3 years and 10 months ago)

94 views

version 5.3, February

20
10

Scientific & technical presentation


JChem Base

Introduction to JChem Base

High performance Java based tools

for:

storage
,
search

and
retrieval of chemical

structures and associated data


The components

can be integrated into

web
-
based

or standalone applications


in association with other
ChemAxon

tools

Structural overview

Web

browser

Application

Web application

JChem Base API
:

Chemical logic

Structure cache

JDBC driver
: Standard interface to the RDBMS

RDBMS

(e.g. Oracle, MySQL, etc.) :

Storage and security

Compatibility and integration

File formats:



SMILES



MDL molfile


(v2000 and v3000)



MDL SDF



RXN



RDF



MRV



IUPAC name, InChI



Markush DARC



CDX


Integration:

extensive API

for



Java



.NET



JChem Cartridge for

Oracle

Database engines:



Oracle



MySQL



MS SQL Server



PostgreSQL



MS Access



IBM DB2



Derby



etc.


Operating systems:



Windows



Linux



Mac OS X



Solaris



etc.

JSP example application

F
eatures
:


Substructure, Superstructure,
Full,
Exact fragment,

Similarity
and
Perfect
search


Molecular Descriptor similarity
search with descriptor coloring


Substructure hit alignment and
coloring, inverse hit list


Chemical Terms

filter


Import / Export


Export of hits


Insert / Modify / Delete structures


AJAX in JChem Webservices

Structure search features

See detailed information on structure search:

www.chemaxon.com/conf/Structural_Search.ppt



Wide range of query atoms



Query properties



R
-
group queries



Full SMARTS support



Coordination compounds



Link nodes



Pseudo atoms, lone pairs



Relative stereo



Reaction search features



Hit coloring, position variation



Polymers

S
earch options

Some
selected
structure search options:


Stereo on/off


Ignore charge/isotope/radical/

valence/
polymers, etc.


Vague bond matching
options


Chemical Terms filter


Tautomer search


Inverse hit list


Maximum search time / number of hits


Combine with non
-
structure

conditions


Ordering of results


etc.

JChem Base
5
.
2.2
,

Intel Quad
Q6600 2.4
GHz,

8
GB RAM;


Oracle
10
.2.0.
3

Performance (1)

Number

of
compounds

Elapsed time

Duplicates

not

checked

Duplicates
checked

10,000

21 s

26 s

100,000

2 min 4 s

2 min 34 s

200,000

4 min 24 s

5 min 13 s

Query

Number of hits

Search time




2

0.91 s




93

0.98 s




6,001

1.30 s

146,256

5,66 s

Compound registration:


Substructure search

in

PubChem

(
19.5

million

compounds
)
:


Performance (2)

Similarity search:

Tanimoto

>0.9













JChem Base
5
.
2.2
,

Intel Quad Q6600 2.4
GHz,

8
GB RAM;

Oracle
10
.2.0.

Query

Number of hits

Search time




0

3.39 s




0

3.82 s




0

3.33 s

Markush structures

Markush structure registration and search



Markush features



R
-
groups



Atom lists
, b
ond

lists



Position variation bond



Link nodes

and repeating units



Homology variation (alkyl, aryl, etc.)



Compatible Markush enumeration plugin


Administration with JChemManager

User interface for



creating tables



import



export



deleting rows



dropping tables



Most functions are also available
from
command

line
.

Standardization


Default standardization
includes
:


Hydrogen removal


Aromatization



Custom standardization

can be specified for each
table by specifying an
XML

configuration file at table
creation or in the “
Table
Options
” dialog of
JChem
Manager

(jcman)

Custom Standardization Example

after

before

Standardizer

http://www.chemaxon.com/conf/Standardizer.ppt

The property table

The property table stores information about JChem
structure tables, including:




Fingerprint parameters



Custom standardization rules



Other table options and information


More than one property table can be used, each
property table represents a particular JChem
environment.

Table types

Control allowed chemical structures and available







operations


Molecule



Reaction


Markush



Query



Any structure

The structure of JChem tables

Column name

Explanation

cd_id

unique numeric identifier in the table

cd_structure

the imported structure in the original format, without
modifications (except for the removal of data fields)

cd_smiles
;
cd_smarts;
cd_markush

the standardized structure format

dependig on the
different table types
, used by the search process

cd_formula

the formula of the standardized structure

cd_sortable_formula

formula representation for alphanumerical sorting

cd_molweight

the molecular weight of the standardized structure

cd_hash
;
cd_flags
;

cd_fp…

fields used internally for structure searching

cd_timestamp

the date and time of the insertion of the row

[user fields]

custom data fields can be added by the user

Structural search in database

Two stage method provides optimal performance:


1.

Rapid pre
-
screening reduces the number of

possible hit candidates



Chemical Hashed Fingerprints are used for

substructure and superstructure searches


Hash code is used for duplicate filtering



(usually during compound registration)


2.

Graph search algorithm is used to determine

the final hit list

Structure Cache


Contains Fingerprints for screening and ChemAxon Extended
SMILES for ABAS


Instant access to the structures for the search process


Reduced load on the database server


Incremental update ensures minimum overhead after changes
in the table


Small memory footprint due to


SMILES compression


Optimized storage technique


Approximately 100MB memory needed for 1 million typical
drug
-
like structures (using default, 512 bit long fingerprints)

Future plans


Graphical user interface for R
-
group decomposition


Arbitrary table structure

(Java and .NET API for JChem index)


Maximum common substructure search type


Additional layer: JChem Server (later also as grid)


Compound registration system API


Summary

ChemAxon’s JChem Base API provides sophisticated
high performance tools

for the developer to deal
with chemical structures and associated data.


Building on the JChem API is convenient, because:



Our various tools
integrate

seamlessly


Both high and low level API classes are
available


Responsive developer
-
to
-
developer
support

Links


JChem home page:

http://www.chemaxon.com/products/jchem
-
base


Online tryout
:

http://www.chemaxon.com/jchem/examples.html


API documentation:

http://www.chemaxon.com/jchem/doc/api/index.html


Brochure:

www.chemaxon.com/brochures/JChemBase.pdf

Visit other

technical presentations

MarvinSketch/View

http://www.chemaxon.com/MarvinSketch_View.ppt

MarvinSpace


http://www.chemaxon.com/MarvinSpace.ppt

Calculator Plugins


h
ttp://www.chemaxon.com/Calculator_Plugins.ppt

JChem Base


http://www.chemaxon.com/JChem_Base.ppt

JChem Cartridge

http://www.chemaxon.com/JChem_Cartridge.ppt

Standardizer


http://www.chemaxon.com/Standardizer.ppt

Screen



http://www.chemaxon.com/Screen.ppt

J
Klustor


http://www.chemaxon.com/JKlustor.ppt

Fragmenter


http://www.chemaxon.com/Fragmenter.ppt

Reactor



http://www.chemaxon.com/Reactor.ppt