Chemical Database Management with JChem Base and Cartridge
Transcript of Chemical Database Management with JChem Base and Cartridge
Chemical Database Management with JChem Base and Cartridge
Szabolcs Csepregi
Solutions for Cheminformatics
Outline
• ChemAxon chemical database products
• Architecture
• Features
• Example interfaces: JSP, ASP examples
• Integration with other CXN tools
• The coming Registration System API
• What is coming in JCB/Cartridge 5.1
2
Chemical database products
• JChem Base– A library for adding chemical structures into relational
database systems. Available in Java, JSP and .NET– Open-source web application example is available.
• JChem Cartridge for Oracle– Extends Oracle SQL with chemical operators and index.– SQL interface for ChemAxon functionality
• Instant JChem– An all-in-one desktop chemical database application.
3
JChem Base application architectures
Web application
4
ClientClient
Internet /
Intranet
Internet /
Intranet
ServerServer
JDBC driver
Custom servlet
or JSP scripts
Custom servlet
or JSP scripts
Query
structure
SQL
Hits
Web browserWeb browser
Query Structures +
data
Relational database
(Oracle, MySQL, MS SQL Server, DB2, etc.)
JChem
class library
JChem
class library
JChem
class library
JChem
class library
JChem
class library
JChem
class library
JChem Base application architectures
Rich client application
5
ClientClient
Internet /
Intranet
Internet /
Intranet
ServerServerJDBC driver
Rich client
application
Rich client
applicationQuery
structure
SQL
Hits
Relational database
(Oracle, MySQL, MS SQL Server, DB2,
etc.)
JChem
class library
JChem
class library
JChem
class library
JChem
class library
JChem
class library
JChem
class libraryRich client
application
Rich client
application
JChem Cartridge architecture
The JChem computation engine can be on a dedicated server to balance workload.
6
ClientClient
Internet /
Intranet
Internet /
Intranet
ServerServerOracle JChem Cartridge• PL/SQL• Java stored procedures
JChem Cartridge• PL/SQL• Java stored procedures
JChem Server
JChem Cartridge AdapterJChem Cartridge Adapter
JChem BaseJChem Base
Search Update
JChem core
CacheCache
CacheCache
RMI
JDBC
Client application / Application serverClient application / Application server
SQL
Compatibility and integration
Supported chemical file formats:• SMILES• MDL MOL/RXN/SDF/RDF (v2000 and v3000)• CML, MRV• etc.
Database engines:• Oracle, MySQL, MS SQL Server, MS Access,
PostgreSQL, IBM DB2, Derby, etc.
All operating systems through:• Java API (JChem Base)• .NET API (JChem Base + JNBridge) – for Windows• SQL (Cartridge)
7
Structure searching: features• Substructure, Similarity,
Exact, Exact fragment, etc. Search types
• Wide range of query atoms
• Query properties
• R-group queries
• Full SMARTS support
• Coordination compounds
• Link nodes
• Pseudo atoms, Lone pairs
• Relative stereo
• Reaction search features
• Hit coloring ...
www.chemaxon.com/conf/Structural_Search.ppt8
Structure searching: optionsSome of the structure search options:
– Chemical Terms filter constraint– Tautomer search– Stereo on/off– Ignore charge/isotope/radical/valence/mixture
brackets– Vague bond matching modes: „or aromatic”;
ignore bond types– Inverse hit list – Maximum search time / number of hits – SQL SELECT statement for pre-filtering– Ordering of results– etc. 9
Structure search: performance
10
JChem Base 5.0, Athlon X2 2.6GHz, 4GB RAM; Oracle 9.2.0.8.0
Number of compounds
Elapsed timeDuplicates not
checkedDuplicates checked
10,000 22 s 35 s
100,000 2 min 33 s 4 min 16 s
200,000 4 min 53 s 8 min 19 s
Query Number of hits Search time
12 0.219 s
936 0.375 s
4,608 0.734 s
65,208 5.594 s
Compound registration:
Substructure search in a table of 3 million
compounds:
Table typesControl allowed chemical structures and available
operations
• Molecule
• Reaction
• Combinatorial Markush
• Query
• Any structure11
Example interfaces: JSP, ASP• Example web applications: open source JSP, ASP
examples– Marvin applets
are used for query drawing and structure visualization
• Demo
12
Integration
• Integration with other ChemAxon tools: – Custom, uniform chemical representation. (Standardizer –
see separate presentation today.)– Automatically calculated properties by Chemical Terms
Calculated columns (Calculator plugins)– Additional similarity calculations (Screen - JChem Base
only) – Tautomer handling:
• Tautomer search• Tautomer duplicate filter table/index option• Custom tautomer transforms or canonical tautomer using
Standardizer– Query drawing and structure visualization (Marvin)
Provides the most consistent interface and back-end.
13
Integration
Additional Cartridge functionality– JChem index (for non-JChem tables)– Communication with Oracle optimizer– Reaction based enumeration (Reactor)– Format conversions – image generation also– Markush enumeration (Calculator plugins)– Property predictions through Chemical Terms
(Calculator plugins)
14
Registration system
• New component for registration system will be introduced from summer, 2008 (API only)
• Main features:– Customizable business logic
• Multilevel duplication control • Customizable corporate registration ID • Handling of salts, batches, lots, samples, and mixtures
– Identification, split and registration of salt and solvent structures Storage of input structures in original format
– Mock registration (dry run) – Pre-registration through a transitory area– Basic, customizable implementation examples
• Separate examples for chemists and registrars
• Web and Instant JChem interfaces will follow later
15
What is coming in JChem 5.1
Structure searching– Position variation
in Markush structures and queries
– Diastereomer search option (Same tetrahedral stereo centers, but possibly different configurations.)
– Check sp-hybridization search option (substructure)
Cartridge installer GUI
16
What is coming in JChem 5.1.X
(In a few months)• Web Services interface for JChem Base
• Compound registration system API
17
Under development
• Further improvements of Markush handling (towards patents)
• Flexible 3D pharmacophore searching
• Integration of further ChemAxon functionality in the Cartridge:– R-group decomposition– Custom descriptors & similarity measures
• Include JDBC drivers in installer
• JChem for Excel
18
Summary
• JChem Base, JChem Cartridge and Instant JChem offer comprehensive and efficient chemical database solutions.
• They are integrated with many other ChemAxon products and are accessible from various interfaces.
• Registration system, JChem for Excel and patent Markush handling are coming.
19
Thank you for your attention!
For more information please visit www.chemaxon.com
20