Post on 07-Aug-2020
Zhenbin (Benjamin) Li, Ph.D. Research Data Integration & Logistics Service
Chemistry Infrastructure Migration in a Global Pharmaceutical Company: Concerns and Reality
Chemistry Infrastructure
• Chemistry Infrastructure: Computer systems, applications or software that store, search, manipulate, calculate, and visualize chemical or biological entities and their properties. Chemistry infrastructure is indispensible computer support in drug discovery and development processes of pharmaceutical industry.
• Examples of Chemistry Infrastructure: Chemistry cartridge, chemistry drawing tools, structure standardization, chemical reaction and molecule visualization, etc.
• Vendors of Chemistry Cartridges: MDL Direct (Accelrys), Accord (Accelrys), JChem (ChemAxon), ICCartridge (InfoChem) Daylight (daylight), Bingo (GGA), etc.
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 2
History of Chemistry Infrastructure
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 3
1950
2012
•1955 CAS Laid ground work for computer-based chemical information database
•1957 Ray and Kirsch: substructure searching algorithm (atom-by-atom matching), later modified by Sussenguth (1965)
•1959 Opler and Baird: first graphical display of chemical structure
•1965 Gluck, Morgan, Chemical storage and search system (Du Pont), canonical form of connection (bond-by-bond) and later
modified by Morgan and become Gluck-Morgan Algorithm
•1967 Armitage and Lynch, Structure similarity
•1970 Crowe et al. fragment-based screening
•1971 Hamilton, established the protein data bank (PDB) at Brookhaven National Lab
•1971 Gund et al. 3D structure searching
•1972 Wipke et al. 3D model from 2 D drawing with stereochemistry
•1977 Mason, Peacock, Wipke, Molecular Design Limited, First database MACCS
•1979 Chevron Chemical Company, first company to license MDL
•1981 Lynch et al. Markush structures, 2 Patent databases Markush DARC (Derwent) and MARPAT (CAS)
•1985 First commercial sale of Robernstein’s ChemDraw to Stu Schrieber and Yale Univ.
•1986 ChemDraw 1.0 was released
•1987 Dolata et al. 2D-3D converter and Hiller and Gasteiger CORINA
•1987 MDL listed on NASDAQ
•1988 Weininger, SMILES notation
•1988 Downs et al. Parallel computing system (Transputer)
•1989 Gasteiger and Weiske ChemInform, ChemoData, InfoChem reaction database, digitalized Beilstein Handbook
•1991 MDL ISIS Client/server application
•1992 Delby et al. Introduced MOLfile, Sdfile, RDfile, RXNfile, CTAB (v2000), the de facto
•1996 MDL introduced V3000 format
•1997 MDL acquired by Reed Elsevier
•1998 ChemAxon formed
•2003 Elsevier MDL introduced Xdfile
•2004 Launch and adoption of ChemAxon's JChem Cartridge for Oracle to medium sized CRO
•2005 Neurogen completely migrated chemistry infrastructure from MDL to ChemAxon
•2007 Symyx acquired Elsevier MDL
•2010 Accelrys merged with Symyx
•2012 ChemAxon JChem Cartridge globally licensed to 5 of the top 10 pharma
W. L. Chen, 2006, J. Chem. Inf. Model
And personal communications
When to Consider Chemistry Infrastructure Migration
• Certain legacy systems would not run on new environment (hardware and operation systems) or requires tremendous effort (or cost) for upgrade
• The current chemistry infrastructure technology lags behind the industry trend
• The current chemistry infrastructure cannot meet the increasing demand of in-house software development
• Dissatisfaction from technology and business demand for support and consulting
• Long-term financial gain
• Historically, systems were built with ISIS platform, and needed to be migrated away
• Isentris based alternative solutions did not offer performance advantages, and were therefore temporarily shelved
• In-house chemistry systems demand robust APIs to integrate and manage global work-flows
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 4
Common Considerations of Chemistry Infrastructure Migration
Challenges in BI
• Reliability
• Extensibility
• User friendliness
• Clear path for migration
• Consulting
• Support
• Expertise
• Customization
• New upgrade according to customers’ needs
• License model
• Negotiation power
• Short-term cost cut
• Long-term financial gain
• Company stability
• Size
• Culture/work ethics
• Familiarity with global pharmaceutical industry
Chemistry Infrastructure Selection Criteria
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 5
ChemAxon as Chemistry Infrastructure
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012
ChemAxon as Chemistry Infrastructure
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012
JChemBase (Java and .NET)
JChem Cartridge
Marvin/MarvinSketch
InstantJChem
Things to Consider for Chemistry Infrastructure Migration
Timeline
On-going business demand
Re-training of developers
Re-training of end-users
Legacy systems
System interdependency
Production interruption
User acceptance
Financial commitment
Resource
Expertise
Global alignment
Reliability and flexibility of the new chemistry infrastructure
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 8
Roadmap of Chemistry Infrastructure Migration
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 9
Business intention
Market options
Preliminary evaluation
Business approval
In-depth evaluation
Pilot implementation
Negotiation
Financial commitment
Acquire licenses
Consulting
In-depth system analysis
Implementation planning
Data migration
System migration
Testing
Completion
External Request Management System as A Pilot Project
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 10
BI Chemist CRO Chemist
FedEx/DHL
Shipping Sheet
Management
Local / International
-Initials
-Amount
-Request date
-# of steps
-Difficulty
-Ordering dates
-Completion dates
-Quantity shipped
-Customs issues
-Shipment contents
Inventory System
Transfer compound
info into inventory
automatically
Logistics
Calculation and
Reporting
DB UI
ERMS
ERMS as a Pilot Project
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 11
User Login
Authentication
and Authorization
ERMS DB
Request
Management
Logistics
Calculations
and Reporting
Structure
Searching
Reagent
DB
Compound
DB
Shipping
and Status
Commercial
DB
BI Internal only Accessible by both BI and CROs
Accessible by BI, but partially accessible by CROs E-Notebook
DB
Reaction Scheme
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 12
-MgBr
* http://en.wikipedia.org/wiki/Nicolaou_Taxol_total_synthesis
Aldol condensation of acetone and ethyl acetoacetate gave β-keto-ester 3. A Grignard reaction involving
methylmagnesium bromide provided alcohol 4, which was subjected to acid catalyzed elimination to give diene 5.
Reduction and acylation gave diene 7 (Scheme 3, compound 1).
Reaction Scheme: Iteration of Compounds
• JChem allows parsing the reaction scheme into individual compounds.
• This can only be achieved when the regular arrow, instead of reaction arrow, is used in the scheme.
• Mol file containing a mixture of all the compounds in reaction scheme can be separated using getFragments() method.
• However, the order of compounds is not necessarily consistent with the reaction scheme.
• Ideally developer should have some control of the order or at least the behavior can be understandable.
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 13
Steps of JChem Implementation
• JChem Oracle Cartridge Installation
• Data migration using JChem Manager or pure SQL statement
• Create domain indices on the structures if data are created via SQL.
• ChemAxon domain index can coexist with MDL Direct index on a same database instance. This allows us to better planning the data migration with low impact on current production systems
• Rebuild the relationship in the database
• Change application codes to implement ChemAxon technology
• Change interface with ChemAxon user interfaces
• Testing and deployment
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012
Common Cartridge Functions
•Insertion
jchem_table_pkg.jc_insert('C1CCCCC1', 'JCHEM_Structure', null,null, null, null);
•Update
jchem_table_pkg.jc_update('c1ccccc1', 'JCHEM_STRUCTURE', cd_id, null);
•Deletion
jchem_table_pkg. jc_delete('JCHEM_STRUCTURE', 'where structure_id = 1001', null);
•Search Structure:
SELECT COUNT(*) FROM JCHEM_STRUCTURE WHERE JC_COMPARE(CD_STRUCTURE, 'C1CCCCC1', 'T:S') = 1;
s: substructure search (default)
na: substructure search fingerprint-only
f: full structure search; query and target must have the same heavy atom network for matching.
ff: full fragment search; query must be full matching to a target fragment.
d: duplicate search
i: similarity search.
u: superstructure search
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012
http://www.chemaxon.com/jchem/doc/dev/cartridge/cartapi.html
ERMS Fully ChemAxon-enabled
ChemAxon select structure_id, jc_molconvertb(cd_structure, 'mol') as mole, cd_molweight as mw, cd_formula as formula, cd_smiles as smiles, ….. from jchem_structure where jc_compare(cd_structure, ?, 't:ff') = 1
select structure_id, jc_molconvertb(cd_structure, 'mol') as mole, cd_molweight as mw, cd_formula as formula, cd_smiles as smiles, ….. from jchem_structure where jc_compare(cd_structure, ?, 't:i simThreshold:?') = 1
select structure_id, jc_molconvertb(cd_structure, 'mol') as mole, cd_molweight as mw, cd_formula as formula, cd_smiles as smiles, …..from jchem_structure where jc_compare(cd_structure, ?, 't:s') = 1
MDL select structure_id, molfile(molecule) as mole, molwt(molecule) as mw, molfmla(molecule) as formula, smiles, ….. from structure where flexmatch(molecule, ?, 'match=all')=1
select structure_id, molfile(molecule) as mole, chime(molecule) as chime, molwt(molecule) as mw, molfmla(molecule) as formula, smiles, ….. from structure where similar(molecule, ?, ?)=1
select structure_id, molfile(molecule) as mole, chime(molecule) as chime, molwt(molecule) as mw, molfmla(molecule) as formula, smiles, …..from structure where sss(molecule, ?)=1
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012
Summary of the Pilot Project
• ChemAxon JChem allows us to achieve reaction scheme capture, regeneration and compound iteration programmatically, which are difficult, if possible, to achieve with other chemistry software.
• We installed JChem cartridge and successfully implemented in ERMS v2.
• Switching from MDL to ChemAxon is very straightforward, suggesting a high feasibility to migrate chemistry infrastructure from MDL to ChemAxon.
• The total amount of time (60 hrs) is very reasonable to completely change from MDL-based technology to ChemAxon-based technology.
• This project established the benchmark for other similar applications
• ChemAxon technology provides opportunities to BI for future system development and integration.
• Modular design in ERMS helps the smooth migration
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012
System Interdependency Analysis
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 18
Migration Planning
Time 2012 2013 2014
System Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2
System A
System B
System C
System D
System E
System F
System G
System H
Migration planning should be based on the interdependency of systems and applications as well as
available resources.
The pilot project should provide benchmark for the estimation of time needed for each system. The
estimation should include the migration of data, the application and interfaces.
Each additional system migration can serve as additional benchmark for the other systems in order to
further modify and optimize the estimation.
Learning curve effect should be considered
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012
Conclusions
• This has never been easy decision, but considering this as investment for next decades.
• Use pilot project to prove the concept and address concerns, especially from the management.
• System interdependency analysis is important step. It helps the migration plan independent of what vendor of chemistry infrastructure is.
• The reality of the chemistry infrastructure migration may not be as bad as we originally fear
• In migration process, the effort of data curation can be high.
• Strong user support, especially from the key stakeholders, must be expected because there will be changes in user interface, even applications.
Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 20