ICIC 2014 What Can We Learn from Our Past, that Equips Us for the Future?

Post on 11-Jun-2015

562 views 3 download

Tags:

description

This talk describes lessons learned from a 30-year career in Chemical Information Science, key influences and motivations, and signposts to the future for what we may expect. Work has changed from implicit knowledge of where to find information in books, through data stores, to the internet, and beyond to unimagined futures. The talk references the rise and fall of the UK pharmaceutical industry as a place to work, in line with changes to the provision of information to scientists. The nature of work is described in relation to the 2nd law of thermodynamics, and hopefully provides hope for the future of chemical information for the next generation.

Transcript of ICIC 2014 What Can We Learn from Our Past, that Equips Us for the Future?

David Walsh, Grail Entropix

What can we learn from our past, that equips us for the future?

This talk describes lessons learned from a 30 year career in Chemical Information Science, key influences and motivations, and signposts to the future that we may expect. It describes how work has changed from implicit knowledge of where to find information in books, through Data stores, to the internet, and possibly beyond. It references the rise and fall of the UK Pharmaceutical Industry as a place to work, in line with changes to the provision of information to scientists. The nature of work is described in relation to the 2nd law of thermodynamics, and hopefully provides hope for the future of chemical information in the next generation.

Career Influences

My Family

1895 Born Cork City 1913 Joined the Irish Volunteer Force 1914 Joined the British Army 1917 Survived Paschendaele 1917 Joined British Royal Navy Dover Patrol 1918-1920 Worked in Zeebrugge, Belgium 1920-1940 Survived the Depression with legal and not so legal jobs 1940- Bombed out of Dover, worked in a paper mill

My Grandfather Demonstrated :

International approach

Pragmatism

Instinct for survival

“Fighting Irishman”

Hard work

Multiple skills

Found opportunities in the complex history of 20th Century

My Previous Lives

First job archaeologist aged 11 (£0.5/day) Newcastle University BSc Biochemistry (1974-1977) (Worked in a paper mill like my father and grandfather) Essex County Libraries (1977-1978) City University London MSc Information Science (1978-1979) Wellcome Foundation, Dartford (1979-1987) (Now Rubble) Royal Society of Chemistry, Nottingham (Closed) Royal Society of Chemistry, Cambridge University of Manchester MSc Bioinformatics (1993-1994) Proteus Molecular Design (1994-1996) (Now an Audi Dealership) Glaxo Wellcome Stevenage (now GSK) Chiroscience (became UCB, Granta Park, now Pfizer Neusentis) Merck Sharp & Dohme, Harlow, (Awaiting demolition) Pfizer, Sandwich, 2000-2011 (Some buildings demolished) Grail Entropix, Deal, 2011- (working from home, no threat of demolition, yet!)

New Scientist Magazine 29th April 1976 Wow! You can make a living out of this!

Wellcome Dartford, 1906-2010

My first job 1979-1987

Where the UK Pharmaceutical Industry Becomes Urban Exploration

Royal Society of Chemistry University of Nottingham

Science Park Cambridge

Burlington House, London

Chemical Abstracts Service

CAS-Online

STN

Proteus Molecular Design

A revolutionary Computer aided chemical and biomolecular design biotech company Computer aided molecular design Bioinformatics Protein structure Now, an Audi dealership

Merck, Harlow

In the process of being converted to a housing estate

Pfizer, Sandwich 2000-2011

“Inside these empty halls, there are ghosts!”

PFE v AZN

No comment!

Grail Entropix

We try to convert Randomness to order We impose order from chaos We are human machines (Jacques Monod) We are entropically driven Information, is particularly, subject to entropy Without energy (= Hard Graft), these transformations cannot

take place

“Thermodynamics is a funny subject. The first time you go through it, you don't understand it at all. The second time you go through it, you think you understand it, except for one or two small points. The third time you go through it, you know you don't understand it, but by that time you are so used to it, it doesn't bother you any more”.

A Semantic Mediawiki company covering Pharmaceutical patents

Grail Entropix & Semantic Mediawiki Conversion of XML

Lessons Learnt. Education is a lifelong component of a career

Few have the luxury of a job for life

Several of these companies no longer exist, physically or as businesses.

The world changes, radically, so you have to change with it

All experience is valuable

All experience is enjoyable (but perhaps, not at the time!)

Libraries – The Library of Babel Jorge Luis Borges: A description of an infinity of

combinations of words in books in an almost infinite number of rooms in a physical place, that some call the library

Where number of books is 1.956 x 101834097

To Infinity & Beyond!

Rosetta Stone Champollion: Conversion of Greek to Egyptian Coptic allows understanding of the pictorial language of Hieroglyphics

This metaphor also allows conversion of chemical nomenclature through notations, (WLN, Inchi etc) to the pictorial language of chemistry

Demotic

Coptic

English

Hieroglyphics Smiles,

InChi’s

Chemical Structure

IUPAC

DNA

A molecule designed to store, transfer and convert information

Remarkably faithful and energy efficient, but, eventually, prone to failure

Markush

Many Markush definition described in patents and elsewhere, can describe and almost infinity of distinct chemical substances.

Work on Markush enumeration from Chemaxon, has regularly identified 1046 molecules – more than the number of atoms in the universe.

How we used to work

Books, Monographs Abstract Journals Data Dynamics teletype

Tips for searchers

Part seen, imagined part

Describe confidenently what you know

Leave out what you cannot describe confidently

Tips for searchers

Vulcan mind meld

Imagine yourself in the mindset of the organiser of information

Tips for Searchers “Words don’t describe my meaning, Notes

cannot spell out the score, Finding not keeping’s the best thing” Brian Ferry, Roxy Music, 2HB

Synonyms & spelling errors

Words really don’t describe meaning very well

Gamekeepers and Poachers

Information is owned, according to intellectual property law, or that ownership is transferred for the common good.

Whereas the opportunities may not be good for poachers, Gamekeepers still thrive, and the landowners do as they always have done, make money.

How we now work Semantic Mediawiki MySql, Sparql, PHP, Python, Triple stores Name to structure Ontology Markush (touching Infinity) Text mining Big Data Free data and Curated data; marrying unlike platforms Creating Structured data from multiple data sources.

How we will work will become part of another undefined paradigm

Fee and Free Gathering, Analysing, Storing and distributing

information is entropically driven. We expect to be able to derive payment for

this work We trade experience and effort for money What if all information is available, free on the

internet? Somebody is paying for the conversion of data

into information into knowledge If information is free, do we have any redress

against poor quality or omissions?

Lessons Learnt Despite a period of managing, anticipating or

resisting change, Web based tools and services are the only realistic option for Information retrieval (at the moment).

Increasingly, there is a requirement for structured, semantic data sets

Everyone is an information scientist, except, perhaps, those that were information scientists?

What are the USP’s for our profession?

Why should we exist?

USP’s What we may take for granted, may be the keys to our utility Our core skills are often unknown to us We may find that we are good at seeking, assimilating,

transforming data or information. We translate information between languages and meta

languages. We create order out of chaos We bring to our work, our collected experience and skill sets Networking Discernment Inference International Outlook Some of us may be good at computing.

Future Proofing Cheminformatics Bioinformatics Open Source, new computing languages, Wiki Visualisation Big Data Marketing, Sales Consultancy Networking Collaboration and Community Languages “The singularity”

For the future A widely adopted common data format for

chemical information exchange

More work on Markush, please!

Structured data exchange for merging chemical data, eliminating duplication, and identifying novelty,

Benchmarking quality for internet data sources.

The Singularity

Self assembling information

The activation energy for this is reducing

Entropically driven

Fear it, Follow it

Seek Beauty in all things

May all your data be structured!

Thanks for listening – It’s been emotional - David Walsh