An Application of Text Mining in Strategic Technical Planning
Paul Frey – Search Technology, Inc.
Nils Newman – Intelligent Information Services Corp.
Robert Watts – U.S. Army TACOM
Alan Porter – Georgia Institute of Technology
Symposium on Technical Intelligence
Division of Chemical Information
American Chemical Society National Meeting
San Diego California, April 4, 2001
Preliminary Comments
• A word of caution …– I am one of those people that previous speakers warned
you about, because …– I make and sell “hammers.”– a.k.a. software tools to assist (not replace!) a skilled
Technical Intelligence analyst
• Work reported occurred in late 1996– Using an earlier prototype of VantagePoint– Screen-shots in this presentation are re-creations – But the story is true
Statement of the Problem
Domain: U.S. Army Tank-automotive and Armaments Command (TACOM)
• Replacement of high-cost, out-of-tolerance engine components is expensive
• Potential Solution: Recondition worn-out components using thermal-spray coatings
• Institutional Barrier: Related R&D program in early 1980’s was “too early”
Innovation Forecasting
• Search on basic topical terms in multiple data sources
• Examine results to refine query and re-do search• Plot major trends and model the life cycle• Identify qualitative categories for assessment of
technology life cycle (e.g., academic and corporate research)
• Slice data by time and compare• Identify and analyze special areas (e.g., gap
analysis)
Initial Query
• EI Compendex
• “Ceramic” NEAR “Engine” (and variants)
• ~800 records
Examine Publication Trend by Industry Segment (Overview)
1. Import the raw data (Import Filters/Editor)2. Extract Year of Publication from a coded
field (Thesaurus)3. Clean up “Corporate Source” field4. Assign industry segment in “Corporate
Source” field (Thesaurus Groups)5. Create a co-occurrence matrix6. Plot the trend (VBScript MS Excel)
Extract Publication Year
• Capture raw data • Apply a thesaurus to condense the raw data
User-managed thesauri are based on “Regular Expressions”
Clean-up Corporate Source Field
• Automatic
• with Manual confirmation – Optional, but
recommended
• Save and/or merge clean-up operations into a thesaurus for re-use
Categorize Corporate Sources
• Create groups using a thesaurus– Corporation
– Laboratory
– University
Cumulative Publication Trends
• Co-occurrence matrix
• Plot in Excel using VBScript
So …
• It looks like something significant happened in the mid-80’s to radically increase the publication rate.
• It also looks like things have slowed down in (then) recent years (early- to mid-90’s).
• Has the technology has matured?
• Is it ready to transition?
Analyze Selected Areas
U.S. Patents – Ceramic Coating (cumulative)
Forecasting – Counting Patents
• Plot
• Model(Coeff. Det. > 0.95)
• Predict“significant technology growth in the next 9 years”
• Monitor(Leap to today)
Look for existing forecasts• Create groups of
records using a thesaurus (e.g. “forecast”“projection”“future”“market study”)
• Made “leap” to electronics industry
• Identified expert• The rest is history
PIEZOELECTRIC DEVICES - Manufacture
Results
• 1997– Prepared case for new effort on reconditioning engine
components using thermal-spray coatings
– U.S. Army invested $2M in 5-year R&D program
• Today– Installing reconditioning equipment at Red River Army
Depot for production use
– Expected Near-term payback ~ $5.5M
References and Contact Information
• “Innovation Forecasting” by Robert J. Watts and Alan L. Porter, Technological Forecasting and Social Change, 56, 25-47 (1997).
• Paul Frey, Search Technology, Inc., Atlanta, GA– [email protected]– 770.441.1457
• Web Sites– www.searchtech.com– www.TheVantagePoint.com
Top Related