INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) Forecasting Emerging Science and Technology...
-
Upload
kenneth-gilbert -
Category
Documents
-
view
214 -
download
0
Transcript of INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) Forecasting Emerging Science and Technology...
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA)
Forecasting Emerging Science and Technology
Dewey Murdick, Ph.D.16 January 2015
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 2
“Invests in high-risk/high-payoff research programs that have the potential to provide our nation with an overwhelming intelligence
advantage over our future adversaries.”http://www.iarpa.gov/
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 3
Anticipatory Intelligence Programs“Detecting and Forecasting Significant Events”
S&T Intelligence
Detecting and forecasting the emergence of new technical capabilities.
Indications & Warning
Early warning of social and economic crises, disease
outbreaks, insider threats, and cyber attacks.
Strategic Forecasting
Probabilistic forecasts of major geopolitical trends
and rare events.
FUSE: Foresight and Understanding from Scientific ExpositionForeST: Forecasting Science & Technology
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 4
Anticipatory Intelligence Programs“Detecting and Forecasting Significant Events”
S&T Intelligence
Detecting and forecasting the emergence of new technical capabilities.
Indications & Warning
Early warning of social and economic crises, disease
outbreaks, insider threats, and cyber attacks.
Strategic Forecasting
Probabilistic forecasts of major geopolitical trends
and rare events.
OSI: Open Source
Indicators ACE: Aggregative Contingent Estimation
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 5
Detecting and Forecasting the Emergence of New Technical Capabilities
FUSE: Forecasting emergence with English & Chinese Scientific Lit/Patents “Big Data”
ForeST: Forecasting S&T milestones with the “Wisdom of the Crowd”
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 6
FUSE Core Motivation
• Provide analysts a way to find and prioritize emerging technical areas for further exploration across broad range of disciplines in English and Chinese.
• Help people outside of any particular leading research (publishing) or innovation (patenting) community maintain needed awareness (or hire a workforce to be aware) of the otherwise overwhelming technical literature (worldwide or by sector).
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 7
FUSE Goal: Validated, early detection of technical emergence
Reduce “technical surprise” by developing reliable• Forecasts for the future prominence of scientific
and technical terms (explored, people, docs, and orgs in Phase 2) and
• Indicators that function in a wide range of disciplines (technical cultures)
as found within the English and Chinese scientific and patent literature.
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 8
Next Generation Scientific & Technical Intelligence – Automation
Term Forecast (R=2007, F=2010)扁钢 Flat steel轴加工 Axis machining拟人机器人 Humanoid robot塑料加工 Plastics processing正极材料 Cathode material磨头 Grinding head甲醛释放 Formaldehyde emission锂离子电池 Lithium-ion battery锰钢 Manganese steel数控机床 CNC machine tools纳米银 Nano-Silver等离子切割 Plasma cutting膨润土 Bentonite注塑 Plastic injection moulding100k+ terms (per discipline)
reduced to 1k “nominated” terms
FUSEProgram
Monthly reading list: ~600k scientific articles [solid lines] and patents [dotted
lines] in English and Chinese
Millions of pages of text per month
Forecasting prominence of technical concepts (2-5 years)
3 million
Years
Do
cs
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 9
Example U.S. Patent IndicatorsReference Period: 2007, Forecast Period 2010
Hundreds of indicators prototyped, tested & implemented.
Technical Terms Paten
t Indic
ators
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 10
Example English Sci. Lit. IndicatorsReference Period: 2007, Forecast Period 2010
Technical Terms Sci L
it In
dicat
ors
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 11
Prototype Interface (1 of 2): Term List View
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 12
Prototype Interface (2 of 2): Indicator View
Link to storyboard explanation of prominence forecast
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 13
FUSE Evidence Explanation Example (aka Storyboard)
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 14
Number of Beds & Cribs Granted US Patents by Assignee Type and Time Period
0
500
1000
1500
2000
2500
3000
1981-1985 1986-1990 1991-1995 1996-2000 2001-2005 2006-2010
# G
ran
ted
US
Pa
ten
ts
Total
Company
Academic/Govt/Non-Profit
Individual
Unclassified
FUSE Research Thrusts
Document FeaturesPatents, S&T Lit
Evidence Explanation &Demo User Interface
Indicator DevelopmentLeading indicators
System Engineering
Nomination QualityForecast formulation
Prominence of terms in 2-5 years
Theory & Hypothesis DevelopmentSupports indicator development and explanation; a robust theory is unlikely
PDF to XML Transformation
Technical Terms (Input)Term extraction and filters
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 15
Ocean acidification
Graphene
Nanocarriers
…
Next Generation Scientific & Technical Intelligence – Automation + Smart People
Ask the crowd!
If successful, will automatically alert IC analysts to emerging capabilities, and tap broadest knowledge base to assess.
10,000+
FUSEProgram
ForeSTProgram
Probability of reaching a significant S&T milestone (months to years)
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 16
“Nanocarriers” / “Nanomedicine”Example of Current Forecast
– FUSE (term usage forecast): • >100% growth in “nanocarriers” (for drug delivery) usage in
Nature/Science/PNAS (scientific literature) in 2016• 25-50% growth in “nanomedicine” usage in 2016
– ForeST (milestone probability forecast): • When will a prototype of a nanomachine, containing a
nanomotor and intended for drug delivery be built in lab? – 104 forecasts– 42% Chance of occurring “Between Jan 1 and Dec 31, 2016”
“Nanocarriers”forecast made
FUSE
ForeST
Data as of 10/20/2014
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 17
Examples of Current Forecasts
FUSE Triggered ForeSTPerovskite (CaTiO₃): • High topic entropy (2013)• High forecasted term usage rates in
Science, Nature, and PNAS (2016)
What will be the next highest published efficiency of a tin-based perovskite solar cell? 9.70% Efficiency (104 forecasts)
Thin film type solar cell: • High forecasted term usage rates in
granted patents (2015)
By the end of 2014, will the highest-reported efficiency of a cadmium telluride (CdTe) thin film solar cell be greater than the highest-reported efficiency for a CIGS [Copper Indium Gallium Selenide] solar cell? 40% Chance (62 forecasts), high in Sept
Quinone: • High topic entropy (2013)• High forecasted term usage rates in
Science, Nature, and PNAS (2016)
Will a utility install a quinone-based organic battery for energy storage by January 1, 2017? 31% Chance (20 forecasts)
Data as of 10/20/2014
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 18
ForeST: Visit SciCast.org
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 19
B B B B
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 20
Forecast-driven Test and Evaluation:Leading and Lagging Indicators
Pre-Emergence Emerging Emerged
Mysterious Process
Obvious in Retrospect
Potential
LaggingIndicators
Leading Indicators
Measure relationship of performers’ leading indicatorsto Government’s lagging indicators
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 21
Results Summary
• FUSE: Hundreds of test scenarios each with thousands of evaluations in English and Chinese (Dec 2014)– All teams should meet 3-year term forecast targets
Nomination: 33% precision (41%), 50% recall (71%), 10% false positive (8%) Evidence clarity: >90% of analysts’ evaluations pass (Apr 2014)
– Benchmarks: simple machine learning model (2-3x improvement), status quo heuristic, chance
• ForeST: 89,500 forecasts on 893 questions (Oct 2014)– Benchmarks: status quo heuristic, equiprobability, other forecasts – Accuracy on 241 closed questions, 25-40% more accurate than
benchmarks• Publications: >100 articles for FUSE (goo.gl/u2kum) and ForeST
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 22
Forecasts & Technical Horizon Scanning
FUSES&T LitPatents
ForeST elicits crowd judgments about performance and applications
Horizon Scanning integrates FUSE, ForeST, and additional analysis to forecast technical adoption, application, and impact
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 23
Lessons Learned & Surprises
• Human judgments about technical emergence / de-emergence do not make for effective ground truth– Poor temporal resolution / “linearization” of memories– SME-favorites never die– Small group SME forecasting accuracy is low
• Very few terms drop out of the literature– Boltzmann machines, Cold Fusion, etc. (scientific literature)– 2000+ crib and bed patents filed each year (patents)
• S&T market questions are hard to write– Significant expertise required to formulate and define resolution
It is easy to get myopic and ignore the contributions of communities you don’t know exist (e.g., foreign, subfield specialties).
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 24
Early detection of technical emergence: What should work in when FUSE ends?
Indicators
Forecasting Models
Term Extraction
Demo Interface
-----------------Storyboards
Term Quality, tested and +/- definedTerm families, working for demo
Nomination Quality, targets reached
Rich(er) set, tested (e.g., functional, ablation, lift)
Evidence Quality, targetsreached (or maintained)
User Experience, informal
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 25
Many Challenges Remain
• S&T event coding at the societal level (IARPA-RFI-14-04)• Technical concept resolutions (e.g., terms, related document
groups, sector)– Facets (e.g., component, application, academic discipline, …)– Automated, sufficiently accurate
• Model drift for FUSE-like indicators (IARPA-RFI-14-02)• Technical emergence
– More use cases– New objective functions– Explore across regions, languages, and cultures
• Emergent phenomena detection & forecasting in other domains
A clear problem definition and evaluation will always be critical
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 26
Current Research, Test & Evaluation Teams
FUSE: Year 3 of 4-year programForeST: Year 1 of 2-year* program
1790 AnalyticsBrandeis UniversityNew York UniversityRensselaer Polytechnic Institute
SciTech StrategiesUniversity of Mass, Amherst
Intelligent Information Services Co. (IISC)University of California, IrvineUniversity of Illinois, Urbana-ChampagneUniversity of Michigan
Gold Brand Software, LLCInkling MarketsKaDSci, LLCTuuyi
FUSE ForeST
Test & Evaluation
1790 Analytics
*Market technology developed in ACE Program in first 2 years.
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 27
Recent News Coverage
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 28
Questions
Dewey Murdick, Ph.D.FUSE Program Manager, IARPA
Jason Matheny, Ph.D.ForeST Program Manager, IARPA
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 29
BACKUP
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 30
Evaluation Attempt #1: Case Studies
• Drawn from diverse areas of scientific inquiry & application:– Biological Sciences / Biotechnology– Computer Science / Information Science; Engineering– Mathematics / Statistics– Physical Sciences; Earth Science– Medical / Clinical / Infectious Disease / Health Services; – Social Sciences; …
• Technical emergence measured from “real world” view point, but connected to literature
• Multiple case studies to be produced; some are held back for evaluation– Case studies are representative but not comprehensive– Insufficient to train technical emergence classifiers– Limited examples of emergence & non-emergence (10s planned)– Reference baseline has limited temporal resolution (~5 year blocks)
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 31
Ground TruthData
Phase 2 Evaluation: Nomination Test
Time
Forecast PeriodData Period
FUSE Document Repository
DRF Compare
NQScore
ReferencePeriod
Performer-defined indicators
Prominence Forecasts
In
I2I1
FUSE Performer System
gap
GTF*(E,D,R,F)
(E)ntity(D)ata Period(R)eference Period(F)orecast Period
31
e1
e2
e3 en
e4
e5
T&E
Test Sample
FUSE Document Repository
Tnow
LEADING LAGGING
*GTF = Ground Truth Function
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 32
• ‘Prominence’ is a FUSE created function designed to model the informal conceptual notion of emergence and have the mathematical attributes of a well-behaved scoring function.– Primary inputs are counts in two particular years – a reference year, and a forecast year– Some smoothing function used around reference and forecast years
• What is a reasonable threshold for ‘prominence’?
• Each line show the prominence value associated with a particular increase during the forecast gap (e.g., 3 years or 5 years).
• Prominence function changes rapidly for low counts (dashed line = 3); quickly levels off• Prominence = 0.3 corresponds to a ~50% increase during a forecast gap (dotted line)
– 3 year forecast gap = ~15% increase per year– 5 year forecast gap = ~9% increase per year
Prominence Score and Targets
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 33
FUSE: Scientific and Patent Literature
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 34
FUSEnet Support for T&EFUSEnet
– Government system hosted by Oak Ridge National Laboratory (ORNL)
– Protected unclassified system with remote access for all approved users
– Local hardware allowed for selected data
Updated Specifications for FUSEnet– 770 gigaFLOPS* of maximum performance– 16 blade servers (plus 3 support blades),
each with 2 CPUs, each with 6 cores, totaling 192 cores (processors)
– 3.07 TB of RAM w/ 192 GB per node– Disk space:
• EMC Isilon: 480 TB (4 Isilon nodes) running NFS over 10 Gb/s Ethernet
• HP LeftHand: 260 TB of effective disk storage used for data backup
• Isilon disk I/O is roughly 3-10x improvement over the LeftHand Storage
– Networking: Flex-10 modules totaling ≤160 Gbits/sec bandwidth per enclosure x 2 enclosures
– Virtualized computing space through VMware
– Access and control policies are enforced by ORNL
– Call Center and metrics for service quality* FLoating point OPerations per Second
INTELLIGENCE ADVANCED RESEARCH PROJECTS ACTIVITY (IARPA) 35
Data Transformation Pipeline for Chinese Scientific Literature, Phase 3
• FUSE Chinese PDF Processing Pipeline converts image and textual PDF documents, typically scientific and technical articles, into usable XML.
• The pipeline consists of several different processes that convert the document and subsequently validate the results’ compliance with a common FUSE schema.
• The virtualized processing pipeline successfully converted 96.5% of 17.4 million PDFs* into XML at ~104k PDFs per day.
* A remaining 6.6M image PDFs were not processed, but could be converted at 13k PDFs per day.