Research ProfilingResearch Profiling – “Tech … › sites › default › files ›...
Transcript of Research ProfilingResearch Profiling – “Tech … › sites › default › files ›...
Research Profiling – “Tech Mining” WebResearch Profiling Tech Mining Web of Science topical search results
Cells• Nano-enhanced Thin-film Solar Cells
Alan PorterDirector of R&D, Search Technology, Inc. Nano enhanced Thin film Solar Cells, gy,
[& Georgia Tech][email protected] @
ForINGENIO, Valencia, 2010, ,
Topics1. Introduction to “Tech Mining”g2. Research and Development (R&D) Profiling --
Illustrated for Nano-enhanced thin-film SolarIllustrated for Nano-enhanced, thin-film Solar Cells
3 Discussion3. Discussion
Stages in Mining R&D Information Resources1. Literature review (within research community)2 T h Mi i2. Tech Mining
• Multiple Science, Technology and Innovation (“ST&I”)data to mine
• To generate effective intelligenceg g3. Research Profiling: Characterizing a body of
research publication activityresearch publication activity• Focus on select research activities• Characterize the “research landscape”
4. Structured Knowledge Discovery -- Literature-g yBased Discovery (“LBD”)
How do you extract effectiveHow do you extract effective intelligence from electronic
ST&I information resources?
Tech MiningTech MiningAlan L. Porter and Scott W. CunninghamJohn Wiley & Sons Inc 2005John Wiley & Sons Inc., 2005
Search Technology, 2010
Start with the questions!Start with the questions!
13 MOT Issues ~200 Innovation Indicators39 MOT QuestionsTech Mining Framework
• R&D Portfolio MgtR&D P j t
• Mapping of topic clusters within the technology3 D d h f i
What?• What’s hot?
• R&D Project Initiation
• Engr Project Initiation
• 3-D trend charts for topic clusters
• Ratio of conference to journal papers (benchmarked)
• Fit into tech landscape?• New frontiers at fringe?• Drivers?
Initiation• New Product
Development• Strategic
papers (benchmarked)• Scorecard rate-of-change
metrics for topic clusters• Time slices to show evolution
• Competing technologies?• Likely development paths?Who?Strategic
Planning• Track/forecast
emerging or
Time slices to show evolution of topical emphases
• Topic growth modeling (S-curve) fit & extrapolation
• Who are available experts?• Which universities or labs
lead?breakthrough technologies
• etc.
p
• Profile table of main players• Pie chart: Company vs.
MOT Issues, Questions, and Indicators
Academic vs. Government publishing
• Spreading (or constricting) # of players by topicand Indicators players by topic
“Answers”: Innovation IndicatorsAnswers : Innovation Indicators• Technology Life Cycle IndicatorsTechnology Life Cycle Indicators
- e,g, growth curve location & projection
• Innovation Context Indicators- e g presence or absence of success factors- e.g., presence or absence of success factors (funding, standards, infrastructure, etc.)
• Product Value Chain and MarketProduct Value Chain and Market Prospects Indicators- e.g., applications, sectors engagedg , pp , g g
How to do Tech Mining (for Research Profiling): 8 steps
1 Spell out the questions and how to answer1. Spell out the questions and how to answer them
2 Get suitable data2. Get suitable data3. Search (iterate)4 I t i t t t i i ft (4. Import into text mining software (e.g.,
VantagePoint)5 Cl th d t5. Clean the data6. Analyze & interpret 7. Represent the information well – communicate!8. Standardize and semi-automate where possible
Six information types
Technical Information Contextual Information
Six information types
• Science, Technology & Innovation (“ST&I”)
• Business, competition, customer policy& Innovation ( ST&I )
Databases (e.g., Web of Science; CSCD
customer, policy, popular content Databases (e gof Science; CSCD,
Thomson Innovation)• Internet Sources
Databases (e.g., Thomson One)
• Internet Sources (e g• Internet Sources(e.g., Googling)
• Internet Sources (e.g., blogs, website profiling)
• Technical Expertiseprofiling)
• Business Expertise
On-line Data Sources Custom DataCambridge Scientific Abstracts Factiva Patbase Comma/tab delimited tablesDelphion ISI Web Of Knowledge Questel‐Orbit Microsoft Excel and AccessDi l L i N i Sil Pl tt S tCh tDialog Lexis Nexis SilverPlatter SmartChartsEBSCOHost Micropatent STN XMLEi Engineering Village Ovid Thomson Innovation
Databases Record/Field ToolsAerospace Focust Pascal Combine duplicate recordsArt Abstracts Food Sci & Tech Patent Citation Index Remove duplicate recordsBiobase Foodline Market PCT Create “frankenrecords”Biological Abstracts Foodline Science PCTPAT (merge records fromBiological Sciences Forege Phin dissimilar sources)Biosis Frosti Pira Classify recordsBiotechno FSTA Pluspat Merge fieldsBusiness & Industry Gale PROMT PROMT Clean up fieldsCAPlus (AnaVist export) GeoRef PsycINFO Apply thesauriCassis Global Reporter PubMedCBNB IFIPAT Rapra Claims IFIUDB Recent RefsComputer & Info Systems INPADOC Reference ManagerCorrosion INSPEC Science Citation IndexCurrent Contents IPA SciSearch
A wealth of diverse
Derwent Biotech Abstracts ISD ScopusDerwent Innovations Index ITRD Tech ResearchDerwent World Patent Index JAPIO ToxFile Ei Compendex JICST TransportEMBase Kosmet USApps
information sources for
EnCompass Literature LGST USPat EnCompass Patents MATBUS WaternetEnergy Medline WaterResAbsEnergySciTech METADEX Web of ScienceEngineering Materials Abstr Mgmt and Org Studies WeldaSearch
innovation managementg g g g
Envr Sci & Pollution Mgmt Micropatent Materials Wisdomain ERIC MobilityEuroPat NSF AwardsFamPat NTIS
VantagePoint Import Filters and Tools
g
Case Examples
Getting to the datathe data- usually via internet
Case Examples
Getting the datathe data
- search within databases
- retrieve abstract records electronically
Case ExamplesImport into text mining software for cleaning & analyses
Nano-enhanced Thin-film Solar CellsNano-enhanced Thin-film Solar Cells
Analysis of Global Research Activities ywith Future Prospects
Ying Guo
Ph.D. Candidate, Beijing Institute of TechnologyVisiting Student, Georgia Institute of Technology
Alan L. PorterLu HuangLu Huang
International Association for Management of Technology, 2009
P ibl ST&I P li k Q ti Wh t ti l• Possible ST&I Policy-maker Question: What are national R&D strengths and weaknesses?
• Possible Technology Manager Questions: How to gauge relative opportunities for collaborative development, as well as monitor emerging competitors? g g p
Who Global
What
When Research Activities with Our Papers
Tech Mining
Where
Why
Future Prospects
g
Why
HowNeed more expert inputs (we’re working on this)
We look at such aspects as:We look at such aspects as:
1. What research fields are involved?– science overlay maps
2 Quantity publication numbers and trends2. Quantity -- publication numbers and trends
3. Diversity -- national contrasts
4 Q lit it ti4. Quality -- citations
5. Patterns of research networking---using VantagePoint
6. “Hot” topics
Data:
a global dataset of defined “thin film acquired the dataset containing 1659nano publications
downloaded from the SCI
and (solar or photovoltaic)” as our search expression
containing 1659 records for time period from 2001 to mid-2008mid 2008
Basic Dataset Search Expression Resulting Dataset
The Three Core VantagePoint gAnalytical Tools•• LISTSLISTS• CO-OCCURRENCE• CO-OCCURRENCE
MATRICES• MAPS• MAPS
KEY: Co-occurrence statistics to find relationshipsp• Count the relative degree to which “terms” (e.g.,
keywords, author names, organizations, years) appear together in particular documents in the set
• The higher the co-occurrence, the stronger the t ti l l ti hipotential relationship
“Top N’s” Available1 Document types (e g articles)1. Document types (e.g., articles)2. Publication Years (essential for trend analyses)3 Ti Cit d3. Times Cited4. Countries5. Affiliations6 Funding agencies6. Funding agencies7. Authors8 J l ( S )8. Journals (or Sources)9. Key terms10. Subject Categories11. Macro-Disciplines11. Macro Disciplines12. Organization Types
Nano-enhanced Solar Cell Web of Science Subject Category Concentrations of the Leading CountriesCategory Concentrations of the Leading Countries
USA India Germany Japan ChinaUSA India Germany Japan China
Materials Science 126 132 83 68 63Materials Science, Multidisciplinary
126 132 83 68 63
Physics Applied 112 56 92 68 53Physics, Applied 112 56 92 68 53Physics, Condensed Matter 59 72 80 47 46Ch i t Ph i l 8282 2626 28 34 32Chemistry, Physical 8282 2626 28 34 32Energy & Fuels 2626 4949 16 9 10M t i l S i C ti 24 21 26 17 21Materials Science, Coatings & Films
24 21 26 17 21
Academic-Corporate-Government Publishing by Countryp g y y
Cross-national Collaboration
% International
USA India Germany Japan China France UK SouthKorea
Mexico Spain
Cooperation (among top 10)
USA 20 1% 288 5 16 5 6 5 3 9 8 1USA 20.1% 288 5 16 5 6 5 3 9 8 1India 26.4% 5 239 4 15 4 5 20 10Germany 27.1% 16 4 195 10 2 8 8 1 4
Japan 24.2% 5 15 10 182 4 2 5 2 1China 10.4% 6 2 4 182 2 2 1 2France 24.8% 5 4 8 2 2 113 4 3UK 34.5% 3 5 8 5 2 4 84 1 1South 52.2% 9 20 1 2 1 1 69 2Korea
Mexico 38.5% 8 10 1 2 2 65 2Spain 17 5% 1 4 3 1 2 63p 17.5% 1 4 3 1 2 63
Dye Sensitized Solar Cell („DSSC“) research by organization type over Time (from SCI)organization type, over Time (from SCI)
# of author affiliations/paper for DSSC publications (SCI)
Nano-Structured ZnO Thin-film Solar Cells Publication by Countries and Years
14
12 China
India
8
10 Japan
USA
6Mexico
Germany
ChinaJapan
2
4 SouthKoreaSpain
2001
Mexico
South Korea0
2
France
2001 2002 2003 2004 2005 2006 2007
France
IAMOT 2009
90%
100%
France
%
70%
80% France
Spain
SouthK
40%
50%
60% KoreaGermany
Mexico
20%
30%
40% USA
Japan
India
0%
10%
20%
China
0%2001 2002 2003 2004 2005 2006 2007
Nano-Structured ZnO Thin-film Solar Cells Publication: Top 10 countries by Years – note the Bumpiness for Spain
DSSC Publications (SCI) with % 2006 or later
Projecting Nano-enhanced Solar Cell Research Activity
Actual data Projected data
Research publication activity and citation (impact) characteristics
USA
2000ns
1000
1500
f ci
tatio
n
IndiaGermanyUK
Japan500
1000
ality
-# o
f
IndiaChinaFrance
South KoreaMexico
Spain0
0 50 100 150 200 250 300 350
qua
0 50 100 150 200 250 300 350
act iv ity -# of records
• Nodes above the diagonal suggest relatively higher quality (US and UK). Below the diagonal, the closer to the diagonal, the higher the quality of thatBelow the diagonal, the closer to the diagonal, the higher the quality of that country’s research.
ZnO attracts increasing attention in recent years and is on trend to catch up with TiO2
“Hot topic” shown by relative trends
ZnO attracts increasing attention in recent years and is on trend to catch up with TiO2
“ Hot Topics“ –ratio-recent # Records Top 20 Key Terms
1.14 47 conjugated polymer pas Ratio of Occurrences
0.85 74 fabrication0.85 61 TiO20.74 66 chemical vapor deposition
2007-08 to those in 2001-06
0.65 28 amorphous silicon0.53 72 morphology0.52 94 semiconductor0 50 48 f ll0.50 48 fullerene0.48 49 zinc oxide0.46 51 microstructure0 41 65 l i0.41 65 spray pyrolysis0.36 49 heterojunction0.32 37 CdTe0 29 102 l t d iti0.29 102 electrodeposition0.28 92 CuInSe20.24 21 anatase0 22 39 h i l b th d iti0.22 39 chemical bath deposition0.17 21 Cu(In0.00 37 sol-gel0 00 22 h t d ti it0.00 22 photoconductivity
0.44 Top 20 Key Terms combined
Recent Entrants
• We need not restrict the temporal comparison to key terms or topicskey terms or topics
• Same modus operandi can be applied to identify new or recent entrants to the research (e g firstnew or recent entrants to the research (e.g., first papers on the topic from a given organization)Another variant is the inverse to look for• Another variant is the inverse – to look for which participants seem to have abandoned the topic (no publications since Year X)topic (no publications since Year X)
Visualization (Maps)1. VantagePoint Mapsg p Auto-correlation maps Cross-correlation maps Cross-correlation maps Factor maps
2 S i l N t k A l i (SNA)2. Social Network Analysis (SNA)3. Science Overlay Maps4. Geo-mapping
Factor Map (Principal Components Analysis) groups terms based on their tendency to co-occur across records
Social Network Analysis (SNA)• VantagePoint offers several application opportunities Create a sub-dataset for a given country or organizationCreate a sub dataset for a given country or organization Within that target group, for the given research topic, explore
research network connectionsresearch network connections• Examples CollaborationsCollaborations Shared interests Discrepancies between interests & collaborationDiscrepancies between interests & collaboration
• Rich options Highly co-cited authors (e g nano in social sciences) Highly co-cited authors (e.g., nano in social sciences) Highly co-citing authors Bibliographic coupling (shared referencing) Bibliographic coupling (shared referencing)
Auto-Correlation MapsNETFSC Research networking comparison
USS (dispersed) vs Germany (1 central organization)Auto-Correlation Map
Affiliation (Name Only) (>6)
All li k h
Emor y Uni vEmor y Uni v Auto-Correlation Map
Affiliation (Name Only) (>5)USA Germany
USS (dispersed) vs Germany (1 central organization)
All links shown> 0.75 0 (0)0.50 - 0.75 0 (0)0.25 - 0.50 0 (0)< 0.25 3 (0)
Nor t hwest er n Uni vNor t hwest er n Uni v
All links shown> 0.75 0 (0)0.50 - 0.75 0 (0)0.25 - 0.50 2 (0)< 0.25 11 (0)
Uni v Lei pzi gUni v Lei pzi g
Hahn Mei t ner I nst Ber l i n GmbHHahn Mei t ner I nst Ber l i n GmbH
Fr ee Uni v Ber l i nFr ee Uni v Ber l i n
MI TMI THahn Mei t ner I nst Ber l i n GmbHHahn Mei t ner I nst Ber l i n GmbH
Uni v Wur zbur gUni v Wur zbur g
Bul gar i an Acad SciBul gar i an Acad Sci
Uni v Massachuset t sUni v Massachuset t sUni v Del awar eUni v Del awar e
Uni v Cal i f Los Angel esUni v Cal i f Los Angel es Johns Hopki ns Uni vJohns Hopki ns Uni v
Uni v Massachuset t sUni v Massachuset t s
Uni v Fl or i daUni v Fl or i daTech Uni v Dar mst adtTech Uni v Dar mst adt
Uni v Cal i f Sant a Bar bar aUni v Cal i f Sant a Bar bar a
Penn St at e Uni vPenn St at e Uni v Uni v St ut t gar tUni v St ut t gar t
Uni v Gi essenUni v Gi essen
Uni v Washi ngt onUni v Washi ngt on
Penn St at e Uni vPenn St at e Uni v
Nat l Renewabl e Ener gy LabNat l Renewabl e Ener gy Lab
Uni v Er l angen Nur nber gUni v Er l angen Nur nber g
Tech Uni v Muni chTech Uni v Muni ch
Max Pl anck I nst Pol ymer ResMax Pl anck I nst Pol ymer Res
Gi f u Uni vGi f u Uni v
Science Overlay Mapping1 Start with Web of Science file in VantagePoint1. Start with Web of Science file in VantagePoint
• List the Subject Categories or• Cited Subject Categories (somewhat complicated process)• Cited Subject Categories (somewhat complicated process)• Output a vector file of SCs or Cited SCs
2 In Pajek2. In Pajek• Select the SCI (175 SC) or SCI+SSCI (221 SC) base map
[thanks to Ismael Rafols & Loet Leydesdorff]y• Edit your map
3. In MS Powerpoint• Overlay on the appropriate base map
4. Or, go to www.idr.gatech.edu/ -- select “Upload Map”
Science Overlay Map [see: www.idr.gatech.edu – includes “how to make your own map” and full citations]
GeosciencesAgri Sci
Ecol SciInfec tious Diseases
EnvSci & Tech
Biomed Sci.
Chemistr yClinical Med
Chemi st r y Physi cal
Ener gy & Fuel s
Cognitive Sci
E S i
Health Sci
Mat er i al s Sci ence, Mul t i di sci pl i nar y
Physi cs Appl i ed
Chemi st r y, Physi cal
Mat er i al s Sci ence, Coat i ngs & Fi l ms
Engr SciMtls Sci
Physi cs, Appl i ed
Physi cs, Condensed Mat t er
Computer Sci
PhysicsPhysics
Nano-Thin-Film Publications 2001-08 DistributionOv erlay ov er base 175 Subject Category Science Map
Ley desdorff &Raf ols (Forthcoming) –
Geo-map: Nano-enhanced Solar Cells – European Institutions >=10 papers
Resources• www.theVantagePoint.com – offers multiple papers and
some case analyses• Tech Mining by Alan Porter and Scott Cunningham, Wiley,
2005.Porter A L Kongthon A Lu J C Research Profiling:• Porter, A.L., Kongthon, A., Lu, J-C., Research Profiling: Improving the Literature Review, Scientometrics, Vol. 53, p. 351-370, 2002.351 370, 2002.
• Interdisciplinarity and Science Overlay Mapping:www.idr.gatech.edu/
• Additional analyses (papers) using these tools:• www.tpac.gatech.edu/• www nanopolicy gatech edu/• www.nanopolicy.gatech.edu/
Opportunitiespp• Future-oriented Technology Analyses Conference
Sevilla, May 12-13, 2011 – see [email protected]
C f S• Tech Mining Workshop + Atlanta Conference on Science, Technology & Innovation Policy, Atlanta, Sep 13-17, 2011
Discussion1 R h fili th WIDENED lit i1. Research profiling – the WIDENED lit review To inform ST&I policy-making To assist R&D management To assist R&D management For researchers to situate their work
2 Empirical ST&I analyses of use to you?2. Empirical ST&I analyses of use to you? Web of Science (or other research databases) Other R&D information (e g patents)Other R&D information (e.g., patents) Downstream data (e.g., business, public interest)
3. Further analyses of interest to you:3. Further analyses of interest to you: Environmental (competitive; natural; organizational) Economic development (clusters; geo-maps)p ( ; g p ) ??