Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study
-
Upload
andrea-scharnhorst -
Category
Education
-
view
196 -
download
3
Transcript of Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study
![Page 1: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/1.jpg)
dans.knaw.nlDANS is an institute of KNAW en NWO
Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study
Andrea Scharnhorst, Rob Koopman, Shenghui Wang
eHumanities group, Research meeting, Feb 11, 2016
![Page 2: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/2.jpg)
What are pattern?
Related words[indicative] [occurring] [occurrence] [patterns] [consistent] [distribution] [restricted] [portions] [origin] [distinct]
http://thoth.pica.nl/relate? Pattern
![Page 3: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/3.jpg)
Clusters as specific patterns
Related words[clusters] [clustering] [clustered] [distances] [molecular] [structure] [arrangement]
http://thoth.pica.nl/relate? Cluster
![Page 4: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/4.jpg)
What are topics?
Related words[researchers] [topics] [reviewing] [discussion] [interested] [questions] [discussing] [methodological] [suggestions] [great deal]
http://thoth.pica.nl/relate? topic
![Page 5: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/5.jpg)
What is this all about? ‘Same data – different results’
A group of bibliometricians and sociologists of science started a project to delineate scientific topics by means of looking into scholarly communication, more specifically into journal articles from the field of astrophysics.They applied different methods of clustering documents into groups, representing scientific topics, based on information from the bibliographic record.They compare the methods to gain a better understanding what kind of bibliometric approach actually produces what kind of representation of a topic.
Why is this important?Topics or bigger entities as fields are used:
- used to be used to classify and better order knowledge (subject headings) and to let us find things easierbibliometrics- to determine the degree of interdisciplinary - to understand how innovation emerges at the boundaries of fields- to determine in which emergent fields to invest- to evaluate individual researchers in comparison a their ‘reference field’
![Page 6: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/6.jpg)
Can we understand the pattern (cluster) we find?
Hellsten, I., Lambiotte, R., Scharnhorst, A., & Ausloos, M. (2007). Self-citations, co-authorships and keywords: A new approach to scientists’ field mobility? Scientometrics, 72(3), 469–486. doi:10.1007/s11192-007-1680-5
![Page 7: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/7.jpg)
Ambiguity at levels of the research process
Conceptual level
What are the atoms of science? Topics? Fields? Specialties?How are they defined?
Empirical level
What traces to be used?Journal articles Which part(s) of them?
Methodological level
On the basis of which approach we group articles? Because they share references, words, authors, journals, ….?
Correspond the structures/pattern/clusters wesee with the topical structure we wanted to explore?
![Page 8: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/8.jpg)
Background: “Same data, difference results”• Evolved from annual meetings of advisory project funded by German Ministry for Education and Research on ‘Measuring Diversity in Science’• To measure epistemic diversity of a field, the field needs to be delineated and topics identified• Compare solutions derived from same data set • Series of workshops (Berlin 9/2014, Amsterdam 4/2015, Berlin 8/2015)• Special session at ISSI 2015, July in Istanbul
![Page 9: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/9.jpg)
The Astro dataset
• Source: Web of Science (Thomson Reuters)• 8 years: 2003 -2010• 59 astrophysics and astronomy journals• 111,161 articles, letters & proceedings papers
![Page 10: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/10.jpg)
Six teams
• Humboldt University of Berlin• University of Michigan• SciTech Strategies• University of Leuven• CWTS• OCLC & DANS
![Page 11: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/11.jpg)
Eight clustering solutions
![Page 12: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/12.jpg)
Topic Extraction Workflow
T. Velden. Same Data, Different Results-- On a Comparative Topic Extraction Exercise. SIGMET Workshop at ASIST 2015
![Page 13: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/13.jpg)
Overview Approaches
Direct Citation
Bibliogr. Coupling
Hybrid (bc & terms/NLP)
Semantic matrix
Projection onto Global Direct Citation Map
Infomap UMSI -- -- -- --
SLMA CWTS -- -- -- STS
Memetic HU -- -- -- --
Louvian -- ECOOM ECOOM OCLC --
K-means -- -- -- OCLC --
HU: Humboldt University; CWTS: Centre for Science and Technology Studies, Leiden; ECOOM: Expertisecentrum Onderzoek en Ontwikkelingsmonitoring; UMSI: University of Michigan School of Information, OCLC: Online Computer Library Center, Inc.; STS: SciTech Strategies
T. Velden. Same Data, Different Results-- On a Comparative Topic Extraction Exercise. SIGMET Workshop at ASIST 2015
![Page 14: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/14.jpg)
Cluster comparison
• Overlap measures• Normalised mutual information• Overlap index
• Visualisation• Thesaurus mapping• Semantic similarity • Topic affinity network• VOSView term maps
![Page 15: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/15.jpg)
Cluster labelling
• Descriptive, human-readable labels for the clusters produced by automated processes• Different methods:
• Internal information• Differential labelling• External knowledge• Experts
![Page 16: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/16.jpg)
Mutual information based labelling
https://en.wikipedia.org/wiki/Mutual_information
![Page 17: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/17.jpg)
Normalised mutual information
![Page 18: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/18.jpg)
Labelling results
![Page 19: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/19.jpg)
Concept map by Marcus John
![Page 20: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/20.jpg)
Visual comparison using labels
• Select 50 most informative labels for each clustering• Combine into one list of 61 labels• Re-compute the NMI between each cluster and each
label• Each cluster is represented by a 61 dimensional vector
![Page 21: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/21.jpg)
Fingerprints of clusters
![Page 22: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/22.jpg)
grb
![Page 23: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/23.jpg)
![Page 24: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/24.jpg)
Clus
ters
Labels
![Page 25: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/25.jpg)
Ambiguity in the topic extraction workflowSchema courtesy of Theresa Velden
TopicsInterpretation Evaluation• Labelling• Visual representations
•Experts
Comparison• Set–based• Ensemble statistics• Labelling
![Page 26: Comparison of methods – an unloved duty? Examples from an ongoing bibliometric study](https://reader036.fdocuments.in/reader036/viewer/2022062523/58e8bb681a28abc9058b5041/html5/thumbnails/26.jpg)
dans.knaw.nlDANS is an institute of KNAW en NWO
Thanks for your attention!
[email protected]; xxxxTwitter: @knowescape