Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach
description
Transcript of Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach
![Page 1: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/1.jpg)
Analyzing the Evolution of Scientific Citations &
Collaborations: A Multiplex Network
Approach By Soumajit Pramanik
Guide : Dr. Bivas Mitra
![Page 2: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/2.jpg)
Citation Network
Important Author-based Metrics:• In-Citation Count• H-Index etc.
![Page 3: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/3.jpg)
Co-Authorship Network
![Page 4: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/4.jpg)
Previous works on Citation Network mainly focused on:
◦ Analyzing the evolution of citation and collaboration networks using “Preferential Attachment” [Barabasi et al. 2002]
◦ Understanding the importance of community structure in citation networks [Chin et al. 2006]
◦ Studying the evolution of research topics [He et al. 2009]
Existing Works
![Page 5: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/5.jpg)
Previous works on Collaboration Network mainly focused on:
◦ Adopting social network measures of degree, closeness, betweenness and eigenvector centrality to explore individuals’ positions in a given co-authorship network [Liu et al. 2005].
◦ Analyzing the importance of the geographical proximity (same university/city/country etc.) of the collaborators [Divakarmurthy et al. 2011].
Continued…
![Page 6: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/6.jpg)
1. Existing studies focused on the dominant factors like preferential attachment
2. None of these factors can be self- regulated.
3. Does their exist any self-tunable factor (suppressed by dominant factors) for boosting own citations/collaboration?
Motivation:
![Page 7: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/7.jpg)
Continued…Advantage of attending Conferences:
Face-to-Face interactions with Fellow ScientistsStudying the influence of
such interactions on theevolution of Citation andCollaboration Networks
![Page 8: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/8.jpg)
The authors, whose talks are scheduled in the same technical session of a conference, have high chances of interaction.
In general, the first or the last author (or sometimes both) of a paper attends the conference.
Assumptions:
![Page 9: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/9.jpg)
Citations & Collaborations:
◦ DBLP Dataset for Computer Science domain (1960-2008)
◦ Around 1 million papers along with information about author, year, venue and references
◦ 501060 authors tagged with continents (using Microsoft Academic Search)
◦ 6559415 author-wise citation links
Real Dataset:
http://arnetminer.org/citationhttp://cse.iitkgp.ac.in/resgrp/cnerg/Files/resources.html
![Page 10: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/10.jpg)
Interactions:
◦ Two domains: 1> Networking & Distributed Computing 2> Artificial Intelligence
◦ Selected 3 leading conferences from each domain:
1> INFOCOM, ICDCS, IPDPS from the first domain (1982-2007)
2> AAAI, ICRA, ICDE from the second domain (1980-2008)
◦ Collected session information from DBLP and program schedule of the conferences
Continued…
![Page 11: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/11.jpg)
To regulate some important parameters and manifest their effects on the citation network
Followed statistics regarding articles per field per year, distribution of the number of authors in a paper and citation information from the real dataset
Only tunable parameter used: Successful interaction Rate p (p=0.1,0.2,…,1)
Synthetic Dataset:
![Page 12: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/12.jpg)
Methodology: Multiplex Network Construction:
For each year t:
◦ Citation Layer: Directed author-wise citation links created at t, pointing to papers
published before t (or sometimes, in t)
◦ Interaction Layer: Undirected interaction links between authors presenting in same
sessions in selected conferences in t
◦ Co-authorship Layer: Undirected collaboration links between two authors if they co-author
a paper published in those chosen conferences in t
![Page 13: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/13.jpg)
Continued…
![Page 14: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/14.jpg)
1. Conversion Rate (CR) for a conference C for a
time-span T:
No. of “Successful” interactions in C during T
-------------------------------------- Total no. of interactions in C during T
From this, the definition of the Overall Conversion rate can be simply extended.
Evaluation Metrics:
![Page 15: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/15.jpg)
2. Induced Citation Link Repetition (LR):
LR measures the no. of times each “induced” citation link appears within the recorded time period.
3. Lifespan of Induced citation (LS):
The Lifespan of an “induced” citation is measured as the difference between the first and the last appearing year of the “induced” citation link.
Continued…
![Page 16: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/16.jpg)
4. Rate of appearance (RA):
The rate of appearance of the of a induced citation link is denoted by the ratio of the repetition count and lifespan. Hence RA = LR / LS
5. Influence of successful interaction (IG):
The influence of a “successful” interaction is measured as the latency between the “successful” interaction and the formation of the first induced citation.
Continued…
![Page 17: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/17.jpg)
Interactions to Citations
![Page 18: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/18.jpg)
Real Datasets:
Conversion Rates
Networking Domain:2.87% (381 out of 13240) for [0.9,0.1] interaction probabilities
AI Domain:2.1% (1291 out of 61896) for [0.9,0.1] interaction probabilities
![Page 19: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/19.jpg)
Synthetic Dataset:
Continued…
Downfall near end years due to “Boundary Effect”
![Page 20: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/20.jpg)
Heat-Maps
Networking Domain:
1. Overall Value increasing2. Distributed Contribution
AI Domain:
1. Overall Value slowly increasing2. Dominated Contribution
![Page 21: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/21.jpg)
Induced Citation Repetition (LR) & Lifespan (Ls)
In both domains,
1. Power-Law distribution2. A significant no. of “induced” citations repeat a high no. of times
AI Domain
Networking Domain
Significant no. of “induced” citations have high RA values
Reasons can be a) Low LS or/and b) High LR
AI Domain
AI Domain
NetworkingDomain
![Page 22: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/22.jpg)
Continued…AI Domain
Networking Domain
1. High RA ratio results from mainly low LS2. Ä large no. of induced" citations missing from the right side of the plot due to the boundary effect.
1. Aperiodicity of repetitions of “induced” citations increase almost linearly with their Lifespan2. High LR not necessarily imply high standard deviation AI Domain
Networking Domain
![Page 23: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/23.jpg)
Influence Gap (IG)
Influence of Continents
1. All the highly repeating “induced” citations have low “Influence” Gap
Dominance of North America-North America pairs
AI Domain
AI Domain
Networking Domain
Networking Domain
![Page 24: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/24.jpg)
Domain LR vs LS Standard Deviation
vs LS
LR vs IG LS vs IG
Artificial Intelligenc
e
0.57 0.98 -0.13 -0.12
Networking &
Distributed Systems
0.61 0.97 -0.14 -0.13
Correlation Values
![Page 25: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/25.jpg)
Citations To Collaborations
![Page 26: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/26.jpg)
Conversion Rates◦ 1. Considered only collaboration between established
researchers (having at least 1 publication)
◦ 2. In Networking domain out of 8920 co-author links, 2495 (28%) exhibits a past history of mutual citations!
◦ 3. In AI domain 3211 out of 10192 (31.5%) are such “induced” co-author links.
Induced Collaboration Repetition Count and Influence GapHere also, all highly repeating“induced” collaborations have small “influence” gap
AI Domain
Networking Domain
![Page 27: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/27.jpg)
Component EvolutionNetworking Domain: 1. Giant component size 8152, Second Largest Component size 63
2. 28% (167) of induced collaboration links took part in the merging process
AI Domain: 1. Giant component size 16203, Second Largest Component size 41 2. 36:6% (263) of induced collaboration links took part in the merging process
![Page 28: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/28.jpg)
Interactions during conferences can be used as a tool to boost own citation-count.
This can indirectly help in creating effective future collaborations and this cycle goes on.
With time people are being more and more aware about the benefits of interacting with fellow researchers during conferences.
Conclusion & Future Plans
Need to check
1. Influence of specific fields of interacting authors on creation of “induced” citations
2. Effects of “induced” citations/collaborations on the citation/collaboration degree distribution
3. Modeling the dynamics
![Page 29: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/29.jpg)
1. A. L. Barabasi, H. Jeong, Z. Neda, E. Ravasz, A. Schubert, and T. Vicsek: “Evolution of the social network of scientic collaborations”. Physica A: Statistical Mechanics and its Applications, 311(3-4):590 - 614, 2002.
2. A. Chin and M. Chignell.: “A social hypertext model for finding community in blogs. In HYPERTEXT '06”. Proceedings of the seventeenth conference on Hypertext and hypermedia, pages 11-22, New York, NY, USA, 2006. ACM Press.
3. Q. He, B. Chen, J. Pei, B. Qiu, P. Mitra, and C. L. Giles: “Detecting topic evolution in scientific literature: how can citations help?” In CIKM, pages 957-966, 2009.
4. X. Liu, J. Bollen, M. L. Nelson, and H. Van de Sompel.: “Co-authorship networks in the digital library research community”. Information processing & management, 41(6):1462-1480, 2005.
5. P. Divakarmurthy, P. Biswas, and R. Menezes.: “A temporal analysis of geographical distances in computer science collaborations”. In SocialCom/PASSAT, pages 657-660. IEEE, 2011.
References
![Page 30: Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816437550346895dd5fe74/html5/thumbnails/30.jpg)
Thank you…