1404.3757v1

9
Inheritance patterns in citation networks reveal scientific memes Tobias Kuhn, 1, * Matjaˇ z Perc, 2, 3 and Dirk Helbing 1, 4 1 Chair of Sociology, in particular of Modeling and Simulation, ETH Zurich, 8092 Zurich, Switzerland 2 Faculty of Natural Sciences and Mathematics, University of Maribor, Koroˇ ska cesta 160, SI-2000 Maribor, Slovenia 3 CAMTP – Center for Applied Mathematics and Theoretical Physics, University of Maribor, Krekova 2, SI-2000 Maribor, Slovenia 4 Risk Center, ETH Zurich, 8092 Zurich, Switzerland Memes are the cultural equivalent of genes that spread across human culture by means of imitation. What makes a meme and what distinguishes it from other forms of information, however, is still poorly understood. Here we propose a simple formula for describing the characteristic properties of memes in the scientific literature, which is based on their frequency of occurrence and the degree to which they propagate along the citation graph. The product of the frequency and the propagation degree is the meme score, which accurately identifies important and interesting memes within a scientific field. We use data from close to 50 million publication records from the Web of Science, PubMed Central and the American Physical Society to demonstrate the effectiveness of our approach. Evaluations relying on human annotators, citation network randomizations, and comparisons with several alternative metrics confirm that the meme score is highly effective, while requiring no external resources or arbitrary thresholds and filters. Researchers take delight in meticulously evaluating scientific output and patterns of scientific collaboration. From citation distributions [1, 2], coauthorship networks [3] and the forma- tion of research teams [4, 5], to the ranking of researchers [6–8] and the predictability of their success [9] — how we do science has become a science in its own right. While the now famous works of Derek J. de Solla Price [10] and Robert K. Merton [11] from the mid 1960s has been followed by a long run-up towards maturity and mainstream popularity of the field, the rapid progress made in recent years is largely due to the increasing availability of vast amounts of digitized data. Massive publication and citation databases, also referred to as “metaknowledge” [12], along with leaps of progress in the theory and modeling of complex systems, fuel large-scale ex- plorations of the human culture that were unimaginable even a decade ago [13]. The science of science is scaling up mas- sively as well, with studies on world citation and collabora- tion networks [14], the global analysis of the “scientific food web” [15], and the identification of phylomemetic patterns in the evolution of science [16], culminating in the visually com- pelling atlases of science [17] and knowledge [18]. Science is central to many key pillars of human culture, and probably the most popular concept to describe the most influential aspects of our culture is that of a meme. The term “meme” was coined by Richard Dawkins in his book The Self- ish Gene [19], where he argues that cultural entities such as words, melodies, recipes, and ideas evolve similarly as genes, involving replication and mutation but using human culture instead of the gene pool as their medium of propagation. Re- cent research on memes has enhanced our understanding of the dynamics of the news cycle [20], the tracking of informa- tion epidemics in blogspace [21], and the political polarization on Twitter [22]. It has been shown that the evolution of memes can be exploited effectively for inferring networks of diffusion and influence [23], and that information contained in memes * Electronic address: [email protected] is evolving as it is being processed collectively in online social media [24]. The question of how memes compete with each other for the limited and fluctuating resource of user attention has also amassed the attention of scientists, who showed that social network structure is crucial for understanding the di- versity of memes [25] and that their competition can bring the network at the brink of criticality [26], where even minute dis- turbances can lead to avalanches of events that make a certain meme go viral [27]. While the study of memes in mass media and popular cul- ture has been based primarily on their aggregated wave-like occurrence patterns, the citation network of scientific liter- ature allows for more sophisticated and fine-grained analy- ses. Quantum, fission, graphene, self-organized criticality, and traffic flow are examples of well-known memes from the field of physics, but what exactly makes such memes different from other words and phrases found in the scientific litera- ture? As an answer to this question, we propose the following definition that is modeled after Dawkins’ underlying defini- tion of the word “gene” [19]: A scientific meme is a short unit of text in a publication that is replicated in citing pub- lications and thereby distributed around in many copies; the more likely a certain sequence of words is to be broken apart, altered, or simply not present in citing publications, the less it qualifies to be called a meme. Publications that reproduce words or phrases from cited publications are thus the ana- log to offspring organisms that inherit genes from their par- ents. In contrast to existing work on scientific memes, our ap- proach is therefore grounded in the “inheritance mechanisms” of memes and not just their accumulated frequencies. The above definition covers memes made up of exact words and phrases, but the same methods apply just as well to more ab- stract forms of memes, such as patterns of co-occurrence and grammatical structures. According to our definition, scientific memes are entities that propagate within the network of citations. To identify them and study their properties and dynamics, we therefore need databases of scientific publications that include citation data. Here we rely on 47.1 million publication records from arXiv:1404.3757v1 [cs.SI] 14 Apr 2014

description

fdcfhg

Transcript of 1404.3757v1

  • Inheritance patterns in citation networks reveal scientific memes

    Tobias Kuhn,1, Matjaz Perc,2, 3 and Dirk Helbing1, 41Chair of Sociology, in particular of Modeling and Simulation, ETH Zurich, 8092 Zurich, Switzerland

    2Faculty of Natural Sciences and Mathematics, University of Maribor, Koroska cesta 160, SI-2000 Maribor, Slovenia3CAMTP Center for Applied Mathematics and Theoretical Physics,

    University of Maribor, Krekova 2, SI-2000 Maribor, Slovenia4Risk Center, ETH Zurich, 8092 Zurich, Switzerland

    Memes are the cultural equivalent of genes that spread across human culture by means of imitation. What makesa meme and what distinguishes it from other forms of information, however, is still poorly understood. Here wepropose a simple formula for describing the characteristic properties of memes in the scientific literature, whichis based on their frequency of occurrence and the degree to which they propagate along the citation graph. Theproduct of the frequency and the propagation degree is the meme score, which accurately identifies importantand interesting memes within a scientific field. We use data from close to 50 million publication records fromthe Web of Science, PubMed Central and the American Physical Society to demonstrate the effectiveness of ourapproach. Evaluations relying on human annotators, citation network randomizations, and comparisons withseveral alternative metrics confirm that the meme score is highly effective, while requiring no external resourcesor arbitrary thresholds and filters.

    Researchers take delight in meticulously evaluating scientificoutput and patterns of scientific collaboration. From citationdistributions [1, 2], coauthorship networks [3] and the forma-tion of research teams [4, 5], to the ranking of researchers[68] and the predictability of their success [9] how wedo science has become a science in its own right. While thenow famous works of Derek J. de Solla Price [10] and RobertK. Merton [11] from the mid 1960s has been followed by along run-up towards maturity and mainstream popularity ofthe field, the rapid progress made in recent years is largely dueto the increasing availability of vast amounts of digitized data.Massive publication and citation databases, also referred to asmetaknowledge [12], along with leaps of progress in thetheory and modeling of complex systems, fuel large-scale ex-plorations of the human culture that were unimaginable evena decade ago [13]. The science of science is scaling up mas-sively as well, with studies on world citation and collabora-tion networks [14], the global analysis of the scientific foodweb [15], and the identification of phylomemetic patterns inthe evolution of science [16], culminating in the visually com-pelling atlases of science [17] and knowledge [18].

    Science is central to many key pillars of human culture,and probably the most popular concept to describe the mostinfluential aspects of our culture is that of a meme. The termmeme was coined by Richard Dawkins in his book The Self-ish Gene [19], where he argues that cultural entities such aswords, melodies, recipes, and ideas evolve similarly as genes,involving replication and mutation but using human cultureinstead of the gene pool as their medium of propagation. Re-cent research on memes has enhanced our understanding ofthe dynamics of the news cycle [20], the tracking of informa-tion epidemics in blogspace [21], and the political polarizationon Twitter [22]. It has been shown that the evolution of memescan be exploited effectively for inferring networks of diffusionand influence [23], and that information contained in memes

    Electronic address: [email protected]

    is evolving as it is being processed collectively in online socialmedia [24]. The question of how memes compete with eachother for the limited and fluctuating resource of user attentionhas also amassed the attention of scientists, who showed thatsocial network structure is crucial for understanding the di-versity of memes [25] and that their competition can bring thenetwork at the brink of criticality [26], where even minute dis-turbances can lead to avalanches of events that make a certainmeme go viral [27].

    While the study of memes in mass media and popular cul-ture has been based primarily on their aggregated wave-likeoccurrence patterns, the citation network of scientific liter-ature allows for more sophisticated and fine-grained analy-ses. Quantum, fission, graphene, self-organized criticality,and traffic flow are examples of well-known memes from thefield of physics, but what exactly makes such memes differentfrom other words and phrases found in the scientific litera-ture? As an answer to this question, we propose the followingdefinition that is modeled after Dawkins underlying defini-tion of the word gene [19]: A scientific meme is a shortunit of text in a publication that is replicated in citing pub-lications and thereby distributed around in many copies; themore likely a certain sequence of words is to be broken apart,altered, or simply not present in citing publications, the less itqualifies to be called a meme. Publications that reproducewords or phrases from cited publications are thus the ana-log to offspring organisms that inherit genes from their par-ents. In contrast to existing work on scientific memes, our ap-proach is therefore grounded in the inheritance mechanismsof memes and not just their accumulated frequencies. Theabove definition covers memes made up of exact words andphrases, but the same methods apply just as well to more ab-stract forms of memes, such as patterns of co-occurrence andgrammatical structures.

    According to our definition, scientific memes are entitiesthat propagate within the network of citations. To identifythem and study their properties and dynamics, we thereforeneed databases of scientific publications that include citationdata. Here we rely on 47.1 million publication records from

    arX

    iv:1

    404.

    3757

    v1 [

    cs.SI

    ] 14

    Apr

    2014

  • 2WoS. Disciplines:

    Natural/Agricultural Sciences

    (except Physical Sciences)

    Physical Sciences

    Engineering and Technology

    Medical and Health Sciences

    Social Sciences / Humanities

    APS. Physical Review Journals:

    A: Atomic, molecular, optical phys.

    B: Condensed matter, materials phys.

    C: Nuclear phys.

    D: Particles, fields, gravitation, cosmology

    E: Statistical, nonlinear, soft matter phys.

    other journals

    APS. Selected Memes:

    quantum

    fission

    graphene

    self-organized criticality

    traffic flow

    1FIG. 1: Citation networks of the Web of Science and the Physical Review datasets reveal community structures that nicely align with thescientific disciplines and the journals covering particular subfields of physics. The meme-centric perspective of the Physical Review citationgraph in the right-hand picture shows that scientific memes concentrate in relatively isolated communities of publications that correspond to aparticular subfield of physics. The generation of the visualizations was based on Gephi [28] and the OpenOrd plugin [29], which implementsa force-directed layout algorithm that is able to handle very large graphs. For details we refer to the network legends and the main text.

    the Web of Science, PubMed Central and the American Phys-ical Society. Due to their representative long-term coverage ofa specific field of research, we focus mainly on the titles andabstracts of almost half a million publications of the PhysicalReview and the pertaining citation data, which were publishedbetween July 1893 and December 2009. To demonstrate therobustness of our method, we also present results for the over46 million publications indexed by the Web of Science, andfor the over 0.6 million publications from the open accesssubset of PubMed Central that covers research mostly fromthe biomedical domain and mostly from recent years.

    The citation graph visualizations presented in Fig. 1 giveus an intuition about the structure of these networks and thespreading patterns of scientific memes therein. The leftmostnetwork depicts the entire giant component of the citationgraph of the Web of Science database, consisting of more than33 million publications. It can be observed that different sci-entific disciplines form relatively compact communities. Thephysical sciences (cyan) are close to engineering and technol-ogy (magenta) in the top right corner of the network, but ratherfar from the social sciences and humanities (green) and themedical and health sciences (red), which take up the majorityof the left hand side of the network. In between we have nat-ural and agricultural sciences (blue), which form an interfacethat connects the other disciplines. Zooming in on the phys-ical sciences and switching to the dataset from the AmericanPhysical Society, we get the picture shown in the middle. Thecolors now encode five Physical Review journals, each cover-ing a particular subfield of physics. We see that this citationnetwork has a very complex structure with many small andlarge clusters, some tightly, while others loosely, connected

    with one another. Importantly, even though the employed lay-out algorithm [29] did not take the scientific disciplines andthe journal information explicitly into account, the differentcommunities are clearly inferable in the citation graphs.

    If instead of the Physical Review journals we highlightthe previously mentioned memes from physics, we obtain therightmost network presented in Fig. 1. In agreement with ourdefinition of a scientific meme, we see that most of them ap-pear in publications that form compact communities in the ci-tation graph. The meme quantum is widely but by no meansuniformly distributed, pervading several large clusters. Pub-lications containing the meme fission form a few connectedclusters limited to area that makes up the journal Physical Re-view C, which covers nuclear physics. Similarly, the memesgraphene, self-organized criticality, and traffic flow (see en-larged area) are each concentrated in their own medium-sizedor small communities. These points emphasize our generalmeme-centric perspective that we are employing and investi-gating for our analysis of the network of scientific publica-tions.

    Results

    All words and phrases that occur frequently in the scientificliterature can be considered important memes, but we claimthat only the ones that propagate along the citation graph areactually interesting for a given scientific field. The importanceof a meme m is thus given by its frequency of occurrencefm, which is simply the ratio between the number of publica-tions that carry the meme and the number of all publications

  • 3contained in the evaluated dataset. To quantify the degree towhich a meme is interesting, we define the propagation scorePm, which determines the alignment of the occurrences of agiven meme with the citation graph.

    The propagation score Pm is high for memes that fre-quently appear in publications that cite meme-carrying pub-lications (sticking) but rarely appear in publications that donot cite a publication that already contains the meme (spark-ing). Formally, we define the propagation score for a givenmeme m as its sticking factor m divided by its sparking fac-tor m. The sticking factor m quantifies the degree to whicha meme replicates in a publication that cites a meme-carryingpublication. Concretely, it is defined as

    m =dmmdm

    , (1)

    where dmm is the number of publications that carry thememe and cite at least one publication carrying the meme,while dm is the number of all publications (meme-carryingor not) that cite at least one publication that carries the meme.Similarly, the sparking factor m quantifies how often a memeappears in a publication without being present in any of thecited publications. It is thus defined as

    m =dmmdm

    , (2)

    where dmm is the number of meme-carrying publicationsthat do not cite publications that carry the meme, and dmis the number of all publications (meme-carrying or not) thatdo not cite meme-carrying publications. For the propagationscore Pm, we thus obtain

    Pm =dmmdm

    /dmmdm

    . (3)

    Intuitively, the propagation score compares the replicationabilities of a meme when its publication is cited with the ten-dency to appear out of nothing in a publication that does notcite a meme-carrying publication. Memes with a high propa-gation score travel mostly along the citation graph.

    Having determined the frequency of occurrence fm and thepropagation score Pm for a particular meme m, we define theformula for the meme score Mm simply as

    Mm = fmPm. (4)

    As defined, the meme score has a number of desirable prop-erties: (i) it can be calculated exactly without the introductionof arbitrary thresholds, such as a minimal number of occur-rences, limiting the length of n-grams to consider, or filteringout words containing special characters; (ii) it does not de-pend on external resources, such as dictionaries or other lin-guistic data; (iii) it does not depend on filters, like stop-wordlists, to remove the most common words and phrases; (iv) it issimple (we introduce only one parameter; see Methods for de-tails) and works exceptionally well even on massive datasets;and (v) it requires virtually no preprocessing of the publica-tion texts, apart from recommended elementary tokenization

    (splitting at blank spaces and detaching trailing punctuationcharacters) and the transformation to lower case. To test therobustness and effectiveness of the meme score, we carefullyevaluate its performance by means of full and time-preservingrandomization of the citation graphs, by means of manual an-notation of identified terms, as well as by means of several al-ternative metrics, including frequency of occurrence, changesin absolute and relative trends over time, and absolute and rel-ative differences occurring across journals. We refer to theMethods section for further details, while here we proceedwith the presentation of the results obtained with the memescore.

    Calculating the meme score for all n-grams in the threedatasets considered gives us the results presented in Fig. 2.The two quantities that define the meme score, namely therelatively frequency and the propagation score, are plottedagainst each other in the form of heat maps with logarithmicscales. There is no upper limit to the length of n-grams, andthe presented maps cover without exception all n-grams witha non-zero meme score. Meme scores are increasing towardsthe top-right and decreasing towards the bottom-left corner.Maps A, C and D feature a broad band with a downward slope,indicating that, in general, more frequent memes tend to prop-agate less via the citation graph. In the lower half of eachmap, we see a wedge of very high densities that follows thelarger band on the bottom-left edge, but getting narrower to-wards the middle where it ends. Though this wedge has asomewhat rounder and broader shape for the Web of Science(WoS) database, overall these patterns look remarkably simi-lar across all datasets despite their differences with respect totopic, coverage, and size. This is an indication of universalityin the distribution patterns of scientific memes. The 99.9%-quantile line (M0.999) is also surprisingly stable, consideringthat the underlying values range over five orders of magnitudeor more. Localizing the previously mentioned physics memesin the APS dataset (map A), we see that they are located onthe very edge of the top-right side of the band, where the den-sity of n-grams is very low. Indeed, interesting and importantmemes are located mostly in this area, which is exactly whatis reflected by the meme score.

    The heat map B in Fig. 2 illustrates a typical case of whathappens when the APS citation graph is randomized but thetime ordering of publications is preserved. The number ofterms with a non-zero meme score decreases dramatically(from 1.4 million in map A to just 89,356 in map B),the universal distribution pattern of scientific memes vanishes,and the top-right part where the top-ranked memes should belocated disappears completely. Naturally, if the APS citationgraph is randomized without preserving the time ordering, theoverlap with the original results presented in map A is evensmaller (not shown). Statistical analysis reveals that medianvalues of the meme score obtained with the randomized net-works differ by more than one order of magnitude from thoseobtained with the original citation graph, with very little varia-tion between different randomization runs. These results showthat topology and time structure alone fail to account for thereported universality in the distribution patterns, and that thusthe top memes get their high meme scores based on intricate

  • 4relative

    frequency

    102 100 102 104 106106

    104

    102

    100

    A APS

    n = 1, 372, 365

    from titles and abstracts

    M0.999 = 0.716

    quantum

    fissiongraphene

    self-organizedcriticality

    traffic flow

    102 100 102 104 106106

    104

    102

    100

    B APS, randomized(time preserving)

    n = 89, 356

    from titles and abstracts

    M0.999 = 0.121

    102 100 102 104 106106

    104

    102

    100

    C PMC

    n = 1, 322, 013

    from titles and abstracts

    M0.999 = 0.626

    102 100 102 104 106 108108

    106

    104

    102

    100

    D WoS

    n = 7, 966, 731

    from titles only

    M0.999 = 0.560

    propagation score

    density

    ofn-grams:

    100

    101

    102

    103

    104

    105

    1FIG. 2: Universality in the distribution patterns of scientific memes across datasets. Heat maps encode the density of n-grams with a givenpropagation score and frequency. The meme score increases towards the top-right and decreases towards the bottom-left corner in each map.Maps A, C and D, depicting results for publications from the American Physical Society (APS), the open access subset of PubMed Central(PMC), and the Web of Science (WoS), respectively, all feature a broad band with a downward slope, indicating that more frequent memes tendto propagate less via the citation graph. The 99.9%-quantile with respect to the meme score distribution (M0.999) is depicted as a white line.Interesting and important memes are located mostly around the very edge of the top-right side of the band (in the vicinity of the 99.9%-quantileline). Heat map B shows the results obtained with a time-preserving randomization of the APS citation graph (see Methods for details). Theuniversal distribution pattern clearly vanishes, thus confirming that the topology and the time structure of the citation graph alone cannotexplain the observed patterns, in particular not at the top end of the meme score distribution.

    processes and conventions that underlie the dynamics of sci-entific progress and the way credit is given to previous work.

    Table I shows the 50 top-ranked memes from the APSdataset, also indicating their agreement with human annota-tion and whether they can be found under a subcategory ofphysics in Wikipedia. Several properties are worth pointingout. First, most of the memes are noun phrases denoting realand reasonable physics concepts. This is remarkable giventhat the computation of the simple meme score formula as-sumes no linguistic knowledge whatsoever and consideringthat it does not filter out any tokens from the start. Second,the memes on the list consist of one, two or three words,which indicates that the meme score does not favor short orlong phrases, again without applying explicit measures to bal-ance n-gram lengths. Third, chemical formulas such as MgB2and CuGeO3 are relatively frequent, which might suggest thatconventional approaches, filtering such entities out from thestart, are likely to miss many relevant memes.

    In Fig. 3 and Table II, we present results of the manualannotation of terms identified by meme score as comparedto randomly selected terms. The general level of agreementbetween the two annotators is very good, given that the pro-

    vided classification is not perfectly clear-cut. In particular,the agreement is 90% and more for the meme score and above85% for the random terms. Each of the annotators consideredaround 86% of the meme score terms to be important physicsconcepts, agreeing on this in 81% of the cases. With respectto their linguistic categories, each annotator considered 86%of the meme score terms to be noun phrases, and the two an-notators agreed on that for 83% of the terms. The respectivevalues are much lower for the randomly extracted terms. Only25% (non-weighted) and 19% (weighted) of terms were, inagreement, found to be important physics concepts, and only33% (non-weighted) and 25% (weighted) to be noun phrases.The reported differences between meme score and the tworandom selection methods are highly significant (p < 1015using Fishers exact test on the number of agreed classifica-tions). These results confirm that the meme score stronglyfavors noun phrases and important concepts, which corrobo-rates its accuracy for the identification of memes in the scien-tific literature.

    Next we compare the meme score to a number of possi-ble alternative metrics, as defined in the Methods section, andalign the identified words and phrases with a ground-truth list

  • 51. loop quantum cosmology +*2. unparticle +*3. sonoluminescence +*4. MgB2 +

    5. stochastic resonance +*6. carbon nanotubes +*7. NbSe3 +

    8. black hole +*9. nanotubes +

    10. lattice Boltzmann +*11. dark energy +*

    12. Rashba13. CuGeO3 +

    14. strange nonchaotic15. in NbSe316. spin Hall +

    17. elliptic flow +*18. quantum Hall +*19. CeCoIn5 +

    20. inflation +

    21. exchange bias +*22. Sr2RuO4 +

    23. traffic flow +*24. TiOCl25. key distribution +

    26. graphene +*27. NaxCoO2 +

    28. the unparticle +

    29. black30. electromagnetically induced

    transparency +*31. light-induced drift +

    32. proton-proton bremsstrahlung +

    33. antisymmetrized moleculardynamics +

    34. radiative muon capture +

    35. Bose-Einstein +

    36. C60 +

    37. entanglement +

    38. inspiral *39. spin Hall effect +*40. PAMELA41. BaFe2As2 +

    42. quantum dots +*

    43. Bose-Einstein condensates +

    44. X(3872) *45. relaxor +

    46. blue phases +

    47. black holes +*48. PrOs4Sb12 +

    49. the Schwinger multichannelmethod +

    50. Higgsless +

    TABLE I: Top 50 memes with respect to their meme score from the APS dataset. The symbol + indicates memes where the human annotatorsagreed that this is an interesting and important physics concept, while the symbol * indicates memes that are also found on the list of memesextracted from Wikipedia (see Methods for details).

    of terms extracted from physics-related Wikipedia titles. Fig-ure 4 summarizes the results, showing that 70% of the top10 memes identified by meme score correspond to terms ex-tracted from Wikipedia, and 55% of the top 20, 40% ofthe top 50, and 26% of the top 100. The largest area underthe curve A is obtained for a controlled noise level = 4 (seeMethods for details), which is highlighted by the thick blueline. The box plot on the right compares the outcomes of dif-ferent metrics with respect to A, as described in the Methodssection. The meme score achieves A-values that fall com-fortably within the 4050% agreement interval, while all thealternative metrics score considerably worse, consistently be-low the 20% agreement baseline. These results indicate thatthe simple meme score formula performs better than severalalternative metrics in warranting a reasonably high level ofagreement with the list of ground-truth memes extracted fromWikipedia.

    Having established an accurate meme metric, the dynamicsof memes and their patterns over time are one of the manythings we can investigate. As a first step we can track thetemporal changes of selected memes, as shown in Figure 5for the same five exemplary memes introduced above. We

    physics concept not a physics concept

    noun phrase verb adjective or adverb other

    meme score

    A1A2A1A2

    random

    A1A2A1A2

    weighted random

    terms30 60 90 120 150

    A1A2A1A2

    1FIG. 3: Human annotation agrees with the predictions of the memescore. The color bars (see legend) encode the level of agreement withthe meme score list (top) and the two random lists of terms (middleand bottom). The corresponding statistical analysis and further de-tails are provided in Table II.

    see that these memes have very different histories: fission wasdominant in the 1960s and 1970s, self-organized criticalityand traffic flow had their heydays in the 1990s, graphene burstonly within the last three years of the dataset, while quantumis on a very long and slow but steady increase without anysignificant bursts. Looking at the bigger picture, Fig. 6 showsthe top memes over time, revealing bursty dynamics, akin tothe one reported previously in humans dynamics [30] and thetemporal distribution of words [31]. These bursts might be areflection of scientific memes fast rise and fall with respect tofame and mainstream popularity. As new scientific paradigmsemerge, the old ones seem to quickly lose their appeal, andonly a few memes manage to top the rankings over extendedperiods of time. The bursty dynamics also support the ideathat both the rise and fall of scientific paradigms is driven byrobust principles of self-organization [32].

    Discussion

    By going back to the original analogy to genes put forwardby Richard Dawkins [19], we propose a definition of scien-

    method main class anno- agree- classification p-value for diff. to:tator ment (agreed) r w

    physics concept A1 90.0% 85.3% 81.3%

  • 6100 101 102 1030%

    10%

    20%

    30%

    40%

    50%

    60%

    70%

    80%

    90%

    100%

    top x terms according to meme score

    perc

    enta

    ge o

    f Wiki

    pedi

    a te

    rms

    40% of top 50 terms are found on Wikipedia list

    0 0.1 0.2 0.3 0.4 0.5

    meme score

    frequency

    maximum absolute change(over time)

    maximum relative change(over time)

    maximum absolute difference(across journals)

    maximum relative difference(across journals)

    A (area under curve)

    1FIG. 4: The meme score outperforms alternative metrics. The graph on the left shows the percentage from the x top-ranked terms accordingto the meme score that also appear on the ground-truth list of physics terms extracted from Wikipedia, as obtained for different values ofcontrolled noise (1 10, see Methods for details). Curves are shown for the individual values of , with the thick line highlighting thecase = 4, for which the agreement in terms of the area under the curve A is largest. The box plot on the right summarizes the quantitativeagreement achieved by the different metrics (see legend). While the meme score generally achieves more than 40% agreement, the alternativemetrics all perform consistently worse, almost exclusively below the 20% agreement baseline.

    0.5 1 1.5 2 2.5 3 3.5 4 4.5x 105

    0

    5

    10

    15

    publication count

    me

    me

    sco

    re (

    =

    1)

    1940

    1960

    1970

    1980

    1982

    1984

    1986

    1988

    1990

    1992

    1994

    1996

    1998

    2000

    2002

    2004

    2006

    2008

    quantumfissiongrapheneselforganized criticalitytraffic flow

    FIG. 5: The five exemplary memes exhibit very different histories in terms of their meme scores. Four of them show bursts at different pointsin time, while the fifth quantum shows a very steady and almost linear path. The time axis is scaled by publication count.

    tific memes based on their inheritance patterns on the citationgraph of publications. We present the meme score, a met-ric to identify scientific memes, defined as the product of thefrequency of occurrence and the propagation score, wherebythe latter determines the degree to which the occurrence of ameme is aligned with the citation graph.

    We have shown that the meme score can be calculated ex-actly without the introduction of arbitrary thresholds or fil-ters, without the usage of external resources such as dictionar-ies, and without noteworthy preprocessing of the publicationtexts. The method is fast and reliable, and it can be appliedon massive databases. We have demonstrated the effective-ness of the meme score on more than 47.1 million publicationrecords from the Web of Science, PubMed Central, and theAmerican Physical Society. Moreover, we have evaluated theperformance of the proposed meme score by means of fulland time-preserving randomization of the citation graphs, by

    means of manual annotation of publications, as well as bymeans of several alternative metrics. We have provided sta-tistical evidence for the agreement between human annotatorsand the meme-score results, and we have shown that it is su-perior to alternative metrics. We have also confirmed that theobserved patterns cannot be explained by topological or tem-poral features alone, but are grounded in more intricate pro-cesses that determine the dynamics of the scientific progressand the way credit is given to preceding publications. The top-ranking scientific memes reveal bursty time dynamics, whichmight be a reflection of the fierce competition of memes forthe limited and fluctuating resource of scientists attention.

  • 70.5 1 1.5 2 2.5 3 3.5 4 4.5

    x 105

    0

    2

    4

    6

    8

    10

    12

    publication count

    mem

    e sc

    ore

    1940

    1960

    1970

    1980

    1982

    1984

    1986

    1988

    1990

    1992

    1994

    1996

    1998

    2000

    2002

    2004

    2006

    2008

    grapheneentanglement

    MgB2

    nanotubescarbon nanotubes

    quarkneutrino

    BoseEinsteinquantum Hall

    blackC

    60Hubbard model

    quantum wellsgraphite

    reactionsphotoemission

    black holetricritical

    Kondosuperconducting

    fissionMeV

    diffuse scattering

    FIG. 6: Time history of top physics memes based on their meme scores obtained from the American Physical Society dataset. The time axisis scaled by publication count. Bars and labels are shown for all memes that top the rankings for at least ten out of the displayed 911 pointsin time. The gray area represents the second-ranked meme at the given time. The bursty dynamics seem to indicate that both rise and fall arefast, and that for the majority of scientific memes the popularity is fleeting.

    Methods

    Controlled noise and discounting free-riding

    The effectiveness of the propagation score, as defined inEq. 3, can be further improved by adding a small amount ofcontrolled noise , thus obtaining

    Pm =dmmdm +

    /dmm + dm +

    . (5)

    The controlled noise corrects for the fact that any of the fourbasic terms can be zero, and it also prevents that phrases witha very low frequency get a high score by chance. To illustratethe latter, consider a publication that is cited only once, whilethe citing publication is never cited. If these two publicationshappen to share a phrase that does not exist otherwise, forexample because the second publication reproduces a shortsequence of the text of the first, then this phrase would getthe maximum propagation score (it sparks once and then italways sticks). The controlled noise corresponds to fictitiouspublications that carry all memes and cite none, plus another publications that carry no memes and cite all. This decreasesthe sticking factors and increases the sparking factors for allmemes, thereby reducing all meme scores; very slightly sofor frequent memes but heavily for rare ones. Our tests show

    that a small amount of noise (e.g. = 3 as used throughoutthis work unless stated otherwise) are sufficient to solve theabove-mentioned problem.

    Another matter that deserves attention is the potential free-riding of shorter memes on longer ones. Memes can be partof larger memes, and it pays to check whether a given memesticks on its own or just free-rides on the popularity of alarger meme. For example, consider the multi-token memethe littlest Higgs model, which contains the specific tokenlittlest that rarely occurs otherwise. The meme littlesttherefore gets about the same propagation score as the longmeme. Yet the larger meme is clearly more interesting, andwe should thus discount for sticking behavior that is due onlyto the free-riding on larger memes. This can be achieved byredefining the term dmm in Eq. 1 to exclude publicationswhere the given meme appears in the publication and its citedpublications only within the same larger meme. If littlest,for example, is always followed by Higgs in a given publi-cation and all its cited publications, then this publication shallnot contribute to the dmm term for m = littlest.

  • 8Graph randomization

    We use graph randomizations to verify that the reported re-sults are not rooted in generic properties of the examined net-work, i.e. we want to rule out that the observed effects can beexplained by network topology and chance alone. For that weuse networks that have exactly the same topology as the origi-nal one but where the article texts (i.e. titles and abstracts withtheir memes) are randomly assigned to the nodes. Each nodetherefore owns it position in the network to one particular pub-lication but has text attached that comes from a different one.This kind of randomization, however, still leaves room for theunlikely possibility that the time order of the publications hasa major effect on the meme score, in particular the simplefact that citations go only backwards in time. To rule thisout, we also perform time-preserving randomizations of thecitation network, shuffling only publications that were pub-lished within narrow consecutive time windows. Concretelywe use time windows of 1000 publications, meaning that after shuffling no publication has moved more than 1000positions forward or backward from the original chronologi-cal order.

    Human annotation

    To test whether human annotators confirm that phrases witha high meme score are indeed interesting and important con-cepts in the respective scientific field such as physics, we de-fine the following two categories for manual annotation: (i)the phrase is not a meaningful term or not an important con-cept of physics, and (ii) the phrase is an important concept orentity of physics it could appear as the title of an entry ofa comprehensive encyclopedia on physics. Our hypothesis isthat the phrases with a high meme score would tend to endup in the second category, while the phrases with a low memescore would end up in the first. In addition, we asked our an-notators to determine the linguistic categories of the phrases,for which we defined the following classes: (i) noun phrase,(ii) verb, (iii) adjective or adverb, and (iv) other. Our intu-itive assumption is that memes would mostly have the form ofnoun phrases.

    The set of phrases used for this evaluation consisted of thetop 150 memes with respect to their meme score, extractedfrom the American Physical Society dataset, plus another twosets for comparison of 150 randomly drawn phrases each. Forthe two comparison sets, we have considered all phrases thatappear in at least 100 publications. From these, 150 termswere drawn randomly without taking into account their fre-quency, i.e., frequent terms had the same chance of being se-lected as infrequent ones, whereas the second 150 terms weredrawn with a weight that corresponded to their frequency, i.e.,a term appearing 10000 times was ten times more likely to beselected than a term appearing 1000 times. Moreover, to ruleout effects of different n-gram lengths, we made sure that thetwo batches of random terms followed exactly the same lengthdistribution as the main sample extracted based on the memescore. The resulting 450 terms were shuffled and given to two

    human annotators, both PhD students with a degree in physics,who independently annotated each of the terms according tothe two criteria (physics concept and linguistic category).

    Alternative metrics

    We have used the following metrics as alternatives to thememe score: (i) frequency the most frequent terms (upto 3-grams) skipping the first x terms to filter out generalwords like of and method (setting x in the range from 0 to500); (ii) maximum absolute change over time the highest-scoring terms (up to 3-grams, occurring in at least 10 publica-tions) with respect to maximum absolute change in frequencyfrom one time window of x publications to the next on chrono-logically sorted publications (we set x in the range from 1000to 100,000); (iii) maximum relative change over time thesame as (ii) but based on relative changes; (iv) maximum ab-solute difference across journals the highest-scoring terms(up to 3-grams, occurring in at least 10 publications) with re-spect to maximum absolute difference in relative frequencyfrom one journal to another, considering only journals withat least x publications and optionally excluding the old jour-nals Physical Review Series I and II (we set x to 0, 1000, and10,000); (v) maximum relative difference across journals the same as (iv) but based on relative changes.

    Metric (i) is based on the assumption that important memesare frequent but not as frequent as the small class of generalwords that can be found in all types of texts. Metrics (ii) and(iii) are based on an idea proposed in [32], being that interest-ing memes exhibit trends over time. Words like approachand the might have a very high frequency, but they are notsubject to strong trends over time, as compared to terms suchas graphene. Metrics (iv) and (v) are based on the intuitionthat phrases occurring mostly in specific journals but not inothers must be specific concepts of the particular field of re-search.

    To compare these metrics, we need to establish some sortof ground-truth list of memes to compare the extracted termsagainst. For that, we automatically extracted 5178 terms fromWikipedia. We collected the titles of all articles and termsredirecting to them from the categories physics, appliedand interdisciplinary physics, theoretical physics, emerg-ing technologies, and their direct sub-categories, but filter-ing out terms that appear in less than 10 publications of theAmerican Physical Society dataset. While it is unreasonableto expect that any metric would be able to perfectly reproducesuch a Wikipedia-based list of memes, a considerable overlapshould nevertheless be achievable for good metrics.

    To quantify the agreement between the top memes identi-fied by a particular metric and the Wikipedia list, we use thenormalized area A under the curve as shown on the left ofFigure 4. The step-shaped curved has a log-scaled x-axis run-ning up to the number of terms s on the ground-truth meme list(s = 5178 in our case) and the y-axis running from 0 (no over-lap) to 1 (perfect overlap). Limiting cases are A = 1, repre-senting perfect agreement, and A = 0, representing no agree-ment between the two compared lists. Values 0 < A < 1 rep-

  • 9resent partial agreement, giving an agreement between higher-ranked memes more weight than to an agreement betweenlower-ranked ones.

    Acknowledgments

    This research was supported by the European Commissionthrough the ERC Advanced Investigator Grant Momentum

    (Grant No. 324247) and by the Slovenian Research Agencythrough the Program P5-0027. In addition, we would like tothank Karsten Donnay, Matthias Leiss, Christian Schulz, andOlivia Woolley-Meza for their help.

    [1] Redner, S. How popular is your paper? an empirical study ofthe citation distribution. Eur. Phys. J. B 4, 131134 (1998).

    [2] Radicchi, F., Fortunato, S., and Castellano, C. Universality ofcitation distributions: Toward an objective measure of scientificimpact. Proc. Natl. Acad. Sci. USA 105, 1726817272 (2008).

    [3] Newman, M. E. J. Coauthorship networks and patterns of scien-tific collaboration. Proc. Natl. Acad. Sci. USA 101, 52005205(2004).

    [4] Guimera`, R., Uzzi, B., Spiro, J., and Amaral, L. A. N. Team as-sembly mechanisms determine collaboration network structureand team performance. Science 308, 697702 (2005).

    [5] Milojevic, S. Principles of scientific research team formationand evolution. Proc. Natl. Acad. Sci. USA 111, 39843989(2014).

    [6] Hirsch, J. E. An index to quantify an individuals scientificresearch output. Proc. Natl. Acad. Sci. USA 104, 16569 (2005).

    [7] Radicchi, F., Fortunato, S., Markines, B., and Vespignani, A.Diffusion of scientific credits and the ranking of scientists.Phys. Rev. E 80, 056103 (2009).

    [8] Petersen, A. M., Wang, F., and Stanley, H. E. Methods for mea-suring the citations and productivity of scientists across timeand discipline. Phys. Rev. E 81, 036114 (2010).

    [9] Penner, O., Pan, R. K., Petersen, A. M., Kaski, K., and Fortu-nato, S. On the predictability of future impact in science. Sci.Rep. 3, 3052 (2013).

    [10] de Solla Price, D. J. Networks of scientific papers. Science 149,510515 (1965).

    [11] Merton, R. K. The matthew effect in science. Science 159,5363 (1968).

    [12] Evans, J. A. and Foster, J. G. Metaknowledge. Science 331,721725 (2011).

    [13] Michel, J. B., Shen, Y. K., Presser Aiden, A., Veres, A., Gray,M. K., The Google Books Team, Pickett, J. P., Hoiberg, D.,Clancy, D., Norvig, P., Orwant, J., Pinker, S., Nowak, M. A.,and Lieberman Aiden, E. Quantitative analysis of culture usingmillions of digitized books. Science 331, 176182 (2011).

    [14] Pan, R. K., Kaski, K., and Fortunato, S. World citation andcollaboration networks: uncovering the role of geography inscience. Sci. Rep. 2, 902 (2012).

    [15] Mazloumian, A., Helbing, D., Lozano, S., Light, R. P., andBorner, K. Global multi-level analysis of the scientific foodweb. Sci. Rep. 3, 1167 (2013).

    [16] Chavalarias, D. and Cointet, J.-P. Phylomemetic patterns inscience evolution the rise and fall of scientific fields. PLoSONE 8, e54847 (2013).

    [17] Borner, K. Atlas of Science. MIT Press, Cambridge, MA,(2010).

    [18] Borner, K. Atlas of Knowledge. MIT Press, Cambridge, MA,(2014).

    [19] Dawkins, R. The Selfish Gene. Oxford University Press, Ox-ford, (1989).

    [20] Leskovec, J., Backstrom, L., and Kleinberg, J. Meme-trackingand the dynamics of the news cycle. In Proceedings of ACMSIGKDD, 497506, (2009).

    [21] Adar, E. and Adamic, L. A. Tracking information epidemicsin blogspace. In Proceedings of IEEE/WIC/ACM, 207214,(2005).

    [22] Conover, M., Ratkiewicz, J., Francisco, M., Goncalves, B.,Menczer, F., and Flammini, A. Political polarization on Twitter.In Proceedings of ICWSM, 8996, (2011).

    [23] Gomez Rodriguez, M., Leskovec, J., and Krause, A. Inferringnetworks of diffusion and influence. In Proceedings of ACMSIGKDD, 10191028, (2010).

    [24] Simmons, M. P., Adamic, L. A., and Adar, E. Memes online:Extracted, subtracted, injected, and recollected. In Proceedingsof ICWSM, 353360, (2011).

    [25] Weng, L., Flammini, A., Vespignani, A., and Menczer, F. Com-petition among memes in a world with limited attention. Scien-tific Reports 2 (2012).

    [26] Stanley, H. E. Introduction to Phase Transitions and CriticalPhenomena. Clarendon Press, Oxford, (1971).

    [27] Gleeson, J. P., Ward, J. A., OSullivan, K. P., and Lee, W. T.Competition-induced criticality in a model of meme popularity.Phys. Rev. Lett. 112, 048701 (2014).

    [28] Bastian, M., Heymann, S., and Jacomy, M. Gephi: an opensource software for exploring and manipulating networks. InICWSM, 361362, (2009).

    [29] Martin, S., Brown, W. M., Klavans, R., and Boyack, K. W.OpenOrd: an open-source toolbox for large graph layout. InIS&T/SPIE Electronic Imaging. International Society for Op-tics and Photonics, (2011).

    [30] Barabasi, A. L. The origin of bursts and heavy tails in humansdynamics. Nature 435, 207211 (2005).

    [31] Altmann, E. G., Pierrehumbert, J. B., and Motter, A. E. Be-yond word frequency: bursts, lulls, and scaling in the temporaldistributions of words. PLoS ONE 4, e7678 (2009).

    [32] Perc, M. Self-organization of progress across the century ofphysics. Sci. Rep. 3, 1720 (2013).

    Results Discussion Methods Controlled noise and discounting free-riding Graph randomization Human annotation Alternative metrics

    Acknowledgments References