Confidential: For Review Only - bmj.com · Neurology; University Medical ... low-frequency words...
Transcript of Confidential: For Review Only - bmj.com · Neurology; University Medical ... low-frequency words...
Confidential: For Review O
nly
The use of positive and negative words in scientific
abstracts: too good to be true?
Journal: BMJ
Manuscript ID BMJ.2015.029354.R1
Article Type: Research
BMJ Journal: BMJ
Date Submitted by the Author: 19-Nov-2015
Complete List of Authors: Vinkers, Christiaan; University Medical Center Utrecht, Department of Psychiatry Tijdink, Joeri; VU Medical Center, Department of Internal Medicine Otte, Willem; University Medical Center Utrecht, Department of Pediatric Neurology; University Medical Center Utrecht, Image Sciences Institute
Keywords: positive outcome bias, publiciation patterns, novel, innovative, robust, unprecedented, PubMed
https://mc.manuscriptcentral.com/bmj
BMJ
Confidential: For Review O
nly
1
The use of positive and negative words in scientific abstracts: too good to be
true?
REVISED VERSION
Christiaan H Vinkers, Joeri Tijdink, Willem M Otte
Christiaan H Vinkers MD PhD, Department of Psychiatry, Brain Center Rudolf Magnus, University Medical
Center Utrecht, 3584 CX Utrecht, the Netherlands
Christiaan H Vinkers, Assistant professor
Joeri Tijdink MD, Department of Internal Medicine, VU University Medical Center, 1081 HZ Amsterdam, the
Netherlands
Joeri Tijdink, PhD student
Willem M Otte PhD, Department of Child Neurology, Brain Center Rudolf Magnus, University Medical Center
Utrecht, 3584 CX, Utrecht, the Netherlands, Biomedical MR Imaging and Spectroscopy, Center for Image
Sciences, University Medical Center Utrecht, Utrecht, the Netherlands
Willem M Otte, Assistant professor
Correspondence to: Christiaan H Vinkers, MD PhD, Department of Psychiatry, Brain Center
Rudolf Magnus, University Medical Center Utrecht, Utrecht, the Netherlands, Heidelberglaan
100, 3584 CX Utrecht, The Netherlands. Tel: +31 (0) 88 7 555 555. E-mail:
Page 1 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
2
Abstract
Objective: Our perception of the world is reflected in how we use language. We aimed to
investigate whether language use in science would skew towards the use of strikingly positive
and negative words over time.
Design: Retrospective analysis of all scientific abstract included between 1974 and 2015 in
PubMed.
Main outcome measures: Positive and negative word frequencies in comparison to
frequencies of words with a neutral and random connotation, expressed as relative change
since 1980.
Methods: The yearly frequency of 25 positive, 25 negative, and 25 neutral words, as well as
100 randomly selected words was normalized for the total number of abstracts. Sub-analyses
included pattern quantification of words in isolation, specificity for high impact journals, and
comparison between affiliations within or outside countries with English as the official
majority language. Frequency patterns were compared with 4% of all books ever printed and
digitized using the Google Books Ngram Viewer.
Results: The relative increase in word frequency over four decades was 880% for positive
words and 257% for negative words. All individual positive words contributed to the increase,
particularly the words ‘robust’, ‘novel’, ‘innovative’ and ‘unprecedented’ which increased in
relative frequency up to 15000%. Comparable but less pronounced results were obtained
when restricting the analysis to high-impact factor journals. Authors affiliated to an institute
located in a non-English speaking country used significantly more positive words. No
apparent increase was found in the use of neutral and random words, and neither did the
frequency of positive words increase in published books over the same time period.
Page 2 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
3
Conclusions: Our unique lexicographic analysis convincingly demonstrates that scientific
abstracts are currently written with more positive and negative words. The remarkable
increase in frequency of positively valenced words provides a novel and unprecedented
insight into the evolution of scientific writing. Apparently scientists look on the bright side of
research results. However, whether this perception fits reality should be questioned.
Page 3 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
4
Introduction
Science has shown an impressive growth over the last decades and more scientific papers are
published now than ever before.1 Between 1996 and 2011, over 15 million individuals
authored around 25 million papers.2 Due to expanding research fields, it is increasingly
difficult to get studies published in high-impact journals.3 This is important since publication
quantity and associated impact factors have a considerable effect on a scientist’s career
perspective.4 Consequently, in order to get published, scientific discoveries may sometimes be
exaggerated or the potential implications overstated.5 6 Indeed, overinterpretation,
overstatement and misreporting of scientific results have been frequently reported.7-12
However, the prevalence of this problem in the scientific literature is unclear.
There is a well-known universal tendency in humans to use positive words,13 and
exaggeration of research-related news has previously been linked to overstatements in
academic press releases.14 In the current study, we used a data-driven approach to investigate
trends in the use of positively and negatively valenced words in PubMed abstracts and titles
over the last four decades. Subsequently, positive and negative word trends were contrasted to
either neutral or random words, as well as to patterns obtained from the corpus of digitized
texts containing about 4% of all books ever printed using Google Ngram Viewer. We
hypothesized that the emergence of a culture aimed at productivity and novelty could have
affected the use of positive and negative words in scientific reporting and discussion.
Methods
The yearly frequency of 25 predefined positive, negative, and neutral words was quantified in
titles and abstracts obtained from the PubMed database (www.pubmed.gov) (Table 1).
Analyses were restricted to 1974 – 2015 to ensure that all abstract texts were available. Words
Page 4 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
5
were selected before analyses were carried out after reaching consensus between the authors
through discussion which included manual analysis of random abstracts and search of
thesaurus listings. To validate the results from these pre-specified lists, additional positive
words were selected from a recent article on superlatives in news coverage of cancer drugs.15
To further exclude a bias in the choice of these words, we additionally searched for 50 nouns
and 50 adjectives randomly selected from Ogden’s 850 core words of Basic English
(https://en.wiktionary.org/wiki/Appendix:Basic_English_word_list). The yearly number of
abstracts containing one or more of the positive, negative, neutral, or random words in title or
abstract text (based on the OR operator) was divided by the total number of yearly
publications. Search queries are provided as supplementary material (supplementary data 1).
Differences between trends across the last 10 years were also summarized (mean 95%
confidence interval (CI)) and statistically tested with unpaired t-tests. Patterns of individual
words were plotted to determine whether developments were comparable across words.
Future predictions for the word ‘novel‘ were calculated with low order polynomial
regressions. Co-occurrence of positive and negative words in abstracts was examined with
random sets of positive words using the ‘AND’ operator. All analyses were carried out using
R and plots were created with the R package ‘ggplot2’
To ensure that any trend in the use of positive and negative words in PubMed abstracts
was specific for science rather than reflecting general trends in words use in society, the use
of positive and negative words in published books between 1975-2009 was also quantified
using the Google Books Ngram Viewer that charts frequencies of any word or short sentence
found in millions of books printed between 1800 and 2009.16 We plotted average Google
Books patterns and corresponding CIs (calculated from bootstrap sampling of all individual
word frequency patterns; 1000 samples/year) to evaluate differences with the patterns
obtained from the Pubmed queries.
Page 5 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
6
In light of the increasing number of journals and the rise of the open access movement,
we also restricted our search to 20 high impact journals which were pre-specified and based
on consensus between the authors (see supplementary table S1). Finally, we investigated a
possible cultural influence by comparing the use of positive and negative words in titles and
abstracts between authors with an affiliation in a country where English is the de facto official
language (Australia, New Zealand, United Kingdom and the United States).
Results
Between 1974 and 1980, the percentage of PubMed records containing one or more positive
word in title or abstract varied between 1.7 and 2.3%. This further increased to 17.5% in
2015, a relative increase of 880% (Figure 1, top left). Increases above 700% were present
with random selections of each 20 positive words. The usage of the same positive words in
published books increased to 146% from 1975 to 2009 (Figure 1, top). Frequency patterns of
all individual words in abstracts showed increased usage although with large variation (Figure
2). In isolation, the words ‘robust’, ‘novel’, ‘innovative’ and ‘unprecedented’ increased in
relative frequency from 2500 to 15000% (Figure 2). Removal of these words still yielded a
relative frequency increase of 540%. Moreover, word trends were similar after exclusion of
low-frequency words such as ‘inventive’ and ‘astonishing’. Analyses of additional positive
words (“breakthrough”, “cure”, “marvel”, “miracle”, “revolutionary” and “transformative”)
based on a recent article15 revealed comparable and consistent patterns increases in frequency
(supplemental figure S1). Positive word use also increased in high-impact factor journals,
with a change from 1.1 to 8.9% (relative increase of 674%; Figure 1, top left). However, the
increase in positive word use over the last ten years was significantly lower in high-impact
factor journals compared to the frequency pattern of positive words across all journals (-
159.8%, CI -92.9 to -226.7%, p-value 0.0001). Similar results were found using the top 20 list
Page 6 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
7
of journals based on the journal impact factor (both for all journals and for general medical
journals, Journal Citation Reports 2014) (data not shown). Combinations of more than two
positive words in single abstracts only occurred in a minority of abstracts. Patterns in positive
and negative words significantly differed between authors with an affiliation inside or outside
an English speaking country, with lower frequency rates in the last ten years for those
affiliated with an institution in Australia, New Zealand, UK or US. (-31.4%, CI -50.6 to -
12.2%, p-value 0.003; supplemental figure S3). Extrapolating the upward trend of positive
words over the last 40 years to the future, we predict that the word ‘novel’ will appear in
every record by the year 2123.
For negative words, a similar but less unequivocal increase in relative frequency was
found: in 2015, up to 257% and 199% if restricted to high impact journals (Figure 1, top
right). Individual negative word patterns are included as supplemental figure S2. No increase
was found in the use of neutral words and only a modest increase in relative frequency to
150% for random words (Figure 1, bottom).
Discussion
Our analysis of scientific abstracts demonstrates that positive and - to a lesser extent -
negative words are increasingly used over the past four decades. In contrast, this increase was
absent for neutral and random words. The increase in positive words could not be attributed to
general language tendencies as represented by the corpus of millions of printed books. Neither
is the increase driven by one or two words as all words showed increased frequency patterns.
Even though the upward trend in positive word use was conserved in high-impact journals,
this trend was significantly less pronounced (Figure 1). This could be the result of a more
thorough and critical editorial and peer review process.
Page 7 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
8
Our approach has strengths and weaknesses. The main strength of our lexicographic analysis
is the inclusion of all PubMed abstracts for over four decades which prevents selection bias.
Side-by-side comparisons with patterns of other word lists and general English texts provide
robust reference data. However, our study also has limitations. First, we limited the list of
positive and negative words, and the choice of words is likely to affect the specificity of the
observed patterns. However, the general tendency was comparable across individual words
and sensitivity analyses with additional positive words yielded similar results. Second, we did
not account for changes in the maximum abstract length of Pubmed abstracts over the years.
However, the upward trends are more or less linear over time, and abstract length would
likely have resulted in an increase of neutral or random words as well. Third, we did not
study the location of the words in the abstracts, or the context of their usage. Contextual
analysis of words may differentiate between the connotation of isolated words and the
connotation conditional on the sentence. Moreover, we did not directly examine the
relationship between word usage and the current scientific culture, i.e. the role of increased
publication pressure and the perceived relevance of publications for a scientific career.
Finally, we cannot exclude the possibility that the scientific process has considerably
improved over the last decades and that the more frequent use of positive words is
appropriate.
Although researchers may have adopted an increasingly optimistic writing approach and are
ever more enthusiastic about their results, another explanation is more likely: in order to get
published, scientists may assume that results and their implications have to be exaggerated
and overstated in order to get published. Our finding that scientific abstracts use more overt
positive language is also probably related to the emergence of a positive outcome bias that
currently dominates scientific literature.17
There is a high pressure on scientists in academia to
Page 8 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
9
publish as many papers as possible in order to further one’s career. As a result, we may be
afraid to break the bad news that many studies do not result in statistically significant or
clinically meaningful effects. Currently, the majority of research findings may be false or
exaggerated,6 18 and research resources are often wasted.
19 Overestimation of research
findings directly impairs the ability of science to find true effects and leads to an unnecessary
focus on research marketability. The consequences of this increase are worrisome since it
makes research a survival of the fittest: the person who is best able to ‘sell’ their results may
be the most successful. It is time for a new academic culture that rewards quality over
quantity and stimulates researchers to revere nuance and objectivity. Notwithstanding the
steady increase of superlatives in science, this finding should not detract us from the fact we
need bright, unique, innovative, phenomenal, creative, and excellent scientists.
Competing interests
All authors have completed the ICMJE uniform disclosure form at
www.icmje.org/coi_disclosure.pdf and declare: no support from any organisation for the
submitted work; no financial relationships with any organisations that might have an interest
in the submitted work in the previous three years; no other relationships or activities that
could appear to have influenced the submitted work.
Funding source
No funding source supported this study.
Licence Statement
The Corresponding Author has the right to grant on behalf of all authors and does grant on
behalf of all authors, a worldwide licence to the Publishers and its licensees in perpetuity, in
Page 9 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
10
all forms, formats and media (whether known now or created in the future), to i) publish,
reproduce, distribute, display and store the Contribution, ii) translate the Contribution into
other languages, create adaptations, reprints, include within collections and create summaries,
extracts and/or, abstracts of the Contribution, iii) create any other derivative work(s) based on
the Contribution, iv) to exploit all subsidiary rights in the Contribution, v) the inclusion of
electronic links from the Contribution to third party material where-ever it may be located;
and, vi) licence any third party to do any or all of the above.”
Declaration of contribution
CV, JT and WO all had a substantial contributions to the conception or design of the work; or
the acquisition, analysis, or interpretation of data for the work; AND
CV, JT and WO drafted the work or revising it critically for important intellectual content;
AND
CV, JT and WO gave final approval of the version to be published; AND
CV, JT and WO all agreed to be accountable for all aspects of the work in ensuring that
questions related to the accuracy or integrity of any part of the work are appropriately
investigated and resolved.
Data sharing
No additional data available.
Page 10 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
11
Transparency statement
I, Christiaan Vinkers, affirm that the manuscript is an honest, accurate, and transparent
account of the study being reported; that no important aspects of the study have been omitted;
and that any discrepancies from the study as planned have been explained.
What this paper adds
Section 1: What is already known on this subject
• Science has shown an impressive growth over the last decades and in order to get
published, scientific discoveries are sometimes exaggerated or the potential
implications overstated.
Section 2: What this study adds
• Our analysis of Pubmed abstracts demonstrates that positive words are increasingly
used over the past four decades.
• The use of more overt positive language is probably related to the emergence of a
positive outcome bias that currently dominates scientific literature.
Page 11 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
12
References
1. Ridker PM, Rifai N. Expanding options for scientific publication: is more always better?
Circulation 2013;127(2):155-6.
2. Boyack KW, Klavans R, Sorensen AA, Ioannidis JP. A list of highly influential biomedical
researchers, 1996-2011. Eur J Clin Invest 2013;43(12):1339-65.
3. Fraser AG, Dunstan FD. On the impossibility of being expert. BMJ 2010;341:c6815.
4. Publish or perish. Nature 2015;521(7552):259.
5. Macleod MR, Michie S, Roberts I, Dirnagl U, Chalmers I, Ioannidis JP, et al. Biomedical
research: increasing value, reducing waste. Lancet 2014;383(9912):101-4.
6. Ioannidis JP. Why most published research findings are false. PLoS Med 2005;2(8):e124.
7. Boutron I, Dutton S, Ravaud P, Altman DG. Reporting and interpretation of randomized
controlled trials with statistically nonsignificant results for primary outcomes. JAMA
2010;303(20):2058-64.
8. Ochodo EA, de Haan MC, Reitsma JB, Hooft L, Bossuyt PM, Leeflang MM.
Overinterpretation and misreporting of diagnostic accuracy studies: evidence of
"spin". Radiology 2013;267(2):581-8.
9. Lockyer S, Hodgson R, Dumville JC, Cullum N. "Spin" in wound care research: the
reporting and interpretation of randomized controlled trials with statistically non-
significant primary outcome results or unspecified primary outcomes. Trials
2013;14:371.
10. Patel SV, Chadi SA, Choi J, Colquhoun PH. The use of "spin" in laparoscopic lower GI
surgical trials with nonsignificant results: an assessment of reporting and interpretation
of the primary outcomes. Dis Colon Rectum 2013;56(12):1388-94.
11. Boutron I, Altman DG, Hopewell S, Vera-Badillo F, Tannock I, Ravaud P. Impact of spin
in the abstracts of articles reporting results of randomized controlled trials in the field
of cancer: the SPIIN randomized controlled trial. J Clin Oncol 2014;32(36):4120-6.
12. Lazarus C, Haneef R, Ravaud P, Boutron I. Classification and prevalence of spin in
abstracts of non-randomized studies evaluating an intervention. BMC Med Res
Methodol 2015;15:85.
13. Dodds PS, Clark EM, Desu S, Frank MR, Reagan AJ, Williams JR, et al. Human language
reveals a universal positivity bias. Proc Natl Acad Sci U S A 2015;112(8):2389-94.
Page 12 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
13
14. Sumner P, Vivian-Griffiths S, Boivin J, Williams A, Venetis CA, Davies A, et al. The
association between exaggeration in health related science news and academic press
releases: retrospective observational study. BMJ 2014;349:g7015.
15. McCarthy M. Superlatives are commonly used in news coverage of cancer drugs, study
finds. BMJ 2015;351:h5803.
16. Michel JB, Shen YK, Aiden AP, Veres A, Gray MK, Google Books T, et al. Quantitative
analysis of culture using millions of digitized books. Science 2011;331(6014):176-82.
17. Dwan K, Gamble C, Williamson PR, Kirkham JJ, Reporting Bias G. Systematic review of
the empirical evidence of study publication bias and outcome reporting bias - an
updated review. PLoS One 2013;8(7):e66844.
18. Ioannidis JP. How to make more published research true. PLoS Med
2014;11(10):e1001747.
19. Chalmers I, Glasziou P. Avoidable waste in the production and reporting of research
evidence. Lancet 2009;374(9683):86-9.
Page 13 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
14
Figures and Tables
Figure 1: Relative frequency patterns of positive (top left), negative (top right), neutral
(bottom left), and random (bottom right) words in PubMed abstracts and titles over time. The
mean relative frequency patterns of the same positive and negative words in general books is
plotted in A and B including 95% confidence intervals (gray shaded).
Figure 2: Relative frequencies of 24 individual positive words as used in PubMed between
1975 and 2015. The word ‘inventive’ was not plotted due to low search volumes.
Page 14 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
15
Table 1: List of the positive, negative, and neutral words used in Pubmed search queries and
Google books search engine.
Positive
words
Amazing, assuring, astonishing, bright, creative, encouraging, enormous,
excellent, favorable, groundbreaking, hopeful, innovative, inspiring, inventive,
novel, phenomenal, prominent, promising, reassuring, remarkable, robust,
spectacular, supportive, unique, unprecedented
Negative
words
Detrimental, disappointing, disconcerting, discouraging, disheartening,
disturbing, frustrating, futile, hopeless, impossible, inadequate, ineffective,
insignificant, insufficient, irrelevant, mediocre, pessimistic, substandard,
unacceptable, unpromising, unsatisfactory, unsatisfying, useless, weak,
worrisome
Neutral
words
Animal, blood, bone, brain, condition, design, disease, experiment, human,
intervention, kidney, liver, man, men, muscle, patient, prospective, rodent,
significant, skin, skull, treatment, vessel, woman, women
Page 15 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
Figure 1: Relative frequency patterns of positive (top left), negative (top right), neutral (bottom left), and random (bottom right) words in PubMed abstracts and titles over time. The mean relative frequency patterns of the same positive and negative words in general books is plotted in A and B including 95%
confidence intervals (gray shaded).
Page 16 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
Figure 2: Relative frequencies of 24 individual positive words as used in PubMed between 1975 and 2015. The word ‘inventive’ was not plotted due to low search volumes.
Page 17 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
1
Supplementary data
Page 2: Supplementary data 1: Search queries
Page 4: Supplementary Table S1: List of twenty journals used for positive word analysis in high
impact journals
Page 5: Supplementary figure S1: Relative frequencies of positive words both combined (top left)
and in isolation selected from a recent paper on superlatives commonly used in news coverage of
cancer drugs.
Page 6: Supplemental figure S2: Relative frequencies of 21 individual negative words as used in
PubMed between 1975 and 2015. Four words with low search volumes (‘disconcerting’,
‘disheartening’, ‘unpromising’ and ‘unsatisfying’) were not plotted.
Page 7: Supplemental figure S3. Relative frequency patterns of positive (top left), negative (top
right), neutral (bottom left), and random (bottom right) words in PubMed abstracts and titles over time
between authors affiliated with an institution inside or outside countries with English as the official
majority language.
Page 18 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
2
Supplementary data 1: Search queries
Pubmed screenshot (November 121h
, 2015). Shows example query for the word ‘robust’. If the search
volume is high enough a box will appear on the right – marked with red – which allows to download
yearly frequency counts as a comma-separated file.
Combined query with one or more positive words in abstracts
(Amazing OR Assuring OR Astonishing OR Bright OR Creative OR Encouraging OR Enormous OR
Excellent OR Favorable OR Groundbreaking OR Hopeful OR Innovative OR Inspiring OR Inventive
OR Novel OR Phenomenal OR Prominent OR Promising OR Reassuring OR Remarkable OR Robust
OR Spectacular OR Supportive OR Unique OR Unprecedented)
Combined query with one or more negative words in abstracts
(Detrimental OR Disappointing OR Disconcerting OR Discouraging OR Disheartening OR Disturbing
OR Frustrating OR Futile OR Hopeless OR Impossible OR Inadequate OR Ineffective OR
Insignificant OR Insufficient OR Irrelevant OR Low-quality OR Mediocre OR Pessimistic OR
Substandard OR Unacceptable OR Unpromising OR Unsatisfactory OR Unsatisfying OR Useless OR
Weak OR Worrisome)
Combined query with one or more neutral words in abstracts
(Animal OR Blood OR Bone OR Brain OR Condition OR Design OR Disease OR Experiment OR
Human OR Intervention OR Kidney OR Liver OR Man OR Men OR Muscle OR Patient OR
Page 19 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
3
Prospective OR Rodent OR Significant OR Skin OR Skull OR Treatment OR Vessel OR Woman OR
Women)
Combined query with one or more random words in abstracts
(manager OR substance OR law OR dust OR bite OR butter OR fold OR mind OR protect OR
insurance OR test OR father OR letter OR friend OR power OR edge OR linen OR scale OR bread
OR statement OR weather OR smell OR glass OR food OR level OR steam OR soap OR help OR rule
OR wind OR interest OR purpose OR hole OR fight OR representative OR danger OR prose OR
change OR discussion OR company OR direction OR balance OR organisation OR size OR trade OR
rice OR invention OR heat OR road OR mountain OR electric OR good OR natural OR sweet OR
dead OR strange OR thin OR political OR open OR bitter OR dark OR complex OR warm OR full OR
red OR kind OR possible OR strong OR free OR quick OR slow OR cut OR narrow OR certain OR
dependent OR flat OR acid OR fixed OR responsible OR false OR great OR like OR green OR cold
OR poor OR low OR opposite OR bright OR military OR fertile OR second OR left OR wrong OR
hanging OR gray OR mixed OR angry OR foolish OR loose OR late)
Query to combine with other queries to select abstract within specific journals only
(ANN INTERN MED[journal]) OR (ANNU REV IMMUNOL[journal]) OR (ARCH INTERN
MED[journal]) OR (BMJ[journal]) OR (CANCER CELL[journal]) OR (CELL[journal]) OR
(IMMUNITY[journal]) (JAMA[journal]) OR (LANCET[journal]) OR (BLOOD[journal]) OR (NAT
NEUROSCI[journal]) OR (NAT REV CANCER[journal]) OR (NAT REV GENET[journal]) OR
(NAT REV IMMUNOL[journal]) OR (NAT REV NEUROSCI[journal]) OR (NATURE[journal]) OR
(N ENGL J MED[journal]) OR (PLOS MED[journal]) OR (P NATL ACAD SCI U S A[journal]) OR
(SCIENCE[journal])
Query to combine with other queries to select abstract with authors from within English-
speaking countries (use NOT operator to invert result)
("UK"[ad] OR "United Kingdom"[ad] OR "Great Britain"[ad] OR "US"[ad] OR "USA"[ad] OR
"United States"[ad] OR "United States of America"[ad] OR "Australia"[ad] OR "New Zealand"[ad])
Query to get total number of papers for each year
"[journal]”
Page 20 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
4
Supplementary Table S1: List of twenty journals used for positive word analysis in high impact
journals.
1. Annals of Internal Medicine
2. Annual Reviews in Immunology
3. Archives of Internal Medicine
4. Blood
5. British Medical Journal
6. Cancer Cell
7. Cell
8. Immunity
9. JAMA
10. Lancet
11. Nature
12. Nature Neuroscience
13. Nature Reviews in Cancer
14. Nature Reviews in Genetics
15. Nature Reviews in Immunology
16. Nature Reviews in Neuroscience
17. New England Journal of Medicine
18. PLoS Medicine
19. PNAS
20. Science
Page 21 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
5
Supplementary figure S1: Relative frequencies of positive words both combined (top left) and in
isolation selected from a recent paper on superlatives commonly used in news coverage of cancer
drugs.
Page 22 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
6
Supplemental figure S2: Relative frequencies of 21 individual negative words as used in PubMed
between 1975 and 2015. Four words with low search volumes (‘disconcerting’, ‘disheartening’,
‘unpromising’ and ‘unsatisfying’) were not plotted.
Page 23 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
7
Supplemental figure S3. Relative frequency patterns of positive (top left), negative (top right),
neutral (bottom left), and random (bottom right) words in PubMed abstracts and titles over time
between authors affiliated with an institution inside or outside countries with English as the official
majority language.
Page 24 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
Page 25 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review O
nly
Page 26 of 27
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960