Post on 06-Oct-2020
Don’t make a fool of yourselfReputation and performance evaluation in academia
This presentation is licensed under a CC-BY 4.0 license. You may copy, distribute, and use the slides in your own work, as long as you give attribution to the original author at each slide that you use.
2018-06-20
PD Dr. Felix SchönbrodtLudwig-Maximilians-Universität München
www.nicebread.dewww.researchtransparency.org
@nicebread303
2
Thesis 1: Our current indicators for scientific quality do a
bad job.
Thesis 2:Our current incentives foster bad science.
Thesis 3:Some ideas, how good scientific practice and
incentive structures can be realigned.
Thesis 1: Our current indicators for scientific quality do a
bad job.
4
Journal Impact Factor (JIF)
5Lariviere, V., Kiermer, V., MacCallum, C. J., McNutt, M., Patterson, M., Pulverer, B., Swaminathan, S., u. a. (2016). A simple proposal for the publication of journal citation distributions. bioRxiv, 062109. doi:10.1101/062109
JIF = 35
76% of papers have less citations
2.1% of papers are never cited
Journal Impact Factor (JIF)
6
•Objective quantification of cristallograhic quality: higher JIF ➙ less quality (Brown and Ramaswamy, 2007)
Für Überblick, siehe Brembs, Button, & Munafò (2013)
Journal Impact Factor (JIF)
7Brown and Ramaswamy, 2007
bett
er
Journal Impact Factor (JIF)
8
•Objective quantification of cristallograhic quality: higher JIF ➙ less quality (Brown and Ramaswamy, 2007)
•Negative relationship between JIF and statistical power/sample sizes in psychology (Fraley & Vazire, 2014; Szucs & Ioannidis, 2016)
•Positive relationship between JIF and objective errors of gene names in Excel sheets (Ziemann, Eren, & El-Osta, 2016)
•Positive relationship between JIF and the frequency of retractions (Brembs, Button, & Munafò, 2013)
For an overview, see Brembs, Button, & Munafò (2013) and http://bjoern.brembs.net/2016/01/even-without-retractions-top-journals-publish-the-least-reliable-science/
9
„Double blind peer review is the hallmark of scientific quality
control“
Reliability for single case diagnostics
11Krohne, H. W., & Hock, M. (2015). Psychologische Diagnostik: Grundlagen und Anwendungsfelder. Kohlhammer Verlag.
How well do reviewers agree in their assessment of a paper?➙ interrater agreement
Peer review• The classic: Would top journals accept already published papers once
more? Peters and Ceci (1982)
12for an overview, see Osterloh, M., & Kieser, A. (2015).
Peer review• The classic: Would top journals accept already published papers once
more? Peters and Ceci (1982)
• Meta-analysis of reviewer agreement (k=48, 19,443 manuscripts):
13for an overview, see Osterloh, M., & Kieser, A. (2015).
Peer review• The classic: Would top journals accept already published papers once
more? Peters and Ceci (1982)
• Meta-analysis of reviewer agreement (k=48, 19,443 manuscripts): ⌀ ICC = .84, kappa = .77 Bornmann, Mutz, Daniel (2010)
14for an overview, see Osterloh, M., & Kieser, A. (2015).
Peer review• The classic: Would top journals accept already published papers once
more? Peters and Ceci (1982)
• Meta-analysis of reviewer agreement (k=48, 19,443 manuscripts): ⌀ ICC = .84, kappa = .77 Bornmann, Mutz, Daniel (2010)
15for an overview, see Osterloh, M., & Kieser, A. (2015).
Peer review• The classic: Would top journals accept already published papers once
more? Peters and Ceci (1982)
• Meta-analysis of reviewer agreement (k=48, 19,443 manuscripts): ⌀ ICC = .34, kappa = .17 Bornmann, Mutz, Daniel (2010)
16for an overview, see Osterloh, M., & Kieser, A. (2015).
Peer review• The classic: Would top journals accept already published papers once
more? Peters and Ceci (1982)
• Meta-analysis of reviewer agreement (k=48, 19,443 manuscripts): ⌀ ICC = .34, kappa = .17 Bornmann, Mutz, Daniel (2010)
• „Agreement about shared values“ ≠ „agreement about true value“ ➙ Correlation with „true value“ <= 1➙ Estimate correlation of reviewer’s assessment with „true value“ of a paper: r = .09 – .27 (mean r = .18; explained variance: 3%) Starbuck (2004)
• Decisions highly dependent on a (random) selection of reviewers ➙ Lottery Bornmann & Daniel (2009)
17for an overview, see Osterloh, M., & Kieser, A. (2015).
18SHIN, J. C., Toutkoushian, R. K., & Teichler, U. (2011). University Rankings: Theoretical Basis, Methodology and Impacts on Global Higher Education. Springer Science & Business Media, p. 151.; Zitat erstmalig gesehen bei Präsentation von Margit Osterloh
„When I divide the week’s contribution into two piles – one that we are going to publish and the other we are going to return – I wonder whether it would make any real difference to the journal or its readers if I exchanged one pile for another“.
Sir Theodore Fox in Lancet, 1965
Peer review• The classic: Would top journals accept already published papers once
more? Peters and Ceci (1982)
• Meta-analysis of reviewer agreement (k=48, 19,443 manuscripts): ⌀ ICC = .34, kappa = .17 Bornmann, Mutz, Daniel (2010)
• „Agreement about shared values“ ≠ „agreement about true value“ ➙ Correlation with „true value“ <= 1➙ Estimate correlation of reviewer’s assessment with „true value“ of a paper: r = .09 – .27 (mean r = .18; explained variance: 3%) Starbuck (2004)
• Decisions highly dependent on a (random) selection of reviewers ➙ Lottery Bornmann & Daniel (2009)
• Summary: Pre-publication peer review in the current system has a (perceived) value as feedback (to improve a manuscript), but virtually no value as quality control; very inefficient
19for an overview, see Osterloh, M., & Kieser, A. (2015).
Number of publicationsEmmy-Noether-Programm
20Böhmer et al (2008): Hornbostel, S., Böhmer, S., Klingsporn, B., Neufeld, J., & Ins, von, M. (2008); Neufeld, J. (2016).
Number of publicationsEmmy-Noether-Programm
21Böhmer et al (2008): Hornbostel, S., Böhmer, S., Klingsporn, B., Neufeld, J., & Ins, von, M. (2008); Neufeld, J. (2016).
Average JIFEmmy-Noether-Programm
22Böhmer et al (2008): Hornbostel, S., Böhmer, S., Klingsporn, B., Neufeld, J., & Ins, von, M. (2008); Neufeld, J. (2016).
Citations per PaperEmmy-Noether-Programm
23Böhmer et al (2008); Hornbostel, S., Böhmer, S., Klingsporn, B., Neufeld, J., & Ins, von, M. (2008); Neufeld, J. (2016).
Emmy-Noether-Programm
24Böhmer et al (2008)
"Taken together, the bibliometric results show remarkably small differences between funded and rejected applicants (prior to funding). Moreover, these small differences are not increased by the fact that one of both groups gets the funding of the Emmy Noether program* and the other doesn’t.“
*1 - 1.5 million €
Thesis 2:Our current incentives foster bad science.
26
Much of the scientific literature, perhaps half, may simply be untrue.
Part of the problem is that no one is incentivised to be right.
Richard Horton,Editor von The Lancet
Quantity, not quality
27Abele-Brehm, A. E., & Bühner, M. (2016). Wer soll die Professur bekommen? Psychologische Rundschau, 67(4), 250–261. http://doi.org/10.1026/0033-3042/a000335
Actual (not desired) relevance at professorship hiring committees: Rank
Number of peer-reviewed publications 1
Fit of research profile to the advertising institution 2
Quality of research talk 3
Number of publications 4
Volume of acquired third-party funding 5
Number of first authorships 6
… …
28
Bakker, M., van Dijk, A., & Wicherts, J. M. (2012). The Rules of the Game Called Psychological Science. Perspectives on Psychological Science, 7(6), 543–554. http://doi.org/10.1177/1745691612459060Smaldino, P. E., & McElreath, R. (2016). The natural selection of bad science. Royal Society Open Science, 3(9), 160384–17. http://doi.org/10.1098/rsos.160384
Ideal strategy for a high quantity of publications:small n + many studies + questionable research practices (QRPs), such as p-hacking
„The rules of the game“ „Evolution of bad science“
„Innovative, unprecedented, transformative!“ +880% von 1974- 2014
29Vinkers, C. H., Tijdink, J. K., & Otte, W. M. (2015). Use of positive and negative words in scientific PubMed abstracts between 1974 and 2014: retrospective analysis. Bmj, 351, h6467–6. http://doi.org/10.1136/bmj.h6467
Amazing!!
Enormous!!
Groundbreaking!!!
30
http://www.nature.com/news/2011/111005/full/478026a/box/2.htmlhttp://retractionwatch.com/2016/03/24/retractions-rise-to-nearly-700-in-fiscal-year-2015-and-psst-this-is-our-3000th-post/https://www.washingtonpost.com/news/speaking-of-science/wp/2016/04/01/when-scientists-lie-about-their-research-should-they-go-to-jail/
„In the past decade, the number of retraction notices has shot up 10-fold.“
Retractions: +1000% in 10 years
2013
467
500
2015
684
Scientific misconduct:+ 1200% in 4 years
31https://ori.hhs.gov/case_summary
U.S. Office of Research Integrity
2009-2011 2012-2015
3
36
Which part of published findings can be independently replicated?
32
0%
25%
50%
75%
100%
Psychology (2015; n = 97)
Economy* (2015; n = 67)
Cancer research 1 (2011; n = 53)
Cancer research 2 (2012; n = 67)
Experimental Philosophy
(n=40, 2018)
22%
79%89%
51%64%
78%
21%11%
49%36%
Open Science Collaboration (2015); Chang & Li (2015); Begley, C. G., & Ellis, L. M. (2012). Prinz, F., Schlange, T., & Asadullah, K. (2011); Cova et al. (2018)
* The data on economics is about reproducibility; i.e. the attempt to get the same results if you apply the original data analysis on the original data set.
Early career researchers are stuck
33
➙ felt contradiction between „good research“/„open research“ and „having a career in science“
What would be a good balance between Open Science and having a career in academia? […] Being open IMHO is a competitive disadvantage. Can you only afford open science when you are tenured?
My contract is limited to two years – although it would be nice to publish the data, I have no time to do it. I rather have to churn out another publication.
Why should I share my hard-won data with my rivals that presumably compete with me for the next post-doc position?
34
© KC Green
Thesis 3:Some ideas how to realign good scientific practice
and incentive structures.
Publications Job committeesFunding/Grants
Variant 1: Post-publication Peer Review (PPPR)
•Higher agreement in evaluating bad research•➙ Negative selection is possible; assessment of excellence is hardly possible
•Only perform input control (i.e., cut out the crap), after that use public post-publication peer review in a lively discourse to sort out excellent from the average stuff
36
Publications
Kriegeskorte, N. (2012). Open evaluation: a vision for entirely transparent post-publication peer review and rating for science, 1–18. http://doi.org/10.3389/fncom.2012.00079/abstract
Variant 2: Overlay Journals
1. All papers are initially preprints, which get comments in open peer review. They get eventually revised on the preprint server.
2. Journals look out for the best preprints („reader’s digest“), probably even compete for the best papers, or authors actively submit their preprint to a journal
3. Optionally, these papers get an additional traditional peer review with selected reviewers.
4. The final paper stays on the preprint server and gets linked from the overlay journaly.
37
Example: Discrete Analysis (mathematics): no APCshttp://discreteanalysisjournal.com/
Publications
Potential drawbacks
•An even greater flood of papers?• Use intelligent recommender systems• Overlay Journals as Reader’s Digest• Papers without any PPPR typically are not read
(„a twilight zone of unevaluated papers“)
•Attention economy: Success = a prolific social media activity?
•See also discussion and FAQ in Kriegeskorte (2012).
38Kriegeskorte, N. (2012). Open evaluation: a vision for entirely transparent post-publication peer review and rating for science, 1–18. http://doi.org/10.3389/fncom.2012.00079/abstract
Publications
39
✓ Full open access, no APCs✓ Non-commercial institutional
publisher (Linnaeus U library)✓ Open, citable peer review
(with doi)✓ Well-powered null results and
direct replications welcomed✓ Registered Reports as option✓ Mandatory open data✓ Open Science badges
(including a reproducibility badge)
Publications
40
https://www.psychopen.eu/ Publications
Research funding
https://dirnagl.com/2014/01/14/otto-warburgs-research-grant/
Grant proposal submitted to the precursor of the German Research Foundation, around 1921 (accepted).
Osterloh & Frey: Aleatoric Principle•Grant decisions as a lottery: It’s not a bug, it’s a feature!•Tiered selection process:
• Preselection by traditional peer review (but with clear focus on negative selection only!)
• Distribute funds either by chance, or by traditional selection criteria• After some time: Compare grant performance from „chance track“
with „traditional track“
•Applicable to research funding, but also to search committees for professorship positions; cf. University of Basel in 18th century) 42
Funding/Grants
Peer-to-peer funding
43
http://www.laborjournal-archiv.de/epaper/LJ_17_06/index.html#22https://dirnagl.com/2017/07/01/start-your-own-funding-organization/Bollen, J., Crandall, D., Junk, D., Ding, Y., & Börner, K. (2014). From funding agencies to scientific agency: Collective allocation of science funding as an alternative to peer review. EMBO Reports, 15(2), 131–133. https://doi.org/10.1002/embr.201338068
Funding/Grants
Quantity, not quality
44Abele-Brehm, A. E., & Bühner, M. (2016). Wer soll die Professur bekommen? Psychologische Rundschau, 67(4), 250–261. http://doi.org/10.1026/0033-3042/a000335
Job committees
Actual (not desired) relevance at professorship hiring committees: Rank
Number of peer-reviewed publications 1Fit of research profile to the advertising institution 2
Quality of research talk 3Number of publications 4
Volume of acquired third-party funding 5Number of first authorships 6
… …Quality assessment of the best three publications 17
… …Indicators of research transparency 41 (of 41)
Quality, not quantity
45Abele-Brehm, A. E., & Bühner, M. (2016). Wer soll die Professur bekommen? Psychologische Rundschau, 67(4), 250–261. http://doi.org/10.1026/0033-3042/a000335
Job committees
Indicators with the largest discrepancy between „desired“
and „actual“
46
Job committees
https://docs.google.com/document/d/1ty43Syw0Flkh8ncjW8MZArIkvYe8hLwwhLlIwbtSk_Y/edit?usp=drive_web&ouid=108982640291853577145
www.uni-koeln.de
The Department of Psychology at the Faculty of Human Sciences of the University of Cologne (UoC) seeks to appoint a
Full Professor (W3) of Social Psychology
to be filled as soon as possible.
The successful candidate is expected to have a record of excellence in social cognition, and/or related areas such as cognitive psychology or motivation science.
The candidate is also expected to strongly contribute to the UoC’s Center for Social and Economic Behavior and the Social Cognition Center Cologne of the Department of Psychology. Both structures are part of UoC’s Key Profile Area II, „Behavioral Economic Engineering and Social Cognition“.
The ideal candidate’s track record should show an excellent fit with these interrelated structures and a strong interest to bridge the fields of social cognition and behavioral economics.
The Department of Psychology aims for transparent and reproducible research (including Open Data, Open Materials, and Preregistrations). Applicants are asked to illustrate how they have pursued these goals in the past and/or how they plan to do so in the future.
We strongly encourage international applicants. Salaries and working conditions at the UoC - one of the German Universities of Excellence – meet international standards. Candidates are expected to be willing to learn the German language. The Faculties offer Bachelor, Master, and doctoral degrees. Courses are taught either in English or German.
Applicants will be hired in concordance with § 36 of the University Law of the State of North-Rhine Westphalia.
The UoC supports diversity, the multiplicity of perspectives, and equal opportunities. The University of Cologne particularly encourages applications from disabled persons. Disabled persons are given preference in case of equal qualification. Women are strongly encouraged to apply. Preferential treatment is given to women if their professional qualifications and abilities are equivalent to those of other applicants.
Applications with the usual documents (including vita, research statement, 5 most important publications, full list of publications and teaching experience, and diplomas) should be submitted via the University’s Academic Job Portal (https://berufungen.uni-koeln.de) until March 30th, 2017.
www.uni-koeln.de
The Department of Psychology at the Faculty of Human Sciences of the University of Cologne (UoC) seeks to appoint a
Full Professor (W3) of Social Psychology
to be filled as soon as possible.
The successful candidate is expected to have a record of excellence in social cognition, and/or related areas such as cognitive psychology or motivation science.
The candidate is also expected to strongly contribute to the UoC’s Center for Social and Economic Behavior and the Social Cognition Center Cologne of the Department of Psychology. Both structures are part of UoC’s Key Profile Area II, „Behavioral Economic Engineering and Social Cognition“.
The ideal candidate’s track record should show an excellent fit with these interrelated structures and a strong interest to bridge the fields of social cognition and behavioral economics.
The Department of Psychology aims for transparent and reproducible research (including Open Data, Open Materials, and Preregistrations). Applicants are asked to illustrate how they have pursued these goals in the past and/or how they plan to do so in the future.
We strongly encourage international applicants. Salaries and working conditions at the UoC - one of the German Universities of Excellence – meet international standards. Candidates are expected to be willing to learn the German language. The Faculties offer Bachelor, Master, and doctoral degrees. Courses are taught either in English or German.
Applicants will be hired in concordance with § 36 of the University Law of the State of North-Rhine Westphalia.
The UoC supports diversity, the multiplicity of perspectives, and equal opportunities. The University of Cologne particularly encourages applications from disabled persons. Disabled persons are given preference in case of equal qualification. Women are strongly encouraged to apply. Preferential treatment is given to women if their professional qualifications and abilities are equivalent to those of other applicants.
Applications with the usual documents (including vita, research statement, 5 most important publications, full list of publications and teaching experience, and diplomas) should be submitted via the University’s Academic Job Portal (https://berufungen.uni-koeln.de) until March 30th, 2017.
…
…
Since 2015: All professorship job descriptionsuse this requirement
See more such prof job ads at: https://osf.io/7jbnt/
Hiring committees: Make „open science“ a desirable or essential job characteristic
47http://www.fak11.lmu.de/dep_psychologie/osc/open-science-hiring-policy/index.html
Job committees
CV: Only the 10 best publications are allowed + extra information which is easy to gather
48
No journal; JIF is irrelevant or misleading
Paper-level citation metrics
Basic information for judging
evidential value
Open science indicators: Judging
reproducibility
Data: own collection or
reuse?
Summary• We apply established methodological quality checks in our
„normal“ scientific work – we should also apply them to performance evaluations in academia ➙ critical analysis of the current system
• The current incentive structure fosters bad science• Early career researchers feel a tension: You can either do
„good science“ or „career in academia“. Seniors have the obligation to resolve that dilemma for young scientists.
• Alternative ideas and structures for performance evaluation exist and just wait for being tested and evaluated.
49