Research Evaluation: When you measure a system, you change the system
-
Upload
giuseppe-de-nicolao -
Category
Education
-
view
35 -
download
0
Transcript of Research Evaluation: When you measure a system, you change the system
1 11
RESEARCH EVALUATION: WHEN YOU MEASURE A SYSTEM, YOU CHANGE THE SYSTEM
Giorgio Sirilli
IRCrES-CNR Redazione ROARS
2 22
ROARS
Start: 2011Members of the Editorial board: 14Collaborators: 250Contacts: 10,6 million (November 2011 – May 2015)Average daily contacts: 500 November 2011; 8,000 in 2014)Articles published: 2,000Comments by readers: 30,000ROARS is ranked 8° among the top cultural national blogsROARS, a genuine expression of democracy and participation, has
become a very important player in the policy debate and in policy making
3 33
Evaluation
Evaluation may be defined as an objective process aimed at the critical analysis of the relevance, efficiency, and effectiveness of policies, programmes, projects, institutions, groups and individual researchers in the pursuance of the stated objectives.
Evaluation consists of a set of coordinated activities of comparative nature, based on formalised methods and techniques through codified procedures aimed at formulating an assessment of intentional interventions with reference to their implementation and to their effectiveness.
Internal/external evaluation
4 44
The first evaluation (Genesis)
The first evaluationIn the beginning God created the heaven and the earth.And God saw everything that He had made. “Behold”, God said, “it is very
good”. And the evening and morning were the sixth day.And on the seventh day God rested from all His work. His Archangel came
then unto Him asking, “God, how do you know that what You have created is ‘very good’? What are Your criteria? On what data do You base Your judgement? Aren’t You a little close to the situation to make a fair and unbiased evaluation?”
God thought about these questions all that day and His rest was greatly disturbed.
On the eighth day, God said, “Lucifer, go to hell!”(From Halcom’s “The Real Story of Paradise Lost”)
5 55
A brief history of evaluation
Research Assessment Exercise (RAE)Research Excellence Framework (REF) (impact) “The REF will over time doubtless become more sophisticated and burdensome. In short we are creating a Frankenstein monster” (Ben Martin)
Italy, a latecomerEvaluation in Italy: yes or no?Yes, but … good evaluation
7 77
The value of science
William Gladstone, then British Chancellor of the Exchequer (minister of finance), asked Michael Faraday of the practical value of electricity.Gladstone’s only commentary was ‘but, after all, what use is it?”“Why, sir, there is every probability that you will soon be able to tax it.”
Michael Faraday William Gladstone
9 99
The case of physicists
“Physics is a single discipline but unfortunately nowadays phisicists belong to two differents groups: the theoreticians and the experimentalists. If a thoretician does not posses an extraordinary ability his work does not make sense ….For experimentalists also ordinary peole can do a useful work …”(Enrico Fermi, 1931)
“La fisica è una sola ma disgraziatamente oggi i fisici sono divisi in due categorie: i teorici e gli sperimentatori. Se un teorico non possiede straordinarie capacità il suo lavoro non ha senso…Per quanto riguarda la sperimentazione invece anche una persona di medie capacità ha la possibilità di svolgere un lavoro utile.”
10 1010
The case of graphene
Graphene is an allotrope of carbon in the form of a two-dimensional, atomic-scale, hexagonal lattice.Graphene has many extraordinary properties. It is about 100 times stronger than steel by weight, conducts heat and electricity with great efficiency and is nearly transparent.Scientists it was first measurably produced and isolated in the lab in 2003.Andre Geim and Konstantin Novoselov at the University of Manchester won the Nobel Prize in Physics in 2010 "for groundbreaking experiments regarding graphene."The global market for graphene is reported to have reached $9 million by 2014 with most sales in the semiconductor, electronics, battery energy and composites industries.
11 1111
The famous paper by Andre Geim and Konstantin Novoselov was published in 2004 and in 2007 it was indeed quite famous and cited.
The point is whether the committee would have selected his project and awarded him with an ERC Starting Grant in 2004. By looking at his citations and publications records in 2004 it is very un-probable that he would have been considered among the top 10%.
The case of graphene
14 1414
The knowledge institutions
University
teaching
research
“third mission”
Research agencies
research
problem solving
management
17 1717
The neo-conservative wave in Italy
Letizia MorattiItalian minister of education and research
“You first show that you use efficiently and effectively the public money, then we will open the strings of the purse” Never happened!
18 1818
Model of firm’s management based on the principles of competitiveness and customer satisfaction (the market)
The catchwords:competitivenessexcellencemeritocracy
“Evaluative state” as the “minimum state” in which the government gives up the role of political responsibility and avoid the democratic debate in search of consensus, and rests on the “automatic pilot” of techno-administrative control.
Contro l’ideologia della valutazione. L’ANVUR e l’arte della rottamazione dell’università
19 1919
Contro l’ideologia della valutazione. L’ANVUR e l’arte della rottamazione dell’università
“ANVUR is much more than an administrative branch. It is the outcome of a cultural and political project aimed at reducing the range of alternatives and hampering pluralism.”
Sergio Benedetto
20 2020
Changes in university life
The university has become at the mercy of:
- increasing bibliometric measurement- quality standards- blind refereeing (someone sees you but you do not see him)- bibliometric medians- journal classifications (A, B, C, …)- opportunistic citing- academic tourism- administrative burden- …….
21 2121
Interview of Italian researchers (40-65 years old)
Main results:
A drastic change of researchers’ attitude due to the introduction of bibliometrics-based evaluation
The bibliometrics-based evaluation has an extremely strong normative function on scientific practices, which deeply impact the epistemic status of the disciplines
The epistemic consequences of bibliometrics-based evaluation
(T. Castellani, E. Pontecorvo, A. Valente, Epistemological consequences of bibliometrics: Insights from the scientific community, Social Epistemology Review and Reply Collective vol. 3 no. 11, 2014).
22 2222
Results1. The bibliometrics-based evaluation criteria changed the way in which scientists choose the topic of their research:-choosing a fashionable theme-placing the article in the tail of an important discovery (bandwagon effect)-choosing short empirical papers 2. The hurry3. Interdisciplinary topics are hindered. Bibliometric evaluative systems encourage researchers not to change topic during their career4. repetition of experiments is discouraged. Only new results are considered interesting(T. Castellani, E. Pontecorvo, A. Valente, Epistemological consequences of bibliometrics: Insights from the scientific community, Social Epistemology Review and Reply Collective vol. 3 no. 11, 2014).
The epistemic consequences of bibliometrics-based evaluation
24 2424
Research evaluation
Indicators used
- bibliometrics- R&D- peer review- students- graduates- patents- spin-offs- contracts and other funding- other
27 2727
The h-index (Jorge Eduardo Hirsch)
In 2005, the physicist Jorge Hirsch suggested a new index to measure the broad impact of an individual scientist’s work, the h-index .A scientist has index h if h of his or her Np papers have at least h citations each and the other (Np − h) papers have ≤ h citations each.In plain terms, a researcher has an h-index of 20 if he or she has published 20 articles receiving at least 20 citations each.
28 2828
Impact factor (Eugene Fardfield)
The impact factor (IF) of an academic journal is a measure reflecting the average number of citations to recent articles published in that journal. It is frequently used as a proxy for the relative importance of a journal within its field. In any given year, the impact factor of a journal is the average number of citations received per paper published in that journal during the two preceding years. For example, if a journal has an impact factor of 3 in 2008, then its papers published in 2006 and 2007 received 3 citations each on average in 2008. ("Citable items" for this calculation are usually articles, reviews, proceedings, or notes; not editorials or letters to the editor).
29 2929
Nobel laureates and bibliometrics (Boson in 2013)
Peter Ware Higgs13 works, mostly in “minor” journal, h-index = 6
Francois Englert89 works, both in prestigious and minor journals, h-index = 10
W. S. Boyleh-index = 7
G. E. Smith h-index = 5
C. K. Kaoh-index = 1
T. Maskawah-index = 1
Y. Nambyh-index = 17
30 3030
Science and ideology: the impact on citations
0
500
1,000
1,500
2,000
2,500
3,000
CITATION YEAR
NR
CIT
ES
MARX
LENINFall of the Berlin wall
Berlin Nov. 1989
31 3131
San Francisco Declaration on Research AssessmentThe Journal Impact Factor, as calculated by Thomson Reuters, was
originally created as a tool to help librarians identify journals to purchase, not as a measure of the scientific quality of research in an article.
With that in mind, it is critical to understand that the Journal Impact Factor has a number of well-documented deficiencies as a tool for research assessment. These limitations include:
A) citation distributions within journals are highly skewed; B) the properties of the Journal Impact Factor are field-specific: it is a
composite of multiple, highly diverse article types, including primary research papers and reviews;
C) Journal Impact Factors can be manipulated (or “gamed”) by editorial policy; and
D) data used to calculate the Journal Impact Factors are neither transparent nor openly available to the public.
32 3232
San Francisco Declaration on Research Assessment
General RecommendationDo not use journal-based metrics, such as Journal Impact Factors, as a surrogate measure of the quality of individual research articles, to assess an individual scientist’s contributions, or in hiring, promotion, or funding decisions.
San Francisco Declaration on Research Assessment
34 3434
The Leiden Manifesto
Bibliometrics: The Leiden Manifesto for research metrics
“Data are increasingly used to govern science. Research evaluations that were once bespoke and performed by peers are now routine and reliant on metrics. The problem is that evaluation is now led by the data rather than by judgement. Metrics have proliferated: usually well intentioned, not always well informed, often ill applied. We risk damaging the system with the very tools designed to improve it, as evaluation is increasingly implemented by organizations without knowledge of, or advice on, good practice and interpretation.”
35 3535
The Leiden Manifesto – Ten principles
1) Quantitative evaluation should support qualitative, expert assessment. 2) Measure performance against the research missions of the institution, group or researcher. 3) Protect excellence in locally relevant research. 4) Keep data collection and analytical processes open, transparent and simple. 5) Allow those evaluated to verify data and analysis.
36 3636
6) Account for variation by field in publication and citation practices. 7) Base assessment of individual researchers on a qualitative judgment of their portfolio. 8) Avoid misplaced concreteness and false precision. 9) Recognize the systemic effects of assessment and indicators. 10) Scrutinize indicators regularly and update them.
The Leiden Manifesto – Ten principles
39 3939
Ranking of universities
Four major sources of ranking
ARWU Shangai (Shangai, Jiao Tong University)QS World University Ranking THE University Ranking (Times Higher Education)US News e World Reports (Best Global Universities)
Criteria selected as the key pillars of what makes a world class university:•Research•Teaching•Employability•Internationalisation •Facilities •Social Responsibility•Innovation•Arts & Culture •Inclusiveness•Specialist Criteria
TopUNIVERSITIES Worldwide university rankings, guides & events
41 4141
Global rankings cover less than 3-5% of the world universities
Performance
Top20
Top500
Next 500
Num
ber o
f uni
vers
ities
Other 16,500universities
42 4242
Ranking of universities: the case of Italy
ARWU Shangai (Shangai, Jiao Tong University)QS World University Ranking THE University Ranking (Times Higher Education)US News e World Reports (Best Global Universities)
ARWU Shangai: Bologna 173,, Milano 186, Padova 188, Pisa 190, Sapienza 191QS World University Ranking: Bologna 182,, Sapienza 202, Politecnico Milano 229World University Ranking SA: Sapienza 95, Bologna 99, Pisa 184, Milano 193US News e World Report: Sapienza 139, Bologna 146, Padova 146, Milano 155
44 4444
The rank-ism (De Nicolao)
The vice-rector of the univerisity of Pavia declared that “There are various rankings in the world: in each of them the University of Pavia ranks in the firts 1%.But it is not true. According to three agencies Pavia is in the following positions:371: QS World University Rankings251-275: Times Higher Education401-500: Shanghai Ranking (ARWU)
Pavia
45 4545
Evaluation is an expensive exercise
Rule of thumb: less than 1% of R&D budget devoted to its evaluation
Evaluation of the Quality of Research (VQR) 300 million Euro (ROARS)182 million Euro (Geuna)
Research Assessment Exercise (RAE)540 million Euro
Research Excellence Framework (REF)1 milllion Pounds (500 million)
46 4646
Evaluation is an expensive exercise
National Scientific Habilitation: 126 million Euro- Cost per application: 2,300 euro- Cost per job assigned: 32,000 euro
49 4949
Evaluation of the Quality of Research by ANVUR
Researchers’ products to be evaluated- journal articles- books and book chapters- patents- designs, exhibitions, software, manufactured items, prototypes, etc.
University teachers: 3 “products” over the period 2004-2010Public Research Agencies researchers: 6 “products” over the period 2004-2010Scores: from 1 (excellent) to -1 (missing)
50 5050
Attention basically here!
Evaluation of the Quality of Research by ANVUR
Indicators linked to research:quality (0,5)ability to attract resources (0,1)mobility (0,1)internazionationalisation (0,1)high level education (0,1)own resources (0,05)improvement (0,05)
51 5151
Evaluation of the Quality of Research by ANVUR
Indicators of the “third mission” :fund raising (0,2)patents (0,1)spin-offs (0,1)incubators (0,1)consortia (0,1)archaeological sites (0,1)museums (0,1)other activities (0,2)
52 5252
Call for Papers for Philosophy and Technology’s special issue: Toward a Philosophy of Impact
There was a time when serendipity played a central role in knowledge policy. Scientific advancement was viewed as essential for social progress, but this was paired with the assumption that it was generally impossible to steer research directly toward desired outcomes. Attempts to guide the course of research or predict its societal impacts were seen as impeding the advancement of science and thus of social welfare. Driven in part by budgetary constraints, and in part by ideology, the age of serendipity is being eclipsed by the age of accountability. Society increasingly requires academics to give an account of the value of their research. The ‘audit culture’ now permeates the university from STEM (science, technology, engineering, and math) through HASS (humanities, arts, and social sciences). Academics are being asked to consider not just how their work influences their disciplines, but also other disciplines and society more generally.
53 5353
A warning
“Science today is riven with perverse incentives:Researchers judge one another not by the quality of their science — who has time to read all that? — but by the pedigree of their journal publications.High-profile journals pursue flashy results, many of which won’t panout on further scrutiny.Universities reward researchers on those publication records.Financing agencies, reliant on peer review, direct their grant money back toward those same winners.Graduate students, dependent on their advisers and neglected by their universities, receive minimal, ad hoc training on proper experimental design, believing the system of rewards is how it always has been and how it always will be.”
The Cronicle of Higher Education (March 16, 2015) Amid a Sea of False Findings, the NIH Tries Reform - By Paul Voosen
54 5454
Lessons from Research Evaluation
Evaluation in Italy is going to stayThe system has been measured and has changedAwareness of the limitations of metricsThe challenge: avoid that evaluation becomes a Frankenstein monsterMain problems:
League tablesCompetition vs cooperation of scientistsPeer review vs bibliometricsNSE vs SSHOpportunistic behaviourThe split of the academic community (the good and the bad
guys)The equilibrium amongst the teaching, research and third
missionBureacratisation
The use of evaluation for polict purposes