1 1 1 R&D evaluation: Opportunities and threats Giorgio Sirilli Research Director Campus Luigi...
-
Upload
clyde-boone -
Category
Documents
-
view
215 -
download
0
Transcript of 1 1 1 R&D evaluation: Opportunities and threats Giorgio Sirilli Research Director Campus Luigi...
1 11
R&D evaluation: Opportunities and threats
Giorgio SirilliResearch Director
Campus Luigi Einaudi
2 22
Outline of the presentation
Definitions of research and evaluationSome dataEvaluation in the context of S&T policySome experiences of R&D evaluationLessons learnedConcluding remarks
3 33
Definitions
4 44
Evaluation
Evaluation may be defined as an objective process aimed at the critical analysis of the relevance, efficiency, and effectiveness of policies, programmes, projects, institutions, groups and individual researchers in the pursuance of the stated objectives.
Evaluation consists of a set of coordinated activities of comparative nature, based on formalised methods and techniques through codified procedures aimed at formulating an assessment of intentional interventions with reference to their implementation and to their effectiveness.
Internal/external
5 55
The first evaluation (Genesis)
The first evaluationIn the beginning God created the heaven and the earth.And God saw everything that He had made. “Behold”, God said, “it is very
good”. And the evening and morning were the sixth day.And on the seventh day God rested from all His work. His Archangel came
then unto Him asking, “God, how do you know that what You have created is ‘very good’? What are Your criteria? On what data do You base Your judgement? Aren’t You a little close to the situation to make a fair and unbiased evaluation?”
God thought about these questions all that day and His rest was greatly disturbed.
On the eighth day, God said, “Lucifer, go to hell!”(From Halcom’s “The Real Story of Paradise Lost”)
6 66
Research and Development (R&D): definition
Research and experimental development (R&D) comprise creative work undertaken on a systematic basis in order to increase the stock of knowledge, including knowledge of man, culture and society, and the use of knowledge to devise new applications
Basic researchApplied researchExperimental development
7 77
OECD Frascati Manual
8 88
R&D performing organisations
Higher education
Government
Business enterprises
Private non-profit institutions
9 99
The knowledge bundle in higher education and government
10 1010
The knowledge institutions
Higher educationteachingresearch
“third mission”
Research agenciesresearchproblem solvingmanagement
11 1111
Some data
12 1212
R&D resources
OECD Science, Technology and Industry Scoreboard, 2015
Italy
13 1313
R&D expenditure/GDP (percentage) - 20130.0 1.0 2.0 3.0 4.0
OECD Science, Technology and Industry Scoreboard, 2015
14 1414
The R&D performing sectors
15 1515
R&D expenditure by performing sectors, 2013
16 1616
The context: S&T policy
17 1717
Vannevar Bush ,“Science the Endless Frontier”
1945
The context: S&T policy
18 1818
“Science the Endless Frontier”
Problems to be addressed and to be solved in the US through science:
- defence- health
Solution: science policyNational Science Foundation
19 1919
The neo-conservative wave of the 1980s
“All areas of public expenditure should demonstrate ‘value for money’”Thatcher’s three Es:
economyefficiencyeffectiveness
20 2020
The new catchwords
New public managementValue for moneyAccountabilityRelevance
21 2121
Metodology
22 2222
Questions
Why to evaluate?How to evaluate?What are results of the exercise?
23 2323
Why do we need evaluation?
Governments need tools to help determine: - how much to invest in R&D- in which areas to invest- which institutions/organisations to finance
24 2424
Types of decisions in research policy
Distribution between sciences (e.g. physics, social sciences)Distribution between specialties within sciences
e.g. high-energy physics, optical physicsDistribution between different types of activity
e.g. university research, postgraduates, central labsDistribution between centres, groups, individuals
25 2525
Scope and object of evaluation
Type of research e.g.- academic research vs targeted research- international big-science programmes
Level and object of the evaluation- individual researcher- research group- project- programme- whole discipline
26 2626
Criteria for evaluation
Vary according to the scope and purpose of evaluation; they range from criteria for identifying quality/impact of research to criteria for identifying value for money
Four main aspects often distinguished- quantity- quality- impact- utility
Criteria can be- internal – likely impact on advance of knowledge- external – likely impact on other S&T fields, economy and society
27 2727
Research evaluation
Evaluation of what:researcheducation“third mission” of universities and research agencies
(consultancy, support to local authorities, etc.)Evaluation by whom:
experts, peersEvaluation of what:
organisations (departments, universities, schools)programmes, projectsindividuals (professors, researchers, students)
Evaluation whenex-antein-itinereex-post
28 2828
Policy objectives
Socio-economic objective (NABS) Italy UK France Germany
Exploration and exploitation of the earth 5,5 3,0 1,1 1,7Environment 2,7 2,8 1,9 2,8Exploration and exploitation of space 8,7 4,2 9,7 4,6Transport, telecommunication and other infrastructures 1,2 3,3 6,1 1,5Energy 3,8 2,4 6,7 5,2Industrial production and technology 11,7 3,3 1,6 12,6Health 9,6 22,1 7,6 5,0Agriculture 3,4 4,0 2,0 2,8Education 3,9 0,4 - 1,1Culture, recreation, religion and mass media 0,9 1,8 - 1,2Political and social systems, structures and processes 5,7 1,4 5,1 1,8General advancement of knowledge: R&D financed from General University Funds (GUF) 39,4 23,0 25,3 40,0General advancement of knowledge: R&D financed from other sources than GUF 2,6 12,9 19,8 17,1Defence 0,8 15,3 6,3 3,7Total civil R&D appropriations 99,2 84,7 93,7 96,3Total R&D appropriations 100,0 100,0 100,0 100,0
Million euro 8.444,3 11.758 14.981 25.371
Source: EUROSTAT
Government budget appropriations for R&D (percentage) - 2013
29 2929
Evaluation: a difficult task
30 3030
Difficult to evaluate science
William Gladstone, then British Chancellor of the Exchequer (minister of finance), asked Michael Faraday of the practical value of electricity.Gladstone’s only commentary was ‘but, after all, what use is it?”“Why, sir, there is every probability that you will soon be able to tax it.”
Michael FaradayWilliam Gladstone
31 3131
Difficult to evaluate science: The case of physicists
Bruno Maksimovič Pontekorvo
“Physics is a single discipline but unfortunately nowadays phisicists belong to two differents groups: the theoreticians and the experimentalists. If a thoretician does not posses an extraordinary ability his work does not make sense ….For experimentalists also ordinary peole can do a useful work …”(Enrico Fermi, 1931)
32 3232
Evaluation experiences
33 3333
In the UK
•Research Assessment Exercise (RAE)•Research Excellence Framework (REF) (impact)
34 3434
In Italy
Evaluation of the Quality of Research (VQR)Model: Research Assessment Exercise (RAE)Objective: Evaluation of Areas, Research structures and Departments (not of researchers) Reference period: 2004-2010Report: 2014Actors:- ANVUR- GEV (Evaluation Groups) (#14) (450 experts involved plus referees)- Research structures (universities, research agencies)- Departments- Subjects evaluated: researchers (university teachers and PRA researchers)
35 3535
Evaluation of the Quality of Research by ANVUR
Researchers’ products to be evaluated- journal articles- books and book chapters- patents- designs, exhibitions, software, manufactured items, prototypes, etc.
University teachers: 3 “products” over the period 2004-2010Public Research Agencies researchers: 6 “products” over the period 2004-2010Scores: from 1 (excellent) to -1 (missing)
36 3636
Attention basically here!
Evaluation of the Quality of Research by ANVUR
Indicators linked to research:quality (0,5)ability to attract resources (0,1)mobility (0,1)internazionationalisation (0,1)high level education (0,1)own resources (0,05)improvement (0,05)
37 3737
Evaluation of the Quality of Research by ANVUR
Indicators of the “third mission” :fund raising (0,2)patents (0,1)spin-offs (0,1)incubators (0,1)consortia (0,1)archaeological sites (0,1)museums (0,1)other activities (0,2)
38 3838
Multi-dimensional matrix of evaluation
Unit to be evaluated
Individual
Research group
Departiment
Institution
Research area
Objective
Resource allocation
Improvement of performance
Increase of multidisciplinarity
Increase or regional involvement
Promotion, hiring
Dimensions of output
Research productivity
Impact on scientific community
Innovation and social benefit
Sustainability
Research infrastructure
Bibliometric indicators
Publications
Impact factor
Impact of citations
International collaborations
Prestige of citations
Other indicators
Peer review
Patents, licences, spin-offs
Invitation to conferences
External financing
“Quality” of PhDs
39 3939
The h-index (Jorge Eduardo Hirsch)
In 2005, the physicist Jorge Hirsch suggested a new index to measure the broad impact of an individual scientist’s work, the h-index .A scientist has index h if h of his or her Np papers have at least h citations each and the other (Np − h) papers have ≤ h citations each.In plain terms, a researcher has an h-index of 20 if he or she has published 20 articles receiving at least 20 citations each.
40 4040
Impact factor (Eugene Fardfield)
The impact factor of a journal is a measure that reflects the average number of citations in the previous two years of articles (articles, reviews proceedings, etc.) published in the journal.In plain terms, if a journal has an impact factor of 3 in 2008, the articles published in 2006 and 2007 have received 3 citations each in 2008.
41 4141
Nobel laureates and bibliometrics (Boson in 2013)
Peter Ware Higgs13 works, mostly in “minor” journal, h-index = 6
Francois Englert89 works, both in prestigious and minor hournals, h-index = 10
W. S. Boyleh-index = 7
G. E. Smith h-index = 5
C. K. Kaoh-index = 1
T. Maskawah-index = 1
Y. Nambyh-index = 17
42 4242
Performance-Based research funding systems
“The rationale of performance funding is that funds should flow to institutions where performance is manifest: ‘performing’ institutions should receive more income than lesser performing institutions, which would provide performers with a competitive edge and would stimulate less performing institutions to perform. Output should be rewarded, not input.”Performance based research funding systems are national systems of ex-post university research output evaluation used to inform distribution of funding.
Herbst, 2007
43 4343
CriteriaResearch (teaching excluded)Evaluation ex-post (ex-ante excluded)Research outputResearch funding must depend on the results of evaluationNational system (internal university evaluations excluded)
Performance-Based research funding systems
44 4444
Share of university funding dependent on “Performance-Based Research Funding Systems”
Country Share (%) of what
Australia 6 Total revenue
Italy 2 Block grant
New Zeland 10 Block grant
Norway 2 Total funding
Slovak Republic 15 Total funding
UK 25 Research support
Hicks D., Reseach Policy (2012)
Performance-Based research funding systems
45 4545
Share of university funding dependent on “Performance-Based Research Funding Systems”
Hicks D., Reseach Policy (2012)
“The distribution of university research funding is something of an illusion”
“It is the competition for prestige that creates powerful incentives within university systems”
“Performance-based research funding systems aim at excellence: they may compromise other important values such as equity and diversity”
Performance-Based research funding systems
46 4646
Ranking of universities
Four major sources of ranking
ARWU Shangai (Shangai, Jiao Tong University)QS World University Ranking THE University Ranking (Times Higher Education)US News e World Reports (Best Global Universities)
47 4747
Criteria selected as the key pillars of what makes a world class university:•Research•Teaching•Employability•Internationalisation •Facilities •Social Responsibility•Innovation•Arts & Culture •Inclusiveness•Specialist Criteria
TopUNIVERSITIES Worldwide university rankings, guides & events
Ranking of universities
48 4848
Ranking of universities: the case of Italy
ARWU Shangai (Shangai, Jiao Tong University)QS World University Ranking THE University Ranking (Times Higher Education)US News e World Reports (Best Global Universities)
ARWU Shangai: Bologna 173, Milano 186, Padova 188, Pisa 190, Sapienza 191QS World University Ranking: Bologna 182, Sapienza 202, Politecnico Milano 229World University Ranking SA: Sapienza 95, Bologna 99, Pisa 184, Milano 193US News e World Report: Sapienza 139, Bologna 146, Padova 146, Milano 155
49 4949
Evaluation is an expensive exercise
Rule of thumb: less than 1% of R&D budget devoted to its evaluation
Evaluation of the Quality of Research (VQR) 300 million Euro (180,000 “products”)182 million Euro
Research Assessment Exercise (RAE)540 million Euro
Research Excellence Framework (REF)1 milllion Pounds (500 million)
50 5050
Cost of evaluation: the saturation effect
Source: Geuna and Martin
Saturation effect A systematic loss
51 5151
San Francisco Declaration on Research Assessment
General RecommendationDo not use journal-based metrics, such as Journal Impact Factors, as a surrogate measure of the quality of individual research articles, to assess an individual scientist’s contributions, or in hiring, promotion, or funding decisions.
52 5252
San Francisco Declaration on Research Assessment
The Journal Impact Factor, as calculated by Thomson Reuters, was originally created as a tool to help librarians identify journals to purchase, not as a measure of the scientific quality of research in an article. With that in mind, it is critical to understand that the Journal Impact Factor has a number of well-documented deficiencies as a tool for research assessment. These limitations include: - citation distributions within journals are highly skewed; - the properties of the Journal Impact Factor are field-specific: it is a composite of multiple, highly diverse article types, including primary research papers and reviews; - Journal Impact Factors can be manipulated (or “gamed”) by editorial policy; and - data used to calculate the Journal Impact Factors are neither transparent nor openly available to public.
53 5353
Over recent years, Journal Impact factor (JIF) has become the most prominent indicator of a journal's standing, bringing intense pressure on journal editors to do what they can to increase it. Which are the approaches employed by journal editors to maximise it? The editorial draws three conclusions. First, in the light of ever more devious ruses of editors, the JIF indicator has now lost most of its credibility. Secondly, where the rules are unclear or absent, the only way of determining whether particular editorial behavior is appropriate or not is to expose it to public scrutiny.Thirdly, editors who engage in dubious behavior thereby risk forfeiting their authority to police misconduct among authors.
Editorial in Research Policy by Ben Martin (2105)
54 5454
The Leiden manifesto on bibliometrics
55 5555
The Leiden Manifesto for research metrics
Bibliometrics: The Leiden Manifesto for research metrics
“Data are increasingly used to govern science. Research evaluations that were once bespoke and performed by peers are now routine and reliant on metrics. The problem is that evaluation is now led by the data rather than by judgement. Metrics have proliferated: usually well intentioned, not always well informed, often ill applied. We risk damaging the system with the very tools designed to improve it, as evaluation is increasingly implemented by organizations without knowledge of, or advice on, good practice and interpretation.”
56 5656
Changes in university life
The university has become at the mercy of:
- increasing bibliometric measurement- quality standards- blind refereeing (someone sees you but you do not see him)- bibliometric medians- journal classifications (A, B, C, …)- opportunistic citing- academic tourism- administrative burden- …….
57 5757
Interview of Italian researchers (40-65 years old)
Main results:
A drastic change of researchers’ attitude due to the introduction of bibliometrics-based evaluation
The bibliometrics-based evaluation has an extremely strong normative function on scientific practices, which deeply impact the epistemic status of the disciplines
The epistemic consequences of bibliometrics-based evaluation
T. Castellani, E. Pontecorvo, A. Valente, Epistemological consequences of bibliometrics: Insights from the scientific community, 2014
58 5858
Results1. The bibliometrics-based evaluation criteria changed the way in which scientists choose the topic of their research:-choosing a fashionable theme-placing the article in the tail of an important discovery (bandwagon effect)-choosing short empirical papers 2. The hurry3. Interdisciplinary topics are hindered. Bibliometric evaluative systems encourage researchers not to change topic during their career4. repetition of experiments is discuraged. Only new results are considered interestingT. Castellani, E. Pontecorvo, A. Valente, Epistemological consequences of bibliometrics: Insights from the scientific community, 2014
The epistemic consequences of bibliometrics-based evaluation
59 5959
Against the ideology of evaluation. How AVUR is killing university
“ANVUR is much more than an administrative branch. It is the outcome of a cultural and political project aimed at reducing the range of alternatives and hampering pluralism.”
D: Borrelli, Contro l’ideologia della valutazione, 2015
60 6060
Model of firm’s management based on the principles of competitiveness and customer satisfaction (the market)
The catchwords:competitivenessexcellencemeritocracy
“Evaluative state” as the “minimum state” in which the government gives up the role of political responsibility and avoid the democratic debate in search of consensus, and rests on the “automatic pilot” of techno-administrative control.D: Borrelli, Contro l’ideologia della valutazione, 2015
Against the ideology of evaluation. How AVUR is killing university
61 6161
Lessons from Research Evaluation
Evaluation should enhance efficiency and effectivenessPro-active evaluation vs punitive evaluationEvaluation is a difficult and expensive processOpportunistic behaviourPeer review vs bibliometricsNSE vs SSHCompetition vs cooperation of scientistsThe mith of excellenceThe split of the academic community (the good and the bad guys)The equilibrium amongst teaching, research and third missionBureacratisationEvaluation in the economic cycle
62 6262
Some concluding remarks
S&T systems are under scrutinyEvaluation as a tool for the legitimation of R&D and higher educationEvaluation has become a key policy instrumentR&D evaluation is “easier” than other types of evaluationThe ideology behind R&D evaluation (concentration in “excellent”
institutions or spreading?)Evaluation exercises heavily criticised from a methodological point of
viewImpact on the scientific community and on researchers’ behavior (“when
you measure a system, you change the system”)Evaluation is expensiveEvaluation in a period of reduction of resourcesEvaluation necessary … but too much evaluation is harmful
63 6363
How to proceed with evaluation?
Respect the internal logic and rules of the scientific communityPro-active evaluation, not punitive evaluationKeep the national S&T system anchored to the international systemReduce/avoid negative unintended effects (opportunism, etc.)Don’t ask science what science can’t deliver (e.g. jobs, competitiveness)The inventor and the innovatorDesign evaluation in an efficient way (costs/benefits)Evaluation as dynamite: handle with care