1 1 1 R&D evaluation: Opportunities and threats Giorgio Sirilli Research Director Campus Luigi...

1 11

R&D evaluation: Opportunities and threats

Giorgio SirilliResearch Director

Campus Luigi Einaudi

2 22

Outline of the presentation

Definitions of research and evaluationSome dataEvaluation in the context of S&T policySome experiences of R&D evaluationLessons learnedConcluding remarks

3 33

Definitions

4 44

Evaluation

Evaluation may be defined as an objective process aimed at the critical analysis of the relevance, efficiency, and effectiveness of policies, programmes, projects, institutions, groups and individual researchers in the pursuance of the stated objectives.

Evaluation consists of a set of coordinated activities of comparative nature, based on formalised methods and techniques through codified procedures aimed at formulating an assessment of intentional interventions with reference to their implementation and to their effectiveness.

Internal/external

5 55

The first evaluation (Genesis)

The first evaluationIn the beginning God created the heaven and the earth.And God saw everything that He had made. “Behold”, God said, “it is very

good”. And the evening and morning were the sixth day.And on the seventh day God rested from all His work. His Archangel came

then unto Him asking, “God, how do you know that what You have created is ‘very good’? What are Your criteria? On what data do You base Your judgement? Aren’t You a little close to the situation to make a fair and unbiased evaluation?”

God thought about these questions all that day and His rest was greatly disturbed.

On the eighth day, God said, “Lucifer, go to hell!”(From Halcom’s “The Real Story of Paradise Lost”)

6 66

Research and Development (R&D): definition

Research and experimental development (R&D) comprise creative work undertaken on a systematic basis in order to increase the stock of knowledge, including knowledge of man, culture and society, and the use of knowledge to devise new applications

Basic researchApplied researchExperimental development

7 77

OECD Frascati Manual

8 88

R&D performing organisations

Higher education

Government

Business enterprises

Private non-profit institutions

9 99

The knowledge bundle in higher education and government

10 1010

The knowledge institutions

Higher educationteachingresearch

“third mission”

Research agenciesresearchproblem solvingmanagement

11 1111

Some data

12 1212

R&D resources

OECD Science, Technology and Industry Scoreboard, 2015

Italy

13 1313

R&D expenditure/GDP (percentage) - 20130.0 1.0 2.0 3.0 4.0

OECD Science, Technology and Industry Scoreboard, 2015

14 1414

The R&D performing sectors

15 1515

R&D expenditure by performing sectors, 2013

16 1616

The context: S&T policy

17 1717

Vannevar Bush ,“Science the Endless Frontier”

1945

The context: S&T policy

18 1818

“Science the Endless Frontier”

Problems to be addressed and to be solved in the US through science:

- defence- health

Solution: science policyNational Science Foundation

19 1919

The neo-conservative wave of the 1980s

“All areas of public expenditure should demonstrate ‘value for money’”Thatcher’s three Es:

economyefficiencyeffectiveness

20 2020

The new catchwords

New public managementValue for moneyAccountabilityRelevance

21 2121

Metodology

22 2222

Questions

Why to evaluate?How to evaluate?What are results of the exercise?

23 2323

Why do we need evaluation?

Governments need tools to help determine: - how much to invest in R&D- in which areas to invest- which institutions/organisations to finance

24 2424

Types of decisions in research policy

Distribution between sciences (e.g. physics, social sciences)Distribution between specialties within sciences

e.g. high-energy physics, optical physicsDistribution between different types of activity

e.g. university research, postgraduates, central labsDistribution between centres, groups, individuals

25 2525

Scope and object of evaluation

Type of research e.g.- academic research vs targeted research- international big-science programmes

Level and object of the evaluation- individual researcher- research group- project- programme- whole discipline

26 2626

Criteria for evaluation

Vary according to the scope and purpose of evaluation; they range from criteria for identifying quality/impact of research to criteria for identifying value for money

Four main aspects often distinguished- quantity- quality- impact- utility

Criteria can be- internal – likely impact on advance of knowledge- external – likely impact on other S&T fields, economy and society

27 2727

Research evaluation

Evaluation of what:researcheducation“third mission” of universities and research agencies

(consultancy, support to local authorities, etc.)Evaluation by whom:

experts, peersEvaluation of what:

organisations (departments, universities, schools)programmes, projectsindividuals (professors, researchers, students)

Evaluation whenex-antein-itinereex-post

28 2828

Policy objectives

Socio-economic objective (NABS) Italy UK France Germany

Exploration and exploitation of the earth 5,5 3,0 1,1 1,7Environment 2,7 2,8 1,9 2,8Exploration and exploitation of space 8,7 4,2 9,7 4,6Transport, telecommunication and other infrastructures 1,2 3,3 6,1 1,5Energy 3,8 2,4 6,7 5,2Industrial production and technology 11,7 3,3 1,6 12,6Health 9,6 22,1 7,6 5,0Agriculture 3,4 4,0 2,0 2,8Education 3,9 0,4 - 1,1Culture, recreation, religion and mass media 0,9 1,8 - 1,2Political and social systems, structures and processes 5,7 1,4 5,1 1,8General advancement of knowledge: R&D financed from General University Funds (GUF) 39,4 23,0 25,3 40,0General advancement of knowledge: R&D financed from other sources than GUF 2,6 12,9 19,8 17,1Defence 0,8 15,3 6,3 3,7Total civil R&D appropriations 99,2 84,7 93,7 96,3Total R&D appropriations 100,0 100,0 100,0 100,0

Million euro 8.444,3 11.758 14.981 25.371

Source: EUROSTAT

Government budget appropriations for R&D (percentage) - 2013

29 2929

Evaluation: a difficult task

30 3030

Difficult to evaluate science

William Gladstone, then British Chancellor of the Exchequer (minister of finance), asked Michael Faraday of the practical value of electricity.Gladstone’s only commentary was ‘but, after all, what use is it?”“Why, sir, there is every probability that you will soon be able to tax it.”

Michael FaradayWilliam Gladstone

31 3131

Difficult to evaluate science: The case of physicists

Bruno Maksimovič Pontekorvo

“Physics is a single discipline but unfortunately nowadays phisicists belong to two differents groups: the theoreticians and the experimentalists. If a thoretician does not posses an extraordinary ability his work does not make sense ….For experimentalists also ordinary peole can do a useful work …”(Enrico Fermi, 1931)

32 3232

Evaluation experiences

33 3333

In the UK

•Research Assessment Exercise (RAE)•Research Excellence Framework (REF) (impact)

34 3434

In Italy

Evaluation of the Quality of Research (VQR)Model: Research Assessment Exercise (RAE)Objective: Evaluation of Areas, Research structures and Departments (not of researchers) Reference period: 2004-2010Report: 2014Actors:- ANVUR- GEV (Evaluation Groups) (#14) (450 experts involved plus referees)- Research structures (universities, research agencies)- Departments- Subjects evaluated: researchers (university teachers and PRA researchers)

35 3535

Evaluation of the Quality of Research by ANVUR

Researchers’ products to be evaluated- journal articles- books and book chapters- patents- designs, exhibitions, software, manufactured items, prototypes, etc.

University teachers: 3 “products” over the period 2004-2010Public Research Agencies researchers: 6 “products” over the period 2004-2010Scores: from 1 (excellent) to -1 (missing)

36 3636

Attention basically here!


Indicators linked to research:quality (0,5)ability to attract resources (0,1)mobility (0,1)internazionationalisation (0,1)high level education (0,1)own resources (0,05)improvement (0,05)

37 3737


Indicators of the “third mission” :fund raising (0,2)patents (0,1)spin-offs (0,1)incubators (0,1)consortia (0,1)archaeological sites (0,1)museums (0,1)other activities (0,2)

38 3838

Multi-dimensional matrix of evaluation

Unit to be evaluated

Individual

Research group

Departiment

Institution

Research area

Objective

Resource allocation

Improvement of performance

Increase of multidisciplinarity

Increase or regional involvement

Promotion, hiring

Dimensions of output

Research productivity

Impact on scientific community

Innovation and social benefit

Sustainability

Research infrastructure

Bibliometric indicators

Publications

Impact factor

Impact of citations

International collaborations

Prestige of citations

Other indicators

Peer review

Patents, licences, spin-offs

Invitation to conferences

External financing

“Quality” of PhDs

39 3939

The h-index (Jorge Eduardo Hirsch)

In 2005, the physicist Jorge Hirsch suggested a new index to measure the broad impact of an individual scientist’s work, the h-index .A scientist has index h if h of his or her Np papers have at least h citations each and the other (Np − h) papers have ≤ h citations each.In plain terms, a researcher has an h-index of 20 if he or she has published 20 articles receiving at least 20 citations each.

40 4040

Impact factor (Eugene Fardfield)

The impact factor of a journal is a measure that reflects the average number of citations in the previous two years of articles (articles, reviews proceedings, etc.) published in the journal.In plain terms, if a journal has an impact factor of 3 in 2008, the articles published in 2006 and 2007 have received 3 citations each in 2008.

41 4141

Nobel laureates and bibliometrics (Boson in 2013)

Peter Ware Higgs13 works, mostly in “minor” journal, h-index = 6

Francois Englert89 works, both in prestigious and minor hournals, h-index = 10

W. S. Boyleh-index = 7

G. E. Smith h-index = 5

C. K. Kaoh-index = 1

T. Maskawah-index = 1

Y. Nambyh-index = 17

42 4242

Performance-Based research funding systems

“The rationale of performance funding is that funds should flow to institutions where performance is manifest: ‘performing’ institutions should receive more income than lesser performing institutions, which would provide performers with a competitive edge and would stimulate less performing institutions to perform. Output should be rewarded, not input.”Performance based research funding systems are national systems of ex-post university research output evaluation used to inform distribution of funding.

Herbst, 2007

43 4343

CriteriaResearch (teaching excluded)Evaluation ex-post (ex-ante excluded)Research outputResearch funding must depend on the results of evaluationNational system (internal university evaluations excluded)


44 4444

Share of university funding dependent on “Performance-Based Research Funding Systems”

Country Share (%) of what

Australia 6 Total revenue

Italy 2 Block grant

New Zeland 10 Block grant

Norway 2 Total funding

Slovak Republic 15 Total funding

UK 25 Research support

Hicks D., Reseach Policy (2012)


45 4545

Share of university funding dependent on “Performance-Based Research Funding Systems”

Hicks D., Reseach Policy (2012)

“The distribution of university research funding is something of an illusion”

“It is the competition for prestige that creates powerful incentives within university systems”

“Performance-based research funding systems aim at excellence: they may compromise other important values such as equity and diversity”


46 4646

Ranking of universities

Four major sources of ranking

ARWU Shangai (Shangai, Jiao Tong University)QS World University Ranking THE University Ranking (Times Higher Education)US News e World Reports (Best Global Universities)

47 4747

Criteria selected as the key pillars of what makes a world class university:•Research•Teaching•Employability•Internationalisation •Facilities •Social Responsibility•Innovation•Arts & Culture •Inclusiveness•Specialist Criteria

TopUNIVERSITIES Worldwide university rankings, guides & events

Ranking of universities

48 4848

Ranking of universities: the case of Italy

ARWU Shangai (Shangai, Jiao Tong University)QS World University Ranking THE University Ranking (Times Higher Education)US News e World Reports (Best Global Universities)

ARWU Shangai: Bologna 173, Milano 186, Padova 188, Pisa 190, Sapienza 191QS World University Ranking: Bologna 182, Sapienza 202, Politecnico Milano 229World University Ranking SA: Sapienza 95, Bologna 99, Pisa 184, Milano 193US News e World Report: Sapienza 139, Bologna 146, Padova 146, Milano 155

49 4949

Evaluation is an expensive exercise

Rule of thumb: less than 1% of R&D budget devoted to its evaluation

Evaluation of the Quality of Research (VQR) 300 million Euro (180,000 “products”)182 million Euro

Research Assessment Exercise (RAE)540 million Euro

Research Excellence Framework (REF)1 milllion Pounds (500 million)

50 5050

Cost of evaluation: the saturation effect

Source: Geuna and Martin

Saturation effect A systematic loss

51 5151

San Francisco Declaration on Research Assessment

General RecommendationDo not use journal-based metrics, such as Journal Impact Factors, as a surrogate measure of the quality of individual research articles, to assess an individual scientist’s contributions, or in hiring, promotion, or funding decisions.

52 5252

San Francisco Declaration on Research Assessment

The Journal Impact Factor, as calculated by Thomson Reuters, was originally created as a tool to help librarians identify journals to purchase, not as a measure of the scientific quality of research in an article. With that in mind, it is critical to understand that the Journal Impact Factor has a number of well-documented deficiencies as a tool for research assessment. These limitations include: - citation distributions within journals are highly skewed; - the properties of the Journal Impact Factor are field-specific: it is a composite of multiple, highly diverse article types, including primary research papers and reviews; - Journal Impact Factors can be manipulated (or “gamed”) by editorial policy; and - data used to calculate the Journal Impact Factors are neither transparent nor openly available to public.

53 5353

Over recent years, Journal Impact factor (JIF) has become the most prominent indicator of a journal's standing, bringing intense pressure on journal editors to do what they can to increase it. Which are the approaches employed by journal editors to maximise it? The editorial draws three conclusions. First, in the light of ever more devious ruses of editors, the JIF indicator has now lost most of its credibility. Secondly, where the rules are unclear or absent, the only way of determining whether particular editorial behavior is appropriate or not is to expose it to public scrutiny.Thirdly, editors who engage in dubious behavior thereby risk forfeiting their authority to police misconduct among authors.

Editorial in Research Policy by Ben Martin (2105)

54 5454

The Leiden manifesto on bibliometrics

55 5555

The Leiden Manifesto for research metrics

Bibliometrics: The Leiden Manifesto for research metrics

“Data are increasingly used to govern science. Research evaluations that were once bespoke and performed by peers are now routine and reliant on metrics. The problem is that evaluation is now led by the data rather than by judgement. Metrics have proliferated: usually well intentioned, not always well informed, often ill applied. We risk damaging the system with the very tools designed to improve it, as evaluation is increasingly implemented by organizations without knowledge of, or advice on, good practice and interpretation.”

56 5656

Changes in university life

The university has become at the mercy of:

- increasing bibliometric measurement- quality standards- blind refereeing (someone sees you but you do not see him)- bibliometric medians- journal classifications (A, B, C, …)- opportunistic citing- academic tourism- administrative burden- …….

57 5757

Interview of Italian researchers (40-65 years old)

Main results:

A drastic change of researchers’ attitude due to the introduction of bibliometrics-based evaluation

The bibliometrics-based evaluation has an extremely strong normative function on scientific practices, which deeply impact the epistemic status of the disciplines

The epistemic consequences of bibliometrics-based evaluation

T. Castellani, E. Pontecorvo, A. Valente, Epistemological consequences of bibliometrics: Insights from the scientific community, 2014

58 5858

Results1. The bibliometrics-based evaluation criteria changed the way in which scientists choose the topic of their research:-choosing a fashionable theme-placing the article in the tail of an important discovery (bandwagon effect)-choosing short empirical papers 2. The hurry3. Interdisciplinary topics are hindered. Bibliometric evaluative systems encourage researchers not to change topic during their career4. repetition of experiments is discuraged. Only new results are considered interestingT. Castellani, E. Pontecorvo, A. Valente, Epistemological consequences of bibliometrics: Insights from the scientific community, 2014

The epistemic consequences of bibliometrics-based evaluation

59 5959

Against the ideology of evaluation. How AVUR is killing university

“ANVUR is much more than an administrative branch. It is the outcome of a cultural and political project aimed at reducing the range of alternatives and hampering pluralism.”

D: Borrelli, Contro l’ideologia della valutazione, 2015

60 6060

Model of firm’s management based on the principles of competitiveness and customer satisfaction (the market)

The catchwords:competitivenessexcellencemeritocracy

“Evaluative state” as the “minimum state” in which the government gives up the role of political responsibility and avoid the democratic debate in search of consensus, and rests on the “automatic pilot” of techno-administrative control.D: Borrelli, Contro l’ideologia della valutazione, 2015

Against the ideology of evaluation. How AVUR is killing university

61 6161

Lessons from Research Evaluation

Evaluation should enhance efficiency and effectivenessPro-active evaluation vs punitive evaluationEvaluation is a difficult and expensive processOpportunistic behaviourPeer review vs bibliometricsNSE vs SSHCompetition vs cooperation of scientistsThe mith of excellenceThe split of the academic community (the good and the bad guys)The equilibrium amongst teaching, research and third missionBureacratisationEvaluation in the economic cycle

62 6262

Some concluding remarks

S&T systems are under scrutinyEvaluation as a tool for the legitimation of R&D and higher educationEvaluation has become a key policy instrumentR&D evaluation is “easier” than other types of evaluationThe ideology behind R&D evaluation (concentration in “excellent”

institutions or spreading?)Evaluation exercises heavily criticised from a methodological point of

viewImpact on the scientific community and on researchers’ behavior (“when

you measure a system, you change the system”)Evaluation is expensiveEvaluation in a period of reduction of resourcesEvaluation necessary … but too much evaluation is harmful

63 6363

How to proceed with evaluation?

Respect the internal logic and rules of the scientific communityPro-active evaluation, not punitive evaluationKeep the national S&T system anchored to the international systemReduce/avoid negative unintended effects (opportunism, etc.)Don’t ask science what science can’t deliver (e.g. jobs, competitiveness)The inventor and the innovatorDesign evaluation in an efficient way (costs/benefits)Evaluation as dynamite: handle with care

64 6464

Thank you for attention

[email protected]

1 1 1 R&D evaluation: Opportunities and threats Giorgio Sirilli Research Director Campus Luigi...

Documents

Transcript of 1 1 1 R&D evaluation: Opportunities and threats Giorgio Sirilli Research Director Campus Luigi...