1 NumericNumeric Developing a statistical framework for measuring the digitisation of Europe’s...

26
1 Numeric Developing a statistical framework for measuring the digitisation of Europe’s cultural heritage Numeric Phillip Ramsdale The study was conducted for the European Commission by the Chartered Institute of Public Finance and Accountancy

Transcript of 1 NumericNumeric Developing a statistical framework for measuring the digitisation of Europe’s...

1Numeric

Developing a statistical framework for measuring the digitisation of

Europe’s cultural heritage

Numeric

Phillip Ramsdale

The study was conducted for the European Commission by the

Chartered Institute of Public Finance and Accountancy

2Numeric

Numeric WHY? Long-term goals

better identify the total European digitisation effort and progress;

stimulate further digitisation by demonstrating the current progress;

better inform stakeholders that have an interest or direct involvement in digitisation policies and funding.

3Numeric

Study objectives

Statistics were intended to develop indicators for:

digitisation costs, investments and funding sources;

volume and growth of digitised resources, related to the analogue collections held by institutions;

the characteristics of digitised outputs, including their formats and user access.

4Numeric

Study phasing

2007 2008

2008 2009

May J un J ul Aug Sep Oct Nov Dec J an Feb Mar Apr

May J un J ul Aug Apr

Stage 1 - Development

Stage 2 - Implementation

J an Feb MarSep Oct Nov Dec

START OF STUDY

Building networks, desk research, identify data sources, defining sample frame, refine intended methodology

DESK RESEARCH

CONCLUDES

NEED FOR A

SURVEY

Baseline statisitcal analysis, refinement of definitions

METHOD AND

TEST SURVEY

REVIEWED

Refinement based on feedback

Data collection

Data collection, analysis, interpretation, presentation, reporting, consultation, publication Promotion, sustainability actions

QUESTIONNAIRES

AND SURVEY

INSTRUMENTS

PREPARED

SURVEY CONDUCTED IN EACH COUNTRY

PROPOSALS

REPORTED TO

EXPERTS

WORKSHOP

TO LAUNCH EU

SURVEY

DRAFT

FINDINGS

REVIEWED

WORKSHOP

AND FINAL

REPORT

5Numeric

Cultural heritage:A jumble of definitions!

Culture / Creative Industry

x "part of"

NACE 1.1 ISIC 3.1 NAICS 2002

EU UN North America

Video, film and photography of which: Photography

22.32 x 92.10 92.72 x 74.87 x 74.81 x

2230 x 9211 9212 9249 x 7499 x 7494

334612 x 5121 56131 x 7114 x 54192

Music and the visual and performing arts Sound recording and music publishing Visual and Performing arts (including Festivals)

22.14 22.31 x 92.31 92.32 92.34 x 92.72 x

2213 2230 x 7499 x 9214 9219 x 9249 x

5122 334612 x 7111 7114 x 7115 x 7113 7114 x 56131 x

Radio and TV (Broadcasting) 92.20 92.72 x 9213 9249 x 515 516 x 5175 56131 x

Libraries (includes archives) 92.5 9231 51912

Museums 92.5 x 9232 x 71211

Historic and heritage sites 92.5 x 9232 x 71212

Other heritage institutions 92.5 9233 x 71219

6Numeric

What objects are digitised?

Collective memory of print: books, journals, newspapers, for example

Images held by any institution

Museum objects

Archival documents

Audio-visual materials, such as films and broadcasts

Granularity becomes coarser as the classification is

summarised.

7Numeric

The “Design” Since no primary data existed for all

domains, it was necessary to collect data directly from individual institutions.

Identify the appropriate institutions from which to collect the data.

Use a “Standard” questionnaire so that consistent definitions could be followed in each country.

8Numeric

The Method – put simply• Identify those institutions holding

collections that represent the significant part of the nation’s potential digital heritage.

• Survey a sample of these.

• Use the survey results to infer the overall scale of digitisation activity and expenditure to all other relevant institutions.

9Numeric

The “Process” Establish the number of institutions in

each country. (All types / domains)

Identify the “Relevant” institutions.

Draw a representative sample.

Introduce the survey questionnaire.

Chase the response.

Check the responders’ data.

Scrutinise the survey results.

Review and refine method for the future.

10Numeric

The “National” approaches provided for translations of the

questionnaire into:

• Czech• Dutch• English• Estonian• French• German• Hungarian

• Latvian• Lithuanian• Polish• Portuguese• Romanian• Slovenian• Spanish

11Numeric

Designing the “TARGET”

ALL INSTITUTIONSALL INSTITUTIONS

RELEVANTINSTITUTIONS

SAMPLE

Archives, Broadcasters, Film

Institutes,Museums, Libraries,Heritage Agencies

Archives, Broadcasters, Film

Institutes,Museums, Libraries,Heritage Agencies

Digitisation of collections will

significantly enhance access to the

country's cultural heritage

Digitisation of collections will

significantly enhance access to the

country's cultural heritage

1,539 > ¼ of relevant

12Numeric

The “Relevant” institution

“Institutions possessing collections that would be of significant value to the digitised heritage of the nation”.

At what point does “Significant” become “Insignificant”?

The guidance spelt out some examples, but assumed National Co-ordinators would be better placed to decide.

13Numeric

Response aligned with distribution of institutions

133

41

332

222

60

848

2754

1932

10

9

Number of relevant institutions vs. survey responders

Archives / Rec Offices

Film Inst's

Museums

Libraries

Other Inst's

Relevant institutions

Surveyresponders

14Numeric

Interpretation

Weighting – flexing the results to eliminate bias arising from the pattern of response.

Summary statistics – choosing the appropriate measure to describe all institutions that are so diverse in their purpose.

15Numeric

Choice of statistical measures

1.41€ A: Mean Arithmetic mean (sum of values divided by number of values)1.00€ B: Mode Most frequently occuring or repetitive value in the distribution0.69€ C: Median The value in the middle of the ranked distribution

€ 0

€ 1

€ 2

€ 3

€ 4

€ 5

€ 6

€ 7

€ 8

€ 9

€ 10

€ 11

€ 12

A B C

Example of cost distribution for digitising text combining images on the same page

16Numeric

% with on-line catalogues

Range of error for 90% confidence

Survey errorsAll

17Numeric

Sources of data

Pathfinder7%

Numeric77%

National Surveys

16%

18Numeric

A few words word about qualityMore than X% of responders reported on:

90% institutional staff; possession of digitisation plans.

80% digitisation staff; on-line catalogues; undertakers of digitisation; access policies to digitised materials.

70% institutional budget; sources of funding for digitisation; progress towards digitisation.

19Numeric

A few more words about quality

More than X% of responders reported on:

60% digitisation budget; availability of digitised material on the internet.

Less than X% of responders reported on:

40% number of users accessing digitised materials.

20Numeric

Statistics summarised cover:• budgets• staff time• formal plans• funding sources• contractors• progress

• formats• unit costs• access policies• internet

availability• cost of plans

analysis by type of institution, and type of materials held in collections

21Numeric

Problem definitions

• User access to digitised materials• Archive records• Museum digitisation (catalogues / materials)

• Newspapers• Monuments

22Numeric

“Relevant” institutions• Archives – Government documentation /

records offices.• Museums – Collections of national

importance.• Libraries – National, University founded

before 1900, Public libraries acting as the main reference centre for regions containing at least 5% of population. i.e. Wider than CENL.

• Audio-visual - Members of ACE / FIAF and National Broadcaster.

• Significant others!

??

23Numeric

Summary issues (1)• Clarify the definition of Relevant

institutions• Encourage universal approaches to

measuring the analogue collections• Develop incentives to respond to

surveys• Concentrate the questionnaire on

“hard facts”, and ...• Streamline the questionnaire

24Numeric

The “measures” – 7 QuestionsInputs1. Cost in previous 12 months2. Staff devoted to digitisation projects3. Cost of planned digitisation 12/+ monthsOutputs1. Digitised pages / hours / etc last year2. Same assumed in plansOutcomes1. On/off-line user visits in previous 12 months2. Proportion of such visits that were “Free”

25Numeric

Summary issues (2)

• National planning returns to encourage the collection of data

• Provide a checklist of digitisation processes

• Review definitions• Respect the considerable differences

between domains, but consistently cover them all

26Numeric

Summary: Main points• Short questionnaire for high-level

national summaries – Benchmarking for more specific investigations.

• Address the definition of “Relevant” institutions.

• Review the use of the statistics on a regular basis.