Digital and institutional repositories: emerging architectures A short workshop moderated by Steve...

28
Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science (ECS), Southampton University On 21 and 22 June, 2006, at the SCONUL Conference 2006, Newcastle- upon-Tyne

Transcript of Digital and institutional repositories: emerging architectures A short workshop moderated by Steve...

Page 1: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

Digital and institutional repositories: emerging architectures

A short workshop moderated by Steve Hitchcock, School of Electronics and

Computer Science (ECS), Southampton University

On 21 and 22 June, 2006, at the SCONUL Conference 2006, Newcastle-upon-Tyne

Page 2: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

The workshop brief

The question for repositories is not which software you should choose, but which applications and services you want to support. With growing numbers of institutional repositories and increasing commitment within institutions, alongside an active and growing programme of development more broadly in digital repositories, notably that sponsored by JISC in the UK, this is a picture that could change significantly in the next few years. This workshop will provide the opportunity to explore and discuss, from institutional perspectives, some of these developments, with a view to anticipating emerging architectures that could support expanded repository capabilities.

For a brief report and follow-up, see this blog entry

http://www.eprints.org/community/blog/index.php?/archives/89-Emerging-IR-architectures-investigation-by-workshop.html

Page 3: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

What is an IR? Lynch 2003

“a university-based institutional repository is a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members. It is most essentially an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access or distribution. ….. An institutional repository is not simply a fixed set of software and hardware.”

Cliff Lynch, 2003 http://www.arl.org/newsltr/226/ir.htm

Page 4: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

What the workshop is NOT about

Open access

Technical interoperability (OAI-PMH, METS, Z39.50, SRW, AJAX, Web 2.0, etc., see Augmenting interoperability across scholarly repositories, New York, April 2006 http://msc.mellon.org/Meetings/Interop/ )

National repository services (Linking UK Repositories)

Departmental repositories (e.g. Caltech)

Consortium repositories (e.g. White Rose)

Page 5: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

Comments on following schematics

Commentary slide added post-workshop

The following three slides illustrate the sort of components that might be found in an IR:

1 Schematic, Liz Lyon, UKOLN, for eBank UK project

2 OCLC 2003 environmental scan

These are two network-based examples, but don’t have an institutional perspective

3 Chart, RepoMMan project

This is getting closer to the institutional view being investigated here, coordinating data types with users

Page 6: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

Learning & Teaching workflows

Research & e-Science workflows

Aggregator services:

eBank UK

Repositories : institutional, e-prints, subject, data, learning objects

Data curation: databases & databanks

Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules

Validation

Harvestingmetadata

Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media

Resource discovery, linking, embedding

Deposit / self-archiving

Peer-reviewed publications: journals, conference proceedings

Publication

Validation

Data analysis, transformation, mining, modelling

Resource discovery, linking, embedding

Deposit / self-archiving

Learning object creation, re-use

Searching , harvesting, embedding

Quality assurance bodies

Validation

Schematic by Liz Lyon, UKOLN, for eBank UK project

Resource discovery, linking, embedding

Linking

Page 7: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

OCLC 2003 environmental scan

Page 8: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

RepoMMan project, Hull

Page 9: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

What this workshop IS about

The institutional repository

The institutional perspective

The multi-repository institution

Where does it exist? What does it look like? What will it look like?

Your view

Page 10: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

Edinburgh example: nowWe have at present:• A catalogue repository with records for ejournals and ebooks (and

some web sites), which we feel should be accommodated in separate repositories;

• a single repository for 'research outputs' (eprints, research papers and reports, and theses);

• a proto-research publications repository (currently serving as our RAE publications repository);

• Separate repositories for image and museum collections; • a learning objects repository; • an archives repository; • and a proto-repository for locally digitised research collections

(with little in it, but quite a lot of planning done).

Thanks to John MacColl for this example

Page 11: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

Edinburgh example: next?We wish to develop this into a more efficient architecture by:• splitting out the catalogue records by their various types;• splitting out the research outputs into separate repositories;• introducing an image/musuem management system (currently being

implemented);• introducing a licence management system (currently being

implemented); • creating a digital records repository (not started);• migrating the locally digitised research collections from one system

(Endeavor ENCompass) to another (probably Dspace).

We are also seeking to• apply a federated search engine to the entire architecture

(WebFeat, currently being implemented).

Page 12: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

Repository data types

Open Access (published papers. preprints, tech. reports, etc.)

Electronic Theses and Dissertations

Teaching & Learning

E-science, datasets

Research Information (CRIS)

Multimedia (audio, video, images, museum collections)

Digitisation

Publishing

Preservation

Administration

Page 13: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

Open source software?

Open source softwareDSpace, EPrints, Fedora, Moodle, Bodington, Sakai, Plone ….

Page 14: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

Repository data types

EPrints

DSpace

Moodle, Bodington, Sakai, Plone

EPrints

Bepress

Fedora

Open Access

ETDs

T&L

Datasets

CRIS

Multimedia

Digitisation

Publishing

Preservation

Administration

Web pages

Structured databases

Examination papers

???

Page 15: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

Open source or Services?

Open source softwareDSpace, EPrints, Fedora, Moodle, Bodington, Sakai, Plone ….

IR ServicesOpen Repository (based on DSpace), EPrints Services

Page 16: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

Multi-repository institution

INTEROPERABILITY

Open Access

ETDs

T&L

Datasets

CRIS

Multimedia

Digitisation

Publishing

Preservation

Administration

??????

Page 17: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

An interoperating network?

“Repository deployment is fragmented, and repositories tend to exist in isolation rather than being embedded into an interoperating network of services. “We've got bits and pieces but it doesn't operate as a whole and there are big gaps in provision in some areas.” Within institutions, repositories tend not to inter-work with other applications. Nor are they well integrated with other institutional repositories (although there are some examples of innovative workflows, for example between laboratory repository and cross-institutional repository in R4L/eBank).” R4L: Repository for the Laboratory http://r4l.eprints.org/

eBank UK http://www.ukoln.ac.uk/projects/ebank-uk/

Rachel Heery and Andy Powell, Digital Repositories Roadmap: looking forward, JISC, April 2006 http://www.jisc.ac.uk/uploaded_documents/rep-roadmap-v15.doc

Page 18: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

The Institutional Repository?

Open Access

ETDs

T&L

Datasets

CRIS

Multimedia

Digitisation

Publishing

Preservation

Administration

???

???

Page 19: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

Open source or Services or LMS?

Open source softwareDSpace, EPrints, Fedora, Moodle, Bodington, Sakai, Plone ….

IR ServicesOpen Repository (based on DSpace), EPrints Services

IR Services + library management systemsVTLS VITAL (Fedora), Proquest (Digital Commons), Ex Libris

(DigiTools)

Page 20: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

The Digital Library

Open Access

ETDs

T&L

Datasets

CRIS

Multimedia

Digitisation

Publishing

Preservation

Administration

Library content

Institutionally-generated content

Library Services:OPAC

OpenURL resolverE-journals

Etc.

Library Management System

???

???

Page 21: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

What is an IR? Lynch 2006

“As I've looked more at various institutional deployments and planned deployments, I think that the distinction between digital libraries, digital collection management systems, digital archives, and institutional repositories is less clear than I might have felt in 2003."

Cliff Lynch, 2006, quoted in http://poynder.blogspot.com/2006/03/institutional-repositories-and-little.html

Page 22: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

Networkflows"Historically, users have built their workflow around the services the

library provides. As we move forward, the reverse will increasingly be the case. On the network, the library needs to build its services around its users' work- and learn-flows (networkflows).

"one of the discussion points around institutional repositories is about which goals they support: open access, curation of institutional intellectual assets, reputation management. And which processes? Over time, it is clear that what we now call institutional repositories will be part of wider research process support. What is currently the institutional repository will be a component of the workflow/curation/disclosure apparatus that develops to support research activities."

Lorcan Dempsey, Networkflows, January 2006 http://orweblog.oclc.org/archives/000933.html

Page 23: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

Integration, workflow, portals

The role of the repository will influence the level of integration and interaction required.

Where a repository is being used for multiple content types and as an everyday working tool then greater integration is required to allow it to take on this role. Integration may also be focussed at the presentation level for end-user interaction (e.g., presenting a search or deposit screen within a portal) or can be at the data level for the exchange of information between systems.

Alma Swan and Chris Awre, LINKING UK REPOSITORIES, A6.6 Repository integration in local infrastructure

http://www.jisc.ac.uk/uploaded_documents/Linking_UK_repositories_appendix.pdf

Page 24: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

The Digital Library

Open Access

ETDs

T&L

Datasets

CRIS

Multimedia

Digitisation

Publishing

Preservation

Administration

Library content

Institutionally-generated content

Library Services:OPAC

OpenURL resolver

Library Management System

Portal

???

???

Page 25: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

Embedding IRs in institutional strategy

“Often institutions are not clear as to their strategy for establishing repositories. There are real benefits for institutions in effectively managing their digital assets (promoting research outcomes, fulfilling preservation responsibilities, facilitating added value services such as overlay journals, data mining, etc). Such benefits can be assisted by leveraging the open access agenda. Despite this, repositories are not yet fully embedded in institutional strategy and there is perhaps a misplaced confidence that institutions will take on the full range of repository business functions. Interoperability between institutional libraries, repositories, learning management systems and MIS is still rare.“

Rachel Heery and Andy Powell, Digital Repositories Roadmap: looking forward, JISC, April 2006 http://www.jisc.ac.uk/uploaded_documents/rep-roadmap-v15.doc

Page 26: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

Assessing the costs

From a spreadsheet on the costs of setting up and maintaining Open Source repository: "There was a range, from about $6,886.62 for a set-up cost, all the way to over $1 million."

Rebecca Kemp, list posting, November 2005http://www.library.yale.edu/~llicense/ListArchives/0511/msg00030.html

Page 27: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

Wider frameworks

Where do wider services – national frameworks and services, Web search and other Web services – fit into the institutional agenda?

Need for IRs to be visible, searchable and usable by local and distant users

Page 28: Digital and institutional repositories: emerging architectures A short workshop moderated by Steve Hitchcock, School of Electronics and Computer Science.

Summary

• One repository or multi?

• OSS vs Services vs extended LMS

• IR or DL

• Build around workflows

• Embed in institutional strategy