Data services and computing
-
Upload
jolene-rasmussen -
Category
Documents
-
view
34 -
download
3
description
Transcript of Data services and computing
1
Data services and computing
2
We tend to be dealt the computing environment in which we must operate.
Few of us have enough influence to steer the direction of central computing on our campus.
Thus, we try to match our computing needs with the resources provided locally.
Computing reality
3
Develop a computing strategy that identifies the hardware, the software, and the network connectivity
needed to support the level of data service you are providing now and will be in the near future.
Match to level of service
4
Desktop computing power hardware
fastest affordable processor largest affordable hard drive largest affordable monitor removable media drives support
Strategic factors
5
Desktop computing power the software it should support
metadata tools statistical software network tools compression utilities
Strategic factors
6
Large quantities of disk space on a fast systemuncompressing filesunpacking filespackage and compress files
Strategic factors
7
Access to at least one fast processing machine with statistical softwarepowerful Unix workstationhandle larger-scale problems
Strategic factors
8
Mass storage that supports multiple-user access to files and preferably multiple-system accessdistributed file systeminstitutional repository
Strategic factors
9
Support software for data servicesstatistical packagesmetadata supportcommunication tools
web tools blogs/wikis
training tools
Strategic factors
10
Network connectivity permitting high-speed file transfers
off and on campus transfer may require using services
elsewhere on campus
Strategic factors
11
System administration takes a lot of time! think twice about becoming
your own system administrator
Implementation strategy
12
Purchase compatible computing equipment to receive maintenance
support simplifies the sharing of
peripheral devices
Implementation strategy
13
Investigate local computing support possibly a centralized high performance
computing service or compute grid site licenses for software
Implementation strategy
14
Align with local institutional repository services and digital preservation initiatives
introduce data to the planning of your institutional repository
Implementation strategy
15
Data infrastructure models Data centres
• The data centre is part of the instrumentation infrastructure.
• e.g., the Large Hadron Collider Data repositories
• The repository is part of a specific institution’s larger stewardship mandate for digital resources.
• e.g., Odesi in Scholar’s Portal Domain archives
• Domain archives are institutions established explicitly to preserve and provide access to a domain’s data.
• e.g., the ICPSR and the UKDA
16
Emerging data infrastructure
Duration of access
Institutional based
Domain based
Long-termtrusted
repositorytrusted
repository
Mid-termdigital
repositoriesdigital
repositories
Immediate internet internet
17
Examples
Duration of access
Institutional based
Domain based
Long-termNARA
Scholars Portal
ICPSR
Genbank
Mid-term DataStaRNational Center for Ecological Analysis
and Synthesis
Immediate Local govt dataCross-national
Time-series Data Archive
18
GÉANT network infrastructure
computing/data grid infrastructure
scientific data infrastructure
biology data
astronomy data
clinical data
LHC data
Source: Mário CampolargoOpen Grid Forum Barcelona, 3 June 2008 source: eSciDR study (adapted)
ICPSR data
19
e-I
nfr
ast
ructu
re
of
re
posi
tori
es
e-I
nfr
ast
ructu
re
for
re
posi
tori
es
Management TransparentResponsiveInformedGrids, Virtual Organisations, etc
Repositories TrustedOpenWell managedRepository management, curation, physical security,
etc
Repositories services Ease of useAvailabilityReliabilityDeposit, annotation, delivery, visualisation, search,
help, etc
Information AuthenticityQualityLongevity
Collections: data, work-flows, publications, learning materials, etc.
AvailableScaleableReliableNetworks, computing, HPC, physical storage, etc
Physical infrastructure
Access StandardisedStableFlexible
Authentication, authorisation, logical security, federation, portals, etc
Source: Mário CampolargoOpen Grid Forum Barcelona, 3 June 2008 source: eSciDR study (adapted)
e-Infrastructure for repositories