JASMIN and CEMS : The Need for Secure Data Access in a Virtual Environment
description
Transcript of JASMIN and CEMS : The Need for Secure Data Access in a Virtual Environment
JASMIN and CEMS: The Need for Secure Data Access in a Virtual Environment
Cloud Workshop23 July 2013
Philip KershawCentre for Environmental Data Archival
RAL Space, STFC Rutherford Appleton Laboratory
Introduction
• JASMIN and CEMS background– Current phase 1 deployment– Plans for phase 2
• Security Requirements• Access Control and Federated Identity
Management• Cloud and Confidentiality• Cloud and SLAs
LST plot for the UK [John Remedios and Darren Ghent, University of Leicester].
JASMIN Phase 1
• e-Infrastructure investment (NERC and UKSA)
• 6PB fast disk (Panasas) via low latency networks
• Distributed: RAL, Leeds, Bristol, Reading
• for Climate Science and Earth Observation (CEMS) communities
• Compute cluster, virtualisation (VMware) and private cloud (vCloud)
External Cloud
Providers
JASMIN 2 and 3
Virtualisation
Cloud Federation API
Internal Private Cloud
JASMIN / CEMS Academic [R89 Building STFC Rutherford Appleton Laboratory]
Direct access to the data archive - Hosted processing and analysis environments
• NERC Environmental Big Data investment (2 internal phases)• JASMIN for use by entire NERC
community• Expand to 12PB fast disk + 1000s
cores• Provided a range of service models
• Batch compute• Virtualisation• Cloud
• Private Cloud with capability to federate with public clouds• Private cloud will be a host for
virtual platforms• Dynamically configured infrastructure
to enable switching of storage and compute between private cloud and archive
Isolated part of the network
Cloud burst as demand requires
Panasas Storage
Bare Metal Compute
Data Archive and compute
Evolving Security Requirements
• CEDA changing from a data provider to a data provider and hosting service• Communities
– JASMIN 1 + CEMS: Data for the Atmospheric Science and Earth Observation research communities
– JASMIN 2 private cloud will serve wider NERC community• Requirements
1. Enforcement of licence agreements, terms of use, embargo periods or limited distributions
2. User privacy – Data Protection Act3. Protection of computing resources is the critical consideration
• Increasing importance with the provision of user hosting environments
• To prevent,– Loss of service of for extended period– Detrimental impact on science– Knock-on effect of reputational loss
Interfaces
• Interfaces – critical consideration as they mark out security boundaries
• Interfaces changing and evolving with new service models: virtualisation, cloud, …
Interfaces and Usage Patterns vs. Hosting Solutions
High per
forman
ce
file sys
tem Hosted
Proce
ssing
Hosted
Infrastr
ucture
PaaS
– Hoste
d
Analysis
En
vironmen
ts
Increasing virtualisation =>
Cloud platformDirect Access to the File System
Sandboxed environments
Serv
ice
Offe
red
Cloud Federa
tion /
Broke
ring
Virt
ualis
ation
an
d ne
twor
king
Virtua
l Sto
rage
Applicati
on Hosti
ng
Bare metal
SOA
Isolated network
Increased set-up time, but longer usage
Lower level of trust in user =><= Increased level of trust in user
Use
rs a
nd
usag
e
More dynamic and autonomous usage patternsGreat security risk usage patterns
Share
d Scien
tific
Analysis
hosts
Virtual
Infrastr
uctures
for
other
organisa
tions
Access Control and Federated Identity Management
• RBAC (Role-Based Access Control) in place for many years• FIM required for international collaborations
Earth System Grid Federation Security
• ESGF, a globally distributed federation of nodes initially deployed in support of CMIP5
• Requirements:– Access control for enforcement of licence agreements and terms of use– Single sign-on (SSO)– Authorisation overseen by PCMDI, lead organisation
• Solution:– SSO: OpenID for browser-based access, SLCS (Short-Lived Credential Service -
X.509) for command line wget and other clients (NetCDF) and GridFTP– SAML for attribute query and authorisation interfaces– RBAC with virtual Organisation(s) to managing access roles– RESTful authorisation policy
• Also adopted for CEDA’s infrastructure
Access Control and FIM for Clouds
• Build on work for ESGF– But ESGF designed for federated access to datasets– Low LoA required (Level of Assurance) for credentials
• New work with Contrail project to address some challenging use cases . . .
Contrail Project Goals
• EC FP7 Project, led by INRIA, 36 month+, completes Jan 2014
• Federation of cloud providers• Federation with external IdPs• Elastic CAs for dynamically created
services• Autonomous SLA management
(SLA@SOI)• IaaS and PaaS integration• Reuse of existing open standards:
– OVF, OCCI, CDMI– WS-Security, SLA@SOI models . . .
Contrail – Delegation with OAuth
Cloud Providers
Federation CLI Browser
Federation Web Portal
Federation core
Online CA Service
Federation Identity Provider REST API
Multiple delegation hops
Cloud credential mapping
OAuth
Contrail Federation Layer OAuth Authz Server
External IdPs – Shib, OpenID
Confidentiality
• Homomorphic encryption– Homomorphic Encryption: Theory & Application, Jaydip Sen, Department
of Computer Science, National Institute of Science & Technology Odisha, INDIA
• Divide data into chunks and distribute across multiple providers• Only the owner can re-assemble the data• No single provider can re-assemble the data• Computationally expensive• ESA Project DCGO (Data Chunks to Go) exploring this technology• Other commercial solutions
SLAs and Security
• Lack of standardisation and relative immaturity are problems• Contrail project• Extends work of SLA@SOI project
– Support for expressing SLAs at the level of individual resources by linking to OVF (Open Virtualisation Format) descriptors
• Federated negotiation with multiple providers and the selection of the optimum SLA offer according to user criteria
• Quality of Protection (QoP) terms, such as data locality, protection, replication, …
External Cloud
Providers
Security, Cloud and Network Isolation
Virtualisation
Cloud Federation API
Internal Private Cloud
JASMIN / CEMS Academic [R89 Building STFC Rutherford Appleton Laboratory]
Direct access to the data archive - Hosted processing and analysis environments
• 3 interfaces• Private archive• Private cloud• Public cloud (via broker)
• Private archive and private cloud in independent networks but co-located
• key interfaces link between the two e.g. data download OPeNDAP
• Dynamically configured infrastructure to enable switching of storage and compute between private cloud and archive
Isolated part of the network
Cloud burst as demand requires
Panasas Storage
Bare Metal Compute
Data Archive and compute
Conclusions
• Existing climate science and earth observation security requirements understood
• Strong foundation of access control and FIM to build on– Need to consider LoA for new use cases
• New user communities within NERC to consider• New challenges with requirements to protect computing
resources, new interfaces (attack vectors!)• Confidentiality and SLAs
– Areas where much more work is needed• Network isolation baseline for private cloud• Clarity and clear demarcation needed for hybrid cloud (cloud
federation)