1 iRODS: A Rule Oriented Data ManagementSystem SRB Space.
-
Upload
myrtle-foster -
Category
Documents
-
view
223 -
download
2
Transcript of 1 iRODS: A Rule Oriented Data ManagementSystem SRB Space.
1
iRODS: A Rule Oriented Data ManagementSystem
SRB Space
2
Beyond the Storage Resource Broker• SRB is a data management system for large-scale data
• Logical name space -- Independence from Physical Pin Downs• Integrated Data and Metadata Management• Uniform Access Interfaces
• Caters to multiple tasks and paradigms• Data grid Federations for distributed and replicated data handling
• Cooperating Autonomous Virtual Organizations (VO)• Persistent Archives for long-term preservation
• Building light, dim and dark archives• Digital Libraries for semantically searchable data sharing
• multiple domains with collection-level functionalities• Server-side Operations for performing data intensive operations
• Data sub setting, data fusion, administrative management• Used in large-scale systems in production
3
What Next?• SRB is quite complex – with many functions and operations
• > 90 commands with many options • several 100 unique ops
• The intelligence is hard-coded • extensions/modifications require extreme care• but, the modules are fairly robust and reusable
• SRB is a one-size fits all architecture• everyone gets the same code base
• Users want more functionality• increased customizability• want a small foot print as necessary• Easy for them to modify
• independence from developers• functionality to fit policies and not the other way around!!
4
What Do the Users Want?• Innovative Access Control
• Sometimes by groups, sometimes by users & sometimes by roles• Based on their login type – how they got authenticated• Third party authorization - outside authority agent• Dynamically changeable access control• Access Control Lists, Denial lists, over-rides,…• Ticket-based short-term and controlled access
• Data Placement Strategies• Completely user controlled – user preference policies• Completely Administration controlled – site policies• Group-based policies• Over-rides, exceptions• Based on Data characteristic or Collection characteristic• Policies for staging, caching, archiving, purging, synchronization,…
• Ingestion Policies• Check for authenticity – anonymization, • Pre and post process• Replication policies, metadata extraction policies, permission
policies,…• And others …
5
Rule Oriented Data Management
• Adaptive Middleware Architecture
• Customizable and Flexible – User Configurable
• Administratively Simpler – Admin Configurable
• Build upon the experience of SRB Data Grid
• Rule-oriented Programming
• Well-defined set of functionalities --- Micro services
• Define Rules which chain micro-services
• Work-flow of micro services
• Define Rule Application Condition
• Define Recoverability for failure management
• Administrators can set site policies
• Users can encode their preferences
• Groups can set their process requirements
• Control actions at collection-level, format level,
user level, resource level, ….
6
Rules and Constraints• Rule-based
• Lower-level Functions are composed of micro-services• Higher-level Functions are composed of rules of lower-level
micro-services• Rules are interpreted using a rule engine• Customizability• Problems with rule composition
• Integrity checks to make sure rules do not break higher-level functionailties
• Declarative programming • Rules define semantics
• Operational programming• Rule invocation provides procedural interpretation
• Rules can be used as “checks and balances” to make sure that collections are self-consistent
• Example: Rule makes two copies of each files• Constraint checking: can be used to see if the collection is
consistent with this rule
7
Rule-Oriented Data Systems Framework
Resources
Client Interface Admin Interface
MetadataModifierModule
ConfigModifierModule
RuleModifierModule
ConsistencyCheckModule
Confs
RuleBase
Meta DataBase
Engine
Rule
Current State
Rule Invoker
MicroService
Modules
Resource-based Services
MicroService
Modules
Metadata-based Services
ServiceManager
ConsistencyCheckModule
ConsistencyCheckModule
8
Rules Flow
Application Client Call
Server Call
Select Firstt/NextRule
Find AppropriateRules
ConditionCheck
Execute NextMicroService/Action
Success
Execute RecoveryMicroService/Action
Yes No
Success: No More MS/A
True
False
Failure: No More Rules
9
ingestObject(*F)
createFile(*F), registerFile(*F).
ingestObject(*F)
$userDept == sdsc OR $userDept == sio
createFile(*F), registerFile(*F),
computeChkSum(*F),!,
findBackUpRsrc(*F, *R), replicateFile(*F, *R),
computeCheckSum(*F, *R),
compareCheckSum(*F).
ingestObject(*F)
$dataType == FITS Image
createFile(*F), registerFile(*F),
extractFITSMetadata(*F).
Sample Rules
10
Format of a Rule
Action :- Condition | MS1, …, MSn | RMS1, …, RMSn
Action to be performedCondition checked to see if rule is applicableIf applicable micro services {1,…n} are executedIf any micro service fails, recovery micro service(s) executed to maintain
transactional capability createFile(*F) removeFile(*F) ingestMetadata(*F,*M) rollback
Caveats:• More than one rule can define an action• R/MSi can be actions• Micro services can pass parameters
11
AMA & ROPA New Paradigm in Middleware Development
• Higher level Services composed of Micro-services• Customizable at multiple levels• Glass Box Architecture
• Can explain what happens• Semantics can be checked• Run-time Version Control
• Combines multiple paradigms• Workflow systems, active databases, rule-based execution, transaction systems, data grids and remote execution of
services• Flexible Management
• Administrative ease• Triggers for handling low/high water marks• Periodic Job execution – backup, archive, usage control,…
12
Components of Rule System
Actions• Name Space of Actions• Client Call Maps to Actions
Micro Services• Well-defined Server-side Procedures and Functions
Rule • Definitions for Actions
• Workflow of what to do • Composed of of Actions and Micro Services
• Invoked to execute an Action
Rule Base• Set of Rules
• Each User Community can choose their own rule base
Data Components• Blackboard Architecture • Used by Micro Services,Actions and Rules
13
Data Components of Rule System
Persistent Data Attributes: #• Has an external name space• Mapping to internal database attributes• Persists across sessions
Session Data Attributes: $• Has an external name space• Mapping to internal data structures• Used by micro-services/actions inside a session
Side Effects Set: %• Changes affected outside the system
• File created, File Copied, Email Sent, …• Well-defined name space of activities
14
Micro Services
Compiled Functions
Short and Well-defined functionality
Should have a clear semantics
Works on $,#,%
Examples:• Metadata Extraction for DICOM
• Access Control Permission Changed to User
• Replicate a file from Source to Destination
15
Semantics
Micro Service Semantics• Input /Output Variables (in terms of $)
• Input: what is needed
• Output: what gets changed
• Persistent Changes (in terms of #)
• Updates to Databases
• Activities Performed (in terms of %)
• External Activities Performed
16
Semantics
Rule Semantics• Based on component micro services
Action Semantics• Based on corresponding rules
• Only one rule semantics apply
18
Middleware
Software providing complex distributed applications/services• Client-server• Peer-to-peer
Web servers, Content Managers, Databases, Application Servers,…
Client access through common protocols• RPC, Message-oriented, Object Request Broker, WSDL or
service-oriented
Middleware provide a specific set of services
19
Middleware
Normal Middleware are black boxes• Expose a set of interfaces/service definitions• No customization • System Developer has complete control• A Service will have very configurability option - even in
open source middlewares
Applications are developed on top of middleware
20
Adaptive Middleware Architecture
Similar to normal middleware• Provides a set of services• Has a well-defined access protocol
AMA not a Black Box• Admin/User Customizable Service
• Tweak services to achieve alternate goals• Can explain at a high-level what is happening• One can compare two AMA services to see how they differ• Useful for verification and analysis
21
Adaptive Middleware Architecture
External View – Logical Name Space• Persistent Memory – Database• Transient Memory – Variables• External Side-effects
• Interaction to outside world• Ex. File is created, Email is sent
• Services, Methods, Actions• Rules, Workflow
Internal View – Programmatic View• Changes in DB Tables, internal variables/structure• Procedures, Methods and Functions• Drivers, Protocols• Users, Resources, Data Objects – methods affecting them
Mapping• External to Internal• Capturing Semantics of Services and Rules• Validation, Analysis, Introspection