NICTA Copyright 2012 From imagination to impact
Eliciting Operations
Requirements for Applications
L. Bass, R. Jeffrey, I. Weber, H. Wada and L. Zhu
NICTA Copyright 2012 From imagination to impact
Operations Requirements
● "Through 2015, 80% of outages will be caused by people and process issues. 50% are caused by change, config and release" - Gartner
● Devs and Ops are (still) isolated but Ops are important source of product requirements
○ Before unit-test, less attention paid to "testability"○ In DevOps era, we should incorporate "operatability"
into products
● Making applications operation process aware!○ But where requirements come from?
NICTA Copyright 2012 From imagination to impact
Overview of Our Study
● Studied sources of operations requirements and discuss in the context of our spin-out
○ Operations personnel○ Internal development efforts○ Operations standards○ Organizational process descriptions○ Academic studies
● Model processes and the product○ Verify if a product satisfies operations
requirements
NICTA Copyright 2012 From imagination to impact
Standards and Organizational Process
● Process standards, ISO 15504 or ITIL, are good source but not specific enough to turn into product requirements
● Organizational process descriptions tends to provide more details
○ e.g., resource migration in Amazon Web Services [1]
● We found standards are useful to (1) implement (automate) into a product, and (2) define a method to validate the process by operators
[1] media.amazonwebservices.com/AWS_Migrate_Resources_To_New_Region.pdf
NICTA Copyright 2012 From imagination to impact
Example Operational Requirement
● CP-6 Alternate Storage Site, NIST 800-53○ "The organization establishes an alternate storage
site including necessary agreements to permit the storage and recovery of information system backup information"
● Derived product requirement○ "The product shall maintain backup in an alternate
storage site. The product shall provide a method to assess the recoverability of the system"
● Actual implementation in our product○ Setup a backup site and a schedule job as part of
product initialization. Otherwise, launch fails○ Provide a report to assess the quality of backup (e.g,
timestamp, execution time, capacity of disk, ...)
NICTA Copyright 2012 From imagination to impact
Academic Studies
● Difference between the environment is the most common source of upgrade problem [2]
○ Called "hidden dependencies" - incorrect file path, incorrect network address, library conflict, ...
● Hidden dependencies is a useful list of product requirements
● Actual implementation in our product○ e.g., run dependency check at boot. Terminate the
app immediately to prevent fatal issues occurring later (e.g., getting data corrupted)
○ Boot failure is easy to detect - make Ops happy[2] T. Dumitras, "Why do upgrades fail and what can we do about it?: towards dependable, online upgrades in enterprise system", Middleware 2009
NICTA Copyright 2012 From imagination to impact
Internal DevOps Experience
● Context: Our spin-out provides a SaaS solution for replicating resources in AWS
● Issue: Expensive to clean up resources○ Tests○ Handle unexpected failures
● "undo" functionality to revert the resource status to a certain point [3]
○ Easy to run tests○ Easy to clean up the mess
[3] I. Weber, et. al. "Automatic undo for cloud management via AI planning," HotDep'12
NICTA Copyright 2012 From imagination to impact
Towards the formal validation
● Incorporating Ops requirements into development/product is useful; however, how to verify the implementation is correct?
● Our on-going work - modeling process and product together
○ Does the product satisfy ops requirements?○ The process operates the product as required?
NICTA Copyright 2012 From imagination to impact
Example
● Model the mixed-version upgrading process
● Version conflict between clients and servers over long running process
● We're evaluating this method in a real system
NICTA Copyright 2012 From imagination to impact
Conclusion
● Operations including release are a large source of outages
● To improve the "operatability" of products, we studied operations requirements
● Future work: validate whether the "operatabiliy" is satisfied by implementations?
Top Related