Injecting Faults for Error Evaluation
description
Transcript of Injecting Faults for Error Evaluation
Introduction• Applications often consists of software components
plus custom development, merged into a coherent package.– COTS, GOTS, open source, etc.
• Source code is usually not available for review of quality and reliability. – Visibility into the component is only what’s available via a
public interface– What is the quality of that component?– What faults lay inside the component?
• Applications interface with hardware and other software and can be influenced by failures in those systems.
Fault Injection on Interfaces
• Interfaces (hardware, software, human) are a major source of errors and induced faults
• Software and system testing looks at anticipated off-nominal situations, but often misses unusual situations or combinations of faults
• Mishap investigation has shown that multiple faults or unexpected anomalies are key players in accidents and mission failures
Example System
ApplicationCOTSLibrary
COTS Operating System
Other Applications
on same system
External Systems
System Hardware
Input Sensors
Control Outputs
Fault Injection Flow Diagram
No
Identify Interfaces and Critical Sections
Error/Fault Research
Estimate Effort Required
Obtain Source Code and Documentation
Start
Sufficient time and funds?
Importance Analysis
Select Subset
Test Case Generation
Fault Injection Testing
Document Results, Metrics, Lessons Learned
Feedback to FCF Project
End
Yes
Interface Identification
• Artifacts and Documentation– Software and System Requirements and
Design specifications– Interface Specifications– User and Training Manuals– Other project documentation
• Source code
Error Research
• Sources of Error/Fault Information– Vendor documentation
• Application or Hardware Interfaces• White papers• Defect list or open issue/problem reports
– Public bug list– Internet Sources– Software logs– Error databases– Personnel Experience
Evaluation and Scoping
• Determine level of effort, funding, time constraints
• If complete effort not possible– Perform importance analysis of interfaces,
software units• Safety• Complexity• Use by other system elements• Expected number or types of faults
– Prioritize and select by importance
Testing• Test case generation based on identified errors plus
permutations on possible input values• Consider multiple faults• Consider faults while system is off-nominal from a
previous fault• Consider effects of system load/stress• Consider state-specific effects• Instrument software to observe effects of injected
faults– External or observable effects– State changes (or lack of)– Effects on safety-critical functions
First Project: Tempest
• Written in Java 1.1• Configurable • Cross platform operability• Implements HTTP GET and HEAD Request and
Server Side Includes• Has some Basic Security Features• Debug Mode monitoring• Commercially available
Requirement Database
• Documents found contained 80 requirements that vendor says he meets (vendor’s claims)
• Requirement table is the parent table for 5 other sub-tables– Performance– Specs– HFE– Security– Misc
Standards Database
• Parsed commercial standards into pseudo functions and test scenarios
• Test scenarios included expected faults as well as on-the-edges and way-outside-the-box tests
• Each standard has its own table and own set of tests
Tempest Results
• Inappropriate system operation with modified configuration file
• Non-compliance with HTTP standard• System crash with invalid port numbers
– Port 49151.45 -> opened port 80
• File access in server machine outside of authorized directories
• System did not operate as per user documentation
Second Project:Fluids and Combustion Facility
• Permanent, multi-user facility for ISS microgravity experiments
• Two racks (fluids/FIR, combustion/CIR)• Operates for 10 years, so robustness important• CANbus processors selected for fault injection
– Health and Status Monitoring– Cannot be upgraded in flight– Mature requirements, design, and interface definition– Source code available
CANbus Processors
• Air Thermal Control Unit (ATCU)• Color Camera Package (CCP)• Common IPSU Diagnostic Board• FOMA Control Unit• FSAP Diagnostic Board• Nd:YAG Laser Package • Water Thermal Control System (WTCS)• White Light Package
FIR System Diagram
IOP Main Processor IOP HRDL
Processor
IOP Video Switch
Processor
IOP CAN Node
Processor
Input-Output Processor (IOP)
FSAP Main Processor
FSAP CAN Node
Processor
FSAP
IPSU Main Processor
ISPU CAN Node
Processor
Common IPSU
Laser Diode CAN
Processor
White Light CAN
Processor
DCMCAN
Processor
Nd:Yag CAN Processor
PI Package
ECS CANbus
ATCU CAN Processor
WTCS CAN Processor
Optics Bench CANbus
Ethernet
CANbus Processor State Diagram
Off-Nominal (O-N)
Power Down (P)
Power On
Initialization
Power Off
Operational (OP)
Power Down Cmd
Error
Success
Operational Cmd
Error
Error
Operational Cmd
Power Down Cmd
Next Steps
• Complete Interface Identification and prioritization
• Obtain hardware, source code for testing environment
• Error/Fault search on selected interfaces/components
• Test case generation, source code instrumentation, and test execution