Trusted Recovery is a security concept that ensures a system can recover from a crash or failure without compromising its security. This is particularly critical in high-security systems, specifically those rated at B3 and A1 levels under the Trusted Computer System Evaluation Criteria (TCSEC, also known as the Orange Book). Trusted recovery involves processes and mechanisms to ensure that the system returns to a secure state after a failure, maintaining the integrity, confidentiality, and availability of the system and its data.
Key Concepts of Trusted Recovery
- Failure Preparation
- Backup Critical Information: Regularly backing up critical information is essential to enable data recovery in the event of a system failure. This ensures that important data can be restored and that the system can resume operation with minimal disruption.
- System Recovery After a Crash
- Rebooting in Single User Mode or Recovery Console:
- Purpose: The system is rebooted in a controlled environment where no user access is enabled, allowing the recovery process to proceed without the risk of further compromising the system.
- Recovering Active File Systems:
- Purpose: All file systems that were active during the failure are checked and recovered to ensure data integrity.
- Restoring Missing or Damaged Files:
- Purpose: Any files that were lost or damaged during the crash are restored from backups or through recovery processes.
- Recovering Security Characteristics:
- Purpose: Critical security characteristics, such as file security labels, are restored to ensure that the security policies remain intact.
- Checking Security-Critical Files:
- Purpose: Key security files, such as the system password file, are verified for integrity and restored if necessary to prevent unauthorized access.
- Rebooting in Single User Mode or Recovery Console:
Common Criteria Hierarchical Recovery Types
- Manual Recovery
- Description: In this type of recovery, a system administrator must manually intervene to return the system to a secure state. This approach is typically used when automated recovery is not feasible or when the failure is complex.
- Use Case: Appropriate for systems where human judgment is required to assess and correct the issue.
- Automatic Recovery
- Description: The system can automatically recover to a secure state after resolving a single failure. However, if additional failures are present, a system administrator may be required to intervene.
- Use Case: Suitable for systems where minor issues can be resolved without manual intervention but where human oversight is necessary for more complex scenarios.
- Automatic Recovery Without Undo Loss
- Description: A higher level of recovery where the system not only recovers automatically but also prevents the loss of protected objects during the recovery process. This type of recovery ensures that no critical data or security attributes are lost.
- Use Case: Ideal for high-security environments where the integrity of protected data must be maintained even during a failure.
- Function Recovery
- Description: The system can automatically restore its functional processes without requiring manual intervention. This type of recovery focuses on resuming normal operations as quickly as possible.
- Use Case: Common in environments where maintaining service continuity is critical, and disruptions must be minimized.
Types of System Failure
- System Reboot
- Description: The system shuts itself down in a controlled manner after detecting inconsistent data structures or when it runs out of resources. This type of failure allows for a clean recovery process.
- Impact: Minimal, as the system initiates a controlled shutdown.
- Emergency Restart
- Description: The system restarts after a failure occurs in an uncontrolled manner, such as when a low-privileged user attempts to access restricted memory segments. This type of restart is less orderly and may require additional recovery steps.
- Impact: Moderate, as the system may require further checks and corrections during recovery.
- System Cold Start
- Description: The system experiences a complete shutdown due to an unexpected kernel or media failure. Regular recovery procedures cannot bring the system back to a consistent state, requiring a full cold start.
- Impact: High, as this type of failure often involves significant downtime and a more complex recovery process.
Summary
- Trusted Recovery ensures that a system can recover from a crash or failure without compromising its security.
- System Recovery involves rebooting in a secure mode, recovering file systems, restoring files, and ensuring that security characteristics are intact.
- Common Criteria Recovery Types include manual recovery, automatic recovery, automatic recovery without undo loss, and functional recovery, each offering varying levels of automation and security.
- System Failures can range from controlled reboots to emergency restarts and cold starts, each requiring different recovery approaches.
Trusted recovery is crucial in maintaining the security and integrity of systems, particularly in environments where data protection and system reliability are paramount.