Change Management

💎 Reliability Best Practices - Change Management

Being aware of how change affects a system allows you to plan proactively, and monitoring allows you to quickly identify trends that could lead to capacity issues or SLA breaches. In traditional environments, change-control processes are often manual and must be carefully coordinated with auditing to effectively control who makes changes and when they are made.

Using AWS, you can monitor the behavior of a system and automate the response to KPIs, for example, by adding additional servers as a system gains more users. You can control who has permission to make system changes and audit the history of thesechanges.

💎 Reliability Change Management Questions

REL 3: How does your system adapt to changes in demand?

A scalable system provides elasticity to add and remove resources automatically so that they closely match the current demand at any given point in time.

REL 4: How do you monitor your resources?

Logs and metrics are a powerful tool to gain insight into the health of your workloads. You can configure your workload to monitor logs and metrics and send notifications when thresholds are crossed or significant events occur. Ideally, when low-performance thresholdsare crossed or failures occur, the workload has been architected to automatically self-heal orscale in response.

REL 5: How do you implement change?

Uncontrolled changes to your environment make it difficult to predict the effect of a change. Controlled changes to provisioned resources and workloads are necessary to ensure that the workloads and the operating environment are running known software and can be patched or replaced in a predictable manner.


When you architect a system to automatically add and remove resources in responseto changes in demand, this not only increases reliability but also ensures that business success doesn’t become a burden. With monitoring in place, your team will be automatically alerted when KPIs deviate from expected norms. Automatic logging of changes to your environment allows you to audit and quickly identify actions thatmight have impacted reliability. Controls on change management ensure that you can enforce the rules that deliver the reliability you need.