Launches or terminates instances based on specified conditions
Automatically registers new instances with specified load balancers
Can launch across Availability Zones
Can leverage On-Demand, Reserved, and Spot Instances
Scale based on schedule; scale your application ahead of known load changes
Example: Turning off your dev and test instances at night
Excellent for general scaling
Allows your scaling to respond to unanticipated changes in traffic
Example: Scaling based on a CPU utilization CloudWatch alarm
Easiest to use
Scales based on machine learning algorithms
Example: Want to eliminate manual monitoring and adjustment of Auto Scaling
Step 1 - ELB triggers Amazon CloudWatch
In this example, the load balancer was configured to report latency to Amazon CloudWatch, which has an alarm set up to trigger when latency gets poor enough to warrant adding more instances.
Step 2 - CloudWatch triggers scaling policy
When the CloudWatch alarm goes off, it triggers a scaling policy set up in the Auto Scaling group for those instances.
Step 3 - Auto Scaling scales out and registers instance with load balancer
Finally, once the scaling action is triggered, Auto Scaling launches a third instance into the group, based on the configurations specified in the Auto Scaling group, and registers that instance with the load balancer so that it will receive traffic appropriately.
Minimum capacity - This is the lowest number of instances this group can have. If you reach this number, but a CloudWatch alarm tries to tell Auto Scaling to scale in more, Auto Scaling won’t scale in.
Maximum capacity - This is the most instances this group can have. If CloudWatch alarms tell the group to scale out, Auto Scaling will not be able to.
Desired capacity - When you initially create your Auto Scaling group, the desired capacity will be the number of instances your group begins with. As CloudWatch alarms go off and request Auto Scaling to scale, Auto Scaling will change the desired capacity to whatever quantity of instances it needs to scale in or out to.
For example, you start your group with the following settings: Min: 2 Max: 10
Auto Scaling launches 5 instances.
A few minutes later, a CPU capacity alarm goes off, and CloudWatch requests Auto Scaling to scale this group in by 1 instance. Auto Scaling then changes the group’s desired capacity to 4 and terminates 1 instance, based on your termination policy. Your new desired capacity is 4. If you go in and change that back to 5 later, Auto Scaling will launch a new instance to match the desired capacity.
We recommend starting with minimum and desired capacities of 2 instances (1 per Availability Zone).
Avoid thrashing (aggressive instance termination)
Scale out early, scale in slowly
Set the min and max capacity parameters carefully
Use lifecycle hooks (perform custom actions as Auto Scaling launches or terminates instances)
Stateful applications require additional automatic configuration of instances launched into Auto Scaling groups
Horizontal: Read Replicas
For read-heavy workloads
Available for Amazon Aurora, MySQL, MariaDB, and PostgreSQL
For write-heavy workloads
Splits data into large chunks (shards)
Can give you higher performance and better operating efficiency in many circumstances
Vertical: Push-Button Scaling
Scale your Amazon RDS instances up or down with the RDS APIs or a few clicks in the console, often with no downtime
Scale from micro to 8xlarge and everything in between
Scale storage up with zero downtime
Scale throughput (when using provisioned IOPS RDS storage)
Automatically adjusts read and write throughput capacity in response to dynamically changing request volumes with zero downtime
Default for all new tables
Just set your desired throughput utilization target, minimum and maximum limits
Continuously monitors actual throughput consumption using Amazon CloudWatch
No additional cost to use
Available in all AWS Regions
Best for general scaling needs for most applications with relatively predictable scaling needs
Flexible billing option, capable of serving thousands of requests per second without capacity planning
Uses pay-per-request pricing instead of a provisioned pricing model
DynamoDB adapts rapidly to accommodate new peaks in level of traffic
Best for spiky, unpredictable workloads