AWS Auto Scaling

AWS Auto Scaling is an AWS service that allows users to increase or decrease the number of EC2 instances within their application’s architecture. With Auto Scaling, users can create collections of EC2 instances, called Auto Scaling groups. Users can create these groups from scratch, or from existing EC2 instances that are already in production, and create as many Auto Scaling groups as needed. For example, if an application consists of a web tier and an application tier, users can create two Auto Scaling groups—one for each tier. Each Auto Scaling group can contain one or more scaling policies—these policies define when Auto Scaling launches or terminated EC2 instances within the group.

Adding Auto Scaling to the network architecture is one way to maximize the benefits of the AWS cloud. With Auto Scaling, users can make applications: 

  • More fault tolerant: Auto Scaling can detect when an instance is unhealthy, terminate it, and launch a new instance to replace it. •
  • More highly available: Users can configure Auto Scaling to use multiple subnets or Availability Zones. If one subnet or Availability Zone becomes unavailable, Auto Scaling can launch instances in another one to compensate.
  • Increase and decrease in capacity only when needed: Unlike on-premises solutions, with Auto Scaling users can have own network scale dynamically, and don’t pay for Auto Scaling. Instead, users pay only for the EC2 instances launched, and only for as long as use them.

 

AWS Auto Scaling

AWS Auto Scaling Benefits

AWS Auto Scaling allows users set target utilization levels for multiple resources in a single, intuitive interface. Users can quickly see the average utilization of all of the scalable resources without having to navigate to other consoles. For example, if the application uses Amazon EC2 and Amazon DynamoDB, users can use AWS Auto Scaling to manage resource provisioning for all of the EC2 Auto Scaling groups and database tables in the application.

Using AWS Auto Scaling, users maintain optimal application performance and availability, even when workloads are periodic, unpredictable, or continuously changing. AWS Auto Scaling continually monitors the applications to make sure that they are operating at desired performance levels. When demand spikes, AWS Auto Scaling automatically increases the capacity of constrained resources so users maintain a high quality of service.

AWS Auto Scaling allows users build scaling plans that automate how groups of different resources respond to changes in demand. Users can optimize availability, costs, or a balance of both. AWS Auto Scaling automatically creates all of the scaling policies and sets targets based on users preference. AWS Auto Scaling monitors application and automatically adds or removes capacity from the resource groups in real-time as demands change.

AWS Auto Scaling can help users optimize the utilization and cost efficiencies when consuming AWS services so only pay for the resources actually used. When demand drops, AWS Auto Scaling will automatically remove any excess resource capacity to avoid overspending. AWS Auto Scaling is free to use, and allows users to optimize the costs of AWS environment.

AWS Auto Scaling Features

Using AWS Auto Scaling, users can configure automatic scaling for all of the scalable resources powering your application from a single unified interface, including:

  • Amazon EC2: Launch or terminate Amazon EC2 instances in an Amazon EC2 Auto Scaling group
  • Amazon EC2 Spot Fleets: Launch or terminate instances from an Amazon EC2 Spot Fleet, or automatically replace instances that get interrupted for price or capacity reasons.
  • Amazon ECS: Adjust ECS service desired count up or down to respond to load variations. 
  • Amazon DynamoDB: Enable a DynamoDB table or a global secondary index to increase its provisioned read and write capacity to handle sudden increases in traffic without throttling.
  • Amazon Aurora: Dynamically adjust the number of Aurora Read Replicas provisioned for an Aurora DB cluster to handle sudden increases in active connections or workload.

AWS Auto Scaling scans users environment and automatically discovers the scalable cloud resources underlying the application. Using AWS Auto Scaling, users can set target utilization levels for multiple resources in a single, intuitive interface. 

  • AWS Auto Scaling will scan users selected AWS CloudFormation stack or resources with the specified tags to identify the supported AWS resource types that can be scaled. Note that currently, ECS services cannot be discovered using tags.
  • Users can quickly see the average utilization of all of their scalable resources without having to navigate to other consoles. 
  • For applications such as Amazon EC2 and Amazon DynamoDB, AWS Auto Scaling manages resource provisioning for all of the EC2 Auto Scaling groups and database tables in the customer application.

Using AWS Auto Scaling, users can select one of three predefined optimization strategies designed to optimize performance, optimize costs, or balance the two. If interested, users can set own target resource utilization. Using selected scaling strategy, AWS Auto Scaling will create the scaling policies for each resources.

  • The application has a Cyclical traffic such as high use of resources during regular business hours and low use of resources overnight.
  • The application is experiencing On and off workload patterns, such as batch processing, testing, or periodic analysis.
  • The application has Variable traffic patterns, such as marketing campaigns with periods of spiky growth.

AWS Auto Scaling continually calculates the appropriate scaling adjustments and immediately adds and removes capacity as needed to keep the metrics on target. AWS target tracking scaling policies are self-optimizing, and learn the actual load patterns to minimize fluctuations in resource capacity. This results in smoother, smarter scaling and users pay only for the resources actually need.

  • AWS Auto Scaling allows customers to build scaling plans that automate how groups of different resources respond to changes in demand. 
  • AWS Auto Scaling automatically creates all of the scaling policies and sets targets for customers based on their preference. 
  • AWS Auto Scaling monitors customers applications and automatically adds or removes capacity from their resource groups in real-time as demands change.

Predictive Scaling predicts future traffic, including regularly-occurring spikes, and provisions the right number of EC2 instances in advance of predicted changes. Predictive Scaling’s machine learning algorithms detect changes in daily and weekly patterns, automatically adjusting their forecasts. This removes the need for manual adjustment of Auto Scaling parameters over time, making Auto Scaling simpler to configure and consume. Auto Scaling enhanced with Predictive Scaling delivers faster, simpler, and more accurate capacity provisioning resulting in lower cost and more responsive applications.

  • Load forecasting: AWS Auto Scaling analyzes up to 14 days of history for a specified load metric and forecasts the future demand for the next two days. 
  • Scheduled scaling actions: AWS Auto Scaling schedules the scaling actions that proactively add and remove resource capacity to reflect the load forecast. At the scheduled time, AWS Auto Scaling updates the resource’s minimum capacity with the value specified by the scheduled scaling action. 
  • Maximum capacity behavior: Each resource has a minimum and a maximum capacity limit between which the value specified by the scheduled scaling action is expected to lie.

AWS Auto Scaling automatically creates target tracking scaling policies for all of the resources in the scaling plan, using selected scaling strategy to set the target values for each metric. AWS Auto Scaling also creates and manages the Amazon CloudWatch alarms that trigger scaling adjustments for each resources.

With target tracking scaling policies, users select a scaling metric and set a target value. Amazon EC2 Auto Scaling creates and manages the CloudWatch alarms that trigger the scaling policy and calculates the scaling adjustment based on the metric and the target value. The scaling policy adds or removes capacity as required to keep the metric at, or close to, the specified target value. In addition to keeping the metric close to the target value, a target tracking scaling policy also adjusts to changes in the metric due to a changing load pattern. Users can use target tracking scaling to:

  • Configure a target tracking scaling policy to keep the average aggregate CPU utilization of Auto Scaling group at 40 percent.
  • Configure a target tracking scaling policy to keep the request count per target of the Application Load Balancer target group at 1000 Auto Scaling group.

Auto Scaling Group Lifecycle

Like Amazon EC2 instances launched manually, instances in an Auto Scaling group follow a specific path, or lifecycle. For Auto Scaling instances, this lifecycle starts when users create a new Auto Scaling group or when a scale out event occurs. At that point, a new instance launches and is put into service by the Auto Scaling group. The lifecycle ends when a corresponding scale in event occurs, at which point the Auto Scaling group detaches the instance and terminates it.

Auto Scaling Basic Lifecycle 

The basic lifecycle of instances in an Auto Scaling group applies to most implementations of Auto Scaling and is a great place to start when considering adding Auto Scaling to application’s architecture.

Scale out event

This event informs the Auto Scaling group to launch one or more new instances and add them to the application. Some examples of scale out events:

  • Manually chose to increase the number of instances, either by setting a new minimum number of instances, or by configuring the desired capacity for the Auto Scaling group. 
  • Create an Amazon CloudWatch alarm to monitor the application.
  • Create a schedule-based policy to scale out the application at a specific time.
  • An existing instance fails a required number of health checks, or manually configure an instance to have a have an Unhealthy status.
Instances launched

After a scale out event occurs, the Auto Scaling group uses its assigned launch configuration to launch one or more Amazon EC2 instances. The number of instances launched depends on how users configured the Auto Scaling group’s scaling policies. Instances that have launched but are not yet fully configured are typically in the Pending state.

  • Users have the option of adding a hook to your Auto Scaling group that puts instances in this state into a Pending:Wait. This state allows users to access these instances before they are put into service.
Instance attached to the Auto Scaling group

When an instance has launched and is fully configured, it is put into service and attached to the Auto Scaling group load balancer. The instance now counts against the minimum size, maximum size, and desired capacity (if set) for the Auto Scaling group. These instances are in the InService state.

Scale in event

It is important that each scale out event is matched to a corresponding scale in event. This helps ensure that the application’s resources match the demand for those resources as closely as possible. As with scale out events, a scale in event can be one of a number of actions, including manual configuration, a CloudWatch alarm, a schedule-based policy, or an instance failure.

Instance detached from an Auto Scaling group

When a scale in event occurs, the Auto Scaling group detaches one or more instances. How the Auto Scaling group determines which instance to terminate depends on its termination policy. Instances that are in the process of detaching from the Auto Scaling group and shutting down are in the Terminating state.

  • Users have the option of adding a hook to the Auto Scaling group instances in this state into a Terminating:Wait state. This state allows usersto access these instances before they are terminated.

Instance terminated Finally, the instance is completely terminated.

Auto Scaling Pending State

When an Auto Scaling group reaches a scale out threshold, it launches one or more instances (as determined by scaling policy). These instances are configured based on the launch configuration for the Auto Scaling group. While an instance is launched and configured, it is in a Pending state. Depending on how users want to manage the Auto Scaling group, the Pending state can be divided into two additional states: Pending:Wait and Pending:Proceed.

  • Users can use these states to perform additional actions before the instances are added to the Auto Scaling group.
AWS Auto Scaling
Auto Scaling InService State

Instances that are functioning within your application as part of an Auto Scaling group are in the InService state. Instances remain in this state until:

  • An Auto Scaling scale in event occurs, reducing the size of the Auto Scaling group
  • Put the instance into a Standby state. 
  • Manually detach the instance from the Auto Scaling group
  • The instance fails a required number of health checks or you manually set the status of the instance to Unhealthy.
  • Any running instances that users attach to the Auto Scaling group are also in the InService state.

Users have the option of putting any InService instance into a Standby state. Instances in this state continue to be managed by the Auto Scaling group. However, they are not an active part of the application until users put them back into service.

Auto Scaling Terminating State

Instances that fail a required number of health checks are removed from an Auto Scaling group and terminated. The instances first enter the Terminating state, then Terminated.. Depending on how users want to manage the Auto Scaling group, the Terminating state can be divided into two additional states: Terminating:Wait and Terminating:Proceed.

  • Users can use these states to perform additional actions before the instances are terminated.
AWS Auto Scaling

Availability Zones and Regions

Amazon cloud computing resources are housed in highly available data center facilities.  To provide additional scalability and reliability, these data centers are in several physical locations categorized by regions and Availability Zones. Regions are large and widely dispersed geographic locations. Availability Zones are distinct locations within a region that are engineered to be isolated from failures in other Availability Zones and provide inexpensive, low-latency network connectivity to other Availability Zones in the same region.

  • Auto Scaling allows users take advantage of the safety and reliability of geographic redundancy by spanning Auto Scaling groups across multiple Availability Zones within a region.
  • When one Availability Zone becomes unhealthy or unavailable, Auto Scaling launches new instances in an unaffected Availability Zone.
  • When the unhealthy Availability Zone returns to a healthy state, Auto Scaling automatically redistributes the application instances evenly across all of the designated Availability Zones.
  • An Auto Scaling group can contain EC2 instances that come from one or more EC2 Availability Zones within the same region. However, Auto Scaling group cannot span multiple regions.
Instance Distribution and Balance Across Multiple Zones

Auto Scaling attempts to distribute instances evenly between the Availability Zones that are enabled Auto Scaling group. Auto Scaling does this by attempting to launch new instances in the Availability Zone with the fewest instances. If the attempt fails, however, Auto Scaling will attempt to launch in other zones until it succeeds. Certain operations and conditions can cause Auto Scaling group to become unbalanced between the zones. Auto Scaling compensates by creating a rebalancing activity under any of the following conditions:

  • Issue a request to change the Availability Zones for the group.
  • Explicitly call for termination of a specific instance that caused the group to become unbalanced.
  • An Availability Zone that previously had insufficient capacity recovers and has additional capacity available.
AWS Auto Scaling
Multi-Zone Instance Counts When Approaching Capacity

Because Auto Scaling always attempts to launch new instances before terminating old ones when attempting to balance across multiple zones, being at or near the specified maximum capacity could impede or completely halt rebalancing activities. To avoid this problem, the system can temporarily exceed the specified maximum capacity of a group by a 10 percent margin (or by a 1-instance margin, whichever is greater) during a rebalancing activity.

  • The margin is extended only if the group is at or near maximum capacity and needs rebalancing, either because of user-requested rezoning or to compensate for zone availability issues.
  • The extension lasts only as long as needed to rebalance the group typically a few minutes.
Auto Scaling Limits

AWS account comes with default limits on resources for Auto Scaling and other Amazon Web Services. Unless otherwise noted, each limit is per region. There is a default limit of 20 Auto Scaling groups and 100 launch configurations per region.

  • Users can go to AWS Service Limits and select Auto Scaling Limits or any other service listed on the page to see its default limits. 
  • When reach the limit for the number of Auto Scaling groups or the number of launch configurations, users can go to Support Center and place a request to raise the limit. 
  • Users can see the number of Auto Scaling resources currently allowed for AWS account either by using the as-describe-account-limits command or by calling the DescribeAccountLimits action. 

Auto Scaling components

#01

Groups

 
 

An Auto Scaling group contains a collection of Amazon EC2 instances that are treated as a logical grouping for the purposes of automatic scaling and management. An Auto Scaling group also enables users to use Amazon EC2 Auto Scaling features such as health check replacements and scaling policies. Both maintaining the number of instances in an Auto Scaling group and automatic scaling are the core functionality of the Amazon EC2 Auto Scaling service.

  • The size of an Auto Scaling group depends on the number of instances that was set as the desired capacity. Users can adjust its size to meet demand, either manually or by using automatic scaling.
  • An Auto Scaling group starts by launching enough instances to meet its desired capacity. It maintains this number of instances by performing periodic health checks on the instances in the group. If an instance becomes unhealthy, the group terminates the unhealthy instance and launches another instance to replace it.
  • Users can use scaling policies to increase or decrease the number of instances in the group dynamically to meet changing conditions. When the scaling policy is in effect, the Auto Scaling group adjusts the desired capacity of the group, between the minimum and maximum capacity values that you specify, and launches or terminates the instances as needed. 
  • An Auto Scaling group can launch On-Demand Instances, Spot Instances, or both. Users can specify multiple purchase options for your Auto Scaling group only when configuring the group to use a launch template. 
  • Spot Instances provide access to unused Amazon EC2 capacity at steep discounts relative to On-Demand prices. There are key differences between Spot Instances and On-Demand Instances:

    • The price for Spot Instances varies based on demand

    • Amazon EC2 can terminate an individual Spot Instance as the availability of, or price for, Spot Instances changes

#02

Launch Templates

 
 

 

A launch template is similar to a launch configuration, in that it specifies instance configuration information. Included are the ID of the Amazon Machine Image (AMI), the instance type, a key pair, security groups, and the other parameters that users use to launch EC2 instances. However, defining a launch template instead of a launch configuration allows multiple versions of a template.

  • With versioning, users can create a subset of the full set of parameters and then reuse it to create other templates or template versions. For example, users can create a default template that defines common configuration parameters and allow the other parameters to be specified as part of another version of the same template.
  •  If using launch configurations with Amazon EC2 Auto Scaling, users cannot create an Auto Scaling group that launches both Spot and On-Demand Instances or that specifies multiple instance types or multiple launch templates. Users need to use a launch template to configure these features. 
  • Launch templates enable users to use newer features of Amazon EC2. This includes the current generation of EBS volume types (gp3 and io2), EBS volume tagging, T2 Unlimited instances, and Dedicated Hosts, to name a few.
  • Dedicated Hosts are physical servers with EC2 instance capacity that are dedicated to use. While Amazon EC2 Dedicated Instances also run on dedicated hardware. The advantage of using Dedicated Hosts over Dedicated Instances is that users can bring eligible software licenses from external vendors and use them on EC2 instances.
  • A launch template enables users to take advantage of newer features of Amazon EC2 such as the current generation of EBS volume types (gp3 and io2), EBS volume tagging, T2 Unlimited instances, and Dedicated Hosts.
  • A launch template Support for Dedicated Hosts (host tenancy) is only available if specified a host resource group. Users cannot target a specific host ID or use host placement affinity.
  • A launch template lets you configure a network type (VPC or EC2-Classic), subnet, and Availability Zone. However, these settings are ignored in favor of what is specified in the Auto Scaling group.

 

launch configuration is an instance configuration template that an Auto Scaling group uses to launch EC2 instances. When creating a launch configuration, users specify information for the instances. Include the ID of the Amazon Machine Image (AMI), the instance type, a key pair, one or more security groups, and a block device mapping. Users need to specify an EC2 instance before the same information in order to launch the instance.  

Users can specify launch configuration with multiple Auto Scaling groups. However, they can only specify one launch configuration for an Auto Scaling group at a time, but not allowed modify a launch configuration after created it. To change the launch configuration for an Auto Scaling group, users need to create a launch configuration and then update Auto Scaling group with it.

  •  in order to create an Auto Scaling group, users need to specify a launch configuration, a launch template, or an EC2 instance.
  • When creating an Auto Scaling group using an EC2 instance, Amazon EC2 Auto Scaling automatically creates a launch configuration and associates it with the Auto Scaling group.
  • If using launch templates, users can specify a launch template instead of a launch configuration or an EC2 instance. 
  • Users create the launch configuration by providing information about the image in Auto Scaling to use to launch EC2 instances. The information can be the image ID, instance type, key pairs, security groups, and block device mapping. To learn more about Amazon machine images (AMI)

Tenancy defines how EC2 instances are distributed across physical hardware and affects pricing. There are three tenancy options available:

  • Shared (default) — Multiple AWS accounts may share the same physical hardware.
  • Dedicated Instance (dedicated) — Your instance runs on single-tenant hardware.
  • Dedicated Host (host) — Your instance runs on a physical server with EC2 instance capacity fully dedicated to your use, an isolated server with configurations that you can control.

#03

Launch Configurations

 
 

#04

Scaling Plans

 
 

 

In addition to creating a launch configuration and an Auto Scaling group, Users need to create a scaling plan for the Auto Scaling group. A scaling plan tells Auto Scaling when and how to scale. Users can create a scaling plan based on the occurrence of specified conditions (dynamic scaling) or you can create a plan based on a specific schedule. Amazon EC2 Auto Scaling provides several ways to scale your Auto Scaling groups:

Maintain current instance levels at all times: Users can configure Auto Scaling group to maintain a specified number of running instances at all times. To maintain the current instance levels, Amazon EC2 Auto Scaling performs a periodic health check on running instances within an Auto Scaling group.

Scale manually: Manual scaling is the most basic way to scale the resources, where users specify only the change in the maximum, minimum, or desired capacity of Auto Scaling group. Amazon EC2 Auto Scaling manages the process of creating or terminating instances to maintain the updated capacity. 

Scale based on a schedule: Scaling by schedule means that scaling actions are performed automatically as a function of time and date. This is useful when users know exactly when to increase or decrease the number of instances in the group, simply because the need arises on a predictable schedule.

Scale based on demand: A more advanced way to scale the resources is using scaling policies, which allows users to define parameters that control the scaling process. For example, assume a web application currently runs on two instances and the user want the CPU utilization of the Auto Scaling group to stay at around 50 percent when the load on the application changes, then using this method can be the better choice.

  • Scale based on demand is useful for scaling in response to changing conditions, when users don’t know when those conditions will change. Users can set up Amazon EC2 Auto Scaling to respond for users . 

Use predictive scaling: Users can use Amazon EC2 Auto Scaling in combination with AWS Auto Scaling to scale resources across multiple services. AWS Auto Scaling can help maintain optimal availability and performance by combining predictive scaling and dynamic scaling (proactive and reactive approaches, respectively) to scale Amazon EC2 capacity faster. 

Auto Scaling for AWS Services

Users have multiple options for scaling resources. To configure automatic scaling for multiple resources across multiple services, use AWS Auto Scaling to create a scaling plan for the resources underlying their application. AWS Auto Scaling is also used to create predictive scaling for EC2 resources.

  • Amazon EC2 Auto Scaling helps users have the correct number of Amazon EC2 instances available to handle the load for their application.
  • In additon, Application Auto Scaling can scale Amazon ECS services, Amazon EC2 Spot fleets, Amazon EMR clusters, Amazon AppStream 2.0 fleets, provisioned read and write capacity for Amazon DynamoDB tables and global secondary indexes, Amazon Aurora Replicas, and Amazon SageMaker endpoint variants.

EC2 Auto Scaling

 

Amazon EC2 Auto Scaling groups enable users to Launch or terminate EC2 instances in an Auto Scaling group.

  • Amazon EC2 Auto Scaling scales out the group (add more instances) to deal with high demand at peak times, and scale in the group (run fewer instances) to reduce costs during periods of low utilization.
  • scaling policy instructs Amazon EC2 Auto Scaling to track a specific CloudWatch metric, and it defines what action to take when the associated CloudWatch alarm is in ALARM. 
  • The metrics that are used to trigger an alarm are an aggregation of metrics coming from all of the instances in the Auto Scaling group.

Amazon EC2 Auto Scaling supports the following types of scaling policies:

  • Target tracking scaling: Increase or decrease the current capacity of the group based on a target value for a specific metric. 
  • Step scaling: Increase or decrease the current capacity of the group based on a set of scaling adjustments, known as step adjustments, that vary based on the size of the alarm breach.
  • Simple scaling: Increase or decrease the current capacity of the group based on a single scaling adjustment.

EC2 Spot Fleet requests

 

Amazon EC2 Spot Fleet requests: Launch or terminate instances from a Spot Fleet request, or automatically replace instances that get interrupted for price or capacity reasons. Automatic scaling is the ability to increase or decrease the target capacity of the customer Spot Fleet automatically based on demand. A Spot Fleet can either launch instances (scale out) or terminate instances (scale in), within the range that was specified, in response to one or more scaling policies. Spot Fleet supports the following types of automatic scaling:

  • Target tracking scaling: Increase or decrease
 the current capacity of the fleet based on a target value for a specific metric. This is similar to the way that thermostat maintains the temperature of a home, select temperature and the thermostat does the rest.
  • Step scaling: Increase or decrease the current capacity of the fleet based on a set of scaling adjustments, known as step adjustments, that vary based on the size of the alarm breach.
  • Scheduled scaling: Increase or decrease the current capacity of the fleet based on the date and time

The scaling policies that were created for Spot Fleet support a cool down period. Which is the number of seconds after a scaling activity completes where previous trigger-related scaling activities can influence future scaling events.

  • Use scale based on instance metrics with a 1-minute frequency to ensure a faster response to utilization changes. Scaling on metrics with a 5-minute frequency can result in slower response time and scaling on stale metric data.

ECS Auto Scaling

 

Automatic scaling has the ability to increase or decrease the desired count of tasks in the customer Amazon ECS service automatically. Amazon ECS leverages the Application Auto Scaling service to provide this functionality. Amazon ECS publishes CloudWatch metrics with users service’s average CPU and memory usage, so that they can use this and other CloudWatch metrics to scale out the service to deal with high demand at peak times, and to scale in the service to reduce costs during periods of low utilization. Amazon ECS Service Auto Scaling supports

  • Target Tracking 
  • Scaling Policies.
  • Scheduled Scaling

The Application Auto Scaling service needs permission to describe the Amazon ECS services, CloudWatch alarms, and to modify customers service’s desired count on their behalf. Service Auto Scaling is a combination of the Amazon ECS, CloudWatch, and Application Auto Scaling APIs. 

  • Services are created and updated with Amazon ECS, 
  • Alarms are created with CloudWatch, and 
  • Scaling policies are created with Application Auto Scaling.

Aurora Auto Scaling

 

Aurora Auto Scaling dynamically adjusts the number of Aurora Replicas provisioned for an Aurora DB cluster using single-master replication. Aurora Auto Scaling is available for both Aurora MySQL and Aurora PostgreSQL. Aurora Auto Scaling enables the customer Aurora DB cluster to handle sudden increases in connectivity or workload. 

  • When the connectivity or workload decreases, Aurora Auto Scaling removes unnecessary Aurora Replicas. 
  • The scaling policy defines the minimum and maximum number of Aurora Replicas that Aurora Auto Scaling can manage. 
    • Using this policy customers can define and apply a scaling policy to an Aurora DB cluster.

Aurora Auto Scaling uses a scaling policy to adjust the number of Aurora Replicas in an Aurora DB cluster. Aurora Auto Scaling has the following components:

  • A service-linked role
  • Target metric:– A target metric is a predefined or custom metric and a target value for the metric is specified in a target-tracking scaling policy configuration.
  • Minimum and maximum capacity:- Customers are able to specify the maximum number of Aurora Replicas (0 – 15) to be managed by Application Auto Scaling. 
  • A cooldown period:- A cooldown period blocks subsequent scale-in or scale-out requests until the period expires. These blocks slow the deletions of Aurora Replicas in the Aurora DB cluster.

DynamoDB Auto Scaling

 

Amazon DynamoDB auto scaling uses the AWS Application Auto Scaling service to dynamically adjust provisioned throughput capacity on users behalf, in response to actual traffic patterns. This enables a table or a global secondary index to increase its provisioned read and write capacity to handle sudden increases in traffic, without throttling. When the workload decreases, Application Auto Scaling decreases the throughput. 

Enabling a DynamoDB table or a global secondary index  increases or decreases its provisioned read and write capacity to handle increases in traffic without throttling. With Application Auto Scaling, customers can create a scaling policy for a table or a global secondary index. 

  • The scaling policy contains a target utilization, the percentage of consumed provisioned throughput at a point in time. Application Auto Scaling uses a target tracking algorithm to adjust the provisioned throughput of the table (or index) upward or downward in response to actual workloads, so that the actual capacity utilization remains at or near the customer target utilization.
  • DynamoDB auto scaling also supports global secondary indexes. Every global secondary index has its own provisioned throughput capacity, separate from that of its base table.
  • DynamoDB auto scaling modifies provisioned throughput settings only when the actual workload stays elevated (or depressed) for a sustained period of several minutes.

When users create a scaling policy, Application Auto Scaling creates a pair of Amazon CloudWatch alarms on their behalf. Each pair represents clients upper and lower boundaries for provisioned throughput settings. To enable DynamoDB auto scaling for the ProductCatalog table, users need to create a scaling policy. This policy specifies includes:

  • The table or global secondary index that the users want to manage.
  • Which capacity type to manage (read capacity or write capacity).
  • The upper and lower boundaries for the provisioned throughput settings.
  • The users target utilization

Auto Scaling Monitoring  

AWS Auto Scaling

Monitoring is an important part of maintaining the reliability, availability, and performance of Amazon EC2 Auto Scaling and AWS solutions. AWS provides the following monitoring tools to watch Amazon EC2 Auto Scaling, report when something is wrong, and take automatic actions when appropriate:

Health Checks

Amazon EC2 Auto Scaling periodically performs health checks on the instances in Auto Scaling group. If an instance does not pass its health check, it is marked unhealthy and will be terminated while Amazon EC2 Auto Scaling launches a new instance in replacement. 

Capacity Rebalancing

Users can enable Capacity Rebalancing on new and existing Auto Scaling groups when using Spot Instances. When turning on Capacity Rebalancing, Amazon EC2 Auto Scaling attempts to launch a Spot Instance whenever Amazon EC2 notifies that a Spot Instance is at an elevated risk of interruption. After launching a new instance, it then terminates an old instance. 

AWS Personal Health Dashboard

The Personal Health Dashboard (PHD) displays information, and also provides notifications that are triggered by changes in the health of AWS resources. The information is presented in two ways: on a dashboard that shows recent and upcoming events organized by category, and in a full event log that shows all events from the past 90 days. 

CloudWatch Alarms

To detect unhealthy application behavior, CloudWatch helps by automatically monitoring certain metrics for AWS resources. Users can configure a CloudWatch alarm and set up an Amazon SNS notification that sends an email when a metric’s value is not what users expect or when certain anomalies are detected.  

CloudWatch Dashboards

CloudWatch dashboards are customizable home pages in the CloudWatch console. Users can use these pages to monitor resources in a single view, even including resources that are spread across different Regions. Users can use CloudWatch dashboards to create customized views of the metrics and alarms for AWS resources. 

CloudTrail Logs

AWS CloudTrail enables users to track the calls made to the Amazon EC2 Auto Scaling API by or on behalf of their AWS account. CloudTrail stores the information in log files in the Amazon S3 bucket that you specify. Users can use these log files to monitor activity of Auto Scaling groups. Logs include which requests were made, the source IP addresses where the requests came from, who made the request, when the request was made, and so on. 

CloudWatch Logs

CloudWatch Logs enable users to monitor, store, and access your log files from Amazon EC2 instances, CloudTrail, and other sources. CloudWatch Logs can monitor information in the log files and notify users when certain thresholds are met. Users can also archive their log data in highly durable storage. 

Amazon Simple Notification Service Notifications

Users can configure Auto Scaling groups to send Amazon SNS notifications when Amazon EC2 Auto Scaling launches or terminates instances. 

EventBridge

Amazon EventBridge, formerly called CloudWatch Events, delivers a near real-time stream of system events that describe changes in AWS resources. EventBridge enables automated event-driven computing, as users can write rules that watch for certain events and trigger automated actions in other AWS services when these events happen. Users can also receive a two-minute warning when Spot Instances are about to be reclaimed by Amazon EC2. 

AWS Auto Scaling is an AWS service that allows users to increase or decrease the number of EC2 instances within their application’s architecture. With Auto Scaling, users can create collections of EC2 instances, called Auto Scaling groups. Users can create these groups from scratch, or from existing EC2 instances that are already in production, and create as many Auto Scaling groups as needed. For example, if an application consists of a web tier and an application tier, users can create two Auto Scaling groups—one for each tier. Each Auto Scaling group can contain one or more scaling policies—these policies define when Auto Scaling launches or terminated EC2 instances within the group.

Adding Auto Scaling to the network architecture is one way to maximize the benefits of the AWS cloud. With Auto Scaling, users can make applications: 

  • More fault tolerant: Auto Scaling can detect when an instance is unhealthy, terminate it, and launch a new instance to replace it. •
  • More highly available: Users can configure Auto Scaling to use multiple subnets or Availability Zones. If one subnet or Availability Zone becomes unavailable, Auto Scaling can launch instances in another one to compensate.
  • Increase and decrease in capacity only when needed: Unlike on-premises solutions, with Auto Scaling users can have own network scale dynamically, and don’t pay for Auto Scaling. Instead, users pay only for the EC2 instances launched, and only for as long as use them.