使用多种机型

ASG的最佳实践是使用多种机型,因为如果使用单一机型,容易受到该机型容量不足的影响,此时会开不出机器;如果使用多种机型,抗这种风险的能力会大大提升。另外我们通常会将spot和OD一起使用以节省成本。这就引出了使用多种机型这个话题,我们会考虑到如何分配spot和OD的比例、在使用多种机型时如何控制花费等

price protection

价格保护(price protection)是ASG

Price protection is a feature that protects your Auto Scaling group from extreme price differences across instance types. When you create a new Auto Scaling group or update an existing Auto Scaling group with attribute-based instance type selection, we enable price protection by default. Optionally, you can choose your price protection thresholds for Spot and On-Demand Instances. When you do this, Amazon EC2 Auto Scaling doesn’t select instance types with prices higher than your specified thresholds. The thresholds represent what you are willing to pay, defined in terms of a percentage above a baseline, rather than as absolute values. The baseline is determined by the price of the least expensive current generation M, C, or R instance type with your specified attributes. If your attributes don’t match any M, C, or R instance types, we use the lowest priced instance type.

If you don’t specify a threshold, the following are used by default:

  • For On-Demand Instances, the price protection threshold is set to 20 percent.
  • For Spot Instances, the price protection threshold is set to 100 percent.

You can update these values when creating your Auto Scaling group in the Amazon EC2 Auto Scaling console. On the Choose instance launch options page, choose the desired price protection attribute from the Additional instance attributes drop-down list. Then, type or choose a value for the attribute in the text box. You can also update these values later by editing the Auto Scaling group from the console or by using the AWS CLI or an SDK.

Allocation strategies

Allocation strategies是ASG决定如何从机型列表中创建出来OD和spot机器,首先它会保证不同AZ机器数量的均衡,再使用设置的allocation strategy来启动新机器

Spot实例

  • capacity-optimized:spot机器的数量会随着时间发生变化,capacity-optimized策略会查看当前比较充足预测未来一段时间也比较充足的机型来启动,这样spot被回收的概率会比较小

  • capacity-optimized-prioritized:如果某几种机型的库存都比较充足,我们也想为它们设置优先启动顺序,此时就可以使用capacity-optimized-prioritized策略

  • lowest-price:ASG会选最便宜的spot来启动,由于此时不考虑容量因素,可能会让spot的中断率大大提高

  • price-capacity-optimized (推荐使用):即考虑容量因素,又考虑价格因素

对于新手,推荐使用price-capacity-optimized模式

On-Demand

OD只有两种分配策略——lowest priceprioritized

  • lowest-price: ASG从所有实例中选择最低价格的来启动
  • prioritized:ASG从高优先级的实例类型开始启动。如果设置的优先级顺序是c5.large, c4.largec3.large,那么会先启动c5.large机型

spot的price-capacity-optimized 策略

Overview

The price-capacity-optimized allocation strategy makes Spot allocation decisions based on both capacity availability and Spot prices. In comparison to the lowest-price allocation strategy, the price-capacity-optimized strategy doesn’t always attempt to launch in the absolute lowest priced Spot Instance pool. Instead, price-capacity-optimized attempts to diversify as much as possible across the multiple low-priced pools with high capacity availability. As a result, the price-capacity-optimized strategy in most cases has a higher chance of getting Spot capacity and delivers lower interruption rates when compared to the lowest-price strategy. If you factor in the cost associated with retrying the interrupted requests, then the price-capacity-optimized strategy becomes even more attractive from a savings perspective over the lowest-price strategy.

We recommend the price-capacity-optimized allocation strategy for workloads that require optimization of cost savings, Spot capacity availability, and interruption rates. For existing workloads using lowest-price strategy, we recommend price-capacity-optimized strategy as a replacement. The capacity-optimized allocation strategy is still suitable for workloads that either use similarly priced instances, or ones where the cost of interruption is so significant that any cost saving is inadequate in comparison to a marginal increase in interruptions.

Walkthrough

In this section, we illustrate how the price-capacity-optimized allocation strategy deploys Spot capacity when compared to the other two allocation strategies. The following example configuration shows how Spot capacity could be allocated in an Auto Scaling group using the different allocation strategies:

{
    "AutoScalingGroupName": "myasg ",
    "MixedInstancesPolicy": {
        "LaunchTemplate": {
            "LaunchTemplateSpecification": {
                "LaunchTemplateId": "lt-abcde12345"
            },
            "Overrides": [
                {
                    "InstanceRequirements": {
                        "VCpuCount": {
                            "Min": 4,
                            "Max": 4
                        },
                        "MemoryMiB": {
                            "Min": 0,
                            "Max": 16384
                        },
                        "InstanceGenerations": [
                            "current"
                        ],
                        "BurstablePerformance": "excluded",
                        "AcceleratorCount": {
                            "Max": 0
                        }
                    }
                }
            ]
        },
        "InstancesDistribution": {
            "OnDemandPercentageAboveBaseCapacity": 0,
            "SpotAllocationStrategy": "spot-allocation-strategy"
        }
    },
    "MinSize": 10,
    "MaxSize": 100,
    "DesiredCapacity": 60,
    "VPCZoneIdentifier": "subnet-a12345a,subnet-b12345b,subnet-c12345c"
}

JSON

First, Amazon EC2 Auto Scaling attempts to balance capacity evenly across Availability Zones (AZ). Next, Amazon EC2 Auto Scaling applies the Spot allocation strategy using the 30+ instances selected by attribute-based instance type selection , in each Availability Zone. The results after testing different allocation strategies are as follows:

  • Price-capacity-optimized strategy diversifies over multiple low-priced Spot Instance pools that are optimized for capacity availability.
  • Capacity-optimize strategy identifies Spot Instance pools that are only optimized for capacity availability.
  • Lowest-price strategy by default allocates the two lowest priced Spot Instance pools that aren’t optimized for capacity availability

To find out how each allocation strategy fares regarding Spot savings and capacity, we compare ‘Cost of Auto Scaling group’ (number of instances x Spot price/hour for each type of instance) and ‘Spot interruptions rate’ (number of instances interrupted/number of instances launched) for each allocation strategy. We use fictional numbers for the purpose of this post. However, you can use the Cloud Intelligence Dashboards to find the actual Spot Saving, and the Amazon EC2 Spot interruption dashboard to log Spot Instance interruptions. The example results after a 30-day period are as follows:

Allocation strategy Instance allocation Cost of Auto Scaling group Spot interruptions rate
price-capacity-optimized 40 c6i.xlarge20 c5.xlarge $4.80/hour 3%
capacity-optimized 60 c5.xlarge $5.00/hour 2%
lowest-price 30 c5a.xlarge30 m5n.xlarge $4.75/hour 20%

As per the above table, with the price-capacity-optimized strategy, the cost of the Auto Scaling group is only 5 cents (1%) higher, whereas the rate of Spot interruptions is six times lower (3% vs 20%) than the lowest-price strategy. In summary, from this exercise you learn that the price-capacity-optimized strategy provides the optimal Spot experience that is the best of both the lowest-price and capacity-optimized allocation strategies.

Common use-cases of price-capacity-optimized allocation strategy

Earlier we mentioned that the price-capacity-optimized allocation strategy is recommended for most Spot workloads. To elaborate further, in this section we explore some of these common workloads.

Stateless and fault-tolerant workloads

Stateless workloads that can complete ongoing requests within two minutes of a Spot interruption notice, and the fault-tolerant workloads that have a low cost of retries, are the best fit for the price-capacity-optimized allocation strategy. This category has workloads such as stateless containerized applications, microservices, web applications, data and analytics jobs, and batch processing.

Workloads with a high cost of interruption

Workloads that have a high cost of interruption associated with an expensive cost of retries should implement checkpointing to lower the cost of interruptions. By using checkpointing, you make the price-capacity-optimized allocation strategy a good fit for these workloads, as it allocates capacity from the low-priced Spot Instance pools that offer a low Spot interruptions rate. This category has workloads such as long Continuous Integration (CI), image and media rendering, Deep Learning, and High Performance Compute (HPC) workloads.

Conclusion

We recommend that customers use the price-capacity-optimized allocation strategy as the default option. The price-capacity-optimized strategy helps Amazon EC2 Auto Scaling groups and Amazon EC2 Fleet provision target capacity with an optimal experience. Updating to the price-capacity-optimized allocation strategy is as simple as updating a single parameter in an Amazon EC2 Auto Scaling group and Amazon EC2 Fleet.

To learn more about allocation strategies for Spot Instances, visit the Spot allocation strategies documentation page .