ECS Cluster Hibernation-Scheduled Stop/Start

ECS Cluster Hibernation-Scheduled Stop/Start

But Why?

Intensely utilized ECS Clusters can cost lots of Benjamins. Needless to say, there is no way around to lower the costs in Production environments besides provisioning appropriate type of instances for the tasks, coding efficiently, architecting a well-planned infrastructure and so on since the clusters must be running all the times. But of course an Infrastructure Engineer can take actions on Development environments to lower the costs.

I’m shutting down all clusters on Development environment between 23:00-07:00 in my way by a Python Lambda script which gets deployed by Terraform. I stop clusters by setting their Auto Scaling Group to 0. This makes all Container Instances to shut down. But what about the inital ASG states? Where do minimum, maximum and desired values go? I write them to a DynamoDB Table before setting them to 0.

I start the clusters by reading and setting the initial values for Cluster specific ASG from that DynamoDB Table.

For the schedule, I’m using CW Event Rules to trigger the Lambda script.

Let’s see the scripts!

The Python Code

To Stop All ECS Clusters

from __future__ import print_function
import boto3
 
printer = lambda x : print(x)
 
def lambda_handler(event, context):
    as_client       = boto3.client('autoscaling')
    dynamodb_client = boto3.client('dynamodb')
 
    printer("Collecting Auto Scaling Group properties and writing them to DynamoDB...")
    asg_list = collect_as(as_client, dynamodb_client)
    printer("All Auto Scaling Group(s) is/are collected and written to DynamoDB!")
 
    printer("Setting MinSize, MaxSize and DesiredCapacity to 0...")
    stop_ecs_clusters(asg_list, as_client)
    printer("MinSize, MaxSize and DesiredCapacity are set to 0!")
 
    return "Script execution completed. See CloudWatch logs for complete output!"
 
def collect_as(as_client, dynamodb_client):
    asg_list = []
 
    asg = as_client.describe_auto_scaling_groups()['AutoScalingGroups']
 
    for each_asg in asg:
        as_name = each_asg['AutoScalingGroupName']
        printer(str(i+1) + "-) " + as_name)
        asg_list.append(as_name)
 
        min_size = each_asg['MinSize']
        max_size = each_asg['MaxSize']
        des_cap  = each_asg['DesiredCapacity']
 
        write_initial_states_to_dynamodb(as_name, min_size, max_size, des_cap, dynamodb_client)
 
    return asg_list
 
def write_initial_states_to_dynamodb(as_name, min_val, max_val, des_val, dynamodb_client):
    update_query = "SET MinVal = :minimum_value, MaxVal = :maximum_value, DesVal = :desired_value"
 
    request = dynamodb_client.update_item(
        TableName                 = "ASGValues",
        Key                       = {
            'ASGName': {
                'S': as_name
            }
        },
        UpdateExpression          = update_query,
        ExpressionAttributeValues = {
            ':minimum_value': {
                'N': str(min_val)
            },
            ':maximum_value': {
                'N': str(max_val)
            },
            ':desired_value': {
                'N': str(des_val)
            }
        }
    )
 
def stop_ecs_clusters(asg_list, as_client):
    for each_asg in asg_list:
        described_as = as_client.update_auto_scaling_group(
            AutoScalingGroupName    = each_asg,
            MinSize                 = 0,
            MaxSize                 = 0,
            DesiredCapacity         = 0
        )

To Start All ECS Clusters

from __future__ import print_function
import boto3
 
printer = lambda x : print(x)
 
def lambda_handler(event, context):
    asg_client      = boto3.client('autoscaling')
    dynamodb_client = boto3.client('dynamodb')
 
    printer("Collecting Auto Scaling Group properties from DynamoDB...")
    asg_dictionary = collect_as(asg_client, dynamodb_client)
    printer("Auto Scaling Group property(s) has/have been collected!")
 
    printer("Setting MinSize, MaxSize and DesiredCapacity to their actual values...")
    start_ecs_clusters(asg_dictionary, asg_client)
    printer("MinSize, MaxSize and DesiredCapacity are set to their actual values!")
 
    return "Script execution completed. See CloudWatch logs for complete output!"
 
def collect_as(asg_client, dynamodb_client):
    asg_dictionary = {}
 
    asg = asg_client.describe_auto_scaling_groups()['AutoScalingGroups']
 
    for i in range(0, len(asg)):
        asg_dictionary[i] = {}
 
        asg_name = asg[i]['AutoScalingGroupName']
        printer(str(i+1) + "-) " + asg_name)
        asg_dictionary[i]['asg_name'] = asg_name
 
        read_initial_states_from_dynamodb(i, asg_name, asg_dictionary, dynamodb_client)
 
    return asg_dictionary
 
def read_initial_states_from_dynamodb(i, asg_name, asg_dictionary, dynamodb_client):
    response = dynamodb_client.get_item(
        TableName = "ASGValues",
        Key = {
            'ASGName': {
                'S': asg_name
            }
        },
        AttributesToGet = [
            'MinVal',
            'MaxVal',
            'DesVal'
        ]
    )
 
    min_val                      = response['Item']['MinVal']['N']
    asg_dictionary[i]['min_val'] = int(min_val)
 
    max_val                      = response['Item']['MaxVal']['N']
    asg_dictionary[i]['max_val'] = int(max_val)
 
    des_val                      = response['Item']['DesVal']['N']
    asg_dictionary[i]['des_val'] = int(des_val)
 
 
def start_ecs_clusters(asg_dictionary, asg_client):
    for i in range(0, len(asg_dictionary)):
        described_asg = as_client.update_auto_scaling_group(
            AutoScalingGroupName = asg_dictionary[i]['asg_name'],
            MinSize              = asg_dictionary[i]['min_size'],
            MaxSize              = asg_dictionary[i]['max_size'],
            DesiredCapacity      = asg_dictionary[i]['des_cap']
        )

This Post Has 3 Comments

  1. Hello Mert. Thanks for this tutorial. It is exactly what I was looking for. I am new to pyhton and AWS. Could you tell me what permissions are needed to run the script? Another doubt, where is the name of the table specified? Thanks.

    1. Hello Sergio,

      I’m glad this is helpful for you. The table name is “ASGValues”, as declared in the scripts and for the permissions, on top of my head, I think you’ll need ASG:DescribeAutoScalingGroups, ASG:UpdateAutoScalingGroup and DynamoDB:UpdateItem. Or you can use ASG:*, DynamoDB:* if you want it to get working ASAP.

      1. Hello Mert. Thank you. I was able to implement the functions and understand the code better. I am not an aws expert, in fact, I had to use the ecs_client = boto3.client (‘ecs’) function that returns all services in a given cluster. So, I was able to change the DesiredCount attribute (the only one available), but it is enough to stop and start the services.

Leave a Reply

Close Menu