Designing Proactive Monitoring Patterns of Costs and Usages using AWS Budgets

Tech Community • 7 min read

In one of our previous posts, we spoke about how AWS Cost Explorer helps us to explore and analyze spending across our AWS accounts, offering a wide range of filtering options to help us dig down into the details. 

With AWS Budgets, we can set up the reports and alarms we need to keep a close eye on our spending. Today we’re discussing a solution pattern that can be incorporated into our AWS Landscape in order to proactively monitor costs and usage based on forecasted values using AWS Budgets. 

Cost Explorer or AWS Budgets?

Cost Explorer and Budgets share some similarities. Both are part of the FinOps Model and the Cost optimization pillar of the Well-Architected framework. Therefore, AWS designed both services to help its users identify, monitor, and manage their costs.

Both also enable us to avoid surprise costs by setting alerts that notify you of potential overspending or when we hit our set limit; Cost Anomaly alerts in Cost Explorer, and Budgets alerts in AWS Budgets.

Why AWS Budgets?

AWS Budgets are helpful when we want to set a custom spending limit and compare your actual spending to your budget. The goal here is to let you proactively react based on your AWS spending limit and take action to mitigate the overspending. 

A set of alerts can notify you or the stakeholders when the cost and usage for a set of services approach or exceed a specific threshold.

Using AWS Budgets, we can also predict or forecast the costs for the following month based on the last five months and the current month. However, there is an incubation time of at least 1 month of usage data.

We will showcase an example here by demonstrating the usage of AWS Budgets Action, configuring specific notifications and invocations to respond to usage and cost status changes.

Solution Design

In this solution pattern, AWS Lambda is triggered by the SNS Topic when an AWS Budget event occurs. This function then checks to see if the event is actually over budget or forecasted to be over the predefined budget. 

The function then calls Cost Explorer API to check which AWS Service and which API calls cost the most on the last day. Then the function will post this information to an enriched SNS topic. This is a customized AWS CDK L3 Construct which needs to be developed by implementing the AWS Lambda function as well as writing the L1 Construct for AWS Budgets.

Implementation

Here is the link to the full repo. However, we will describe each of the components of the solution construct.

AWS Budgets Alerts: One of the offerings of AWS Budgets is to enable actions based on monthly or daily granularity. We can either forecast total monthly costs and set actions to it or we can also enable threshold values on a daily basis. Here is the AWS CDK implementation to enable the actions

new aws_budgets.CfnBudget(this, "cost-budget", {
      budget: {
        budgetName: props.budgetName,
        budgetType: "COST",
        timeUnit: props.granularity,
        budgetLimit: { amount: props.spend, unit: "USD" },
      },
      notificationsWithSubscribers: [
        {
          notification: {
            notificationType: "ACTUAL",
            comparisonOperator: "GREATER_THAN",
            threshold: props.threshold, // percent
          },
          subscribers: [
            { subscriptionType: "SNS", address: snsTopic.topicArn },
          ],
        },
      ],
    });

As shown above, the threshold is set for the actions to be executed based on the budget limit. Currently, the subscription of the alerts can be either SNS topic or email. We have configured an SNS topic where both email and AWS Lambda are put as subscribers.

AWS SNS Notification: As mentioned above, next we define the SNS topic which sends an email and also an event towards AWS Lambda which enriches the message to send it to Slack integrated channels. Here is the AWS CDK implementation for the same:

const snsTopic = new Topic(this, "sns-budget-action-alert", {
            topicName: id+"-SNS-Budgets-Action-alert",
        });
        snsTopic.addToResourcePolicy(
            new PolicyStatement({
              resources: [
                snsTopic.topicArn
              ],
              effect:Effect.ALLOW,
              actions: ["SNS:Publish"],
              principals: [new ServicePrincipal("budgets.amazonaws.com")]
            })
          );
        if (props.email) {
            snsTopic.addSubscription(new EmailSubscription(props.email));
        }

AWS Lambda Enricher: In this function, we invoke the integrations between our AWS Landscape and Slack. The Authentication Token for connecting and sending Callback messages to Slack is kept in the AWS SSM parameter store.

session = Session(aws_access_key_id=stsClient['Credentials']['AccessKeyId'],
aws_secret_access_key=stsClient['Credentials']['SecretAccessKey'],
aws_session_token=stsClient['Credentials']['SessionToken'])
ssm = session.client('ssm')

response = ssm.get_parameter(
Name='/slack/bot/tool/token',
WithDecryption=True)

Once we fetch this token, we will be using this to send messages inside and call the relevant channels. Here is a full documentation link for the initial setup for Slack-AWS integration

Next, we configure the lambda function to fetch the Cost and Usage Python SDK. This implementation customizes the response to provide values based on the timeline which is sent as request parameters.

response = ce.get_cost_and_usage(
        TimePeriod={
            'Start': yesterday,  # For example, 2022-09-01
            'End': today
        },
        Granularity='DAILY',
        Metrics=[
            'BlendedCost',
        ],
        GroupBy=[
            {
                'Type': 'DIMENSION',
                'Key': 'SERVICE'
            }
        ]
    )

    groups = response['ResultsByTime'][0]['Groups']

    highest_cost_service = []
    for i in groups:
        service = (i['Keys'])
        service = ''.join(service)
        cost = str(i['Metrics']['BlendedCost']['Amount'])
        if float(cost) > 0:
            if not highest_cost_service:
                highest_cost_service = service
                highestvalue = cost
            if cost > highestvalue:
                highest_cost_service = service
                highestvalue = cost

    highestvalue = float(highestvalue)
    highestvalue = '${:,.2f}'.format(highestvalue)

The above implementation fetches the highest values in terms of service usage on the previous day. We have a similar implementation for the highest usage API calls for the same service. 

Both the values from Service + API usage are collected and constructed using the Slack message block formats

slackmessageblock.append({
            "type": "section",
            "fields": [
                {
                    "type": "mrkdwn",
                    "text": "*Service:*\n{}".format(slack_bot_report['service'])
                },
                {
                    "type": "mrkdwn",
                    "text": "*Cost:*\n{}".format(slack_bot_report['servicecost'])
                },
                {
                    "type": "mrkdwn",
                    "text": "*API call:*\n{}".format(slack_bot_report['apicall'])
                },
                {
                    "type": "mrkdwn",
                    "text": "*API call cost:*\n{}".format(slack_bot_report['apicost'])
                }

            ]
        })

All the above implementation helps to create the message that needs to be incorporated inside the AWS Lambda function which will be invoked inside the Solution Construct.

After creating this AWS Lambda function, we need to add this as a subscription to the newly created AWS SNS topic.

new lambdaPython.PythonFunction(this, "reporter", {
      entry: path.join(__dirname, "..", "lambda", "reporter", "budgets"),
      index: "budget_enricher.py",
      handler: "handler",
      runtime: lambda.Runtime.PYTHON_3_8,
      timeout: Duration.minutes(1),
      environment: {
        SLACK_CHANNEL: props.slackConfig?.slackChannel
          ? props.slackConfig.slackChannel
          : "",
        ACCOUNT_NAME: props.accountFriendlyName,
        GRANULARITY: props.granularity,
      },
    });

snsTopic.addSubscription(
      new LambdaSubscription(this.createLambdaForSubscription(props))
);
    

Finally, we have finished creating the L3 level constructs for AWS Budgets and proactive monitoring. Currently, AWS Constructs DEV Portal does not have any readymade constructs which can be used out of the box. Hence, we have created this customized solution pattern using AWS CDK.

AWS CDK Invocation: Once we set up the AWS CLI inside our account, we can instantiate the same using the following parameters:

new Budgets(app, 'ForecastingBudgetscdkStack', {
    budgetName:"forecast-budget",
    accountFriendlyName:"test-construct",
    granularity: CostGranularity.MONTHLY,
    spend:500,
    threshold:80,
    email:"aritranag89@gmail.com",
    slackConfig:{
        slackChannel:"ABCDEFGH"
    }
});

After deploying the same inside our AWS account, we can verify the same has been created by going to the AWS Budgets console and finding the Budgets alert.

Here, as discussed above, we can set up multiple budget actions based on the cost and usage generation granularity.

Once the whole AWS Stack is deployed in the account, we start getting notifications in the Slack channel whenever the AWS Budgets action gets executed. Here is an example of the same:

Final Thoughts

This is one of the patterns that can be leveraged to understand which services are responsible for adding up the costs on both a monthly and daily basis. Since the AWS Budget events are being executed towards the SNS topic, much other automation can be implemented like Start/Stop RDS or EC2 instances and sending emails to the relevant stakeholders.

References

  1. https://docs.aws.amazon.com/wellarchitected/latest/cost-optimization-pillar/welcome.html
  2. https://docs.aws.amazon.com/cost-management/latest/userguide/budgets-managing-costs.html
  3. https://docs.aws.amazon.com/cdk/v2/guide/getting_started.html

Get in Touch.

Let’s discuss how we can help with your cloud journey. Our experts are standing by to talk about your migration, modernisation, development and skills challenges.

Ilja Summala
Ilja’s passion and tech knowledge help customers transform how they manage infrastructure and develop apps in cloud.
Ilja Summala LinkedIn
Group CTO