Mastering AWS Account Decommissioning At Scale: A Technical Deep Dive

Tech Community • 5 min read

22 February 2024

In multi-account AWS environments, managing the entire lifecycle of accounts poses a challenge—from creation and maintenance to closing. One conventional process, decommissioning an account, has traditionally been a manual and time-intensive endeavour. 

This blog post introduces a technical journey outlining an automated approach, specifically focusing on closing accounts. The method leverages AWS Step Functions, Lambda, AWS Organizations, API calls, and various other AWS services.

Understanding the Problem

When an organisation operates in a complex AWS environment with multiple accounts, it creates these accounts for various purposes, utilising them during projects that eventually require closure. However, manually closing an AWS account is not only time-consuming but also error-prone. 

The challenge arises from removing organisation-wide implemented resources, managed through stacks for Infrastructure as Code (IaC). This process often takes multiple hours  in our past experiences, especially when dealing with active resources,manual approvals.

What and Why Decommission should be done

Decommissioning an AWS account involves carefully removing active resources and associated components, with the goal of optimising the account before shutting it down. This process is crucial in promoting efficiency, cost-effectiveness, and reinforcing security controls in cloud environments. 

At the completion of projects, proper decommissioning is essential to reducing unnecessary expenses and mitigating security risks associated with unused, individual accounts. Failing to decommission accounts can leave businesses vulnerable to security breaches, resource inefficiencies, and complications in managing their cloud systems. 

Furthermore, non-compliance issues may arise, potentially damaging the organisation's reputation and compromising its financial integrity. By implementing decommissioning practices, businesses can in part ensure optimal performance and protection of their cloud operations.

Manual Decommissioning Process involved

Manually decommissioning an AWS account is a time-intensive and error prone process, often taking multiple hours as it requires a time to inform and gets approval from people in different stages.

This involves checking for active resources, informing users through email, and, if no active resources are found, removing AWS CloudFormation stack sets in each region along with organisational resources. The manual process is not only hectic and crucial but also requires substantial time and effort.

Automated Decommissioning Advantage

In contrast, the automated process efficiently handles these steps without manual intervention, significantly reducing the time required for decommissioning and ensuring a streamlined, error-free closure of AWS accounts.

Automating AWS Account Decommissioning: A Step-by-Step Approach

The automated approach to AWS account decommissioning unfolds through the following key steps:

Step 1: Decommission Approval Workflow

AWS Step Functions initiate the process by deploying a Lambda function, which sends an email approval request to the account owner. Upon authorisation, the automated sequence progresses.

Step 2: Removing Consumer-Created Resources

Utilising the AWS Cost Explorer API, Lambda functions dynamically analyse the last 24 hours of data to identify active resources. Instead of removing these resources, the system gathers information, including the region where they exist. This information is used to notify account owners/consumers about the active resources.

Step 3: Removing AWS Landing Zone Resources

AWS CloudFormation StackSets, leveraging the OrganizationAccountAccessRole, perform stack deletion across the organisation using API calls. Access Policies ensure secure and efficient resource obliteration.

Step 4: Removing Non-Default VPC and Associated Resources

Lambda functions orchestrate the dismantling of non-default VPCs, managing components such as VPCs, subnets, ENIs, IAM roles, and security groups.

Step 5: Consumer Notification and Account Closure

Lambda functions act as messengers, notifying the account owner/consumer of the successful decommissioning process, marking the closure of the account at the organisation level.

Addressing Manual Challenges

The blog post explores challenges associated with manual account removal and the deletion of organisation-created resources. The automated approach facilitates the identification of active resources and ensures a seamless closure. Consumer-created resources are identified with region-specific details, and consumers are notified, emphasising their responsibility for removal.

The AWS Nuke Dilemma: Not a fit for a production Landing Zone 

AWS Nuke is a powerful tool for decommissioning standalone AWS accounts, but it is not recommended for use in production environments. This is because AWS Nuke is destructive and can accidentally delete important resources. Additionally, AWS Nuke is not idempotent, which means that running it multiple times can have unintended consequences.

Below Are some of the specific reasons why AWS Nuke is not good for production environments:

It can be destructive: AWS Nuke can delete entire accounts, including all resources in those accounts. This can be a disaster if you accidentally delete the wrong account or if you do not have a backup of your data.

It is not idempotent: This means that running AWS Nuke multiple times can have unintended consequences. For example, if you run AWS Nuke once to delete a resource, and then run it again, AWS Nuke will delete the resource again. This can be a problem if you need to restore the resource later.

It is not well-suited for complex environments: AWS Nuke is designed to be a simple tool for decommissioning simple/standalone accounts. It is not well-suited for decommissioning complex accounts that are part of AWS Organizations, have many resources and are used in a production environment.

https://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/automate-deletion-of-aws-resources-by-using-aws-nuke.html

If you need to decommission an AWS account in a multi account production environment, it is the best to use a more controlled approach

Conclusion

In conclusion, the automated approach offers advantages over manual decommissioning, including efficiency, cost savings, and enhanced security. The custom solution provides finer-grained control, seamless integration, greater flexibility, and heightened security by leveraging AWS Step Functions, Lambda, API calls, and other AWS services, streamlining the decommissioning process for a more efficient AWS environment.

Get in Touch.

Let’s discuss how we can help with your cloud journey. Our experts are standing by to talk about your migration, modernisation, development and skills challenges.

Ilja Summala
Ilja’s passion and tech knowledge help customers transform how they manage infrastructure and develop apps in cloud.
Ilja Summala LinkedIn
Group CTO