Persisting Docker Volumes in ECS using EFS

CATEGORIES

Blog

Last week we faced a new challenge to persist our Docker Volume using EFS. Sounds easy, right? Well, it turned out to be a bit more challenging than expected and we were only able to find a few tips here and there. That is why we wrote this post so others may succeed faster.

Before digging into the solution, let’s take a minute to describe our context to elaborate a bit more on the challenge.
First of all, we believe in Infrastructure as Code and thereby we use CloudFormation to be able to recreate our environments. Luckily Amazon provides a working sample and we got EFS working quite easily. The next part was to get Docker to use a volume from EFS. We got lucky a second time as Amazon provides another working sample.

We managed to combine these resources and everything looked alright, but a closer look revealed that the changes did not persist. We found one explanation for why it didn’t work. It appears that we mount EFS after the Docker daemon starts and therefore the volume mounts an empty non-existing directory. In order to fix that we did two things, first we orchestrated the setup and then we added EFS to fstab in order to auto-mount on reboot.

The solution looks a bit like the following:

  
EcsCluster:
    Type: AWS::ECS::Cluster
    Properties: {}
  LaunchConfiguration:
    Type: AWS::AutoScaling::LaunchConfiguration
    Metadata:
      AWS::CloudFormation::Init:
        configSets:
          MountConfig:
          - setup
          - mount
        setup:
          packages:
            yum:
              nfs-utils: []
          files:
            "/home/ec2-user/post_nfsstat":
              content: !Sub |
                #!/bin/bash

                INPUT="$(cat)"
                CW_JSON_OPEN='{ "Namespace": "EFS", "MetricData": [ '
                CW_JSON_CLOSE=' ] }'
                CW_JSON_METRIC=''
                METRIC_COUNTER=0

                for COL in 1 2 3 4 5 6; do

                 COUNTER=0
                 METRIC_FIELD=$COL
                 DATA_FIELD=$(($COL+($COL-1)))

                 while read line; do
                   if [[ COUNTER -gt 0 ]]; then

                     LINE=`echo $line | tr -s ' ' `
                     AWS_COMMAND="aws cloudwatch put-metric-data --region ${AWS::Region}"
                     MOD=$(( $COUNTER % 2))

                     if [ $MOD -eq 1 ]; then
                       METRIC_NAME=`echo $LINE | cut -d ' ' -f $METRIC_FIELD`
                     else
                       METRIC_VALUE=`echo $LINE | cut -d ' ' -f $DATA_FIELD`
                     fi

                     if [[ -n "$METRIC_NAME" && -n "$METRIC_VALUE" ]]; then
                       INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
                       CW_JSON_METRIC="$CW_JSON_METRIC { \"MetricName\": \"$METRIC_NAME\", \"Dimensions\": [{\"Name\": \"InstanceId\", \"Value\": \"$INSTANCE_ID\"} ], \"Value\": $METRIC_VALUE },"
                       unset METRIC_NAME
                       unset METRIC_VALUE

                       METRIC_COUNTER=$((METRIC_COUNTER+1))
                       if [ $METRIC_COUNTER -eq 20 ]; then
                         # 20 is max metric collection size, so we have to submit here
                         aws cloudwatch put-metric-data --region ${AWS::Region} --cli-input-json "`echo $CW_JSON_OPEN ${!CW_JSON_METRIC%?} $CW_JSON_CLOSE`"

                         # reset
                         METRIC_COUNTER=0
                         CW_JSON_METRIC=''
                       fi
                     fi



                     COUNTER=$((COUNTER+1))
                   fi

                   if [[ "$line" == "Client nfs v4:" ]]; then
                     # the next line is the good stuff
                     COUNTER=$((COUNTER+1))
                   fi
                 done <<< "$INPUT"
                done

                # submit whatever is left
                aws cloudwatch put-metric-data --region ${AWS::Region} --cli-input-json "`echo $CW_JSON_OPEN ${!CW_JSON_METRIC%?} $CW_JSON_CLOSE`"
              mode: '000755'
              owner: ec2-user
              group: ec2-user
            "/home/ec2-user/crontab":
              content: "* * * * * /usr/sbin/nfsstat | /home/ec2-user/post_nfsstat\n"
              owner: ec2-user
              group: ec2-user
          commands:
            01_createdir:
              command: !Sub "mkdir -p /${MountPoint}"
        mount:
          commands:
            01_mount:
              command:
                Fn::Join:
                  - ""
                  - - "mount -t nfs4 -o nfsvers=4.1 "
                    - Fn::ImportValue:
                        Ref: FileSystem
                    - ".efs."
                    - Ref: AWS::Region
                    - ".amazonaws.com:/ /"
                    - Ref: MountPoint
            02_fstab:
              command:
                Fn::Join:
                  - ""
                  - - "echo \""
                    - Fn::ImportValue:
                        Ref: FileSystem
                    - ".efs."
                    - Ref: AWS::Region
                    - ".amazonaws.com:/ /"
                    - Ref: MountPoint
                    - " nfs4 nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 0 0\" >> /etc/fstab"
            03_permissions:
              command: !Sub "chown -R ec2-user:ec2-user /${MountPoint}"
            04_restart_docker_and_ecs:
              command: !Sub "service docker restart && start ecs"
    Properties:
      AssociatePublicIpAddress: true
      ImageId:
        Fn::FindInMap:
        - AWSRegionArch2AMI
        - Ref: AWS::Region
        - Fn::FindInMap:
          - AWSInstanceType2Arch
          - Ref: InstanceType
          - Arch
      InstanceType:
        Ref: InstanceType
      KeyName:
        Ref: KeyName
      SecurityGroups:
      - Fn::ImportValue:
          Ref: SecuritygrpEcsAgentPort
      - Ref: InstanceSecurityGroup
      IamInstanceProfile:
        Ref: CloudWatchPutMetricsInstanceProfile
      UserData:
        Fn::Base64: !Sub |
          #!/bin/bash -xe
          echo ECS_CLUSTER=${EcsCluster} >> /etc/ecs/ecs.config
          yum update -y
          yum install -y aws-cfn-bootstrap
          /opt/aws/bin/cfn-init -v --stack ${AWS::StackName} --resource LaunchConfiguration --configsets MountConfig --region ${AWS::Region}
          crontab /home/ec2-user/crontab
          /opt/aws/bin/cfn-signal -e $? --stack ${AWS::StackName} --resource AutoScalingGroup --region ${AWS::Region}
    DependsOn:
    - EcsCluster


Here is what we did compared to the original AWS provided template:
  1. extracted FileSystem EFS into another CF template and exported the EFS identifier so that we can use ImportValue
  2. added -p to the mkdir command just in case
  3. enhanced mount to use imported filesystem reference
  4. added mount to fstab so that we auto-mount on reboot
  5. recursive changed EFS mount ownership
  6. restarted Docker daemon to include mounted EFS and started ECS as it does not automatically restart when the Docker daemon restarts
  7. added ECS cluster info to ECS configuration
  8. added ECS agent security group so that port 51678 which the ECS agent uses is open
  9. added yum update just in case
  10. included launch configuration into auto scaling group for the ECS cluster and added depends on ECS cluster

We were a bit surprised that EFS does not require an additional volume driver to function. It appears to work out-of-the-box and turned out to be quite straightforward. Thank you for reading and enjoy using EFS as a means to persist your Docker Volumes in your ECS cluster!

Get in Touch.

Let’s discuss how we can help with your cloud journey. Our experts are standing by to talk about your migration, modernisation, development and skills challenges.









    If your cloudformation deployments are failing, this is why

    CATEGORIES

    Blog

    Update [16:00UTC]: AWS were quick to release a fix (aws-cfn-bootstrap-1.4-26) and -25 is still in the yum repositories. Unless you were unlucky and froze your environment today, the problem should solve itself.


    The latest version of aws-cfn-bootstrap package aws-cfn-bootstrap-1.4-25.17.amzn1.noarch that was signed November 2 around 21:00 UTC changed how cfn-signal works. cfn-signal now picks up the the instance profile role’s api keys and try to sign the request by default. This causes the signal to fail if the instances IAM role does not have cloudformation:SignalResource permission.

    cfn-signal has always supported signed requests but if access keys were not provided the following authentication method was used.

    cfn-signal does not require credentials, so you do not need to use the –access-key, –secret-key, –role, or –credential-file options. However, if no credentials are specified, AWS CloudFormation checks for stack membership and limits the scope of the call to the stack that the instance belongs to.

    http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-signal.html

    This will only affect users that either build ami’s or update system packages on bootup. If you normally do a yum update replace it with yum -y upgrade –security or yum -y upgrade –exclude=aws-cfn-bootstrap

    You could also add the Iam policy statement below to your instance role.

    {

    “Action”: [

    “cloudformation:DescribeStackResource”,

    “cloudformation:SignalResource”

    ],

    “Effect”: “Allow”,

    “Resource”: {

    “Fn::Sub”: “arn:aws:cloudformation:${AWS::Region}:${AWS::AccountId}:stack/${AWS::StackName}/*”

    }

    }

    Please contact Nordcloud for more information on CloudFormation

    Get in Touch.

    Let’s discuss how we can help with your cloud journey. Our experts are standing by to talk about your migration, modernisation, development and skills challenges.









      Keeping up with the latest skills: AWS IoT, Polly, and Rekognition

      CATEGORIES

      Blog

      Recently, I secured a number of AWS IoT Buttons for our office to play with and wanted to try to see how easy they would be to set-up and use in various mock-up applications. In the spirit of playing around with the buttons and keeping up my technical skills related to the AWS platform, I decided to make a small proof-of-concept project around them by collecting some old Android devices I had lying around, and various bits and pieces of AWS services such as Image recognition.

      The concept I finally settled with is a remote surveillance camera solution which can be triggered remotely with the AWS IoT Button, and which performs simple image recognition labelling the image content in the form of gender, roughage, mood, and other parameters. The solution will update a “monitoring” website where the latest surveillance image will be shown and the recognised characteristics spoke out for the viewer, removing the need to read the monitor in detail.

      For building the actual solution I selected the following tools and technologies together with the AWS platform:

      • Android tablet – I like to repurpose and recycle old and unused items, so I decided to use a decommissioned tablet as the IoT device which will act as the camera module for the system. Android devices are, in my opinion, one of the best toys to have lying around for building solutions requiring mobile, IoT, or embedded components. The platform is quite easy to use and easy to write applications in.
      • NodeRed – Since I didn’t want to spend too much time configuring and setting up the IoT libraries and framework in the Android devices, I decided to use NodeRed as the solution providing the MQTT protocol support, as it provides easy to use programming tools for doing quick PoCs around IoT. Running NodeRed requires SSH-access to the device, which I established using Termux and associated modules or controlling the camera etc.
      • The AWS IoT Button – This was an obvious choice as it was one of the technology components I wanted to test and one that also made me start working with the project in the first place.

      As the main idea of the solution was to build something around the AWS IoT Button and see how easy it is to set-up and use, this meant using the AWS platform as the IoT “backend”. For the rest of the solution, (as I didn’t want to start maintaining or setting up servers myself) I decided to use as many platform services as possible in AWS. I ended up working with the following AWS services:

      AWS IoT

      Using the AWS IoT platform for the message brokering, connectivities, and overall management of the IoT solution.

      AWS IAM

      The requirement here was to configure the various access roles and rights for all the architectural components in a secure way.

      AWS S3

      Using two distinct S3 buckets. One for uploading the images taken by the camera, one for hosting the website for the “monitoring” purposes.

      AWS Lambda

      Lambda functions were used to perform the required calculations and actions in a “serverless”-fashion and to remove the need for maintaining infrastructure components.

      AWS Polly

      Text-to-speech service used for creating the audio-streams required by the solution.

      AWS Rekognition

      Image recognition service used for analysing, and labelling the images.

      AWS CloudWatch and logs

      Used for monitoring and debugging the solution during the project.

      AWS CloudFormation

      Used for creating the resources, functions, roles etc. in the solution.

      Python/Boto3

      I selected to use Python as the programming language as the Boto3 libraries provide easy APIs to utilise the AWS services. Python was used to write all the Lambda functions to perform the processing required by the overall solution.

      How everything was brought together

      After registering the AWS IoT button (which was easily done with the AWS Android app), and Android devices to AWS IoT framework and provisioning the security credentials for them, they were good to be used as part of the solution. The architectural idea was to press a button to trigger a Lambda function which will do a few checks on the “upload” S3 bucket, creating a temporary signed URL for the S3 bucket. It will then use the AWS IoT topic to notify the Android devices on the image capture trigger. The Android device would then take the picture of whatever is standing in front of the camera and upload it securely to the “upload” S3 bucket using the temporary upload URL provided via the MQTT message it received earlier.

      Whenever new images are uploaded to the S3 bucket, this will trigger another serverless action in the background. This Lambda-function will take the image and use AWS Rekognition for performing the image recognition on it. The recognised labels and objects will then be run through AWS Polly to create the required audio stream. After all the new content is created, the Lambda-function will upload the content to the other S3 bucket where the website is hosted to show and play the content for whoever is watching the “monitoring” website. The separation of the S3 buckets provides added security measures, (a DMZ of sorts) to safeguard the website for the potentially harmful content which could, in theory, be uploaded to the upload bucket if the temporary upload URL was somehow acquired by an attacker.

      The whole solution is secured by AWS IAM by providing the least amount of necessary privileges for all the components to perform their actions in the exact resources they are using.

      Enabling Cloudwatch monitoring and logging is a good choice for debugging the solution, at least during the development phase. This enabled me to catch unnecessary typing errors in the granular IAM policies in the Lambda function’s IAM Role during the set-up.

      My findings

      This was a rather quick and fun project to work with and provided some insight into using the AWS IoT Button and Android devices as part of the AWS IoT ecosystem. The devices themselves were rather easy to get registered and functioning in the set-up. Of course in a large-scale real-world environment the set-up, certification creation, and installation of the IoT devices would need to be automated as well to make it feasible. Incorporating small Lambda-functions with image recognition and text-to-speech was quite straightforward and worked as a good learning platform for the technologies.

      When applying the project to a customer situation, I would definitely improve it by adding image transcoding for different screen sizes, create a proper web-service with searchable UI and proper picture database/index etc. All in all, I can highly recommend playing around with the IoT framework, IoT button, and NodeRed in Android. Creating these kinds of small side-projects is the perfect platform for people in our business to continue improving our skills and know-how around the ever-expanding technology selection in modern IT environment.

      Nordcloud offers deep-dive workshop which will help to identify the opportunities that impact your business and help you shape data-driven solutions which will take your business to the next level, contact us for more information.

      Get in Touch.

      Let’s discuss how we can help with your cloud journey. Our experts are standing by to talk about your migration, modernisation, development and skills challenges.