Notes from AWS Chalk session at AWS re:Invent 2018 – Lambda@Edge optimisations
Lambda@Edge makes it possible to run Lambda code in Edge locations to modify viewer/origin requests. This can be used to modify HTTP headers, change content based on user-agent and more. We’ve written about it previously, so feel free to read this blog post if you want an introduction: https://nordcloud.com/aws-lambdaedge-running-lambda-code-in-edge-locations/
There are quite a few limitations for Lambda@Edge which depends on which request event you are responding on. For example the maximum response that is generated by the lambda function is totally different depending on if it is a viewer or origin response (40 Kb vs 1 Mb). The function in itself also has limits, such as maximum 3GB of memory allocation and 50 Mb zipped deployment package size.
This means that most use-cases have a need for optimisation. First thing first: evaluate if you really need to use Lambda@Edge. Cloudfront currently have a lot of functionality that is possible to take part of before trying to reinvent the wheel – caching depending on device, selecting headers to base caching on, regional blocks with WAF, etc. Even your origin can sometimes handle header rewrites and other header manipulation, which means that there is no need to spend the time to build it yourself. So you should only use Lambda@Edge if you know that cloudfront can’t do it and that there will be a benefit to rendering or serving your content at the edge.
Optimise before the function
If you’ve decided to use Lambda@Edge you should first look into the optimisations you can do before the function is invoked by the event. Cloudfront does a lot of optimisations for you. It groups requests so that if the response time of the object fetch is the same it will put them together and do only one get instead of sending all of them to the origin. Note that cloudfront is a multilayered CDN which will try to catch the cache from the closest location in cloudfront on miss in a specific region as well, so there is no need to build multi-region caching yourself. Another thing to look at in cloudfront is the origin paths that the event reacts upon. Perhaps the function only needs to react on a very specific HTTP path. If possible it is also always better to let the function react on origin events instead of viewer events which in turn makes the amount of events to react upon fewer and you have higher limitations for function size, response time and resource allocation.
When writing the function you should try to utilise global variables as much as possible since they are re-used between invocations and cached on the workers for a couple of hours. Small things such as keeping TCP sockets usable and perhaps using UDP instead of TCP can make a difference especially since Lambda@Edge is synchronous.
When deploying the function, look at minimising the code with different tools such as browserify. Also note that Lambda@Edge can be deployed with different memory allocations so make sure that you test which size gives you the best bang for the buck – sometimes raising the memory usage from 128 Mb to 256 Mb gives you much faster responses without costing that much more.
If you are fetching content from S3, try using S3 Select to get just what you need from a subset of data from an object by using simple SQL expressions, and even better, try to use cached content in Cloudfront instead of trying to fetch it from S3 or other origins. This makes a lot of sense especially if the data can be cached.
Last but not least: Remove the function when not in use. Don’t use Lambda@Edge if you don’t need to anymore.
If you’d like to learn more about moving your business to the Cloud, please contact us here.
Get in Touch.
Let’s discuss how we can help with your cloud journey. Our experts are standing by to talk about your migration, modernisation, development and skills challenges.