A year with AWS Lambda

Last year could largely be summarized as `A year with AWS Lambda.` I’ve been a huge advocate for serverless architectures and after using AWS Lambda and AWS API Gateway for a number of projects, am extremely bullish on its future. As a result, I’ve started to understand what use cases are better suited for Lambda, as well as general lessons learned (good and bad). It goes without saying that as with any AWS-related offering, the capabilities and availability are subject to change at any time – information provided here will be subject to drift.

As it stands currently, my go-to Lambda stack looks like: TypeScript + NodeJS + Serverless

Would love to hear what your `Lambda Learnings` are — pile on by leaving a comment.

Use Cases

Ideal ForNot Ideal For
Light-weight triggersHigh volume microservices
Event driven architecturesReal time eventing
Low volume microservicesGlobal solutions where portability is key concern
Short-lived, stateless operationsLong-running tasks

Best Practices

Concurrency By default, Lambda is limited to 100 concurrent executions, per account, per region. It will handle short-term bursts that exceed that limit, but will start throttling requests, returning HTTP status code 429. The current default is quite low and one of the primary reasons Lambda is currently unsuitable for high scale/volume use cases – microservices or otherwise. It is possible to increase this limit (on a per-account basis) by reaching out to Amazon.
If a function is defined within a VPC, the number of concurrent calls is limited by the number of free IPs in the subnets associated with that function. The number of required IPs can be approximately determined by the following formula:

Projected peak concurrent executions * (Memory in GB / 1.5GB)

recommendation: Have a good understanding of your application’s expected capacity and performance characteristics. If necessary, reach out to your AWS TAM and request a higher concurrency limit. In either case, be sure adequate monitoring is in place to detect when throttling occurs, which could impact end user experience. It’s entirely possible Lambda is not an ideal solution choice for your use case.

Performance / SLAs Amazon has no formal service-level agreements crafted around the setup, execution, teardown or appearance of logs in Cloudwatch – of your function. AWS does guarantee that `there are no maintenance windows or scheduled downtimes` – even during Lambda function updates/changes. The overhead of Lambda is measured in 100s of milliseconds (even a `warm` function). For applications/experiences that require high volume/low latency response times, Lambda is unlikely to be an ideal choice for your use case. For applications that have SLA requirements for sub-100ms response times, other solutions should be considered.

Keep in mind that Lambda functions are dependent on EC2 and Cloudfront APIs. While AWS outages are rare, they do happen. There have been instances where EC2 APIs are unavailable, impacting the availability of Lambda functions. Amazon is working on a solution to avoid this dependency.

recommendation: Consider other solutions if low response times and SLA requirements are critical to your application and user experience.

Accessing VPC Resources To successfully access other VPC-based resources, you must assign a Lambda function to a VPC. As a general rule, this is not recommended as it means longer start-up times and large enough capacity of free ENIs. You can easily run out of free IPs, which mean subsequent Lambda invocations will fail as well. Also, if the subnets allocated to that Lambda function are shared with other applications, that application could be affected as well.

recommendation: If possible, avoid accessing VPC resources in a Lambda function. If necessary, just be mindful of the possible side effects of doing so.

Language Selection Lambda currently supports Java, C# Python and NodeJS (ECMAScript 5, some support for 6). Any of these three are perfectly fine and should align with team skill sets and preference. However, the underlying runtimes for both Python and NodeJS are faster than Java/C#. If performance and start-up time are a paramount concern, lean towards one of the former.

recommendation: Lean towards Python or NodeJS. If team expertise/comfort is JVM-centric and performance is less of a concern, go with Java/C#.

Package Size Lambda currently has a 50MB limit on package sizes. The larger a packager, generally the longer it will take to build and deploy. Likewise, a larger application footprint impacts start-up time and can also result in greater latencies when viewing logs in Cloudwatch. Dependencies should be prudently considered in any context, but doubly so when using Lambda. Only use what you need and keep the package size as small as possible. In general, Python and NodeJS applications end up having smaller footprints. I’ve had immense success in minimizing package size by using Webpack with my NodeJS Lambda functions. Effectively, only JavaScript that referenced in the dependency graph is included in the final bundle.

recommendation: Judiciously consider, and limit dependencies, to reduce build, deployment and start-up times.

Portability Lambda is a serverless implementation specific to Amazon Web Services. Other contemporaries include Azure Function and GCP Function. While most companies and products are moving towards `cloud-first` largely – focused on AWS – it’s prudent to be mindful of potential vendor lock-in. Lambda is also not available in all regions. If globalization is a concern, Amazon may not have the capabilities to support applications that rely on Lambda functions, that have specific region requirements.

recommendation: All Lambda handler code should be isolated and extremely thin shims to logic that is locked up in other modules/classes. This increases reusability and in the event that a refactor is necessary to move out of Lambda, makes that work much easier and straightforward. This also facilitates unit testing.
An example of a thin Lambda handler:

var healthController = require('./health_controller');
module.exports.healthCheck = (event, context, callback) => {
  Logger.verbose(`GET /health: ${JSON.stringify(event)}`);
  healthController.doHealthChecks().subscribe(results => {
    callback(null, createResponse(200, results));
Deployment & CI/CD Deploying Lambda via the AWS console or AWS CLI is precarious. A number of OSS tools exist that provide robust deployment and CD/CI capabilities. Inevitably, if you don’t end up taking advantage of one of these tools, you’ll end up re-writing portions of it.

recommendation: Use Serverless to quickly and efficiently deploy Lambda functions and AWS API Gateway, in a repeatable, standardized way. We have multiple examples of how to do this, in GitHub. Serverless is great for local development and can be leveraged in the same way for seamless CD/CI on Jenkins, CircleCI or similar. Serverless is an OSS framework that uses standard AWS technologies under the hood (e.g. IAM roles, Cloudformation templates, AWS CLI).

Other Considerations

Event Sources Lambda has an ever-growing list of event sources. Currently you can invoke a Lambda function via:

  • Amazon S3
  • Amazon DynamoDB
  • Amazon Kinesis Streams
  • Amazon Simple Notification Service
  • Amazon Simple Email Service
  • Amazon Cognito
  • AWS CloudFormation
  • Amazon CloudWatch Logs
  • Amazon CloudWatch Events
  • AWS CodeCommit
  • Scheduled Events (powered by Amazon CloudWatch Events)
  • AWS Config
  • Amazon Echo
  • Amazon Lex
  • Amazon API Gateway
  • Other Event Sources: Invoking a Lambda Function On Demand
  • Sample Events Published by Event Sources

It’s often neglected, but Lambda functions can also be directly invoked via the AWS CLI/APIs with an access key/ID.

Cost Lambda is a very cost-effective AWS solution; it’s orders of magnitude cheaper than an EC2 equivalent. The main cost factors are:

Total # of requests
Duration of code execution
Allocated memory

Currently, you are charged $0.00001667 for every GB-second used. Duration is calculated from the time your code begins executing until it returns or otherwise terminates, rounded up to the nearest 100ms. Amazon categorizes the price per ms based on the amount of memory allocated.

Additional charges also occur when your Lambda function utilizes other AWS services and initiates data transfers. To get an idea of monthly costs, use Amazon’s Lambda cost calculator. In general, millions of requests per month will result in costs of 10s of dollars, per month.

recommendation: Memory allocation makes a big difference in cost. Tune your memory allocation to your application. Again, Python and NodeJS typically have smaller footprints here – usually between 40 – 120MB. Try to tune memory to be no more than ~2X a typical workload. 512 and 1024MB are good places to start. Again, your mileage will vary… measure and tune over time.

Warm vs. Cold Lambda functions exist in one of two states: cold or warm.

Cold functions have a noticeably longer start-up time – on the order of 1s+. You’ll notice that an initial request takes much longer than subsequent ones. Once `initialized`, the function is now warm. Amazon has confirmed that this is indeed expected behavior – a byproduct of the underlying infrastructure (Lambda currently utilizes EC2 and Cloudfront API). It’s possible this will be addressed in a future release.

Functions can be kept `warm` by periodically hitting them (via monitoring or otherwise). Fair warning: a cold function that receives a high volume of traffic will return failures (500s) rather quickly, until initialized (warm).

SecurityLambda functions use IAM roles to define permissions for what a Lambda function can and can’t access.
All Lambda usage is subject to security and architecture review. External surface area, information disclosure, data privacy and acceptable usage are all considerations that should undergo a design review.
Configuration Configuration information can be provided to a Lambda function in a number of ways. Simple name-value configuration values can simply be baked into the application itself – generated or written by the build system – or even checked-in to source. However, extra care must be taken for sensitive configuration values (e.g. API keys, database credentials, etc.). No repositories should ever contain sensitive configuration information. Two acceptable approaches would include:

1) Use environment variables. Lambda has full environment variable support as of 11/16.
2) Have the build system generates config file(s) to a secure S3 bucket, which is only accessible by the Lambda function.
3) Leverage KMS.

recommendation: As outlined in the Twelve-factor App, use environment variables for configuration-related data. It’s a well-known, supported and language-agnostic way of providing configuration data to an application.

Logging Lambda uses basic logging facilities (e.g. console.writeline), which end up in Cloudwatch. Cloudwatch is a robust, if not unwieldy, offering. There is no SLA/guarantee when (or if) Lambda logs will appear in Cloudwatch. Overall package size and start-up time appear to influence this. Latencies of up to 30 minutes have been observed. Typically, it’s on the order of minutes, if that.

recommendation: Log early and often. Specify config-based log levels (e.g. verbose/info/warning/error) and use appropriately throughout your function.

Monitoring Out-of-the-box, Lambda offers little in the vein of monitoring or alerting. Both New Relic and Datadog have good Lambda integration. While these offerings provide basic telemetry, it’s still fairly limited in nature. If more meaningful monitoring is required, the recommendation would be to implement light-weight, fire-n-forget instrumentation (e.g. New Relic Insights, or similar).

At a minimum, monitoring and alerting thresholds should be setup for any business critical functionality:

Throttling (exceeding concurrency)
Max execution duration.

Environments Unlike traditional cloud or on premise hosting, the concept of different environments isn’t exactly clear in Lambda.

recommendation: Have 1:1 parity with existing environments, on a per function basis. If exposing via API Gateway, use the concept of `stages` to define different environments. Developers should have their own stage and a stage should exist for all other non-prod/prod environments. This also helps segregate cost.

Tips, Tricks & Gotchas

Debugging / Running Locally In general, debugging a Lambda function is a clumsy affair. Runtime failures often manifest as the infamous, generic ‘internal server error,’ leaving no context as to the root issue. As cited above in the Logging section, log early and often. For RESTful services, use a well-defined error contract to return additional error information and HTTP status codes. Use a config-based setting to toggle ‘debug’ vs. ‘release’ level error details.

Mapping errors between Lambda and API Gateway can be challenging; it’s one of the things that Serverless addresses well.

As described in Portability above, structure your application in very distinct classes and modules. Judiciously use IoC and DI patterns to allow for quick and easy unit testing. If designed and tested well, there should be few to no surprises once deployed and running in Lambda.

Function Dependencies Lambda functions can in fact invoke other Lambda functions. In general, for reasons of readability, traceability and separation of concerns, avoid ‘chaining’ together Lambda functions. Use Event Sources as a decoupled messaging system (e.g. a lambda function does work, writes to S3, which triggers another Lambda function), rather than tightly coupling Lambda functions.

In this vein, it’s also best to avoid results/outcomes tied to sequenced Lambda functions (think async.js). However, AWS did recently just launch Lambda Step Functions, which coordinates via stateful workflows.

Exposing Externally Lambda functions are not exposed externally, but instead triggered by one of the Event Sources defined above. Unless you’re exposing a RESTful interface, there’s no need to make a Lambda function externally accessible.

Lambda functions can be invoked directly via AWS CLI/APIs, but require access ID/key and in general is a pattern to be avoided. For RESTful services, the natural inkling is to use AWS API Gateway.

Closest-edge Function Invocation Lambda functions are currently deployed on a per-region basis. Currently, there is no way for a Lambda function to be deployed to edge nodes (a la Cloudfront) and when invoked by a caller, to use the closest geospatial function – via anycast DNS. All requests to a Lambda function will be routed to its specified region. Routing Lambda requests to a closer region would be a significant amount of overhead and still have caveats.

Lambda Edge is currently in preview and allows responses to Cloudfront events, but still doesn’t address this particular issue. Edge is currently in preview and only supports NodeJS.

Leave Reply