How to do queues for AWS Lambda (part 2)

Here is part 1 of this post in case you aren’t sure. The takeaway from that article is that people should use DynamoDB as a queue system. This article follows up with a little more info.

We’ve been coding within an AWS Lambda backend and using Serverless ideas (not framework) and event driven patterns for the last 18 months. In that time, we’ve realised that one of the key elements to an event driven pattern is to have a pattern based around message queues (of course).

Originally, it was an SQS approach, which is a pull queue, but that didn’t really make the system work easily. It required a continual check of the queue in some form. For some use cases though (things like dead letter queues) this is a really good idea. But mainly we have worked with DynamoDB and DynamoDB Streams.

Unfortunately though, there are a few quirks with using DynamoDB for this. If the use case fits though these quirks can be really useful.

The main thing that we’ve found is that using DynamoDB with DynamoDB Streams and triggering AWS Lambda means you have to code your Lambda function in a certain way.

The main thing is that if the Lambda function produces an error (it does not complete) then the DynamoDB Stream will retry the function pretty much forever (a quick test today on a dummy function ran for 5 hours and invoking roughly once a minute). The downside is that because the Lambda function is stateless, then this is likely to produce the same result each time.

Which means that we occasionally hit an edge case, and see large amounts of errors on our system. They are all the same error, but there’s another issue.

The errors… they go on, and on, and on, and are essentially “blocking” the rest of the queue!!

One good thing is that then we can simply identify and fix those edge cases in code, but the errors will occur until you fix it.

The bad thing is that they might not actually be errors, which means it’s a failure of coding. It’s very easy to utilise the error state within a lambda to tell you something (and to find errors in logs).

This is not a bad thing per se, but if you’re using the system as a queue, then really you want error data to go into a Dead Letter Queue (SQS is a great use for this as it is a pull queue) instead of into a constant error cycle (multiple retries).

In most circumstances, an SNS queue appears to be better than a DynamoDB queue. At least in our case. But the problem is that it’s not as easily auditable. You can’t see what’s coming up in the queue if there’s a problem or what’s been before.

But the advantage of SNS is that you can specify a number of retries for the triggered Lambda in the configuration. This is a big advantage.

Add to this the utilisation of a Dead Letter Queue with the Lambda function, which retries an error three times (default for Lambda) and then sends it to a Dead Letter Queue (either SQS or SNS — my pref is SQS) for processing later.

It’s all a little bit more complicated than I thought it was last year to do queues in Lambda, and some of these things (Dead Letter Queues configuration for Lambda for example) weren’t there then.

But the upshot of it is that SQS still isn’t a very good queue for an event driven process (it’s still “pull”). That’s great for storing messages for later processing like in a Dead Letter Queue, but for an event driven architecture, it’s probably best to start with SNS and if you really need to be able to see what’s in the queue, then a DynamoDB table with DynamoDB Streams, BUT ensuring that you manage errors really well in your lambda function (only allow it to happen if it’s actually an error basically). You can easily end up with a blocking scenario and problems later on.

Or as I suggested to someone else, a DynamoDB table with a triggered Lambda that only pushes to an SNS queue that triggers a Lambda with a Dead Letter Queue, so that you can audit the queue easily and utilise the best parts of the system.

Well, that’s a lot more complicated than I thought it would be!

My AWS wishlist got a bit longer: Much better handling of errors from DynamoDB Streams would be awesome (make it like SNS and I’d be very happy)!

Written by

ServerlessDays CoFounder (Jeff), ex AWS Serverless Snr DA, experienced CTO/Interim, Startups, Entrepreneur, Techie, Geek and Christian

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store