Best solution to pass massive messaging spikes to AWS SQS - node.js

The app I'm working on is expected to face occasional spikes of incoming messages to be processed (Facebook webhook events). Live tests of this app hasn't been done yet but based on the experience of similar projects it's expected that these spikes can start sharply and hold at ~0.8-3k messages/sec for several hours. Beginning of the spike is predictable to the accuracy of several seconds-tens of seconds.
It's seems rational to pass these messages to some queue like AWS SQS and then process them at comfortable speed. If so, what would be the optimal solution for resending such message waves to SQS so that the listening app is always available, especially in the beginning of the spike (otherwise Facebook can probably show 503 error "Your webhook is down"):
hosting the listening app on AWS EC2 with a load balancer;
hosting the listening app on AWS Lambda (probably implementing some Lambda-warming measures like these)
other ideas? It would have been convenient if SQS could confirm subscription to Messenger webhooks so that Facebook would send those messages directly to SQS but that's unfortunately not possible due to "passive" nature of SQS.
Thanks in advance.

hosting the listening app on AWS EC2 with a load balancer
I think you can simplify it by going Serverless.
hosting the listening app on AWS Lambda (probably implementing some Lambda-warming measures
I don't think Lambda will be a good option because of 1000 concurrent executions limit, although it can be increased.
other ideas? It would have been convenient if SQS could confirm subscription to Messenger webhooks so that Facebook would send those messages directly to SQS but that's unfortunately not possible due to "passive" nature of SQS
I'd suggest using AWS API Gateway with AWS Service integration using SQS. You can configure
Facebook webhook events to go directly to your API Gateway REST endpoint. You can configure authentication and throttling as per your requirements in API gateway.

I like the architecture with a thin frontend application whose only responsibility is to accept the incoming request and immediately offload it to a queue or a stream. At the other end of the stream, there is a worker application that take care of the actually event processing.
other ideas?
Personally, I would use API Gateway mapped to Kinesis (or Kinesis Firehose) for the frontend application instead of EC2. Thus, I can rely on AWS to provide load balancing, autoscaling, OS patches, network configuration and so on. Moreover, Kinesis offers significant buffering capabilities so there is no need to resend the message spike. For the worker part of the application, it depends on what action needs to be performed. For short-lived operations, I recommend Lambda, but AWS also offers integration with EMR, Redshift, Elasticsearch and so on.

Related

How should an API gateway handle a slow microservice when scaling

I'm in the process of creating a kubernetes application. I have one microservice that performs an operation on data that it receives.
My API does some basic authentication and request validation checks, however here is where I'm confused as to what to do.
The API gateway has two endpoints, one which performs an individual operation, forwarding the request to the microservice, and another which receives an array of data, which it then sends off as individual requests to the microservice, allowing it to scale independently
The main thing I'm worried about is regarding scaling. The microservice depends on external APIs and could take a while to respond, or could fail to scale quickly enough. How would this impact the API gateway?
I'm worried that the API gateway could end up being overwhelmed by external requests, what is the best way to implement this?
Should I use some kind of custom metric and somehow tell kubernetes to not send traffic to api gateway pods that are handling more than X requests? Or should I set a hard cap using a counter on the API gateway to limit the number of requests that pod is handling by returning an error or something?
I'm use node.js for the API gateway code so aside from memory limits, I'm not sure if there's an upper limit to how many requests the gateway can handle

Adding more verbose detail to DocuSign Connect and API logging

We are encountering inconsistent results with DocuSign Connect integration where some envelope transactions are received by the listening server while others are not. This DocuSign integration involves two web servers with load balancing (listening app on both). The existing log details lack enough troubleshooting data to describe reason for the Connect failure events.
DocuSign Connect logs the responses it receives from the customer's Connect servers (or "listeners"). But Connect is merely the client of your servers.
The best logging will be your servers' logs.
Are your servers processing the incoming notifications synchronously or putting the notifications messages onto a queue and then later processing the messages asynchronously? The latter is a more reliable and recommended pattern.
You may want to consider switching to a PaaS solution that incorporates asynchronous queuing:
If you use a PaaS (Platform as a Service) system such as AWS or Azure, the cost of the intermediate system will be zero (AWS) or very low (Azure, etc) for as many as a million messages per month.
In addition, the PaaS pattern will enable your application to receive the notification messages from behind your firewall, with no changes to the firewall.
More information: https://www.docusign.com/blog/dsdev-webhook-listeners-part-4/
Code examples for AWS, Azure, Google Cloud for C# .NET Core, Java, Node.js, PHP, and Python are available from DocuSign’s GitHub repository. See the repository listing via this link. The repos all start with Connect-
After you receive a notification message, your application can use it to trigger a download/storage of the envelope’s documents; to start a new process since the envelope has now been signed, etc.

Node.js RESTful API server on AWS EC2 vs AWS API Gateway

I have a node.js RESTful API application. There is no web interface (at least as of now) and it is just used as an API endpoint which is called by other services.
I want to host it on Amazon's AWS cloud. I am confused between two options
Use normal EC2 hosting and just provide the hosting url as the API endpoint
OR
Use Amazon's API Gateway and run my code on AWS Lambda
Or can I just run my code on EC2 and use API Gateway?
I am confused on how EC2 and API Gateway are different when it comes to a node.js RESTful api application
Think of API Gateway as an API management service. It doesn't host your application code, it does provide a centralized interface for all your APIs and allows you to configure things like access restrictions, response caching, rate limiting, and version management for your APIs.
When you use API Gateway you still have to host your API's back-end application code somewhere like Lambda or EC2. You should compare Lambda and EC2 to determine which best suits your needs. EC2 provides a virtual Linux or Windows server that you can install anything on, but you pay for every second that the server is running. With EC2 you also have to think about scaling your application across multiple servers and load balancing the requests. AWS Lambda hosts your functions and executes them on demand, scales out the number of function containers automatically, and you only pay for the number of executions (and it includes a large number of free executions every month). Lambda is going to cost much less unless you have a very large number of API requests every month.

How does Elastic Beanstalk know if EC2 server is busy?

I've got a NodeJS application that does some moderately intense logic work when a user requests it. For example, a user on the frontend can click Analyze and the server will perform the work, which could take 30 seconds to 1 minute (non-blocking)
My app is not aimed at the wide public but at an audience of a few thousand. So there is a chance that several people might analyze at the same time.
I'm currently planning to deploy the app via Elastic Beanstalk, but I am not sure exactly how it will deal with a server when it is busy and if I have to implement some kind of custom signal to tell the load balancer to send requests to another instance, if the current one is busy performing analysis.
I understand that Lambdas are often held up as an option in this case, but I would much prefer to keep it simple and keep the code in my Node app.
How should I design this to ensure the app could handle doing analysis and still handling other requests normally?
Elastic Beanstalk uses Autoscaling Group to launch and maintain the EC2 instances required to run the Application. With Autoscaling Groups you can increase/decrease the EC2 instance count dynamically with Autoscaling Scaling policies. By default, Autoscaling Group provides scaling based on CPU, Network IN, Network Out, Request Count, Latency etc.. You can use any of these metrics and Scale-up your infrastructure dynamically.
You can refer to AWS Documentation here for more information.

Kafka as Messaging queue in Microservices

To give you background of the question, i am considering Kafka as a channel for inter service communication between micro services. Most of my micro services would have web nature (Either Web server/ REST server/ SOAP server to communicate with existing endpoints).
At some point i need asynchronous channel between micro services so i a considering Kafka as the message broker.
In my scenario, i have RESTfull microservice once its job is done, it that pushes messages to Kafka queue. Then another micro service which is also web server (embedded tomcat) with small REST layer would consume those messages.
Reason for considering messaging queue is that even if my receiving micro service is down for some reason, all my incoming message would be added to queue and data flow wont be disturbed. Another reason is that Kafka is persistent queue.
Typically, Kafka consumer are multi threaded.
Question is that, receiving micro service being web server, i am concerned about creating user threads in a servlet container managed environment. It might work. But considering the fact that user created threads are bad within web application, i am confused. So what would be your approach in this scenario.
Please suggest.

Resources