Does AWS lambda (nodejs) share memory across different executions? - node.js

I have this lambda func that responds to Http event (get),
sending the request I accept a pair of headers headerA and headerB.
Lambda is written in nodejs / typescript
On the implementation I have a global variable defined as a collection of objects:
let storage:{[myIndex :typeofHeaderA]? : storageDrivers } = {}
When I get a request I store a new instance of a Class I have into the above global var, on the "intended" index, if not yet present:
if(!storage[headerA]){
storage[headerA] = new MyStorageDriver(headerB)
}
What I am experiencing is that, if considering two distinct request, close to each other:
1st request headerA1 headerB1
2nd request headerA1 headerB2
After the 2nd request storage[headerA1] will contain a instance of MyStorageDriver(headerB1) and not MyStorageDriver(headerB2)
As if on different request lambda execution , the global scope memory is shared or reused across request.
Is it something expected with AWS lambda or is it something else that lead me to this unexpected behaviour?
My current solution (and to double check this behaviour, is to change the global var like this :
let storage:{[myIndex :typeofHeaderB]? :{
[myIndex :typeofHeaderA]? : storageDrivers }
} = {}
and then assign like this on requests:
if(!storage[headerB]){storage[headerB]={}
if(!storage[headerB][headerA]){
storage[headerB][headerA] = new MyStorageDriver(headerB)
}

Yes it does happen and no, you shouldn't rely on it.
What happens is that each time there is a request for a Lambda to act, AWS backend looks to see whats available. If no current running container exists for that lambda, it will spin one up and create the container, the lambda, and then use it. Calling the lambda is called 'Invoke'. The process of starting up a container and initializing the lambda is called the Cold Start, and is a bit of a problem - it can take upwards of 15-30 seconds depending on the complexity of your lambda and layers
The next time AWS goes to Invoke that lambda, if an already existing container is still running, it will attempt to re-use that container. This is known as 'Keeping the Lambda Warm' so that it can be Invoked without the cold start, resulting in a much faster response time.
If it needs to respond to multiple requests (ie: scale) it will spin up additional containers. These are known as Concurrent Executions - if you view the metrics for your lambda you will find maybe thousands of Invokes - but only 3 or 4 Concurrent.
The logic that AWS uses on when to spin up and when to shut down conccurrent executions is based on predictive understanding of your traffic + need + whatever settings you have for provisioned capacity.
Across any invoke that is using an existing container, global variables will be maintained - because it is the same environment. This is why you are not supposed to use global variables for anything that may change during the execution of the lambda You cannot rely on concurrent executions and being in the same container as a previous execution, and you cannot rely on it not being one either.

Related

Update (skip) wait duration on AWS Step Function

I have a Step Function set up that has a 'wait' state (eg, 999999 seconds). Once the wait is over, the Step Function invokes a Lambda. Sometimes, I will want to interrupt the wait time and trigger the Lambda immediately. Is this possible?
I thought I could do it by using the aws-sdk with the Step Functions API to manually skip the wait; but I've been experimenting with no success.
I tried the API's Start Execution method, but it is only for starting the entire Step Function (https://docs.aws.amazon.com/step-functions/latest/apireference/API_StartExecution.html) I can't find anything for manipulating individual steps.
I can use GetExecutionHistory to return an event object that describes the Wait step, eg:
{
timestamp: 2022-10-17T08:38:27.849Z,
type: 'WaitStateEntered',
id: 2,
previousEventId: 0,
stateEnteredEventDetails: {
name: 'Wait',
input: '{\n "Comment": "Insert your JSON here"\n}',
inputDetails: {truncated: false}
}
},
But there doesn't seem to be a way to manipulate this event to move to the next step.
I've spoken to AWS tech support who have confirmed that there is nothing in the aws-sdk or the aws-cdk that provides for the update of an existing state (eg, a 'wait' state) while it is running. There are some workarounds:
AWS tech support suggest Iterating a loop using a Lambda. This basically loops over a Choice>Wait>Lambda>(repeat) where the Lambda returns an output that tells the Choice whether to continue with the loop or else direct the Execution to another state. The advantage of this is that we don't need to cancel the Execution and we maintain a simpler record of activities. The disadvantage is that we are regularly invoking a Lambda.
As per #Guy's suggestion, we could split the Step Function into two separate Step Functions. This means we could cancel the initial Step Function and then trigger the latter Step Function manually.
We can cancel the execution of a Step Function with stopExecution. For example, using the aws-sdk:
import { config, Credentials, StepFunctions } from "aws-sdk"; // package.json: "aws-sdk": "^2.1232.0",
config.update({ region: "eu-west-2" });
const stepFunctions = new StepFunctions();
const stoppedExecution = await stepFunctions
.stopExecution({
executionArn: "...",
cause: "...",
error: "...",
})
.promise();
We can then trigger a new Step Function with startExecution
Step Functions also allow us to Wait for a callback with the Task Token. Basically, the Execution step state will send a task token (eg, to a Lambda), then wait to be returned the Task Token. Once received the Execution will proceed to the next step.
There are two ways I can think of proceeding from above item 3.:
a. Configure a Heartbeat Timeout for a Waiting Task. If the Heartbeat Timeout ends without a response token being received, the task fails with a States.Timeout error name. We can (I assume) handle the error in the Task rule to trigger the next step anyway. So the default behaviour is now to trigger the next step after a duration elapses, and then we also have the facility to skip the wait duration by sending the Task Token back to the Execution.
b. Use another Service to perform the wait function and return the Task Token after the wait duration has elapsed.
Option 3 of your answer would still require some service/process to handle/poll whether or not to continue I believe.
Ive implemented a pattern very similar to your description but can be defined in a single sfn defintion.
note: I consider this a hack/abuse of the States Language but it has the benefit of keeping a single state machine definition/execution and prevents paying for excessive state transitions in the looping method:
put your Wait state in a new Parallel state.
add a waitforcallback type task in the Parallel state. (dynamodb, sqs, etc) making sure to configure timeout/heartbeat to the same duration as the "neighboring" Wait state.
If/when you want to "short circuit" the wait duration, query wherever you stored the task token, and send a SendTaskFailure
with a unique cause/error payload.
configure the Catch (FallBack state) for the Parallel state to point to your "Invoke Lambda" state
Also configure the default(?) Next field for the Parallel block to point to your "Invoke Lambda" state
This may not be very intuitive but it relies on the fact that if any state defined in a parallel state fails, that entire block will fail immediately. with some custom error handling though, you can "ignore" the "special sentinel error", thus short circuiting the long wait duration and proceed to your next state.
its def not perfect and youll have to play around with errors/timeouts/heartbeats that make sense for your usecase.
depending on how many executions/transitions you expect, the easiest thing ive found is just making sure the task token ends up in a predictable cloudwatch log group, then query with cloudwatch logs insights when i need to retrieve it again

Are there conditions under which variables in an AWS Node Lambda persist between invocations?

I wouldn't think so, but I don't have another good explanation for what I observed. Here's a rough version of the relevant code, which is inside the handler function (i.e., it would not be expected to persist between invocations):
const res = await graphqlClient.query({my query})
const items = res.data.items
console.log(items) // <- this is the line that logs the output below
items.push({id: 'some-id'})
const itemResults = await Promise.all(items.map((item) => etc etc)
Over successive invocations from my client, spaced less than ten seconds apart, some-id was repeatedly added to items. On the first invocation, this is what was logged in CloudWatch after const items = res.data.items:
[
{
anotherId: 'foo',
id: 'bar',
}
]
The 2nd time it was invoked, after a few seconds, written to the logs before the call to items.push():
[
{
anotherId: 'foo',
id: 'bar',
},
{ id: 'some-id' }
]
The 3rd time, again written to the logs before the call to items.push():
[
{
anotherId: 'foo',
id: 'bar',
},
{ id: 'some-id' },
{ id: 'some-id' }
]
items is never written to persistent storage. some-id is only modified twice: when it's set to equal the value returned by the graphql query, and when I manually push another value onto the stack. I can prevent this bug by checking to see if some-id is already on the stack, so I'm unblocked for now, but how could it persist over successive runs? I never would've expected a Lambda to behave that way! I thought each invocation was stateless.
AWS Lambda is kind of stateless but not fully. You have to take care yourself that this is really true. Since your example code above is missing a handler function, I assume you didn't provide the full code and that you have defined const items outside of your handler function. A rough explanation based on my assumption:
Everything outside of your handler function is initialized once when starting your Lambda function for the first time (i.e. 'cold start'). You can initalize variables, database connections, etc. and reuse them in every invocation as long as the instance of your Lambda function stays alive.
Then, your handler function is invoked after the initialization steps and also for each future invocation. If you change values/objects outside of your handler function, then they'll survive the current invocation and you can use them in your next invocation. This way, you can cache some expensive data or do some other optimizations. For example:
const items = []
exports.handler = function(...) {
// ...
items.push(...)
// ...
}
This is also true for Java and Python Lambda functions and I believe for most other runtimes as well. Now, this is probably the explanation to what you observe: in one invocation you are pushing something to items and in the next one invocation, the previous data has survived because it was stored outside of the handler function.
Suggestion in your case: if you want full stateless functions, don't modify data outside of your handler function and instead, only store values inside. But take care that this can slow down your Lambda functions if you need to initialize data in each invocation.
Since this behavior of AWS Lambda is often used for caching data, there are a few blog posts covering this topic as well and how the code is handling it. They usually provide more visual explanations and example code:
Caching in AWS Lambda (note: my own blog post)
Leveraging Lambda Cache for Serverless Cost-Efficiency
All you need to know about caching for serverless application (This is covering much more about caching but one part of it is also considering caching inside a Lambda function)
There's much more happening behind the scenes of course. If you are interested in how this whole process works, I can recommend you taking a look into the Execution Environment Details. The article is more focused on giving background to building extensions and how the process outside of the code is working but it might help you understand what's happening behind the scenes.

Appropriate solution for long running computations in Azure App Service and .NET Core 3.1?

What is an appropriate solution for long running computations in Azure App Service and .NET Core 3.1 in an application that has no need for a database and no IO to anything outside of this application ? It is a computation task.
Specifically, the following is unreliable and needs a solution.
[Route("service")]
[HttpPost]
public Outbound Post(Inbound inbound)
{
Debug.Assert(inbound.Message.Equals("Hello server."));
Outbound outbound = new Outbound();
long Billion = 1000000000;
for (long i = 0; i < 33 * Billion; i++) // 230 seconds
;
outbound.Message = String.Format("The server processed inbound object.");
return outbound;
}
This sometimes returns a null object to the HttpClient (not shown). A smaller workload will always succeed. For example 3 billion iterations always succeeds. A bigger number would be nice specifically 240 billion is a requirement.
I think in the year 2020 a reasonable goal in Azure App Service with .NET Core might be to have a parent thread count to 240 billion with the help of 8 child threads so each child counts to 30 billion and the parent divides an 8 M byte inbound object into smaller objects inbound to each child. Each child receives a 1 M byte inbound and returns to the parent a 1 M byte outbound. The parent re-assembles the result into a 8 M byte outbound.
Obviously the elapsed time will be 12.5%, or 1/8, or one-eighth, of the time a single thread implementation would need. The time to cut-up and re-assemble objects is small compared to the computation time. I am assuming the time to transmit the objects is very small compared to the computation time so the 12.5% expectation is roughly accurate.
If I can get 4 or 8 cores that would be good. If I can get threads that give me say 50% of the cycles of a core, then I would need may be 8 or 16 threads. If each thread gives me 33% of the cycles of a core then I would need 12 or 24 threads.
I am considering the BackgroundService class but I am looking for confirmation that this is the correct approach. Microsoft says...
BackgroundService is a base class for implementing a long running IHostedService.
Obviously if something is long running it would be better to make it finish sooner by using multiple cores via System.Threading but this documentation seems to mention System.Threading only in the context of starting tasks via System.Threading.Timer. My example code shows there is no timer needed in my application. An HTTP POST will serve as the occasion to do work. Typically I would use System.Threading.Thread to instantiate multiple objects to use multiple cores. I find the absence of any mention of multiple cores to be a glaring omission in the context of a solution for work that takes a long time but may be there is some reason Azure App Service doesn't deal with this matter. Perhaps I am just not able to find it in tutorials and documentation.
The initiation of the task is the illustrated HTTP POST controller. Suppose the longest job takes 10 minutes. The HTTP client (not shown) sets the timeout limit to 1000 seconds which is much more than 10 minutes (600 seconds) in order for there to be a margin of safety. HttpClient.Timeout is the relevant property. For the moment I am presuming the HTTP timeout is a real limit; rather than some sort of non-binding (fake limit) such that some other constraint results in the user waiting 9 minutes and receiving an error message. A real binding limit is a limit for which I can say "but for this timeout it would have succeeded". If the HTTP timeout is not the real binding limit and there is something else constraining the system, I can adjust my HTTP controller to instead have three (3) POST methods. Thus POST1 would mean start a task with the inbound object. POST2 means tell me if it is finished. POST3 means give me the outbound object.
What is an appropriate solution for long running computations in Azure App Service and .NET Core 3.1 in an application that has no need for a database and no IO to anything outside of this application ? It is a computation task.
Prologue
A few years ago a ran in to a pretty similar problem. We needed a service that could process large amounts of data. Sometimes the processing would take 10 seconds, other times it could take an hour.
At first we did it how your question illustrates: Send a request to the service, the service processes the data from the request and returns the response when finished.
Issues At Hand
This was fine when the job only took around a minute or less, but anything above this, the server would shut down the session and the caller would report an error.
Servers have a default of around 2 minutes to produce a response before it gives up on the request. It doesn't quit the processing of the request... but it does quit the HTTP session. It doesn't matter what parameters you set on your HttpClient, the server is the one that delegates how long is too long.
Reasons For Issues
All this is for good reasons. Server sockets are extremely expensive. You have a finite amount to go around. The server is trying to protect your service by severing requests that are taking longer than a specified time in order to avoid socket starvation issues.
Typically you want your HTTP requests to take only a few milliseconds. If they are taking longer than this, you will eventually run in to socket issues if your service has to fulfil other requests at a high rate.
Solution
We decided to go the route of IHostedService, specifically the BackgroundService. We use this service in conjunction with a Queue. This way you can set up a queue of jobs and the BackgroundService will process them one at a time (in some instances we have service processing multiple queue items at once, in others we scaled horizontally producing two or more queues).
Why an ASP.NET Core service running a BackgroundService? I wanted to handle this without tightly-coupling to any Azure-specific constructs in case we needed to move out of Azure to some other cloud service (back in the day we were contemplating this for other reasons we had at the time.)
This has worked out quite well for us and we haven't seen any issues since.
The work flow goes like this:
Caller sends a request to the service with some parameters
Service generates a "job" object and returns an ID immediately via 202 (accepted) response
Service places this job in to a queue that is being maintained by a BackgroundService
Caller can query the job status and get information about how much has been done and how much is left to go using this job ID
Service finishes the job, puts the job in to a "completed" state and goes back to waiting on the queue to produce more jobs
Keep in mind your service has the capability to scale horizontally where there would be more than one instance running. In this case I am using Redis Cache to store the state of the jobs so that all instances share the same state.
I also added in a "Memory Cache" option to test things locally if you don't have a Redis Cache available. You could run the "Memory Cache" service on a server, just know that if it scales then your data will be inconsistent.
Example
Since I'm married with kids, I really don't do much on Friday nights after everyone goes to bed, so I spent some time putting together an example that you can try out. The full solution is also available for you to try out.
QueuedBackgroundService.cs
This class implementation serves two specific purposes: One is to read from the queue (the BackgroundService implementation), the other is to write to the queue (the IQueuedBackgroundService implementation).
public interface IQueuedBackgroundService
{
Task<JobCreatedModel> PostWorkItemAsync(JobParametersModel jobParameters);
}
public sealed class QueuedBackgroundService : BackgroundService, IQueuedBackgroundService
{
private sealed class JobQueueItem
{
public string JobId { get; set; }
public JobParametersModel JobParameters { get; set; }
}
private readonly IComputationWorkService _workService;
private readonly IComputationJobStatusService _jobStatusService;
// Shared between BackgroundService and IQueuedBackgroundService.
// The queueing mechanism could be moved out to a singleton service. I am doing
// it this way for simplicity's sake.
private static readonly ConcurrentQueue<JobQueueItem> _queue =
new ConcurrentQueue<JobQueueItem>();
private static readonly SemaphoreSlim _signal = new SemaphoreSlim(0);
public QueuedBackgroundService(IComputationWorkService workService,
IComputationJobStatusService jobStatusService)
{
_workService = workService;
_jobStatusService = jobStatusService;
}
/// <summary>
/// Transient method via IQueuedBackgroundService
/// </summary>
public async Task<JobCreatedModel> PostWorkItemAsync(JobParametersModel jobParameters)
{
var jobId = await _jobStatusService.CreateJobAsync(jobParameters).ConfigureAwait(false);
_queue.Enqueue(new JobQueueItem { JobId = jobId, JobParameters = jobParameters });
_signal.Release(); // signal for background service to start working on the job
return new JobCreatedModel { JobId = jobId, QueuePosition = _queue.Count };
}
/// <summary>
/// Long running task via BackgroundService
/// </summary>
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
while(!stoppingToken.IsCancellationRequested)
{
JobQueueItem jobQueueItem = null;
try
{
// wait for the queue to signal there is something that needs to be done
await _signal.WaitAsync(stoppingToken).ConfigureAwait(false);
// dequeue the item
jobQueueItem = _queue.TryDequeue(out var workItem) ? workItem : null;
if(jobQueueItem != null)
{
// put the job in to a "processing" state
await _jobStatusService.UpdateJobStatusAsync(
jobQueueItem.JobId, JobStatus.Processing).ConfigureAwait(false);
// the heavy lifting is done here...
var result = await _workService.DoWorkAsync(
jobQueueItem.JobId, jobQueueItem.JobParameters,
stoppingToken).ConfigureAwait(false);
// store the result of the work and set the status to "finished"
await _jobStatusService.StoreJobResultAsync(
jobQueueItem.JobId, result, JobStatus.Success).ConfigureAwait(false);
}
}
catch(TaskCanceledException)
{
break;
}
catch(Exception ex)
{
try
{
// something went wrong. Put the job in to an errored state and continue on
await _jobStatusService.StoreJobResultAsync(jobQueueItem.JobId, new JobResultModel
{
Exception = new JobExceptionModel(ex)
}, JobStatus.Errored).ConfigureAwait(false);
}
catch(Exception)
{
// TODO: log this
}
}
}
}
}
It is injected as so:
services.AddHostedService<QueuedBackgroundService>();
services.AddTransient<IQueuedBackgroundService, QueuedBackgroundService>();
ComputationController.cs
The controller used to read/write jobs looks like this:
[ApiController, Route("api/[controller]")]
public class ComputationController : ControllerBase
{
private readonly IQueuedBackgroundService _queuedBackgroundService;
private readonly IComputationJobStatusService _computationJobStatusService;
public ComputationController(
IQueuedBackgroundService queuedBackgroundService,
IComputationJobStatusService computationJobStatusService)
{
_queuedBackgroundService = queuedBackgroundService;
_computationJobStatusService = computationJobStatusService;
}
[HttpPost, Route("beginComputation")]
[ProducesResponseType(StatusCodes.Status202Accepted, Type = typeof(JobCreatedModel))]
public async Task<IActionResult> BeginComputation([FromBody] JobParametersModel obj)
{
return Accepted(
await _queuedBackgroundService.PostWorkItemAsync(obj).ConfigureAwait(false));
}
[HttpGet, Route("computationStatus/{jobId}")]
[ProducesResponseType(StatusCodes.Status200OK, Type = typeof(JobModel))]
[ProducesResponseType(StatusCodes.Status404NotFound, Type = typeof(string))]
public async Task<IActionResult> GetComputationResultAsync(string jobId)
{
var job = await _computationJobStatusService.GetJobAsync(jobId).ConfigureAwait(false);
if(job != null)
{
return Ok(job);
}
return NotFound($"Job with ID `{jobId}` not found");
}
[HttpGet, Route("getAllJobs")]
[ProducesResponseType(StatusCodes.Status200OK,
Type = typeof(IReadOnlyDictionary<string, JobModel>))]
public async Task<IActionResult> GetAllJobsAsync()
{
return Ok(await _computationJobStatusService.GetAllJobsAsync().ConfigureAwait(false));
}
[HttpDelete, Route("clearAllJobs")]
[ProducesResponseType(StatusCodes.Status200OK)]
[ProducesResponseType(StatusCodes.Status401Unauthorized)]
public async Task<IActionResult> ClearAllJobsAsync([FromQuery] string permission)
{
if(permission == "this is flakey security so this can be run as a public demo")
{
await _computationJobStatusService.ClearAllJobsAsync().ConfigureAwait(false);
return Ok();
}
return Unauthorized();
}
}
Working Example
For as long as this question is active, I will maintain a working example you can try out. For this specific example, you can specify how many iterations you would like to run. To simulate long-running work, each iteration is 1 second. So, if you set the iteration value to 60, it will run that job for 60 seconds.
While it's running, run the computationStatus/{jobId} or getAllJobs endpoint. You can watch all the jobs update in real time.
This example is far from a fully-functioning-covering-all-edge-cases-full-blown-ready-for-production example, but it's a good start.
Conclusion
After a few years of working in the back-end, I have seen a lot of issues arise by not knowing all the "rules" of the back-end. Hopefully this answer will shed some light on issues I had in the past and hopefully this saves you from having to deal with said problems.
One option could be to try out Azure Durable Functions, which are more oriented to long-running jobs that warrant checkpoints and state as against attempting to finish within the context of the triggering request. It also has the concept of fan-out/fan-in, in case what you're describing could be divided into smaller jobs with an aggregated result.
If just raw compute is the goal, Azure Batch might be a better option since it facilitates that scaling.
I assume the actual work that needs be done is something other than iterating over a loop doing nothing, so in terms of possible parallelization I can't offer much help right now. Is the work CPU intensive or IO related?
When it comes to long running work in an Azure App Service, one of the option is to use a Web Job. A possible solution would be to post the request for computation to a queue (Storage Queue or Azure Message Bus Queues). The webjob then processes those messages and possibly puts a new message on another queue that the requester can use to handle the results.
If the time needed for processing is guaranteed to be less than 10 minutes you could replace the Web Job with an Queue Triggered Azure Function. It is a serverless offering on Azure with great scaling possibilities.
Another option is indeed using a Service Worker or an instance of an IHostingService and do some queue processing there.
Since you're saying that your computation succeeds at fewer iterations, a simple solution is to simply save your results periodically and resume the computation.
For example, say you need to perform 240 Billion iterations, and you know that the highest number of iterations to perform reliably is 3 Billion iterations, I would set up the following:
A slave, that actually performs the task (240Billion iterations)
A master that periodically received input from the slave about progress.
The slave can periodically send a message to the master (say once every 2billion iterations ?). This message could contain whatever is relevant to resume the computation should the computation be interrupted.
The master should keep track of the slave. If the master determines that the slave has died / crashed / whatever, the master should simply create a new slave which should resume computation from the last reported position.
How exactly you implement the master and slave is a matter of your personal preference.
Rather than have a single loop perform 240 billion iterations, if you can split your computation across nodes, I would try to simultaneously compute the solution in parallel across as many nodes as possible.
I personally use node.js for multicore projects. Although you are using asp.net, I include this example of node.js to illustrate the architecture that works for me.
Node.js on multi-core machines
https://dzone.com/articles/multicore-programming-in-nodejs
As Noah Stahl has mentioned in his answer, Azure Durable Functions and Azure Batch seem like options to help you achieve your goal on your platform. Please see his answer for more details.
The standard answer is to use asynchronous messaging. I have a blog series on the topic. This is particularly the case since you're already in Azure.
You already have an Azure web app service, but now you want to run code outside of a request - "request-extrinsic code". The proper way to run that code is in a separate process - Azure Functions or Azure WebJobs are a good match for Azure webapps.
First, you want a durable queue. Azure Storage Queues are a good fit since you're in Azure anyway. Then your webapi can just write a message into the queue and return. The important part here is that this is a durable queue, not an in-memory queue.
Meanwhile, the Azure Function / WebJob is processing that queue. It will pick up the work from the queue and execute it.
The final piece of the puzzle is the completion notification. This is a pretty common approach:
I can adjust my HTTP controller to instead have three (3) POST methods. Thus POST1 would mean start a task with the inbound object. POST2 means tell me if it is finished. POST3 means give me the outbound object.
To do this, your background processor should save the "in-progress" / "complete/result" state somewhere where the webapi process can access it. If you already have a shared database (and it makes sense to keep results), then this may be the easiest choice. I would also consider using Azure Cosmos DB, which has a nice time-to-live setting so the background service can inject the results that are "good for 24 hours" or whatever, after which they're automatically cleaned up.

Should an AWS Lambda function instance in Node.js pick up another request during an async await?

Let's say I've got a queue of requests for my Lambda, and inside the lambda might be an external service call that takes 500ms, which is wrapped in async await like
async callSlowService(serializedObject: string) Promise<void>{
await slowServiceClient.post(serializedObject);
}
Should I expect that my Lambda instance will pick up another request off the queue while awaiting the slow call? I know it'll also spin up new Lambda instances but that's not what I'm talking about interleaving requests on a single instance.
I'm asking because I would think that it should do this, however I'm testing with a sleep function and a load generator and it's not happening. My code actually looks like this:
async someCoreFunction() Promise<void>{
// Business logic
console.log("Before wait");
await sleep(2000);
console.log("After wait");
}
}
const sleep = (milliseconds) => {
return new Promise(resolve => setTimeout(resolve, milliseconds))
};
And while it definitely is taking 2 seconds between the "Before wait" and "After wait" statements, there's no new logs being written in that time.
No.
Lambda as a service is largely unaware of what your code is doing. It simply takes a request, invokes your code and then waits for it to return.
I would not expect AWS to implement a feature like interleaving any time soon. It would require the lambda runtime to have substantial knowledge of how your code behaves (for example, you may be awaiting two concurrent long asynchronous calls within one invocation- so simply interrupting when you hit your first await would be incorrect). It would also cause no end of issues for people using the shared scope outside of the handler for common setup/teardown.
As you pay per invocation and time, I don't really see that there is much difference between interleaving and processing the queue in parallel (which lambda natively supports); considering that time spent awaiting still requires some compute. If interleaving ever happens I'd expect it to be a way for AWS to reduce the drain on their own resources.
n.b. If you are awaiting for a long time in a lambda function then there is probably a better way of doing things. For example, Step Functions provide a great way to kick off and poll long running tasks. Similarly, the pattern of using a session variable in your payload is a good way of allowing a long service to callback into lambda without having the lambda idling.

AWS Lambda does not run independently

I am using the nodejs to use AWS Lambda.
As I know each function of lambda is handled in independent and parallel process.
However, following example shows different result than I expected.
// test.js
const now = new Date();
module.exports = () => {
console.log(now);
};
// handler.js
const test = require('./test');
module.exports.hello = async (event, context) => {
test();
return {
statusCode: 200,
body: null
};
};
RESULT:
hello handler log
As I intended, each function was executed independently, so the value of console.log(now) should always be the point at which it was executed.
However, in the actual log, the value of now is continuously recorded at the point of the very first execution - rather than each function’s execution.
The log’s value after 5 minutes was the same.
However, the value changed after 12 hours, but after that, it shows the same problem.
This result gives us serious consideration of how to manage the DB connection.
There are two assumption for each case of lambda’s recycling
If lambda recycles like test.js,
better to use connection pool
also recommends to use a orm such as sequelize which requires initialization
If not,
better to use simple connections and regular queries to quickly consume connections
How can we use lambda within maximum performance?
How can we interpret the test results above?
AWS Lambda creates and reuses the containers, so you need to understand the impact of this practice on the programming model.
The first time a function executes, a new container will be created to execute it.
Let’s say your function finishes, and some time passes, then you call it again. Lambda may create a new container all over again. However, if you haven’t changed the Lambda function code and not too much time has gone by, Lambda may reuse the previous container. This offers performance advantages: Lambda gets to skip the nodejs language initialization, and you get to skip initialization in your code (so you can reuse DB connections, for example); files that you wrote to /tmp last time around will still be there if the container gets reused; anything you initialized globally outside of the Lambda function handler persists.
For more see Understanding Container Reuse in AWS Lambda.
The behavior that you have described is a result of AWS optimizations. It looks like your lambda is very fast and it is more efficient to use only one unit of execution (process/container/instance) fro AWS. So try to simulate a long running process and see that the actual timestamps are different in this case.

Resources