I'm working with a Redis cluster having 2+ nodes. I'm trying to figure out which tool best fits for handling concurrency - transaction or locking. Transactions are well documented, but I didn't find a good best-practice-example on redlock. I also wonder why two tools exist and what's the use case for each.
For simplicity, let's assume I want to do a concurrent increment and there is no INCR command in Redis.
Option 1. Using Transactions
If I understand correctly, NodeJS pseudocode would look like this:
transactIncrement = async (key) => {
await redisClient.watch(key);
let value = redisClient.get(key);
value = value + 1;
const multi = await redisClient.multi();
try {
await redisClient.set(key, value, multi);
await redisClient.exec(multi);
} catch (e) {
// most probably error thrown because transaction failed
// TODO: think if it's a good idea to restart in every case, introducing a potential infinite loop
// whatever, restart
await transactIncrement(key);
}
}
Bad things I can see above are:
try-catch block
possibility to use transactions with multiple keys is limited on redis cluster
Option 2. Redlock
Is it true that trying to lock a resource that's already locked would not cause a failure immediately? So that redlock tries N times before erroring?
If true then here's my pseudocode:
redlockIncrement = async (key) => {
await redlock.lock(key, 1);
// below this line it's guaranteed that other "threads" are put on hold
// and cannot access the key, right?
let value = await redisClient.get(key);
value = value + 1;
await redisClient.set(key, value);
await redlock.unlock(key);
}
Summary
If I got things right then redlock is definitely a more powerful technique. Please correct me if I'm wrong in the above assumptions. It would also be really great if someone provides an example of code solving similar problem because I couldn't find one.
Redlock is useful when you have a distributed set of components that you want to coordinate to create an atomic operation.
You wouldn't use it for operations that affect a single Redis node. That's because Redis already has much simpler and more reliable means of ensuring atomicity for commands that use its single-threaded server: transactions or scripting. (You didn't mention Lua scripting, but that's the most powerful way to create custom atomic commands).
Since INCR operates on a single key, and therefore on a single node, the best way to implement that would be with a simple Lua script.
Now, if you want to use a sequence of commands that spans multiple nodes neither transactions nor scripting will work. In that case you could use Redlock or a similar distributed lock. However, you would generally try to avoid that in your Redis design. Specifically, you would use hash tags to force certain keys to reside on the same node:
Hash tags are a way to ensure that multiple keys are allocated in the same hash slot. This is used in order to implement multi-key operations in Redis Cluster.
Related
I am facing concurrency problem with redis, my API is build on Nodejs Fastify and i am using fastify-redis in my API call.
I am using two simple methods of redis ZRANGEBYSCORE and ZINCRBY
The problem is under high concurrency ZINCRBY is executed late which results into giving me same value on multiple request.
How can i prevent this under high concurrency is there any method to lock the KEY which was previously executed.
Here is the example of my code
numbers = await redis.zrangebyscore(
`user:${req.query.key}:${state}`, //key
0, // min value
50, // max value
"LIMIT",
0, // offset
1 // limit
);
if (numbers.length > 0) {
await redis.zincrby(`user:${req.query.key}:${state}`, 1, numbers[0]);
res.send(numbers[0]);
}
The issue isn't concurrency per se, it's that you have a series of operations that need to be atomic and you haven't done anything to ensure that.
Redis has facilities to try to ensure atomicity, as described here. In your case, since the value from the first operation is used in the second, you couldn't do a simple MULTI and EXEC. You'd have to instead WATCH the key and then retry the operation if it aborted.
The simpler, and recommended approach, though, is just to put your above code into a Lua script, where it can be executed on the server as a single atomic operation.
I am getting n post requests (on each webhook trigger) from a webhook. The data is identical on all requests that come from the same trigger - they all have the same 'orderId'. I'm interested in saving only one of these requests, so on each endpoint hit I'm checking if this specific orderId exists as a row in my database, otherwise - create it.
if (await orderIdExists === null) {
await Order.create(
{
userId,
status: PENDING,
price,
...
}
);
await sleep(3000)
function sleep(ms) {
return new Promise((resolve) => {
setTimeout(resolve, ms);
});
}
}
return res.status(HttpStatus.OK).send({success: true})
} catch (error) {
return res.status(HttpStatus.INTERNAL_SERVER_ERROR).send({success: false})
}
}
else {
return res.status(HttpStatus.UNAUTHORIZED).send(responseBuilder(false, responseErrorCodes.INVALID_API_KEY, {}, req.t));
}
}
Problem is before Sequelize manages to save the new created order in the db (all of the n post requests get to the enpoint in 1 sec - or less), I already get another endpoint hit from the other n post requests, while orderIdExists still equels null, So it ends up creating more identical orders. One (not so good solution) is to make orderId unique in the db, which prevents the creation of on order with the same orderId, but tries to anyway, which results in empty id incrementation in the db. Any idea would be greatly appreciated.
p.s. as you can see, i tried adding a 'sleep' function to no avail.
Your database is failing to complete its save operation before the next request arrives. The problem is similar to the Dogpile Effect or a "cache slam".
This requires some more thinking about how you are framing the problem: in other words the "solution" will be more philosophical and perhaps have less to do with code, so your results on StackOverflow may vary.
The "sleep" solution is no solution at all: there's no guarantee how long the database operation might take or how long you might wait before another duplicate request arrives. As a rule of thumb, any time "sleep" is deployed as a "solution" to problems of concurrency, it usually is the wrong choice.
Let me posit two possible ways of dealing with this:
Option 1: write-only: i.e. don't try to "solve" this by reading from the database before you write to it. Just keep the pipeline leading into the database as dumb as possible and keep writing. E.g. consider a "logging" table that just stores whatever the webhook throws at it -- don't try to read from it, just keep inserting (or upserting). If you get 100 ping-backs about a specific order, so be it: your table would log it all and if you end up with 100 rows for a single orderId, let some other downstream process worry about what to do with all that duplicated data. Presumably, Sequelize is smart enough (and your database supports whatever process locking) to queue up the operations and deal with write repetitions.
An upsert operation here would be helpful if you do want to have a unique constraint on the orderId (this seems sensible, but you may be aware of other considerations in your particular setup).
Option 2: use a queue. This is decidedly more complex, so weigh carefully wether or not your use-case justifies the extra work. Instead of writing data immediately to the database, throw the webhook data into a queue (e.g. a first-in-first-out FIFO queue). Ideally, you would want to choose a queue that supports de-duplication so that exiting messages are guaranteed to be unique, but that infers state, and that usually relies on a database of some sort, which is sort of the problem to begin with.
The most important thing a queue would do for you is it would serialize the messages so you can deal with them one at a time (instead of multiple database operations kicking off concurrently). You can upsert data into the database when you read a message out of the queue. If the webhook keeps firing and more messages enter the queue, that's fine because the queue forces them all to line up single-file and you can handle each insertion one at a time. You'll know that each database operation has completed before it moves on to the next message so you never "slam" the DB. In other words, putting a queue in front of the database will allow it to handle data when the database is ready instead of whenever a webhook comes calling.
The idea of a queue here is similar to what a semaphore accomplishes. Note that your database interface may already implement a kind of queue/pool under-the-hood, so weigh this option carefully: don't reinvent a wheel.
Hope those ideas are useful.
You saved my time #Everett and #april-henig. I found that saving directly into database read to records duplicates. If you store records into an object and deal with one record at time helped me a lot.
May be I would share my solution perhaps some may find it useful in future.
Create an empty object to save success request
export const queueAllSuccessCallBack = {};
Save POST request in object
if (status === 'success') { // I checked the request if is only successfully
const findKeyTransaction = queueAllSuccessCallBack[client_reference_id];
if (!findKeyTransaction) { // check if Id is not added to avoid any duplicates
queueAllSuccessCallBack[client_reference_id] = {
transFound,
body,
}; // save new request id as key and the value as data you want
}
}
Access the object to save into database
const keys = Object.keys(queueAllSuccessCallBack);
keys.forEach(async (key) => {
...
// Do extra checks if you want to do so
// Or save in database direct
});
I need NodeJS to prevent concurrent operations for the same requests. From what I understand, if NodeJS receives multiple requests, this is what happens:
REQUEST1 ---> DATABASE_READ
REQUEST2 ---> DATABASE_READ
DATABASE_READ complete ---> EXPENSIVE_OP() --> REQUEST1_END
DATABASE_READ complete ---> EXPENSIVE_OP() --> REQUEST2_END
This results in two expensive operations running. What I need is something like this:
REQUEST1 ---> DATABASE_READ
DATABASE_READ complete ---> DATABASE_UPDATE
DATABASE_UPDATE complete ---> REQUEST2 ---> DATABASE_READ ––> REQUEST2_END
---> EXPENSIVE_OP() --> REQUEST1_END
This is what it looks like in code. The problem is the window between when the app starts reading the cache value and when it finishes writing to it. During this window, the concurrent requests don't know that there is already one request with the same itemID running.
app.post("/api", async function(req, res) {
const itemID = req.body.itemID
// See if itemID is processing
const processing = await DATABASE_READ(itemID)
// Due to how NodeJS works,
// from this point in time all requests
// to /api?itemID="xxx" will have processing = false
// and will conduct expensive operations
if (processing == true) {
// "Cheap" part
// Tell client to wait until itemID is processed
} else {
// "Expensive" part
DATABASE_UPDATE({[itemID]: true})
// All requests to /api at this point
// are still going here and conducting
// duplicate operations.
// Only after DATABASE_UPDATE finishes,
// all requests go to the "Cheap" part
DO_EXPENSIVE_THINGS();
}
}
Edit
Of course I can do something like this:
const lockedIDs = {}
app.post("/api", function(req, res) {
const itemID = req.body.itemID
const locked = lockedIDs[itemID] ? true : false // sync equivalent to async DATABASE_READ(itemID)
if (locked) {
// Tell client to wait until itemID is processed
// No need to do expensive operations
} else {
lockedIDs[itemID] = true // sync equivalent to async DATABASE_UPDATE({[itemID]: true})
// Do expensive operations
// itemID is now "locked", so subsequent request will not go here
}
}
lockedIDs here behaves like an in-memory synchronous key-value database. That is ok, if it is just one server. But what if there are multiple server instances? I need to have a separate cache storage, like Redis. And I can access Redis only asynchronously. So this will not work, unfortunately.
Ok, let me take a crack at this.
So, the problem I'm having with this question is that you've abstracted the problem so much that it's really hard to help you optimize. It's not clear what your "long running process" is doing, and what it is doing will affect how to solve the challenge of handling multiple concurrent requests. What's your API doing that you're worried about consuming resources?
From your code, at first I guessed that you're kicking off some kind of long-running job (e.g. file conversion or something), but then some of the edits and comments make me think that it might be just a complex query against the database which requires a lot of calculations to get right and so you want to cache the query results. But I could also see it being something else, like a query against a bunch of third party APIs that you're aggregating or something. Each scenario has some nuance that changes what's optimal.
That said, I'll explain the 'cache' scenario and you can tell me if you're more interested in one of the other solutions.
Basically, you're in the right ballpark for the cache already. If you haven't already, I'd recommend looking at cache-manager, which simplifies your boilerplate a little for these scenarios (and let's you set cache invalidation and even have multi-tier caching). The piece that you're missing is that you essentially should always respond with whatever you have in the cache, and populate the cache outside the scope of any given request. Using your code as a starting point, something like this (leaving off all the try..catches and error checking and such for simplicity):
// A GET is OK here, because no matter what we're firing back a response quickly,
// and semantically this is a query
app.get("/api", async function(req, res) {
const itemID = req.query.itemID
// In this case, I'm assuming you have a cache object that basically gets whatever
// is cached in your cache storage and can set new things there too.
let item = await cache.get(itemID)
// Item isn't in the cache at all, so this is the very first attempt.
if (!item) {
// go ahead and let the client know we'll get to it later. 202 Accepted should
// be fine, but pick your own status code to let them know it's in process.
// Other good options include [503 Service Unavailable with a retry-after
// header][2] and [420 Enhance Your Calm][2] (non-standard, but funny)
res.status(202).send({ id: itemID });
// put an empty object in there so we know it's working on it.
await cache.set(itemID, {});
// start the long-running process, which should update the cache when it's done
await populateCache(itemID);
return;
}
// Here we have an item in the cache, but it's not done processing. Maybe you
// could just check to see if it's an empty object or not, but I'm assuming
// that we've setup a boolean flag on the cached object for when it's done.
if (!item.processed) {
// The client should try again later like above. Exit early. You could
// alternatively send the partial item, an empty object, or a message.
return res.status(202).send({ id: itemID });
}
// if we get here, the item is in the cache and done processing.
return res.send(item);
}
Now, I don't know precisely what all your stuff does, but if it's me, populateCache from above is a pretty simple function that just calls whatever service we're using to do the long-running work and then puts it into the cache.
async function populateCache(itemId) {
const item = await service.createThisWorkOfArt(itemId);
await cache.set(itemId, item);
return;
}
Let me know if that's not clear or if your scenario is really different from what I'm guessing.
As mentioned in the comments, this approach will cover most normal issues you might have with your described scenario, but it will still allow two requests to both fire off the long-running process, if they come in faster than the write to your cache store (e.g. Redis). I judge the odds of that happening are pretty low, but if you're really concerned about that then the next more paranoid version of this would be to simply remove the long-running process code from your web API altogether. Instead, your API just records that someone requested that stuff to happen, and if there's nothing in the cache then respond as I did above, but completely remove the block that actually calls populateCache altogether.
Instead, you would have a separate worker process running that would periodically (how often depends on your business case) check the cache for unprocessed jobs and kick off the work for processing them. By doing it this way, even if you have 1000's of concurrent requests for the same item, you can ensure that you're only processing it one time. The downside of course is that you add whatever the periodicity of the check is to the delay in getting the fully processed data.
You could create a local Map object (in memory for synchronous access) that contains any itemID as a key that is being processed. You could make the value for that key be a promise that resolves with whatever the result is from anyone who has previously processed that key. I think of this like a gate keeper. It keeps track of which itemIDs are being processed.
This scheme tells future requests for the same itemID to wait and does not block other requests - I thought that was important rather than just using a global lock on all requests related to itemID processing.
Then, as part of your processing, you first check the local Map object. If that key is in there, then it's currently being processed. You can then just await the promise from the Map object to see when it's done being processed and get any result from prior processing.
If it's not in the Map object, then it's not being processed now and you can immediately put it in Map to mark it as "in process". If you set a promise as the value, then you can resolve that promise with whatever result you get from this processing of the object.
Any other requests that come along will end up just waiting on that promise and you will thus only process this ID once. The first one to start with that ID will process it and all other requests that come along while it's processing will use the same shared result (thus saving the duplication of your heavy computation).
I tried to code up an example, but did not really understand what your psuedo-code was trying to do well enough to offer a code example.
Systems like this have to have perfect error handling so that all possible error paths handle the Map and promise embedded in the Map properly.
Based on your fairly light pseudo-code example, here's a similar pseudo code example that illustrates the above concept:
const itemInProcessCache = new Map();
app.get("/api", async function(req, res) {
const itemID = req.query.itemID
let gate = itemInProcessCache.get(itemID);
if (gate) {
gate.then(val => {
// use cached result here from previous processing
}).catch(err => {
// decide what to do when previous processing had an error
});
} else {
let p = DATABASE_UPDATE({itemID: true}).then(result => {
// expensive processing done
// return final value so any others waiting on the gate can just use that value
// decide if you want to clear this item from itemInProcessCache or not
}).catch(err => {
// error on expensive processing
// remove from the gate cache because we didn't get a result
// expensive processing will have to be done by someone else
itemInProcessCache.delete(itemID);
});
// mark this item as being processed
itemInProcessCache.set(itemID, p);
}
});
Note: This relies on the single-threadedness of node.js. No other request can get started until the request handler here returns so that itemInProcessCache.set(itemID, p); gets called before any other requests for this itemID could get started.
Also, I don't know databases very well, but this seems very much like a feature that a good multi-user database might have built in or have supporting features that makes this easier since it's not an uncommon idea to not want to have multiple requests all trying to do the same database work (or worse yet, trouncing each other's work).
I'm dealing with a situation where multiple threads are accessing this method
using (var tx = StateManager.CreateTransaction())
{
var item = await reliableDictioanary.GetAsync(tx, key);
... // Do work on a copy of item
await reliableDictioanary.SetAsync(tx, key, item);
await tx.CommitAsync();
}
Single threading this works well, but when I try accessing the dictionary this way using multiple threads I encounter a System.TimeOutException.
The only way I've been able to get around it is to use LockMode.Update on the GetAsync(...) method. Has anyone here experienced something like this?
I'm wondering if there is a way to read with snapshot isolation, which would allow a read with no lock on it, as opposed to a read with a shared lock on the record.
I've tried doing this with both a shared transaction as shown above as well as individual transactions for the get and the set. Any help would be appreciated.
The default lock when reading, is a shared lock. (caused by GetAsync)
If you want to write, you need an exclusive lock. You can't get it if shared locks exist.
Getting the first lock as an update lock prevents this, like you noticed.
Snapshot isolation happens when enumerating records, which you're not doing with GetAsync.
More info here.
I'm using Redis to generate IDs for my in memory stored models. The Redis client requires a callback to the INCR command, which means the code looks like
client.incr('foo', function(err, id) {
... continue on here
});
The problem is, that I already have written the other part of the app, that expects the incr call to be synchronous and just return the ID, so that I can use it like
var id = client.incr('foo');
The reason why I got to this problem is that up until now, I was generating the IDs just in memory with a simple closure counter function, like
var counter = (function() {
var count = 0;
return function() {
return ++count;
}
})();
to simplify the testing and just general setup.
Does this mean that my app is flawed by design and I need to rewrite it to expect callback on generating IDs? Or is there any simple way to just synchronize the call?
Node.js in its essence is an async I/O library (with plugins). So, by definition, there's no synchronous I/O there and you should rewrite your app.
It is a bit of a pain, but what you have to do is wrap the logic that you had after the counter was generated into a function, and call that from the Redis callback. If you had something like this:
var id = get_synchronous_id();
processIdSomehow(id);
you'll need to do something like this.
var runIdLogic = function(id){
processIdSomehow(id);
}
client.incr('foo', function(err, id) {
runIdLogic(id);
});
You'll need the appropriate error checking, but something like that should work for you.
There are a couple of sequential programming layers for Node (such as TameJS) that might help with what you want, but those generally do recompilation or things like that: you'll have to decide how comfortable you are with that if you want to use them.
#Sergio said this briefly in his answer, but I wanted to write a little more of an expanded answer. node.js is an asynchronous design. It runs in a single thread, which means that in order to remain fast and handle many concurrent operations, all blocking calls must have a callback for their return value to run them asynchronously.
That does not mean that synchronous calls are not possible. They are, and its a concern for how you trust 3rd party plugins. If someone decides to write a call in their plugin that does block, you are at the mercy of that call, where it might even be something that is internal and not exposed in their API. Thus, it can block your entire app. Consider what might happen if Redis took a significant amount of time to return, and then multiple that by the amount of clients that could potentially be accessing that same routine. The entire logic has been serialized and they all wait.
In answer to your last question, you should not work towards accommodating a blocking approach. It may seems like a simple solution now, but its counter-intuitive to the benefits of node.js in the first place. If you are only more comfortable in a synchronous design workflow, you may want to consider another framework that is designed that way (with threads). If you want to stick with node.js, rewrite your existing logic to conform to a callback style. From the code examples I have seen, it tends to look like a nested set of functions, as callback uses callback, etc, until it can return from that recursive stack.
The application state in node.js is normally passed around as an object. What I would do is closer to:
var state = {}
client.incr('foo', function(err, id) {
state.id = id;
doSomethingWithId(state.id);
});
function doSomethingWithId(id) {
// reuse state if necessary
}
It's just a different way of doing things.