Redis Bitset operations in Node.js / Express.js - node.js

I'm very new to Node.js and Redis. I read this article, and want to use a bitset to store all the user information for my Express.js app, as mentioned in this article: http://blog.getspool.com/2011/11/29/fast-easy-realtime-metrics-using-redis-bitmaps/
I'm having a bit of a trouble. In my function, I get the current year, month, and date, and then use client.setbit() to set appropriate key and value. But how can I count all the keys? I'm on Redis 2.4*, and the BITCOUNT command is in 2.6. Is there any other way? The article uses a Java bitset, so that's a different thing. I don't quite understand it.
How could I use, for example, a for loop, to count all the bits set to 1? Is there any operation to count the size of the bitset, so I could do something like this:
for (var i = initial_offset; i < bitset_length; i++){
if (i == 1){
total_users++;
}
}
Or am I going about it in a totally wrong way?

You need to count the number of bits of a given string stored in Redis.
There are basically two ways to do this:
you can try to do it on server-side with Redis 2.6 and the new BITCOUNT/BITOP operations.
you can retrieve the whole string (containing all the bits) and process the data on client side. In the original article, the author retrieves the Redis string and converts it to a Java bitset on which bit-level algorithms can be applied. The same strategy can be applied with any client, any language: you just have to find a good library to deal with arrays of bits, or implement one by yourself (it is not that hard). It would work with Redis 2.2 or higher.
A strategy that would not work very well is to iterate on client-side and check each individual bits by executing the GETBIT command. It would be really inefficient.
With node.js, here are a few resources you may want to use to implement the second option:
https://gist.github.com/1455345
https://github.com/bramstein/bit-array
How do I create bit array in Javascript?
Node.js is not a very good environment to implement CPU consuming operations, but in the worst case, should you have very large bitsets, you can still rely on an efficient C++ implementation to be called from Node.js. You have a good one in boost::dynamic_bitset.
Here is a Node.js example with a very simple (and probably inefficient) counting algorithm:
var redis = require('redis')
var rc = redis.createClient(6379, 'localhost', {return_buffers:true} );
var bitcnt = [ 0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7,1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7,4,5,5,6,5,6,6,7,5,6,6,7,6,7,7,8]
function count(b)
{
var cnt = 0
for (i=0; i<b.length; ++i ) {
cnt += bitcnt[ b[i] ]
}
return cnt
}
function fetch( callback )
{
rc.get( 'mybitset', function(err,reply) {
callback(reply)
});
}
function fill( callback )
{
rc.setbit( 'mybitset', 0, 1 )
rc.setbit( 'mybitset', 10, 1 )
rc.setbit( 'mybitset', 20, 1 )
rc.setbit( 'mybitset', 60, 1, function(err,reply) {
callback()
});
}
rc.flushall( function(err,rr) {
fill( function() {
fetch( function(b) {
console.log( "Count = ",count(b) );
});
})
})
Please note the {return_buffers:true} option is used to be sure Redis output is processed as binary data (ignoring possible character conversion).

Related

How can I run multiple instances of a for loop in NodeJS?

I have a function which returns the usage of a CPU core with the help of a library called cpu-stat:
const cpuStat = require('cpu-stat')
var coreCount = cpuStat.totalCores()
var memArr = []
function getCoreUsage(i) {
return new Promise(async(resolve) => {
if (i === 0) {
cpuStat.usagePercent({coreIndex: i,sampleMs: 1000,},
async function(err, percent, seconds) {
if (err) {resolve(console.log(err))}
x = await percent
resolve("Core0: " + x.toFixed(2) + "%");
});
} else {
cpuStat.usagePercent({coreIndex: i,sampleMs: 1000,},
async function(err, percent, seconds) {
if (err) {resolve(console.log(err))}
x = await percent
resolve(x);
});
}
})
}
This function is called whenever a client requests a specific route:
function singleCore() {
return new Promise(async(resolve) => {
for (i=0; i <= coreCount; i++) {
if (i < coreCount) {core = await getCoreUsage(i), memArr.push(core)}
else if (i === coreCount) {resolve(memArr), memArr = []}
}
})
}
Now, this works just fine on machines which have less than 8 cores. The problem I am running into is that if I (hypothetically) use a high core count CPU like a Xeon or a Threadripper, the time it takes to get the usage will be close to a minute or so because they can have 56 or 64 cores respectively. To solve this, I thought of executing the for loop for each core on different threads such that the time comes down to one or two seconds (high core count CPUS have a lot of threads as well, so this probably won't be a problem).
But, I can't figure out how to do this. I looked into the child_process documentation and I think this can probably be done. Please correct me if I am wrong. Also, please suggest a better way if you know one.
This usagePercent function works by
looking at the cycle-count values in os.cpus[index] in the object returned by the os package.
delaying the chosen time, probably with setTimeout.
looking at the cycle counts again and computing the difference.
You'll get reasonably valid results if you use much shorter time intervals than one second.
Or you can rework the code in the package to do the computation for all cores in step 3 and return an array rather than just one number.
Or you can use Promise.all() to run these tests concurrently.

CosmosDB insertion loop stops inserting after a certain number of iterations (Node.js)

I'm doing a few tutorials on CosmosDB. I've got the database set up with the Core (SQL) API, and using Node.js to interface with it. for development, I'm using the emulator.
This is the bit of code that I'm running:
const CosmosClient = require('#azure/cosmos').CosmosClient
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
const options = {
endpoint: 'https://localhost:8081',
key: REDACTED,
userAgentSuffix: 'CosmosDBJavascriptQuickstart'
};
const client = new CosmosClient(options);
(async () => {
let cost = 0;
let i = 0
while (i < 2000) {
i += 1
console.log(i+" Creating record, running cost:"+cost)
let response = await client.database('TestDB').container('TestContainer').items.upsert({}).catch(console.log);
cost += response.requestCharge;
}
})()
This, without fail, stops at around iteration 1565, and doesn't continue. I've tried it with different payloads, without much difference (it may do a few more or a few less iterations, but seems to almsot always be around that number)
On the flipside, a similar .NET Core example works great to insert 10,000 documents:
double cost = 0.0;
int i = 0;
while (i < 10000)
{
i++;
ItemResponse<dynamic> resp = await this.container.CreateItemAsync<dynamic>(new { id = Guid.NewGuid() });
cost += resp.RequestCharge;
Console.WriteLine("Created item {0} Operation consumed {1} RUs. Running cost: {2}", i, resp.RequestCharge, cost);
}
So I'm not sure what's going on.
So, after a bit of fiddling, this doesn't seem to have anything to do with CosmosDB or it's library.
I was running this in the debugger, and Node would just crap out after x iterations. I noticed if I didn't use a console.log it would actually work. Also, if I ran the script with node file.js it also worked. So there seems to be some sort of issue with debugging the script while also printing to the console. Not exactly sure whats up with that, but going to go ahead and mark this as solved

How to Increment Value Atomically with Redis

I have below code in nodejs:
const decrease = async (userId, points) => {
const user = await redisClient.hgetall(userId);
if(user.points - points >= 0) {
await redisClient.hset(userId, userId, user.points - points);
}
}
since async/await is not blocking the execution, if there are multiple requests for the same userId, the code is not running as atomically. That means the user points may be decreased multiple times even there is not enough point left on users account. How can I make the method run atomically?
I have checked redis multi command and it works for multiple redis statements. But in my case, I need to calculate the user points which is not part of redis command. So how to make them run as an atomic function.
I also read the INCR pattern: https://redis.io/commands/incr
But it doesn't seem to fix my issue. The patterns listed there need to work with expire which I don't have such requirement to give a specific timeout value.
Use the power of (Redis) server-side Lua scripts by calling EVAL. It should probably look something like this:
const lua = `
local p = redis.call('HGET',KEYS[1],'points')
local d = p - ARGV[1]
if d >= 0 then
redis.call('HSET', KEYS[1], 'points', d)
end`
const decrease = async (userId, points) => {
await redisClient.eval(lua, 1, userId, points);
}

How should I avoid out of memory using nodejs?

var pass = require('./pass.js');
var fs = require('fs');
var path = "password.txt";
var name ="admin";
var
remaining = "",
lineFeed = "\r\n",
lineNr = 0;
var log =
fs.createReadStream(path, { encoding: 'utf-8' })
.on('data', function (chunk) {
// store the actual chunk into the remaining
remaining = remaining.concat(chunk);
// look that we have a linefeed
var lastLineFeed = remaining.lastIndexOf(lineFeed);
// if we don't have any we can continue the reading
if (lastLineFeed === -1) return;
var
current = remaining.substring(0, lastLineFeed),
lines = current.split(lineFeed);
// store from the last linefeed or empty it out
remaining = (lastLineFeed > remaining.length)
? remaining.substring(lastLineFeed + 1, remaining.length)
: "";
for (var i = 0, length = lines.length; i < length; i++) {
// process the actual line
var account={
username:name,
password:lines[i],
};
pass.test(account);
}
})
.on('end', function (close) {
// TODO I'm not sure this is needed, it depends on your data
// process the reamining data if needed
if (remaining.length > 0) {
var account={
username:name,
password:remaining,
};
pass.test(account);
};
});
I tried to do something like test password of account "admin", pass.test is a function to test the password, I download a weak password dictionary with a large number of lines,so I search for way to read that many lines of weak password,but with code above, the lines array became too large ,and run out of memory,what should I do?
Insofar as my limited understanding goes, you need to watch a 1GB limit, which I believe is imposed by the V8 engine, actually. (Here's a link, actually saying the limit is 1.4 GB, currently, and lists the different params used to change this manually.) Depending on where you host your node app(s), you can increase this limit, by a param set on the command line when node is started. Again, see the linked article for a few ways to do this.
Also, you might want to make sure that, whenever possible, you use buffers, instead of converting things like data streams (from a DB or other things, for instance) to arrays/whatever, as this will then load the entire dataset into memory. As long as it lives in a buffer, it doesn't contribute to the total memory footprint of your app.
And actually, one thing that doesn't make sense, and that seems to be very inefficient in your app, is that, on reading each chunk of data in, you then check your username against EVERY username you've amassed so far, in your lines array, instead of the LAST one. What your app should do is keep track of the last username and password combo you've read in, and then delete all data before this user, in your remaining variable, so you keep your memory down. And since it's not a hold all repository for every line of your password file anymore, you should probably retitle it something like buffer or something. This means that you'd remove your for loop, since you're already "looping" through the data in your password file, by reading it in, chunk by chunk.

Redis: How to check if exists in while loop

I'm using Redis in my application and one thing is not clear for me. I save an object with a random generated string as its key. However I would like to check if that key exists. I am planning to use while loop however I am not sure how would I struct it according to Redis. Since if I would like to check for once, I would do;
redisClient.get("xPQ", function(err,result){
if(result==null)
exists = false
});
But I would like use the while loop as;
while(exists == false)
However I cannot build the code structure in my head. Would the while be inside the function or outside the function?
In general, you shouldn't check for existence of a key on the client side. It leads to race conditions. For example, another thread could insert the key after the first thread checked for its presence.
You should use the commands ending with NX. For example - SETNX and HSETNX. These will insert the key only if doesn't already exist. It is guaranteed to be atomic.
I do not understand why you need to implement active polling to check whether a key exists (there are much better ways to handle this kind of situations), but I will try to answer the question.
You should not use a while loop at all (inside or outside the function). Because of the asynchronous nature of node.js, these loops are better implemented using tail recursion. Here is an example:
var redis = require('redis')
var rc = redis.createClient(6379, 'localhost');
function wait_for_key( key, callback ) {
rc.get( key, function(err,result) {
if ( result == null ) {
console.log( "waiting ..." )
setTimeout( function() {
wait_for_key(key,callback);
}, 100 );
} else {
callback(key,result);
}
});
}
wait_for_key( "xPQ", function(key,value) {
console.log( key+" exists and its value is: "+value )
});
There are multiple ways to simplify these expressions using dedicated libraries (using continuation passing style, or fibers). For instance you may want to check the whilst and until functions of the async.js package.
https://github.com/caolan/async

Resources