Redis: How to check if exists in while loop - node.js

I'm using Redis in my application and one thing is not clear for me. I save an object with a random generated string as its key. However I would like to check if that key exists. I am planning to use while loop however I am not sure how would I struct it according to Redis. Since if I would like to check for once, I would do;
redisClient.get("xPQ", function(err,result){
if(result==null)
exists = false
});
But I would like use the while loop as;
while(exists == false)
However I cannot build the code structure in my head. Would the while be inside the function or outside the function?

In general, you shouldn't check for existence of a key on the client side. It leads to race conditions. For example, another thread could insert the key after the first thread checked for its presence.
You should use the commands ending with NX. For example - SETNX and HSETNX. These will insert the key only if doesn't already exist. It is guaranteed to be atomic.

I do not understand why you need to implement active polling to check whether a key exists (there are much better ways to handle this kind of situations), but I will try to answer the question.
You should not use a while loop at all (inside or outside the function). Because of the asynchronous nature of node.js, these loops are better implemented using tail recursion. Here is an example:
var redis = require('redis')
var rc = redis.createClient(6379, 'localhost');
function wait_for_key( key, callback ) {
rc.get( key, function(err,result) {
if ( result == null ) {
console.log( "waiting ..." )
setTimeout( function() {
wait_for_key(key,callback);
}, 100 );
} else {
callback(key,result);
}
});
}
wait_for_key( "xPQ", function(key,value) {
console.log( key+" exists and its value is: "+value )
});
There are multiple ways to simplify these expressions using dedicated libraries (using continuation passing style, or fibers). For instance you may want to check the whilst and until functions of the async.js package.
https://github.com/caolan/async

Related

Exclusive access to variable in mapLimit return

I'm reasonably new to programming in nodejs but not to programming (C/C++/Python/Shaders) and I have a question about exclusive access to a global variable when e.g. async.mapLimit return its callbacks
example
var myGlobalCounter = 0;
function executeDownload(item, callback){
exec('./ascriptthatdownload.sh ' + item, function (error, stdout, stderr) {
// Here do I have exclusive access to myGlobalCounter
// so that I could do this or update lets say a UI component?
myGlobalCounter++
console.log('Downloads ready:', myGlobalCounter);
});
}
function downloadSomeFiles() {
var listOfFiles = [];
// create download links
async.mapLimit(listOfFiles, 4, executeDownload, function(err, results){
});
}
I can get this to work but I don't know if this is safe enough? Other suggestions also appreciated. In C/C++ I would have used a mutex to guard against simultaneous access to myGlobalCounter.
Edit:I want to be able to safely count myGlobalCounter by 1 each time a download is ready and then pass it on either in console.log or to another component
The kudos goes to #CertainPerformance explanation in his comment (that I can't accept as an answer)
Yes, Javascript is single-threaded - a synchronous block of code will
run to the end before any other callback can run, look up the event
loop. You almost never have to worry about shared mutable state in JS

Can this code cause a race condition in socket io?

I am very new to node js and socket io. Can this code lead to a race condition on counter variable. Should I use a locking library for safely updating the counter variable.
"use strict";
module.exports = function (opts) {
var module = {};
var io = opts.io;
var counter = 0;
io.on('connection', function (socket) {
socket.on("inc", function (msg) {
counter += 1;
});
socket.on("dec" , function (msg) {
counter -= 1;
});
});
return module;
};
No, there is no race condition here. Javascript in node.js is single threaded and event driven so only one socket.io event handler is ever executing at a time. This is one of the nice programming simplifications that come from the single threaded model. It runs a given thread of execution to completion and then and only then does it grab the next event from the event queue and run it.
Hopefully you do realize that the same counter variable is accessed by all socket.io connections. While this isn't a race condition, it means that there's only one counter that all socket.io connections are capable of modifying.
If you wanted a per-connection counter (separeate counter for each connection), then you could define the counter variable inside the io.on('connection', ....) handler.
The race conditions you do have to watch out for in node.js are when you make an async call and then continue the rest of your coding logic in the async callback. While the async operation is underway, other node.js code can run and can change publicly accessible variables you may be using. That is not the case in your counter example, but it does occur with lots of other types of node.js programming.
For example, this could be an issue:
var flag = false;
function doSomething() {
// set flag indicating we are in a fs.readFile() operation
flag = true;
fs.readFile("somefile.txt", function(err, data) {
// do something with data
// clear flag
flag = false;
});
}
In this case, immediately after we call fs.readFile(), we are returning control back to the node.js. It is free at that time to run other operations. If another operation could also run this code, then it will tromp on the value of flag and we'd have a concurrency issue.
So, you have to be aware that anytime you make an async operation and then the rest of your logic continues in the callback for the async operation that other code can run and any shared variables can be accessed at that time. You either need to make a local copy of shared data or you need to provide appropriate protections for shared data.
In this particular case, the flag could be incremented and decremented rather than simply set to true or false and it would probably serve the desired purpose of keeping track of whether this file is current being read or not.
Shorter answer:
"Race condition" is when you execute a series of ordered asynchronous functions and because of their async nature they won't finish processing in their original order.
In your code, you are executing a series of ordered synchronous process (increasing or decreasing the counter), So they finish instantly after they start, resulting in ordered output. So no racing here!

how to define functions in redis \ lua?

I'm using Node.js, and the 'redis-scripto' module, and I'm trying to define a function in Lua:
var redis = require("redis");
var redisClient = redis.createClient("6379","127.0.0.1");
var Scripto = require('redis-scripto');
var scriptManager = new Scripto(redisClient);
var scripts = {'add_script':'function add(i,j) return (i+j) end add(i,j)'};
scriptManager.load(scripts);
scriptManager.run('add_script', [], [1,1], function(err, result){
console.log(err || result);
});
so I'm getting this error:
[Error: ERR Error running script (call to .... #enable_strict_lua:7: user_script:1: Script attempted to create global variable 'add']
so I've found that it's a protection, as explained in this thread:
"The doc-string of scriptingEnableGlobalsProtection indicates that intent is to notify script authors of common mistake (not using local)."
but I still didn't understand - where is this scripting.c ? and the solution of changing global tables seems risky to me.
Is there no simple way of setting functions to redis - using lua ?
It looks like the script runner is running your code in a function. So when you create function add(), it thinks you're doing it inside a function by accident. It'll also, likely, have a similar issue with the arguments to the call to add(i,j).
If this is correct then this should work:
local function add(i,j) return (i+j) end return add(1,1)
and if THAT works then hopefully this will too with your arguments passed from the js:
local function add(i,j) return (i+j) end return add(...)
You have to assign an anonymous function to a variable (using local).
local just_return_it = function(value)
return value
end
--// result contains the string "hello world"
local result = just_return_it("hello world")
Alternately, you can create a table and add functions as fields on the table. This allows a more object oriented style.
local obj = {}
function obj.name()
return "my name is Tom"
end
--// result contains "my name is Tom"
local result = obj.name()
Redis requires this to prevent access to the global scope. The entire scripting mechanism is designed to keep persistence out of the scripting layer. Allowing things in the global scope would create an easy way to subvert this, making it much more difficult to track and manage state on the server.
Note: this means you'll have to add the same functions to every script that uses them. They aren't persistent on the server.
This worked for me: (change line 5 from the question above):
var scripts = {
'add_script':'local function add(i,j) return (i+j) end return(add(ARGV[1],ARGV[2]))'};
to pass variables to the script use KEYS[] and ARGV[], as explained in the redis-scripto guide
and also there's a good example in here: "Lua: A Guide for Redis Users"

Perform non-blocking eval reads in MongoDB

I figured out how to run javascript code on the MongoDB server, from a node.js client:
db.eval("function(x){ return x*10; }", 1, function (err, retval) {
console.log('err: '+err);
console.log('retval: '+retval);
});
And that works fine. But the docs say that db.eval() issues a write lock, so that nothing else can read or write to the database. I do not want that.
It also says that eval has no such limitation, but I do not know where to find it. From the way they're talking about it, it seems as if regular eval is only available in the mongo shell, and so not from the client side.
So: how can I run these stored procedures on the mongodb server without blocking everything?
you can pass an object with the field nolock set to true as an optional 3rd parameter to eval:
db.eval('function (x) {return x*10; }', [1], {nolock:true}, function(err, retval) {
console.log('err: '+err);
console.log('retval: '+retval);
});
Note that this prevents eval from setting an obligatory write-lock, but it doesn't prevent any operations inside your function from creating write-locks on their own.
Source: the documentation.
Note that the term "stored procedure" is wrong in this case. A stored procedure refers to code which is stored on the database itself and not delivered by the application layer. MongoDB can also do this utilizing the special collection db.system.js, but doing this is discouraged: http://docs.mongodb.org/manual/applications/server-side-javascript/#storing-functions-server-side
By the way: MongoDB wasn't designed for stored procedures. It is usually recommended to implement any advanced logic on the application layer. The practice to implement even trivial operations as stored procedures, like it is sometimes done on SQL databases, is discouraged.
This is this the way to store your functions on the Server Side and you call use it as shown below:
db.system.js.save( { _id : "myAddFunction" , value : function (x,y)
{ return x +y;} } );
db.system.js.find()
{ "_id" : "myAddFunction", "value" : function (x,y){ return x + y; } }
db.eval( "myAddFunction( 1 ,2)" )
3

How to populate mongoose with a large data set

I'm attempting to load a store catalog into MongoDb (2.2.2) using Node.js (0.8.18) and Mongoose (3.5.4) -- all on Windows 7 64bit. The data set contains roughly 12,500 records. Each data record is a JSON string.
My latest attempt looks like this:
var fs = require('fs');
var odir = process.cwd() + '/file_data/output_data/';
var mongoose = require('mongoose');
var Catalog = require('./models').Catalog;
var conn = mongoose.connect('mongodb://127.0.0.1:27017/sc_store');
exports.main = function(callback){
var catalogArray = fs.readFileSync(odir + 'pc-out.json','utf8').split('\n');
var i = 0;
Catalog.remove({}, function(err){
while(i < catalogArray.length){
new Catalog(JSON.parse(catalogArray[i])).save(function(err, doc){
if(err){
console.log(err);
} else {
i++;
}
});
if(i === catalogArray.length -1) return callback('database populated');
}
});
};
I have had a lot of problems trying to populate the database. Under previous scenarios (and this one), node pegs the processor and eventually runs out of memory. Note that in this scenario, I'm trying to allow Mongoose to save a record, and then iterate to the next record once the record saves.
But the iterator inside of the Mongoose save function never gets incremented. In addition, it never throws any errors. But if I put the iterator (i) outside of the asynchronous call to Mongoose, it will work, provided the number of records that I try to load are not too big (I have successfully loaded 2,000 this way).
So my questions are: Why isn't the iterator inside of the Mongoose save call ever incremented? And, more importantly, what is the best way to load a large data set into MongoDb using Mongoose?
Rob
i is your index to where you're pulling input data from in catalogArray, but you're also trying to use it to keep track of how many have been saved which isn't possible. Try tracking them separately like this:
var i = 0;
var saved = 0;
Catalog.remove({}, function(err){
while(i < catalogArray.length){
new Catalog(JSON.parse(catalogArray[i])).save(function(err, doc){
saved++;
if(err){
console.log(err);
} else {
if(saved === catalogArray.length) {
return callback('database populated');
}
}
});
i++;
}
});
UPDATE
If you want to add tighter flow control to the process, you can use the async module's forEachLimit function to limit the number of outstanding save operations to whatever you specify. For example, to limit it to one outstanding save at a time:
Catalog.remove({}, function(err){
async.forEachLimit(catalogArray, 1, function (catalog, cb) {
new Catalog(JSON.parse(catalog)).save(function (err, doc) {
if (err) {
console.log(err);
}
cb(err);
});
}, function (err) {
callback('database populated');
});
}
Rob,
The short answer:
You created an infinite loop. You're thinking synchronously and with blocking, Javascript functions asynchronously and without blocking. What you are trying to do is like trying to directly turn the feeling of hunger into a sandwich. You can't. The closest thing is you use the feeling of hunger to motivate you to go to the kitchen and make it. Don't try to make Javascript block. It won't work. Now, learn async.forEachLimit. It will work for what you want to do here.
You should probably review asynchronous design patterns and understand what it means on a deeper level. Callbacks are not simply an alternative to return values. They are fundamentally different in how and when they are executed. Here is a good primer: http://cs.brown.edu/courses/csci1680/f12/handouts/async.pdf
The long answer:
There is an underlying problem here, and that is your lack of understanding of what non-blocking IO and asynchronous means. Im not sure if you are breaking into node development, or this is just a one-off project, but if you do plan to continue using node (or any asynchronous language) then it is worth the time to understand the difference between synchronous and asynchronous design patterns, and what motivations there are for them. So, that is why you have a logic error putting the loop invariant increment inside an asynchronous callback which is creating an infinite loop.
In non-computer science, that means that your increment to i will never occur. The reason is because Javascript executes a single block of code to completion before any asynchronous callbacks are called. So in your code, your loop will run over and over, without i ever incrementing. And, in the background, you are storing the same document in mongo over and over. Each iteration of the loop starts sending document with index 0 to mongo, the callback can't fire until your loop ends, and all other code outside the loop runs to completion. So, the callback queues up. But, your loop runs again since i++ is never executed (remember, the callback is queued until your code finishes), inserting record 0 again, queueing another callback to execute AFTER your loop is complete. This goes on and on until your memory is filled with callbacks waiting to inform your infinite loop that document 0 has been inserted millions of times.
In general, there is no way to make Javascript block without doing something really really bad. For example, something paramount to setting your kitchen on fire to fry some eggs for that sandwich I talked about in the "short answer".
My advice is to take advantage of libs like async. https://github.com/caolan/async JohnnyHK mentioned it here, and he was correct for doing so.

Resources