Randomly generating a value not in mongodb? - node.js

I'm trying to generate session ids in Node and store them in MongoDB. For my application I only want to generate a random 6 digit number (this number is used to sync two sessions so it's short for users to easily copy). I want to generate a number that is not already in my sessions collection in MongoDB. My first thought which will probably work 99% of the time is this:
function GetSessionKey(cb) {
var key = parseInt(Math.random() * 1000000).toString();
while (key.length < 6) key = '0' + key;
db.sessions.find({ Key: key }, function (err, docs) {
if (!err && docs.length == 0) {
cb(key);
} else {
GetSessionKey(cb);
}
});
}
But there is a slim chance that the generated key constantly picks existing keys in mongo. Or the possibility that all keys are already in use (which I do not expect to happen).
I know Node is asynchronous but does that mean that a recursive async call like this will fill up the stack if it keeps getting called? Or do async calls not get put on the stack? Is there a better way to generate short unique keys? Old keys will be removed and can be reused so I don't think it's be a ticking time bomb. I'm just worried that the random aspect may cause trouble.

An alternative is to use MongoDB to enforce uniqueness. For example, create an unique index on the Key field in the Mongo shell:
db.sessions.ensureIndex( { Key: 1 }, { unique: true } )
Modify the Node.js code to insert the generated random value and test the err variable for duplicate key errors. If there is no error, you're good to go. Otherwise, regenerate the value and try again. You can test for duplicate key errors like this:
...
db.sessions.insert({ Key: key }, function (err, docs) {
...
if (err) {
if (err.code && err.code == 11000) {
// Duplicate key error. Generate a new random value and insert again.
} else {
throw err;
}
}
...
If you can store the generated key in the _id field instead of Key, then you can skip ensureIndex(...) as MongoDB automatically creates an unique index on _id.

Recursive async call will not fill up stack. They'll just spin cpu over and over again until they find a key or some kind of an error happens.
I suggest to set a reasonable maximum on an amount of guesses though. And yes, base64 (as suggested above) or base36 (that's simply .toString(36)) will increase amount of keys available reducing collision chance if you're worrying about that.
Edit: there is indeed a better way - sequential keys.

Related

How to keep json object with highest value among duplicates with nodejs

I have JSON objects imported from an external system, some of which are duplicates in an ID value.
Foe example:
{
"ID": "1",
"name": "Bob",
"ink": "100"
},
{
"ID":"2",
"Name": "George",
"ink": "100"
},
{
"ID":"1",
"name": "Bob",
"ink":"200"
}
I am manipulating the information for each object, then push them into a new JSON array:
var array = {};
array.users = [];
for (let user of users) {
function (user) => {
...
array.users.push(user);
}
}
I need to remove all duplicates save the one with the highest value in the ink key.
I found solutions to do this for the array AFTER it is constructed, but that means I use system resources for nothing - no reason to manipulate users that will be removed anyway.
I am looking for a way to check for each new user if a user with that ID:value pair already exists in the array.users[] array, if it does, compare the values of the ink key, if it is higher - remove the existing from the array, then I can continue with my manipulation code and push the new user into the array.
Any ideas of what would be the most elegant/efficient/shortest way to accomplish this?
I am not really sure if I fully understood your question. If I understand correctly you don't want to pass through the entire array after it is constructed and check for duplicates?
"If in doubt throw a hash map at the problem". Use a map instead of a plain array. The map key stores the ID. And save your fields as the value. If a key already exists then you can just check which value is higher.
Code example should somewhat look like this:
let userMap = new Map()
for (let user in users) {
if (userMap.has(user["ID"]) //Look which ink is bigger
else //Store new entry
}
EDIT: My solution does require an extra step though and is not directly done in the original array. However, I still think that maps are probably one of the most efficient ways to handle this...
var array = {};
array.users = users.filter((user)=>{
for (let userSecond of users) {
if(userSecond.id === user.id && +userSecond.ink > +user.ink){
return false;
}
}
return true;
});
Not the cleanest solution perhaps but it should do the job. Basically you filter through users. Within the filter you go through every user again to check if any of them has the same id and more ink, if so the current user should be discarded by returning false. If no user is found with same id and more ink the current user will stay in the array.

MongoDB: findOneAndUpdate seems to be not atomic

I am using nodejs + mongodb as a backend for a largely distributed web application. I have a series of events, that need to be in a specific order. There are multiple services generating these events and my application should process and store them as they come in and at any given time I want to have them in the correct order.
I cannot rely on timestamps since javascript only provides timestamps in milliseconds, which is not accurate enough for my case.
I have two collections in my database. One that stores the events and one that stores an index, which represents my eventorder. I have tried using findOneAndUpdate in order to increase my index atomically. This however does not seem to be working.
console.log('Adding');
console.log(event.type);
this._db.collection('evtidx').findOneAndUpdate({ id : 'index' }, { $inc: { value : 1 } }, (err, res) => {
console.log('For '+event.type);
console.log('Got value: '+res.value.value);
event.index = res.value.value;
this._db.collection('events').insertOne(event, (err, evtres) => {
if (err) {
throw err;
}
});
});
When I check the output of the code above I see:
Adding
Event1
Adding
Event2
Adding
Event3
Adding
Event4
For Event1
Got value: 1
For Event3
Got value: 4
For Event2
Got value: 2
For Event4
Got value: 3
Which concludes to me, that my code is not working atomically.
The events come in in the correct index, but don't have the correct order attached to them after findOneAndUpdate. Could anyone help me out there?
Atomic database operations does not mean that they lock the database while the request is running. Maybe You are getting requests in order but they are not executed in sequential order nor in the backend nor in the database.
What you need to do is read the last document index from the 'events' collection. If its one less then your current request index then insert else wait and retry.
Although this can cause problems if one event fails because of network error or something else. Then Your request processing would stop.

Using Riak.js / Riak, how do I do an "AND" select?

I am trying to determine the existence of an object to decide whether to create a new object with a new key or to update an existing object. The goal here is to match on two Secondary Indexes.
db.query(bucket, {end: null, definition_id: id}, function(err, data) {
if (err) {
res.send(err);
} else {
if (data.length === 0) {
// write new obj
} else {
// add to current obj
}
}
});
If there is an easy way to do this with the HTTP API I would be game for that, too, just can't seem to find it in the docs.
Thanks.
Riak's secondary indexing doesn't support querying 2 indexes simultaneously, you would need to query each index separately and then intersect the result sets.
However, if you need to routinely query the same pair of indexes, you can create a composite index in addition to the others. So if you are indexing, end and definition_id, also create a end-def index whose values are the end and definition_id concatenated with a separator.

How do I create an entry with a compound key with Couchbase?

I have some code running in NodeJS that sets the doc in the database:
cb.set(req.body.id, req.body.value, function (err, meta) {
res.send(req.body);
});
I have read about compound keys and it seems that feature can simplify my life. The question is how to properly add an entry with a compound key? The code below fails and messages that a string was expected, no array.
cb.set([req.body.id, generate_uuid()], req.body.value, function (err, meta) {
res.send(req.body);
});
So should I convert my array to a string like '["patrick_bateman", 'uuid_goes_here']'?
If you're speaking about this "compound keys"...
This compuond keys aren't set by user directly, they are made by couchbase server while you use view. In couchbase view you can create map functions that will use "compund keys". Example:
map: function() {
if (doc.type === "mytype"){
emit([doc.body.id, doc.uuid], null);
}
}
In this case couchbase will create index by that "compund key" and when you query view you'll be able to set "two" keys.
This is useful i.e. in situations when you need to get some documents that varied by some time range. Example, you have docs with type "message" and you want to get all docs that have created from time 4 to 7.
In this case map function will look like:
map: function(){
if (meta.type === "json"){
emit([doc.type, doc.timestamp], null);
}
}
and query will contain params startKey=["message", 4] and endKey=["message", 7].
But also you can create complex keys like "message:4" and then query it via simple get. I.e. if you use sequential ids (by using increment function) for that messages you can easily iterate through that messages using simple for loop and couchbase.get function.
Also check this blog post by Tug Grall about creating chat application with nodejs and couchbase.

emit doc twice with different key in couchdb

Say I have a doc to save with couchDB and the doc looks like this:
{
"email": "lorem#gmail.com",
"name": "lorem",
"id": "lorem",
"password": "sha1$bc5c595c$1$d0e9fa434048a5ae1dfd23ea470ef2bb83628ed6"
}
and I want to be able to query the doc either by 'id' or 'email'. So when save this as a view I write so:
db.save('_design/users', {
byId: {
map: function(doc) {
if (doc.id && doc.email) {
emit(doc.id, doc);
emit(doc.email, doc);
}
}
}
});
And then I could query like this:
db.view('users/byId', {
key: key
}, function(err, data) {
if (err || data.length === 0) return def.reject(new Error('not found'));
data = data[0] || {};
data = data.value || {};
self.attrs = _.clone(data);
delete self.attrs._rev;
delete self.attrs._id;
def.resolve(data);
});
And it works just fine. I could load the data either by id or email. But I'm not sure if I should do so.
I have another solution which by saving the same doc with two different view like byId and byEmail, but in this way I save the same doc twice and obviously it will cost space of the database.
Not sure which solution is better.
The canonical solution would be to have two views, one by email and one by id. To not waste space for the document, you can just emit null as the value and then use the include_docs=true query paramter when you query the view.
Also, you might want to use _id instead of id. That way, CouchDB ensures that the ID will be unique and you don't have to use a view to loop up documents.
I'd change to the two separate views. That's explicit and clear. When you emit the same doc twice in a single view – by an id and e-mail you're effectively combining the 2 views into one. You may think of it as a search tree with the 2 root branches. I don't see any reason of doing that, and would suggest leaving the data access and storage optimization job to the database.
The views combination may also yield tricky bugs, when for some reason you confuse an id and an e-mail.
There is absolutely nothing wrong with emitting the same document multiple times with a different key. It's about what makes most sense for your application.
If id and email are always valid and interchangeable ways to identify a user then a single view is perfect. For example, when id is some sort of unique account reference and users are allowed to use that or their (more memorable) email address to login.
However, if you need to differentiate between the two values, e.g. id is only meant for application administrators, then separate views are probably better. (You could probably use a complex key instead ... but that's another answer.)

Resources