Using Riak.js / Riak, how do I do an "AND" select? - node.js

I am trying to determine the existence of an object to decide whether to create a new object with a new key or to update an existing object. The goal here is to match on two Secondary Indexes.
db.query(bucket, {end: null, definition_id: id}, function(err, data) {
if (err) {
res.send(err);
} else {
if (data.length === 0) {
// write new obj
} else {
// add to current obj
}
}
});
If there is an easy way to do this with the HTTP API I would be game for that, too, just can't seem to find it in the docs.
Thanks.

Riak's secondary indexing doesn't support querying 2 indexes simultaneously, you would need to query each index separately and then intersect the result sets.
However, if you need to routinely query the same pair of indexes, you can create a composite index in addition to the others. So if you are indexing, end and definition_id, also create a end-def index whose values are the end and definition_id concatenated with a separator.

Related

The batchWrite method of DocumentClient and GSI

I'm trying to delete some items from DynamoDB table. My table has a global secondary index. And I'm wondering if it's possible to use the batchWrite method of the DocumentClient to delete items from GSI table. Or we can use GSI for fetching data only?
var params = {
RequestItems: {
'Table-1': [
{
DeleteRequest: {
Key: { HashKey: 'someKey' }
}
}
]
}
};
documentClient.batchWrite(params, function(err, data) {
if (err) console.log(err);
else console.log(data);
});
if it's possible please provide some example of params.
docs
You cannot delete from a GSI. These indexes are pretty much read only: you can’t mutate the data in the table through a global secondary index, so no inserting, deleting or updating.
You can only read from the GSI and then implement the necessary logic to delete the items in the main table by key.
Also, a batch operation doesn’t make it a whole lot more efficient to delete those items: yes, it saves on network calls (up to 25:1) but not on used write capacity.

Google Datastore combine (union) multiple sets of entity results to achieve OR condition

I am working with NodeJS on Google App Engine with the Datastore database.
Due to the fact that Datastore does not have support the OR operator, I need to run multiple queries and combine the results.
I am planning to run multiple queries and then combine the results into a single array of entity objects. I have a single query working already.
Question: What is a reasonably efficient way to combine two (or more) sets of entities returned by Datastore including de-duplication? I believe this would be a "union" operation in terms of set theory.
Here is the basic query outline that will be run multiple times with some varying filters to achieve the OR conditions required.
//Set requester username
const requester = req.user.userName;
//Create datastore query on Transfer Request kind table
const task_history = datastore.createQuery('Task');
//Set query conditions
task_history.filter('requester', requester);
//Run datastore query
datastore.runQuery(task_history, function(err, entities) {
if(err) {
console.log('Task History JSON unable to return data results. Error message: ', err);
return;
//If query works and returns any entities
} else if (entities[0]) {
//Else if query works but does not return any entities return empty JSON response
res.json(entities); //HOW TO COMBINE (UNION) MULTIPLE SETS OF ENTITIES EFFICIENTLY?
return;
}
});
Here is my original post: Google Datastore filter with OR condition
IMHO the most efficient way would be to use Keys-only queries in the 1st stage, then perform the combination of the keys obtained into a single list (including deduplication), followed by obtaining the entities simply by key lookup. From Projection queries:
Keys-only queries
A keys-only query (which is a type of projection query) returns just
the keys of the result entities instead of the entities themselves, at
lower latency and cost than retrieving entire entities.
It is often more economical to do a keys-only query first, and then
fetch a subset of entities from the results, rather than executing a
general query which may fetch more entities than you actually need.
Here's how to create a keys-only query:
const query = datastore.createQuery()
.select('__key__')
.limit(1);
This method addresses several problems you may encounter when trying to directly combine lists of entities obtained through regular, non-keys-only queries:
you can't de-duplicate properly because you can't tell the difference between different entities with identical values and the same entity appearing in multiply query results
comparing entities by property values can be tricky and is definitely slower/more computing expensive than comparing just entity keys
if you can't process all the results in a single request you're incurring unnecessary datastore costs for reading them without actually using them
it is much simpler to split processing of entities in multiple requests (via task queues, for example) when handling just entity keys
There are some disadvantages as well:
it may be a bit slower because you're going to the datastore twice: once for the keys and once to get the actual entities
you can't take advantage of getting just the properties you need via non-keys-only projection queries
Here is the solution I created based on the advice provided in the accepted answer.
/*History JSON*/
module.exports.treqHistoryJSON = function(req, res) {
if (!req.user) {
req.user = {};
res.json();
return;
}
//Set Requester username
const loggedin_username = req.user.userName;
//Get records matching Requester OR Dataowner
//Google Datastore OR Conditions are not supported
//Workaround separate parallel queries get records matching Requester and Dataowner then combine results
async.parallel({
//Get entity keys matching Requester
requesterKeys: function(callback) {
getKeysOnly('TransferRequest', 'requester_username', loggedin_username, (treqs_by_requester) => {
//Callback pass in response as parameter
callback(null, treqs_by_requester)
});
},
//Get entity keys matching Dataowner
dataownerKeys: function(callback) {
getKeysOnly('TransferRequest', 'dataowner_username', loggedin_username, (treqs_by_dataowner) => {
callback(null, treqs_by_dataowner)
});
}
}, function(err, getEntities) {
if (err) {
console.log('Transfer Request History JSON unable to get entity keys Transfer Request. Error message: ', err);
return;
} else {
//Combine two arrays of entity keys into a single de-duplicated array of entity keys
let entity_keys_union = unionEntityKeys(getEntities.requesterKeys, getEntities.dataownerKeys);
//Get key values from entity key 'symbol' object type
let entity_keys_only = entity_keys_union.map((ent) => {
return ent[datastore.KEY];
});
//Pass in array of entity keys to get full entities
datastore.get(entity_keys_only, function(err, entities) {
if(err) {
console.log('Transfer Request History JSON unable to lookup multiple entities by key for Transfer Request. Error message: ', err);
return;
//If query works and returns any entities
} else {
processEntitiesToDisplay(res, entities);
}
});
}
});
};
/*
* Get keys-only entities by kind and property
* #kind string name of kind
* #property_type string property filtering by in query
* #filter_value string of filter value to match in query
* getEntitiesCallback callback to collect results
*/
function getKeysOnly(kind, property_type, filter_value, getEntitiesCallback) {
//Create datastore query
const keys_query = datastore.createQuery(kind);
//Set query conditions
keys_query.filter(property_type, filter_value);
//Select KEY only
keys_query.select('__key__');
datastore.runQuery(keys_query, function(err, entities) {
if(err) {
console.log('Get Keys Only query unable to return data results. Error message: ', err);
return;
} else {
getEntitiesCallback(entities);
}
});
}
/*
* Union two arrays of entity keys de-duplicate based on ID value
* #arr1 array of entity keys
* #arr2 array of entity keys
*/
function unionEntityKeys(arr1, arr2) {
//Create new array
let arr3 = [];
//For each element in array 1
for(let i in arr1) {
let shared = false;
for (let j in arr2)
//If ID in array 1 is same as array 2 then this is a duplicate
if (arr2[j][datastore.KEY]['id'] == arr1[i][datastore.KEY]['id']) {
shared = true;
break;
}
//If IDs are not the same add element to new array
if(!shared) {
arr3.push(arr1[i])
}
}
//Concat array 2 and new array 3
arr3 = arr3.concat(arr2);
return arr3;
}
I just wanted to write in for folks who stumble upon this...
There is a workaround for some cases of not having the OR operator if you can restructure your data a bit, using Array properties: https://cloud.google.com/datastore/docs/concepts/entities#array_properties
From the documentation:
Array properties can be useful, for instance, when performing queries with equality filters: an entity satisfies the query if any of its values for a property matches the value specified in the filter.
So, if you needed to query for all entities bearing one of multiple potential values, putting all of the possibilities for each entity into an Array property and then indexing it for your query should yield the results you want. (But, you'd need to maintain that additional property, or replace your existing properties with that Array implementation if it could work for all of what you need.)

Remove a single object from an array of objects in a single document of a collection

I have a problem with which I am struggling for quite some time.Suppose my document is like this
{"owner":"princu7", "books":[{"name":"the alchemist"}, {"name":"the alchemist"}, {"name":"the alchemist"}].
Now what do I do if I have to just remove one single element from the books array based on the matching of the name?I did it like this
var bookName="the alchemist";
var obj={"name":bookName}
db.collection("colName").update({"owner":"princu7"}, {$pull:{books:obj}}, {multi:false})
But the problems is that it removes all the entries in the array which have the name matching to "the alchemist". What I wanted was this
{"owner":"princu7", "books":[{"name":"the alchemist"}, {"name":"the alchemist"}
But what I got was this
{"owner":"princu7", "books":[]}
Upon reading the documentation, it says that pull removes all the instances that match the required condition so maybe that's why it's removing all other entries of the array which match the condition.So what should I do here.Thanks for reading.Really appreciate your help.
See this issue:
https://jira.mongodb.org/browse/SERVER-1014
You cannot achieve what you are trying to do in a single update. A way would be to modify that record in your application & save the changes.
You could use collection.updateOne() with upsert set to true to re-write the record in place. The idea is you get the original document, modify it in your app logic, then re-apply it to the database after removing the element from the array.
function upsert(collection, query, json) {
var col = db.collection(collection);
col.updateOne(query
, { $set : json }
, { upsert : true }
, function (err, result) {
if(err) {
log('error = ', err);
} else {
// no error, call the next function
}
}
);
};
By design, mongodb's $pull operator removes from an existing array all instances of a value or values that match the specified condition.Therefore it will remove all matching {"name":"the alchemist"} elements from the array.
I guess I would use $pop, which only removes the first matching element.
db.collection("colName")
.update({"owner": "princu7"}
, {$pop: {"books": {$match: {"name": "the alchemist"}}}}
, {multi:false}
)

Why can't I seem to merge a normal Object into a Mongo Document?

I have a data feed from a 3rd party server that I am pulling in and converting to JSON. The data feed will never have my mongoDB's auto-generated _ids in it, but there is a unique identifier called vehicle_id.
The function below is what is handling taking the data-feed generated json object fresh_event_obj and copying its values into a mongo document if there is a mongo document with the same vehicle_id.
function update_vehicle(fresh_event_obj) {
console.log("Updating Vehicle " + fresh_event_obj.vehicleID + "...");
Vehicle.find({ vehicleID: fresh_event_obj.vehicleID }, function (err, event_obj) {
if (err) {
handle_error(err);
} else {
var updated = _.merge(event_obj[0], fresh_event_obj);
updated.save(function (err) {
if (err) {
handle_error(err)
} else {
console.log("Vehicle Updated");
}
});
}
});
}
The structures of event_obj[0] and fresh_event_obj are identical, except that event_obj[0] has _id and __v while the "normal" object doesn't.
When I run _.merge on these two, or even my own recursive function that just copies values from the latter to the former, nothing in the updated object is different from the event_obj[0], despite fresh_event_obj having all new values.
Does anyone have any idea what I'm doing wrong? I feel it is obvious and I'm just failing to see it.
The problem is that if you don't have properties defined in your schema, and if they don't already exist, you can't create them with
doc.prop = value
even if you have {strict:false} in your schema.
The only way to set new properties is to do
doc.set('prop', value)
(You still have to have {strict:false} in your schema if that property doesn't exist in your schema)
As for having too many properties to be defined in schema, you can always use for-in loop to go through object properties
for(key in fresh_event_obj)
event_obj.set(key, fresh_event_obj[key]);

How do I create an entry with a compound key with Couchbase?

I have some code running in NodeJS that sets the doc in the database:
cb.set(req.body.id, req.body.value, function (err, meta) {
res.send(req.body);
});
I have read about compound keys and it seems that feature can simplify my life. The question is how to properly add an entry with a compound key? The code below fails and messages that a string was expected, no array.
cb.set([req.body.id, generate_uuid()], req.body.value, function (err, meta) {
res.send(req.body);
});
So should I convert my array to a string like '["patrick_bateman", 'uuid_goes_here']'?
If you're speaking about this "compound keys"...
This compuond keys aren't set by user directly, they are made by couchbase server while you use view. In couchbase view you can create map functions that will use "compund keys". Example:
map: function() {
if (doc.type === "mytype"){
emit([doc.body.id, doc.uuid], null);
}
}
In this case couchbase will create index by that "compund key" and when you query view you'll be able to set "two" keys.
This is useful i.e. in situations when you need to get some documents that varied by some time range. Example, you have docs with type "message" and you want to get all docs that have created from time 4 to 7.
In this case map function will look like:
map: function(){
if (meta.type === "json"){
emit([doc.type, doc.timestamp], null);
}
}
and query will contain params startKey=["message", 4] and endKey=["message", 7].
But also you can create complex keys like "message:4" and then query it via simple get. I.e. if you use sequential ids (by using increment function) for that messages you can easily iterate through that messages using simple for loop and couchbase.get function.
Also check this blog post by Tug Grall about creating chat application with nodejs and couchbase.

Resources