so i am trying to query some data from a cosmos documentdb database as follows
here is my data :
{
"id": "**********",
"triggers": [
{
"type": "enter_an_area",
"trigger_identity": "***********",
},
},
{
"type": "enter_an_area",
"trigger_identity": "********",
},
},
{
"type": "exit_an_area",
"trigger_identity": "*******",
},
this is one document of my collection, where i have a document for every user with a unique ID, now what i want to do is count the number of users that use a specific trigger, a user may have the same trigger multiple times, as you can see in the example "enter_an_area" has more than one entery, but i would still want to count it as one entery.
i use this query to get the count for a specific trigger :
SELECT VALUE COUNT(1) FROM u JOIN t in u.triggers WHERE CONTAINS(t.type, "enter_an_area")
but for the example above this would return: 2 where i want it to return 1
is there a query to do this in documentdb? if there is no such a query, is there a way to return results without duplicates? because as a solution i thought i can return the IDs that use this specific trigger, but then i would get duplicate IDs when a user have more than one entery for a trigger.
It seems that your issue is about function like distinct the results of the query joins an array. I suggest you using stored procedure to implement your solution as a workaround.
Please refer to my sample stored procedure:
function sample() {
var collection = getContext().getCollection();
var isAccepted = collection.queryDocuments(
collection.getSelfLink(),
'SELECT u.id FROM u JOIN t in u.triggers WHERE CONTAINS(t.type, "enter_an_area")',
function (err, feed, options) {
if (err) throw err;
if (!feed || !feed.length) getContext().getResponse().setBody('no docs found');
else {
var returnResult = [];
var temp =''
for(var i = 0;i<feed.length;i++){
var valueStr = feed[i].id;
if(valueStr != temp){
temp = valueStr;
returnResult.push(feed[i])
}
}
getContext().getResponse().setBody(returnResult.length);
}
});
if (!isAccepted) throw new Error('The query was not accepted by the server.');
}
Hope it helps you.
Related
In a Cosmos DB stored procedure, I'm using a inline sql query to try and retrieve the distinct count of a particular user id.
I'm using the SQL API for my account. I've run the below query in Query Explorer in my Cosmos DB account and I know that I should get a count of 10 (There are 10 unique user ids in my collection):
SELECT VALUE COUNT(1) FROM (SELECT DISTINCT c.UserId FROM root c) AS t
However when I run this in the Stored Procedure portal, I either get 0 records back or 18 records back (total number of documents). The code for my Stored Procedure is as follows:
function GetDistinctCount() {
var collection = getContext().getCollection();
var isAccepted = collection.queryDocuments(
collection.getSelfLink(),
'SELECT VALUE COUNT(1) FROM (SELECT DISTINCT c.UserId FROM root c) AS t',
function(err, feed, options) {
if (err) throw err;
if (!feed || !feed.length) {
var response = getContext().getResponse();
var body = {code: 404, body: "no docs found"}
response.setBody(JSON.stringify(body));
} else {
var response = getContext().getResponse();
var body = {code: 200, body: feed[0]}
response.setBody(JSON.stringify(body));
}
}
)
}
After looking at various feedback forums and documentation, I don't think there's an elegant solution for me to do this as simply as it would be in normal SQL.
the UserId is my partition key which I'm passing through in my C# code and when I test it in the portal, so there's no additional parameters that I need to set when calling the Stored Proc. I'm calling this Stored Proc via C# and adding any further parameters will have an effect on my tests for that code, so I'm keen not to introduce any parameters if I can.
Your problem is caused by that you missed setting partition key for your stored procedure.
Please see the statements in the official document:
And this:
So,when you execute a stored procedure under a partitioned collection, you need to pass the partition key param. It's necessary! (Also this case explained this:Documentdb stored proc cross partition query)
Back to your question,you never pass any partition key, equals you pass an null value or "" value for partition key, so it outputs no data because you don't have any userId equals null or "".
My advice:
You could use normal Query SDK to execute your sql, and set the enableCrossPartitionQuery: true which allows you scan entire collection without setting partition key. Please refer to this tiny sample:Can't get simple CosmosDB query to work via Node.js - but works fine via Azure's Query Explorer
So I found a solution that returns the result I need. My stored procedure now looks like this:
function GetPaymentCount() {
var collection = getContext().getCollection();
var isAccepted = collection.queryDocuments(
collection.getSelfLink(),
'SELECT DISTINCT VALUE(doc.UserId) from root doc' ,
{pageSize:-1 },
function(err, feed, options) {
if (err) throw err;
if (!feed || !feed.length) {
var response = getContext().getResponse();
var body = {code: 404, body: "no docs found"}
response.setBody(JSON.stringify(body));
} else {
var response = getContext().getResponse();
var body = {code: 200, body: JSON.stringify(feed.length)}
response.setBody(JSON.stringify(body));
}
}
)
}
Essentially, I changed the pageSize parameter to -1 which returned all the documents I knew would be returned in the result. I have a feeling that this will be more expensive in terms of RU/s cost, but it solves my case for now.
If anyone has more efficient alternatives, please comment and let me know.
I have a list of documents that belong to a partitioned collection. Instead of querying for every document from the .NET client and either do update or insert, I thought I could use a Stored Procedure to accomplish this.
What I did not initially realize is that Stored Procedures are executed in the transaction scope of a single partition key. So I am getting PartitionKey value must be supplied for this operation.
The thing is that the documents (that I am trying to upsert) may belong to different partitions. How can I accomplish this in the Stored Procedure? In my case, the SP is useless unless it can operate on multiple partitions.
This is how I constructed my SP:
function upsertEcertAssignments(ecerts) {
var collection = getContext().getCollection();
var collectionLink = collection.getSelfLink();
var response = getContext().getResponse();
// Validate input
if (!ecerts) throw new Error("The ecerts is null or undefined");
if (ecerts.length == 0) throw new Error("The ecerts list size is 0");
// Recursively call the 'process' function
processEcerts(ecerts, 0);
function processEcerts(ecerts, index) {
if (index >= ecerts.length) {
response.setBody(index);
return;
}
var query = {query: "SELECT * FROM DigitalEcerts c WHERE c.code = #code AND c.collectionType = #type", parameters: [{name: "#code", value: ecerts[index].code}, {name: "#type", value: 0}]};
var isQueryAccepted = collection.queryDocuments(collectionLink, query, {partitionKey: ecerts[index].code}, function(err, foundDocuments, foundOptions) {
if (err) throw err;
if (foundDocuments.length > 0) {
var existingEcert = foundDocuments[0];
ecerts[index].id = existingEcert.id;
var isAccepted = __.replaceDocument(existingEcert._self, ecerts[index], function(err, updatedEcert, replacedOptions) {
if (err) throw err;
processEcerts(ecerts, index + 1);
});
if (!isAccepted) {
response.setBody(index);
}
} else {
var isAccepted = __.createDocument(__.getSelfLink(), ecerts[index], function(err, insertedEcert, insertedOptions) {
if (err) throw err;
processEcerts(ecerts, index + 1);
});
if (!isAccepted) {
response.setBody(index);
}
}
});
if (!isQueryAccepted)
response.setBody(index);
}
}
From .NET, if I call it like this, I get the partitionKey value problem:
var continuationIndex = await _docDbClient.ExecuteStoredProcedureAsync<int>(UriFactory.CreateStoredProcedureUri(_docDbDatabaseName, _docDbDigitalEcertsCollectionName, "UpsertDigitalMembershipEcertAssignments"), digitalEcerts);
If I call it with a partition key, it works...but it is useless:
var continuationIndex = await _docDbClient.ExecuteStoredProcedureAsync<int>(UriFactory.CreateStoredProcedureUri(_docDbDatabaseName, _docDbDigitalEcertsCollectionName, "UpsertDigitalMembershipEcertAssignments"), new RequestOptions { PartitionKey = new PartitionKey(digitalEcerts[0].Code) }, digitalEcerts.Take(1).ToList());
I appreciate any pointer.
Thanks.
By the sound of it, your unique id is a combination of code and type. I would recommend making your id property to be the combination of two.
This guarantees that your id is unique but also eliminates the need to query for it.
If the collection the stored procedure is registered against is a
single-partition collection, then the transaction is scoped to all the
documents within the collection. If the collection is partitioned,
then stored procedures are executed in the transaction scope of a
single partition key. Each stored procedure execution must then
include a partition key value corresponding to the scope the
transaction must run under.
You could refer to the description above which mentioned here. We can query documents cross partitions via setting EnableCrossPartitionQuery to true in FeedOptions parameter. However, the RequestOptions doesn't have such properties against executing stored procedure.
So, It seems you have to provide partition key when you execute sp. Of course, it can be replaced by upsert function. It is useless from the perspective of the business logic, but if bulk operations, the SP can release some of the performance pressure because the SP is running on the server side.
Hope it helps you.
I have a collection of articles in mongodb. I choose an article that i want to render, and I want two other articles chosen randomly. I want to pick two articles in my collection that are not the same, and are not the article I have chosen before.
Been on this problem for hours, search for a solution but only found how to pick an element randomly, but not except one...
Here is what I have now :
article.find({}, function(err, articles{
var articleChosen = articles.filter(selectArticleUrl, articleUrl)[0];
article.find({})
.lean()
.distinct("_id")
.exec(function(err, arrayIds){
var articleChosenIndex = arrayIds.indexOf(articleChosen._id);
arrayIds.splice(articleChosenIndex, 1);
chooseRdmArticle(arrayIds, function(articleRdm1Id){
var articleRmd1 = articles.filter(selectArticleId, articleRdm1Id)[0];
var articleRdm1Index = arrayIds.indexOf(articleRdm1Id);
arrayIds.splice(articleRdm1Index, 1);
chooseRdmArticle(arrayIds, function(articleRdm2Id){
var articleRmd2 = articles.filter(selectArticleId, articleRdm2Id)[0];
// do stuff with articleChosen, articleRmd1 and articleRmd2
})
})
})
})
where the function which choose rdm article is :
function chooseRdmArticle(articles, callback){
var min = Math.ceil(0);
var max = Math.floor(articles.length);
var rdm = Math.floor(Math.random() * (max - min)) + min;
callback(articles[rdm])
}
and the function which select the article from its url is :
function selectArticleUrl(element){
return element.url == this
}
My idea was to work on the array containing all the ObjectId (arrayIds here), to choose two Ids randomly after removing the articleChosen id. But I understood that arrayIds.indexOf(articleRdm1Id); couldn't work because ObjectIds are not strings ... Is there a method to find the index of the Id I want? Or any better idea ?
Thanks a lot !
Run two queries where the first fetches the chosen document and the other uses the aggregation framework to run a pipeline with the $sample operator to return 2 random documents from the collection except the chosen one.
The following query uses Mongoose's built-in Promises to demonstrate this:
let chosenArticle = article.find({ "url": articleUrl }).exec();
let randomArticles = article.aggregate([
{ "$match": { "url": { "$ne": articleUrl } } },
{ "$sample": { "size": 2 } }
]).exec();
Promise.all([chosenArticle, randomArticles]).then(articles => {
console.log(articles);
});
There is the mongodb command $sample, which is gonna read documents in a random way.
Example from the documentation :
db.users.aggregate( [ { $sample: { size: 3 } } ] )
I had the same problem and this works for me
const suggestedArticles = await Article.find({
articleId: { $ne: req.params.articleId },
}).limit(2);
I need to store user's info in DynamoDB and send a mail to the same user if it doesn't already exist in DynamoDB table. I am doing this in for loop. The list contains only 2 records. The issue is only the second record gets inserted in table and the mail is sent twice to the same user. Here is the code:
module.exports.AddUser = function(req, res, usersList, departmentId) {
var _emailId = "";
var _userName = "";
var _departmentId = departmentId;
for (var i = 0; i < usersList.length; i++) {
_emailId = usersList[i].emailId;
_userName = usersList[i].userName;
var params = {
TableName: "UsersTable",
Key: {
"emailId": _emailId,
"departmentId": _departmentId
}
};
docClient.get(params, function(err, data) {
if (!err) {
if (!data.items)
AddUserAndSendEmail("UsersTable", _emailId, _userName);
//The above function is being called twice but for the same user.
//It has a check so not inserting the same record twice but
//sending two mails to the same user.
}
});
}
res.end("success");
}
function AddUserAndSendEmail(tableName, emailId, _userName) {
var params = {
TableName: tableName,
Item: {
"emailId": emailId,
"departmentId": 101//Default Department
}
};
docClient.put(params, function(err, data) {
if (!err) {
//Send Email Code Here
} else {
console.log("error");
}
});
}
What could be the reason for this strange behavior? Really frustrated, I am about to give up on this.
1) Please note that DynamoDB is eventually consistent. If you insert the item and check whether the item exists immediately, it may not always find the item in the database.
This means the second iteration of the loop may not always find the first item inserted into the table.
2) If the item already exists in the table, the Put api will update the item and give successful response.
This means the Put will be successful for the same email id and department id in the second iteration because it updates the record if it is already present.
GetItem – The GetItem operation returns a set of Attributes for an
item that matches the primary key. The GetItem operation provides an
eventually consistent read by default. If eventually consistent reads
are not acceptable for your application, use ConsistentRead.
PutItem – Creates a new item, or replaces an old item with a new item
(including all the attributes). If an item already exists in the
specified table with the same primary key, the new item completely
replaces the existing item. You can also use conditional operators to
replace an item only if its attribute values match certain conditions,
or to insert a new item only if that item doesn’t already exist.
Based on the above points, there is a possibility to get two emails if you have same email id and department id in the array.
I'm trying to figure out how Mongoose and MongoDB works... I'm really new to them, and I can't seem to figure how to return values based on a find statement, where some of the given parameters in the query possible are null - is there an attribute I can set for this or something?
To explain it further, I have a web page that has different input fields that are used to search for a company, however they're not all mandatory.
var Company = mongoose.model('Company');
Company.find({companyName: req.query.companyName, position: req.query.position,
areaOfExpertise: req.query.areaOfExpertise, zip: req.query.zip,
country: req.query.country}, function(err, docs) {
res.json(docs);
});
By filling out all the input fields on the webpage I get a result back, but only that specific one which matches. Let's say I only fill out country, it returns nothing because the rest are empty, but I wish to return all rows which are e.g. in Germany. I hope I expressed myself clearly enough.
You need to wrap the queries with the $or logic operator, for example
var Company = mongoose.model('Company');
Company.find(
{
"$or": [
{ "companyName": req.query.companyName },
{ "position": req.query.position },
{ "areaOfExpertise": req.query.areaOfExpertise },
{ "zip": req.query.zip },
{ "country": req.query.country }
]
}, function(err, docs) {
res.json(docs);
}
);
Another approach would be to construct a query that checks for empty parameters, if they are not null then include it as part of the query. For example, you can just use the req.query object as your query assuming the keys are the same as your document's field, as in the following:
/*
the req.query object will only have two parameters/keys e.g.
req.query = {
position: "Developer",
country: "France"
}
*/
var Company = mongoose.model('Company');
Company.find(req.query, function(err, docs) {
if (err) throw err;
res.json(docs);
});
In the above, the req.query object acts as the query and has an implicit logical AND operation since MongoDB provides an implicit AND operation when specifying a comma separated list of expressions. Using an explicit AND with the $and operator is necessary when the same field or operator has to be specified in multiple expressions.
If you are after a query that satisfies both logical AND and OR i.e. return all documents that match the conditions of both clauses for example given a query for position AND country OR any other fields then you may tweak the query to:
var Company = mongoose.model('Company');
Company.find(
{
"$or": [
{ "companyName": req.query.companyName },
{
"position": req.query.position,
"country": req.query.country
},
{ "areaOfExpertise": req.query.areaOfExpertise },
{ "zip": req.query.zip }
]
}, function(err, docs) {
res.json(docs);
}
);
but then again this could be subject to what query parameters need to be joined as mandatory etc.
I simply ended up deleting the parameters in the query in case they were empty. It seems all the text fields in the submit are submitted as "" (empty). Since there are no such values in the database, it would return nothing. So simple it never crossed my mind...
Example:
if (req.query.companyName == "") {
delete req.query.companyName;
}