Azure DocumentDB: order by and filter by DateTime - azure

I have the following query:
SELECT * FROM c
WHERE c.DateTime >= "2017-03-20T10:07:17.9894476+01:00" AND c.DateTime <= "2017-03-22T10:07:17.9904464+01:00"
ORDER BY c.DateTime DESC
So as you can see I have a WHERE condition for a property with the type DateTimeand I want to sort my result by the same one.
The query ends with the following error:
Order-by item requires a range index to be defined on the corresponding index path.
I have absolutely no idea what this error message is about :(
Has anybody any idea?

You can also do one thing that don't require indexing explicitly. Azure documentBD is providing indexing on numbers field by default so you can store the date in long format. Because you are already converting date to string, you can also convert date to long an store, then you can implement range query.

I think I found a possible solution, thanks for pointing out the issue with the index.
As stated in the following article https://learn.microsoft.com/en-us/azure/documentdb/documentdb-working-with-dates#indexing-datetimes-for-range-queries I changed the index for the datatype string to RangeIndex, to allow range queries:
DocumentCollection collection = new DocumentCollection { Id = "orders" };
collection.IndexingPolicy = new IndexingPolicy(new RangeIndex(DataType.String) { Precision = -1 });
await client.CreateDocumentCollectionAsync("/dbs/orderdb", collection);
And it seems to work! If there are any undesired side effects I will let you know.

Related

Timeseries differencing - ArangoDB (AQL or Python)

I have a collection which holds documents, with each document having a data observation and the time that the data was captured.
e.g.
{
_key:....,
"data":26,
"timecaptured":1643488638.946702
}
where timecaptured for now is a utc timestamp.
What I want to do is get the duration between consecutive observations, with SQL I could do this with LAG for example, but with ArangoDB and AQL I am struggling to see how to do this at the database. So effectively the difference in timestamps between two documents in time order. I have a lot of data and I don't really want to pull it all into pandas.
Any help really appreciated.
Although the solution provided by CodeManX works, I prefer a different one:
FOR d IN docs
SORT d.timecaptured
WINDOW { preceding: 1 } AGGREGATE s = SUM(d.timecaptured), cnt = COUNT(1)
LET timediff = cnt == 1 ? null : d.timecaptured - (s - d.timecaptured)
RETURN timediff
We simply calculate the sum of the previous and the current document, and by subtracting the current document's timecaptured we can therefore calculate the timecaptured of the previous document. So now we can easily calculate the requested difference.
I only use the COUNT to return null for the first document (which has no predecessor). If you are fine with having a difference of zero for the first document, you can simply remove it.
However, neither approach is very straight forward or obvious. I put on my TODO list to add an APPEND aggregate function that could be used in WINDOW and COLLECT operations.
The WINDOW function doesn't give you direct access to the data in the sliding window but here is a rather clever workaround:
FOR doc IN collection
SORT doc.timecaptured
WINDOW { preceding: 1 }
AGGREGATE d = UNIQUE(KEEP(doc, "_key", "timecaptured"))
LET timediff = doc.timecaptured - d[0].timecaptured
RETURN MERGE(doc, {timediff})
The UNIQUE() function is available for window aggregations and can be used to get at the desired data (previous document). Aggregating full documents might be inefficient, so a projection should do, but remember that UNIQUE() will remove duplicate values. A document _key is unique within a collection, so we can add it to the projection to make sure that UNIQUE() doesn't remove anything.
The time difference is calculated by subtracting the previous' documents timecaptured value from the current document's one. In the case of the first record, d[0] is actually equal to the current document and the difference ends up being 0, which I think is sensible. You could also write d[-1].timecaptured - d[0].timecaptured to achieve the same. d[1].timecaptured - d[0].timecaptured on the other hand will give you the inverted timestamp for the first record because d[1] is null (no previous document) and evaluates to 0.
There is one risk: UNIQUE() may alter the order of the documents. You could use a subquery to sort by timecaptured again:
LET timediff = doc.timecaptured - (
FOR dd IN d SORT dd.timecaptured LIMIT 1 RETURN dd.timecaptured
)[0]
But it's not great for performance to use a subquery. Instead, you can use the aggregation variable d to access both documents and calculate the absolute value of the subtraction so that the order doesn't matter:
LET timediff = ABS(d[-1].timecaptured - d[0].timecaptured)

Get all collection result if null value appears in match filter in mongoose

Really don't know how to describe my problem. I tried my best. I have a collection and I want anyone to search through multiple parameters. I used the $match filter for this. here is my code.
db.product.aggregate([{$match:{lastStatus:req.query.status,name:req.query.name}}])
This is working fine. but I am taking the match values from the user via API and there is a chance that the user didn't pass these values, in that case, I return the products based on the applied match. suppose the user only passes the name, then the result comes on the basis of the name, not on lastStatus. right now this query does not work if any of the provided values is empty or undefined.
what can be the solutions
You can check the condition if any of the field has value then set the value in the match variable.
let match = {};
// MATCH STATUS
if (req.query.status) match["lastStatus"] = req.query.status;
// MATCH NAME
if (req.query.name && req.query.name.trim() != "") match["name"] = req.query.name.trim();
db.product.aggregate([{ $match: match }]);

Getting index of the resultset

Is there a way to get the index of the results within an aql query?
Something like
FOR user IN Users sort user.age DESC RETURN {id:user._id, order:{index?}}
If you want to enumerate the result set and store these numbers in an attribute order, then this is possible with the following AQL query:
LET sorted_ids = (
FOR user IN Users
SORT user.age DESC
RETURN user._key
)
FOR i IN 0..LENGTH(sorted_ids)-1
UPDATE sorted_ids[i] WITH { order: i+1 } IN Users
RETURN NEW
A subquery is used to sort users by age and return an array of document keys. Then a loop over a numeric range from the first to the last index of the that array is used to iterate over its elements, which gives you the desired order value (minus 1) as variable i. The current array element is a document key, which is used to update the user document with an order attribute.
Above query can be useful for a one-off computation of an order attribute. If your data changes a lot, then it will quickly become stale however, and you may want to move this to the client-side.
For a related discussion see AQL: Counter / enumerator
If I understand your question correctly - and feel free to correct me, this is what you're looking for:
FOR user IN Users
SORT user.age DESC
RETURN {
id: user._id,
order: user._key
}
The _key is the primary key in ArangoDB.
If however, you're looking for example data entered (in chronological order) then you will have to have to set the key on your inserts and/or create a date / time object and filter using that.
Edit:
Upon doing some research, I believe this link might be of use to you for AI the keys: https://www.arangodb.com/2013/03/auto-increment-values-in-arangodb/

How truncate time while querying documents for date comparison in Cosmos Db

I have document contains properties like this
{
"id":"1bd13f8f-b56a-48cb-9b49-7fc4d88beeac",
"name":"Sam",
"createdOnDateTime": "2018-07-23T12:47:42.6407069Z"
}
I want to query a document on basis of createdOnDateTime which is stored as string.
query e.g. -
SELECT * FROM c where c.createdOnDateTime>='2018-07-23' AND c.createdOnDateTime<='2018-07-23'
This will return all documents which are created on that day.
I am providing date value from date selector which gives only date without time so, it gives me problem while comparing date.
Is there any way to remove time from createdOnDateTime property or is there any other way to achieve this?
CosmosDB clients are storing timestamps in ISO8601 format and one of the good reasons to do so is that its lexicographical order matches the flow of time. Meaning - you can sort and compare those strings and get them ordered by time they represent.
So in this case you don't need to remove time components just modify the passed in parameters to get the result you need. If you want all entries from entire date of 2018-07-23 then you can use query:
SELECT * FROM c
WHERE c.createdOnDateTime >= '2018-07-23'
AND c.createdOnDateTime < '2018-07-24'
Please note that this query can use a RANGE index on createdOnDateTime.
Please use User Defined Function to implement your requirement, no need to update createdOnDateTime property.
UDF:
function con(date){
var myDate = new Date(date);
var month = myDate.getMonth()+1;
if(month<10){
month = "0"+month;
}
return myDate.getFullYear()+"-"+month+"-"+myDate.getDate();
}
SQL:
SELECT c.id,c.createdOnDateTime FROM c where udf.con(c.createdOnDateTime)>='2018-07-23' AND udf.con(c.createdOnDateTime)<='2018-07-23'
Output :
Hope it helps you.

Alfresco slingshot search numberFound and totalRecords number different

I'm using Alfresco 5.0.d
I see in the search json result (with firebug console panel) that in addition to result items , 2 other properties are returned : numberFound and totalRecords. It seems that Alfresco search engine considers numberFound as total items found.
So it display "numberFound results founded" to user.
The problem is that numberFound is not equals to totalRecords.
I see that totalRecords is the correct number of search result (in fact search always return totalRecords number of items).
So i decided to see in the webscript that performs search (alfresco-remote-api-5.0.d.jar\alfresco\templates\webscripts\org\alfresco\slingshot\search\search.lib.js).
We can easly see that the numberFound property comes from this statement
var rs = search.queryResultSet(queryDef);
var numberFound = rs.meta.numberFound ;
About totalRecords property, it comes from the same statement but a little bit different:
var totalRecords = rs.nodes.length
which is the correct value of number of item really found.
So is it an Alfresco api bug?
If no , is it possible that error comes from my query parameters?
Can someone explains me what does mean the numberFound property?
Thank you.
Below is the URL of java file which is getting called when you are executing search.queryResultSet(queryDef) code.
you can refer below method in the java file.It is adding all the things.
https://svn.alfresco.com/repos/alfresco-open-mirror/alfresco/HEAD/root/projects/repository/source/java/org/alfresco/repo/jscript/Search.java
public Scriptable queryResultSet() //This is java method which is getting called.
Below is the code which is written for what you are getting in result.
meta:
{
numberFound: long, // total number found in index, or -1 if not known or not supported by this resultset
facets:
{ // facets are returned for each field as requested in the SearchParameters fieldfacets
field:
{ // each field contains a map of facet to value
facet: value,
},
}
}

Resources