Firebase Firestore Query Between Two Timestamps - node.js

I want to find events that are a on now and upcoming (next 30 days) but that are also not in the past.
When i run this as a cloud function, I get "Cannot have inequality filters on multiple properties". How am I meant to get this type of data.
(ignore the fact that the date stuff is a bit messy, am still playing around).
// Create date 30 days in future
const searchData: Date = new Date();
searchData.setDate(searchData.getDate() + 30);
// Load data and handle empty respoonse
const response: admin.firestore.QuerySnapshot = await admin
.firestore()
.collection(Collections.EVENTS)
.where("startDate", "<=", admin.firestore.Timestamp.fromMillis(searchData.valueOf()))
.where("endDate", ">=", admin.firestore.Timestamp.fromMillis(new Date().valueOf()))
.where("public", "==", true)
.limit(NUMBER_OF_EVENTS)
.get();
Edit:
I would like to know the data structure/query method that will allow me to return all events in the events collection that are on now or that will start in the next month. Additionally, I would like the query to exclude events that have already finished. Each document (event) has a startDate and endDate timestamp on it.

Since you have two fields to check with ranges, I'm not sure this is doable with a single query. What you can do instead is perform two queries, merge the results on the client, and perform a final filter to get the exact matches.
Make an assumption about the maximum duration for an event. Call that amount of time "X".
Query for all documents where startDate is greater than now - X, and also less than now + 30 days. Call this result set "A".
Query for all documents where endDate is greater than now, and also less than now + 30 days. Call this result set "B".
On the client, iterate all the results from A and B, checking to see if the start and end dates fit the criteria you want.
I can't think of a way to structure your data that will do this with a single query.

I know this is kind a old thread, but the answer might be good for others.
What you can do as I at the end ended up doing, is that you have a $start_date and a $target_date.
You then do like this:
<?php
$start = strtotime('2021-11-22');
$target = strtotime('2022-01-01 0:00:00');
$limit = (($target - $start) / 86400);
$query = $col_data->where('date.day_start', '>=', $start);
$query = $query->limit($limit);
?>
Not bad, eh? You welcome!

From the docs:
You can only perform range comparisons (<, <=, >, >=) on a single field, and you can include at most one array-contains or array-contains-any clause in a compound query:
citiesRef.where("state", ">=", "CA").where("state", "<=", "IN");
citiesRef.where("state", "==", "CA").where("population", ">", 1000000);
https://firebase.google.com/docs/firestore/query-data/queries

Related

Timeseries differencing - ArangoDB (AQL or Python)

I have a collection which holds documents, with each document having a data observation and the time that the data was captured.
e.g.
{
_key:....,
"data":26,
"timecaptured":1643488638.946702
}
where timecaptured for now is a utc timestamp.
What I want to do is get the duration between consecutive observations, with SQL I could do this with LAG for example, but with ArangoDB and AQL I am struggling to see how to do this at the database. So effectively the difference in timestamps between two documents in time order. I have a lot of data and I don't really want to pull it all into pandas.
Any help really appreciated.
Although the solution provided by CodeManX works, I prefer a different one:
FOR d IN docs
SORT d.timecaptured
WINDOW { preceding: 1 } AGGREGATE s = SUM(d.timecaptured), cnt = COUNT(1)
LET timediff = cnt == 1 ? null : d.timecaptured - (s - d.timecaptured)
RETURN timediff
We simply calculate the sum of the previous and the current document, and by subtracting the current document's timecaptured we can therefore calculate the timecaptured of the previous document. So now we can easily calculate the requested difference.
I only use the COUNT to return null for the first document (which has no predecessor). If you are fine with having a difference of zero for the first document, you can simply remove it.
However, neither approach is very straight forward or obvious. I put on my TODO list to add an APPEND aggregate function that could be used in WINDOW and COLLECT operations.
The WINDOW function doesn't give you direct access to the data in the sliding window but here is a rather clever workaround:
FOR doc IN collection
SORT doc.timecaptured
WINDOW { preceding: 1 }
AGGREGATE d = UNIQUE(KEEP(doc, "_key", "timecaptured"))
LET timediff = doc.timecaptured - d[0].timecaptured
RETURN MERGE(doc, {timediff})
The UNIQUE() function is available for window aggregations and can be used to get at the desired data (previous document). Aggregating full documents might be inefficient, so a projection should do, but remember that UNIQUE() will remove duplicate values. A document _key is unique within a collection, so we can add it to the projection to make sure that UNIQUE() doesn't remove anything.
The time difference is calculated by subtracting the previous' documents timecaptured value from the current document's one. In the case of the first record, d[0] is actually equal to the current document and the difference ends up being 0, which I think is sensible. You could also write d[-1].timecaptured - d[0].timecaptured to achieve the same. d[1].timecaptured - d[0].timecaptured on the other hand will give you the inverted timestamp for the first record because d[1] is null (no previous document) and evaluates to 0.
There is one risk: UNIQUE() may alter the order of the documents. You could use a subquery to sort by timecaptured again:
LET timediff = doc.timecaptured - (
FOR dd IN d SORT dd.timecaptured LIMIT 1 RETURN dd.timecaptured
)[0]
But it's not great for performance to use a subquery. Instead, you can use the aggregation variable d to access both documents and calculate the absolute value of the subtraction so that the order doesn't matter:
LET timediff = ABS(d[-1].timecaptured - d[0].timecaptured)

Firebase querying by date range always returns empty array

In my database I have all the documents saved on the day 23 of this month as the image below shows
Even if i pass the start date in day 20 it will return empty value.
Here is my query snippet:
const documents = await admin.firestore()
.collection('orders')
.orderBy('payment_type')
.startAt('2020-06-22')
.endAt('2020-06-23')
.get()
I'm new to firebase, so I don't know where i'm missing and the docs were not much of help.
Also, i'm using cloud functions if the information helps.
The value you pass to startAt and endAt has to correspond to the fields you're using for ordering. Since you're ordering on payment_type, the start and end values don't make sense - they are looking for ranges of values that don't match at all payment_type values at all. Also, it doesn't make sense to use startAt and endAt unless you are trying to do pagination, which it doesn't look like you are here.
If you're trying to do a range query between two dates, then order the results by payment_type, that's actually not possible with Firestore using the data you have now. You can't have a range query on a field that's different than the fields you're using to order. Note the limitation in the documentation:
If you include a filter with a range comparison (<, <=, >, >=), your
first ordering must be on the same field:
citiesRef.where("population", ">", 100000).orderBy("population")
So, you can try a range query by date:
const documents = await admin.firestore()
.collection('orders')
.where('date', '>', '2020-06-22')
.where('date', '<', '2020-06-23')
.get()
The order the results on the client by payment_type if you want.
You're ordering by payment_type, which isn't a date value. Suggest you update that to date instead so that the respective startAt and endAt functions will target the correct field. Documentation
const documents = await admin.firestore()
.collection('orders')
.orderBy('date')
.startAt('2020-06-22')
.endAt('2020-06-25')
.get()
Also, extend the value of your endAt to encompass June 24th.

How truncate time while querying documents for date comparison in Cosmos Db

I have document contains properties like this
{
"id":"1bd13f8f-b56a-48cb-9b49-7fc4d88beeac",
"name":"Sam",
"createdOnDateTime": "2018-07-23T12:47:42.6407069Z"
}
I want to query a document on basis of createdOnDateTime which is stored as string.
query e.g. -
SELECT * FROM c where c.createdOnDateTime>='2018-07-23' AND c.createdOnDateTime<='2018-07-23'
This will return all documents which are created on that day.
I am providing date value from date selector which gives only date without time so, it gives me problem while comparing date.
Is there any way to remove time from createdOnDateTime property or is there any other way to achieve this?
CosmosDB clients are storing timestamps in ISO8601 format and one of the good reasons to do so is that its lexicographical order matches the flow of time. Meaning - you can sort and compare those strings and get them ordered by time they represent.
So in this case you don't need to remove time components just modify the passed in parameters to get the result you need. If you want all entries from entire date of 2018-07-23 then you can use query:
SELECT * FROM c
WHERE c.createdOnDateTime >= '2018-07-23'
AND c.createdOnDateTime < '2018-07-24'
Please note that this query can use a RANGE index on createdOnDateTime.
Please use User Defined Function to implement your requirement, no need to update createdOnDateTime property.
UDF:
function con(date){
var myDate = new Date(date);
var month = myDate.getMonth()+1;
if(month<10){
month = "0"+month;
}
return myDate.getFullYear()+"-"+month+"-"+myDate.getDate();
}
SQL:
SELECT c.id,c.createdOnDateTime FROM c where udf.con(c.createdOnDateTime)>='2018-07-23' AND udf.con(c.createdOnDateTime)<='2018-07-23'
Output :
Hope it helps you.

Couchdb - date range + multiple query parameters

I want to be able query the couchdb between dates, I know that this can be done with startkey and endkey (it works fine), but is it possible to do query for example like this:
SELECT *
FROM TABLENAME
WHERE
DateTime >= '2011-04-12T00:00:00.000' AND
DateTime <= '2012-05-25T03:53:04.000'
AND
Status = 'Completed'
AND
Job_category = 'Installation'
Generally-speaking, establishing indexes on multiple fields grows in complexity as the number of fields increases.
My main question is: do Status and Job_category need to be queried dynamically too? If not, your view is simple:
function (doc) {
if (doc.Status === 'Completed' && doc.Job_category === 'Installation') {
emit(doc.DateTime); // this line may change depending on how you break up and emit the datetimes
}
}
Views are fairly cheap, (depending on the size of your database) so don't be afraid to establish several that cover different cases. I would expect something like Status to have predefined list of available options, as oppposed to Job_category which seems like it could be more related to user input.
If you need those fields to be dynamic, you can just add them to the index as well:
function (doc) {
emit([ doc.Status, doc.Job_category, doc.DateTime ]);
}
Then you can use an array as your start_key. For example:
start_key=["Completed", "Installation", ...]
tl;dr: use "static" views where you have a predetermined list of values for a given field. while possible to query "dynamic" views with multiple fields, the complexity grows very quickly.

CouchDB function to sample records at a given interval.

I have records with a time value and need to be able to query them for a span of time and return only records at a given interval.
For example I may need all the records from 12:00 to 1:00 in 10 minute intervals giving me 12:00, 12:10, 12:20, 12:30, ... 12:50, 01:00. The interval needs to be a parameter and it may be any time value. 15 minutes, 47 seconds, 1.4 hours.
I attempted to do this doing some kind of reduce but that is apparently the wrong place to do it.
Here is what I have come up with. Comments are welcome.
Created a view for the time field so I can query a range of times. The view outputs the id and the time.
function(doc) {
emit([doc.rec_id, doc.time], [doc._id, doc.time])
}
Then I created a list function that accepts a param called interval. In the list function I work thru the rows and compare the current rows time to the last accepted time. If the span is greater or equal to the interval I add the row to the output and JSON-ify it.
function(head, req) {
// default to 30000ms or 30 seconds.
var interval = 30000;
// get the interval from the request.
if (req.query.interval) {
interval = req.query.interval;
}
// setup
var row;
var rows = [];
var lastTime = 0;
// go thru the results...
while (row = getRow()) {
// if the time from view is more than the interval
// from our last time then add it.
if (row.value[1] - lastTime > interval) {
lastTime = row.value[1];
rows.push(row);
}
}
// JSON-ify!
send(JSON.stringify({'rows' : rows}));
}
So far this is working well. I will test against some large data to see how the performance is. Any comments on how this could be done better or would this be the correct way with couch?
CouchDB is relaxed. If this is working for you, then I'd say stick with it and focus on your next top priority.
One quick optimization is to try not to build up a final answer in the _list function, but rather send() little pieces of the answer as you know them. That way, your function can run on an unlimited result size.
However, as you suspected, you are using a _list function basically to do an ad-hoc query which could be problematic as your database size grows.
I'm not 100% sure what you need, but if you are looking for documents within a time frame, there's a good chance that emit() keys should primarily sort by time. (In your example, the primary (leftmost) sort value is doc.rec_id.)
For a map function:
function(doc) {
var key = doc.time; // Just sort everything by timestamp.
emit(key, [doc._id, doc.time]);
}
That will build a map of all documents, ordered by the time timestamp. (I will assume the time value is like JSON.stringify(new Date), i.e. "2011-05-20T00:34:20.847Z".
To find all documents within, a 1-hour interval, just query the map view with ?startkey="2011-05-20T00:00:00.000Z"&endkey="2011-05-20T01:00:00.000Z".
If I understand your "interval" criteria correctly, then if you need 10-minute intervals, then if you had 00:00, 00:15, 00:30, 00:45, 00:50, then only 00:00, 00:30, 00:50 should be in the final result. Therefore, you are filtering the normal couch output to cut out unwanted results. That is a perfect job for a _list function. Simply use req.query.interval and only send() the rows that match the interval.

Resources