mongodb change stream resume from timestamp

mongodb change stream resume from timestamp - node.js

In mongodb docs https://docs.mongodb.com/manual/changeStreams/
there is a quote:
The oplog must have enough history to locate the operation associated with the token or the timestamp, if the timestamp is in the past.
So it seems that it is possible to resume and get all the events that where added to oplog from a certain time.
There is a param, seems that it has to accomplish what I need
watch([],{startAtOperationTime: ...})
https://github.com/mongodb/specifications/blob/master/source/change-streams/change-streams.rst#startatoperationtime
Param is Timestamp, I don't get how to translate a particular date to correct timestamp.

startAtOperationTime is a new parameter for changestreams introduced in MongoDB 4.0 and newer driver versions. It allows you to ensure that you're not missing any writes just in case the stream was interrupted, and you don't have access to the resume token.
One caveat of using startAtOperationTime is that your app needs to be prepared to accept that it may see a write event twice when resuming the changestream, since you're resuming from an arbitrary point in time.
In node, this can be done by constructing a Timestamp object and passing it into watch():
async function run() {
const con = await MongoClient.connect(uri, {useNewUrlParser: true})
const ts = new Timestamp(1, 1560812065)
con.db('test').collection('test').watch([], {startAtOperationTime: ts})
.on('change', console.log)
}
The Timestamp object itself is created with the form of:
new Timestamp(ordinal, unix_epoch_in_seconds)
A detailed explanation can be found in BSON Timestamp.
In node, you can get the current epoch (in milliseconds) using e.g.:
(new Date).getTime()
bearing in mind that this needs to be converted to seconds for creating the Timestamp object needed for startAtOperationTime.

Related

Why can't I store a PriorityQueue into MongoDB

Recently I have decided to replace arrays with priority queues for storing my list of jobs for a user into MongoDB. I use NodeJS and ExpressJS for backend. The priority queue I attempted to store is from an external package which can be installed by running the following command in terminal:
yarn add js-priority-queue
For some reason the priority queue works perfectly prior to storing it into MongoDB. However, the next time I attempt to take it out of MongoDB and use it, its functionality is missing. I declare its type as Schema.Types.Mixed in the Schema. Am I doing something wrong or is it not possible to store instantiated class objects into MongoDB?

As far as I know, when you store things in MongoDB they are stored as extended JSON (EJSON) in binary format (BSON)
const { EJSON } = require('bson');
const test = EJSON.stringify({a: new Date(), foo:function(){console.log('foo');}})
console.log(test) // "{"a":{"$date":"2020-07-07T14:45:49.475Z"}}"
So any sort of function is lost.

How `agenda.js` calculates timezone for `every()` operation

I am using agenda.js in my Node project, backed with a MongoDB database, to handle batch processes we need to run. This is working well. I do have a question about timezones, however. When I use the every() operation, it seems like it accepts the job name, and the schedule. So I have been seeding jobs to the database like so:
for (let job of dbJobs) {
await agenda.every(schedule, job.name);
}
Note that for the above, schedule is in cron format -- 00 05 * * 1-5.
This works. However, from what I can tell, every() doesn't accept an argument for repeatTimezone. So what does it do to calculate the timezone in those cases?
To clarify, when I look at the document in the database after the job has been added using every(), the repeatTimezone property exists, but its value is set to null.
Other agenda operations, like repeatEvery(), do accept an argument for timezone, like so:
job.repeatEvery('0 6 * * *', {
timezone: 'America/New_York'
});
Since I'm using every(), I have been managing this by first seeding the database using every(), and then running a Mongo updateMany() to add the timzeone explicitly to all jobs:
async function addTimezoneToJobs() {
try {
const db = await client.db(dbName);
await db.collection('batch_processes').updateMany({}, {
$set: {
repeatTimezone: 'America/New_York'
}
});
} catch (error) {
console.log(error);
}
}
But strangely enough, agenda seems to calculate the same time even when I don't explicitly add the repeatTimezone property value to the jobs as when I do.
What's happening here that I'm not understanding? How is the runtime calculated with every(), and is there a way to pass in timezone?
FYI: I am not in the same timezone as that which needs to be set in the db.

Your Question seems to be 2 part, I'm not exactly sure I'll be able to explain it very well but let me try
So, your first question
However, from what I can tell, every() doesn't accept an argument for Timezone
Well Technically you can add Timezone option to every() as well because what this method does is it calls job.repeatEvery internally and as you already know you can add timezone to that. To Support my answer, I found 2 evidence
From Documentation as every accepts 4 parameters
every(interval, name, [data], [options])
options is an optional argument that will be passed to job.repeatEvery. In order to use this argument, data must also be specified.
So you can technically pass timezone if you pass data as well
From SourceCode, here you can see they use job.repeatEvery(interval, options) internally.
Now, To your Second Question
what does it do to calculate the timezone in those cases?
Well They have a very unique yet required module named ComputeNextRunAt().
So I went through their Source Code and figured Out that this is to Compute when will be the next run for your job based on startingTime and Interval.
Your Code works because you have once (Initially) mentioned in your job to follow America/New_York timezone, so every next job interval is calculated based on that, that's the reason you don't need to specify it again.
So, If initially you haven't had specified the timezone attribute, you would have gotten your local Timezone but now you did so, it calculates the next interval based on that.

MongoDB - Is find() realtime?

Using the node-mongodb-native npm package, in a node.js app, if I acquire a collection object early in a long-running node.js async script, like this:
var collection = await db.collection(collectionName);
If the collection gets modified before I execute the find() method, of this collection object, will the results of find({}) be current, or will it only show data as it was at the time I acquired the collection object?
For example, let's hypothetically assume that 10 minutes later the script gets to a line like this:
let cursor = await collection.find({});
Additionally assume that during this lapse of time, items were added, removed and modified before find() was called.
Will the resulting cursor navigate current data or will the data be as it was at the time that I acquired the collection object (at the beginning of the script)?

I really doubt it would take a snapshot of the collection when you acquire it.
See:
https://docs.mongodb.com/manual/reference/method/db.getCollection/
Return value of find will be a cursor to the current state.
Wil the resulting cursor navigate current data or will the data be as
it was at the time that I acquired the collection object (at the
beginning of the script)?
The resulting cursor runs through current data.

Wrong time format fetched

I am using Node for fetching data from MySQL. In database, i got record like : 2013-08-13 15:44:53 . But when Node fetches from Database , it assigns value like 2013-08-19T07:54:33.000Z.
I just need to get time format as in MySQL table. Btw ( My column format is DateTime in MySQL)
In Node :
connection.query(post, function(error, results, fields) {
userSocket.emit('history :', {
'dataMode': 'history',
msg: results,
});
});

When retrieving it from the database you most likely get a Date object which is exactly what you should work with (strings are only good to display dates, but working on a string representation of a date is nothing you want to do).
If you need a certain string representation, create it based on the data stored in the Date object - or even better, get some library that adds a proper strftime-like method to its prototype.
The best choice for such a library is moment.js which allows you to do this to get the string format you want:
moment('2013-08-19T07:54:33.000Z').format('YYYY-MM-DD hh:mm:ss')
// output (in my case on a system using UTC+2 as its local timezone):
// "2013-08-19 09:54:33"
However, when sending it through a socket (which requires a string representation) it's a good idea to use the default one since you can pass it to the new Date(..) constructor on the client side and get a proper Date object again.

node-postgres: how to prepare a statement without executing the query?

I want to create a "prepared statement" in postgres using the node-postgres module. I want to create it without binding it to parameters because the binding will take place in a loop.
In the documentation i read :
query(object config, optional function callback) : Query
If _text_ and _name_ are provided within the config, the query will result in the creation of a prepared statement.
I tried
client.query({"name":"mystatement", "text":"select id from mytable where id=$1"});
but when I try passing only the text & name keys in the config object, I get an exception :
(translated) message is binding 0 parameters but the prepared statement expects 1
Is there something I am missing ? How do you create/prepare a statement without binding it to specific value in order to avoid re-preparing the statement in every step of a loop ?

I just found an answer on this issue by the author of node-postgres.
With node-postgres the first time you issue a named query it is
parsed, bound, and executed all at once. Every subsequent query issued
on the same connection with the same name will automatically skip the
"parse" step and only rebind and execute the already planned query.
Currently node-postgres does not support a way to create a named,
prepared query and not execute the query. This feature is supported
within libpq and the client/server protocol (used by the pure
javascript bindings), but I've not directly exposed it in the API. I
thought it would add complexity to the API without any real benefit.
Since named statements are bound to the client in which they are
created, if the client is disconnected and reconnected or a different
client is returned from the client pool, the named statement will no
longer work (it requires a re-parsing).

You can use pg-prepared for that:
var prep = require('pg-prepared')
// First prepare statement without binding parameters
var item = prep('select id from mytable where id=${id}')
// Then execute the query and bind parameters in loop
for (i in [1,2,3]) {
client.query(item({id: i}), function(err, result) {...})
}

Update: Reading your question again, here's what I believe you need to do. You need to pass a "value" array as well.
Just to clarify; where you would normally "prepare" your query, just prepare the object you pass to it, without the value array. Then where you would normally "execute" your query, set the value array in the object and pass it to the query. If it's the first time, the driver will do the actual prepare for you the first time around, and simple do binding and execution for the rest of the iteration.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

mongodb change stream resume from timestamp - node.js

Related

Why can't I store a PriorityQueue into MongoDB

How `agenda.js` calculates timezone for `every()` operation

MongoDB - Is find() realtime?

Wrong time format fetched

node-postgres: how to prepare a statement without executing the query?

Categories

Resources