Mongodb, mongoose - dynamically set TTL time

Mongodb, mongoose - dynamically set TTL time - node.js

I am referring to this Time to live in mongodb, mongoose dont work. Documents doesnt get deleted to ask my question:
Is it possible to set TTL time for MongoDB dynamically?
So let's suppose you have token collection and you want to use it for different purposes. In that case every time when you create the token it would be nice to set specific TTL for each token.
If this is possible, could you please provide some code snippet?

To dynamically set a TTL to a document you can make use of the same index, but create another field like expireAt in the schema like:
expireAt: {
type: Date,
default: null,
}
Then create an index like (example for mongoose):
schema.index({ expireAt: 1 }, { expireAfterSeconds: 0 });
Now for all documents that you want to expire, you can set the exact datetime. For others, whose field expireAt is defaulting to null won't expire.
You can see the same example here in MongoDB Docs.

If you define a TTL index on a collection, then periodically MongoDB will remove() old documents from the collection.
db.events.ensureIndex('time', expireAfterSeconds=3600)
It use an indexing system for handling TTL. Its fixed, there is no way to define it dynamically for each document. in your scenario I recommend you to use Messaging System like RabbitMQ along with MongoDB
https://www.rabbitmq.com/ttl.html

Related

Trying to understand mongodb indexes for finding documents with exact and unique value(s)

I am reading through mongo docs fro nodejs driver, particularly this index section https://www.mongodb.com/docs/drivers/node/current/fundamentals/indexes/#geospatial-indexes and it looks like all of the indexes that they mention are for sortable / searchable data. So I wanted to ask if I need indexes for following use case:
I have this user document structure
{
email: string,
version: number,
otherData: ...
}
As far as I understand I can query each user by _id and this already has default unique index applied to it? I alos want to query user by email as well, so I created following unique index
collection.createIndex({ email: 1 }, { unique: true })
Is my understanding correct here that by creating this index I guarantee thaa:
Email is always unique
My queries like collection.findOne({email: 'my#email.com'}) are optimised?
Next, I want to perform update operations on user documents, but only on specific versions, so:
collection.updateOne({email: '...', version: 2}, update)
What index do I need to create in order to optimise this query? Should I be somehow looking into compound indexes for this as I am now using email and version?

Yes, the unique constraint happens at the db layer so by definition this will be unique, It is worth mentioning that this can affect insert/update performance as this check has to be executed on each of these instances - from my experience you only start feeling this overhead in larger scale ( hundreds of millions of documents in a single collection + thousands of inserts a minutes ).
Yes. there is no other way to optimize this further.
What index do I need to create in order to optimise this query? Should I be somehow looking into compound indexes for this as I am now using email and version?
You want to create a compound index, the syntax will looks like this:
collection.createIndex({ email: 1, version: 1 }, { unique: true })
I will just say that by definition the (first) email index ensures uniqueness, so any additional filtering you add to the query and index will not really affect anything as there will always be only 1 of those emails in the DB. Basically why bother adding a "version" field to the query? if you need it for filtering that's fine but then you won't be needing to alter the existing index.

How to add timestamps to existing records in MongoDB

I am having a collection 'users' in MongoDB which contains multiple records without timestamps. I am using that collection with my node application and have set timestamps to true as shown:
const userSchema = new Schema({
...
},{
timestamps: true
});
I wanted to apply timestamps to the existing records and use it with my node application in future. If I make new fields 'createdAt' and 'updatedAt', will they work with my Mongoose schema? Or if there is any alternative way to achieve the task, please enlighten me as I am new to node and mongo in general.

first of all, I think this applies cannot affect the existing collections in the database, cause these fields are just a bunch of documents you inserting with existing/updating operations.
in MongoDB, everything is just a document and MongoDB does not care about data you store inside a collection, no validation here.so mongoose comes in for handling those validations and etc. if you change a schema in a collection it only effects to incoming requests from now on. but be careful if conflict fields happen, you will get an error for getting collections.
in short answer: MongoDB does not know when data stored or edited
but you can get timeStamp of creation in mongo ObjectId:
https://docs.mongodb.com/manual/reference/method/ObjectId.getTimestamp/index.html

How to expose MongoDB documents primary keys in a REST API?

I am building a REST API with MongoDB + nodeJS. All the documents are stored and are using _id as the primary key. I've read here that we should not expose the _id and we should use another ID which is not incremental.
In the DB, a document is represented as:
{
_id: ObjectId("5d2399b83e9148db977859ea")
bookName: "My book"
}
For the following the endpoints, how should the documents be exposed?
GET /books
GET /books/{bookId}
Currently my API returns:
{
_id: "5d2399b83e9148db977859ea"
bookName: "My book"
}
but should it instead return something like:
{
id: "some-unique-id-generated-on-creation"
bookName: "My book"
}
Questions
Should I expose the _id so that one can make queries such as:
GET /books/5d2399b83e9148db977859ea
Should I use a UUID for my ID instead of ObjectId?
Should I keep the internal _id (but never expose it) and create another attribute id which would use UUID or another custom generated ID ?
Is it a good practice to work with _id in my backend or should I only make queries using my own custom ID? Example: find({ id: }) instead of find({ _id: })

To answer your questions.
You can expose _id so that authenticated users can make queries like GET, PUT and PATCH on that _id.
MongoDB has support that allows you to generate your own BSON ID and use it, instead of mongodb created it's own _id during the insert.
There is no need of duplicating logic, the main purpose of _id is to identify each document separately and having two id columns means you are storing redundant data, follow DRY (DO NOT REPEAT YOURSELF) principle wherever possible.
It's not a bad practice to work with _id in your backend.
Hope this helps!

Given you're using Mongoose, you can use 'virtuals', which are essentially fake fields that Mongoose creates. They're not stored in the DB, they just get populated at run time:
// Duplicate the ID field.
Schema.virtual('id').get(function(){
return this._id.toHexString();
});
// Ensure virtual fields are serialised.
Schema.set('toJSON', {
virtuals: true
});
Any time toJSON is called on the Model you create from this Schema, it will include an 'id' field that matches the _id field Mongo generates. Likewise you can set the behaviour for toObject in the same way.
You can refer the following docs:
1) https://mongoosejs.com/docs/api.html
2) toObject method

In my case, whether it's a security risk or not, but my _id is a concatenation of any of the fields in my Document that are semantically considered as keys, i.e. if i have First Name, Last Name, and Email as my identifier, and a fourth field such as Age as attribute, then _id would be concatenation of all these 3 fields. It would not be difficult to get and update such record as long as I have First Name, Last Name and email information available

How to automatically update date fields in mongoDB collection on insert/update documents?

I'm using MongoDB v3.6.3 with PyMongo.
Here's my document structure:
{
"process_id": number,
"created_dttm": date,
"updated_dttm": date
}
I want to do two things:
Whenever a new document is inserted, created_dttm and updated_dttm should have the current system date.
Whenever an existing document is updated, updated_dttm should be updated to the current system date at that time
I have done this using MongoEngine Models by overriding the save() and update() methods .
Is there any other way to do this using PyMongo other than explicitly handling this programatically while insert/update?

Unfortunately this doesn't come out of the box from mongodb/pymongo. The only thing you get is if you use ObjectId's as primary keys for your documents, you can extract the timestamp from it with
oid = ObjectId()
oid.generation_time # is a datetime.datetime
For the update timestamps, you'll need to handle that in your application code. There is usually 2 ways for doing this, either you emit & store audit events in a separate collection, either you wrap your update method and modify a last_update_timestamp every time it is called.

Sub documents vs Mongoose population

I have the following senario:
A user can login to a website. A user can add/delete the poll(a question with two options). Any user can give there opinion on the poll by selecting anyone of the options.
Considering the above scenario I have three models - Users Polls Options . They are as follows, in order of dependency:
Option Schema
var optionSchema = new Schema({
optionName : {
type : String,
required : true,
},
optionCount : {
type : Number,
default : 0
}
});
Poll Schema
var pollSchema = new Schema({
question : {
type : String,
required : true
},
options : [optionSchema]
});
User Schema: parent schema
var usersSchema = new Schema({
username : {
type : String,
required : true
},
email : {
type : String,
required : true,
unique : true
},
password : String,
polls : [pollSchema]
});
How do I implement the above relation between those documents. What exaclty is mongoose population? How is it different from subdocuments ? Should I go for subdocuments or should I use Mongoose population.

As MongoDb hasn't got joins as relational databases, so population is a something like hidden join. It just means that when you have that User model and you will populate Poll Model, mongoose will do something like this:
fetch User
fetch related Polls, by ObjectIds which are stored in User document
put fetched Polls documents into User document
And when you will set User as document and Polls as subdocument, it will just mean that you will put whole data in single document. At one side it means that to fetch User Polls, mongoose doesn't need to run two queries(it need to fetch only User document, because Polls data is already there).
But what is better to choose? It just depends of the case.
If your Polls document will refer in another documents (you need access to Polls from documents User, A, B, C - it could be better to populate it, but not for sure. The advantage of populating is fact, that when you will need to change some Polls fields, you don't need to change that data in every document which is referring to that Polls document(as it will be a subdocument) - in that case in document User, A, B, C - you will only update Polls document. As you see it's nice. I told that it's not sure if populating will be better in that case, because I don't know how you need to retrieve your Polls data. If you store you data in wrong way, you will get performance issues or have some problems in easy data fetch.
Subdocuments are the basic way of storing data. It's great when Polls will be only referring to User. There is performance advantage - mongoose need to do one query instead of two as in population and there is no previously reminded update disadvantage, because you store Polls data only in single place, so there is no need to update other documents.
Basically MongoDb was created to mostly use Subdocuments. As the matter of fact, it's just non-relational database. So in most cases I prefer to use subdocuments. I can't answer which way will be better in your case, because I'm not sure how your DB looks like(in a full way) and how you want to retrieve your data.
There is some useful info in official documentation:
http://mongoosejs.com/docs/subdocs.html
http://mongoosejs.com/docs/populate.html
Take a look on that.
Edit
As I prefer to fetch data easily, take care about performance and know that data redundancy in MongoDb is something common, I will choose to store this data as subdocuments.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string