Couchdb how to get the timestamp of the last modified database? - couchdb

Does anyone know how to get the last timestamp that a specific database was modified?
The API _changes does not provide that information. Thank you.
UPDATE
How to retrieve the last date /time that the database had anew document inserted or a modified one.

CouchDB does not record the time that each change occurred, so if you need this functionality you need to a add a timestamp into the document e.g.
{
"_id": "myid",
"name": "Bob",
"email": "bob#aol.com",
"timestamp": 1657614546263
}
Then a MapReduce view will allow you to query documents by timestamp:
function(doc) {
emit(doc.timestamp)
}
To get a the latest change you would query the resultant view with ?descending=true&limit=1 to get the most recently modified document:
GET /mydb/_design/myview/_view/myview?descending=true&limit=1&include_docs=true`
Alternatively, you can use a document id that has a timestamp encoded within it. See this blog post which shows how documents with time-sortable ids allow easy querying of the latest documents to be added.

Related

Find differences between current document and previous revision

Is there a way to determine what changes were made in a document? Here's a document and a revision of it
{
"_id": "panel100000",
"_rev": "1-b4f55d0e03fbfaef0822a0607d5d6ad0",
"name": "Maya Jambalaya",
"maritalstatus": "Married",
"employed": "Full time",
"education": "College graduate"
}
{
"_id": "panel100000",
"_rev": "2-caab684a341da5185546a028cfb5b0d9",
"name": "Maya Papaya",
"maritalstatus": "Married",
"employed": "Full time",
"education": "College graduate"
}
In this example, name and maritalstatus have changed. Is there a way to find changes between a document and its previous revisions?
Is there anything built-in that does or could track such changes?
Is it possible to access a document's revision via a design document?
If the answer to #2 is "yes" then does anyone have a template of a design document with which to compare them?
No. If you want to track changes, you would probably need to use a data model adapted for that purpose. Otherwise, Couch keeps revisions of the documents and you can query them to manually calculate the diff. Therefore, there are no guarantees that revisions will not be compacted.
No. Design documents are built with the latest revision of each documents.
...
If you want to be sure to keep every document changes, you would need to create a document for each change. Those changes could be grouped by a uniqueId and you could use a map/reduce to get the latest value of a document. The diff would need to be made manually tho. The advantage would be that you can easily get the state of the document at a certain time.

Why are there two ways to update documents?

As a CouchDB beginner I'm having a hard time understanding how to update documents.
When I read the docs I find this which is quite confusing for me:
1) Updating an Existing Document
To update an existing document you must specify the current revision number within the _rev parameter.
Source: Chapter 10.4.1 /db/doc
2) Update Functions
Update handlers are functions that clients can request to invoke server-side logic that will create or update a document.
Source: Chapter 6.1.4 Design Documents
Could you please tell me which way do you prefer to update your documents?
Edit 1:
Let's say the data structure is just a simple car document with some basic fields.
{
"_id": "123",
"name": "911",
"brand": "Porsche",
"maxHP": "100",
"owner": "Lorna"
}
Now the owner changes, would you still use option 1? Option 1 has quite a downside, because I can't just edit one field. I need to retrieve every fields first, edit just the owner field and than send back the whole document. I just tried it and I find this quite long-winded. Hmmm...
Most of the time you want to choose option 1 "Update an existing document"; this operates on a standard document that stores data in the database. The other option relates to design documents, such as views (which are also documents, this is definitely confusing to new CouchDB users), which is something completely different.
Stick with option 1, and good luck :)

How can I reduce a collection by keeping only change points?

I have a Collection exampled below. This data is pulled from an endpoint every twenty minutes on a cron job.
{"id":AFFD6,"empty":8,"capacity":15,"ready":6,"t":1474370406,"_id":"kROabyTIQ5eNoIf1"}
{"id":AFFD6,"empty":9,"capacity":15,"ready":5,"t":1474116005,"_id":"kX0DpoZ5fkMr2ezg"}
{"id":AFFD6,"empty":9,"capacity":15,"ready":5,"t":1474684808,"_id":"ken1WRN47PTW159H"}
{"id":AFFD6,"empty":9,"capacity":15,"ready":5,"t":1474117205,"_id":"kes1gDlG1sBjgV1R"}
{"id":AFFD6,"empty":10,"capacity":15,"ready":4,"t":1474264806,"_id":"khILUjzGEPOn0c2P"}
{"id":AFFD6,"empty":9,"capacity":15,"ready":5,"t":1474275606,"_id":"ko9r8u860es7E2hI"}
{"id":AFFD6,"empty":9,"capacity":15,"ready":5,"t":1474591207,"_id":"kpLS6mCtkIiffTrN"}
I want to discard any document (row) that doesn't show a change in the empty (and consequently ready). My goal is to find the most recent time stamp where these values have changed with in this collection.
Better illustrated, I want to reduce it to where the values change as such:
{"id":AFFD6,"empty":8,"capacity":15,"ready":6,"t":1474370406,"_id":"kROabyTIQ5eNoIf1"}
{"id":AFFD6,"empty":9,"capacity":15,"ready":5,"t":1474117205,"_id":"kes1gDlG1sBjgV1R"}
{"id":AFFD6,"empty":10,"capacity":15,"ready":4,"t":1474264806,"_id":"khILUjzGEPOn0c2P"}
{"id":AFFD6,"empty":9,"capacity":15,"ready":5,"t":1474591207,"_id":"kpLS6mCtkIiffTrN"}
Can I do this at the in a MongoDB query? Or am I better off with a JavaScript filter function?
MongoDB allows you to specify a unique constraint on an index. These constraints prevent applications from inserting documents that have duplicate values for the inserted fields.
Use the following code to make unique
db.collection.createIndex( { "id": 1 }, { unique: true } )
Also refer the MongoDB documentation for more clarification.

Mongodb: big data structure

I'm rebuilding my website which is a search engine for nicknames from the most active forum in France: you search for a nickname and you got all of its messages.
My current database contains more than 60Gb of data, stored in a MySQL database. I'm now rewriting it into a mongodb database, and after retrieving 1 million messages (1 message = 1 document) find() started to take a while.
The structure of a document is as such:
{
"_id" : ObjectId(),
"message": "<p>Hai guys</p>",
"pseudo" : "mahnickname", //from a nickname (*pseudo* in my db)
"ancre" : "774497928", //its id in the forum
"datepost" : "30/11/2015 20:57:44"
}
I set the id ancre as unique, so I don't get twice the same entry.
Then the user enters the nickname and it finds all documents that have that nickname.
Here is the request:
Model.find({pseudo: "danickname"}).sort('-datepost').skip((r_page -1) * 20).limit(20).exec(function(err, bears)...
Should I structure it differently? Instead of having one document for each message, I'm having a document for each nickname and I update the document once I get a new message from that nickname?
I was using the first method with MySQL et it wasn't taking that long.
Edit: Or maybe should I just index the nicknames (pseudo)?
Thanks!
Here are some recommendations for your problem about big data:
The ObjectId already contains a timestamp. You can also sort on it. You could save on some disk space by removing the datepost field.
Do you absolutely need the ancre field? The ObjectId is already unique and indexed. If you absolutely need it and need to keep the datepost seperate too, you could replace the _id field to be your ancre field.
As many mentioned, you should add an index on pseudo. This will make the "get all messages where the pseudo is mahnickname" search much faster.
If the amount of messages per user is low, you could store all of them inside a single Document per user. This would avoid having to skip to a specific page, which can be slow. However, be aware of the 16mb limit. I would personally still have them in multiple documents.
To keep fast query speeds, ensure that all your indexed fields fit in RAM. You can see the RAM consumption of indexed fields by typing db.collection.stats() and looking at the indexSizes sub-document.
Would there be a way for you to not skip documents, but use the time it got written to the database as your pages? If so, use the datepost field or the timestamp in _id for your paging strategy. If you decide on using the datepost, make a compound index on pseudo and datepost.
As for your benchmarks, you can closely monitor MongoDB by using mongotop and mongostat.

How to delete a document in documentdb using c# where i have variable with timestamp by quering it

I have a document in with data like following format.
[
{
"TimeStamp": "3/18/2015 7:57:21 PM",
}
]
I need to query this document only with date(3/18/2015) and delete that document using c#. Time in the TimeStamp (7:57:21 PM) should not be queried. How can I do it?
If there are multiple records with the same date, i need to delete all those documents.
Thanks
Naveen
Today you need to know the doc._self to delete a Document.
So do a query for the document(s) based on any criteria you like, including the timestamp (no reason to not use this, it's there for that reason). Then execute a delete statement against each document found.
If you want to do this in a batch to minimize server roundtrips, consider doing this in a stored procedure, similar to our published BulkImport.js script.

Resources