Use linux timestamp in CouchDB map function - couchdb

Trying to update an existing CouchDB map function so that it only returns docs created in the past 24 hours.
The current map is very simple
function(doc) {
if(doc.email && doc.type == 'user')
emit(doc.email, doc);
}
I'd like to get the current linux timestamp value and compare that to the creationTime.unix value stored in the doc.
Is that possible?
N.B I'm building the view in futon

I do not know if you can do that, but it if you can that would be very bad for CouchDB database sanity.
Map functions for same document should always emit same values, each time you invoke it (provided that document has not changed in the mean time). This is important since CouchDB stores this emited data in the index, and does not recalculate it again until it is necessary. If map functions could emit different values for the same doc, that would render index unusable.
So, no, do not try that.
Good news is that you can easily achieve what you need without that. If you emit creation time, than you can query your view just for docs with creation time in certain interval like in:
/blog/_design/docs/_view/by_date?startkey="2010/01/01 00:00:00"&endkey="2010/02/00 00:00:00"
Read more how you can query your views in CouchDB The Definitive Guide

Related

Couchdb View does not re-index

I have a view which looks like this:
function(doc) {
if (doc.type === 'article' && (Date.parse(doc.published) < (Date.now() - 30 * 60 * 1000))) {
emit(doc._id, doc._rev);
}
}
The view basically emits articles that are stale (i.e. {{published date}} < {{present - 30 minutes}}.
Now, the issue is as follows: The view does not update itself after the first read. The first access builds the view on all documents as expected. But thereafter it seems like it only updates itself on change (delete, create or update of new documents).
This is however an issue and not what I desire. I have other articles that are getting stale as time progresses therefore I would like couch to return these articles too but since they are not changed they don't come up in the view.
This, what I just described, seems to be expected couchdb behavior (?). But, is there a way to show aging artciles too ?
PS: An easy way to test this is to insert a document with published=Date.now() and type="article" and run this view. After 30 minutes you will see the document is actually stale as per the view definition but it will not show up in the view.
30 minutes is just a number. You can reduce it to a smaller time frame if you want. Thanks in advance for your help !
Yes, this is by design. The view index is only updated on the Create Update and Delete parts of CRUD operations.
Filtering by something dynamic during a Read is done with a key.
In your case you would probably want to emit as follows:
emit(doc.published, doc);
Then in your call to CouchDB you would add parameters that would further filter on the published date.

Pagination in CouchDB using variable keys

There's a bunch of questions on here related to pagination using CouchDB, but none that quite fit what I'm wondering about.
Basically, I have a result set ranked by number of votes, and I want to page through the set in descending order.
Here's the map for reference.
function(doc) {
emit(doc.votes);
}
Now, the problem. I found out that startkey_docid doesn't work on it's own. You have to use it in combination with startkey. The thing is, for the query, I don't use a startkey parameter (I'm not looking to restrict the results, just get the most->least). I was thinking I could just use startkey={{doc.votes}}&startkey_docid={{doc._id}} instead, but the number of votes for a document could have changed by the time someone clicks the "Next Page" link.
The way to solve this seemed obvious: just set startkey=99999999 so that it will return all documents in the database and I can just use startkey_docid to start at the one where we left off last time. Oddly, when I do that, the startkey_docid stopped working and just allowed all results to be returned again. Apparently startkey needs to exactly equal the key on the document whose _id is used in startkey_docid.
What I'm asking is whether anyone knows a workaround for using startkey_docid to page when the actual startkey could have changed by the time you want to use it? Should my application just lookup the document by _id and immediately use the doc.votes value hoping it hasn't changed in the few milliseconds between requests? Even that doesn't seem very reliable.
EDIT: Ended up switching to Mongo for the speed, so this question turned out to be kinda moot.
I have never done something like this but I think I have some idea how to do it. What you can do is to take a snapshot of the ratings and refer to it in every page. You probably want your view not to consume to much space, so you should not map separate copies of the documents with votes not changed after taking the snapshot. So, you can do the following:
Add some history of ratings with timestamp to your document.
Map the ratings AND history like this.
In your app get the current time: start_time = Date.now() and query all pages.
Cleanup the history older then the oldest active sessions.
The problem is that if you emit [votes, date] and try to paginate you will never know how many document you have to fetch to get desired number per page. There can always be some older version which you will have to skip, and you will have make next get from DB. Thats why you can consider emitting: [date, votes], read the view always twice -- for start_time and current time, and merge and sort the result (like in merge-sort).
Ad.1:
{ ...,
votes: 12,
history: [
{date: 1357390271342, votes: 10},
{date: 1357390294682, votes: 11}
]
}
Ad.2:
function (doc) {
emit([{}, doc.votes], null);
doc.history && doc.history.forEach(function(h) {
emit([h.date, h.votes], null);
});
}
Ad.3:
?startkey=[start_time, votes]&limit=items_per_page_plus1
?startkey=[{}, votes]&limit=items_per_page_plus1
Merge lists, sort by votes in your app (on in a list function).
If you will have problems with using start_docid then you can emit [date, votes, id] and query with the ID explicitly. Even when this particular doc changes its votes it will still be available in the history.
Ad.4:
If you emit [date, votes] then you can just get outdated history width: ?startkey=[0]&endkey=[oldest_active_session_time]&inclusive_end=false and update them with update handler:
function(doc, req) {
if (!doc || !doc.history) return [null, 'Error'];
var history = new Array();
var oldest = +(req.query.date);
doc.history.forEach(function(h) {
if (h.date >= oldest)
history.push(h);
});
doc.history = history;
return [doc, 'OK'];
}
Note: I have not tested it, so it is expected not to run without modifications :)
As far as I know CouchDB uses b-tree shadowing to make updates and in principle is should be possible to access older revisions of the view. I am not into the CouchDB design, so it is just a guess and there seems not to be any (documented) API for this.
I can't figure out any simple solution by now, but there are options:
Replicate not-so-often your sorting list to small dedicated db so it will be much more stale than stale=ok
Modify your schema in a way that you'll be able to sort by some more stable data. Look at the banking/ledger example in CouchDb guide: http://guide.couchdb.org/draft/recipes.html#banking. Try to log every vote and reduce them hourly for example. As a bonus you'll get a history/trends :)
I'm kind of surprised this question has been left unanswered because the functionality of CouchDB Futon basically does this when you are paginating through the results of a map function. I opened up firebug to see what was happening in the javascript console as I paginated and saw that for every set of paginated results it is passing the startkey along with startkey_docid. So although the question is how do I paginate without including startkey, CouchDB specifies that the startkey is required and demonstrates how it can work. The endkey is not specified, so if there is only one result for the specified startkey, the next set of paginated results will also contain the next key of the sorted results that do not match the startkey.
So to clarify a bit, the answer to this problem is that as you are paginating and keeping track of the startkey_docid, you also need to capture the startkey of the same document that will be the start of the next set of results. When you are calling the paginated results use both the captured startkey and startkey_docid as couchdb requires. Leave endkey off so that the results will continue on to the next key of the sorted results.
The usecase scenario for wanting to be able to paginate without specifying a key is kind of odd. So let's say that the start docid of the next paginated result did change it's key value drastically from a 9 to a 3. And we are also assuming that there is only one instance of the docid existing in the map results, even though it could potentially appear multiple times (which I believe is why the startkey needs to be specified). As the user is clicking the next button, the user's paginated results will have now moved from looking at rank 9 to rank 3. But if you are including the startkey in addition to the startkey_docid, the paginated results would just start all over at the beginning of the rank 9 results which is a more logical progression than potentially jumping over a large set of results.

couchdb, get last 10 documents

Using mysql this would be:
SELECT * FROM thetable ORDER BY id DESC LIMIT 10
How can I do this in couchdb for all documents with "type":"message"? (without pulling all documents with type:message)
Thanks
Create a view that emits all doc ids. The view keys will be used for sorting automatically.
function(doc) {
if(doc.type && doc.type === 'message'){
emit(doc._id, null);
}
}
Then execute a query: http://host/yourdb/_design/yourdesigndoc/_view/viewname?limit=10&include_docs=true&descending=true
Because you want the full document, we didn't included anything as value in the view. Instead, we add include_docs=true to fetch every full document for the view entries.
Note that there is also a builtin view that does the same: http://host/yourdb/_all_docs?limit=10&include_docs=true&descending=true
PS: You should be aware of the fact that CouchDB by default uses UUIDs as IDs, which will render the sorting more or less useless, if you really want to get the latest docs. Either provide your own incremental IDs (what about distribution/replication?) or use a new field that stores the time the doc has been created and use in the view as well.
If your docs have a created field (i.e. UNIX timestamp, JavaScript Date.now() or even a RFC 3339-like string), you can build an index on these values.
Here is the time-based view:
function(doc) {
if(doc.type && doc.type === 'message' && doc.created){
emit(doc.created, null);
}
}
Note that we will not emit the doc._id itself. However, CouchDB stores the doc._id where the data came from for each emitted key/value pair automatically, so we can again use include_docs=true to fetch the complete docs.
Query http://host/yourdb/_design/yourdesigndoc/_view/viewname?limit=10&include_docs=true&descending=true
If the IDs of your documents are already incremental, instead of the default UUIDs of CouchDB, you do not even need to define a view, you can just use the default _all_docs view, e.g.
http://couchdb_host/your_db/_all_docs?limit=10&descending=true&include_docs=true

couchdb design views, updating fields on doc creation

Is it possible to have couch update or change fields on the fly when you create/update a doc? For example in the design view.... validate_doc_update:
function(newDoc, oldDoc, userCtx) {
}
Within that function I can throw errors like:
if(!newDoc.user_email && !newDoc.user_name && !newDoc.user_password){
throw({forbidden : 'all fields required'});
}
My Question is how would I reassign a field? I tried this:
newDoc.user_password ="changed";
with changed being some new value or hashed value. My overall goal is to build a user registration/login system with node and couchdb and have not found very good examples.
The validate_doc_update function cannot have any side effects and cannot change the document before storage. It only has the power to block an update or to let it through. This is important, because the function is not only called when a user requests an update, but also when changes are replicated from one CouchDB instance to another. So the function can be called multiple times for one document.
However, CouchDB now supports Document Update Handlers that can modify a document or even build it from scratch. These can be used to convert non-JSON input data into usable documents. You can find some documentation in the CouchDB Wiki.
Before you build your own user registration/login system, I'd suggest you look into the built-in CouchDB security features (if you haven't - some information here). They might not be enough for you (e.g. if you need email validation or something similar), but maybe you can build on them.

CouchDB views - Multiple join... Can it be done?

I have three document types MainCategory, Category, SubCategory... each have a parentid which relates to the id of their parent document.
So I want to set up a view so that I can get a list of SubCategories which sit under the MainCategory (preferably just using a map function)... I haven't found a way to arrange the view so this is possible.
I currently have set up a view which gets the following output -
{"total_rows":16,"offset":0,"rows":[
{"id":"11098","key":["22056",0,"11098"],"value":"MainCat...."},
{"id":"11098","key":["22056",1,"11098"],"value":"Cat...."},
{"id":"33610","key":["22056",2,"null"],"value":"SubCat...."},
{"id":"33989","key":["22056",2,"null"],"value":"SubCat...."},
{"id":"11810","key":["22245",0,"11810"],"value":"MainCat...."},
{"id":"11810","key":["22245",1,"11810"],"value":"Cat...."},
{"id":"33106","key":["22245",2,"null"],"value":"SubCat...."},
{"id":"33321","key":["22245",2,"null"],"value":"SubCat...."},
{"id":"11098","key":["22479",0,"11098"],"value":"MainCat...."},
{"id":"11098","key":["22479",1,"11098"],"value":"Cat...."},
{"id":"11810","key":["22945",0,"11810"],"value":"MainCat...."},
{"id":"11810","key":["22945",1,"11810"],"value":"Cat...."},
{"id":"33123","key":["22945",2,"null"],"value":"SubCat...."},
{"id":"33453","key":["22945",2,"null"],"value":"SubCat...."},
{"id":"33667","key":["22945",2,"null"],"value":"SubCat...."},
{"id":"33987","key":["22945",2,"null"],"value":"SubCat...."}
]}
Which QueryString parameters would I use to get say the rows which have a key that starts with ["22945".... When all I have (at query time) is the id "11810" (at query time I don't have knowledge of the id "22945").
If any of that makes sense.
Thanks
The way you store your categories seems to be suboptimal for the query you try to perform on it.
MongoDB.org has a page on various strategies to implement tree-structures (they should apply to Couch and other doc dbs as well) - you should consider Array of Ancestors, where you always store the full path to your node. This makes updating/moving categories more difficult, but querying is easy and fast.

Resources