CouchDB Read Configuration from design document - couchdb

I would like to store a value in the config file and look it up in the design document for comparing against update values. I'm sure I have seen this but, for the life of me, I can't seem to remember how to do this.
UPDATE
I realize (after the first answer) that there was more than one way to interpret my question. Hopefully this example clears it up a little. Given a configuration:
curl -X PUT http://localhost:5984/_config/shared/token -d '"0123456789"'
I then want to be able to look it up in my design document
{
"_id": "_design/loadsecrets",
"validate_doc_update": {
"test": function (newDoc,oldDoc) {
if (newDoc.supersecret != magicobject.config.shared.token){
throw({unauthorized:"You don't know the super secret"});
}
}
}
}
It's the abilitly to do something like the magicobject.config.shared.token that I am looking for.
UPDATE 2
Another potentially useful (contrived) scenario
curl -X PUT http://trustedemployee:5984/_config/eventlogger/detaillevel -d '"0"'
curl -X PUT http://employee:5984/_config/eventlogger/detaillevel -d '"2"'
curl -X PUT http://vicepresident:5984/_config/eventlogger/detaillevel -d '"10"'
Then on devices tracking employee behaviour:
{
"_id": "_design/logger",
"updates": {
"logger": function (doc,req) {
if (!doc) {
doc = {_id:req.id};
}
if(req.level < magicobject.config.eventlogger.detaillevel ){
doc.details = req.details;
}
return [doc, req.details];
}
}
}

Here's a follow-up to my last answer with more general info:
There is no general way to use configuration, because CouchDB is designed with scalability, stability and predictability in mind. It has been designed using many principles of functional programming and pure functions, avoiding side effects as much as possible. This is a Good Thing™.
However, each type of function has additional parameters that you can use, depending on the context the function is called with:
show, list, update and filter functions are executed for each request, so they get the request object. Here you have the req.secObj and req.userCtx to (ab)use for common configuration. Also, AFAIK the this keyword is set to the current design document, so you can use the design doc to get common configuration (at least up to CouchDB 1.6 it worked).
view functions (map, reduce) don't have additional parameters, because the results of a view are written to disk and reused in subsequent calls. map functions must be pure (so don't use e.g. Math.random()). For shared configuration across view functions within a single design doc you can use CommonJS require(), but only within the views.lib key.
validate doc update functions are not necessarily executed within a user-triggered http request (they are called before each write, which might not be triggered only via http). So they have the userCtx and secObj added as separate parameters in their function signature.
So to sum up, you can use the following places for configuration:
userCtx for user-specific config. Use a special role (e.g. with a prefix) for storing small config bits. For example superLogin does this.
secObj for database-wide config. Use a special member name for small bits (as you should normally use roles instead of explicit user names, secObj.members.names or secObj.admins.names is a good place).
the design doc itself for design-doc-wide config. Best use the this.views.lib.config for this, as you can also read this key from within views. But keep in mind that all views are invalidated as soon as you change this key. So if the view results will stay the same no matter what the config values are, it might be better to use a this.config key.
Hope this helps! I can also add examples if you wish.

I think I know what you're talking about, and if I'm right then what you are asking for is no longer possible. (at least in v1.6 and v2.0, I'm not sure when this feature was removed)
There was a lesser-known trick that allowed a view/show/list/validation/etc function to access the parent design document as this in your function. For example:
{
"_id": "_design/hello-world",
"config": {
"PI": 3.14
},
"views": {
"test": {
"map": "function (doc) { emit(this.config.PI); })"
}
}
}
This was a really crazy idea, and I imagine it was removed because it created a circular dependency between the design document and the code of the view that made the process of invalidating/rebuilding a view index a very tricky affair.
I remember using this trick at some point in the distant past, but the feature is definitely gone now. (and likely to never return)

For your special use-case (validating a document with a secret token), there might be a workaround, but I'm not sure if the token might leak in some place. It all depends what your security requirements are.
You could abuse the 4th parameter to validate_doc_update, the securityObject (see the CouchDB docs) to store the secret token as the first admin name:
{
"test": "function (newDoc, oldDoc, userCtx, secObj) {
var token = secObj.admins.names[0];
if (newDoc.supersecret != token) {
throw({unauthorized:"You don't know the super secret"});
}
}"
}
So if you set the db's security object to {admins: {names: ["s3cr3t-t0k3n"], roles: ["_admin"]}}, you have to pass 's3cr3t-t0k3n' as the doc's supersecret property.
This is obviously a dirty hack, but as far as I remember, the security object may only be read or modified by admins, you wouldn't immediately leak your token to the public. But consider adding a separate layer between the CouchDB and your caller if you need "real" security.

Related

NodeJS - Simplify/Resolve GraphQL query

I am currently writing a Lambda authorizer for an AWS AppSync API, however the authorization depends on the target resource being accessed.
Every resource has their own ACL listing the users and conditions for allowing access to it.
Currently the best I could find would be to get the identity of the caller, look at all the ACLs, and authorize the call while denying access to all the other resources, what's not only highly inefficient, but also extremely impractical, if not impossible.
The solution I had originally came up with was to get the target resource, retrieve the ACL and check if the user fits the specified criteria. The problem is that I am unable to reliably define what's the target resource. What I get from AWS is a request like this:
{
"authorizationToken": "ExampleAUTHtoken123123123",
"requestContext": {
"apiId": "aaaaaa123123123example123",
"accountId": "111122223333",
"requestId": "f4081827-1111-4444-5555-5cf4695f339f",
"queryString": "mutation CreateEvent {...}\n\nquery MyQuery {...}\n",
"operationName": "MyQuery",
"variables": {}
}
}
So, I only have the query string and variables, leaving the actual parsing of this to me. I got to convert it to an AST using graphql-js, but it's still extremely verbose and most importantly, it's structure varies greatly.
My first code to retrieve the target worked for the AppSync console queries, but not the Amplify Front-End, for example. I also can't rely on something as simple as the variable name, as an attacker could quite easily craft a query with an arbitrary name, or even not use variables at all.
I thought about implementing this authorization logic within Lambda Resolvers, what should be simpler in a way, but would require me to use resolvers as authorizers, what doesn't seem ideal, and implement the entire resolver logic when I just want the most trivial possible resolvers.
Ideally I'd like something like this:
/* Schema:
type Query {
operationName(key: KEY!): responseType
}*/
/* Query:
query abitraryQueryName($var1: KEY!) {
operationName(key: $var1) {
field1
field2
}
}*/
/* Variables:
{ "var1": "value1" } */
parsedQuery = {
operation: "operationName",
params: { "key": "value1" },
fields: [ "field1", "field2" ]
};
Is there any way to resolve/simplify the queries from GraphQL to JSON/similar in a way that this information can be easily extracted?
Well, couldn't find anything on it, so I made something myself.
On the off chance someone needs something similar, here's the gist with the code I used: https://gist.github.com/Iorpim/6544dad46060522dd0b17477871bc434
I didn't make it a proper full lib, as it's a very specific use case and it's likely a one-off, and I am also not sure how reliable it is, but it solves my problem!

Referencing external doc in CouchDB view

I am scraping an 90K record database using JSON-RPC and I am trying to put in some basic error checking. I want to start by scraping the database twice using two different settings and adding a prefix to the second scrape. This way I can check to ensure that the two settings are not producing different records (due to dropped updates, etc). I wanted to implement the comparison using a view which compares each document from the first scrape with it's twin produced by the second scrape and then emit the names of records with a difference between them.
However, I cannot quite figure out how to pull in another doc in the view, everything I have read only discusses external docs using the emit() function, which is too late to permit me to compare it. In the example below, the lookup() function would grab the referenced document.
Is this just not possible?
function(doc) {
if(doc._id.slice(0,1)!=='$' && doc._id.slice(0,1)!== "_"){
var otherDoc = lookup('$test" + doc._id);
if(otherDoc){
var keys = doc.value.keys();
var same = true;
keys.forEach(function(key) {
if ((key.slice(0,1) !== '_') && (key.slice(0,1) !=='$') && (key!=='expires')) {
if (!Object.equal(otherDoc[key], doc[key])) {
same = false;
}
}
});
if(!same){
emit(doc._id, 1);
}
}
}
}
Context
You are correct that this is not possible in CouchDB. The whole point of the map function is that it must be idempotent, otherwise you lose all the other nice benefits of a pre-calculated index.
This is why you cannot access external resources in the map function, whether they be other records or the clock. Any time you run a map you must always get the same result if you put the same record into it. Since there are no relationships between records in CouchDB, you cannot promise that this is possible.
Solution
However, you can still achieve your end goal, just be different means. Some possibilities...
Assuming there is some meaningful numeric value in each doc, you could use a view to take the sum of all those values and group them by which import you did ({key: <batch id>, value: <meaningful number>}). Then compare the two numbers in your client or the browser to see if they match.
A brute force approach would be to use a view to pair the docs that should match. Each doc is on a different row, but they're grouped by a common field. Then iterate through the entire index comparing the pairs. This would certainly be the quickest to code and doesn't depend on your application or data.
Implement a validation function to enforce a schema on your data. Just be warned that this will reduce your write throughput since each written record will be piped out of Erlang and into the JS engine. Also, this is only applicable if you're worried about properly formed records instead of their precise content, which might not be the case.
Instead of your different batch jobs creating different docs, have them place them into the same doc. The structure might look like this: { "_id": "something meaningful", "batch_one": { ..data.. }, "batch_two": { ..data.. } } Then your validation function could compare them or you could create a view that indexes all the docs that don't match. All depends on where in your pipeline you want to do the error checking and correction.
Personally I like the last option better, but only if you don't plan to use the database as is in production. Ie., you wouldn't want to carry around all that extra data in each record.
Hope that helps.
Cheers.

CouchDB: Single document vs "joining" documents together

I'm tryting to decide the best approach for a CouchApp (no middleware). Since there are similarities to my idea, lets assume we have a stackoverflow page stored in a CouchDB. In essence it consists of the actual question on top, answers and commets. Those are basically three layers.
There are two ways of storing it. Either within a single document containing a suitable JSON representation of the data, or store each part of the entry within a separate document combining them later through a view (similar to this: http://www.cmlenz.net/archives/2007/10/couchdb-joins)
Now, both approaches may be fine, yet both have massive downsides from my current point of view. Storing a busy document (many changes through multiple users are expected) as a signle entity would cause conflicts to happen. If user A stores his/her changes to the document, user B would receive a conflict error once he/she is finished typing his/her update. I can imagine its possible to fix this without the users knowledge through re-downloading the document before retrying.
But what if the document is rather big? I'll except them to become rather blown up over time which would put quite some noticeable delay on a save process, especially if the retry process has to happen multiple times due to many users updating a document at the same time.
Another problem I'd see is editing. Every user should be allowed to edit his/her contributions. Now, if they're stored within one document it might be hard to write a solid auth handler.
Ok, now lets look at the multiple documents approach. Question, Answers and Comments would be stored within their own documents. Advantage: only the actual owner of the document can cause conflicts, something that won't happen too often. Being rather small elements of the whole, redownloading wouldn't take much time. Furthermore the auth routine should be quite easy to realize.
Now here's the downside. The single document is real easy to query and display. Having a lot of unsorted snippets laying around seems like a messy thing since I didn't really get the actual view to present me with a 100% ready to use JSON object containing the entire item in an ordered and structured format.
I hope I've been able to communicate the actual problem. I try to decide which solution would be more suitable for me, which problems easier to overcome. I imagine the first solution to be the prettier one in terms of storage and querying, yet the second one the more practical one solvable through better key management within the view (I'm not entirely into the principle of keys yet).
Thank you very much for your help in advance :)
Go with your second option. It's much easier than having to deal with the conflicts. Here are some example docs how I might structure the data:
{
_id: 12345,
type: 'question',
slug: 'couchdb-single-document-vs-joining-documents-together',
markdown: 'Im tryting to decide the best approach for a CouchApp (no middleware). Since there are similarities to...' ,
user: 'roman-geber',
date: 1322150148041,
'jquery.couch.attachPrevRev' : true
}
{
_id: 23456,
type: 'answer'
question: 12345,
markdown: 'Go with your second option...',
user : 'ryan-ramage',
votes: 100,
date: 1322151148041,
'jquery.couch.attachPrevRev' : true
}
{
_id: 45678,
type: 'comment'
question: 12345,
answer: 23456,
markdown : 'I really like what you have said, but...' ,
user: 'somedude',
date: 1322151158041,
'jquery.couch.attachPrevRev' : true
}
To store revisions of each one, I would store the old versions as attachments on the doc being edited. If you use the jquery client for couchdb, you get it for free by adding the jquery.couch.attachPrevRev = true. See Versioning docs in CouchDB by jchris
Create a view like this
fullQuestion : {
map : function(doc) {
if (doc.type == 'question') emit([doc._id, null, null], null);
if (doc.type == 'answer') emit([doc.question, doc._id, null], null);
if (doc.type == 'comment') emit([doc.question, doc.answer, doc._id], null) ;
}
}
And query the view like this
http://localhost:5984/so/_design/app/_view/fullQuestion?startkey=['12345']&endkey=['12345',{},{}]&include_docs=true
(Note: I have not url encoded this query, but it is more readable)
This will get you all of the related documents for the question that you will need to build the page. The only thing is that they will not be sorted by date. You can sort them on the client side (in javascript).
EDIT: Here is an alternative option for the view and query
Based on your domain, you know some facts. You know an answer cant exist before a question existed, and a comment on an answer cant exist before an answer existed. So lets make a view that might make it faster to create the display page, respecting the order of things:
fullQuestion : {
map : function(doc) {
if (doc.type == 'question') emit([doc._id, doc.date], null);
if (doc.type == 'answer') emit([doc.question, doc.date], null);
if (doc.type == 'comment') emit([doc.question, doc.date], null);
}
}
This will keep all the related docs together, and keep them ordered by date. Here is a sample query
http://localhost:5984/so/_design/app/_view/fullQuestion?startkey=['12345']&endkey=['12345',{}]&include_docs=true
This will get back all the docs you will need, ordered from oldest to newest. You can now zip through the results, knowing that the parent objects will be before the child ones, like this:
function addAnswer(doc) {
$('.answers').append(answerTemplate(doc));
}
function addCommentToAnswer(doc) {
$('#' + doc.answer).append(commentTemplate(doc));
}
$.each(results.rows, function(i, row) {
if (row.doc.type == 'question') displyQuestionInfo(row.doc);
if (row.doc.type == 'answer') addAnswer(row.doc);
if (row.doc.type == 'comment') addCommentToAnswer(row.doc)
})
So then you dont have to perform any client side sorting.
Hope this helps.

Change notification in CouchDB when a field is set

I'm trying to get notifications in a CouchDB change poll as soon as pre-defined field is set or changed. I've already had a look at filters that can be used for filtering change events(db/_changes?filter=myfilter). However, I've not yet found a way to include this temporal information, because you can only get the current version of the document in this filter functions.
Is there any possibility to create such a filter?
If it does not work, I could export my field to a separate database and the only poll for changes in that db, but I'd prefer to keep together my data for obvious reasons.
Thanks in advance!
You are correct: filters and _changes feeds can only see snapshots of a document. What you need is a function which can see the old document and the new document and act correctly. But that is unavailable in _filters and _changes.
Obviously your client code knows if it updates that field. You might update your client code however there is a better solution.
Update functions can access both documents. I suggest you make an _update
function which notices the field change and flags that in the document. Next you
have a simple filter checking for that flag. The best part is, you can use a
rewrite function to make the HTTP API exactly the same as before.
1. Create an update function to flag interesting updates
Your _design/myapp would be {"updates", "smart_updater": "(see below)"}.
Update functions are very flexible (see my recent update handlers
walkthrough). However we only want to mimic the normal HTTP/JSON API.
Your updates.smart_updater field would look like this:
function (doc, req) {
var INTERESTING = 'dollars'; // Set me to the interesting field.
var newDoc = JSON.parse(req.body);
if(newDoc.hasOwnProperty(INTERESTING)) {
// dollars was set (which includes 0, false, null, undefined
// values. You might test for newDoc[INTERESTING] if those
// values should not trigger this code.
if((doc === null) || (doc[INTERESTING] !== newDoc[INTERESTING])) {
// The field changed or created!
newDoc.i_was_changed = true;
}
}
if(!newDoc._id) {
// A UUID generator would be better here.
newDoc._id = req.id || Math.random().toString();
}
// Return the same JSON the vanilla Couch API does.
return [newDoc, {json: {'id': newDoc._id}}];
}
Now you can PUT or POST to /db/_design/myapp/_update/[doc_id] and it will feel
just like the normal API except if you update the dollars field, it will add
an additional flag, i_was_changed. That is how you will find this change
later.
2. Filter for documents with the changed field
This is very straightforward:
function(doc, req) {
return doc.i_was_changed;
}
Now you can query the _changes feed with a ?filter= parameter. (Replication
also supports this filter, so you could pull to your local system all documents
which most recently changed/created the field.
That is the basic idea. The remaining steps will make your life easier if you
already have lots of client code and do not want to change the URLs.
3. Use rewriting to keep the HTTP API the same
This is available in CouchDB 0.11, and the best resource is Jan's blog post,
nice URLs in CouchDB.
Briefly, you want a vhost which sends all traffic to your rewriter (which itself
is a flexible "bouncer" to all design doc functionality based on the URL).
curl -X PUT http://example.com:5984/_config/vhosts/example.com \
-d '"/db/_design/myapp/_rewrite"'
Then you want a rewrites field in your design doc, something like (not
tested)
[
{
"comment": "Updates should go through the update function",
"method": "PUT",
"from": "db/*",
"to" : "db/_design/myapp/_update/*"
},
{
"comment": "Creates should go through the update function",
"method": "POST",
"from": "db/*",
"to" : "db/_design/myapp/_update/*"
},
{
"comment": "Everything else is just like normal",
"from": "*",
"to" : "../../../*"
}
]
(Once again, I got this code from examples and existing code I have laying
around but it's not 100% debugged. However I think it makes the idea very clear.
Also remember this step is optional however the advantage is, you never have to
change your client code.)

How to name application specific fields in couchdb

I tried adding my own fields with names like _myappvar and _myotherappvar to documents to distinguish them from data fields. At first it worked but at some point futon starts to complain.
What is the right way to go?
I am using couchdb 0.9.0, this may be old, butI will not be able to upgrade in this iteration.
Edit: I guess _* is reserved for couchdb vars. I could choose something else but is there a best practice or ho are you solving this?
Edit2: This is somehow severe for my application, because it is already live with those fields. I wonder under which circumstances I can keep the parts that work and only apply a new naming for future fields.
You are correct. The CouchDB Document API, Special Fields section explains it.
Top-level fields may not begin with _.
CouchDB is relaxed, so the best way to go is the easiest thing for your application. About your specific edits:
One idea is to use the _ suffix instead of a prefix. Another idea is a .myapp field which is an object(namespace) for your internal data. You could combine them too:
{
"type": "the document type",
"var1": "Normal variable 1",
"var2": true,
"myapp_": {
"var": "Something internal",
"othervar": null,
}
}
Now you can reference doc.myapp_.var in your view maps, reduces, etc.
You have a choice. You can bite the bullet and change all documents right now. I don't know your app however I prefer that because you are playing with fire using a _ prefix.
However, you could also have both types of document and simply teach your map() function how to handle both of them.
function(doc) {
if(doc.type == "the document type") {
if(doc._myappvar) {
emit(doc._id, doc._myappvar); // The old way
} else if(doc.myapp_) {
emit(doc._id, doc.myapp_.var); // The new way
}
}
}
Good luck!

Resources