Cloudant - apply a view/mapReduce to a geospatial query - couchdb

HI I'm new to cloudant (and couch and asking questions on stackoverflow so I hope I manage to be vaguely clear about what I'm asking ) and I'm trying to do probably the second most basic geo task but am hitting a dead end.
I've got a database of docs which are geojson objects, I've created an index so I can query for intersections etc but it seems the only options I have in the url is the format=legacy (gives me the ids) and the format=geojson and the include_docs parameter - what I'd like to do is give back a particular view of the result set - I'm not interested in the geometry of the object (which is a big lump of data and it's likely that a number of other properties may be in the document that I'd rather filter out)
is there a correct way to do this in a single api call or do I need to fetch the doc ids (legacy format) and then issue a second query to bring back my chosen 'view' for each document id given in the result of format=legacy response
Thanks

Related

Get IDs of nodes via the edge collection only

I am writing an application that stores external data in ArangoDB for further processing inside the application. Let's assume I am talking about Photos in Photosets here.
Due to the nature of used APIs, I need to fetch Photosets befor I can load Photos. In the Photosets API reply, there is a list of Photo IDs that I later use to fetch the Photos. So I created an edge collection called photosInSets and store the edges between Photosets and Photos, although the Photos are not there yet.
Later on, I need to get a list of all needed Photos to load them via the API. All IDs are numeric. At the moment, I use the following AQL query to fetch the IDs of all required Photos:
FOR edge
IN photosInSets
RETURN DISTINCT TO_NUMBER(
SUBSTITUTE(edge._from, "photos/", "")
)
However... this does not look like a nice solution. I'd like to (at least) get rid of the string operation to remove the collection name. What's the nice way to do that?
One way you can find this is with a join on the photosInSets edge collection back to the photos collection.
Try a query that looks like this:
FOR e IN photoInSets
LET item = (FOR v IN photos FILTER e._from == v._id RETURN v._key)
RETURN item
This joins the _from reference in photoInSets with the _id back in the photos collection, then pulls the _key from photos, which won't have the collection name as part of it.
Have a look at a photo item and you'll see there is _id, _key and _rev as system attributes. It's fine to use the _key value if you want a string, it's not necessary to implement your own unique id unless there is a burning reason why you can't expose _key.
With a little manipulation, you could even return an array of objects stating which photo._key is a member of which photoSet, you'll just have to have two LET commands and return both results. One looking at the Photo, one looking at the photoSet.
I'm not official ArangoDB support, but I'm interested if they have another way of doing this.

DocumentDB and client side javascript API for querying a collection

I'm playing with DocumentDB's client side javascript API. I want to be able to query a collection. I'd like to use a collection URL, something like:
"https://mydocumentdb.documents.azure.com:9443/dbs/my_db/colls/my_users"
but there appears to be no API function for me to query a documentdb collection without first having the database "self link" and then in turn getting the collections "self link". The only way to get these self links appears to be to first iterate through all my DB's and then pull the right self link, then iterate through my collections, get the collection, finally, use my self link that I got from the service to query the collection.
Really???
Not exactly.
You're correct in that you have to query for a collection's self-link prior to querying the collection. (I know... this can be quite annoying, and is being looked in to by the DocDB team).
However, there is no need to iterate through all of your DB/collections to retrieve self-links as they are indexed server-side.
It is better to query directly for the particular DB/collection you're looking for, which looks something like: client.queryCollections(database._self, 'SELECT * FROM collections c WHERE c.id="' + collectionId + '"'), where collectionId is an identifier you assign.

REST API design for a large mongo document

I'm having a large document (stored in a Mongo db) and i should expose this document as a REST API. By large I mean more then 200 fields with nested documents and nested list of documents.
My question is simple, what is the best approach to design a REST api for such document.
I see 2 options :
1/ Design a single endpoint for the document
[GET] /api/documents ==> will return an array with 1 doc ...
[GET] /api/documents/:id ==> will return the document by it's id
2/ design multiple endpoints for the document
[GET] /api/documents ==> returning all the first level details of the document
[GET] /api/documents/id/field1 ==> returning all inner doc (array of object) from field1 of the document
[GET] /api/documents/id/field1/nid ==> returning object nid from field1 of the document
The application which will consume the REST api will read and modify the data.
This question may seems tedious but for me this is fundamental to the good design of the application which will consume these REST services.
Thanks in advance for your help.
I would suggest going with your first approach, with the addition of a query parameter "depth". The parameter would indicate how many levels deep to fill in the document that's being returned. It can default to 1 or ALL, depending on your clients' needs.
GET /api/documents/342?depth=8
GET /api/documents/78?depth=ALL
That gives them the flexibility to pull as much information as they need without slamming them with the whole document subtree when they just need one node.
It would also be a standard practice for /documents to always return a collection of documents, not a single document. You could use a query parameter to find the root documents:
GET /api/documents?root=true
and then use the id of the root document to pull down however much you need from /api/documents/{id}.

How to get Post with Comments Count in single query with CouchDB?

How to get Post with Comments Count in single query with CouchDB?
I can use map-reduce to build standalone view [{key: post_id, value: comments_count}] but then I had to hit DB twice - one query to get the post, another to get comments_count.
There's also another way (Rails does this) - count comments manually, on the application server and save it in comment_count attribute of the post. But then we need to update the whole post document every time a new comment added or deleted.
It seems to me that CouchDB is not tuned for such a way, unlike RDBMS when we can update only the comment_count attribute in CouchDB we are forced to update the whole post document.
Maybe there's another way to do it?
Thanks.
The view's return json includes the document count as 'total_rows', so you don't need to compute anything yourself, just emit all the documents you want counted.
{"total_rows":3,"offset":0,"rows":[
{"id":...,"key":...,value:doc1},
{"id":...,"key":...,value:doc2},
{"id":...,"key":...,value:doc3}]
}

CouchDB views - Multiple join... Can it be done?

I have three document types MainCategory, Category, SubCategory... each have a parentid which relates to the id of their parent document.
So I want to set up a view so that I can get a list of SubCategories which sit under the MainCategory (preferably just using a map function)... I haven't found a way to arrange the view so this is possible.
I currently have set up a view which gets the following output -
{"total_rows":16,"offset":0,"rows":[
{"id":"11098","key":["22056",0,"11098"],"value":"MainCat...."},
{"id":"11098","key":["22056",1,"11098"],"value":"Cat...."},
{"id":"33610","key":["22056",2,"null"],"value":"SubCat...."},
{"id":"33989","key":["22056",2,"null"],"value":"SubCat...."},
{"id":"11810","key":["22245",0,"11810"],"value":"MainCat...."},
{"id":"11810","key":["22245",1,"11810"],"value":"Cat...."},
{"id":"33106","key":["22245",2,"null"],"value":"SubCat...."},
{"id":"33321","key":["22245",2,"null"],"value":"SubCat...."},
{"id":"11098","key":["22479",0,"11098"],"value":"MainCat...."},
{"id":"11098","key":["22479",1,"11098"],"value":"Cat...."},
{"id":"11810","key":["22945",0,"11810"],"value":"MainCat...."},
{"id":"11810","key":["22945",1,"11810"],"value":"Cat...."},
{"id":"33123","key":["22945",2,"null"],"value":"SubCat...."},
{"id":"33453","key":["22945",2,"null"],"value":"SubCat...."},
{"id":"33667","key":["22945",2,"null"],"value":"SubCat...."},
{"id":"33987","key":["22945",2,"null"],"value":"SubCat...."}
]}
Which QueryString parameters would I use to get say the rows which have a key that starts with ["22945".... When all I have (at query time) is the id "11810" (at query time I don't have knowledge of the id "22945").
If any of that makes sense.
Thanks
The way you store your categories seems to be suboptimal for the query you try to perform on it.
MongoDB.org has a page on various strategies to implement tree-structures (they should apply to Couch and other doc dbs as well) - you should consider Array of Ancestors, where you always store the full path to your node. This makes updating/moving categories more difficult, but querying is easy and fast.

Resources