CouchDB views - Multiple join... Can it be done? - couchdb

I have three document types MainCategory, Category, SubCategory... each have a parentid which relates to the id of their parent document.
So I want to set up a view so that I can get a list of SubCategories which sit under the MainCategory (preferably just using a map function)... I haven't found a way to arrange the view so this is possible.
I currently have set up a view which gets the following output -
{"total_rows":16,"offset":0,"rows":[
{"id":"11098","key":["22056",0,"11098"],"value":"MainCat...."},
{"id":"11098","key":["22056",1,"11098"],"value":"Cat...."},
{"id":"33610","key":["22056",2,"null"],"value":"SubCat...."},
{"id":"33989","key":["22056",2,"null"],"value":"SubCat...."},
{"id":"11810","key":["22245",0,"11810"],"value":"MainCat...."},
{"id":"11810","key":["22245",1,"11810"],"value":"Cat...."},
{"id":"33106","key":["22245",2,"null"],"value":"SubCat...."},
{"id":"33321","key":["22245",2,"null"],"value":"SubCat...."},
{"id":"11098","key":["22479",0,"11098"],"value":"MainCat...."},
{"id":"11098","key":["22479",1,"11098"],"value":"Cat...."},
{"id":"11810","key":["22945",0,"11810"],"value":"MainCat...."},
{"id":"11810","key":["22945",1,"11810"],"value":"Cat...."},
{"id":"33123","key":["22945",2,"null"],"value":"SubCat...."},
{"id":"33453","key":["22945",2,"null"],"value":"SubCat...."},
{"id":"33667","key":["22945",2,"null"],"value":"SubCat...."},
{"id":"33987","key":["22945",2,"null"],"value":"SubCat...."}
]}
Which QueryString parameters would I use to get say the rows which have a key that starts with ["22945".... When all I have (at query time) is the id "11810" (at query time I don't have knowledge of the id "22945").
If any of that makes sense.
Thanks

The way you store your categories seems to be suboptimal for the query you try to perform on it.
MongoDB.org has a page on various strategies to implement tree-structures (they should apply to Couch and other doc dbs as well) - you should consider Array of Ancestors, where you always store the full path to your node. This makes updating/moving categories more difficult, but querying is easy and fast.

Related

Cloudant - apply a view/mapReduce to a geospatial query

HI I'm new to cloudant (and couch and asking questions on stackoverflow so I hope I manage to be vaguely clear about what I'm asking ) and I'm trying to do probably the second most basic geo task but am hitting a dead end.
I've got a database of docs which are geojson objects, I've created an index so I can query for intersections etc but it seems the only options I have in the url is the format=legacy (gives me the ids) and the format=geojson and the include_docs parameter - what I'd like to do is give back a particular view of the result set - I'm not interested in the geometry of the object (which is a big lump of data and it's likely that a number of other properties may be in the document that I'd rather filter out)
is there a correct way to do this in a single api call or do I need to fetch the doc ids (legacy format) and then issue a second query to bring back my chosen 'view' for each document id given in the result of format=legacy response
Thanks

CouchDB: In a checklist app, new file for every list item?

I'm working on a checklist web app
Now, every user can have lots of check lists, each with many items on them.
Would it be a good idea to keep the items in a JS object in the individual checklist? This would have been my first approach, since there wouldn't be a lot of sorting or anything happening on those items.
Now I'm thinking about putting every item in an individual file (because I might do stuff like deadlines and assignments for individual items)
This seems like a lot of files to me. Maybe I underestimate CouchDB. Would this be a good approach to the problem?
Store every list item as new document.
Every list item document should have properties like listname, deadline and/or assignment.
You don't need (but you can have) extra docs to express hierarchy or nested relations. That what CouchDB views for - e.g. you will build a view getListItemsByListname with the listname as key to get all items of one list.

Search for documents by key using Domino Data Service

Domino Data Service is a good thing but is it possible to search for documents by key.
I didnt find anything in the api and the url parameters about it.
I tried the above and the requests usually fail on the server timeout after 30 seconds. Calls to /api/data/documents won't serve the purpose with parameters like sortcolumn or keysexactmatch, therefore calls to
/api/data/collections should be used for these.
Also, I don't think that arguments like sortcolumn would work on a document collection, because there isn't a column to be sorted in the first place, columns are in the views and not in documents, so view collection should be queried instead. That also mimics the behavior of getDocumentByKey method, which can't be called against document, but against the view. So instead:
http://HOSTNAME/DATABASE.nsf/api/data/documents?search=QUERY&searchmaxdocs=N
I would call
http://HOSTNAME/DATABASE.nsf/api/data/collections/name/viewname?search=QUERY&searchmaxdocs=N
and instead of
http://HOSTNAME/DATABASE.nsf/api/data/documents?sortcolumn=COLUMN&sortorder=ascending&keys=ROWVALUE&keysexactmatch=true
I would call:
http://HOSTNAME/DATABASE.nsf/api/data/collections/name/viewname?sortcolumn=COLUMN&sortorder=ascending&keys=ROWVALUE&keysexactmatch=true
where 'viewname' is the name of the view that is searched.
That is much faster, which comes in handy when working with larger databases.
You would do something like the following:
GET http://HOSTNAME/DATABASE.nsf/api/data/documents?search=QUERY&searchmaxdocs=N
N would be the total number of documents to return and QUERY would be your search phrase. The QUERY would be the same as doing a full text search.
For column lookups it should be something like this:
GET http://HOSTNAME/DATABASE.nsf/api/data/documents?sortcolumn=COLUMN&sortorder=ascending&keys=ROWVALUE&keysexactmatch=true
COLUMN would be the column name. ROWVALUE would be the key you are looking for.
There are further options for this. More details here.
http://infolib.lotus.com/resources/domino/8.5.3/doc/designer_up1/en_us/DominoDataService.html#migratingtowebsphereportalversion7.0

How can I configure Sitecore search to retrieve custom values from the search index

I am using the AdvancedDatabaseCrawler as a base for my search page. I have configured it so that I can search for what I want and it is very fast. The problem is that as soon as you want to do anything with the search results that requires accessing field values the performance goes through the roof.
The main search results part is fine as even if there are 1000 results returned from the search I am only showing 10 or 20 results per page which means I only have to retrieve 10 or 20 items. However in the sidebar I am listing out various filtering options with the number or results associated with each filtering option (eBay style). In order to retrieve these filter options I perform a relationship search based on the search results. Since the search results only contain SkinnyItems it has to call GetItem() on every single result to get the actual item in order to get the value that I'm filtering by. In other words it will call Database.GetItem(id) 1000 times! Obviously that is not terribly efficient.
Am I missing something here? Is there any way to configure Sitecore search to retrieve custom values from the search index? If I can search for the values in the index why can't I also retrieve them? If I can't, how else can I process the results without getting each individual item from the database?
Here is an idea of the functionality that I’m after: http://cameras.shop.ebay.com.au/Digital-Cameras-/31388/i.html
Klaus answered on SDN: use facetting with Apache Solr or similar.
http://sdn.sitecore.net/SDN5/Forum/ShowPost.aspx?PostID=35618
I've currently resolved this by defining dynamic fields for every field that I will need to filter by or return in the search result collection. That way I can achieve the facetted searching that is required without needing to grab field values from the database. I'm assuming that by adding the dynamic fields we are taking a performance hit when rebuilding the index. But I can live with that.
In the future we'll probably look at utilizing a product like Apache Solr.

Solr Merging Results of 2 Cores Into Only Those Results That Have A Matching Field

I am trying to figure out if how I can accomplish the following and none of the answers I have found so far seem to fit:
I have a fairly static and large set of resources I need to have indexed and searchable. Solr seems to be a perfect fit for that. In addition I need to have the ability for my users to add resources from the main data set to a 'Favourites' folder (which can include a few more tags added by them). The Favourites needs to be searchable in the same manner as the main data set, across all the same fields plus the additional ones.
My first thought was to have two separate schemas
- the first for the main data set and its metadata
- the second for the Favourites folder with all of the metadata from the main set copied over and then adding the additional fields.
Then I thought that would probably waste quite a bit of space (the number of users is much larger than the number of main resources).
So then I thought I could have the main data set with its metadata (Core0), same as above with the resourceId as the unique identifier. Then there would be second one (Core1) for the Favourites folder with the unique id of the resourceId, userId, grade, folder all concantenated. The resourceId would be a separate field also. In addition, I would create another schema/core (Core3) with all the fields from the other two and have a request handler defined on it that searches across the other 2 cores and returns the results through this core.
This third core would have searches run against it where the results would expect to only be returned for a single user. For example, a user searches their Favourites folder for all the items with Foo. The result is only those items the user has added to their Favourites with Foo somewhere in their main data set metadata. I guess the result handler from Core3 would break the search up into a search for all documents with Foo in Core0, a search across Core1 for userId and folder and then match up the resourceIds from both of them and eliminate those not in both. Or run a search on Core1 with the userId and folder and then having gotten that result set back, extract all the resourceIds and append an AND onto the search query to Core0 like: AND (resourceId = 1232232312 OR resourceId = 838388383 OR resourceId = 8637626491).
Could this be made to work? Or is there some simpler mechanism is Solr to resolve the merging of 2 searches across 2 cores and only return the results that match on (not necessarily a unique) field in both?
Thanks.
Problem looks like a data base join of 2 tables with resource id as the foreign key.
Ignore the post if what i understood is wrong.
First i will probably do it with a single core, with a field userid (indexed, but not stored), reindex a document every time a new user favorites it by appending his user id (delimited by something that analyzer ignores).
So searching gets easier (userId:"kaka's id" will fetch all my favorites)
I think it takes some work to do this and also if number of users who can like a document increases, userid field gets really long.
So in that case,i will move on to my next idea which is similar to yours,have a second core with (userid,resource id).Write a wrapper which first searches this core for all the favorites, then searches another core for all the resources in a where condition, but again..if a user favorites more resources, the query might exceed GET method's size limit..
If both doesn't seem to work, its time to think something more scalable, which leaves us the same space wasting option.
Am i missing something??

Resources