I would like to do more complex queries on jsonb/documents that contain arrays of objects. Is there any library anyone would recommend for Node? I am using pg but I want to do more advanced queries like select the document where a document has an array with an object with a certain key/value. If there aren't any libraries that do this, does anyone know how I could do it with json functions/etc in psql? or point me to a book/resource where I could learn this advanced querying?
If you need to do really complicated things you're going to be writing SQL no matter what. But for basic queries that involve working with JSONB fields Massive (full disclosure, it's my project) has you covered, and executing handwritten prepared statements is as easy as anything else since scripts are loaded into the API.
Searching an embedded array falls into the 'really complicated' category, unfortunately, but if you know your element positions you could do this quite simply with Massive:
await db.mytable.find({
'somejson.arrayfield[0].key': 'value'
});
This would return all records from mytable where the somejson column has an arrayfield array, the first element in which contains a "key": "value" pair.
For searching, check out the Postgres docs. The specific question you have requires a lateral join on the jsonb_array_elements function like so:
SELECT somejson
FROM mytable
JOIN LATERAL jsonb_array_elements(mytable.somejson->'arrayfield') AS elements
ON TRUE
WHERE elements->>'key' = $1;
With Massive, you'd put this query in a script in your application's /db directory and run it as db.myScriptName('value'). You can use folders to group similar scripts too.
We want to check if a document already exists in the database with the same fields and values of a new object we are trying to save to prevent duplicated item.
Note: This question is not about updating documents or about duplicated document IDs, we only check the data to prevent saving a new document with the same data of an existing one.
Preferably we'd like to accomplish this with Mango/Cloudant queries and not rely on views.
The idea so far is:
1) Scan the the data that we are trying to save and dynamically create a selector that matches that document's structure. (We can't have the selectors hardcoded because we have types of many documents)
2) Query de DB with for any documents matching that selector to if any document already exists that matches those criteria.
However I wonder about the performance of this approach since many of the selector fields will not be indexed.
I also much rather follow best practices than create something out of the blue, but haven't been able to find any known solutions for this specific scenario.
If you happen to know of any, please share.
Option 1 - Define a meaningful ID for your documents
The ID could be a logical coposition or a computed hash from the values that should be unique
If you want to check if a document ID already exists you can use the HEAD method
HEAD /db/docId
which returns 200-OK if the docId exits on the database.
If you would like to check if you have the same content in the new document and in the previous one, you may use the Validate Document Update Function which allows to compare both documents.
function(newDoc, oldDoc, userCtx, secObj) {
...
}
Option 2 - Use content hash computed outside CouchDB
Before create or update a document a hash should be computed using the values of the attributes that should be unique.
The hash is included in the document in a new attribute i.e. "key_hash"
Create a mango index using the "key_hash" attribute
When a new doc should be inserted, the hash should be computed and find for documents with the same hash value using a mango expression before the doc is inserted.
Option 3 - Compute hash in a View
Define a view which emit the computed hash for each document as key
Couchdb Javascript support does not include hashing functions, this could be difficult to include in a design document.
Use erlang to define the map function, where you can access to the erlang support for hashing.
Before creating a new document you should query the view using a the hash that you need to compute previously.
One solution would be to take Juanjo's and Alexis's comment one step further.
Select the keys you wish to keep unique
Put the values in a string and generate a hash
Set the document's _id to that hash
PUT the document on the database.
check return for failure
If another document already exists on the database with the same _id value, the PUT request will fail.
Folks, I was wondering what is the best way to model document and/or map functions that allows me "Not Equals" queries.
For example, my documents are:
1. { name : 'George' }
2. { name : 'Carlin' }
I want to trigger a query that returns every documents where name not equals 'John'.
Note: I don't have all possible names before hand. So the parameters in query can be any random text like 'John' in my example.
In short: there is no easy solution.
You have four options:
sending a multi range query
filter the view response with a server-side list function
using a CouchDB plugin
use the mango query language
sending a multi range query
You can request the view with two ranges defined by startkey and endkey. You have to choose the range so, that the key John is not requested.
Unfortunately you have to find the commit request that somewhere exists and compile your CouchDB with it. Its not included in the official source.
filter the view response with a server-side list function
Its not recommended but you can use a list function and ignore the row with the key John in your response. Its like you will do it with a JavaScript array.
using a CouchDB plugin
Create an additional index with e.g. couchdb-lucene. The lucene server has such query capabilities.
use the "mango" query language
Its included in the CouchDB 2.0 developer preview. Not ready for production but will be definitely included in the stable release.
Domino Data Service is a good thing but is it possible to search for documents by key.
I didnt find anything in the api and the url parameters about it.
I tried the above and the requests usually fail on the server timeout after 30 seconds. Calls to /api/data/documents won't serve the purpose with parameters like sortcolumn or keysexactmatch, therefore calls to
/api/data/collections should be used for these.
Also, I don't think that arguments like sortcolumn would work on a document collection, because there isn't a column to be sorted in the first place, columns are in the views and not in documents, so view collection should be queried instead. That also mimics the behavior of getDocumentByKey method, which can't be called against document, but against the view. So instead:
http://HOSTNAME/DATABASE.nsf/api/data/documents?search=QUERY&searchmaxdocs=N
I would call
http://HOSTNAME/DATABASE.nsf/api/data/collections/name/viewname?search=QUERY&searchmaxdocs=N
and instead of
http://HOSTNAME/DATABASE.nsf/api/data/documents?sortcolumn=COLUMN&sortorder=ascending&keys=ROWVALUE&keysexactmatch=true
I would call:
http://HOSTNAME/DATABASE.nsf/api/data/collections/name/viewname?sortcolumn=COLUMN&sortorder=ascending&keys=ROWVALUE&keysexactmatch=true
where 'viewname' is the name of the view that is searched.
That is much faster, which comes in handy when working with larger databases.
You would do something like the following:
GET http://HOSTNAME/DATABASE.nsf/api/data/documents?search=QUERY&searchmaxdocs=N
N would be the total number of documents to return and QUERY would be your search phrase. The QUERY would be the same as doing a full text search.
For column lookups it should be something like this:
GET http://HOSTNAME/DATABASE.nsf/api/data/documents?sortcolumn=COLUMN&sortorder=ascending&keys=ROWVALUE&keysexactmatch=true
COLUMN would be the column name. ROWVALUE would be the key you are looking for.
There are further options for this. More details here.
http://infolib.lotus.com/resources/domino/8.5.3/doc/designer_up1/en_us/DominoDataService.html#migratingtowebsphereportalversion7.0
I have three document types MainCategory, Category, SubCategory... each have a parentid which relates to the id of their parent document.
So I want to set up a view so that I can get a list of SubCategories which sit under the MainCategory (preferably just using a map function)... I haven't found a way to arrange the view so this is possible.
I currently have set up a view which gets the following output -
{"total_rows":16,"offset":0,"rows":[
{"id":"11098","key":["22056",0,"11098"],"value":"MainCat...."},
{"id":"11098","key":["22056",1,"11098"],"value":"Cat...."},
{"id":"33610","key":["22056",2,"null"],"value":"SubCat...."},
{"id":"33989","key":["22056",2,"null"],"value":"SubCat...."},
{"id":"11810","key":["22245",0,"11810"],"value":"MainCat...."},
{"id":"11810","key":["22245",1,"11810"],"value":"Cat...."},
{"id":"33106","key":["22245",2,"null"],"value":"SubCat...."},
{"id":"33321","key":["22245",2,"null"],"value":"SubCat...."},
{"id":"11098","key":["22479",0,"11098"],"value":"MainCat...."},
{"id":"11098","key":["22479",1,"11098"],"value":"Cat...."},
{"id":"11810","key":["22945",0,"11810"],"value":"MainCat...."},
{"id":"11810","key":["22945",1,"11810"],"value":"Cat...."},
{"id":"33123","key":["22945",2,"null"],"value":"SubCat...."},
{"id":"33453","key":["22945",2,"null"],"value":"SubCat...."},
{"id":"33667","key":["22945",2,"null"],"value":"SubCat...."},
{"id":"33987","key":["22945",2,"null"],"value":"SubCat...."}
]}
Which QueryString parameters would I use to get say the rows which have a key that starts with ["22945".... When all I have (at query time) is the id "11810" (at query time I don't have knowledge of the id "22945").
If any of that makes sense.
Thanks
The way you store your categories seems to be suboptimal for the query you try to perform on it.
MongoDB.org has a page on various strategies to implement tree-structures (they should apply to Couch and other doc dbs as well) - you should consider Array of Ancestors, where you always store the full path to your node. This makes updating/moving categories more difficult, but querying is easy and fast.