I have a bunch of MP3 metadata in CouchDB. I want to return every album that is in the MP3 metadata, but no duplicates.
A typical document looks like this:
{
"_id": "005e16a055ba78589695c583fbcdf7e26064df98",
"_rev": "2-87aa12c52ee0a406084b09eca6116804",
"name": "Fifty-Fifty Clown",
"number": 15,
"artist": "Cocteau Twins",
"bitrate": 320,
"album": "Stars and Topsoil: A Collection (1982-1990)",
"path": "Cocteau Twins/Stars and Topsoil: A Collection (1982-1990)/15 - Fifty-Fifty Clown.mp3",
"year": 0,
"genre": "Shoegaze"
}
I believe your map/reduce would look something like:
function map(doc) {
emit(doc.album, null);
}
function reduce(key, values) {
return null;
}
Remember to query with the extra parameter group=true
Have a look at View Cookbook for SQL Jockeys' Get Unique Values section.
Related
I am trying to query over populated children attributes using mongoose but it straight up doesn't work and will return empty arrays all the time.
even hardcoding right and existing information as values for the query would return empty arrays.
my schema is a business schema with a 1 to 1 relationship with user schema via the attribute createdBy. the user schema has an attribute name which I am trying to query on.
so if I make a query like this :
business.find({'createdBy.name': {$regex:"steve"}}).populate('createdBy')
the above will never return any documents. although, without the find condition, everything works fine.
Can I search by the name inside a populated child or not? all tutorials say this should work fine but it just doesn't.
EDIT : an example of what the record looks like :
{
"_id": "5fddedd00e8a7e069085964f",
"status": 6,
"addInfo": "",
"descProduit": "",
"createdBy": {
"_id": "5f99b1bea9ba194dec3bd6aa",
"status": 1,
"fcmtokens": [
],
"emailVerified": 1,
"phoneVerified": 0,
"userType": "User",
"name": "steve buschemi",
"firstName": "steve",
"lastName": "buschemi",
"tel": "",
"email": "steve#buschemi.com",
"register_token": "747f1e1e8fa1ecd2f1797bb402563198",
"createdAt": "2020-10-28T18:00:30.814Z",
"updatedAt": "2020-12-18T13:52:07.430Z",
"__v": 19,
"business": "5f99b1e101bfff39a8259457",
"credit": 635,
},
"createdAt": "2020-12-19T12:10:57.703Z",
"updatedAt": "2020-12-19T12:11:16.538Z",
"__v": 0,
"nid": "187"
}
It seems there is no way to filter parent documents by conditions on child documents:
From the official documentation:
In general, there is no way to make populate() filter stories based on properties of the story's author. For example, the below query won't return any results, even though author is populated.
const story = await Story.
findOne({ 'author.name': 'Ian Fleming' }).
populate('author').
exec();
story; // null
If you want to filter stories by their author's name, you should use denormalization.
I have a CouchDB with documents, which look like this:
{
"_id": "000040cc-e3b4-47cc-b051-a5508efb8996",
"_rev": "1-882d7f88cc2e1e767b55d0c82fb638d2",
"state": "uploaded",
"state_since": "2020-02-17T11:20:55.1450252Z"
// more metadata ...
"_attachments": {
"large.jpg": {
"content_type": "image/jpeg",
"revpos": 1,
"digest": "md5-NK7ejYjrErhMAs7tZ4+R8w==",
"length": 87846,
"stub": true
},
"medium.jpg": {
...
},
"small.jpg": {
...
}
}
}
Let's assume, I want to query a set of images like this:
{
"selector": {
"state": "uploaded"
},
"sort": ["state_since"],
"limit": 100
}
If I want to display the thumbnails of those 100 images, I'd have to iterate through the result list and download the corresponding attachments. This would be 101 requests in total.
I could also do it in one request by specifying, that I want to fetch the documents with attachments. But this would return all (potentially very large) attachments.
I know that I can set the fields property in my query to only return the fields I need. But can I apply this to attachments, too? And if yes: how?
No, there's no way to do what you're requesting. The only ways to fetch a subset of attachments are by fetching them one at a time, or by using the atts_since attribute when fetching a single document, which is intended for use in replication.
Perhaps consider re-designing your documents. Perhaps you can store your thumbnails on a separate document, that only contains thumbnails.
I would like to create a map/reduce function that filters the documents based on a nested value from the child document. But retrieve the parent document.
I have following documents:
{
"_id": "1",
"_rev": "1-991baf1d86435a73a3460335cc19063c",
"configuration_id": "225f9d47-841c-43c2-90c2-e65bb49083d3",
"name": "test",
"image": "",
"type": "A",
"created": "",
"updated": 1,
"destroyed": ""
}
{
"_id": "225f9d47-841c-43c2-90c2-e65bb49083d3",
"_rev": "1-3e3a1c357c86cbd1cd42b5980b9655a4",
"configuration_packages_id": "cd19b0ba-157d-4dd4-adac-56fd470bfed4",
"configuration_distribution_id": "5b538411-ca99-46c7-ac3c-1f382e4577a9",
"type": "CONFIGURATION",
"configuration": {
"hostname": "example123",
"images": [
"image1",
"image2"
]
}
}
Now I would like to retrieve all the documents of type A and with hostname example123.
At the moment I retrieve all the document of type A like this:
function (doc) {
if (doc.type === "A") {
emit([doc.updated], doc);
}
}
But now I would also like to filter on the host name as well.
I'm not sure on how to achieve this with CouchDB.
TLDR;
You cannot do this
Details
Your "nested" document is only accessible through a join but you can't query it.
The correct way to do that kind of query natively would have been to have a real nested document inside the parent document. Separating those documents has a cost.
Join example
function (doc) {
if (doc.type === "A") {
emit([doc.updated,0]);
emit([doc.updated,1],["_id":doc.configuration_id]);
}
}
If you query the view with "include_docs=true", this will get you the configuration document linked as well as the parent document itself. Then you can query to get the updated docs, merge the nested(1) with the parents(0) and filter them.
How would I creat a view equivalent to a SQL query like this?
SELECT * FROM bucket WHERE (uid='$uid' AND accepted='Y') OR (uid='$uid' AND authorid='$logginid')
My data is stored this way:
{
"id": 9476183,
"authorid": 85490,
"content": "some text here",
"uid": 41,
"accepted": "Y",
"time": "2014-12-09 10:44:01",
"type": "testimonial"
}
function(doc) {
if (doc.accepted == 'Y') {
emit(doc.uid, null);
}
emit([doc.uid, doc.authorid], null);
}
One request is enough. You can tap view written by #Simon (reproduced above) using POST with param keys:[[uid, authorid], uid].
See http://docs.couchdb.org/en/latest/api/ddoc/views.html#post--db-_design-ddoc-_view-view for mode details.
A view could look like this:
function(doc) {
if (doc.accepted == 'Y') {
emit(doc.uid, null);
}
emit([doc.uid, doc.authorid], null);
}
You would query it with key=$uid first. If there is no match, you would query it with key=[$uid,$loginid].
I want to simulate a parent child relation in elastic search and perform some analytics work over it. My use case is something like this
I have a shop owner like this
"_source": {
"shopId": 5,
"distributorId": 4,
"stateId": 1,
"partnerId": 2,
}
and now have child records (for each day) like this:
"_source": {
"shopId": 5,
"date" : 2013-11-13,
"transactions": 150,
"amount": 1980,
}
The parent is a record per store, while the child is the transactions each store does for
day. Now I want to do some complex query like
Find out total transaction for each day for the last 30 days where distributor is 5
POST /newdb/shopsDaily/_search
{
"query": {
"match_all": {}
},
"filter": {
"has_parent": {
"type": "shop",
"query": {
"match": {
"distributorId": "5"
}
}
}
},
"facets": {
"date": {
"histogram": {
"key_field": "date",
"value_field": "transactions",
"interval": 100
}
}
}
}
But the result I get do not take the filtering into account which I applied.
So I changed the query to this:
POST /newdb/shopDaily/_search
{
"query": {"filtered": {
"query": {"match_all": {}},
"filter": { "has_parent": {
"type": "shop",
"query": {"match": {
"distributorId": "13"
}}
}}
}},
"facets": {
"date": {
"histogram": {
"key_field": "date",
"value_field": "transactions",
"interval": 100
}
}
}
}
And then the final histogram facet took filtering into count.
So, when I browsed though I found out this is due to using filtered(which can only be used inside query clause and not outside like filter) rather than filter,
but it also mentioned that to have fast search you should use filter. Will searching as I did in second step (when I used filtered instead of filter) effect the performance of elastic search? If so, how can I make my facets honor filters and not effect the performance?
Thanks for you time
filters in Filtered query (filters in query clause) are cached, hence faster. These type of filters affect both search result and facet counts.
Filters outside the query clause are not considered during facet calculations. They are considered only for search results. Facet is calculated only on the query clause. If you want filtered facets then you need to set filters to each of the facet clauses.