How to find documents which match a subset of keys - couchdb

Let's say we have the following data structure:
{
"name": "",
"tags": []
}
With the following example data:
{
"name": "Test1",
"tags": [
"Laptop",
"Smartphone",
"Tablet"
]
}
{
"name": "Test2",
"tags": [
"Computer",
"Laptop",
"Smartphone",
"Tablet"
]
}
{
"name": "Test3",
"tags": [
"Smartphone",
"Tablet"
]
}
Now I am trying to find:
Find all documents with with Smartphone AND Tablet in tags. This should return all documents.
I can't figure out how this works with couchdb. I tried to add the tags as keys and played with startkey / endkey with no luck.
Hope somebody can help me out.
Greetings,
Ben

I see two possible solutions. Why don't you have a view with a map function like:
function(doc) {
doc.tags && doc.tags.forEach(function(tag) {
emit(tag, null);
});
}
Now if you query this view with keys=["Smartphone", "Tablet"] you will get the following lines:
{id: "A", key: "Smartphone", value: null},
{id: "B", key: "Smartphone", value: null},
{id: "C", key: "Smartphone", value: null},
{id: "A", key: "Tablet", value: null},
{id: "B", key: "Tablet", value: null},
{id: "C", key: "Tablet", value: null}
Now you have to parse this response well on the client side, to filter out the ids which don't show up for all the keys of your query. In this case all the documents (A, B, C) show up, so this is your result. Once you have it, you can use bulk get to fetch the values with:
POST http://...:../db/_all_docs?include_docs=true
with keys=["A", "B", "C"]
This is how I'd do it.
The second approach you could use, is to have a map function which emits all possible subsets of the doc.tags. (Find all possible subset combos in an array?). With the index structure like this, you can get the desired documents with a single query simply using:
key=["Smartphone", "Tablet"] and include_docs=true.
However keep in mind that this means emitting 2**n (n - number of tags) rows for you view, so use this approach only if you are sure that there is only few of them for each doc.

Related

How to project specific fields from a queried document inside an array?

here is the document
formId: 123,
title:"XYZ"
eventDate:"2022-04-15T05:40:57.182Z"
responses:[
{
orderId:98422,
name:"XYZ1",
email:"a#gmal.com",
paymentStatus:"pending",
amount:250,
phone:123456789
},
{
orderId:98422,
name:"XYZ1",
email:"a#gmal.com",
paymentStatus:"success",
amount:250,
phone:123456791
}
]
I used $elemMatch to filter the array such that I get only the matched object.
const response = await Form.findOne({ formId:123 }, {
_id:0,
title: 1,
eventDate: 1,
responses: {
$elemMatch: { orderId: 98422 },
},
})
But this returns all the fields inside the object present in the array "responses".
title:"XYZ"
eventDate:"2022-04-15T05:40:57.182Z"
responses:[
{
orderId:98422,
name:"XYZ1",
email:"a#gmal.com",
paymentStatus:"pending",
amount:250,
phone:123456789
}
]
But I want only specific fields to be returned inside the object like this
title:"XYZ"
eventDate:"2022-04-15T05:40:57.182Z"
responses:[
{
name:"XYZ1",
email:"a#gmal.com",
paymentStatus:"pending",
}
]
How can i do that ?
Query
aggregation way to keep some members and edit them also
map on responses if orderId matches keep the fields you want, the others are replaced with null
filter to remove those nulls (members that didnt match)
here 2 matches if you want to keep only one member of the array you can use
[($first ($filter ...)]
*$elemMatch that you used can be combined with the $ project operator to avoid the aggregation, but with $ operator we get all the matching member (here you want only some fields so i think aggregation is the way)
Playmongo
aggregate(
[{"$match": {"formId": {"$eq": 123}}},
{"$project":
{"_id": 0,
"title": 1,
"eventDate": 1,
"responses":
{"$map":
{"input": "$responses",
"in":
{"$cond":
[{"$eq": ["$$this.orderId", 98422]},
{"name": "$$this.name",
"email": "$$this.email",
"paymentStatus": "$$this.paymentStatus"},
null]}}}}},
{"$set":
{"responses":
{"$filter":
{"input": "$responses", "cond": {"$ne": ["$$this", null]}}}}}])

Cosmos Db: How to query for the maximum value of a property in an array of arrays?

I'm not sure how to query when using CosmosDb as I'm used to SQL. My question is about how to get the maximum value of a property in an array of arrays. I've been trying subqueries so far but apparently I don't understand very well how they work.
In an structure such as the one below, how do I query the city with more population among all states using the Data Explorer in Azure:
{
"id": 1,
"states": [
{
"name": "New York",
"cities": [
{
"name": "New York",
"population": 8500000
},
{
"name": "Hempstead",
"population": 750000
},
{
"name": "Brookhaven",
"population": 500000
}
]
},
{
"name": "California",
"cities":[
{
"name": "Los Angeles",
"population": 4000000
},
{
"name": "San Diego",
"population": 1400000
},
{
"name": "San Jose",
"population": 1000000
}
]
}
]
}
This is currently not possible as far as I know.
It would look a bit like this:
SELECT TOP 1 state.name as stateName, city.name as cityName, city.population FROM c
join state in c.states
join city in state.cities
--order by city.population desc <-- this does not work in this case
You could write a user defined function that will allow you to write the query you probably expect, similar to this: CosmosDB sort results by a value into an array
The result could look like:
SELECT c.name, udf.OnlyMaxPop(c.states) FROM c
function OnlyMaxPop(states){
function compareStates(stateA,stateB){
stateB.cities[0].poplulation - stateA.cities[0].population;
}
onlywithOneCity = states.map(s => {
maxpop = Math.max.apply(Math, s.cities.map(o => o.population));
return {
name: s.name,
cities: s.cities.filter(x => x.population === maxpop)
}
});
return onlywithOneCity.sort(compareStates)[0];
}
You would probably need to adapt the function to your exact query needs, but I am not certain what your desired result would look like.

How to extract selected key and value from nested dictionary object in a list?

I have a list example_list contains two dict objects, it looks like this:
[
{
"Meta": {
"ID": "1234567",
"XXX": "XXX"
},
"bbb": {
"ccc": {
"ddd": {
"eee": {
"fff": {
"xxxxxx": "xxxxx"
},
"www": [
{
"categories": {
"ppp": [
{
"content": {
"name": "apple",
"price": "0.111"
},
"xxx: "xxx"
}
]
},
"date": "A2020-01-01"
}
]
}
}
}
}
},
{
"Meta": {
"ID": "78945612",
"XXX": "XXX"
},
"bbb": {
"ccc": {
"ddd": {
"eee": {
"fff": {
"xxxxxx": "xxxxx"
},
"www": [
{
"categories": {
"ppp": [
{
"content": {
"name": "banana",
"price": "12.599"
},
"xxx: "xxx"
}
]
},
"date": "A2020-01-01"
}
]
}
}
}
}
}
]
now I want to filter the items and only keep "ID": "xxx" and the correspoding value for "price": "0.111", expected result can be something similar to :
[{"ID": "1234567", "price": "0.111"}, {"ID": "78945612", "price": "12.599"}]
or something like {"1234567":"0.111", "78945612":"12.599" }
Here's what I've tried:
map_list=[]
map_dict={}
for item in example_list:
#get 'ID' for each item in 'meta'
map_dict['ID'] = item['meta']['ID']
# get 'price'
data_list = item['bbb']['ccc']['ddd']['www']
for data in data_list:
for dataitem in data['categories']['ppp']
map_dict['price'] = item["content"]["price"]
map_list.append(map_dict)
print(map_list)
The result for this doesn't look right, feels like the item isn't iterating properly, it gives me result:
[{"ID": "78945612", "price": "12.599"}, {"ID": "78945612", "price": "12.599"}]
It gave me duplicated result for the second ID but where is the first ID?
Can someone take a look for me please, thanks.
Update:
From some comments from another question, I understand the reason for the output keeps been overwritten is because the key name in the dict is always the same, but I'm not sure how to fix this because the key and value needs to be extracted from different level of for loops, any help would be appreciated, thanks.
as #Scott Hunter has mentioned, you need to create a new map_dict everytime you are trying to do this. Here is a quick fix to your solution (I am sadly not able to test it right now, but it seems right to me).
map_list=[]
for item in example_list:
# get 'price'
data_list = item['bbb']['ccc']['ddd']['www']
for data in data_list:
for dataitem in data['categories']['ppp']:
map_dict={}
map_dict['ID'] = item['meta']['ID']
map_dict['price'] = item["content"]["price"]
map_list.append(map_dict)
print(map_list)
But what are you doing here is that you are basically just "forcing" your way through ... I recommend you to take a break and check out somekind of tutorial, which will help you to understand how it really works in the back-end. This is how I would have written it:
list_dicts = []
for example in example_list:
for www in item['bbb']['ccc']['ddd']['www']:
for www_item in www:
list_dicts.append({
'ID': item['meta']['ID'],
'price': www_item["content"]["price"]
})
Good luck with this problem and hope it helps :)
You need to create a new dictionary for map_dict for each ID.

python dictionary how can create (structured) unique dictionary list if the key contains list of values of other keys

I have below unstructured dictionary list which contains values of other keys in a list .
I am not sure if the question i ask is strange. this is the actual dictionary payload that we receive from source which not aligned with respective entry
[
{
"dsply_nm": [
"test test",
"test test",
"",
""
],
"start_dt": [
"2021-04-21T00:01:00-04:00",
"2021-04-21T00:01:00-04:00",
"2021-04-21T00:01:00-04:00",
"2021-04-21T00:01:00-04:00"
],
"exp_dt": [
"2022-04-21T00:01:00-04:00",
"2022-04-21T00:01:00-04:00",
"2022-04-21T00:01:00-04:00",
"2022-04-21T00:01:00-04:00"
],
"hrs_pwr": [
"14",
"12",
"13",
"15"
],
"make_nm": "test",
"model_nm": "test",
"my_yr": "1980"
}
]
"the length of list cannot not be expected and it could be more than 4 sometimes or less in some keys"
#Expected:
i need to check if the above dictionary are in proper structure or not and based on that it should return the proper dictionary list associate with each item
for eg:
def get_dict_list(items):
if type(items == not structure)
result = get_associated_dict_items_mapped
return result
else:
return items
#Final result
expected_dict_list=
[{"dsply_nm":"test test","start_dt":"2021-04-21T00:01:00-04:00","exp_dt":"2022-04-21T00:01:00-04:00","hrs_pwr":"14"},
{"dsply_nm":"test test","start_dt":"2021-04-21T00:01:00-04:00","exp_dt":"2022-04-21T00:01:00-04:00","hrs_pwr":"12","make_nm": "test",model_nm": "test","my_yr": "1980"},
{"dsply_nm":"","start_dt":"2021-04-21T00:01:00-04:00","exp_dt":"2022-04-21T00:01:00-04:00","hrs_pwr":"13"},
{"dsply_nm":"","start_dt":"2021-04-21T00:01:00-04:00","exp_dt":"2022-04-21T00:01:00-04:00","hrs_pwr":"15"}
]
in above dictionary payload, below part is associated with the second dictionary items and have to map respectively
"make_nm": "test",
"model_nm": "test",
"my_yr": "1980"
}
Can anyone help on this?
Thanks
Since customer details is a list
dict(zip(customer_details[0], list(customer_details.values[0]())))
this yields:
{'insured_details': ['asset', 'asset', 'asset'],
'id': ['213', '214', '233'],
'dept': ['account', 'sales', 'market'],
'salary': ['12', '13', '14']}
​
I think a couple of list comprehensions will get you going. If you would like me to unwind them into more traditional for loops, just let me know.
import json
def get_dict_list(item):
first_value = list(item.values())[0]
if not isinstance(first_value, list):
return [item]
return [{key: item[key][i] for key in item.keys()} for i in range(len(first_value))]
cutomer_details = [
{
"insured_details": "asset",
"id": "xxx",
"dept": "account",
"salary": "12"
},
{
"insured_details": ["asset", "asset", "asset"],
"id":["213","214","233"],
"dept":["account","sales","market"],
"salary":["12","13","14"]
}
]
cutomer_details_cleaned = []
for detail in cutomer_details:
cutomer_details_cleaned.extend(get_dict_list(detail))
print(json.dumps(cutomer_details_cleaned, indent=4))
That should give you:
[
{
"insured_details": "asset",
"id": "xxx",
"dept": "account",
"salary": "12"
},
{
"insured_details": "asset",
"id": "213",
"dept": "account",
"salary": "12"
},
{
"insured_details": "asset",
"id": "214",
"dept": "sales",
"salary": "13"
},
{
"insured_details": "asset",
"id": "233",
"dept": "market",
"salary": "14"
}
]

Searching parent id in cloudant

I have a Cloudant DB with the following structure:
{id: 1, resource:”john doe”, manager: “john smith”, amount: 13}
{id: 2, resource:”mary doe”, manager: “john smith”, amount: 3}
{id: 3, resource:”john smith”, manager: “peter doe”, amount: 10}
I needed a query to return the sum of amount, so I've built a query with emit(doc.manager, doc.amount) which returns
{"rows":[
{"key":"john smith","value":16},
{"key":"peter doe","value":10}]}
It is working like a charm. However I need the manager ID along with Manager name. The result I am looking for is:
{"rows":[
{"key":{"john smith",3},"value":16},
{"key":{"peter doe",null},"value":10}]}
How should I build a map view to search the parent ID?
Thanks,
Erik
Unfortunately I don't think there's a way to do exactly what you want in one query. Assuming you have the following three documents in your database:
{
"_id": "1",
"resource": "john doe",
"manager": "john smith",
"amount": 13
}
--
{
"_id": "2",
"resource": "mary doe",
"manager": "john smith",
"amount": 3
}
--
{
"_id": "3",
"resource": "john smith",
"manager": "peter doe",
"amount": 10
}
The closest thing to what you want would be the following map function (which uses a compound key) and a _sum reduce:
function(doc) {
emit([doc.manager, doc._id], doc.amount);
}
This would give you the following results with reduce=false:
{"total_rows":3,"offset":0,"rows":[
{"id":"1","key":["john smith","1"],"value":13},
{"id":"2","key":["john smith","2"],"value":3},
{"id":"3","key":["peter doe","3"],"value":10}
]}
With reduce=true and group_level=1, you essentially get the same results as what you already have:
{"rows":[
{"key":["john smith"],"value":16},
{"key":["peter doe"],"value":10}
]}
If you instead do reduce=true and group=true (exact grouping) then you get the following results:
{"rows":[
{"key":["john smith","1"],"value":13},
{"key":["john smith","2"],"value":3},
{"key":["peter doe","3"],"value":10}
]}
Each unique combination of the manager and _id field is summed, which unfortunately doesn't give you what you want. To accomplish what you want to accomplish, I think your best but would be to sum up the values after querying the database.

Resources