Get all CoudhDB documents which contain subset of view keys

Get all CoudhDB documents which contain subset of view keys - couchdb

Assume I have the following documents:
{
_id: 'id1',
tags: ['a', 'b']
}
{
_id: 'id2',
tags: ['b', 'c', 'd']
}
{
_id: 'id3',
tags: ['c', 'd', 'e']
}
Now I want to get all documents, where ALL tags are a subset of a given set. For example the view keys ['a','b','c','d'] should return doc 'id1' and 'id2' but not document with 'id3', because it contains the tag 'e' which is not in the requested keys.
How would you write such a view in CouchDB?

Related

Restructure TSV to list of list of dicts

A simplified look at my data right at parse:
[
{'id':'group1'},
{'id':'member1', 'parentId':'group1', 'size':51},
{'id':'member2', 'parentId':'group1', 'size':16},
{'id':'group2'},
{'id':'member1', 'parentId':'group2', 'size':21},
...
]
The desired output should be like this:
data =
[
[
{'id':'group1'},
{'id':'member1', 'parentId':'group1', 'size':51},
{'id':'member2', 'parentId':'group1', 'size':16}
],
[
{'id':'group2'},
{'id':'member1', 'parentId':'group2', 'size':21},
]
]
The issue is that it's very challenging to iterate through this kind of data structure because each list contains a different length of possible objects: some might have 10 some might have 3, making it unclear when to begin and end each list. And it's also not uniform. Note some have only 'id' entries and no 'parentId' or 'size' entries.
master_data = []
for i in range(len(tsv_data)):
temp = {}
for j in range(?????):
???
How can Python handle arranging vanilla .tsv data into a list of lists as seen above?
I thought one appropriate direction to take the code was to see if I could tally something simple, before tackling the whole data set. So I attempted to compute a count of all occurences of group1, based off this discussion:
group_counts = {}
for member in data:
group = member.get('group1')
try:
group_counts[group] += 1
except KeyError:
group_counts[group] = 1
However, this returned:
'list' object has no attribute 'get'
Which leads me to believe that counting text occurences may not be the solution afterall.

You could fetch all groups to create the new datastructure afterwards add all the items:
data = [
{
'id': 'group1'
}, {
'id': 'member1',
'parentId': 'group1',
'size': 51
}, {
'id': 'member2',
'parentId': 'group1',
'size': 16
}, {
'id': 'group2'
}, {
'id': 'member1',
'parentId': 'group2',
'size': 21
}, {
'id': 'member3',
'parentId': 'group1',
'size': 16
}
]
result = {} # Use a dict for easier grouping.
lastGrpId = 0
# extract all groups
for dct in data:
if 'group' in dct['id']:
result[dct['id']] = [dct]
# extract all items and add to groups
for dct in data:
if 'parentId' in dct:
result[dct['parentId']].append(dct)
nestedListResult = [v for k, v in result.items()]
Out:
[
[
{
'id': 'group1'
}, {
'id': 'member1',
'parentId': 'group1',
'size': 51
}, {
'id': 'member2',
'parentId': 'group1',
'size': 16
}, {
'id': 'member3',
'parentId': 'group1',
'size': 16
}
], [{
'id': 'group2'
}, {
'id': 'member1',
'parentId': 'group2',
'size': 21
}]
]

Cassandra UDT converted to string by Loopback 4 not persisted to database

The code converts Json object in request payload to string for UDT. We debugged and found the issue with Cassandra.prototype.create function in loopback-connector-cassandra/lib/cassandra.js.
LOGS:
Payload object:
{
id: 'c',
name: 'otlsgy4',
phone: '88774622572',
address: {
street: 'Rajpur Road',
city: 'Dehradun',
state_or_province: 'Uttarakhand',
postal_code: '248001',
country: 'India'
}
}
Insert Query by loopback:
INSERT INTO "hotels" ("id","name","phone","address")
VALUES(?,?,?,?)
[
'c',
'otlsgy4',
'88774622572',
'{
"street":"Rajpur Road",
"city":"Dehradun",
"state_or_province":"Uttarakhand",
"postal_code":"248001",
"country":"India"
}'
]
Expected Result:
INSERT INTO "hotels" ("id","name","phone","address")
VALUES(?,?,?,?)
[
'c',
'otlsgy4',
'88774622572',
{
"street":"Rajpur Road",
"city":"Dehradun",
"state_or_province":"Uttarakhand",
"postal_code":"248001",
"country":"India"
}
]
Link to the repository:
https://github.com/nishankpathak/cassandrademo

Python: Add a Key-Value pair when a condition is true

Starting point are the following lists of dicts:
product = [
{'_id': '5678', 'variantIds':[{'id':'1'},{'id':'2'},{'id':'3'}]},
{'_id': '1234', 'variantIds':[{'id':'1'},{'id':'2'},{'id':'3'}]}
]
inventoryItem = [
{'_id': 'a6fdcf69', 'productId': '1234', 'variants': [{'variantId': '1', 'quantity': 0}, {'variantId': '2', 'quantity': 100}]},
{'_id': 'a6fdcf70', 'productId': '5678', 'variants': [{'variantId': '1', 'quantity': 0}, {'variantId': '2', 'quantity': 199}, {'variantId': '3', 'quantity': 299}]},
{'_id': 'a6fdcf77', 'productId': '9999', 'variants': [{'variantId': '1', 'quantity': 1111}, {'variantId': '2', 'quantity': 2222}, {'variantId': '3', 'quantity': 3333}]}
]
what i want is to add the key-value pair 'quantity':'value of quantity' to the first list of products. specifically i want to add it the the sub-list of dicts 'variantIds'. And I only want to add it when product[_'id] == inventoryItem['productId'] AND product['variantIds']['id] == inventoryItem['variants']['variantId'], so that e get the following output:
product = [
{'_id': '5678', 'variantIds':[{'id':'1', 'stockQuantity':0},{'id':'2', 'stockQuantity':199},{'id':'3', 'stockQuantity':299}]},
{'_id': '1234', 'variantIds':[{'id':'1', 'stockQuantity':0},{'id':'2', 'stockQuantity':100},{'id':'3', 'stockQuantity':0}]}
]
i can loop and add everything as long as the order in one list is corresponding to the order in the other list. but if it's not the case, what could be, i struggle with addressing the right index of the second list. how do you do that?
i think here is my closest try. but i already struggle on line 2, because i do not know the corresponding index in the inventoryItem List:
for i in product:
if i['_id'] == inventoryItem['productId']:
for j in i['variantIds']:
if j['id'] == inventoryItem['variants']['variantId']:
j['stock'] = inventoryItem['variants']['quantity']
print (product)

product = [
{'_id': '5678', 'variantIds':[{'id':'1'},{'id':'2'},{'id':'3'}]},
{'_id': '1234', 'variantIds':[{'id':'1'},{'id':'2'},{'id':'3'}]}
]
inventoryItem = [
{'_id': 'a6fdcf69', 'productId': '1234', 'variants': [{'variantId': '1', 'quantity': 0}, {'variantId': '2', 'quantity': 100}]},
{'_id': 'a6fdcf70', 'productId': '5678', 'variants': [{'variantId': '1', 'quantity': 0}, {'variantId': '2', 'quantity': 199}, {'variantId': '3', 'quantity': 299}]},
{'_id': 'a6fdcf77', 'productId': '9999', 'variants': [{'variantId': '1', 'quantity': 1111}, {'variantId': '2', 'quantity': 2222}, {'variantId': '3', 'quantity': 3333}]}
]
for prod in product:
try:
loc = [i['productId'] for i in inventoryItem].index(prod['_id'])
variantList = inventoryItem[loc]['variants']
for item in prod['variantIds']:
try:
subloc = [i['variantId'] for i in variantList].index(item['id'])
quantity = variantList[subloc]['quantity']
except:
quantity = 0
item.update({'stockQuantity':quantity})
except Exception as e:
pass
print(e.args) # you can remove this line, if this disturbs your output
print(product)
Now this may not be the best way to do it, but its pretty straightforward.
This is based on what I understood about the structure of the data. Maybe there is more to the data, and based on that it can be optimized further. But it works on the example set you provided.
What this does is, it grabs items (dictionaries) from product list. Searches for similar id in inventoryItem. Grabs that index (of the dictionary) inside inventoryItem. And then just scans to proper places in that entry of inventoryItem and then updates the product based on that.
Note : I have assumed here that there is no specific order to anything.
If there is an explanation necessary to any part, let me know.

The thing is that you mixed lists and dictionnary so you need to access to the lists of variantIds. You can do it as follow I think :
for i in range(len(product)):
for j in range(len(product[0]['variantIds'])):
product[i]['variantIds'][j]['stockQuantity'] = 0
print(product)
Output:
[{'_id': '5678', 'variantIds': [{'id': '1', 'stockQuantity': 0}, {'id': '2', 'stockQuantity': 0}, ...

Groovy groupBy on multiple properties and extracting only value

Having a list of elements:
List list = [
[category: 'A', name: 'a' value: 10],
[category: 'A', name: 'b' value: 20],
[category: 'B', name: 'a' value: 30],
[category: 'B', name: 'c' value: 40],
[category: 'B', name: 'd' value: 50],
]
I want to transform it into a nested map:
Map map = [
A: [a: 10, b: 20],
B: [a: 30, c: 40, d: 50],
]
The only solution I have come up with is to do something like this:
list.groupBy(
{ it.category }, { it.name }
).collectEntries { category, names ->
[(category): names.collectEntries { name, values ->
[(name): values.value[0]]
}]
}
However, I will have to deal with more than 2 levels of nesting in the future, and this approach will be unfeasible.
Is there any neat way to obtain the proper result in Groovy that will be more flexible?
EDIT:
By more than 2 levels of nesting I mean converting structure like:
List list = [
[category: 'A', subcategory: 'I', group: 'x', name: 'a', value: 10],
[category: 'A', subcategory: 'I', group: 'y', name: 'b', value: 20],
]
Into:
Map map = [
A: [I: [
x: [a: 10],
y: [b: 20],
]],
]
By adding nesting (depth) it would require more nested collectEntries calls, which will become unreadable.

I have found a neat solution to the problem by using Map's withDefault method and recursive Closure calls:
Map map = { [:].withDefault { owner.call() } }.call()
list.each {
map[it.category][it.name] = it.value
}
Or for the second case:
Map map = { [:].withDefault { owner.call() } }.call()
list.each {
map[it.category][it.subcategory][it.group][it.name] = it.value
}

Elasticsearch order facets based on another index

I'm not sure of the right term to describe this problem so I'll go with a simple example.
I have blog posts index each with tags field (not_analyzed):
{
id: 1
tags: ['a']
},
{
id: 2
tags: ['a', 'b']
},
{
id: 3
tags: ['a', 'b', 'c']
}
and if I run search for tag:c I successfully get 3 post results and facets on tags:
{
term: 'a',
count: 3
},
{
term: 'b',
count: 2
}
In the result above, the facets are ordered by count. My question is, if it's possible to boost their values calculated from another index? If I have another index for tags with scores (arbitrary)
{
tag: 'a',
score: 0
},
{
tag: 'b',
score: 10
},
{
tag: 'c',
score: 0
},
Is it possible to achieve the following facets with scores calculated from another index?
{
tag: 'b'
score: 12 //10 + 2, where 2 is the count
},
{
tag: 'b'
score: 3 //0 + 3, where 3 is the count
}
*I'm aware that facets are being deprecated and I should update my code to use aggregations.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Get all CoudhDB documents which contain subset of view keys - couchdb

Related

Restructure TSV to list of list of dicts

Cassandra UDT converted to string by Loopback 4 not persisted to database

Python: Add a Key-Value pair when a condition is true

Groovy groupBy on multiple properties and extracting only value

Elasticsearch order facets based on another index

Categories

Resources