List iteration on python with mongodb - python-3.x

I am working on a small python project where I need to create a mongodb entry.
This is the list of values you received from another collection:
["India", "Australia", "South Africa"]
So the above list contains three items. What I want from my next collection is:
{
"_id": ObjectId('some id'),
"name": "Player",
"value": "India"
}
{
"_id": ObjectId('some id'),
"name": "Player",
"value": "Australia"
}
{
"_id": ObjectId('some id'),
"name": "Player",
"value": "South Africa"
}
I only want the list of values to be added in the value key but the name should be constant. It should repeat again and again but the value key will be changed based on number entries in the list.
How do I approach this problem in python?

You can apparoch this issue in different ways. A very basic one would be using list comprehensions like this:
values_list = ["India", "Australia", "South Africa"]
names_list = ["Peter", "Paul", "Mary"]
def create_objects(name, values):
# this returns a list of dicts basically and should be adopted to create 'real' mongoDB objects/entries
return [{"_id": "some id", "name": name, "value": value} for value in values]
objects = [create_objects(name, values_list) for name in names_list]
print(objects)
Another way is to calculate all possible combinations (called product in itertools) before-hand to prevent the two interating for-loops
from itertools import product
objects = [{"_id": "some id", "name": name, "value": value} for name, value in product(names_list, values_list)]
print(objects)

Related

SQL Query CONTAINS property value in array of objects

I am trying to create a SQL query to get a list of companies that a User belongs to. The database is Cosmos DB Serverless, and the container is called "Companies" with multiple company items inside:
The structure of the company items are as follows:
{
"id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"name": "Company Name",
"users": [
{
"id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"name": "Susan Washington",
"email": "susan.washington#gmail.com",
"createdBy": "xxxx#gmail.com",
"createdServerDateUTC": "2022-01-12T19:21:10.0644424Z",
"createdLocalTime": "2022-01-12T19:21:09Z"
},
{
"id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"name": "Kerwin Evans",
"title": "Test Dev",
"email": "kerwin.e#yahoo.com",
"createdBy": "xxxx#gmail.com",
"createdServerDateUTC": "2022-01-12T19:21:10.0644424Z",
"createdLocalTime": "2022-01-12T19:21:09Z"
},
ETC.
]
}
And this is the SQL query I was trying to use, where user is an email that I pass in:
SELECT *
FROM c
WHERE IS_NULL(c.deletedServerDateUTC) = true
AND CONTAINS(c.users, user)
ORDER BY c.name DESC
OFFSET 0 LIMIT 10
This doesn't work, because the users property is an array. So I believe I need to check each object in the users array to see if the email property matches the user I enter in.
You can query the array via ARRAY_CONTAINS(). Something like this to return company names for a given username that you specify:
SELECT c.name
FROM c
WHERE ARRAY_CONTAINS(c.users,{'name': username}, true)
The 3rd parameter set to true means the array elements are documents, not scalar values.

Count the number of dictionaries from a list in a dictionary

I have a dictionary that contains a list of dictionaries. I want to count the number dictionaries within the list. My data looks like this:
{
"Ratings": [
{
"User": "User1",
"Rating": 2
},
{
"User": "User2",
"Rating": 4
}
}
I tried the following which worked but wanted to know if there is a better way.
print(sum(map(len, dict_1.values())))

Merge documents by fields

I have two types of docs. Main docs and additional info for it.
{
"id": "371"
"name": "Mike",
"location": "Paris"
},
{
"id": "371-1",
"age": 20,
"lastname": "Piterson"
}
I need to merge them by id, to get result doc. The result should look like:
{
"id": "371"
"name": "Mike",
"location": "Paris"
"age": 20,
"lastname": "Piterson"
}
Using COLLECT / INTO, SPLIT(), and MERGE():
FOR doc IN collection
COLLECT id = SPLIT(doc.id, '-')[0] INTO groups
RETURN MERGE(MERGE(groups[*].doc), {id})
Result:
[
{
"id": "371",
"location": "Paris",
"name": "Mike",
"lastname": "Piterson",
"age": 20
}
]
This will:
Split each id attribute at any - and return the first part
Group the results into sepearate arrays (groups)
Merge #1: Merge all objects into one
Merge #2: Merge the id into the result
See REMOVE & INSERT or REPLACE for write operations.

Access a dictionary value based on the list of keys

I have a nested dictionary with keys and values as shown below.
j = {
"app": {
"id": 0,
"status": "valid",
"Garden": {
"Flowers":
{
"id": "1",
"state": "fresh"
},
"Soil":
{
"id": "2",
"state": "stale"
}
},
"BackYard":
{
"Grass":
{
"id": "3",
"state": "dry"
},
"Soil":
{
"id": "4",
"state": "stale"
}
}
}
}
Currently, I have a python method which returns me the route based on keys to get to a 'value'. For example, if I want to access the "1" value, the python method will return me a list of string with the route of the keys to get to "1". Thus it would return me, ["app","Garden", "Flowers"]
I am designing a service using flask and I want to be able to return a json output such as the following based on the route of the keys. Thus, I would return an output such as below.
{
"id": "1",
"state": "fresh"
}
The Problem:
I am unsure on how to output the result as shown above as I will need to parse the dictionary "j" in order to build it?
I tried something as the following.
def build_dictionary(key_chain):
d_temp = list(d.keys())[0]
...unsure on how to
#Here key_chain contains the ["app","Garden", "Flowers"] sent to from the method which parses the dictionary to store the key route to the value, in this case "1".
Can someone please help me to build the dictionary which I would send to the jsonify method. Any help would be appreciated.
Hope this is what you are asking:
def build_dictionary(key_chain, j):
for k in key_chain:
j = j.get(k)
return j
kchain = ["app","Garden", "Flowers"]
>>> build_dictionary(kchain, j)
{'id': '1', 'state': 'fresh'}

How to search through data with arbitrary amount of fields?

I have the web-form builder for science events. The event moderator creates registration form with arbitrary amount of boolean, integer, enum and text fields.
Created form is used for:
register a new member to event;
search through registered members.
What is the best search tool for second task (to search memebers of event)? Is ElasticSearch well for this task?
I wrote a post about how to index arbitrary data into Elasticsearch and then to search it by specific fields and values. All this, without blowing up your index mapping.
The post is here: http://smnh.me/indexing-and-searching-arbitrary-json-data-using-elasticsearch/
In short, you will need to do the following steps to get what you want:
Create a special index described in the post.
Flatten the data you want to index using the flattenData function:
https://gist.github.com/smnh/30f96028511e1440b7b02ea559858af4.
Create a document with the original and flattened data and index it into Elasticsearch:
{
"data": { ... },
"flatData": [ ... ]
}
Optional: use Elasticsearch aggregations to find which fields and types have been indexed.
Execute queries on the flatData object to find what you need.
Example
Basing on your original question, let's assume that the first event moderator created a form with following fields to register members for the science event:
name string
age long
sex long - 0 for male, 1 for female
In addition to this data, the related event probably has some sort of id, let's call it eventId. So the final document could look like this:
{
"eventId": "2T73ZT1R463DJNWE36IA8FEN",
"name": "Bob",
"age": 22,
"sex": 0
}
Now, before we index this document, we will flatten it using the flattenData function:
flattenData(document);
This will produce the following array:
[
{
"key": "eventId",
"type": "string",
"key_type": "eventId.string",
"value_string": "2T73ZT1R463DJNWE36IA8FEN"
},
{
"key": "name",
"type": "string",
"key_type": "name.string",
"value_string": "Bob"
},
{
"key": "age",
"type": "long",
"key_type": "age.long",
"value_long": 22
},
{
"key": "sex",
"type": "long",
"key_type": "sex.long",
"value_long": 0
}
]
Then we will wrap this data in a document as I've showed before and index it.
Then, the second event moderator, creates another form having a new field, field with same name and type, and also a field with same name but with different type:
name string
city string
sex string - "male" or "female"
This event moderator decided that instead of having 0 and 1 for male and female, his form will allow choosing between two strings - "male" and "female".
Let's try to flatten the data submitted by this form:
flattenData({
"eventId": "F1BU9GGK5IX3ZWOLGCE3I5ML",
"name": "Alice",
"city": "New York",
"sex": "female"
});
This will produce the following data:
[
{
"key": "eventId",
"type": "string",
"key_type": "eventId.string",
"value_string": "F1BU9GGK5IX3ZWOLGCE3I5ML"
},
{
"key": "name",
"type": "string",
"key_type": "name.string",
"value_string": "Alice"
},
{
"key": "city",
"type": "string",
"key_type": "city.string",
"value_string": "New York"
},
{
"key": "sex",
"type": "string",
"key_type": "sex.string",
"value_string": "female"
}
]
Then, after wrapping the flattened data in a document and indexing it into Elasticsearch we can execute complicated queries.
For example, to find members named "Bob" registered for the event with ID 2T73ZT1R463DJNWE36IA8FEN we can execute the following query:
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "flatData",
"query": {
"bool": {
"must": [
{"term": {"flatData.key": "eventId"}},
{"match": {"flatData.value_string.keyword": "2T73ZT1R463DJNWE36IA8FEN"}}
]
}
}
}
},
{
"nested": {
"path": "flatData",
"query": {
"bool": {
"must": [
{"term": {"flatData.key": "name"}},
{"match": {"flatData.value_string": "bob"}}
]
}
}
}
}
]
}
}
}
ElasticSearch automatically detects the field content in order to index it correctly, even if the mapping hasn't been defined previously. So, yes : ElasticSearch suits well these cases.
However, you may want to fine tune this behavior, or maybe the default mapping applied by ElasticSearch doesn't correspond to what you need : in this case, take a look at the default mapping or, for even further control, the dynamic templates feature.
If you let your end users decide the keys you store things in, you'll have an ever-growing mapping and cluster state, which is problematic.
This case and a suggested solution is covered in this article on common problems with Elasticsearch.
Essentially, you want to have everything that can possibly be user-defined as a value. Using nested documents, you can have a key-field and differently mapped value fields to achieve pretty much the same.

Resources