Count the number of dictionaries from a list in a dictionary - python-3.x

I have a dictionary that contains a list of dictionaries. I want to count the number dictionaries within the list. My data looks like this:
{
"Ratings": [
{
"User": "User1",
"Rating": 2
},
{
"User": "User2",
"Rating": 4
}
}
I tried the following which worked but wanted to know if there is a better way.
print(sum(map(len, dict_1.values())))

Related

MongoDB compare users based on nested array

I have a collection of users with nested array Answers. I need to compare their responses, every time that it is the same i add the coeff value of the answers to a variable, then if for example it's above 10 i send back all the users with their own total coeff (above 10 so).
So question is how to compare users by going to a nested array (answers) checking if the same field have the same value (answerChoice) for the same answer (answerNumber) and taking another value of the nested array (answerCoeff) to increment into a variable and printing the total coeff and the users meeting a certain coeff amount
{
"id": "string",
"birthdate": "2021-06-18T13:53:30.443Z",
"userName": "string",
"pictures": [
"string"
],
"answers": [
{
"answerNumber": 0,
"answerChoice": 3,
"answerCoeff": 2
},
{
"answerNumber": 1,
"answerChoice": 2,
"answerCoeff": 5
}
...
],
}
Output expected :
{
"matchs": [
{
"ids": [
"string"
],
"userName": "string",
"pictures": [
"string"
],
"coeff": 0
}
]
}

List iteration on python with mongodb

I am working on a small python project where I need to create a mongodb entry.
This is the list of values you received from another collection:
["India", "Australia", "South Africa"]
So the above list contains three items. What I want from my next collection is:
{
"_id": ObjectId('some id'),
"name": "Player",
"value": "India"
}
{
"_id": ObjectId('some id'),
"name": "Player",
"value": "Australia"
}
{
"_id": ObjectId('some id'),
"name": "Player",
"value": "South Africa"
}
I only want the list of values to be added in the value key but the name should be constant. It should repeat again and again but the value key will be changed based on number entries in the list.
How do I approach this problem in python?
You can apparoch this issue in different ways. A very basic one would be using list comprehensions like this:
values_list = ["India", "Australia", "South Africa"]
names_list = ["Peter", "Paul", "Mary"]
def create_objects(name, values):
# this returns a list of dicts basically and should be adopted to create 'real' mongoDB objects/entries
return [{"_id": "some id", "name": name, "value": value} for value in values]
objects = [create_objects(name, values_list) for name in names_list]
print(objects)
Another way is to calculate all possible combinations (called product in itertools) before-hand to prevent the two interating for-loops
from itertools import product
objects = [{"_id": "some id", "name": name, "value": value} for name, value in product(names_list, values_list)]
print(objects)

How to Generate Counts of Elements Returned from Map Function?

I have a map function
function (doc) {
for(var n =0; n<doc.Observations.length; n++){
emit(doc.Scenario, doc.Observations[n].Label);
}
}
the above returns the following:
{"key":"Splunk","value":"Organized"},
{"key":"Splunk","value":"Organized"},
{"key":"Splunk","value":"Organized"},
{"key":"Splunk","value":"Generate"},
{"key":"Splunk","value":"Ingest"}
I"m looking to design a reduce function that will then return the counts of the above values, something akin to:
Organized: 3
Generate: 1
Ingest: 1
My map function has to filter on my Scenario field, hence why I have it as an emitted key in the map function.
I've tried using a number of the built in reduce functions, but I end up getting count of rows, or nothing at all as the functions available don't apply.
I just need to access the counts of each of the elements that appear in the values field. Also, the values present here are representative, there could 100s of different types of values found in the values field for what that's worth.
I really appreciate the help!
Here's sample input:
{
"_id": "dummyId",
"test": "test",
"Team": "Alpha",
"CreatedOnUtc": "2019-06-20T21:39:09.5940830Z",
"CreatedOnLocal": "2019-06-20T17:39:09.5940830-04:00",
"Participants": [
{
"Name": "A",
"Role": "Person"
}
],
"Observations": [
{
"Label": "Report",
},
{
"Label": "Ingest",
},
{
"Label": "Generate",
},
{
"Label": "Ingest",
}
]
}
You can set the map by "value" as your key and associate an increment to that key to make sure a count is maintained. And then you can print your map which should look as you are requesting for.

How to search through data with arbitrary amount of fields?

I have the web-form builder for science events. The event moderator creates registration form with arbitrary amount of boolean, integer, enum and text fields.
Created form is used for:
register a new member to event;
search through registered members.
What is the best search tool for second task (to search memebers of event)? Is ElasticSearch well for this task?
I wrote a post about how to index arbitrary data into Elasticsearch and then to search it by specific fields and values. All this, without blowing up your index mapping.
The post is here: http://smnh.me/indexing-and-searching-arbitrary-json-data-using-elasticsearch/
In short, you will need to do the following steps to get what you want:
Create a special index described in the post.
Flatten the data you want to index using the flattenData function:
https://gist.github.com/smnh/30f96028511e1440b7b02ea559858af4.
Create a document with the original and flattened data and index it into Elasticsearch:
{
"data": { ... },
"flatData": [ ... ]
}
Optional: use Elasticsearch aggregations to find which fields and types have been indexed.
Execute queries on the flatData object to find what you need.
Example
Basing on your original question, let's assume that the first event moderator created a form with following fields to register members for the science event:
name string
age long
sex long - 0 for male, 1 for female
In addition to this data, the related event probably has some sort of id, let's call it eventId. So the final document could look like this:
{
"eventId": "2T73ZT1R463DJNWE36IA8FEN",
"name": "Bob",
"age": 22,
"sex": 0
}
Now, before we index this document, we will flatten it using the flattenData function:
flattenData(document);
This will produce the following array:
[
{
"key": "eventId",
"type": "string",
"key_type": "eventId.string",
"value_string": "2T73ZT1R463DJNWE36IA8FEN"
},
{
"key": "name",
"type": "string",
"key_type": "name.string",
"value_string": "Bob"
},
{
"key": "age",
"type": "long",
"key_type": "age.long",
"value_long": 22
},
{
"key": "sex",
"type": "long",
"key_type": "sex.long",
"value_long": 0
}
]
Then we will wrap this data in a document as I've showed before and index it.
Then, the second event moderator, creates another form having a new field, field with same name and type, and also a field with same name but with different type:
name string
city string
sex string - "male" or "female"
This event moderator decided that instead of having 0 and 1 for male and female, his form will allow choosing between two strings - "male" and "female".
Let's try to flatten the data submitted by this form:
flattenData({
"eventId": "F1BU9GGK5IX3ZWOLGCE3I5ML",
"name": "Alice",
"city": "New York",
"sex": "female"
});
This will produce the following data:
[
{
"key": "eventId",
"type": "string",
"key_type": "eventId.string",
"value_string": "F1BU9GGK5IX3ZWOLGCE3I5ML"
},
{
"key": "name",
"type": "string",
"key_type": "name.string",
"value_string": "Alice"
},
{
"key": "city",
"type": "string",
"key_type": "city.string",
"value_string": "New York"
},
{
"key": "sex",
"type": "string",
"key_type": "sex.string",
"value_string": "female"
}
]
Then, after wrapping the flattened data in a document and indexing it into Elasticsearch we can execute complicated queries.
For example, to find members named "Bob" registered for the event with ID 2T73ZT1R463DJNWE36IA8FEN we can execute the following query:
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "flatData",
"query": {
"bool": {
"must": [
{"term": {"flatData.key": "eventId"}},
{"match": {"flatData.value_string.keyword": "2T73ZT1R463DJNWE36IA8FEN"}}
]
}
}
}
},
{
"nested": {
"path": "flatData",
"query": {
"bool": {
"must": [
{"term": {"flatData.key": "name"}},
{"match": {"flatData.value_string": "bob"}}
]
}
}
}
}
]
}
}
}
ElasticSearch automatically detects the field content in order to index it correctly, even if the mapping hasn't been defined previously. So, yes : ElasticSearch suits well these cases.
However, you may want to fine tune this behavior, or maybe the default mapping applied by ElasticSearch doesn't correspond to what you need : in this case, take a look at the default mapping or, for even further control, the dynamic templates feature.
If you let your end users decide the keys you store things in, you'll have an ever-growing mapping and cluster state, which is problematic.
This case and a suggested solution is covered in this article on common problems with Elasticsearch.
Essentially, you want to have everything that can possibly be user-defined as a value. Using nested documents, you can have a key-field and differently mapped value fields to achieve pretty much the same.

Forming a view/reduce for couchdb

I'm fairly new to couchDB and the concept of views and reduces, and I could not find anything that would help me get my data in the format I want to consume it in.
My Data - Each set is it's own document
{
"_id": "2012-10-28",
"scores" : [
{
"bob": 3,
"dole": 5
}
]
}
{
"_id" : "2012-10-29",
"scores" : [
{
"bob": 3,
"dole": 6
}
]
}
I would like a view/reduce that returns something like:
"bob" : {
"2012-10-27": 3,
"2012-10-28": 3,
...
},
"dole": {
"2012-10-27": 5,
"2012-10-28": 6,
...
}
If this is not possible with my source data, I can reorganize it, but it will be tough.
Any help is greatly appreciated. I would also like to know of any good resources that explain the best practices for views and reduces.
Unless all the dates are known and you can hardcode them in the reduce function, I think it's a bit difficult to do what you need with map/reduce functions.
If it is ok to output something like:
{
"key": ["bob", "2012-10-27"],
"value": {"score": 3}
}
Then this map function should work:
var scoresMapFn = function (doc) {
var scores = doc.scores[0];
for (var k in scores) {
emit([k, doc._id], scores[k]);
}
};
Note that the structures of the original document could be optimised in my opinion. You have an array for scores but only have 1 element in it that is an object which has several keys for the names/players). This could be changed to:
{
"_id": "2012-10-28",
"scores": [
{
"name": "bob,
"score": 3
},
{
"name": "dole,
"score": 5
}
]
}
which would make it easier to manipulate.
Hope this helps a bit.

Resources