Create one nested object with two objects from dictionary - python-3.x

I'm not sure if the title of my question is the right description to the issue I'm facing.
I'm reading the following table of data from a spreadsheet and passing it as a dataframe:
Name Description Value
foo foobar 5
baz foobaz 4
bar foofoo 8
I need to transform this table of data to json following a specific schema.
I'm trying to get the following output:
{'global': {'Name': 'bar', 'Description': 'foofoo', 'spec': {'Value': '8'}}
So far I'm able to get the global and spec objects but I'm not sure how I should combine them to get the expected output above.
I wrote this:
for index, row in df.iterrows():
if row['Description'] == 'foofoo':
global = row.to_dict()
spec = row.to_dict()
del(global['Value'])
del(spec['Name'])
del(spec['Description'])
print("global:", global)
print("spec:", spec)
with the following output:
global: {'Name': 'bar', 'Description': 'foofoo'}
spec: {'Value': '8'}
How can I combine these two objects to get to the desired output?

This should give you that output:
global['spec'] = spec
combined = {'global': global}

Try this and see if it works faster: slow speed might be due to iterrows. I suggest you move the iteration to the dictionary after exporting from the dataframe.
Name Description Value
0 foo foobar 5
1 baz foobaz 4
2 bar foofoo 8
#Export dataframe to dictionar, using the 'index' option
M = df.to_dict('index')
r = {}
q = []
#iterating through the dictionary items(key,value pair)
for i,j in M.items():
#assign value to key 'global'
r['global'] = j
#popitem() works similarly to pop in list,
#take out the last item
#and remove it from parent dictionary
#this nests the spec key, inside the global key
r['global']['spec'] = dict([j.popitem()])
#this ensures the dictionaries already present are not overriden
#you could use copy or deep.copy to ensure same state
q.append(dict(r))
{'global': {'Name': 'foo', 'Description': 'foobar', 'spec': {'Value': 5}}}
{'global': {'Name': 'baz', 'Description': 'foobaz', 'spec': {'Value': 4}}}
{'global': {'Name': 'bar', 'Description': 'foofoo', 'spec': {'Value': 8}}}
dict popitem

Related

How to remove common item from list of dictionaries after grouping

I have a list of dictionaries like below. I want to group the dictionaries based on grade, and convert the list of dictionaries to single dictionaries with key as grade value and value as list of dictionaries
Input:
[
{'name':'abc','mark':'99','grade':'A'},
{'name':'xyz','mark':'90','grade':'A'},
{'name':'123','mark':'70','grade':'C'},
]
I want my output like below:
{
A: [ {'name': 'abc','mark':'99'}, {'name': 'xyz','mark':'90'} ],
C: [ {'name': '123','mark':'70'} ]
}
I tried sorted and groupby; but not able to remove grade from dictionary.
Use a loop with dict.setdefault:
l = [{'name':'abc','mark':'99','grade':'A'},
{'name':'xyz','mark':'90','grade':'A'},
{'name':'123','mark':'70','grade':'C'},
]
out = {}
for d in l:
# avoid mutating the original dictionaries
d = d.copy()
# get grade, try to get the key in "out"
# if the key doesn't exist, initialize with an empty list
out.setdefault(d.pop('grade'), []).append(d)
print(out)
Output:
{'A': [{'name': 'abc', 'mark': '99'},
{'name': 'xyz', 'mark': '90'}],
'C': [{'name': '123', 'mark': '70'}],
}

How to get a dict from a list of dict according to some threshold

I have a list of dicts like the one below:
list_dict = [{'Name': 'Andres', 'score': 0.17669814825057983},
{'Name': 'Paul', 'score': 0.14028045535087585},
{'Name': 'Feder', 'score': 0.1379694938659668},
{'Name': 'James', 'score': 0.1348174512386322}]
I want to output another list of dict but only when sum of score is higher than a threshold=0.15
Expected output: [{'name':'Andres', 'score' : 0.1766..}]
I did this, but the code is terrible and the outuput is wrongly formatted
l = []
for i in range(len(list_dict)):
for k in list_dict[i]['name']:
if list_dict[i]['score']>0.15:
print(k)
Maybe this is what you're looking?
Actually you're very close... but just miss a few syntax.
Each item in list_dict is a dictionary, so you can access and ask the score, it should not use index to get the interesting part.
new_dc = list()
for item in list_dict: # each item is a dictionary
if item['score'] > 0.15: # it's better to use a meaningful variable.
new_dc.append(item)
print(new_dc) # [{'Name': 'Andres', 'score': 0.17669814825057983}]
Alternatively you can use List Comprehension:
output = [item for item in list_dict if item['score'] > 0.15]
assert new_dc == output # Silence mean they're the same
1st approach using loop
final_list = []
for each in list_dict: #simply iterate through each dict in list and compare score
if each['score']>0.15:
final_list.append(each)
print(final_list)
2nd approach using list comprehension
final_list = [item for item in list_dict if item['score']>0.15]
print(final_list)

Issues Sorting a list, list out of order

Hi, I'm trying to write a basic GUI using python and pytq5 which displays information received from an API call, when the information is returned from the API it seems to be out of order and i cant figure out how to sort the data before i send it to the GUI, the list is a list of dictionaries,
interfaces = [
{'name': 'GigabitEthernet 0/0'},
{'name': 'GigabitEthernet 1/0/1'},
{'name': 'GigabitEthernet 1/0/10'},
{'name': 'GigabitEthernet 1/0/11'},
...
]
Any advice would be appreciated, from the image i would be expecting to sort the data so that 1/0/1 - 1/0/9 are all before 1/0/10
Thanks
Try this:
# Your list of dicts
interfaces = [
{'name': 'GigabitEthernet 0/0'},
{'name': 'GigabitEthernet 1/0/11'},
{'name': 'GigabitEthernet 1/0/10'},
{'name': 'GigabitEthernet 1/0/1'},
{'name': 'GigabitEthernet 1/0/2'}
]
new_interfaces = []
# Create a list of dicts whose value corresponding to the 'name' key would be a list
# containing the ethernet type (in this case, 'GigabitEthernet') and a list of 3
# elements representing the date (like '[1, 0, 11]')
for entry in [i["name"] for i in interfaces]:
new_interfaces.append({'name': [entry.split()[0], entry.split()[1].split("/")]})
# If one of those lists representing the date contains 2 elements, add a third
# element with a value of '0'
for entry in new_interfaces:
if len(entry['name'][1]) == 2:
entry['name'][1].append('0')
# Sort the list according to the date
new_interfaces = sorted(new_interfaces, key=lambda d: int(''.join(d['name'][1])))
# Re-create the original list with sorted values and original structure
interfaces = []
for entry in new_interfaces:
interfaces.append({'name': f"{entry['name'][0]} {'/'.join(entry['name'][1])}"})
print(interfaces)
Output:
[{'name': 'GigabitEthernet 0/0/0'}, {'name': 'GigabitEthernet 1/0/1'}, {'name': 'GigabitEthernet 1/0/2'}, {'name': 'GigabitEthernet 1/0/10'}, {'name': 'GigabitEthernet 1/0/11'}]
This algorithm converts the date (or whatever it is lol) to a string without '/'s so that it can be converted to an integer to be able to be compared with each other properly.
Example:
1/0/10 would be 1010
1/0/1 would be 101
And as 101 is lower than 1010 as a number, sorted() method puts it to front.

Python extract unknown string from dataframe column

New to python - using v3. I have a dataframe column that looks like
object
{"id":"http://Demo/1.7","definition":{"name":{"en-US":"Time Training New"}},"objectType":"Activity"}
{"id":"http://Demo/1.7","definition":{"name":{"en-US":"Time Influx"}},"objectType":"Activity"}
{"id":"http://Demo/1.7","definition":{"name":{"en-US":"Social"}},"objectType":"Activity"}
{"id":"http://Demo/2.18","definition":{"name":{"en-US":"Personal"}},"objectType":"Activity"}
I need to extract the activity, which starts in a variable place and is of variable length. I do not know what the activities are. All the questions I've found are to extract a specific string or pattern, not an unknown one. If I use the code below
dataExtract['activity'] = dataExtract['object'].str.find('en-US":"')
Will give me the start index and this
dataExtract['activity'] = dataExtract['object'].str.rfind('"}}')
Will give me the end index. So I have tried combining these
dataExtract['activity'] = dataExtract['object'].str[dataExtract['object'].str.find('en-US":"'):dataExtract['object'].str.rfind('"}}')]
But that just generates "NaN", which is clearly wrong. What syntax should I use, or is there a better way to do it? Thanks
I suggest convert values to nested dictionaries and then extract by nested keys:
#if necessary
#import ast
#dataExtract['object'] = dataExtract['object'].apply(ast.literal_eval)
dataExtract['activity'] = dataExtract['object'].apply(lambda x: x['definition']['name']['en-US'])
print (dataExtract)
object activity
0 {'id': 'http://Demo/1.7', 'definition': {'name... Time Training New
1 {'id': 'http://Demo/1.7', 'definition': {'name... Time Influx
2 {'id': 'http://Demo/1.7', 'definition': {'name... Social
3 {'id': 'http://Demo/2.18', 'definition': {'nam... Personal
Details:
print (dataExtract['object'].apply(lambda x: x['definition']))
0 {'name': {'en-US': 'Time Training New'}}
1 {'name': {'en-US': 'Time Influx'}}
2 {'name': {'en-US': 'Social'}}
3 {'name': {'en-US': 'Personal'}}
Name: object, dtype: object
print (dataExtract['object'].apply(lambda x: x['definition']['name']))
0 {'en-US': 'Time Training New'}
1 {'en-US': 'Time Influx'}
2 {'en-US': 'Social'}
3 {'en-US': 'Personal'}
Name: object, dtype: object

filter dataframe columns as you iterate through rows and create dictionary

I have the following table of data in a spreadsheet:
Name Description Value
foo foobar 5
baz foobaz 4
bar foofoo 8
I'm reading the spreadsheet and passing the data as a dataframe.
I need to transform this table of data to json following a specific schema.
I have the following script:
for index, row in df.iterrows():
if row['Description'] == 'foofoo':
print(row.to_dict())
which return:
{'Name': 'bar', 'Description': 'foofoo', 'Value': '8'}
I want to be able to filter out a specific column. For example, to return this:
{'Name': 'bar', 'Description': 'foofoo'}
I know that I can print only the columns I want with this print(row['Name'],row['Description']) however this is only returning me values when I also want to return the key.
How can I do this?
I wrote this entire thing only to realize that #anky_91 had already suggested it. Oh well...
import pandas as pd
data = {
"name": ["foo", "abc", "baz", "bar"],
"description": ["foobar", "foofoo", "foobaz", "foofoo"],
"value": [5, 3, 4, 8],
}
df = pd.DataFrame(data=data)
print(df, end='\n\n')
rec_dicts = df.loc[df["description"] == "foofoo", ["name", "description"]].to_dict(
"records"
)
print(rec_dicts)
Output:
name description value
0 foo foobar 5
1 abc foofoo 3
2 baz foobaz 4
3 bar foofoo 8
[{'name': 'abc', 'description': 'foofoo'}, {'name': 'bar', 'description': 'foofoo'}]
After converting to dictionary you can delete the key which you don't need with:
del(row[value])
Now the dictionary will have only name and description.
You can try this:
import io
import pandas as pd
s="""Name,Description,Value
foo,foobar,5
baz,foobaz,4
bar,foofoo,8
"""
df = pd.read_csv(io.StringIO(s))
for index, row in df.iterrows():
if row['Description'] == 'foofoo':
print(row[['Name', 'Description']].to_dict())
Result:
{'Name': 'bar', 'Description': 'foofoo'}

Resources