Getting error to normalize nested list in Python

Getting error to normalize nested list in Python - python-3.x

I have a nested list with dictionary. The following is just first element of the list
{'id': 'abcde',
'authorization': None,
'operation_type': 'xx',
'method': 'card',
'transaction_type': 'asd',
'card': {'type': 'dd',
'brand': 'vv',
'address': {'line1': 'xxxxxxx',
'line2': '',
'line3': '',
'state': 'xx',
'city': 'xxx',
'postal_code': '12345',
'country_code': 'xx'},
'card_number': '123456XXXXXX7890',
'holder_name': 'name user,
'expiration_year': '20',
'expiration_month': '02',
'allows_charges': True,
'allows_payouts': True,
'bank_name': 'abc bank',
'bank_code': '000'},
'status': 'fgh',
'conciliated': True,
'creation_date': '2018-09-23T23:58:17-05:00',
'operation_date': '2018-09-23T23:58:17-05:00',
'description': 'asdmdefdsa',
'error_message': 'sdaskjflj',
'order_id': 'ashdgjasdfhk',
'amount': 418.0,
'customer': {'name': 'abc',
'last_name': 'xyz',
'email': 'abcdef#hotmail.com',
'phone_number': '12345678',
'address': None,
'creation_date': '2018-09-23T23:58:18-05:00',
'external_id': None,
'clabe': None},
'fee': {'amount': 0.56, 'tax': 0.91, 'currency': 'XXX'},
'currency': 'XXX'},
{'id': 'abcde',
'authorization': None,
'operation_type': 'xx',
'method': 'card',
'transaction_type': 'asd',
'card': {'type': 'dd',
'brand': 'vv',
'address': {'line1': 'xxxxxxx',
'line2': '',
'line3': '',
'state': 'xx',
'city': 'xxx',
'postal_code': '12345',
'country_code': 'xx'},
'card_number': '123456XXXXXX7890',
'holder_name': 'name user,
'expiration_year': '20',
'expiration_month': '02',
'allows_charges': True,
'allows_payouts': True,
'bank_name': 'abc bank',
'bank_code': '000'},
'status': 'fgh',
'conciliated': True,
'creation_date': '2018-09-23T23:58:17-05:00',
'operation_date': '2018-09-23T23:58:17-05:00',
'description': 'asdmdefdsa',
'error_message': 'sdaskjflj',
'order_id': 'ashdgjasdfhk',
'amount': 418.0,
'customer': {'name': 'abc',
'last_name': 'xyz',
'email': 'abcdef#hotmail.com',
'phone_number': '12345678',
'address': None,
'creation_date': '2018-09-23T23:58:18-05:00',
'external_id': None,
'clabe': None},
'fee': {'amount': 0.56, 'tax': 0.91, 'currency': 'XXX'},
'currency': 'XXX'}
I want to normalize the data to dataframe. I wrote the code as: json_normalize(d). But I am getting following error:
--------------------------------------------------------------------------- KeyError Traceback (most recent call
last) in ()
----> 1 df = json_normalize(data)
/anaconda3/lib/python3.6/site-packages/pandas/io/json/normalize.py in
json_normalize(data, record_path, meta, meta_prefix, record_prefix,
errors, sep)
201 # TODO: handle record value which are lists, at least error
202 # reasonably
--> 203 data = nested_to_record(data, sep=sep)
204 return DataFrame(data)
205 elif not isinstance(record_path, list):
/anaconda3/lib/python3.6/site-packages/pandas/io/json/normalize.py in
nested_to_record(ds, prefix, sep, level)
86 else:
87 v = new_d.pop(k)
---> 88 new_d.update(nested_to_record(v, newkey, sep, level + 1))
89 new_ds.append(new_d)
90
/anaconda3/lib/python3.6/site-packages/pandas/io/json/normalize.py in
nested_to_record(ds, prefix, sep, level)
82 new_d[newkey] = v
83 if v is None: # pop the key if the value is None
---> 84 new_d.pop(k)
85 continue
86 else:
KeyError: 'address'
I understood that because address in None, the code is giving me error. But I don't know how to fix it. Any help in this regard will be highly appreciated. Thanks in advance.
(Please note that the data is dummy data)

The dictionary is badly formatted. First of all, you have lines like the following:
'holder_name': 'name user,
where the value 'name user is not a valid string, since it's not enclosed by a single quote character on the right.
Second, in your code you have two elements of a list, that is, two dictionaries, each of them starting with {'id': ..., as opposed to a single element as claimed.
After fixing the values of 'holder_name in both dictionaries and making it a two-member list, you can proceed with using json_normalize and you would get an output like the following (with printed in stdout):
amount authorization card.address.city card.address.country_code ... operation_type order_id status transaction_type 0
418.0 None xxx xx ... xx ashdgjasdfhk fgh asd 1
418.0 None xxx xx ... xx ashdgjasdfhk fgh asd
[2 rows x 42 columns]

I tried to reproduce this error but I couldn't. After creating a python3 venv and installing pandas with pip I copied your code (python dictionary and not json - my mistake, thanks #AlessandroCosentino +1 from me) to an editor and found out that lines 16 and 56 are missing a single quote
'holder_name': 'name user,
and should be
'holder_name': 'name user',
from pandas.io.json import json_normalize
data = {'id': 'abcde',
'authorization': None,
'operation_type': 'xx',
'method': 'card',
'transaction_type': 'asd',
'card': {'type': 'dd',
'brand': 'vv',
'address': {'line1': 'xxxxxxx',
'line2': '',
'line3': '',
'state': 'xx',
'city': 'xxx',
'postal_code': '12345',
'country_code': 'xx'},
'card_number': '123456XXXXXX7890',
'holder_name': 'name user',
'expiration_year': '20',
'expiration_month': '02',
'allows_charges': True,
'allows_payouts': True,
'bank_name': 'abc bank',
'bank_code': '000'},
'status': 'fgh',
'conciliated': True,
'creation_date': '2018-09-23T23:58:17-05:00',
'operation_date': '2018-09-23T23:58:17-05:00',
'description': 'asdmdefdsa',
'error_message': 'sdaskjflj',
'order_id': 'ashdgjasdfhk',
'amount': 418.0,
'customer': {'name': 'abc',
'last_name': 'xyz',
'email': 'abcdef#hotmail.com',
'phone_number': '12345678',
'address': None,
'creation_date': '2018-09-23T23:58:18-05:00',
'external_id': None,
'clabe': None},
'fee': {'amount': 0.56, 'tax': 0.91, 'currency': 'XXX'},
'currency': 'XXX'},
{'id': 'abcde',
'authorization': None,
'operation_type': 'xx',
'method': 'card',
'transaction_type': 'asd',
'card': {'type': 'dd',
'brand': 'vv',
'address': {'line1': 'xxxxxxx',
'line2': '',
'line3': '',
'state': 'xx',
'city': 'xxx',
'postal_code': '12345',
'country_code': 'xx'},
'card_number': '123456XXXXXX7890',
'holder_name': 'name user',
'expiration_year': '20',
'expiration_month': '02',
'allows_charges': True,
'allows_payouts': True,
'bank_name': 'abc bank',
'bank_code': '000'},
'status': 'fgh',
'conciliated': True,
'creation_date': '2018-09-23T23:58:17-05:00',
'operation_date': '2018-09-23T23:58:17-05:00',
'description': 'asdmdefdsa',
'error_message': 'sdaskjflj',
'order_id': 'ashdgjasdfhk',
'amount': 418.0,
'customer': {'name': 'abc',
'last_name': 'xyz',
'email': 'abcdef#hotmail.com',
'phone_number': '12345678',
'address': None,
'creation_date': '2018-09-23T23:58:18-05:00',
'external_id': None,
'clabe': None},
'fee': {'amount': 0.56, 'tax': 0.91, 'currency': 'XXX'},
'currency': 'XXX'}
print(json_normalize(data))
the output is this
This could easily be avoided by using a smart editor - eg SublimeText - with python highlighting. What editor are you using?

Related

Filtering list of dicts based on a key value in python

I have a list of dictionaries in python which looks like below
list = [{'entityType': 'source', 'databaseName': 'activities', 'type': 'POSTGRES', 'children': [{'id': '3c144414-0c73-41df-9f0e-4dd7cb5af46e',
'path': ['Activities (DEV)', 'public'],
'type': 'CONTAINER',
'containerType': 'FOLDER'}]'checkTableAuthorizer': False},
{'entityType': 'source', 'databaseName': 'pd-prod-dev', 'type': 'POSTGRES', 'children':
[{'id': '75d84ead-a9fe-4949-bc21-d4deb34e1ae1',
'path': ['pg-prd (DEV-RR)', 'pghero'],
'tag': 'PWGqdrkcD08=',
'type': 'CONTAINER',
'containerType': 'FOLDER'},
{'id': 'facc2c20-7561-430f-ac35-547b5bc7a92f',
'path': ['pg-prd (DEV-RR)', 'public'],
'tag': 'gcUL0NTOc+4=',
'type': 'CONTAINER',
'containerType': 'FOLDER'}]'checkTableAuthorizer': False},
{'entityType': 'source', 'databaseName': 'pd-prod-prd', 'type': 'POSTGRES', 'children':
[{'id': '75d84ead-a9fe-4949-bc21-d4deb34e1ae1',
'path': ['pg-prd (PRD-RR)', 'pghero'],
'tag': 'PWGqdrkcD08=',
'type': 'CONTAINER',
'containerType': 'FOLDER'},
{'id': 'facc2c20-7561-430f-ac35-547b5bc7a92f',
'path': ['pg-prd (PRD-RR)', 'public'],
'tag': 'gcUL0NTOc+4=',
'type': 'CONTAINER',
'containerType': 'FOLDER'}]'checkTableAuthorizer': False}]
This is just a sample. The actual list has a list of 30 dictionaries. What I am trying to do is filter out the dictionaries where the nested children dictionary has only ' public' schema in it. So my expected output would be
public_list = [{'entityType': 'source', 'databaseName': 'activities', 'type': 'POSTGRES', 'children': [{'id': '3c144414-0c73-41df-9f0e-4dd7cb5af46e',
'path': ['Activities (DEV)', 'public'],
'type': 'CONTAINER',
'containerType': 'FOLDER'}]'checkTableAuthorizer': False},
{'entityType': 'source', 'databaseName': 'pd-prod-dev', 'type': 'POSTGRES', 'children':
[{'id': 'facc2c20-7561-430f-ac35-547b5bc7a92f',
'path': ['pg-prd (DEV-RR)', 'public'],
'tag': 'gcUL0NTOc+4=',
'type': 'CONTAINER',
'containerType': 'FOLDER'}]'checkTableAuthorizer': False},
{'entityType': 'source', 'databaseName': 'pd-prod-prd', 'type': 'POSTGRES', 'children':
[{'id': 'facc2c20-7561-430f-ac35-547b5bc7a92f',
'path': ['pg-prd (PRD-RR)', 'public'],
'tag': 'gcUL0NTOc+4=',
'type': 'CONTAINER',
'containerType': 'FOLDER'}]'checkTableAuthorizer': False}]
I tried accessing the nested dict children by iterating but unable to filter out what condition to use
for d in list:
for k, v in d.items():
if k == 'children':
print(v)
I would love to apply this as a function since I'll be reusing it on a pandas column of list of dicts

You could create a function that gets the public data for children of each entry:
def get_public_data(data):
result = []
children = data.get("children")
if children:
for row in children:
path = row.get("path")
if path and "public" in path:
result.append(row)
return result
And then create a new list of entries where you just replace the children key:
public_list = []
for x in entities:
public_data = get_public_data(x)
if public_data:
public_list.append({**x, "children": public_data})
Combine these two and you'll get the function you need.

IIUC you want to collect the entries were all items have a public schema?
Assuming your 'children' keys are always valid and a tuple of 2 elements, you can use a simple comprehension:
[d for d in lst
if all(e['path'][1] == 'public' for e in d['children'])
]
NB. I called your input lst as list is a python builtin

Making a list of dict from a list of lists with the list[0] as keys and other lists as values

could you tell me how can I get a list of dicts from that with the a[0] as keys for each dict and a[1:] as values accordingly.
a = [['PORT', 'NAME', 'STATUS', 'VLAN', 'DUPLEX', 'SPEED', 'TYPE', 'FC_MODE'], ['Gi1/0/1', 'S1-P1-01 Cisco_Roo', 'connected', '248', 'a-full', 'a-1000', '10/100/1000BaseTX', ''], ['Gi1/0/2', '', 'notconnect', '121', 'auto', 'auto', '10/100/1000BaseTX', ''], ['Gi1/0/3', '', 'notconnect', '121', 'auto', 'auto', '10/100/1000BaseTX', '']]
I wanna get
[{'PORT' : 'Gi1/0/1',
'NAME' : 'S1-P1-01 Cisco_Roo',
.
.
.
},
{'PORT' : 'Gi1/0/2',
'NAME' : '',
.
.
.
}]

The easiest is probably:
[dict(zip(a[0], x)) for x in a[1:]]
This walks through each element from 1 onwards, and combines it with the first element, converting to a dictionary.

Does this work?
a = [
['PORT', 'NAME', 'STATUS', 'VLAN', 'DUPLEX', 'SPEED', 'TYPE', 'FC_MODE'],
['Gi1/0/1', 'S1-P1-01 Cisco_Roo', 'connected', '248', 'a-full', 'a-1000', '10/100/1000BaseTX', ''],
['Gi1/0/2', '', 'notconnect', '121', 'auto', 'auto', '10/100/1000BaseTX', ''],
['Gi1/0/3', '', 'notconnect', '121', 'auto', 'auto', '10/100/1000BaseTX', '']
]
import pandas as pd
pd.DataFrame(a[1:],columns=a[0]).to_dict(orient="records")
Creates this output.
[{'PORT': 'Gi1/0/1',
'NAME': 'S1-P1-01 Cisco_Roo',
'STATUS': 'connected',
'VLAN': '248',
'DUPLEX': 'a-full',
'SPEED': 'a-1000',
'TYPE': '10/100/1000BaseTX',
'FC_MODE': ''},
{'PORT': 'Gi1/0/2',
'NAME': '',
'STATUS': 'notconnect',
'VLAN': '121',
'DUPLEX': 'auto',
'SPEED': 'auto',
'TYPE': '10/100/1000BaseTX',
'FC_MODE': ''},
{'PORT': 'Gi1/0/3',
'NAME': '',
'STATUS': 'notconnect',
'VLAN': '121',
'DUPLEX': 'auto',
'SPEED': 'auto',
'TYPE': '10/100/1000BaseTX',
'FC_MODE': ''}]

Having issue while merging graph in xlsxwriter

I am trying to merge two graphs in xlsxwriter.
1 : a line graph
2 : a stacked graph having y2 axis.
While merging for the second graph the staring value is coming from the negative value of the line graph. Like below image. I want it start from x axis only i.e from 0. Please find the code i hv used for both graphs.
line_chart.set_x_axis({
'name_font': {'size': 14, 'bold': True},
'major_gridlines': {'visible': False},
'major_tick_mark': 'none',
'minor_tick_mark': 'none'
})
line_chart.set_y_axis({
'name': 'Sales',
'name_font': {'size': 14, 'bold': True},
'major_gridlines': {'visible': False},
'major_tick_mark': 'none',
'minor_tick_mark': 'none',
})
line_chart.add_series({
'name': ' ',
'categories': '=data_sheet!$F$2:$F$25',
'values': '=data_sheet!$D$2:$D$25',
'line': {'color': '#FFCC00'},
'marker': {'type': 'square',
'border': {'color': 'red'},
'fill': {'color': 'red'}
},
})
bar_chart = workbook.add_chart({'type': 'column', 'subtype': 'stacked'})
bar_chart.set_x_axis({
'name_font': {'size': 14, 'bold': True},
'major_gridlines': {'visible': False},
'major_tick_mark': 'none',
'minor_tick_mark': 'none'
})
bar_chart.set_y2_axis({
'name': 'Traffic',
'name_font': {'size': 14, 'bold': True},
'major_gridlines': {'visible': False},
'major_tick_mark': 'none',
'minor_tick_mark': 'none'
})
bar_chart.add_series({
'name': '',
'fill': {'color': 'white'},
'categories': '=data_sheet!$F$2:$F$25',
'values': '=data_sheet!$B$2:$B$25',
'y2_axis': True,
})
bar_chart.add_series({
'name': '',
'fill': {'color': '#117A9A'},
'categories': '=data_sheet!$F$2:$F$25',
'values': '=data_sheet!$A$2:$A$25',
'y2_axis': True,
})
line_chart.combine(bar_chart)
line_chart.set_legend({'position': 'low'})
line_chart.set_size({'width': 1025, 'height': 360})
line_chart.set_chartarea({'border': {'none': True}})

Match key and value from two different dictionaries and merge them

I have this two dictionaries:
{'data': {'id': '001_101_001', 'name': 'chview', 'type': 'multiple', 'mapping': {}},
{'id': '001_102_001', 'name': 'view', 'type': 'binary', 'mapping': {'abc':'exp'}}
And:
{'queries':{'view': 'text', 'chview': 'text1'}}
The desired output should be:
{'new_data' : {'001_101_001': { 'query': 'text1', 'type': 'multiple', 'mapping': {}},
'001_102_001': { 'query1': 'text', 'type': 'binary', 'mapping': {'abc':'exp'}}
Because there are a lot of this dictionaries I need to match them by 'name', to have the coresponding id matched. Any ideas?

Your first dictionary has a problem, it is not hashable. It should be a list of dictionaries.
{"data" :[
{'id': '001_101_001', 'name': 'chview', 'type': 'multiple', 'mapping': {}},
{'id': '001_102_001', 'name': 'view', 'type': 'binary', 'mapping': {'abc':'exp'}}
]}
Complete code:
data = {"data" :[
{'id': '001_101_001', 'name': 'chview', 'type': 'multiple', 'mapping': {}},
{'id': '001_102_001', 'name': 'view', 'type': 'binary', 'mapping': {'abc':'exp'}}
]}
queries = {"queries" : {'view': 'text', 'chview': 'text1'}}
new_data = {}
for d in data["data"]:
item = {d["id"] : {
"query": queries["queries"][d["name"]],
"type": d["type"],
"mapping": d["mapping"]
}}
new_data.update(item)
print({"new_data": new_data})
OUTPUT:
{'new_data': {'001_101_001': {'query': 'text1', 'type': 'multiple', 'mapping': {}}, '001_102_001': {'query': 'text', 'type': 'binary', 'mapping': {'abc': 'exp'}}}}

TypeError: string indices must be integers - json

I want get value (abc.com/p/B3N) from this json :
{'id': 123456, 'parent_id': 0, 'number': '23856', 'order_key': 'abc', 'created_via': 'checkout', 'version': '3.6.4', 'status': 'processing', 'currency': 'USD', 'date_created': '2019-10-05T13:18:49', 'date_created_gmt': '2019-10-05T13:18:49', 'date_modified': '2019-10-05T13:19:20', 'date_modified_gmt': '2019-10-05T13:19:20', 'discount_total': '0.00', 'discount_tax': '0.00', 'shipping_total': '0.00', 'shipping_tax': '0.00', 'cart_tax': '0.00', 'total': '0.40', 'total_tax': '0.00', 'prices_include_tax': False, 'customer_id': 0, 'customer_ip_address': '111.101.111.111', 'customer_user_agent': 'Mozilla/5.0 (Linux; Android 8.0.0; SAMSUNG SM-J337P) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/10.1 Chrome/71.0.3578.99 Mobile Safari/537.36', 'customer_note': '', 'billing': {'first_name': '', 'last_name': '', 'company': '', 'address_1': '', 'address_2': '', 'city': '', 'state': '', 'postcode': '', 'country': '', 'email': 'abc#gmail.com', 'phone': ''}, 'shipping': {'first_name': '', 'last_name': '', 'company': '', 'address_1': '', 'address_2': '', 'city': '', 'state': '', 'postcode': '', 'country': ''}, 'payment_method': 'paypal', 'payment_method_title': 'PayPal', 'transaction_id': '851R', 'date_paid': '2019-10-05T13:19:20', 'date_paid_gmt': '2019-10-05T13:19:20', 'date_completed': None, 'date_completed_gmt': None, 'cart_hash': '0675772a1e', 'meta_data': [{'id': 123456, 'key': 'is_vat_exempt', 'value': 'no'}, {'id': 123456, 'key': 'Payment type', 'value': 'instant'}, {'id': 274929, 'key': '_paypal_status', 'value': 'completed'}, {'id': 123456, 'key': 'PayPal Transaction Fee', 'value': '0.32'}], 'line_items': [{'id': 10927, 'name': 'Jeans', 'product_id': 1234, 'variation_id': 0, 'quantity': 1, 'tax_class': '', 'subtotal': '0.10', 'subtotal_tax': '0.00', 'total': '0.10', 'total_tax': '0.00', 'taxes': [], 'meta_data': [{'id': 100000, 'key': '', 'value': 'Views $0.00 × 500'}, {'id': 100001, 'key': '', 'value': 'Worldwide'}, {'id': 100002, 'key': '', 'value': 'abc.com/p/B3N'}, {'id': 100003, 'key': '', 'value': '17'}], 'sku': '', 'price': 0.1}], 'tax_lines': [], 'shipping_lines': [], 'fee_lines': [{'id': 10928, 'name': 'PayPal Fee (Free Fee for order over $5)', 'tax_class': '0', 'tax_status': 'taxable', 'amount': '0.3', 'total': '0.30', 'total_tax': '0.00', 'taxes': [], 'meta_data': [{'id': 122543, 'key': '_legacy_fee_key', 'value': 'paypal-fee'}]}], 'coupon_lines': [], 'refunds': [], '_links': {'self': [{'href': 'abc.com'}], 'collection': [{'href': 'abc.com'}]}}
this is my code
m = (wcapi.get(order + ordernumber).json())
n = json.dumps(m)
o = json.loads(n)
for i in o:
if i['id'] == '100002':
print(i['value'])
break
and i got this error :
if i['id'] == '100002':
TypeError: string indices must be integers
i have searched others topics but ... can't. thanks for help me!

When you do for i in o and o is a dictionary, the for loop iterates over the keys in o - which are strings in your case. Hence the error. i is a string.
To get the key you need to know the exact structure of o.
I'm gonna give you some examples:
o['id'] # 123456
o['billing']['email'] # "abc#gmai.com"
Now to get the value you want:
first_line_items_meta = o['line_items'][0]['metadata']
for item in first_line_items_meta:
if item['id'] == 100002:
print(item['value']) # "abc.com/p/B3N"

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Getting error to normalize nested list in Python - python-3.x

Related

Filtering list of dicts based on a key value in python

Making a list of dict from a list of lists with the list[0] as keys and other lists as values

Having issue while merging graph in xlsxwriter

Match key and value from two different dictionaries and merge them

TypeError: string indices must be integers - json

Categories

Resources