Convert multiple line to single list in python - python-3.x

I have a file as follows:
using python i want to convert these multiple line to single list as below output. Please help me on this
x = {
"name": "Ken",
"age": 45,
"married": True,
"children": ("Alice", "Bob"),
"pets": [ 'Dog' ],
"cars": [
{"model": "Audi A1", "mpg": 15.1},
{"model": "Zeep Compass", "mpg": 18.1}
],
}
pdf = FPDF()
pdf.add_page()
pdf.set_font("Arial", size=12)
x.keys()
for key in x.keys():
print(key)
keys = key.strip().split("\n")
keys = list(key)
print(keys)
For this I'm getting output as below
['name']
['age']
['married']
['children']
['pets']
['cars']
expected output:
['name','age','married','children','pets','cars']

from fpdf import FPDF
x = {
"name": "Ken",
"age": 45,
"married": True,
"children": ("Alice", "Bob"),
"pets": [ 'Dog' ],
"cars": [
{"model": "Audi A1", "mpg": 15.1},
{"model": "Zeep Compass", "mpg": 18.1}
],
}
pdf = FPDF()
pdf.add_page()
pdf.set_font("Arial", size=12)
list_of_keys = [i for i in x.keys()]
print(list_of_keys)
result:
['name', ' age', 'married', 'children', 'pets', 'cars']

Related

Is this the best way to parse a Json output from Google Ads Stream

Is this the best way to parse a Json output from Google Ads Stream. I am parsing the json with pandas & it is taking too much time
record counts is around 700K
[{
"results": [
{
"customer": {
"resourceName": "customers/12345678900",
"id": "12345678900",
"descriptiveName": "ABC"
},
"campaign": {
"resourceName": "customers/12345678900/campaigns/12345",
"name": "Search_Google_Generic",
"id": "12345"
},
"adGroup": {
"resourceName": "customers/12345678900/adGroups/789789",
"id": "789789",
"name": "adgroup_details"
},
"metrics": {
"clicks": "500",
"conversions": 200,
"costMicros": "90000000",
"allConversionsValue": 5000.6936,
"impressions": "50000"
},
"segments": {
"device": "DESKTOP",
"date": "2022-10-28"
}
}
],
"fieldMask": "segments.date,customer.id,customer.descriptiveName,campaign.id,campaign.name,adGroup.id,adGroup.name,segments.device,metrics.costMicros,metrics.impressions,metrics.clicks,metrics.conversions,metrics.allConversionsValue",
"requestId": "fdhfgdhfgjf"
}
]
This is the sample json.I am saving the stream in json file and then reading using pandas and trying to dump in csv file
I want to convert it to CSV format, Like
with open('Adgroups.json', encoding='utf-8') as inputfile:
df = pd.read_json(inputfile)
df_new = pd.DataFrame(columns= ['Date', 'Account_ID', 'Account', 'Campaign_ID','Campaign',
'Ad_Group_ID', 'Ad_Group','Device',
'Cost', 'Impressions', 'Clicks', 'Conversions', 'Conv_Value'])
for i in range(len(df['results'])):
results = df['results'][i]
for result in results:
new_row = pd.Series({ 'Date': result['segments']['date'],
'Account_ID': result['customer']['id'],
'Account': result['customer']['descriptiveName'],
'Campaign_ID': result['campaign']['id'],
'Campaign': result['campaign']['name'],
'Ad_Group_ID': result['adGroup']['id'],
'Ad_Group': result['adGroup']['name'],
'Device': result['segments']['device'],
'Cost': result['metrics']['costMicros'],
'Impressions': result['metrics']['impressions'],
'Clicks': result['metrics']['clicks'],
'Conversions': result['metrics']['conversions'],
'Conv_Value': result['metrics']['allConversionsValue']
})
df_new = df_new.append(new_row, ignore_index = True)
df_new.to_csv('Adgroups.csv', encoding='utf-8', index=False)
Don't use df.append. It's very slow because it has to copy the dataframe over and over again. I think it's being deprecated for this reason.
You can build the rows using list comprehension before constructing the data frame:
import json
with open("Adgroups.json") as fp:
data = json.load(fp)
columns = [
"Date",
"Account_ID",
"Account",
"Campaign_ID",
"Campaign",
"Ad_Group_ID",
"Ad_Group",
"Device",
"Cost",
"Impressions",
"Clicks",
"Conversions",
"Conv_Value",
]
records = [
(
r["segments"]["date"],
r["customer"]["id"],
r["customer"]["descriptiveName"],
r["campaign"]["id"],
r["campaign"]["name"],
r["adGroup"]["id"],
r["adGroup"]["name"],
r["segments"]["device"],
r["metrics"]["costMicros"],
r["metrics"]["impressions"],
r["metrics"]["clicks"],
r["metrics"]["conversions"],
r["metrics"]["allConversionsValue"],
)
for d in data
for r in d["results"]
]
df = pd.DataFrame(records, columns=columns)

Get dict inside a list with value without for loop

I have this dict:
data_flights = {
"prices": [
{ "city": "Paris", "iataCode": "AAA", "lowestPrice": 54, "id": 2 },
{ "city": "Berlin", "iataCode": "BBB", "lowestPrice": 42, "id": 3 },
{ "city": "Tokyo", "iataCode": "CCC", "lowestPrice": 485, "id": 4 },
{ "city": "Sydney", "iataCode": "DDD", "lowestPrice": 551, "id": 5 },
],
"date": "31/03/2022"
}
Can I acess a dict using a key value from one of the dics, without using for loop?
something like this:
data_flights["prices"]["city" == "Berlin"]
You can achieve this by either using a comprehension or the filter built in.
comprehension:
[e for e in d['prices'] if e['city'] == 'Berlin']
filter:
list(filter(lambda e: e['city'] == 'Berlin', d['prices']))
Both would result in:
[{'city': 'Berlin', 'iataCode': 'BBB', 'lowestPrice': 42, 'id': 3}]
You can use list comprehension
x = [a for a in data_flights["prices"] if a["city"] == "Berlin"]
>>> x
[{'city': 'Berlin', 'iataCode': 'BBB', 'lowestPrice': 42, 'id': 3}]

How can I create data structure with list of key, value pair?

I have below data as input, looking to create a data structure as below.
Input:
Key,type,alias
Aggregator_aggregator_se,Sorter,So_so
Aggregator_aggregator_se,Sorter,So_so
Aggregator_aggregator_se,Sorter,So_so
Expression_expr_se,Aggregator,Ag_ag
Expression_expr_se,Aggregator,Ag_ag
Expression_expr_se,Aggregator,Ag_ag
Expression_expr_se,Aggregator,Ag_ag
Expression_expr_se,Aggregator,Ag_ag
Expression_expr_se,Sorter,So_so
Expression_expr_se,Sorter,So_so
Expression_expr_se,Aggregator,Ag_ag
Expression_expr_se,Aggregator,Ag_ag
Filter_filter_se,Expression,Ex_ex
Filter_filter_se,Expression,Ex_ex
Filter_filter_se,Expression,Ex_ex
Filter_filter_se,Expression,Ex_ex
Filter_filter_se,Expression,Ex_ex
Output:
{ 'Aggregator_aggregator_se' : [ {type: 'Sorter', count: 3, value: 'So_so'],
'Expression_expr_se' : [ {type: 'Aggregator', count: 7, value: 'Ag_ag'}, {type: 'Sorter', count: 2, value: 'So_so'}],
'Filter_filter_se' : [ {type: 'Expression', count: 5,value: 'Ex_ex']
}
How should I achive this data structure?
I am very new to python so need some help.
Try:
import csv
out = {}
with open("your_data_file.txt") as f_in:
reader = csv.reader(f_in)
# skip header
next(reader)
for line in reader:
# skip empty lines
if not line:
continue
key, type_, value = line
out.setdefault(key, {}).setdefault(type_, {}).setdefault(value, 0)
out[key][type_][value] += 1
out = {
k: [
{"type": kk, "count": vvv, "value": kkk}
for kk, vv in v.items()
for kkk, vvv in vv.items()
]
for k, v in out.items()
}
print(out)
Prints:
{
"Aggregator_aggregator_se": [
{"type": "Sorter", "count": 3, "value": "So_so"}
],
"Expression_expr_se": [
{"type": "Aggregator", "count": 7, "value": "Ag_ag"},
{"type": "Sorter", "count": 2, "value": "So_so"},
],
"Filter_filter_se": [{"type": "Expression", "count": 5, "value": "Ex_ex"}],
}

python dictionary how can create (structured) unique dictionary list if the key contains list of values of other keys

I have below unstructured dictionary list which contains values of other keys in a list .
I am not sure if the question i ask is strange. this is the actual dictionary payload that we receive from source which not aligned with respective entry
[
{
"dsply_nm": [
"test test",
"test test",
"",
""
],
"start_dt": [
"2021-04-21T00:01:00-04:00",
"2021-04-21T00:01:00-04:00",
"2021-04-21T00:01:00-04:00",
"2021-04-21T00:01:00-04:00"
],
"exp_dt": [
"2022-04-21T00:01:00-04:00",
"2022-04-21T00:01:00-04:00",
"2022-04-21T00:01:00-04:00",
"2022-04-21T00:01:00-04:00"
],
"hrs_pwr": [
"14",
"12",
"13",
"15"
],
"make_nm": "test",
"model_nm": "test",
"my_yr": "1980"
}
]
"the length of list cannot not be expected and it could be more than 4 sometimes or less in some keys"
#Expected:
i need to check if the above dictionary are in proper structure or not and based on that it should return the proper dictionary list associate with each item
for eg:
def get_dict_list(items):
if type(items == not structure)
result = get_associated_dict_items_mapped
return result
else:
return items
#Final result
expected_dict_list=
[{"dsply_nm":"test test","start_dt":"2021-04-21T00:01:00-04:00","exp_dt":"2022-04-21T00:01:00-04:00","hrs_pwr":"14"},
{"dsply_nm":"test test","start_dt":"2021-04-21T00:01:00-04:00","exp_dt":"2022-04-21T00:01:00-04:00","hrs_pwr":"12","make_nm": "test",model_nm": "test","my_yr": "1980"},
{"dsply_nm":"","start_dt":"2021-04-21T00:01:00-04:00","exp_dt":"2022-04-21T00:01:00-04:00","hrs_pwr":"13"},
{"dsply_nm":"","start_dt":"2021-04-21T00:01:00-04:00","exp_dt":"2022-04-21T00:01:00-04:00","hrs_pwr":"15"}
]
in above dictionary payload, below part is associated with the second dictionary items and have to map respectively
"make_nm": "test",
"model_nm": "test",
"my_yr": "1980"
}
Can anyone help on this?
Thanks
Since customer details is a list
dict(zip(customer_details[0], list(customer_details.values[0]())))
this yields:
{'insured_details': ['asset', 'asset', 'asset'],
'id': ['213', '214', '233'],
'dept': ['account', 'sales', 'market'],
'salary': ['12', '13', '14']}
​
I think a couple of list comprehensions will get you going. If you would like me to unwind them into more traditional for loops, just let me know.
import json
def get_dict_list(item):
first_value = list(item.values())[0]
if not isinstance(first_value, list):
return [item]
return [{key: item[key][i] for key in item.keys()} for i in range(len(first_value))]
cutomer_details = [
{
"insured_details": "asset",
"id": "xxx",
"dept": "account",
"salary": "12"
},
{
"insured_details": ["asset", "asset", "asset"],
"id":["213","214","233"],
"dept":["account","sales","market"],
"salary":["12","13","14"]
}
]
cutomer_details_cleaned = []
for detail in cutomer_details:
cutomer_details_cleaned.extend(get_dict_list(detail))
print(json.dumps(cutomer_details_cleaned, indent=4))
That should give you:
[
{
"insured_details": "asset",
"id": "xxx",
"dept": "account",
"salary": "12"
},
{
"insured_details": "asset",
"id": "213",
"dept": "account",
"salary": "12"
},
{
"insured_details": "asset",
"id": "214",
"dept": "sales",
"salary": "13"
},
{
"insured_details": "asset",
"id": "233",
"dept": "market",
"salary": "14"
}
]

Creating Nested JSON from Dataframe

I have a dataframe and have to convert it into nested JSON.
countryname name text score
UK ABC Hello 5
Right now, I have some code that generates JSON, grouping countryname and name.
However, I want to firstly group by countryname and then group by name. Below is the code and output:
cols = test.columns.difference(['countryname','name'])
j = (test.groupby(['countryname','name'])[cols]
.apply(lambda x: x.to_dict('r'))
.reset_index(name='results')
.to_json(orient='records'))
test_json = json.dumps(json.loads(j), indent=4)
Output:
[
{
"countryname":"UK"
"name":"ABC"
"results":[
{
"text":"Hello"
"score":"5"
}
]
}
]
However, I am expecting an output like this:
[
{
"countryname":"UK"
{
"name":"ABC"
"results":[
{
"text":"Hello"
"score":"5"
}
]
}
}
]
Can anyone please help in fixing this?
This would be the valid JSON. Note the comma , usage, is required as you may check here.
[
{
"countryname":"UK",
"name":"ABC",
"results":[
{
"text":"Hello",
"score":"5"
}
]
}
]
The other output you try to achieve is also not according to the standard:
[{
"countryname": "UK",
"you need a name in here": {
"name": "ABC",
"results": [{
"text": "Hello",
"score": "5"
}]
}
}]
I improved that so you can figure out what name to use.
For custom JSON output you will need to use custom function to reformat your object first.
l=df.to_dict('records')[0] #to get the list
print(l, type(l)) #{'countryname': 'UK', 'name': 'ABC', 'text': 'Hello', 'score': 5} <class 'dict'>
e = l['countryname']
print(e) # UK
o=[{
"countryname": l['countryname'],
"you need a name in here": {
"name": l['name'],
"results": [{
"text": l['text'],
"score": l['score']
}]
}
}]
print(o) #[{'countryname': 'UK', 'you need a name in here': {'name': 'ABC', 'results': [{'text': 'Hello', 'score': 5}]}}]

Resources