Pandas Dataframe to JSON add JSON Object Name - python-3.x

I have a dataframe that I'm converting to JSON but I'm having a hard time naming the object. The code I have:
j = (df_import.groupby(['Item', 'Subinventory', 'TransactionUnitOfMeasure', 'TransactionType', 'TransactionDate', 'TransactionSourceId', 'OrganizationName'])
.apply(lambda x: x[['LotNumber', 'TransactionQuantity']].to_dict('records'))
.reset_index()
.rename(columns={0: 'lotItemLots'})
.to_json(orient='records'))
The result I'm getting:
[
{
"Item": "000400MCI00099",
"OrganizationName": "OR",
"Subinventory": "LAB R",
"TransactionDate": "2021-08-19 00:00:00",
"TransactionSourceId": 3000001595xxxxx,
"TransactionType": "Account Alias Issue",
"TransactionUnitOfMeasure": "EA",
"lotItemLots": [
{
"LotNumber": "00040I",
"TransactionQuantity": -5
}
]
}
]
The result I need (the transactionLines part), but can't figure out:
{
"transactionLines":[
{
"Item":"000400MCI00099",
"Subinventory":"LAB R",
"TransactionQuantity":-5,
"TransactionUnitOfMeasure":"EA",
"TransactionType":"Account Alias Issue",
"TransactionDate":"2021-08-20 00:00:00",
"OrganizationName":"OR",
"TransactionSourceId": 3000001595xxxxx,
"lotItemLots":[{"LotNumber":"00040I", "TransactionQuantity":-5}]
}
]
}
Index,Item Number,5 Digit,Description,Subinventory,Lot Number,Quantity,EOM,[Qty],Transaction Type,Today's Date,Expiration Date,Source Header ID,Lot Interface Number,Transaction Source ID,TransactionType,Organization Name
1,000400MCI00099,40,ACANTHUS OAK LEAF,LAB R,00040I,-5,EA,5,Account Alias Issue,2021/08/25,2002/01/01,160200,160200,3000001595xxxxx,Account Alias Issue,OR
Would appreciate any guidance on how to get the transactionLines name in there. Thank you in advance.

It would seem to me you could simply parse the json output, and then re-form it the way you want:
import pandas as pd
import json
data = [{'itemID': 0, 'itemprice': 100}, {'itemID': 1, 'itemprice': 200}]
data = pd.DataFrame(data)
pd_json = data.to_json(orient='records')
new_collection = [] # store our reformed records
# loop over parsed json, and reshape it the way we want
for record in json.loads(pd_json):
nested = {'transactionLines': [record]} # matching specs of question
new_collection.append(nested)
new_json = json.dumps(new_collection) # convert back to json str
print(new_json)
Which results in:
[
{"transactionLines": [{"itemID": 0, "itemprice": 100}]},
{"transactionLines": [{"itemID": 1, "itemprice": 200}]}
]
Note that of course you could probably do this in a more concise manner, without the intermediate json conversion.

Related

How to append multiple JSON object in a custom list using python?

I have two dictionary (business and business 1). I convert this dictionary into JSON file as (a and b). Then i append this two JSON object in a custom list called "all".
Here, list creation is static, i have to make it dynamic because the number of dictionary could be random. But output should be in same structure.
Here is my code section
Python Code
import something as b
business = {
"id": "04",
"target": b.YesterdayTarget,
'Sales': b.YSales,
'Achievements': b.Achievement
}
business1 = {
"id": "05",
"target": b.YesterdayTarget,
'Sales': b.YSales,
'Achievements': b.Achievement
}
# Convert Dictionary to json data
a= str(json.dumps(business, indent=5))
b= str(json.dumps(business1, indent=5))
all = '[' + a + ',\n' + b + ']'
print(all)
Output Sample
[{
"id": "04",
"target": 55500000,
"Sales": 23366927,
"Achievements": 42.1
},
{
"id": "05",
"target": 55500000,
"Sales": 23366927,
"Achievements": 42.1
}]
Thanks for your suggestions and efforts.
Try this one.
import ast, re
lines = open(path_to_your_file).read().splitlines()
result = [ast.literal_eval(re.search('({.+})', line).group(0)) for line in lines]
print(len(result))
print(result)

"TypeError: string indices must be integers" error while iterating through nested dictionary with numeric key

My dictionary is as shown below. This dictionary is stored in the variable 'api'. I have used only an excerpt of the dictionary as the whole dictionary is very big and follows similar format.
{
"sitedata":[
{
"info":{
"source":"https://thevirustracker.com/"
}}],
"countryitems":[{
"1":{
"ourid":1,
"title":"Afghanistan",
"code":"AF",
"source":"https://thevirustracker.com/afghanistan-coronavirus-information-af",
"total_cases":2335,
"total_recovered":310,
"total_unresolved":0,
"total_deaths":68,
"total_new_cases_today":0,
"total_new_deaths_today":0,
"total_active_cases":1957,
"total_serious_cases":7 },
"2":{
"ourid":2,
"title":"Albania",
"code":"AL",
"source":"https://thevirustracker.com/albania-coronavirus-information-al",
"total_cases":782,
"total_recovered":488,
"total_unresolved":0,
"total_deaths":31,
"total_new_cases_today":0,
"total_new_deaths_today":0,
"total_active_cases":263,
"total_serious_cases":4 },
"3":{
"ourid":3,
"title":"Algeria",
"code":"DZ",
"source":"https://thevirustracker.com/algeria-coronavirus-information-dz",
"total_cases":4154,
"total_recovered":1821,
"total_unresolved":0,
"total_deaths":453,
"total_new_cases_today":0,
"total_new_deaths_today":0,
"total_active_cases":1880,
"total_serious_cases":22 },
"4":{
"ourid":4,
"title":"Angola",
"code":"AO",
"source":"https://thevirustracker.com/angola-coronavirus-information-ao",
"total_cases":30,
"total_recovered":11,
"total_unresolved":0,
"total_deaths":2,
"total_new_cases_today":0,
"total_new_deaths_today":0,
"total_active_cases":17,
"total_serious_cases":0 },
"5":{
"ourid":5,
"title":"Argentina",
"code":"AR",
"source":"https://thevirustracker.com/argentina-coronavirus-information-ar",
"total_cases":4532,
"total_recovered":1292,
"total_unresolved":0,
"total_deaths":225,
"total_new_cases_today":0,
"total_new_deaths_today":0,
"total_active_cases":3015,
"total_serious_cases":157 },
.
.
.
"stat":"ok"
}]}
I am trying to iterate through this dictionary to fetch the country names using the below code:
api_request = requests.get('https://api.thevirustracker.com/free-api?countryTotals=ALL')
api = json.loads(api_request.content)
dict = api['countryitems'][0]
for key in dict:
country = api['countryitems'][0][key]['title']
print(country)
But I am getting the error "TypeError: string indices must be integers".
Can someone please advise what exactly is going wrong here.
I am using this code on Python 3.7 (Tkinter)
When iterating through the dictionary it comes to a key called "stat" and its value is just "ok" and it's type String. But all the other elements of the dictionary contain another dictionary as below,
So here you have tried to access an element called ['title'] and it will throw the above error.
Again, I have removed unicode characters from the dictionary too by the
adding below line additionally,
import ast
api = {}
api_request = requests.get('https://api.thevirustracker.com/free-api?countryTotals=ALL')
api = json.loads(api_request.content)
api = ast.literal_eval(json.dumps(api))
Here is my answer,
Sample code as below,
import requests
import json
import ast
api = {}
api_request = requests.get('https://api.thevirustracker.com/free-api?countryTotals=ALL')
api = json.loads(api_request.content)
api = ast.literal_eval(json.dumps(api))
dict = api['countryitems'][0]
for key in dict:
if (str(key).isdigit()):
country = dict[key]['title']
print (country)

How do I use this list as a parameter for this function?

I'm new to Python and I'm using it to write a Spotify app with Spotipy. Basically, I have a dictionary of tracks called topTracks. I can access a track and its name/ID and stuff with
topSongs['items'][0]
topSongs['items'][3]['id']
topSongs['items'][5]['name']
So there's a function I'm trying to use:
recommendations(seed_artists=None, seed_genres=None, seed_tracks=None, limit=20, country=None, **kwargs)
With this function I'm trying to use seed_tracks, which requires a list of track IDs. So ideally I want to input topSongs['items'][0]['id'], topSongs['items'][1]['id'], topSongs['items'][2]['id'], etc. How would I do this? I've read about the * operator but I'm not sure how I can use that or if it applies here.
You can try something like shown below.
ids = [item["id"] for item in topSongs["items"]]
Here, I have just formed a simple example.
>>> topSongs = {
... "items": [
... {
... "id": 1,
... "name": "Alejandro"
... },
... {
... "id": 22,
... "name": "Waiting for the rights"
... }
... ]
... }
>>>
>>> seed_tracks = [item["id"] for item in topSongs["items"]]
>>>
>>> seed_tracks
[1, 22]
>>>
Imp note about using * operator »
* operator is used in this case but for that, you will need to form a list/tuple containing the list of data the function takes. Something like
You have to form all the variables like seed_tracks above.
data = [seed_artists, seed_genres, seed_tracks, limit, country]
And finally,
recommendations(*data)
Imp note about using ** operator »
And if you are willing to use ** operator, the data will look like
data = {"seed_artists": seed_artists, "seed_genres": seed_genres, "seed_tracks": seed_tracks, "limit": limit, "country": country}
Finally,
recommendations(**data)

AvroTypeException: When writing in python3

My avsc file is as follows:
{"type":"record",
"namespace":"testing.avro",
"name":"product",
"aliases":["items","services","plans","deliverables"],
"fields":
[
{"name":"id", "type":"string" ,"aliases":["productid","itemid","item","product"]},
{"name":"brand", "type":"string","doc":"The brand associated", "default":"-1"},
{"name":"category","type":{"type":"map","values":"string"},"doc":"the list of categoryId, categoryName associated, send Id as key, name as value" },
{"name":"keywords", "type":{"type":"array","items":"string"},"doc":"this helps in long run in long run analysis, send the search keywords used for product"},
{"name":"groupid", "type":["string","null"],"doc":"Use this to represent or flag value of group to which it belong, e.g. it may be variation of same product"},
{"name":"price", "type":"double","aliases":["cost","unitprice"]},
{"name":"unit", "type":"string", "default":"Each"},
{"name":"unittype", "type":"string","aliases":["UOM"], "default":"Each"},
{"name":"url", "type":["string","null"],"doc":"URL of the product to return for more details on product, this will be used for event analysis. Provide full url"},
{"name":"imageurl","type":["string","null"],"doc":"Image url to display for return values"},
{"name":"updatedtime", "type":"string"},
{"name":"currency","type":"string", "default":"INR"},
{"name":"image", "type":["bytes","null"] , "doc":"fallback in case we cant provide the image url, use this judiciously and limit size"},
{"name":"features","type":{"type":"map","values":"string"},"doc":"Pass your classification attributes as features in key-value pair"}
]}
I am able to parse this but when I try to write on this as follows, I keep getting issue. What am I missing ? This is in python3. I verified it is well formated json, too.
from avro import schema as sc
from avro import datafile as df
from avro import io as avio
import os
_prodschema = 'product.avsc'
_namespace = 'testing.avro'
dirname = os.path.dirname(__file__)
avroschemaname = os.path.join( os.path.dirname(__file__),_prodschema)
sch = {}
with open(avroschemaname,'r') as f:
sch= f.read().encode(encoding='utf-8')
f.close()
proschema = sc.Parse(sch)
print("Schema processed")
writer = df.DataFileWriter(open(os.path.join(dirname,"products.json"),'wb'),
avio.DatumWriter(),proschema)
print("Just about to append the json")
writer.append({ "id":"23232",
"brand":"Relaxo",
"category":[{"123":"shoe","122":"accessories"}],
"keywords":["relaxo","shoe"],
"groupid":"",
"price":"799.99",
"unit":"Each",
"unittype":"Each",
"url":"",
"imageurl":"",
"updatedtime": "03/23/2017",
"currency":"INR",
"image":"",
"features":[{"color":"black","size":"10","style":"contemperory"}]
})
writer.close()
What am I missing here ?

Changing the values of a map nested in another map

Using HQL queries I've been able to generate the following map, where the keys represent the month number constant defined in java.util.Calendar, and every value is a map:
[
0:[ client_a:[order1, order2, order3]],
1:[ client_b:[order4], client_c:[order5, order6], client_d:[order7]],
2:[ client_e:[order8, order9], client_f:[order10]]
]
order1, order2, ... are instances of a domain class called Order:
class Order {
String description
Date d
int quantity
}
Now I've got that structure containing orders that belong to some specific year, but I don't really care about the Order object itself. I just want the sum of the quantities of all the orders of each month. So the structure should look something like this:
[
0:[ client_a:[321]],
1:[ client_b:[9808], client_c:[516], client_d:[20]],
2:[ client_e:[22], client_f:[10065]]
]
I don't mind if the values are lists of one element or not lists at all. If this is possible, it would be fine anyway:
[
0:[ client_a:321 ],
1:[ client_b:9808, client_c:516, client_d:20 ],
2:[ client_e:22, client_f:10065 ]
]
I know I have to apply something like .sum{it.quantity} to every list of orders to get the result I want, but I don't know how to iterate over them as they are nested within another map.
Thank you.
Here You go:
class Order {
String description
Date d
int quantity
}
def orders = [
0:[ client_a:[new Order(quantity:1), new Order(quantity:2), new Order(quantity:3)]],
1:[ client_b:[new Order(quantity:4)], client_c:[new Order(quantity:5), new Order(quantity:6)], client_d:[new Order(quantity:7)]],
2:[ client_e:[new Order(quantity:8), new Order(quantity:9)], client_f:[new Order(quantity:10)]]
]
def count = orders.collectEntries { k, v ->
def nv = v.collectEntries { nk, nv ->
[(nk): nv*.quantity.sum()]
}
[(k):(nv)]
}
assert count == [0:[client_a:6], 1:[client_b:4, client_c:11, client_d:7],2:[client_e:17, client_f:10]]
def map = [
0:[ client_a:[[q: 23], [q: 28], [q: 27]]],
1:[ client_b:[[q: 50]], client_c:[[q: 100], [q: 58]], client_d:[[q: 90]]],
2:[ client_e:[[q: 48], [q: 60]], client_f:[[q: 72]]]
]
map.collectEntries { k, v ->
[ k, v.collectEntries { key, val ->
[ key, val*.q.sum() ]
} ]
}
you can also use val.sum { it.q } instead of val*.q.sum()

Resources