How to append multiple JSON object in a custom list using python? - python-3.x

I have two dictionary (business and business 1). I convert this dictionary into JSON file as (a and b). Then i append this two JSON object in a custom list called "all".
Here, list creation is static, i have to make it dynamic because the number of dictionary could be random. But output should be in same structure.
Here is my code section
Python Code
import something as b
business = {
"id": "04",
"target": b.YesterdayTarget,
'Sales': b.YSales,
'Achievements': b.Achievement
}
business1 = {
"id": "05",
"target": b.YesterdayTarget,
'Sales': b.YSales,
'Achievements': b.Achievement
}
# Convert Dictionary to json data
a= str(json.dumps(business, indent=5))
b= str(json.dumps(business1, indent=5))
all = '[' + a + ',\n' + b + ']'
print(all)
Output Sample
[{
"id": "04",
"target": 55500000,
"Sales": 23366927,
"Achievements": 42.1
},
{
"id": "05",
"target": 55500000,
"Sales": 23366927,
"Achievements": 42.1
}]
Thanks for your suggestions and efforts.

Try this one.
import ast, re
lines = open(path_to_your_file).read().splitlines()
result = [ast.literal_eval(re.search('({.+})', line).group(0)) for line in lines]
print(len(result))
print(result)

Related

Replace value of a specific key in json file for multiple files

I am preparing a test data for the feature testing .
I have 1000 json file with same below structure in local folder .
I need to replace two key from all the places and create 1000 different json .
Key that needs to be replaced is
"f_ID": "80510926" and "f_Ctb": "10333"
"f_ID": "80510926" appears in all the array but "f_Ctb": "10333" appear only once .
The replaced value can be running number 1 to 100 in all files .
Can some one suggest how we can do this python to create 1000 files
{
"t_A": [
{
"f_ID": "80510926",
"f_Ctb": "10333",
"f_isPasswordProtected": "False"
}
],
"t_B": [
{
"f_ID": "80510926",
"f_B_Company_ID": "10333",
"f_B_ID": "1",
"f_ArriveDate": "20151001 151535.775"
},
{
"f_ID": "80510926",
"f_B_Company_ID": "10333",
"f_B_ID": "1700",
"f_ArriveDate": "20151001 151535.775"
}
],
"t_C": [
{
"f_ID": "80510926",
"f_Set_Code": "TRBC ",
"f_Industry_Code": "10 ",
"f_InsertDate": "20151001 151535.775"
},
],
"t_D": [
{
"f_ID": "80510926",
"f_Headline": "Global Reinsurance: Questions and themes into Monte Carlo",
"f_Synopsis": ""
}
]
}
here is the solution run in the folder where you have 1000 json files.
It will read all the json files and replace the f_ID and f_Ctb with the count and write the file with same file name in output folder.
import os
import json
all_files = os.listdir()
json_files = {f: f for f in all_files if f.endswith(".json")}
json_files_keys = list(json_files.keys())
json_files_keys.sort()
OUTPUT_FOLDER = "output"
if not os.path.exists(OUTPUT_FOLDER):
os.mkdir(OUTPUT_FOLDER)
for file_name in json_files_keys:
f_read = open(file_name, "r").read()
data = json.loads(f_read)
output_data = {}
count = 1
for key, val_list in data.items():
for nest_dict in val_list:
if "f_ID" in nest_dict:
nest_dict["f_ID"] = count
if "f_Ctb" in nest_dict:
nest_dict["f_Ctb"] = count
if key in output_data:
output_data[key].append(nest_dict)
else:
output_data[key] = [nest_dict]
else:
output_data[key] = val_list
count += 1
output_file = open(f"{OUTPUT_FOLDER}/{file_name}", "w")
output_file.write(json.dumps(output_data))
output_file.close()

Pandas Dataframe to JSON add JSON Object Name

I have a dataframe that I'm converting to JSON but I'm having a hard time naming the object. The code I have:
j = (df_import.groupby(['Item', 'Subinventory', 'TransactionUnitOfMeasure', 'TransactionType', 'TransactionDate', 'TransactionSourceId', 'OrganizationName'])
.apply(lambda x: x[['LotNumber', 'TransactionQuantity']].to_dict('records'))
.reset_index()
.rename(columns={0: 'lotItemLots'})
.to_json(orient='records'))
The result I'm getting:
[
{
"Item": "000400MCI00099",
"OrganizationName": "OR",
"Subinventory": "LAB R",
"TransactionDate": "2021-08-19 00:00:00",
"TransactionSourceId": 3000001595xxxxx,
"TransactionType": "Account Alias Issue",
"TransactionUnitOfMeasure": "EA",
"lotItemLots": [
{
"LotNumber": "00040I",
"TransactionQuantity": -5
}
]
}
]
The result I need (the transactionLines part), but can't figure out:
{
"transactionLines":[
{
"Item":"000400MCI00099",
"Subinventory":"LAB R",
"TransactionQuantity":-5,
"TransactionUnitOfMeasure":"EA",
"TransactionType":"Account Alias Issue",
"TransactionDate":"2021-08-20 00:00:00",
"OrganizationName":"OR",
"TransactionSourceId": 3000001595xxxxx,
"lotItemLots":[{"LotNumber":"00040I", "TransactionQuantity":-5}]
}
]
}
Index,Item Number,5 Digit,Description,Subinventory,Lot Number,Quantity,EOM,[Qty],Transaction Type,Today's Date,Expiration Date,Source Header ID,Lot Interface Number,Transaction Source ID,TransactionType,Organization Name
1,000400MCI00099,40,ACANTHUS OAK LEAF,LAB R,00040I,-5,EA,5,Account Alias Issue,2021/08/25,2002/01/01,160200,160200,3000001595xxxxx,Account Alias Issue,OR
Would appreciate any guidance on how to get the transactionLines name in there. Thank you in advance.
It would seem to me you could simply parse the json output, and then re-form it the way you want:
import pandas as pd
import json
data = [{'itemID': 0, 'itemprice': 100}, {'itemID': 1, 'itemprice': 200}]
data = pd.DataFrame(data)
pd_json = data.to_json(orient='records')
new_collection = [] # store our reformed records
# loop over parsed json, and reshape it the way we want
for record in json.loads(pd_json):
nested = {'transactionLines': [record]} # matching specs of question
new_collection.append(nested)
new_json = json.dumps(new_collection) # convert back to json str
print(new_json)
Which results in:
[
{"transactionLines": [{"itemID": 0, "itemprice": 100}]},
{"transactionLines": [{"itemID": 1, "itemprice": 200}]}
]
Note that of course you could probably do this in a more concise manner, without the intermediate json conversion.

Split 30 Gb json file into smaller files

I am facing memory issue in reading a json file which is 30 GB in size. Is there any direct way in Python3.x like we have in unix where we can split the json file into smaller files based on the lines.
e.g. first 100000 records go in first slit file and then rest go to subsequent child json file?
Depending on your input data and if its structure is known and consistent, it will be more hard or easy.
In my example here the idea is to read the file line by line with a lazy generator and write new files whenever a valid object can be constructed from the input. Its a bit like manual parsing.
In the real world case this logic when to write to a new file would highly depend on your input and what you are trying to achieve.
Some sample data
[
{
"color": "red",
"value": "#f00"
},
{
"color": "green",
"value": "#0f0"
},
{
"color": "blue",
"value": "#00f"
},
{
"color": "cyan",
"value": "#0ff"
},
{
"color": "magenta",
"value": "#f0f"
},
{
"color": "yellow",
"value": "#ff0"
},
{
"color": "black",
"value": "#000"
}
]
# create a generator that yields each individual line
lines = (l for l in open('data.json'))
# o is used to accumulate some lines before
# writing to the files
o=''
# itemCount is used to count the number of valid json objects
itemCount=0
# read the file line by line to avoid memory issues
i=-1
while True:
try:
line = next(lines)
except StopIteration:
break
i=i+1
# ignore first square brackets
if i == 0:
continue
# in this data I know every 5th lines a new object will begin
# this logic depends on your input data
if i%4==0:
itemCount+=1
# at this point I am able to create avalid json object
# based on my knowledge of the input file structure
validObject=o+line.replace("},\n", '}\n')
o=''
# now write each object to its own file
with open(f'item-{itemCount}.json', 'w') as outfile:
outfile.write(validObject)
else:
o+=line
Here is a repl with the working example: https://replit.com/#bluebrown/linebyline

List comprehension to dynamically populate attribute of object

What’s the correct way to dynamically populate the list given to the attribute Value? At the moment its its fixed to 3 (0-2), but would be more useful if it was based on INSTANCE_PARAMS[i]["instances"].
I was thinking along the lines of list comprehension but unsure how to write it into the code.
for i in INSTANCE_PARAMS:
output = template.add_output([
Output(
str(INSTANCE_PARAMS[i]["name"]).capitalize() + "Ips",
Description="IPs for " + str(INSTANCE_PARAMS[i]["name"]),
Value=Join(",", [
GetAtt(ec2.Instance(INSTANCE_PARAMS[i]["name"] + "0"), "PrivateIp"),
GetAtt(ec2.Instance(INSTANCE_PARAMS[i]["name"] + "1"), "PrivateIp"),
GetAtt(ec2.Instance(INSTANCE_PARAMS[i]["name"] + "2"), "PrivateIp"),
],
),
)
],
)
INSTANCE_PARAMS = {
"masters": {
"name": "master",
"instances": 3,
"image_id": "ami-xxxxxxxxxx",
"instance_type": "t1.micro",
"security_group_id": [
"MasterSG",
"ClusterSG"
],
},
}
Achieved it with the following:
template = Template()
for i in INSTANCE_PARAMS:
# declared a new list
tag_value = []
# got the counter
for r in range(INSTANCE_PARAMS[i]["instances"]):
# added each one to the new list
tag_value.append(GetAtt(ec2.Instance(INSTANCE_PARAMS[i]["name"] + str(r)), "PrivateIP"))
output = template.add_output([
Output(
str(INSTANCE_PARAMS[i]["name"]).capitalize() + "Ips",
Description="IPs for " + str(INSTANCE_PARAMS[i]["name"]),
# passed in the list
Value=Join(",", tag_value,
),
)
],
)
print(template.to_yaml())

How do I use this list as a parameter for this function?

I'm new to Python and I'm using it to write a Spotify app with Spotipy. Basically, I have a dictionary of tracks called topTracks. I can access a track and its name/ID and stuff with
topSongs['items'][0]
topSongs['items'][3]['id']
topSongs['items'][5]['name']
So there's a function I'm trying to use:
recommendations(seed_artists=None, seed_genres=None, seed_tracks=None, limit=20, country=None, **kwargs)
With this function I'm trying to use seed_tracks, which requires a list of track IDs. So ideally I want to input topSongs['items'][0]['id'], topSongs['items'][1]['id'], topSongs['items'][2]['id'], etc. How would I do this? I've read about the * operator but I'm not sure how I can use that or if it applies here.
You can try something like shown below.
ids = [item["id"] for item in topSongs["items"]]
Here, I have just formed a simple example.
>>> topSongs = {
... "items": [
... {
... "id": 1,
... "name": "Alejandro"
... },
... {
... "id": 22,
... "name": "Waiting for the rights"
... }
... ]
... }
>>>
>>> seed_tracks = [item["id"] for item in topSongs["items"]]
>>>
>>> seed_tracks
[1, 22]
>>>
Imp note about using * operator »
* operator is used in this case but for that, you will need to form a list/tuple containing the list of data the function takes. Something like
You have to form all the variables like seed_tracks above.
data = [seed_artists, seed_genres, seed_tracks, limit, country]
And finally,
recommendations(*data)
Imp note about using ** operator »
And if you are willing to use ** operator, the data will look like
data = {"seed_artists": seed_artists, "seed_genres": seed_genres, "seed_tracks": seed_tracks, "limit": limit, "country": country}
Finally,
recommendations(**data)

Resources