SF OCAPI pagination loop

SF OCAPI pagination loop - python-3.x

I am trying to get a pagination loop inside of another loop where I query multiple environments in salesforce OCAPI. My code goes like this. I first declare some variables to use in the loop:
URLs=["it","at","lv","ee","lt"]
base_url = "https://example.net/s/-/dw/data/v22_4/customer_lists/"
start = 0
count = 200
Then I create the body for the requests:
body = """
{
"query": {
"bool_query": {
"must": [
{
"filtered_query": {
"query": {
"match_all_query": {}
},
"filter": {
"range_filter": {
"field": "creation_date",
"from": """ + start_time_formatted + """,
"to": """ + end_time_formatted + """
}
}
}
}
]
}
},
"expand":["primary_address"],
"select": "(**)",
"count": """ + str(count) + """,
"start": """ + str(start) + """
}"""
Based on these variables I create an initial For loop where I query multiple URL environments, I tried to input a while loop for pagination within the For loop but it doesn't give me the expected result. The code is below:
for country in URLs:
url = base_url + str(country) + "/customer_search"
response = requests.post(url, headers={'Content-Type': 'application/json','Accept': 'application/json','Authorization': 'Bearer ' + oauth_token}, data=body)
data = response.json()
while (start < total):
start = start * count
response = requests.post(url, headers={'Content-Type': 'application/json','Accept': 'application/json','Authorization': 'Bearer ' + oauth_token}, data=body)
data = response.json()
dataframe = pd.json_normalize(data['hits'], max_level=2)
dfs.append(dataframe)
df = pd.concat(dfs, ignore_index=True)
The json output that I usually get from the query looks similar to this(with records inside hits course):
{{'_v': '22.4',
'_type': 'customer_search_result',
'count': 200,
'expand': ['primary_address'],
'hits': [],
'next': {'_type': 'result_page', 'count': 200, 'start': 200},
'query': {'bool_query': {'_type': 'bool_query',
'must': [{'filtered_query': {'_type': 'filtered_query',
'filter': {'range_filter': {'_type': 'range_filter',
'field': 'creation_date',
'from': '2022-01-01T00:00:00.000Z',
'to': '2022-05-09T17:03:16.000Z'}},
'query': {'match_all_query': {'_type': 'match_all_query'}}}}]}},
'select': '(**)',
'start': 0,
'total': 650}
Now what I need is that the while loop increases the starting point with each iteration and collects all total records for each URL by then stopping in the end and give the results. Do you have any idea on how I can create this loop? count is 200 because that's the maximum amount of records I'm allowed to query with each call. Also count should then change based on total amount of records so for example if it goes in batches of 200 then for the last 150 count should change to 150 instead of 200.

Related

Python: Google Analytics API not retrieving adImpressions and adClicks

I am trying to retrieve some information about my ads in google. I have done the same for campaigns, adgroups and keywords and I haven't got a single issue, however, when it comes to ads and their metrics, I cannot manage to extract something other than adCost.
I have the following code to retrieve the info I need from google ads:
def ad_data(service, view_id, start_date, end_date):
# Initialize request with first page of results
request = {
'reportRequests': [
{
'viewId': view_id,
'dateRanges': [
{
'startDate': start_date,
'endDate': end_date
}
],
'metrics': [
{
'expression': 'ga:adCost'
},
{
'expression': 'ga:adClicks'
},
{
'expression': 'ga:adImpressions'
}
],
'dimensions': [
{
'name': 'ga:date'
},
{
'name': 'ga:campaign'
},
{
'name': 'ga:adContent'
}
],
'pageSize': 1000
}
]
}
all_data = []
while True:
# Make API call and get response
response = service.reports().batchGet(body=request).execute()
report = response.get('reports', [])[0]
rows = report.get('data', {}).get('rows', [])
all_data.extend(rows)
# Check if there are more pages of results
page_token = report.get('nextPageToken', None)
if not page_token:
break
# Update request with next page of results
request['reportRequests'][0]['pageToken'] = page_token
return all_data
The issue that I have is that for some reason I'm not being ale to retrieve metrics other than adCost, hence it shows me the following error:
HttpError: <HttpError 400 when requesting https://analyticsreporting.googleapis.com/v4/reports:batchGet?alt=json returned "Unknown metric(s): ga:adImpressions
For details see https://developers.google.com/analytics/devguides/reporting/core/dimsmets.". Details: "Unknown metric(s): ga:adImpressions
For details see https://developers.google.com/analytics/devguides/reporting/core/dimsmets.">
Could anyone help me out?

Python nested json

Can any one have solution for this, i want there should be api data in this manner ??
I wanted api data in for similar state comes in one hood rather than seprate, different state data can be different obj,
data = [{
state_name:New_jersi, data:{
category:Phishing,
sub_cat_data:[{
name:SubCat1,
count:20
},
{
name:SubCat2,
count:30
}]
}
category: malware,
sub_cat_data:[{
name:SubCat1,
count:20
},
{
name:SubCat2,
count:30
}]
},
{
state_name:Washinton, data:{
category:Phishing,
data:[{
name:SubCat1,
count:20
},
{
name:SubCat2,
count:30
}]
}
}]
But may api response be:
{
"state": "South Carolina",
"state_count": 2,
"Website Compromise/Intrusion": {
"sub_category": {
"Insecure Direct Object Reference": 2,
"Memory Corruption": 2,
"SQLI": 1,
"Stack Overflow": 1,
"XSRF": 1,
"Heap Overflow": 1,
"Security Misconfiguration": 1
}
}
},
{
"state": "South Carolina",
"state_count": 1,
"Phishing": {
"sub_category": {
"Spear Phishing Attacks": 2,
"Fast Flux": 2,
"Rock fish": 2,
"Identify Theft/Social Engineering": 1,
"Phishing Redirector": 1,
"Pharming": 1,
"Exploitation of Hardware Vulnerability": 1
}
}
},
i wanted same state data be in same object buut in my case state data comes in seprate object because of data comes through category, rather that seprate.
My logic are below
cat_count = incnum.values('incident_category__cat_name','incident_category__cat_id').annotate(count=Count('incident_category__cat_id'))
subcat_count = incnum.values('incident_sub_category__sub_cat_name','incident_sub_category__cat_id','incident_sub_category__id').annotate(count=Count('incident_sub_category__cat_id'))
reporter_state_count1 = incnum.values('incident_category__cat_id','reporter__comp_individual_state','reporter__comp_individual_state__name').annotate(count=Count('incident_category__cat_id'))
for x, state_ in enumerate(reporter_state_count1):
for i, cat_ in enumerate(cat_count):
if state_['incident_category__cat_id'] == cat_['incident_category__cat_id']:
for i, cat_ in enumerate(cat_count):
if state_['incident_category__cat_id'] == cat_['incident_category__cat_id']:
arr16.append({'state':state_['reporter__comp_individual_state__name'], 'state_count':state_['count'], cat_['incident_category__cat_name']:{'sub_category':{}}})
for sub_ in subcat_count:
if cat_['incident_category__cat_id'] == sub_['incident_sub_category__cat_id']:
arr16[i][cat_['incident_category__cat_name']]['sub_category'].update({sub_['incident_sub_category__sub_cat_name']:sub_['count']})

cat_count = incnum.values('incident_category__cat_name', 'incident_category__cat_id').annotate(
count=Count('incident_category__cat_id'))
subcat_count = incnum.values('incident_sub_category__sub_cat_name', 'incident_sub_category__cat_id',
'incident_sub_category__id').annotate(count=Count('incident_sub_category__cat_id'))
reporter_state_count1 = incnum.values('incident_category__cat_id', 'reporter__comp_individual_state',
'reporter__comp_individual_state__name').annotate(
count=Count('incident_category__cat_id'))
arr16 = []
for state_ in reporter_state_count1:
state_data = {"state_name" : state_['reporter__comp_individual_state__name'], "data":[]}
for cat_ in cat_count:
if state_['incident_category__cat_id'] == cat_['incident_category__cat_id']:
sub_cat_data = [{sub_['incident_sub_category__sub_cat_name']: sub_['count']} for sub_ in subcat_count if cat_['incident_category__cat_id'] == sub_['incident_sub_category__cat_id']]
category_data = {"category": cat_['incident_category__cat_name'], "sub_cat_data": sub_cat_data}
state_data["data"].append(category_data)
arr16.append(state_data)
1 State might have multiple category, the way you are trying to make your api, it won't be able to show multiple category for a state. This is why i modify a little bit. you will find all the category in state object
Edit
Creating a dictionary which will store category_id as key and all the subcategory of that category as value
cat_to_subcat_list = {}
for cat_ in cat_count:
sub_cat_data = [{"name":sub_['incident_sub_category__sub_cat_name'],"count": sub_['count']} for sub_ in subcat_count if
cat_['incident_category__cat_id'] == sub_['incident_sub_category__cat_id']]
cat_to_subcat_list[cat_['incident_category__cat_id']] = {"category": cat_['incident_category__cat_name'], "sub_cat_data": sub_cat_data}
Createing a dictionary which will store state__name as key and a list of category object will save as value
state_data = {}
for state_ in reporter_state_count1:
if state_['reporter__comp_individual_state__name'] not in state_data:
'''This if statement is checking whether state_name exit or not.
if state_name does not exist in dictionary it'll create a empty list as it's value'''
state_data[state_['reporter__comp_individual_state__name']] = []
state_data[state_['reporter__comp_individual_state__name']].append(cat_to_subcat_list[state_['incident_category__cat_id']])
Re-formatting json as api needed
arr16 = [
{
"state_name": state_name,
"data": state_data
}for state_name, state_data in state_data.items()
]

Updating Cols in google Sheets

I am trying to copy columns from one sheet to another sheet. I get the columns in response from source sheet. I need to insert them into the sheet. Since methods like insertDimension and insertRange cannot do it. I used request = service.spreadsheets().values().update(spreadsheetId=to_spreadsheet_id, range=range_, valueInputOption = "USER_ENTERED", body={"values": response}) but it gives me error like this :- googleapiclient.errors.HttpError: <HttpError 400 when requesting https://sheets.googleapis.com/v4/spreadsheets/1sERXk6YshuNOKi4ggp11a36uf5SGutLg7DAP5vitOoQ/values/Working%20Analysis%21G2?valueInputOption=USER_ENTERED&alt=json returned "Invalid values[1][0]: struct_value {
fields {
key: "effectiveFormat"
value {
struct_value {
fields {
key: "backgroundColor"
value {
struct_value {
fields {
key: "blue"
value {
number_value: 1.0
}
}
fields {
key: "green"
value {
number_value: 1.0
}
Its a very long text......
AND if I use other way which is commented in the code block I get the following error
Details: "Invalid requests[0].updateCells: Attempting to write row: 15000, beyond the last requested row of: 14999">
def copy_column(service, from_spreadsheet_id, to_spreadsheet_id, from_sheet='Analysis', to_sheet_id, from_column='F', from_column_till='K', to_column='G'):
request = service.spreadsheets().get(spreadsheetId=from_spreadsheet_id, ranges=[
from_sheet + "!" + from_column + ":" + from_column_till], includeGridData=True)
response = request.execute()["sheets"][0]["data"][0]["rowData"]
range_ = "Working Analysis!G2"
print(response)
# value_range_body = {
# "requests": {
# "insertDimension": {
# "range": {
# "sheetId": to_sheet_id,
# "dimension": "COLUMNS",
# "startIndex": 6,
# "endIndex": 11
# },
# "inheritFromBefore": True
# }
# }
# }
# request_1 = service.spreadsheets().batchUpdate(spreadsheetId=to_spreadsheet_id, body=value_range_body)
# response_1 = request_1.execute()
# body = {
# "requests": {
# "updateCells": {
# "rows": response,
# "fields": "userEnteredFormat, userEnteredValue",
# # "start":{
# # "sheetId": to_sheet_id,
# # "rowIndex": 1,
# # "columnIndex": 6
# # },
# "range": {
# "sheetId": to_sheet_id,
# "startRowIndex": 1,
# "startColumnIndex": 6,
# "endColumnIndex": 13
# },
# }
# }
# }
request = service.spreadsheets().values().update(spreadsheetId=to_spreadsheet_id,
range=range_, valueInputOption = "USER_ENTERED", body={"values": response})
response = request.execute()
return print('Done')

I think that the response values from service.spreadsheets().get() cannot be directly used to service.spreadsheets().values().update(). From your commented script, I guessed that you might want to copy not only the values but also the cell format.
In this case, how about the following modification?
Modified script:
sheetId = "###" # Please set the sheet ID of the sheet "Working Analysis"
request = service.spreadsheets().get(spreadsheetId=from_spreadsheet_id, ranges=[from_sheet + "!" + from_column + ":" + from_column_till], includeGridData=True)
response = request.execute()["sheets"][0]["data"][0]["rowData"]
requests = {
"requests": [
{
"updateCells": {
"start": {"sheetId": sheetId, "rowIndex": 1, "columnIndex": 6},
"rows": response,
"fields": "*",
}
}
]
}
request = service.spreadsheets().batchUpdate(spreadsheetId=to_spreadsheet_id, body=requests)
response = request.execute()
print("Done")
For example, if you want to copy only the values, you can use service.spreadsheets().values().update() as follows.
request = service.spreadsheets().values().get(spreadsheetId=from_spreadsheet_id, range=from_sheet + "!" + from_column + ":" + from_column_till)
response = request.execute()["values"]
range_ = "Working Analysis!G2"
request = service.spreadsheets().values().update(spreadsheetId=to_spreadsheet_id, range=range_, valueInputOption="USER_ENTERED", body={"values": response})
response = request.execute()
return print("Done")
References:
Method: spreadsheets.batchUpdate
Method: spreadsheets.values.get

JSON Extract to dataframe using python

I have a JSON file and the structure of the file is as below
[json file with the structure][1]
I am trying to get all the details into dataframe or tabular form, Tried using denormalize and could not get the actual result.
{
"body": [{
"_id": {
"s": 0,
"i": "5ea6c8ee24826b48cc560e1c"
},
"fdfdsfdsf": "V2_1_0",
"dsd": "INDIA-",
"sdsd": "df-as-3e-ds",
"dsd": 123,
"dsds": [{
"dsd": "s_10",
"dsds": [{
"dsdsd": "OFFICIAL",
"dssd": {
"dsds": {
"sdsd": "IND",
"dsads": 0.0
}
},
"sadsad": [{
"fdsd": "ABC",
"dds": {
"dsd": "INR",
"dfdsfd": -1825.717444
},
"dsss": [{
"id": "A:B",
"dsdsd": "A.B"
}
]
}, {
"name": "dssadsa",
"sadds": {
"sdsads": "INR",
"dsadsad": 180.831415
},
"xcs": "L:M",
"sds": "L.M"
}
]
}
]
}
]
}
]
}

This structure is far too nested to put directly into a dataframe. First, you'll need to use the ol' flatten_json function. This function isn't in a library (to my knowledge), but you see it around a lot. Save it somewhere.
def flatten_json(nested_json):
"""
Flatten json object with nested keys into a single level.
Args:
nested_json: A nested json object.
Returns:
The flattened json object if successful, None otherwise.
"""
out = {}
def flatten(x, name=''):
if type(x) is dict:
for a in x:
flatten(x[a], name + a + '_')
elif type(x) is list:
i = 0
for a in x:
flatten(a, name + str(i) + '_')
i += 1
else:
out[name[:-1]] = x
flatten(nested_json)
return out
Applying it to your data:
import json
with open('deeply_nested.json', r) as f:
flattened_json = flatten_json(json.load(f))
df = pd.json_normalize(flattened_json)
df.columns
Index(['body_0__id_s', 'body_0__id_i', 'body_0_schemaVersion',
'body_0_snapUUID', 'body_0_jobUUID', 'body_0_riskSourceID',
'body_0_scenarioSets_0_scenario',
'body_0_scenarioSets_0_modelSet_0_modelPolicyLabel',
'body_0_scenarioSets_0_modelSet_0_valuation_pv_unit',
'body_0_scenarioSets_0_modelSet_0_valuation_pv_value',
'body_0_scenarioSets_0_modelSet_0_measures_0_name',
'body_0_scenarioSets_0_modelSet_0_measures_0_value_unit',
'body_0_scenarioSets_0_modelSet_0_measures_0_value_value',
'body_0_scenarioSets_0_modelSet_0_measures_0_riskFactors_0_id',
'body_0_scenarioSets_0_modelSet_0_measures_0_riskFactors_0_underlyingRef',
'body_0_scenarioSets_0_modelSet_0_measures_1_name',
'body_0_scenarioSets_0_modelSet_0_measures_1_value_unit',
'body_0_scenarioSets_0_modelSet_0_measures_1_value_value',
'body_0_scenarioSets_0_modelSet_0_measures_1_riskFactors',
'body_0_scenarioSets_0_modelSet_0_measures_1_underlyingRef'],
dtype='object')

How to insert another item programmatically into body?

I am trying to build a free/busy body request to Google Calendar API via Python 3.8 . However, when I try to insert a new item into the body request, I am getting a bad request and can't use it.
This code is working:
SUBJECTA = '3131313636#resource.calendar.google.com'
SUBJECTB = '34343334#resource.calendar.google.com'
body = {
"timeMin": now,
"timeMax": nownext,
"timeZone": 'America/New_York',
"items": [{'id': SUBJECTA},{"id": SUBJECTB} ]
}
Good Body result:
{'timeMin': '2019-11-05T11:42:21.354803Z',
'timeMax': '2019-11-05T12:42:21.354823Z',
'timeZone': 'America/New_York',
'items': [{'id': '131313636#resource.calendar.google.com'},
{'id': '343334#resource.calendar.google.com'}]}
However,
While using this code:
items = "{'ID': '1313636#resource.calendar.google.com'},{'ID': '3383137#resource.calendar.google.com'},{'ID': '383733#resource.calendar.google.com'}"
body = {
"timeMin": now,
"timeMax": nownext,
"timeZone": 'America/New_York',
"items": items
}
The Body results contain additional quotes at the start and end position, failing the request:
{'timeMin': '2019-11-05T12:04:41.189784Z',
'timeMax': '2019-11-05T13:04:41.189804Z',
'timeZone': 'America/New_York',
'items': ["{'ID': 13131313636#resource.calendar.google.com},{'ID':
53333383137#resource.calendar.google.com},{'ID':
831383733#resource.calendar.google.com},{'ID':
33339373237#resource.calendar.google.com},{'ID':
393935323035#resource.calendar.google.com}"]}
What is the proper way to handle it and send the item list in an accurate way?

In your situation, the value of items is given by the string of "{'ID': '1313636#resource.calendar.google.com'},{'ID': '3383137#resource.calendar.google.com'},{'ID': '383733#resource.calendar.google.com'}".
You want to use as the object by parsing the string value with python.
The result value you expect is [{'ID': '1313636#resource.calendar.google.com'}, {'ID': '3383137#resource.calendar.google.com'}, {'ID': '383733#resource.calendar.google.com'}].
You have already been able to use Calender API.
If my understanding is correct, how about this answer? Please think of this as just one of several answers.
Sample script:
import json # Added
items = "{'ID': '1313636#resource.calendar.google.com'},{'ID': '3383137#resource.calendar.google.com'},{'ID': '383733#resource.calendar.google.com'}"
items = json.loads(("[" + items + "]").replace("\'", "\"")) # Added
body = {
"timeMin": now,
"timeMax": nownext,
"timeZone": 'America/New_York',
"items": items
}
print(body)
Result:
If now and nownext are the values of "now" and "nownext", respectively, the result is as follows.
{
"timeMin": "now",
"timeMax": "nownext",
"timeZone": "America/New_York",
"items": [
{
"ID": "1313636#resource.calendar.google.com"
},
{
"ID": "3383137#resource.calendar.google.com"
},
{
"ID": "383733#resource.calendar.google.com"
}
]
}
Note:
If you can retrieve the IDs as the string value, I recommend the following method as a sample script.
ids = ['1313636#resource.calendar.google.com', '3383137#resource.calendar.google.com', '383733#resource.calendar.google.com']
items = [{'ID': id} for id in ids]
If I misunderstood your question and this was not the result you want, I apologize.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

SF OCAPI pagination loop - python-3.x

Related

Python: Google Analytics API not retrieving adImpressions and adClicks

Python nested json

Updating Cols in google Sheets

JSON Extract to dataframe using python

How to insert another item programmatically into body?

Categories

Resources