Why Do I Keep Receving A "Requests.Exceptions.InvalidSchema: No Connection Adapters Were Found For '0" Error? - python-3.x

I'm trying to create a script that returns domain and backlink numbers for each URL held in a dataframe from an SEMRUSH API.
The dataframe cotaining the URLs has some of the following information:
0
0 www.ig.com/jp/trading-strategies/swing-trading...
1 www.ig.com/it/news-e-idee-di-trading/criptoval...
2 www.ig.com/uk/news-and-trade-ideas/the-omicron...
[1468 rows x 1 columns]
When I run my script I get the following error:
requests.exceptions.InvalidSchema: No connection adapters were found for '0 https://api.semrush.com/analytics/v1/?key=1f0e...\nName: 0, dtype: object'
Here is the part of the code that generates the error:
for index, url in gsdf.iterrows():
rr = requests.request("GET","https://api.semrush.com/analytics/v1/?key="+API_KEY+"&type=backlinks_tld&target="+url+"&target_type=url&export_columns=domains_num,backlinks_num&display_limit=1",headers=headers, data = payload)
data=json.loads(rr.text.encode('utf8'))
srdf=srdf.append({domains_num:data, backlinks_num:data}, ignore_index=True)
I'm not sure why this happens as I'm new to Python. Can you help?
Kind thanks
Mark

Related

Issues with Python Lambda with filter() function

Using multiple if conditions to filter a list works. However, I am looking for a better alternative. As a beginner Python user, I fail to make the filter() and lambda () functions work even after using this resource: https://www.geeksforgeeks.org/python-filter-list-of-strings-based-on-the-substring-list/. Any help will be appreciated.
The following code block (Method 1) works.
mylist = ['MEPS HC-226: MEPS Panel 23 Three-Year Longitudinal Data File',
'HC-203: 2018 Jobs File',
'HC-051H 2000 Home Health',
'NHC-001F Facility Characteristics Round 1',
'HC-IC Linked Data 1999',
'HC-004 1996 Employment Data/Family-Level Weight (HC-004 replaced by HC-012)',
'HC-030 1998 MEPS HC Survey Data (CD-ROM)']
sublist1 = []
for element in mylist:
if element.startswith(("MEPS HC", "HC")):
if "-IC" not in element:
if "replaced" not in element:
if "CD-ROM" not in element:
sublist1.append(element)
print(sublist1)
(Output below)
['MEPS HC-226: MEPS Panel 23 Three-Year Longitudinal Data File', 'HC-203: 2018 Jobs File', 'HC-051H 2000 Home Health']
However, issues are with the following code block (Method 2).
sublist2 = []
excludes = ['-IC', 'replaced', 'CD-ROM']
for element in mylist:
if element.startswith(("MEPS HC", "HC")):
sublist2 = mylist(filter(lambda x: any(excludes in x for exclude in excludes), mylist))
sublist2.append(element)
print(sublist2)
TypeError: 'list' object is not callable
My code block with multiple if conditions (Method 1) to filter the list works. However, I could not figure out why the code block with filter() and lambda functions (Method 2) does not work. I was expecting the same results I got from Method 1. I am open to other solutions as an alternative to Method 1.

How to handle errors with TimeDelta and Integers in Python

I need to calculate the distance between two dates.
df3['dist_2_1'] = (df3['Date2'] - df3['Date1'])
When I save this into my SQLite DB the format is terrible, so I decided to use an integer format which is much better.
df3['dist_2_1'] = (df3['Date2'] - df3['Date1']).astype('timedelta64[D]').astype(int)
So far so good, but in a similar case, I've NULL values which cause an error when I try to do the diference between dates.
df3['dist_B_3'] = df3['Break_date'] - df3['Date3']
The Break_date can be null, so I want that in this case the final result in dist_B_3 is 0, but now is an error that breaks everything. I tested this so far, but doesn't work...
try:
if df3['Break_date'] == 'NaT':
df3['dist_B_3'] = 0
else:
df3['dist_B_3'] = df3['Break_date'] - df3['Date3']
#().astype('timedelta64[D]').astype(int)
except Exception:
print("error in the dist_B_3")
My df3['Break_date'] df is this one, so the NaT are the ones creating the error.
0 2022-07-13
1 2022-07-12
2 2022-07-14
3 2022-07-14
4 NaT
5 NaT
Any idea on how to handle this?

pandas groupby trying to optimse several steps

I've been trying to optimise a bokeh server to calculate live stats by selected country on Covid19.
I found myself repeating a groupby function to calculate new columns and was wondering, having selected the groupby, if I could then apply it in a similar way to .agg() on multiple columns ?
For example:
dfall = pd.DataFrame(db("SELECT * FROM C19daily"))
dfall.set_index(['geoId', 'date'], drop=False, inplace=True)
dfall = dfall.sort_index(ascending=True)
dfall.head()
id date geoId cases deaths auid
geoId date
AD 2020-03-03 70119 2020-03-03 AD 1 0 AD03/03/2020
2020-03-14 70118 2020-03-14 AD 1 0 AD14/03/2020
2020-03-16 70117 2020-03-16 AD 3 0 AD16/03/2020
2020-03-17 70116 2020-03-17 AD 9 0 AD17/03/2020
2020-03-18 70115 2020-03-18 AD 0 0 AD18/03/2020
I need to create new columns based on 'cases' and 'deaths' and applying various functions like cumsum(). Currently I do this the long way
dfall['ccases'] = dfall.groupby(level=0)['cases'].cumsum()
dfall['dpc_cases'] = dfall.groupby(level=0)['cases'].pct_change(fill_method='pad', periods=7)
.....
dfall['cdeaths'] = dfall.groupby(level=0)['deaths'].cumsum()
dfall['dpc_deaths'] = dfall.groupby(level=0)['deaths'].pct_change(fill_method='pad', periods=7)
I tried to optimise the groupby call like this:-
with dfall.groupby(level=0) as gr:
gr = g['cases'].cumsum()...
But the error suggest the class doesn't support this
AttributeError: __enter__
I thought I could use .agg({}) and supply dictionary
g = dfall.groupby(level=0).agg({'cc' : 'cumsum', 'cd' : 'cumsum'})
but that produces another error
pandas.core.base.SpecificationError: nested renamer is not supported
I have plenty of other bits to optimise, I thought this python part would be the easiest and save a few ms!
Could anyone nudge me in the right direction?
To avoid repeating dfall.groupby(level=0) you can just save it in a variable:
gb = dfall.groupby(level=0)
gb_cases = gb['cases']
dfall['ccases'] = gb_cases.cumsum()
dfall['dpc_cases'] = gb_cases.pct_change(fill_method='pad', periods=7)
...
And to run multiple aggregations using a single expression, I think you can use named aggregation. But I have no clue whether it will be more performant or not. Either way, it's better to profile the code and improve the actual bottlenecks.

Why am i getting <searchconsole.query.Report(rows=1)> instead of numbers/strs

Working with search console api,
made it through the basics.
Now i'm stuck on splitting and arranging the data:
When trying to split, i'm getting a NaN, nothing i try works.
46 ((174.0, 3753.0, 0.04636290967226219, 7.816147...
47 ((93.0, 2155.0, 0.0431554524361949, 6.59025522...
48 ((176.0, 4657.0, 0.037792570324243074, 6.90251...
49 ((20.0, 1102.0, 0.018148820326678767, 7.435571...
50 ((31.0, 1133.0, 0.02736098852603707, 8.0935569...
Name: test, dtype: object
When trying to manipulate the data like this (and similar interactions):
data=source['test'].tolist()
data
Its clear that the data is not really available...
[<searchconsole.query.Report(rows=1)>,
<searchconsole.query.Report(rows=1)>,
<searchconsole.query.Report(rows=1)>,
<searchconsole.query.Report(rows=1)>,
<searchconsole.query.Report(rows=1)>]
Anyone have an idea how can i interact with my data ?
Thanks.
for reference, this is the code and the program i work with:
account = searchconsole.authenticate(client_config='client_secrets.json', credentials='credentials.json')
webproperty = account['https://www.example.com/']
def APIsc(date,keyword):
results=webproperty.query.range(date, days=-30).filter('query', keyword, 'contains').get()
return results
source['test']=source.apply(lambda x: APIsc(x.date, x.keyword), axis=1)
source
made by: https://github.com/joshcarty/google-searchconsole

How to fix Key error - "Groups" in using Fousquare API # Python?

I am trying to list nearby venues using get Nearby venues that are previously defined, and every line worked fine and then I cannot label properly nearby venues using Foursquare although its working fine ( I have to reset my Id and Secret as it just stop working). Im using Python 3.5 at Jupyter Notebook
What Im doing wrong? Thank you!!
BT_venues=getNearbyVenues(names=BT_df['Sector'],
latitudes=BT_df['Latitude'],
longitudes=BT_df['Longitude']
)
-----------------------------------------------------------------------
----
KeyError Traceback (most recent call
last)
<ipython-input-99-563e09cdcab5> in <module>()
1 BT_venues=getNearbyVenues(names=BT_df['Sector'],
2 latitudes=BT_df['Latitude'],
----> 3 longitudes=BT_df['Longitude']
4 )
<ipython-input-93-cfc09962ae0b> in getNearbyVenues(names, latitudes,
longitudes, radius)
18
19 # make the GET request
---> 20 results = requests.get(url).json()['response']
['groups'][0]
['items']
21
22 # return only relevant information for each nearby venue
KeyError: 'groups'
As for groups this was the code
venues = res['response']['groups'][0]['items']
nearby_venues = json_normalize(venues) # flatten JSON
# columns only
filtered_columns = ['venue.name', 'venue.categories',
'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]
# only one category per a row
nearby_venues['venue.categories'] =
nearby_venues.apply(get_category_type,
axis=1)
# columns cleaning up
nearby_venues.columns = [col.split(".")[-1] for col in
nearby_venues.columns]
nearby_venues.head()
Check response['meta'], you may have exceeded your quota.
If you need an instant resolution, create a new foursquare account. Then create new application and use your new client id and secret to call api

Resources