GeoCode Results: Index Out of Range - python-3.x

I am making a code that converts an address into lat-long coordinates. I expected the program to output something along the lines of: 39.962714, -83.003419, but it returned an index error:
File "main.py", line 10, in <module>
result = requests.get('http://maps.googleapis.com/maps/api/geocode/json', params=params).json()['results'][0]
IndexError: list index out of range
python3 exited with code 1...
I haven't tried much because I couldn't find any situations similar to mine, and I don't understand exactly why the error message is coming up, because there should be at least one value in results but here's my code anyway:
'address': '90 W Broad St',
'sensor': 'false',
'region': 'Ohio'
}
result = requests.get('http://maps.googleapis.com/maps/api/geocode/json', params=params).json()['results'][0]
geodata = dict()
geodata['lat'] = result['geometry']['location']['lat']
geodata['lng'] = result['geometry']['location']['lng']
print('{lat}, {lng}'.format(**geodata))
I am very sorry if the answer is obvious, but thank you for your time reading this. :)

Related

How to solve ValueError: Sample larger than population or is negative without updating the list size in Python?

I have a list of color names having total 36 color names like below-
MAIN_COLORS = ['darkolivegreen', 'darkseagreen', 'darkorange', 'darkslategrey', 'darkturquoise', 'darkgreen', 'darkviolet', 'darkgray', 'darkmagenta', 'darkblue', 'darkkhaki','darkcyan', 'darkred', 'darksalmon', 'darkslategray', 'darkgoldenrod', 'darkgrey', 'darkslateblue', 'darkorchid','skyblue','yellow','orange','red','pink','violet','green','brown','gold','Olive','Maroon', 'blue', 'cyan', 'black','olivedrab', 'lightcyan', 'silver']
And I have a classes.txt file having total 459 labels.
Now when I run below code snippet-
try:
with open('classes.txt','r') as cls:
classes = cls.readlines()
classes = [cls.strip() for cls in classes]
except IOError as io:
print("[ERROR] Please create classes.txt and put your all classes")
sys.exit(1)
COLORS = random.sample(set(MAIN_COLORS), len(classes))
I am getting below error-
Traceback (most recent call last):
File "D:/Projects/YoloV3_Annotation_Tool-master/YoloV3_Annotation_Tool-master/main.py", line 42, in
COLORS = random.sample(set(MAIN_COLORS), len(classes))
File "C:\Users\prateek.g\AppData\Local\Continuum\anaconda3\envs\myNewEnv\lib\random.py", line 321, in sample
raise ValueError("Sample larger than population or is negative")
ValueError: Sample larger than population or is negative
From the error I understand that I need to increase the color names in my list but it seems difficult to add colors since number of labels in classes may increase.
So is there any way I can fix this problem? Please suggest.
You can expand your MAIN_COLORS list by adding it to itself as many times as needed.
for example:
while len(MAIN_COLORS)<len(classes):
MAIN_COLORS = MAIN_COLORS + MAIN COLORS
However it will create duplicated values in your list, and you have used set() in your code, so no matter how many times you will duplicate your list, set() will decrease its size to main.

Please help me to fix the ''list index out of range'' error

I wrote a program to calculate the ratio of minor (under 20 of age) population in each prefecture of Japan and it keeps producing this error: list index out of range, at line 19: ratio =(agerange[1]+agerange[2]+agerange[3]+agerange[4])/population*100.0
Link to csv: https://drive.google.com/open?id=1uPSMpgHw0csRx1UgAJzRLit9p6NrztFY
f=open("population.csv","r")
header=f.readline()
header=header.rstrip("\r\n")
while True:
line=f.readline()
if line=="":
break
line=line.rstrip("\r\n")
field=line.split(sep=",")
population=0
ratio=0
agerange=[ "pref" ]
for age in range(1, len(field)):
agerange.append(int(field[age]))
population+=int(field[age])
ratio =(agerange[1]+agerange[2]+agerange[3]+agerange[4])/population*100.0
print(field[0],ratio)
On line 17, I assume you to do the following code:
ratio =(agerange[0]+agerange[1]+agerange[2]+agerange[3])/population*100.0
next time, write your error more in detail please.
What you could do instead is get the sums of populations in the required age ranges and then perform the ratio calculation.
In Python, you can use the map function to convert the values in an iterable to ints, and make that into a list.
Once you have the list, you can use the sum function on it, or a part of it.
So, I came up with:
f = open("population.csv","r")
header = f.readline()
header = header.rstrip("\r\n")
while True:
line = f.readline()
if line == "":
break
line = line.rstrip("\r\n")
field = line.split(sep=",")
popData = list(map(int, field[1:]))
youngPop = sum(popData[:4])
oldPop = sum(popData[4:])
ratio = youngPop / (youngPop + oldPop)
print(field[0].ljust(12), ratio)
f.close()
Which outputs (just showing a portion here):
Hokkaido 0.1544532130777903
Aomori 0.1564945226917058
Iwate 0.16108452950558214
Miyagi 0.16831683168316833
Akita 0.14357429718875503
Yamagata 0.16515426497277677
Fukushima 0.16586921850079744
(I don't really know Python, so there could be some "better" or more conventional way.)

KeyError: "None of [Index(...')] are in the [columns]"

I am using the recordlinkage library with pandas. In the first step, I created indexes, the parameters are:
indexer = recordlinkage.Index()
indexer.block(fr.iloc[:, 2])
pairs = indexer.index(fr)
Notice, that the page of project indicates the following usage:
indexer = recordlinkage.Index()
indexer.block('orignal_link')
candidate_links = indexer.index(dfA, dfB)
I replaced the column label with the same position (.iloc). It couldn't find any column name. However, when I specifically asked for the column names, I got the following output:
Index(['_id', 'doi', 'orignal_link', 'title', 'authors', 'affiliation', 'citation', 'abstract', 'paper', 'references'], dtype='object')
Anyway, after replacement the produced error is following:
KeyError: "None of [Index([('https://aip.scitation.org/doi/full/10.1063/1.5097416', 'https://aip.scitation.org/doi/full/10.1063/1.5110298', 'https://aip.scitation.org/doi/full/10.1063/1.5096407', 'https://aip.scitation.org/doi/full/10.1063/1.5093609', 'https://aip.scitation.org/doi/full/10.1063/1.5094748', 'https://aip.scitation.org/doi/full/10.1063/1.5098007', 'https://aip.scitation.org/doi/full/10.1063/1.5095979', 'https://aip.scitation.org/doi/full/10.1063/1.5109249', 'https://iopscience.iop.org/article/10.1088/1367-2630/12/7/073006/meta')], dtype='object')] are in the [columns]"
If it's not finding values, how can it print them out?
Any help?
Thank you

ValueError, though check has already be performed for this

Getting a little stuck with NaN data. This program trawls through a folder in an external hard drive loads in a txt file as a dataframe, and should reads the very last value of the last column. As some of the last rows do not complete for what ever reason, i have chosen to take the row before (or that's what i hope to have done. Here is the code and I have commented the lines that I think are giving the trouble:
#!/usr/bin/env python3
import glob
import math
import pandas as pd
import numpy as np
def get_avitime(vbo):
try:
df = pd.read_csv(vbo,
delim_whitespace=True,
header=90)
row = next(df.iterrows())
t = df.tail(2).avitime.values[0]
return t
except:
pass
def human_time(seconds):
secs = seconds/1000
mins, secs = divmod(secs, 60)
hours, mins = divmod(mins, 60)
return '%02d:%02d:%02d' % (hours, mins, secs)
def main():
path = 'Z:\\VBox_Backup\\**\\*.vbo'
events = {}
customers = {}
for vbo_path in glob.glob(path, recursive=True):
path_list = vbo_path.split('\\')
event = path_list[2].upper()
customer = path_list[3].title()
avitime = get_avitime(vbo_path)
if not avitime: # this is to check there is a number
continue
else:
if event not in events:
events[event] = {customer:avitime}
print(event)
elif customer not in events[event]:
events[event][last_customer] = human_time(events[event][last_customer])
print(events[event][last_customer])
events[event][customer] = avitime
else:
total_time = events[event][customer]
total_time += avitime
events[event][customer] = total_time
last_customer = customer
events[event][customer] = human_time(events[event][customer])
df_events = pd.DataFrame(events)
df.to_csv('event_track_times.csv')
main()
I put in a line to check for a value, but I am guessing that NaN is not a null value, hence it hasn't quite worked.
C:\Users\rob.kinsey\AppData\Local\Continuum\Anaconda3) c:\Users\rob.kinsey\Pro
ramming>python test_single.py
BARCELONA
03:52:42
02:38:31
03:21:02
00:16:35
00:59:00
00:17:45
01:31:42
03:03:03
03:16:43
01:08:03
01:59:54
00:09:03
COTA
04:38:42
02:42:34
sys:1: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.
04:01:13
01:19:47
03:09:31
02:37:32
03:37:34
02:14:42
04:53:01
LAGUNA_SECA
01:09:10
01:34:31
01:49:27
03:05:34
02:39:03
01:48:14
SILVERSTONE
04:39:31
01:52:21
02:53:42
02:10:44
02:11:17
02:37:11
01:19:12
04:32:21
05:06:43
SPA
Traceback (most recent call last):
File "test_single.py", line 56, in <module>
main()
File "test_single.py", line 41, in main
events[event][last_customer] = human_time(events[event][last_customer])
File "test_single.py", line 23, in human_time
The output is starting out correctly, except for the sys:1 error, but at least it carries on, and the final error that stalls the program completely. How can I get past this NaN issue, all variables I am working with should be of float data type or should have been ignored. All data types should only be strings or floats until the time conversion which are integers.
Ok, even though no one answered, I am compelled to answer my own question as I am not convinced I am the only person that has had this problem.
There are 3 main reasons for receiving NaN in a data frame, most of these revolve around infinity, such as using 'inf' as a value or dividing by zero, which will also provide NaN as a result, the wiki page was the most helpful for me in solving this issue:
https://en.wikipedia.org/wiki/NaN
One other important point about NaN it that is works a little like a virus, in that anything that touches it in any calculation will result in NaN, so the problem can get exponentially worse. Actually what you are dealing with is missing data, and until you realize that's what it is, NaN is the least useful and frustrating thing as it comes under a datatype not an error yet any mathematical operations will end in NaN. BEWARE!!
The reason on this occasion is because a specific line was used to get the headers when reading in the csv file and although that worked for the majority of these files, some of them had the headers I was after on a different line, as a result, the headers being imported into the data frame either were part of the data itself or a null value. As a result, trying to access a column in the data frame by header name resulted in NaN, and as discussed earlier, this proliferated though the program causing a few problems which had used workarounds to combat, one of which was actually acceptable which is to add this line:
df = df.fillna(0)
after the first definition of the df variable, in this case:
df= pd.read_csv(vbo,
delim_whitespace=True,
header=90)
The bottom line is that if you are receiving this value, the best thing really is to work out why you are getting NaN in the first place, then it is easier to make an informed decision as to whether or not replacing NaN with '0' is a viable choice.
I sincerely hope this helps anyone who finds it.
Regards
iFunction

SelectKbest, Treeclassifer, small error somewhere :(

Doing a course and I am stuck at what I think has to be a small problem.
I wanna find out with SelectKBest what are the most important features(I vary k from 2,4,6,8)
I load the data
data_dict = pickle.load(open("final_project_dataset.pkl", "r") )
my_dataset = data_dict
data = featureFormat(my_dataset, feature_combo, sort_keys = True)
labels, features = targetFeatureSplit(data)
kbest = SelectKBest(k=2)
train_new= kbest.fit_transform(features,labels)
with get_support I find out the most important features and then try to use it with my classifer
from sklearn import tree
clf1 = tree.DecisionTreeClassifier(min_samples_split=2)
test_classifier(clf1, my_dataset, feature_lists2)
I used a feature list first with all the features I called combo:
feature_combo=['poi','salary','bonus','total_stock_value','long_term_incentive','restricted_stock_deferred','from_this_person_to_poi','shared_receipt_with_poi','newfeature_ratio','total_payments','deferral_payments','loan_advances', 'restricted_stock','director_fees','to_messages','from_messages']
After getting the most important ones I created feature lists like :
feature_lists2=['salary','bonus']
When I run it I get a cryptic error :
Traceback (most recent call last):
File "C:\Users\Stephan\Downloads\ud120-projects\final_project\poi_id.py", line 62, in <module>
train_new= kbest.fit_transform(features,labels)
File "C:\Users\Stephan\Anaconda\lib\site-packages\sklearn\base.py", line 429, in fit_transform
return self.fit(X, y, **fit_params).transform(X)
File "C:\Users\Stephan\Anaconda\lib\site-packages\sklearn\feature_selection\univariate_selection.py", line 300, in fit
self._check_params(X, y)
File "C:\Users\Stephan\Anaconda\lib\site-packages\sklearn\feature_selection\univariate_selection.py", line 405, in _check_params
% self.k)
ValueError: k should be >=0, <= n_features; got 2.Use k='all' to return all features.
[Finished in 0.5s with exit code 1]
Can anyone see what I am doing wrong ? ( I am a beginner )
Your question is a bit unclear to me. I'm not sure when you are exactly getting this error and you did not supply data to make a reproducible example. However, if you read your error message it states the issue pretty clearly:
ValueError: k should be >=0, <= n_features; got 2.Use k='all' to return all features.
This means that your k parameter in your SelectKBest() object was not within appropriate range. Specifically k=2 was greater than n_features, which means that the data you passed into the your kbest.fit_transform() call had fewer than 2 columns. Without seeing any data I can't say why that is happening, but it is almost surely the source of your error.

Resources