Error: IndexError: list index out of range - python-3.x

I am new to python and I am getting an error while executing the below piece of code. Would really appreciate if anybody could help me understand it.
About the data: The dataframe is stored in "train" and the column name is "neighborhood". values in "neighborhood" are like "#Queens#jackson heights" or "#Manhattan#uppereast side". So i am trying to split hashtags and then consider only the first word in each row (i.e. Queens & Manhattan $ etc.)
It does print the expected output but with this error:
IndexError Traceback (most recent call last)
<ipython-input-89-b199ce84fe1c> in <module>()
5 for row in train['neighborhood'].str.split('#'):
6 # if more than a value,
----> 7 if len(row[1]) == 5 :
8 # Append a num grade
9 grades.append('1')
IndexError: list index out of range
train = pd.DataFrame(train, columns = ['id','listing_type','floor','latitude','longitude','price','beds','baths','total_rooms','square_feet','pet_details','neighborhood'])
# Create a list to store the data
grades = [ ]
# For each row in the column,
for row in train['neighborhood'].str.split('#'):
# if more than a value,
if row[1] == 'Queens':
# Append a num grade
grades.append('1')
# else, if more than a value,
elif row[1] == 'Manhattan':
# Append a letter grade
grades.append('2')
# else, if more than a value,
elif row[1] == 'Bronx':
# Append a letter grade
grades.append('3')
# else, if more than a value,
elif row[1] == 'Brooklyn':
# Append a letter grade
grades.append('4')
# else, if more than a value,
else:
# Append a fail0ing grade
grades.append('5')

Related

To find the location at which error has occured

I need to do the data validation for range. To check wheather the column values are within the given range if the value is greater or less than the given range error should occur and display the row no or index where the error has been occured .
my data is as follows:
Draft_Fore
12
14
87
16
90
It should produce the error for the value 87 and 90 as I have considered the range of the column must be greater than 5 and less than 20.
The code which I have tried is as follows:
def validate_rating(Draft_Fore):
Draft_Fore = int(Draft_Fore)
if Draft_Fore > 5 and Draft_Fore <= 20:
return True
return False
df = pd.read_csv("/home/anu/Desktop/dr.csv")
for i, Draft_Fore in enumerate(df):
try:
validate_rating(Draft_Fore)
except Exception as e:
print('Error at index {}: {!r}'.format(i, Draft_Fore))
print(e)
To print the location where the error has occured in the row
A little explanation to clarify my comment. Assuming your dataframe looks like
df = pd.DataFrame({'col1': [12, 14, 87, 16, 90]})
you could do
def check_in_range(v, lower_lim, upper_lim):
if lower_lim < v <= upper_lim:
return True
return False
lower_lim, upper_lim = 5, 20
for i, v in enumerate(df['col1']):
if not check_in_range(v, lower_lim, upper_lim):
print(f"value {v} at index {i} is out of range!")
# --> gives you
value 87 at index 2 is out of range!
value 90 at index 4 is out of range!
So your check function is basically fine. However, if you call to enumerate a df, the values will be the column names. What you need is to enumerate the specific column.
Concerning your idea to raise an exception, I'd suggest to have a look at raise and assert.
So you could e.g. use raise:
for i, v in enumerate(df['col1']):
if not check_in_range(v, lower_lim, upper_lim):
raise ValueError(f"value {v} at index {i} is out of range")
# --> gives you
ValueError: value 87 at index 2 is out of range
or assert:
for i, v in enumerate(df['col1']):
assert v > lower_lim and v <= upper_lim, f"value {v} at index {i} is out of range"
# --> gives you
AssertionError: value 87 at index 2 is out of range
Note: If you have a df, why not use its features for convenience? To get the in-range values of the column, you could just do
df[(df['col1'] > lower_lim) & (df['col1'] <= upper_lim)]
# --> gives you
col1
0 12
1 14
3 16

Was solving a hacker problem and some test cases didnt pass

Given the names and grades for each student in a Physics class of students, store them in a nested list and print the name(s) of any student(s) having the second lowest grade.
Note: If there are multiple students with the same grade, order their names alphabetically and print each name on a new line.
Input Format
The first line contains an integer, , the number of students.
The subsequent lines describe each student over lines; the first line contains a student's name, and the second line contains their grade.
Constraints
There will always be one or more students having the second lowest grade.
Output Format
Print the name(s) of any student(s) having the second lowest grade in Physics; if there are multiple students, order their names alphabetically and print each one on a new line.
This is my code:
list = []
for _ in range(int(input())):
name = input()
score = float(input())
new = [name, score]
list.append(new)
def snd_highest(val):
return val[1]
list.sort(key = snd_highest)
list.sort()
value = list[1]
grade = value[1]
for a,b in list:
if b == grade:
print (a)
This is the test case:
4
Rachel
-50
Mawer
-50
Sheen
-50
Shaheen
51
And the expected output is Shaheen but i got the other 3.
Please explain.
To find the second lowest value, you have actually just sorted your list in ascending order and just taken the second value in the list by using the below code
value = list[1]
grade = value[1]
Imagine this is your list after sorting:
[['Sheen', 50.0], ['mawer', 50.0], ['rachel', 50.0], ['shaheen', 51.0]]
According to value = list[1], the program chooses "value = ['mawer', 50.0]".
Then the rest of your program takes the grade from this value and outputs the corresponding name, that's why this isn't working as per what you need, you need to write logic to find the lowest value and then find the second lowest, this current program just assumes the lowest value is in the second position in the list.
Try doing this
if __name__ == '__main__':
students = []
for _ in range(int(input())):
name = input()
score = float(input())
new = [name, score]
students.append(new)
def removeMinimum(oldlist):
oldlist = sorted(oldlist, key=lambda x: x[1])
min_ = min(students, key=lambda x: x[1])
newlist = []
for a in range(0, len(oldlist)):
if min_[1] != oldlist[a][1]:
newlist.append(oldlist[a])
return newlist
students = removeMinimum(students);
# find the second minimum value
min_ = min(students, key=lambda x: x[1])
# sort alphabetic order
students = sorted(students, key=lambda x: x[0])
for a in range(0, len(students)):
if min_[1] == students[a][1]:
print(students[a][0])
I hope this may help you to pass all your test cases. Thank you.
# These functions will be used for sorting
def getSecond(ele):
return ele[1]
def getFirst(ele):
return ele[0]
studendList = []
sortedList = []
secondLowestStudents = []
# Reading input from STDIN and saving in nested list [["stud1": <score>], ["stud2", <score>]]
for _ in range(int(input())):
name = input()
score = float(input())
studendList.append([name, score])
# sort the list by score and save it in a new list studendList (remove the duplicate score as well - see, if x[1] not in sortedList)
studendList.sort(key=getSecond)
[sortedList.append(x[1]) for x in studendList if x[1] not in sortedList]
# Get the second lowest grade
secondLowest = sortedList[1]
# Now sort the origin list by the name fetch the student list having the secondLowest grade
studendList.sort(key=getFirst)
[secondLowestStudents.append(x[0]) for x in studendList if x[1] == secondLowest]
# Print the student's name having second-lowest grade
for st in secondLowestStudents:
print(st)

Python: what to fix in the following code to make it run?

I have the following code where i am facing error and i am unable to identify the actual issue here. The code takes a .json file which holds the words and their meanings and finds the exact or nearest matches of the words given as input by the user along with their meanings. The code was running fine until i tried to modify it a little. I wanted to add the matching words where the first word is capital in the following line post which it started throwing exception:
Changed line:
if (word != "") and ((word in data.keys()) or (word.capitalize() in data.keys())):
Code:
import json
import difflib
def searchWord(word):
if (word != "") and ((word in data.keys()) or (word.capitalize() in data.keys())):
return data[word]
else:
closematch = difflib.get_close_matches(word,data.keys())[0]
confirmation = (input(f"\nDid you mean: {closematch} (y/n): ")).lower()
if confirmation == 'y':
return data[closematch]
else:
return 'Word Not Found in Dictionary'
print('Loading Data...\n')
data = json.load(open('data.json'))
print('Data Loaded!\n')
word = (input('Enter word to lookup in dictionary: ')).lower()
meanings = searchWord(word)
if meanings == list:
for meaning in meanings:
print("\n"+meaning)
else:
print(meanings[0])
Error:
Loading Data...
Data Loaded!
Enter word to lookup in dictionary: delhi
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
E:\Learning\Python\AdvancedPython\PythonMegaCourse\DictionaryApp\dictionary.py in <module>()
20 word = (input('Enter word to lookup in dictionary: ')).lower()
21
---> 22 meanings = searchWord(word)
23 if meanings == list:
24 for meaning in meanings:
E:\Learning\Python\AdvancedPython\PythonMegaCourse\DictionaryApp\dictionary.py in searchWord(word)
4 def searchWord(word):
5 if (word != "") and ((word in data.keys()) or (word.capitalize() in data.keys())):
----> 6 return data[word]
7 else:
8 closematch = difflib.get_close_matches(word,data.keys())[0]
KeyError: 'delhi'
The .json file has got a key named Delhi however, the capitalize() doesn't seem to work.
When you are trying to access the word from the dictionary, you are not capitalizing it.
This is not a clean way to handle it but to give you the idea.
if (word != "") and (word in data.keys()):
return data[word]
if (word != "") and (word.capitalize() in data.keys()):
return data[word.capitalize()]

Getting error in list assignment about index?

I recently started to study Python, and as I was trying to run a code from a book (with my modification) I got the error:
IndexError: list assignment index out of range
in : `Names[len(Names)]=name`
I read some questions with this error on web but can't figure it out.
Names=[]
num=0
name=''
while True :
print('Enter the name of person '+str(len(Names)+1) + '(or Enter nothing to stop)')
name=input()
if name == '' :
break
Names[len(Names)]=name
print('the person names are:')
for num in range(len(Names)+1) :
print(' '+Names[num])
Looks like you want to append something to an existing list. Why not use .append()? This won't give you the IndexError.
Names.append(name)
Another same error: You shouldn't write range(len(Names) + 1). range(len(Names)) is enough for you to iterate through the whole list:
for num in range(len(Names)):
print(' '+Names[num])
Another suggestion: You don't need the for loop to print the result, at all. Just use str.join():
print(' '.join(Names))
You can not access out of range index
Ex:
>>> l = [1,2,3]
>>> l = [0,1,2]
>>> l[3] = "New"
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
l[3] = "New"
IndexError: list assignment index out of range
For that, you have to append new data to the list.
>>> l.append("new")
>>> l
[0, 1, 2, 'new']
You can try:
Names=[]
num=0
name=''
while True :
print('Enter the name of person '+str(len(Names)+1) + '(or Enter nothing to stop)')
name=input()
if name == '' :
break
Names.append(name)
print('the person names are:')
for num in range(len(Names)) :
print(' '+Names[num])
just use the append function to insert the new name to the existing list of names.
Syntax:-
if you want to append 'foo' to the list of existing names i.e 'Names',type Names.append('foo').

Specific Fields Python3

I try to select specific fields from my Qdata.txt file and use field[2] to calculate average for every years separate. My code give only total average.
data file looks like: (1. day of year: 101 and last: 1231)
Date 3700300 6701500
20000101 21.00 223.00
20000102 20.00 218.00
. .
20001231 7.40 104.00
20010101 6.70 104.00
. .
20130101 8.37 111.63
. .
20131231 45.00 120.98
import sys
td=open("Qdata.txt","r") # open file Qdata
total=0
count=0
row1=True
for row in td :
if (row1) :
row1=False # row1 is for topic
else:
fields=row.split()
try:
total=total+float(fields[2])
count=count+1
# Errors.
except IndexError:
continue
except ValueError:
print("File is incorrect.")
sys.exit()
print("Average in 2000 was: ",total/count)
You could use itertools.groupby using the first four characters as the key for grouping.
with open("data.txt") as f:
next(f) # skip first line
groups = itertools.groupby(f, key=lambda s: s[:4])
for k, g in groups:
print(k, [s.split() for s in g])
This gives you the entries grouped by year, for further processing.
Output for your example data:
2000 [['20000101', '21.00', '223.00'], ['20000102', '20.00', '218.00'], ['20001231', '7.40', '104.00']]
2001 [['20010101', '6.70', '104.00']]
2013 [['20130101', '8.37', '111.63'], ['20131231', '45.00', '120.98']]
You could create a dict (or even a defaultdict) for total and count instead:
import sys
from collections import defaultdict
td=open("Qdata.txt","r") # open file Qdata
total=defaultdict(float)
count=defaultdict(int)
row1=True
for row in td :
if (row1) :
row1=False # row1 is for topic
else:
fields=row.split()
try:
year = int(fields[0][:4])
total[year] += float(fields[2])
count[year] += 1
# Errors.
except IndexError:
continue
except ValueError:
print("File is incorrect.")
sys.exit()
print("Average in 2000 was: ",total[2000]/count[2000])
Every year separate? You have to divide your input into groups, something like this might be what you want:
from collections import defaultdict
row1 = True
year_sums = defaultdict(list)
for row in td:
if row1:
row1 = False
continue
fields = row.split()
year = fields[0][:4]
year_sums[year].append(float(fields[2]))
for year in year_sums:
avarage = sum(year_sums[year])/count(year_sums[year])
print("Avarage in {} was: {}".format(year, avarage)
That is just some example code, I don't know if it works for sure, but should give you an idea what you can do. year_sums is a defaultdict containing lists of values grouped by years. You can then use it for other statistics if you want.

Resources