How do I get my python function to properly apply an IF-ELIF-ELSE statements correctly to all rows in my pandas dataframe? - python-3.x

I am trying to calculate the CGPA of a number of students. The idea here is that each student takes N courses (in this case, N = 3). Every course has its course load which is an integer and can range from 1 to 6. At the end of the semester, the CGPA is calculated based on the unit load of all the courses taken by each student and the grades obtained.
I am trying to do this using a for statement to loop through the entire dataset a row at a time and then an if suite to determine the number of units to assign to each student according to the grade scored. The problem here is that the code works but it doesn't follow through. So if the first student in the dataframe had an A in course1, the code gives him 15units and all other students also get 15units irregardless of if they score a D or an F.
I really want to know what I am doing wrong and how I can fix it. I would also appreciate it if you can suggest smarter ways of accomplishing this task. Thanks.
I have added breaks in the first course section but I am afraid the code is still not generalizing well.
A = 5; B = 4; C = 3; D = 2; E = 1; F = 0;
course1_cl = 3; course2_cl = 3; course3_cl = 3
def calculate_CGPA(dataframe, a, b, c, d):
for row in dataframe[d]:
if dataframe[a].any()=='A':
dataframe['units'] = A * course1_cl
break
elif dataframe[a].any()=='B':
dataframe['units'] = B * course1_cl
break
elif dataframe[a].any()=='C':
dataframe['units'] = C * course1_cl
break
elif dataframe[a].any()=='D':
dataframe['units'] = D * course1_cl
break
elif dataframe[a].any()=='E':
dataframe[units] = E * course1_cl
else:
dataframe[units]= 0
print("Done generating units for: "+ format(a))
for row in dataframe[d]:
if dataframe[b].any()=='A':
dataframe['units2']=A * course2_cl
elif dataframe[b].any()=='B':
dataframe['units2'] = B*course2_cl
elif dataframe[b].any()=='C':
dataframe['units2'] = C*course2_cl
elif dataframe[b].any()=='D':
dataframe['units2'] = D*course2_cl
elif dataframe[b].any()=='E':
dataframe['units2'] = E*course2_cl
else:
dataframe['units2'] = 0
print("Done generating units for: "+format(b))
for row in dataframe[d]:
if dataframe[c].any()=='A':
dataframe['units3']= A * course3_cl
elif dataframe[c].any()=='B':
dataframe['units3'] = B*course3_cl
elif dataframe[c].any()=='C':
dataframe['units3'] = C*course3_cl
elif dataframe[c].any()=='D':
dataframe['units3'] = D*course3_cl
elif dataframe[c].any()=='E':
dataframe['units3'] = E*course3_cl
else:
dataframe['units3'] = 0
print("Done generating units for: "+format(c))
df['CGPA'] = (dataframe['units'] + dataframe['units2'] + dataframe['units3'])/(course1_cl + course2_cl + course3_cl)
The resulting dataframe should have 4 newly added columns: One units column for each of the three courses and a CGPA column as seen below. The values in the units and CGPA columns should change dynamically based on the grades scored by the individual.
S/N,Name,ExamNo,Course1,Course2,Course3,Units,Units2,Units3,CGPA
1,Mary Beth,A1,A,A,B,15,15,12,4.67
2,Elizabeth Fowler,A2,B,A,A,12,15,15,4.67
3,Bright Thompson,A12,C,C,B,9,9,12,3.33
4,Jack Daniels,A24,C,E,C,9,3,9,2.33
5,Ciroc Brute,A31,A,B,C,15,12,9,4.0

I do not know how complicated you actual data is but for your sample data you do not need the if statements:
from io import StringIO
# sample data
s = """S/N,Name,ExamNo,Course1,Course2,Course3
1,Mary Beth,A1,A,A,B
2,Elizabeth Fowler,A2,B,A,A
3,Bright Thompson,A12,C,C,B
4,Jack Daniels,A24,C,E,C
5,Ciroc Brute,A31,A,B,C"""
df = pd.read_csv(StringIO(s))
# create a dict
d = {'A':5, 'B':4, 'C':3, 'D':2, 'E':1, 'F':0}
# replace the letter grade with number and assign it to units cols
df[['Units', 'Units2', 'Units3']] = df[['Course1','Course2','Course3']].replace(d) * 3
# calc CGPA with sum div 3
df['CGPA'] = df[['Course1','Course2','Course3']].replace(d).sum(1) / 3
S/N Name ExamNo Course1 Course2 Course3 Units Units2 Units3 \
0 1 Mary Beth A1 A A B 15 15 12
1 2 Elizabeth Fowler A2 B A A 12 15 15
2 3 Bright Thompson A12 C C B 9 9 12
3 4 Jack Daniels A24 C E C 9 3 9
4 5 Ciroc Brute A31 A B C 15 12 9
CGPA
0 4.666667
1 4.666667
2 3.333333
3 2.333333
4 4.000000

Related

How to fill a matrix with equal sum of rows and columns?

I have a N x N matrix with integer elements.
We have two inputs : n and k.
There is two condition for solving this problem:
1- sum of matrix's columns and rows should be equal to k.
2- Difference between max and min numbers in matrix should be minimum.
I wrote a code in python but it doesn't work well.
n , k = map(int,input().split())
matrix = [[k//n]*n for i in range(n)]
def row_sum(matrix,row):
return sum(matrix[row])
def col_sum(matrix,col):
res = 0
for i in matrix:
res += i[col]
return res
for i in range(n):
for j in range(n):
if (row_sum(matrix,i) != k) and (col_sum(matrix, j) != k):
matrix[i][j] += 1
for i in matrix:
print(*i)
for example we have a 5x5 matrix that sum of its columns and rows should be equal to 6:
input : 5 6
output :
2 1 1 1 1
1 2 1 1 1
1 1 2 1 1
1 1 1 2 1
1 1 1 1 2
but it doesn't work well:
input : 6 11
output:
2 2 2 2 2 1
2 2 2 2 2 1
2 2 2 2 2 1
2 2 2 2 2 1
2 2 2 2 2 1
1 1 1 1 1 2
I spend a lot of time on this and i can't solve it. Please Help!
(This problem is not a homework or something like that. It's a question from an algorithm contest and the contest is over!)
The solution is to work out the first row (using the code you already have), and then set each row to be the row above it rotated one position.
So for example if the first row has the values
a b c d e
then you rotate one position each row to get
a b c d e
b c d e a
c d e a b
d e a b c
e a b c d
Since each value gets placed in each column once the columns will contain one of each value and so add up to the same total, and since each row has the same values just moved around all the rows add up the same too.
Code:
n , k = map(int,input().split())
matrix = [[k//n]*n for i in range(n)]
def row_sum(matrix,row):
return sum(matrix[row])
for j in range(n):
if (row_sum(matrix,0) != k):
matrix[0][j] += 1
for i in range(1, n):
for j in range(n):
matrix[i][j] = matrix[i-1][(j+1)%n]
for i in matrix:
print(*i)

Two new columns based on return has two values in dataframe apply

I have a DataFrame:
Num
1
2
3
def foo(x):
return x**2, x**3
When I did df['sq','cube'] = df['num'].apply(foo)
It is making a single column like below:
num (sq,cub)
1 (1,1)
2 (4,8)
3 (9,27)
I want these column separate with their values
num sq cub
1 1 1
2 4 8
3 9 27
How can I achieve this...?
obj = df['num'].apply(foo)
df['sq'] = obj.str[0]
df['cube'] = obj.str[1]

How to generate pyramid of numbers (using only 1-3) using Python?

I'm wondering how to create a pyramid using only element (1,2,3) regardless of how many rows.
For eg. Rows = 7 ,
1
22
333
1111
22222
333333
1111111
I've have tried creating a normal pyramid with numbers according to rows.
eg.
1
22
333
4444
55555
666666
Code that I tried to make a Normal Pyramid
n = int(input("Enter the number of rows:"))
for rows in range (1, n+1):
for times in range (rows):
print(rows, end=" ")
print("\n")
You need to adjust your ranges and use the modulo operator % - it gives you the remainer of any number diveded by some other number.Modulo 3 returns 0,1 or 2. Add 1 to get your desired range of values:
1 % 3 = 1
2 % 3 = 2 # 2 "remain" as 2 // 3 = 0 - so remainder is: 2 - (2//3)*3 = 2 - 0 = 2
3 % 3 = 0 # no remainder, as 3 // 3 = 1 - so remainder is: 3 - (3//3)*3 = 3 - 1*3 = 0
Full code:
n = int(input("Enter the number of rows: "))
print()
for rows in range (0, n): # start at 0
for times in range (rows+1): # start at 0
print( rows % 3 + 1, end=" ") # print 0 % 3 +1 , 1 % 3 +1, ..., etc.
print("")
Output:
Enter the number of rows: 6
1
2 2
3 3 3
1 1 1 1
2 2 2 2 2
3 3 3 3 3 3
See:
Modulo operator in Python
What is the result of % in Python?
binary-arithmetic-operations
A one-liner (just for the record):
>>> n = 7
>>> s = "\n".join(["".join([str(1+i%3)]*(1+i)) for i in range(n)])
>>> s
'1\n22\n333\n1111\n22222\n333333\n1111111'
>>> print(s)
1
22
333
1111
22222
333333
1111111
Nothing special: you have to use the modulo operator to cycle the values.
"".join([str(1+i%3)]*(1+i)) builds the (i+1)-th line: i+1 times 1+i%3 (thats is 1 if i=0, 2 if i=1, 3 if i=2, 1 if i=4, ...).
Repeat for i=0..n-1 and join with a end of line char.
Using cycle from itertools, i.e. a generator.
from itertools import cycle
n = int(input("Enter the number of rows:"))
a = cycle((1,2,3))
for x,y in zip(range(1,n),a):
print(str(x)*y)
(update) Rewritten as two-liner
from itertools import cycle
n = int(input("Enter the number of rows:"))
print(*[str(y)*x for x,y in zip(range(1,n),cycle((1,2,3)))],sep="\n")

How to use pandas df column value in if-else expression to calculate additional columns

I am trying to calculate additional metrics from existing pandas dataframe by using an if/else condition on existing column values.
if(df['Sell_Ind']=='N').any():
df['MarketValue'] = df.apply(lambda row: row.SharesUnits * row.CurrentPrice, axis=1).astype(float).round(2)
elif(df['Sell_Ind']=='Y').any():
df['MarketValue'] = df.apply(lambda row: row.SharesUnits * row.Sold_price, axis=1).astype(float).round(2)
else:
df['MarketValue'] = df.apply(lambda row: 0)
For the if condition the MarketValue is calculated correctly but for the elif condition, its not giving the correct value.
Can anyone point me as what wrong I am doing in this code.
I think you need numpy.select, apply can be removed and multiple columns by mul:
m1 = df['Sell_Ind']=='N'
m2 = df['Sell_Ind']=='Y'
a = df.SharesUnits.mul(df.CurrentPrice).astype(float).round(2)
b = df.SharesUnits.mul(df.Sold_price).astype(float).round(2)
df['MarketValue'] = np.select([m1, m2], [a,b], default=0)
Sample:
df = pd.DataFrame({'Sold_price':[7,8,9,4,2,3],
'SharesUnits':[1,3,5,7,1,0],
'CurrentPrice':[5,3,6,9,2,4],
'Sell_Ind':list('NNYYTT')})
#print (df)
m1 = df['Sell_Ind']=='N'
m2 = df['Sell_Ind']=='Y'
a = df.SharesUnits.mul(df.CurrentPrice).astype(float).round(2)
b = df.SharesUnits.mul(df.Sold_price).astype(float).round(2)
df['MarketValue'] = np.select([m1, m2], [a,b], default=0)
print (df)
CurrentPrice Sell_Ind SharesUnits Sold_price MarketValue
0 5 N 1 7 5.0
1 3 N 3 8 9.0
2 6 Y 5 9 45.0
3 9 Y 7 4 28.0
4 2 T 1 2 0.0
5 4 T 0 3 0.0

python3 modifying rows in a dataframe based on a condition

I have a dataframe something like
A B C
1 4 x
2 8 y
3 7 z
4 12 y
5 10 b
i need to modify column B based on condition something like
if B <= 5 then B = 1
if B > 5 and B <= 10 then B = 2
if B > 10 and B < 15 then B = 3
so that my dataframe becomes
A B C
1 1 x
2 2 y
3 2 z
4 3 y
5 2 b
i am okay if I have to add a new column first and then drop column B. Could anyone help please?
You should use the apply function to implement this.
def check(row):
if (row['B']) <= 5:
return 1
elif (row['B'] > 5) and (row['B'] <= 10):
return 2
elif (row['B'] > 10) and (row['B'] <= 15):
return 3
These would apply the function to each row and then you can perform the checks.
df['B'] = df.apply(check, axis = 1)
Then the resulting DF would look like:
A B C
1 1 x
2 2 y
3 2 z
4 3 y
5 2 b
More documentation available here.

Resources