Python adding column with condition not working as expected - python-3.x

I am trying to add a column based on a condition like
for i in range(len(data) - 1):
a = data.loc[i, 'column1']
b = data.loc[i+1, 'column1']
if a == b:
data['column2'] = 1
else:
data['column2'] = 0
This shows no error and the new column has been created, however the logic does not work, the new column is filled with 0 regardless of the logic. I have checked if a == b by printing the values for a and b with a == b and the result is as I expected. I am not sure why that is?
Note that the reason why I am using if statements is because my logic involves if and, else if

Inside the if statement, it should be data.loc[i,'column2'] this way it will right in the value at that location

Related

Why does this fibonacci sequence work? Can someone help me?

Python beginner problem
(I am a beginner programmer)
So I was making a simple Fibonacci sequence generator and I made a working version, but I'm confused about how it works. In the code (lines 8-9), the first number (0) is being made the new value of the second value (1). But then that should make all the other numbers 0 as well, but it seems like the defining process is backward. Generally, the new value is on the left of the equal sign and the old value is on the right. But then that means everything should turn to 0. But actually, every number turns to 0 if I try to re-define the variables in the regular way (b = a; c = b). Why is this? I've attached my code at the bottom.
def seq_loop():
a = 0
b = 1
for i in range(15):
print(a)
c = a + b
a = b
b = c
print(seq_loop())
Have a good look at your code. Walk through it, and write down a,b and c in each iteration (or if you know how to use a debugger, set a break point in the loop te verify your variable's values). You'll see that your thinking is not right: in your first iteration: a = b is equivalent to a = 1 and b = c is equivalent to b = 1 + 1.
I am not sure that your interpretation of assignments is correct: An assignment goes right to left. The value (of the variable) on the right side of = is assigned to the variable on the left side of =
I don't really understand what you are asking, but a good way to understand code is to simply execute it by hand with pen&paper and keep a list of all the variables in scope and their current values. So, let's do that now.
In the first line, we define and initialize the variable a to the value 0. Our variables in scope and their values are { a = 0 }.
In the second line, we define and initialize the variable b to the value 1. Our variables in scope and their values are { a = 0; b = 1 }.
In the first line of the loop, we print the current value of the variable a, which we can look up in our variable list is 0. We haven't assigned any variables, so our variables in scope are still unchanged: { a = 0; b = 1 }. And the console looks like this:
0
In the second line of the loop, we define and initialize the variable c to the result of evaluating the expression a + b. We have to dereference the two variables, i.e. look up their values in our list, and their values are 0 and 1. 0 + 1 is 1, which means we initialize c to 1. Our variables in scope and their values are { a = 0; b = 1; c = 1 }.
In the third line of the loop, we assign the variable a. The value we assign to the variable a is the current value of the variable b. So, we look at our variable list, and we see that the current value of b is 1, which means we assign 1 to a. Our variables in scope and their values are { a = 1; b = 1; c = 1 }.
In the fourth line of the loop, we assign the variable b. The value we assign to the variable b is the current value of the variable c. So, we look at our variable list, and we see that the current value of c is 1, which means we assign 1 to b. Our variables in scope and their values are { a = 1; b = 1; c = 1 }.
After the fourth line of the loop, the variable c goes out of scope. Our variables in scope and their values are now { a = 1; b = 1 }.
As you can see, we started the first iteration of the loop with { a = 1; b = 0 } but after the first iteration of the loop, we have { a = 1; b = 1 }, which is what we start the second iteration of the loop with. It is important that something has changed here, otherwise each iteration of the loop would do the same thing, and we would always get the same result.
So, let's look at the second iteration of the loop.
In the first line of the loop, we print the current value of the variable a, which we can look up in our variable list is 1. We haven't assigned any variables, so our variables in scope are still unchanged: { a = 1; b = 1 }. And the console looks like this:
0
1
In the second line of the loop, we define and initialize the variable c to the result of evaluating the expression a + b. We look up their values in our list, and their values are 1 and 1. 1 + 1 is 2, which means we initialize c to 2. Our variables in scope and their values are { a = 1; b = 1; c = 2 }.
In the third line of the loop, we assign the variable a. The value we assign to the variable a is the current value of the variable b. So, we look at our variable list, and we see that the current value of b is 1, which means we assign 1 to a. Our variables in scope and their values are { a = 1; b = 1; c = 2 }.
In the fourth line of the loop, we assign the variable b. The value we assign to the variable b is the current value of the variable c. So, we look at our variable list, and we see that the current value of c is 2, which means we assign 2 to b. Our variables in scope and their values are { a = 1; b = 2; c = 2 }.
After the fourth line of the loop, the variable c goes out of scope. Our variables in scope and their values are now { a = 1; b = 2 }.
Now for the third iteration.
In the first line of the loop, we print the current value of the variable a, which we can look up in our variable list is 1. We haven't assigned any variables, so our variables in scope are still unchanged: { a = 1; b = 2 }. And the console looks like this:
0
1
1
In the second line of the loop, we define and initialize the variable c to the result of evaluating the expression a + b. We look up their values in our list, and their values are 1 and 2. 1 + 2 is 3, which means we initialize c to 3. Our variables in scope and their values are { a = 1; b = 2; c = 3 }.
In the third line of the loop, we assign the variable a. The value we assign to the variable a is the current value of the variable b. So, we look at our variable list, and we see that the current value of b is 2, which means we assign 2 to a. Our variables in scope and their values are { a = 2; b = 2; c = 3 }.
In the fourth line of the loop, we assign the variable b. The value we assign to the variable b is the current value of the variable c. So, we look at our variable list, and we see that the current value of c is 3, which means we assign 3 to b. Our variables in scope and their values are { a = 2; b = 3; c = 3 }.
After the fourth line of the loop, the variable c goes out of scope. Our variables in scope and their values are now { a = 2; b = 3 }.
And this is the fourth iteration of the loop.
In the first line of the loop, we print the current value of the variable a, which we can look up in our variable list is 2. We haven't assigned any variables, so our variables in scope are still unchanged: { a = 2; b = 3 }. And the console looks like this:
0
1
1
2
In the second line of the loop, we define and initialize the variable c to the result of evaluating the expression a + b. We look up their values in our list, and their values are 2 and 3. 2 + 3 is 5, which means we initialize c to 5. Our variables in scope and their values are { a = 2; b = 3; c = 5 }.
In the third line of the loop, we assign the variable a. The value we assign to the variable a is the current value of the variable b. So, we look at our variable list, and we see that the current value of b is 3, which means we assign 3 to a. Our variables in scope and their values are { a = 3; b = 3; c = 5 }.
In the fourth line of the loop, we assign the variable b. The value we assign to the variable b is the current value of the variable c. So, we look at our variable list, and we see that the current value of c is 5, which means we assign 5 to b. Our variables in scope and their values are { a = 3; b = 5; c = 5 }.
After the fourth line of the loop, the variable c goes out of scope. Our variables in scope and their values are now { a = 3; b = 5 }.
Let's do one last iteration of the loop.
In the first line of the loop, we print the current value of the variable a, which we can look up in our variable list is 3. We haven't assigned any variables, so our variables in scope are still unchanged: { a = 3; b = 5 }. And the console looks like this:
0
1
1
2
3
In the second line of the loop, we define and initialize the variable c to the result of evaluating the expression a + b. We look up their values in our list, and their values are 3 and 5. 3 + 5 is 8, which means we initialize c to 8. Our variables in scope and their values are { a = 3; b = 5; c = 8 }.
In the third line of the loop, we assign the variable a. The value we assign to the variable a is the current value of the variable b. So, we look at our variable list, and we see that the current value of b is 5, which means we assign 5 to a. Our variables in scope and their values are { a = 5; b = 5; c = 8 }.
In the fourth line of the loop, we assign the variable b. The value we assign to the variable b is the current value of the variable c. So, we look at our variable list, and we see that the current value of c is 8, which means we assign 8 to b. Our variables in scope and their values are { a = 5; b = 8; c = 8 }.
After the fourth line of the loop, the variable c goes out of scope. Our variables in scope and their values are now { a = 5; b = 8 }.
I hope it is somewhat clearer now.
One thing that is very important in programming is naming. Good names should be intention-revealing. In this case, none of the names are very good: what does the name a tell you about what the intent of the programmer is? The same goes for b, c, and i. seq_loop is also not very intention-revealing, i.e. what does this function actually do? It prints the Fibonacci sequence. How can I tell from the name that it prints the Fibonacci sequence? Well, I simply can't!
So, here is the code with some better names, which should clear up some confusion:
def print_fibonacci_sequence():
previous = 0
current = 1
for _ in range(15):
print(previous)
after = previous + current
previous = current
current = after
print(print_fibonacci_sequence())
You might ask yourself "How is _ a more intention-revealing name than i?" The reason is that _ has a well-known meaning in the Python community: it is used as the name for a variable that is being ignored. Which is exactly what we are doing in this case.
Also, why after and not next? next is already defined in Python and it is considered bad style to shadow or even worse redefine Python builtins.
There are a couple of other oddities in the code. For example, the function prints the elements of the Fibonacci sequence and it doesn't return anything. And then you call the function and print the result of the call … but there is no result because the function doesn't return anything! Renaming the function to have print in its name so that it is clear that it doesn't return anything but prints it, makes that mistake more obvious:
print(print_fibonacci_sequence())
You can immediately see that you print something which prints something, which makes no sense. It should just be
print_fibonacci_sequence()
Another oddity is that the function always prints the first 15 terms of the Fibonacci sequence. Usually, you would want to let the caller decide how many terms to print. Maybe the caller only needs 3? Maybe 20? So, let's do that:
def print_fibonacci_sequence(number_of_elements):
previous = 0
current = 1
for _ in range(number_of_elements):
print(previous)
after = previous + current
previous = current
current = after
print_fibonacci_sequence(15)
Speaking of letting the caller decide what to do, what if the caller doesn't want to print the Fibonacci sequence? What if the caller wants to format it as HTML or insert it into an Excel table?
You should always separate computation from input/output. So, in this case, instead of printing the Fibonacci sequence, we will return the Fibonacci sequence, and then the caller can print it if they want to:
def fibonacci_sequence(number_of_elements):
fibonacci_sequence = []
previous = 0
current = 1
for _ in range(number_of_elements):
fibonacci_sequence.append(previous)
after = previous + current
previous = current
current = after
return fibonacci_sequence
print(fibonacci_sequence(15))
Actually, if you think about it, having the caller pass the number of terms as an argument is still somewhat restrictive. What if the caller doesn't want the first n terms, but the first terms smaller than n? In that case, they would have to compute the Fibonacci sequence first to see how many terms there are smaller than n and then ask for that number of terms of the Fibonacci sequence, which obviously defeats the purpose of having the function in the first place.
The best solution for the caller would be to produce an infinite number of terms and let the caller decide which condition to use to decide how many terms to take:
def fibonacci_sequence():
previous = 0
current = 1
while True:
yield previous
after = previous + current
previous = current
current = after
Now you can use it like this:
from itertools import islice, takewhile
first_15_terms = islice(fibonacci_sequence(), 15)
terms_less_than_100 = takewhile(lambda term: term < 100, fibonacci_sequence())
And you can decide what to do with the result. For example print it:
for term in first_15_terms:
print(term)
for term in terms_less_than_100:
print(term)
Or you can turn it into a list:
list_of_first_15_terms = list(first_15_terms)
list_of_terms_less_than_100 = list(terms_less_than_100)
And many other things.
This is a general principle in programming, software engineering, and API design: not only separate input/output from computation, but also break up the computation into separate parts for producing values, consuming values, transforming values, filtering values, representing values as strings (to be output later), parsing string representations into values (coming in from the outside), and so on.
In one sentence, we could say: separate the things that change.

How to replace a variable with nothing if certain conditions are met using a if statement (Python 3)

I'm trying to make a completing the square calculator. I replicated some of the lengthy code to show where im getting my issue:
a=1
if (a == 1):
print ()
print ("bcdf" + str(a))
Output: bcdf1
In this case, I want it to output bcdf
I genuinely looked all over the place trying to find an answer to this.
I'm not sure if I understand correctly the problem, but probably this is the solution you're searching for:
a = 1
if (a == 1):
print("bcdf")
else:
print("bcdf" + str(a))
You can find more information regarding if..else statements in the documentation.
Since a is a number you could set it to None to reset its value. Also, if you are using Python>3.6 you have f-strings to print what you want nicely in one line:
a = 1
if a == 1:
a = None
print(f"bcdf{a if a is not None else ''}")
This says to evaluate to an empty string if a is None.
If you do not have f-strings you can do it the old way with format, it does exactly the same thing:
a = 1
if a == 1:
a = None
print("bcdf{}".format(a if a is not None else ""))

Using Comparative Operators to find local maxima in column of data frame

I have an excel file where each column represents successive time periods. I want to determine the local maxima for each time period (i.e., within each column). This is the code I have thus far:
import pandas as pd
df = pd.read_excel('my_sheet.xlsx', sheetname='Sheet1')
b = df['time_period_1']
i = 1
for i in b:
if b[i] > b[i-1] and b[i] > b[i+1]:
print(b[i])
i=i+1
Which gives the error
KeyError: 24223
24223 is the first value in the column
Any idea what is going on?
Thanks!
Beware that you are using "i" for both the increment inside of the loop and as an element of b. Basically "for i in b" puts the first value of "b" in "i", which is "24223" but this key is incorrect as certainly your column does not contain 24223 elements.
Basically, change the name of one of your index. For instance :
k = 1
for i in b:
if i > b[k-1] and i > b[k+1]:
print(b[k])
k += 1
(not sure here if you want to print(b[k]) or print(i), but you'll get the point)
Edit 1: you should use "i" as an object in "b", as you don't need it to be an index. When you use "for i in b", i is already the content of the cell. Use "b[i]" only when i is an index. So here, you either :
for i in range(len(b)) :
if b[i] > ...
where i is an index, or :
for i in b :
if i > ...
where i is an object.
Edit 2: to avoid accessing b[k+1] for the last index (which triggers an error as k+1 does not exist), use an if-condition. Also for clarity purposes, I would recommand using only one index. Note that to acheive this you also need a condition for the first row, as b[-1] would refer to the last row in python and I assume it is not what you want the first row to be compared to. Here is a code that should work:
for i in range(len(b)): #range begins implicitly at 0
if i > 0 and i < len(b)-1: #so you are not on the first nor last row
if b[i] > b[i-1] and b[i] > b[i+1]:
print(b[i])
elif i==0: #first row, to be compared with the following one only
if b[i] > b[i+1]:
print(b[i])
else: #last row, to be compared with the previous one only
if b[i] > b[i-1]:
print(b[i])
There would be mode concise and elegant ways to do so, but I think this one is the clearest.

How do i add a variable that changes value into different sections of a list? (Python 3)

I'm not sure if this has been asked but i was having trouble with some code for a project i'm doing in class. So for this part i need to make a function that takes in a variable (y) and adds it to a specific spot in a list [0,0,0,0,0] -> [y,0,0,0,0]. Then +1 to the variable and placing it in the next spot [y,y+1,0,0,0].
This is what i have currently
There's not much reason to start with the list being full if you're going to just overwrite the contents, so the simplest code would probably be:
year_list = []
for year in range(y, y + 5):
year_list.append(year)
You could also just take the range and turn it into a list:
year_list = list(range(y, y + 5))
Since you said you need to insert the value at a particular position, below is the code for that.
y_pos = 0 # enter the actual position of y-1 here. 0 means it will
# inserted at the beginning, 1 means inserted at 2nd
# position
year_list = []
# this intialises the list upto y_pos + 5
for i in range(y_pos+5):
year_list.append(0)
for i in range (y_pos,y_pos+5):
year_list[i] = y + (i - y_pos)

too many arguments for this formula

I am making excel sheet but excel is not allowing me further checks and giving an error to many argument for this formula.
kindly give me solution.
=IF(E2="NL",N2-(J2*0.08),IF(AND(E2="PA",H2="Trip Travel"),(N2*0.01)+N2,(N2*0.02)+N2,IF(AND(E2="PK",H2="Trip Travel"),(J2*0.01)+N2,(J2*0.02)+N2)))
Edit to clarify possible scenarios:
1. E is "NL" and H is "Other" : return N2+(J2*0.08)
2. E is "NL" and H is "Trip Travel" : return N2+(J2*0.08)
3. E is "PK" and H is "Other" : return N2+(J2*0.01)
4. E is "PK" and H is "Trip Travel" : return N2+(J2*0.01)
5. E is "PA" and H is "Other" : return N2+(N2*0.02)
6. E is "PA" and H is "Trip Travel" : return N2+(N2*0.01)
The problem is the 2nd IF function, which has 4 arguments given. IF only takes 3:
The condition to test
The result if the condition is true
The result if the condition is false
The problematic section is:
IF(AND(E2="PA",H2="Trip Travel"), (N2*0.01)+N2, (N2*0.02)+N2, IF(AND(E2="PK",H2="Trip Travel"),(J2*0.01)+N2,(J2*0.02)+N2))
Notice how there are 4 arguments for the IF function in that section:
The condition: AND(E2="PA",H2="Trip Travel")
The result if true: (N2*0.01)+N2
The result if false: (N2*0.02)+N2
And then this extra argument: IF(AND(E2="PK",H2="Trip Travel"),(J2*0.01)+N2,(J2*0.02)+N2)
I'm not sure how exactly to correct this since I can't determine the logic you want from that formula. However, I'm guessing that either (N2*0.01)+N2 or (N2*0.02)+N2 belongs in a different section.
Also, just to help simplify the formula, the "+N2" could be put before the second IF and remove all the "+N2"s since all outcomes have N2 added to them. Something like the following (which still has the 4 arguments since I'm not sure how you want the logic corrected):
=IF(E2="NL",N2-(J2*0.08), N2+IF(AND(E2="PA",H2="Trip Travel"),(N2*0.01),(N2*0.02),IF(AND(E2="PK",H2="Trip Travel"),(J2*0.01),(J2*0.02))))
Edit the edited question:
This formula should cover the six possibilities in your spreadsheet:
=IFERROR(N2+IF(E2="NL",J2*0.08,IF(E2="PK",J2*0.01,IF(E2="PA",IF(H2="Trip Travel",N2*0.01,IF(H2="Other",N2*0.02,"n/a")),"n/a"))),"n/a")

Resources