At Column A i have this values 1
0
3
2
0
5
1
1
1
0
2
1
1
1
0
2
1
1
1
0
0
3
0
2
0
0
3
1
This list grows everyday.
I need a formula to put on every cell of column B that counts upwards how many values bigger than 1 are until the next value = 1 is found.
In another words i need to count how many values larger than 1 are between 1's.
The pretended result would be something like this:
1
0
3
2
0
5
1 3
1
0
2
1 1
1
0
2
1 1
1
0
0
3
0
2
0
0
3
1 3
Thanks in Advance
I would use a helper column, if this is acceptable.
So to create a running count of numbers greater than one which resets each time it encounters a '1', enter this starting in B2 and pull down (I'm assuming the data has a heading and the list starts with a 1) :-
=IF(A2=1,0,B1+(A2>1))
Then to display the counts at each '1' value (but not for repeated ones) enter this in C2 and pull down:-
=IF(AND(A2=1,A1<>1,ISNUMBER(A1)),B1,"")
It's also possible to do it with an array formula, but not sure if it's worth the effort:-
=IF(AND(A2=1,A1<>1),
COUNTIF(
OFFSET(
A$1,
MAX(ROW(A1:A$2)*(A1:A$2=1))-ROW(A$1)+1,,
MAX(ROW(A1))-MAX(ROW(A1:A$2)*(A1:A$2=1))),
">"&0),
"")
to be entered in B2 with Ctrl Shift Enter and pulled down.
Related
I have dataframe that has 50 columns each column have either 0 or 1. How do I return all rows that have an equal (tie) in the number of 0 and 1 (25 "0" and 25 "1").
An example on a 4 columns:
A B C D
1 1 0 0
1 1 1 0
1 0 1 0
0 0 0 0
based on the above example it should return the first and the third row.
A B C D
1 1 0 0
1 0 1 0
Because you have four columns, we assume you must have atleast two sets of 1 in a row. So, please try
df[df.mean(1).eq(0.5)]
I have a df which contains customer data without a primary key. The same customer might show up multiple times.
I have a field (df2['campaign']) that is an int and reflects how many times the customer shows up in the df. There are also many customer attributes.
In my example, going from top to bottom, for each row (i.e. customer), I would like to find all n rows (i.e. all n customers) whose values of the education and default columns are the same. Remember n is the int contained in df2['campaign']
So as shown below, for row 0 and 1 I should search 1 row but find nothing because there are no matching values for education-default combinations.
For row 2 I should search 1 row (because campaign == 1) where education-default values match, and find 1 row in index 4.
df2.head()
job marital education default campaign housing loan contact
0 3 1 0 0 1 0 0 1
1 7 1 3 1 1 0 0 1
2 7 1 3 0 1 2 0 1
3 0 1 1 0 1 0 0 1
4 7 1 3 0 1 0 2 1
Use df2_sorted = df2.sort(['education', 'default'], ascending=[1, 1]).
Then if your data is not noisy, the rows should become neighbors.
Sorry for bad english,
I've some cell with 0 value and 1 value in my microsoft excel, and i want to show 0 values with not valid and 1 values with valid without affecting the formula.
My current excel :
x A B C D E F
1 1 0 0 0 1 0
2 0 1 1 0 0 1
3 1 0 1 0 0 1
4 0 1 1 1 1 1
5 0 0 1 0 1 0
What i want :
valid notvalid notvalid notvalid valid notvalid
0 1 1 0 0 1
1 0 1 0 0 1
0 1 1 1 1 1
0 0 1 0 1 0
Use a custom number format (ctrl+1) of [Color13][=1]v\ali\d;[Color9][=0]\notv\ali\d;; on the cells.
In addition to the valid/notvalid display text, I've added dark blue font for the valids and dark red for notvalid.
Considering you are now working in Worksheet1, if you don't want to edit the formula you currently have in cells A1:F5, you can:
either go to/create Worksheet2 and select the cell A1 OR select the cell A7 in Worksheet1.
write in Worksheet1!A7 or Worksheet2!A1 the following formula:
=IF(Worksheet1!A1=1,"valid","notvalid")
copy the formula dragging the fill handle as needed.
I hope I understood well what you would like to do.
Having this kind of data:
A B C D E
1 1 0 1 0 0
2 0 1 1 0 1
3 1 0 1 1 0
4 0 1 0 1 0
I would like to show true/false in column F if column A, C and E has the value of 1.
So not looking for a value in range - but different columns.
You can use the AND function, something like:
=IF(AND(A1=1,C1=1,E1=1),"TRUE","FALSE")
=IF(A2;IF(C2;IF(E2;TRUE;FALSE);FALSE);FALSE)
This will display TRUE if ALL three cells are 1, else FALSE.
I have a dataframe which has a value of either 0 or 1 in a "column 2", and either a 0 or 1 in "column 1", I would somehow like to find and append as a column the index value for the last row where Column1 = 1 but only for rows where column 2 = 1. This might be easier to see than read:
d = {'C1' : pd.Series([1, 0, 1,0,0], index=[1,2,3,4,5]),'C2' : pd.Series([0, 0,0,1,1], index=[1,2,3,4,5])}
df = pd.DataFrame(d)
print(df)
C1 C2
1 1 0
2 0 0
3 1 0
4 0 1
5 0 1
#I've left out my attempts as they don't even get close
df['C3'] = IF C2 = 1: Call Function that gives Index Value of last place where C1 = 1 Else 0 End
This would result in this result set:
C1 C2 C3
1 1 0 0
2 0 0 0
3 1 0 0
4 0 1 3
5 0 1 3
I was trying to get a function to do this as there are roughly 2million rows in my data set but only ~10k where C2 =1.
Thank you in advance for any help, I really appreciate it - I only started
programming with python a few weeks ago.
It is not so straight forward, you have to do a few loops to get this result. The key here is the fillna method which can do forwards and backwards filling.
It is often the case that pandas methods does more than one thing, this makes it very hard to figure out what methods to use for what.
So let me talk you through this code.
First we need to set C3 to nan, otherwise we cannot use fillna later.
Then we set C3 to be the index but only where C1 == 1 (the mask does this)
After this we can use fillna with method='ffill' to propagate the last observation forwards.
Then we have to mask away all the values where C2 == 0, same way we set the index earlier, with a mask.
df['C3'] = pd.np.nan
mask = df['C1'] == 1
df['C3'].loc[mask] = df.index[mask].copy()
df['C3'] = df['C3'].fillna(method='ffill')
mask = df['C2'] == 0
df['C3'].loc[mask] = 0
df
C1 C2 C3
1 1 0 0
2 0 0 0
3 1 0 0
4 0 1 3
5 0 1 3
EDIT:
Added a .copy() to the index, otherwise we overwrite it and the index gets all full of zeroes.