Populate cells based on x by y cell value - excel

I'm trying to populate cells based on values from two different cells.
Values in the cell needs to be (n-1) where n is the input and then repeated based on the amount of the other cell.
For example, I have input:
x y
2 5
Output should be:
x should have 0 and 1; each repeated five times
y should have 0, 1, 2, 3, 4; each repeated twice
x1 y1
0 0
0 1
0 2
0 3
0 4
1 0
1 1
1 2
1 3
1 4
I used:
=IF(ROW()<=C2+1,K2-1,"")
and
=IF(ROW()<=d2+1,K2-1,"")
but it is not repeating and I only see:
x y
0 0
1 1
__ 2
__ 3
__ 4
(C2 and D2 are where values for x and y are, K is the number of items.)
Are there any suggestions on how I can do this?

In Row2 and copied down to suit:
=IF(ROW()<=1+C$2*D$2,INT((ROW()-2)/D$2),"")
and
=IF(ROW()<=1+C$2*D$2,MOD(ROW()-2,D$2),"")

Related

Using groupby() with appending additional rows

With the following csv input file
ID,Name,Metric,Value
0,K1,M1,200
0,K1,M2,5
1,K2,M1,1
1,K2,M2,10
2,K2,M1,500
2,K2,M2,8
This code, groups the rows by the name column, e.g. two groups. Then it appends the values as columns for the same Name.
df = pd.read_csv('test.csv', usecols=['ID','Name','Metric','Value'])
print(df)
my_array = []
for name, df_group in df.groupby('Name'):
my_array.append( pd.concat(
[g.reset_index(drop=True) for _, g in df_group.groupby('ID')['Value']],
axis=1) )
print(my_array)
The output looks like
ID Name Metric Value
0 0 K1 M1 200
1 0 K1 M2 5
2 1 K2 M1 1
3 1 K2 M2 10
4 2 K2 M1 500
5 2 K2 M2 8
[ Value
0 200
1 5, Value Value
0 1 500
1 10 8]
For example, my_array[1] which is K2 has two rows corresponding to M1 and M2. I would like to keep the IDs as well in the final data frames in my_array. So I want to add a third row and save it (M1, M2 and ID). Therefore, the final my_array should be
[ Value
0 200
1 5
2 0, Value Value
0 1 500 <-- For K2, there are two M1 (1 and 500)
1 10 8 <-- For K2, there are two M2 (10 and 8)
2 1 2] <-- For K2, there are two ID (1 and 2)
How can I modify the code for that purpose?
You can use DataFrame.pivot for DataFrames pe groups and then append df1.columns in np.vstack:
my_array = []
for name, df_group in df.groupby('Name'):
df1 = df_group.pivot('Metric','ID','Value')
my_array.append(pd.DataFrame(np.vstack([df1, df1.columns])))
print (my_array)
[ 0
0 200
1 5
2 0, 0 1
0 1 500
1 10 8
2 1 2]

Pandas Flag Rows with Complementary Zeros

Given the following data frame:
import pandas as pd
df=pd.DataFrame({'A':[0,4,4,4],
'B':[0,4,4,0],
'C':[0,4,4,4],
'D':[4,0,0,4],
'E':[4,0,0,0],
'Name':['a','a','b','c']})
df
A B C D E Name
0 0 0 0 4 4 a
1 4 4 4 0 0 a
2 4 4 4 0 0 b
3 4 0 4 4 0 c
I'd like to add a new field called "Match_Flag" which labels unique combinations of rows if they have complementary zero patterns (as with rows 0, 1, and 2) AND have the same name (just for rows 0 and 1). It uses the name of the rows that match.
The desired result is as follows:
A B C D E Name Match_Flag
0 0 0 0 4 4 a a
1 4 4 4 0 0 a a
2 4 4 4 0 0 b NaN
3 4 0 4 4 0 c NaN
Caveat:
The patterns may vary, but should still be complementary.
Thanks in advance!
UPDATE
Sorry for the confusion.
Here is some clarification:
The reason why rows 0 and 1 are "complementary" is that they have opposite patterns of zeros in their columns; 0,0,0,4,4 vs, 4,4,4,0,0.
The number 4 is arbitrary; it could just as easily be 0,0,0,4,2 and 65,770,23,0,0. So if 2 such rows are indeed complementary and they have the same name, I'd like for them to be flagged with that same name under the "Match_Flag" column.
You can identify a compliment if it's dot product is zero and it's element wise sum is nowhere zero.
def complements(df):
v = df.drop('Name', axis=1).values
n = v.shape[0]
row, col = np.triu_indices(n, 1)
# ensure two rows are complete
# their sum contains no zeros
c = ((v[row] + v[col]) != 0).all(1)
complete = set(row[c]).union(col[c])
# ensure two rows do not overlap
# their product is zero everywhere
o = (v[row] * v[col] == 0).all(1)
non_overlap = set(row[o]).union(col[o])
# we are a compliment iff we do
# not overlap and we are complete
complement = list(non_overlap.intersection(complete))
# return slice
return df.Name.iloc[complement]
Then groupby('Name') and apply our function
df['Match_Flag'] = df.groupby('Name', group_keys=False).apply(complements)

Excel check for value in different cells per row

Having this kind of data:
A B C D E
1 1 0 1 0 0
2 0 1 1 0 1
3 1 0 1 1 0
4 0 1 0 1 0
I would like to show true/false in column F if column A, C and E has the value of 1.
So not looking for a value in range - but different columns.
You can use the AND function, something like:
=IF(AND(A1=1,C1=1,E1=1),"TRUE","FALSE")
=IF(A2;IF(C2;IF(E2;TRUE;FALSE);FALSE);FALSE)
This will display TRUE if ALL three cells are 1, else FALSE.

Index Value of Last Matching Row Python Panda DataFrame

I have a dataframe which has a value of either 0 or 1 in a "column 2", and either a 0 or 1 in "column 1", I would somehow like to find and append as a column the index value for the last row where Column1 = 1 but only for rows where column 2 = 1. This might be easier to see than read:
d = {'C1' : pd.Series([1, 0, 1,0,0], index=[1,2,3,4,5]),'C2' : pd.Series([0, 0,0,1,1], index=[1,2,3,4,5])}
df = pd.DataFrame(d)
print(df)
C1 C2
1 1 0
2 0 0
3 1 0
4 0 1
5 0 1
#I've left out my attempts as they don't even get close
df['C3'] = IF C2 = 1: Call Function that gives Index Value of last place where C1 = 1 Else 0 End
This would result in this result set:
C1 C2 C3
1 1 0 0
2 0 0 0
3 1 0 0
4 0 1 3
5 0 1 3
I was trying to get a function to do this as there are roughly 2million rows in my data set but only ~10k where C2 =1.
Thank you in advance for any help, I really appreciate it - I only started
programming with python a few weeks ago.
It is not so straight forward, you have to do a few loops to get this result. The key here is the fillna method which can do forwards and backwards filling.
It is often the case that pandas methods does more than one thing, this makes it very hard to figure out what methods to use for what.
So let me talk you through this code.
First we need to set C3 to nan, otherwise we cannot use fillna later.
Then we set C3 to be the index but only where C1 == 1 (the mask does this)
After this we can use fillna with method='ffill' to propagate the last observation forwards.
Then we have to mask away all the values where C2 == 0, same way we set the index earlier, with a mask.
df['C3'] = pd.np.nan
mask = df['C1'] == 1
df['C3'].loc[mask] = df.index[mask].copy()
df['C3'] = df['C3'].fillna(method='ffill')
mask = df['C2'] == 0
df['C3'].loc[mask] = 0
df
C1 C2 C3
1 1 0 0
2 0 0 0
3 1 0 0
4 0 1 3
5 0 1 3
EDIT:
Added a .copy() to the index, otherwise we overwrite it and the index gets all full of zeroes.

Comparing boolean cells in excel?

i have some problem in how to compare boolean cells.
Let say like this
A B C D E F G H I
1 1 1 1 1 0
1 0 1 1 1 1
0 1 1 0 1 1
0 0 0 1 1 1
In cell G2 i need to calculate and compare every 3 celss.
if "sum" of A2:C2 = 3, then return 0, but if "sum" of A2:C2 <=2, then "sum" B2:D2, if "sum" of B2:D2 = 3 then return 1 else 0
In cell H2 i need to do the same.
if "sum" of B2:D2 = 3, then return 0, but if "sum" of B2:D2 <=2, then "sum" C2:E2, if "sum" of C2:E2 = 3 then return 1 else 0
and so on
i already try IF, AND, SUM but still not working.
Thanks for your help...

Resources