How to generate all arrangements of two values in 5 columns using Excel? - excel

I have two values: 1 and 0. And I have 5 columns. I need to generate all possible arrangements in Excel.
For example, I have 2 columns and 2 values: 0 , 1. There are only 4 possible arrangements (with repetitions):
1 | 0
0 | 1
0 | 0
1 | 1
I need to generate all posible arrangements of 1 and 0 for 5 columns. Number of possible arrangements with repetition is defined by formula: n^k.
So, for 5 columns and 2 values it is 2^5 = 32 arrangements.
In Excel:
and so on
Is it possible to automate it without typing ones and zeros manually?

You basically want to count from 0 to 31 in binary and then split the binary result out over the columns. You can do it like this:
Column A - just the number i.e. 0, 1, 2, 3, 4
Column B - =DEC2BIN(A2,5)
Columns C to G - =MID($B2,C$1,1) and then drag down and across
For example - for the formula to get the binary digit in the correct column:

Related

How to generate a Dataframe whose length equals to the product of all columns lengths?

I am looking for a quick way to generate a long dataframe. For example, the input is:
Column "color": [1,2,3] (length: 3)
Column "weekday": [0,1] (length: 2)
The expected output is:
color weekday
1 0
2 0
3 0
1 1
2 1
3 1
And this output dataframe has the length as 2*3 = 6.
Is there a quick way to generate such dataframes based on the series as the input? And it is possible that there are many columns. Thanks.

Excel: How to get average of the between nonzero values?

I would need to get the average of the 2nd to the 8th nonzero value per row. Meaning, it would always move depending where the nonzero begins at.
I think what I would need is to determine the following:
Location of the 2nd nonzero value
Location of the 8th nonzero value
Average of numbers between 2nd and 8th nonzero values
Is that possible?
For example
0 | 6 | 10 |5| 9 | 0 | 6 | 0 |3 | 10| 1|9|
Those in bold have to be averaged. The zeroes in between have to be ignored
If you are using Windows Excel 2019, then:
=AVERAGE(FILTERXML("<t><s>" &TEXTJOIN("</s><s>",TRUE,IF(row_ref<>0,row_ref,""))&"</s></t>","//s[position()>1 and position()<10]"))
TEXTJOIN extracts only those values which are non-zero (and non-blank)
By using the appropriate delimiters, we create an XML
The xPath with the position function then extracts the 2nd to 8th values (positions 2 through 9)
AVERAGE
Assuming your data is laid out as per the example below, place the following formula in column T and copy down as required.
=IFERROR(AVERAGE(INDEX(22:22,AGGREGATE(15,6,COLUMN(B22:S22)/(B22:S22<>0),ROW($2:$9)))),"Less than " & rows($2:$9) & " non zero numerical entries")
B22:S22 - represents the row of data you are looking at. Feel free to change the column reference letters to suit your needs. Just ensure all the references with in the formula match.
$2:$9 - Represents the number of entries you want to use as part of the average. 2 is the starting number that you want to use and corresponds to 2nd non zero. 9 is the last number you want to include and makes a total of 8 numbers. Adjust these number to change the data range you want to include.
Ensure you keep the $ to prevent the row number from changing as the formula is copied.
Aggregate performs array like operations. As a result it may cause your system to bog down or crash if there is an excessive number of rows you are looking at. Also, full column references should generally be avoided within the aggregate function to avoid excess calculation.
I placed my answer not in A1 to make sure it will work anywhere on your sheet.
Assuming a layout like this (your first row of numbers from D2 to T2):
Paste this to Y2 (in the combined) column:
=AVERAGEIF(OFFSET(C2,0,AGGREGATE(15,3,((D2:T2)>0)/((D2:T2)>0)*COLUMN(D2:T2)-COLUMN(C2),2)):OFFSET(C2,0,AGGREGATE(15,3,((D2:T2)>0)/((D2:T2)>0)*COLUMN(D2:T2)-COLUMN(C2),8)),">0")
Then copy down to the other rows.
Data I used:
1 2 0 6 4 9 0 7 3 7 1 1 1 2 6 9 7
0 0 7 0 1 1 4 4 1 5 8 3 5 6 4 6 4
1 7 2 8 0 6 4 9 9 7 8 4 6 9 4 2 9
0 1 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0
8 7 8 4 10 0 2 2 10 4 4 8 3 3 0 4 10
In this formula, C2 is the cell (empty or not) before the beginning of the row of numbers and not included in the numbers to be counted.
Assuming you have excel 365, this can be easily done using simply FILTER, INDEX and AVERAGE function
=AVERAGE(INDEX(FILTER(row_ref,row_ref>0),{2,3,4,5,6,7,8,9}))
Example
=AVERAGE(INDEX(FILTER(B5:N5,B5:N5>0),{2,3,4,5,6,7,8,9}))

Is there a way to convert my column of incrementing integers separated by zero to the number of intervals encountered so far in a pandas datafram?

I'm working in pandas and I have a column in my dataframe filled by 0s and incrementing integers starting at one. I would like to add another column of integers but that column would be a counter of how many intervals separated by zero we have encountered to this point. For example my data would like like
Index
1
2
3
0
1
2
0
1
and I would like it to look like
Index IntervalCount
1 1
2 1
3 1
0 1
1 2
2 2
0 2
1 2
Is it possible to do this with vectorized operation or do I have to do this iteratively? Note, it's not important that it be a new column could also overwrite the old one.
You can use cumsum function.
df["IntervalCount"] = (df["Index"] == 1).cumsum()

Compare multiple data from rows

I'm looking for a way to compare multiple rows with data to each other, trying to find the best possible match. Each number in every column must be an approximately match the other numbers in the same column.
Example:
Customer #1: 1 5 10 9 7 7 8 2 3
Customer #2: 10 5 9 3 5 7 4 3 2
Customer #3: 1 4 10 9 8 7 6 2 2
Customer #4: 9 5 6 7 2 1 10 5 6
In this example customer #1 and #3 is quite similar, and I need to find a way to highlight or sort the rows so I can easily find the best match.
I've tried using conditional formatting to highlight the numbers that are the similar, but that is quite confusing, because the amount of data is quite big.
Any ideas of how I could solve this?
Thanks!
The following formula entered in (say) L1 and pulled down gives the best match with the current row based on the sum of the absolute differences between corresponding cells:-
=MIN(IF(ROW($C$1:$K$4)<>ROW(),(MMULT(ABS($C1:$K1-$C$1:$K$4),TRANSPOSE(COLUMN($C$1:$K$4))^0))))
It is an array formula and must be entered with CtrlShiftEnter.
You can then sort on column L to bring the customers with lowest similarity scores to the top or use conditional formatting to highlight rows with a certain similarity value.
EDIT
If you wanted to penalise large differences in individual columns more heavily than small differences to try and avoid pairs of customers which are fairly similar except for having some columns very different, you could try something like the square of the differences:-
=MIN(IF(ROW($C$1:$K$4)<>ROW(),(MMULT(($C1:$K1-$C$1:$K$4)^2,TRANSPOSE(COLUMN($C$1:$K$4))^0))))
then the scores for your test data would come out as 7,127,7,127.
I'm assuming you want to compare customers 2-4 with customer 1 and that you are comparing only within each column. In this case, you could implement a 'scoring system' using multiple IFs. For example,:
A B C D E
1 Customer 1 1 1 2
2 Customer 2 1 2 2
3 Customer 3 0 1 0
you could use in E2
=if(B2=$B$1,1,0)+if(C2=$C$1,1,0)+if(D2=$D$1,1,0)
This will return a 'score' of 1 when you have a match and a 'score' of 0 when you don't. It then adds up the scores and your highest value will be your best match. Copying down would then give
A B C D E
1 Customer 1 1 1 2
2 Customer 2 1 2 2 2
3 Customer 3 0 1 0 1
so customer 2 is the best match.

Find summation and count only if they are EQUAL in Excel

In EXCEL sheet I have 1728 rows and 2 columns (L and O). I am doing addition of these 2 columns in column P. Further I want to count the occurrence in this column if addition is EQUAL to 2 or 4 or 6 or 8 BUT condition here is that The COUNT should be such that BOTH the columns L and O are EQUAL and Their addition is either 2 or 4 or 6 or 8.
This means that only the columns in L and O with values "1+1" , "2+2", "3+3", "4+4" should be counted. The addition of "1+3", "4+2" should not be counted.
=COUNTIF(P:P,4)
does not work.
L O P M
===========================
1 1 2 1 (NO OF 2'S)
2 2 4 1 (NO OF 4'S)
3 3 6 1 (NO OF 6'S)
1 3 4* NO TO BE COUNTED
4 4 8 1 (NO OF 8'S)
2 4 6* NOT TO BE COUNTED
4 2 6*
AS SEEN ABOVE RESULT OF COUNTING IS STORED IN M. Let me know the formula
=IF(L29=M29,SUMPRODUCT(--($L$29:$L$35=$M$29:$M$35)*(L29=$L$29:$L$35)),"Not Counted")
My data started in row 29 so you will need to adjust the references. It counts the entire table in 1 shot. So if you added a row to the bottom that had 1 and 1 and 2, the results in column M in your first row would become 2 and the same for the row you just added.
Will this formula help...?
=IF(AND(A1=B1,OR(SUM(A1,B1)=2,SUM(A1,B1)=4,SUM(A1,B1)=6,SUM(A1,B1)=8)),SUM(A1,B1),"NOT TO BE COUNTED")
Just drag the formula till you have data. You will need to adjust the references.
Here is the reference data.

Resources