All combinations of 4 out of 7 columns with totals using excel - excel-formula

I have 7 columns to choose from and I need to pick 4 of those columns and generate a total for each row. I also need every combination of 4, which means I'll have 35 new columns with the totals for each of those combinations showing in each row. I need the code for this and if it can be done only using Excel. Here is an image of the columns and the grayed ones are the 7 columns I'm talking about. My knowledge of Excel is very limited. There are over 1,500 rows if that matters.

multi step approach that is going to use some helper rows. there may be a more elegant formula that will do this, and much slicker options in VBA, but this is a formula only approach.
Step 1 - Generate List of Column Combination
To generate the list 4 helper rows will need to be insert at the top of your data. either above or below you header row. These 4 rows will represent the column number you are going to pick. To keep the math simpler for me I just assumed the 1 for the first column and 7 for the last column. those numbers will get converted to later to account for column in between in your spreadsheet. For the sake of this example The first combination sum will occur in column AO and the first helper row will be row 1. The first combination will be hard coded and it will seed the pattern for the remainder of column combinations. Enter the following values in the corresponding cells:
AO1 = 1
AO2 = 2
AO3 = 3
AO4 = 4
In the adjacent column a formula will be placed and copied to the right. It will automatically augment the bottom value by 1 until it hits its maximum value at which point the value in the row above will increase by 1 and the the value of the current will be 1 more than the cell above. This will produce a pattern that covers all 35 combinations by the time column BW is reached. Place the formulas below in the appropriate cell and copy to the right:
AP1
=IF(AO2=5,AO1+1,AO1)
AP2
=IF(AO2=5,AP1+1,IF(AO3=6,AO2+1,AO2))
AP3
=IF(AO3=6,AP2+1,IF(AO4=7,AO3+1,AO3))
AP4
=IF(AO4=7,AP3+1,AO4+1)
Step2 - Sum The Appropriate Columns
I was hoping to use a some sort of array type operation to read through the column reference numbers above, but I could not get my head around it. Since it was just 4 entries to worry about I simply added each reference manually in a SUM function. Now the important thing to note is that we will be using the INDEX function over the 13 columns that cover the range of your columns so to convert the index number we figured out above, to something that will work to grab every second row, the number that was calculated will be multiplied by 2 and then 1 will be subtracted. That means 1,2,3,4 for the first column combination becomes 1,3,5,7. You can see this in the following formula. Place the following formula in the appropriate cell and copy down and to the right as needed.
AO5
=INDEX($AB5:$AN5,AO$1*2-1)+INDEX($AB5:$AN5,AO$2*2-1)+INDEX($AB5:$AN5,AO$3*2-1)+INDEX($AB5:$AN5,AO$4*2-1)
pay careful attention to the $ which will lock row or column reference and prevent them from changing as the formula is copied.
Now you may need to adjust the cell references to match your sheet.

Related

How to create a dynamic formula to find the average of a set of values for a given vector

I am trying to create a formula that gives me the average of the last 12 entries in a given dataset depending on the associated vector.
Let's make an example:
I have in column F2,G2,H2 and I2 dates, Company1, Company2 and Company3 respectively. Then from row3 to row 33 I have months dates starting from May 2016.
Date Company1 Company2 Company3
May-16 2,453,845
Jun-16 13,099,823
Jul-16 14,159,037
Aug-16 38,589,050 8,866,101
Sep-16 63,290,285 13,242,522
Oct-16 94,005,364 14,841,793
Nov-16 123,774,792 7,903,600 41,489,883
Dec-16 93,355,037 12,449,604 69,117,105
Jan-17 47,869,982 13,830,712 83,913,764
Feb-17 77,109,905 10,361,555 68,176,643
The goal is to create a formula that, when I drag it down, correctly calculates the average of the last 12 values for a given company.
So for example i would have, say in table "B2:C5":
Company1 76,856,345
Company2 11,120,859
Company3 65,674,349
And, if a new Company4 is added to the list, then I just have to drag it down the formula, to calculate the average of the last 12 months for Company4.
Until now, I have came up with this formula:
=AVERAGE(LOOKUP(LARGE(IF(ISNUMBER(G:G),ROW(G:G)),ROW(INDIRECT("1:"&MIN(12,COUNT(G:G))))),ROW(G:G),G:G ))
This formula correctly calculates the average of a given column, considering only the last 12 values. The last step would be to come up with a formula that includes all the columns and then calculates the average for the given company.
Thanks!
I recommend that you use a named range to define your data in columns G:I. When a company is added, just modify the named range's specs. I used the name Target. Of course, you can replace it with $G:$I if you feel so inclined but I would rather recommend reducing the number of rows in the range, which is easier to manage when it is named.
Use the formula below to extract the company names from the first row of Target into the first column of your averages table. This is to ensure that the names are spelled identically in both locations.
=INDEX(Target,1,ROW()-2)
The number 2 indicates the number of rows above the row containing the formula. it is copied here from cell M3. There, ROW()-2 creates the number 1, counting sequentially as the formula is copied down.
Now I have the formula below in my cell N3 and copied down.
=SUM(INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0)))
The formula simply sums up the columns G, H, and I in 3 consecutive rows.
In the final step I inserted the range definition established above, meaning excluding the SUM() function, into your existing formula.
=AVERAGE(LOOKUP(LARGE(IF(ISNUMBER(INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0))),ROW(INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0)))),ROW(INDIRECT("1:"&MIN(12,COUNT(INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0))))))),ROW(INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0))),INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0))))

Excel - function to find the highest sum in a table using each row and column only once

I've got a table in excel with 10 rows and 10 columns.
The table contains 100 different values between 1 and 3.
I want to find the highest sum of 10 values using only 1 value from each row and 1 from each column.
Do u guys know a function that finds the highest sum? - I've tried to do i manually, but there are to many combinations!
Hope it makes sense.
Thanks in advance:)
My solution builds on what I wrote in the comment, i.e. you first take the maximum value in the 10x10 array, then the maximum in the 9x9 array (excluding the row/column of the first maximum), etc. My solution tries not to do everything in one formula, but I add a few helper columns, and a bit more helper rows (it is fast and dirty, but it works and is easily audited/understandable). You always can do this on a separate worksheet which you could hide if needed.
The screenshot above goes from cell A1 till Y31.
The key formulas:
3.55 is the result of =MAX(B2:K11)
The first gray cell is =IFNA(MATCH($M12;B2:B11;0);""), and you drag this 9 cells to the left. This tries to find a match with the max result in each column of the table;
The 10 left of the 3.55 is =MATCH(TRUE;INDEX(ISNUMBER(P12:Y12);0);0) , and gives the column number of the max value.
The 2 next to the 10 is =INDEX(P12:Y12;N12) and gives the row number of the max value.
The 1 in cell B12 is =IF(OR(B$1=$N12;$A12=$O12);0;1), and creates a 10x10 matrix with a row and column with zeroes where the previous max value was found.
Then you multiply this with the preceding matrix and create a new 10x10 matrix below (enter {=B2:K11*B12:K21} array formula (ctrl+shift+enter) in B22-K31
You then copy/paste rows 12 till 31 9 times below
The 23.02 is the total sum =SUM($M$12:$M$211) from all 10 maximum values and is the result you are looking for. The 10 is just a check with =COUNT($M$12:$M$211)

Excel: Merge two columns into one column with alternating values

how can I merge two columns of data into one like the following:
Col1 Col2 Col3
========================
A 1 A
B 2 1
C 3 B
2
C
3
You can use the following formula in column D as per my example. Keep in mind to increase the $A$1:$B$6 range according to your data.
=INDEX($A$1:$B$6,INT((ROWS(D$2:D2)-1)/2)+1,MOD(ROWS(D$2:D2)-1,2)+1)
Result:
Thank you to #Koby Douek for the answer. Just an addition--if you are using Open Office Calc, you replace the commas with semi-colons.
=INDEX($A$1:$B$6;INT((ROWS(D$2:D2)-1)/2)+1;MOD(ROWS(D$2:D2)-1;2)+1)
Expanding #koby Douek's answer to more columns and explaining some of the terms
Original Code for 2 columns to 1 alternating
=INDEX($A$1:$B$6,INT((ROWS(D$2:D2)-1)/2)+1,MOD(ROWS(D$2:D2)-1,2)+1)
$A$1:$B$6 Defines the columns and rows to source the final set of data from, the $s are only present to keep the formula from changing the columns and rows selects if it is copied and pasted or dragged.
To extend to work on any values you dump into the columns instead of having to expand the range every time it should be amended to $A:$B or A:B so you can easily copy it to other sets of columns and create new merges, but it will also give the 1st value in every column as one of the alternating values so if you instead have headers you would be able to do this by instead using a large number so $A$1:$B$99999 or A$1:B$99999 if you want to past and move the columns ymmv which is better by situation.
lets assume you are fine including the values in the 1st row
This changes the formula to
=INDEX($A:$B,INT((ROWS(D$2:D2)-1)/2)+1,MOD(ROWS(D$2:D2)-1,2)+1)
Now on to D$2:D2
This is the row that is being used to calculate the difference between the current row the formula is in (D2) and the reference row (D$2) The important thing to make sure you do is to set the reference row number to the 1st row you will be putting values in, so if your 1st row is a header in the sort column you will use the 2nd row as the reference, if your values in the combined column D begin on the 3rd row then the reference row would be D$3
Since I like the more general form where the 1st row isn't a header row I'll use D$1:D1 but you could still mix source rows without headers into a combined row with a header of as many rows as you like just by incrementing that reference row number to be the 1st row where your values should begin.
This changes the formula to
=INDEX($A:$B,INT((ROWS(D$1:D1)-1)/2)+1,MOD(ROWS(D$1:D1)-1,2)+1)
Now INT((ROWS(D$1:D1)-1)/2)+1 and MOD(ROWS(D$1:D1)-1,2)+1
INT returns an integer value so any decimal places are dropped, it essentially functions like rounding down to the nearest whole number
MOD functions by returning the remainder of a division, it's result will be a whole number between 0 and n-1 where n is the number we are dividing by. (eg: 0/3=0; 1/3=1; 2/3=2; 3/3=0; 4/3=1 ... etc)
So -1)/2)+1 and -1,2)+1
the first value is again the difference between the current row and the reference row. but D$1:D1 is going to be the count of the rows, which is 1 so we have to correct for the rows count starting at 1 instead of 0 which would throw off our calculations, so both are using the -1 to reduce the count of the rows by 1
in the case of /2 and ,2 both are because we are dividing by 2 in the first statement it's a normal division by 2 /2 in the modulus statement it's an argument of the Mod function so ,2
finally we need to add 1 using +1 to correct for the index's need to have a value series which begins at 1.
INT((ROWS(D$2:D2)-1)/2)+1 is finding the row number to select the value from.
MOD(ROWS(D$1:D1)-1,2)+1 is finding the column number to select the value from
Thus we can change /2 and ,2 to /3 and ,3 to do this with 3 columns
This yields:
=INDEX($A:$B,INT((ROWS(D$1:D1)-1)/3)+1,MOD(ROWS(D$1:D1)-1,3)+1)
So maybe that's the confusing way to look at it but it's closer to how my mind works on it. Here is an alternative view:
=INDEX([RANGE],[ROW_#],[COLUMN_#]) returns the value from a range of rows and columns
Using the example:
=INDEX($A:$B,INT((ROWS(D$1:D1)-1)/3)+1,MOD(ROWS(D$1:D1)-1,3)+1)
[RANGE] = $A:$B this is the range of source columns.
[ROW_#] = INT((ROWS(D$1:D1)-1)/3)+1
INT([VALUE_A])+1 returns an integer value so any decimal places are dropped. Then adds one to it. we add one to the value because the result of the next steps will be 1 less than the value we need.
[Value_A] = (ROWS(D$1:D1)-1)/3
ROWS(D$1:D1) returns the number of rows in the Range to the current row in the results column, we use D$1 to designate the row number where the values in the results column begin. D1 is the current row in the results column giving us a range from the source row, allowing us to count the rows. we have to subtract 1 from this value using -1 to get the difference between the source and current. This is then divided by /3 because we have three columns we want to look through in this example so we only change rows when the result is divisible by 3. the INT drops any decimal places as mentioned so it only increments when cleanly divisible by 3.
[COLUMN_#] = MOD(ROWS(D$1:D1)-1,3)+1
MOD([VALUE],[Divisor])+1 returns the remainder of the value when divided by the divisor.
Using the example:
MOD(ROWS(D$1:D1)-1,3)+1
In this case we still divide by 3 but it's an argument to the MOD function, we still need to count the number of rows and subtract 1 before dividing it, this will return a 0, 1, or 2 for the column, but as above we are shifted backwards by 1 as the column numbers begin with the number 1, so as before we must add 1
And here we add column A and D
two different formulas depending on if you add the formula to an odd row or an even row.
https://1drv.ms/x/s!AncAhUkdErOkguUaToQkVkl5Qw-l_g?e=5d9gVM
Odd Start row
=INDEX($A$2:$D$9;ROUND(ROW(A1)/2;0);IF(MOD(ROW()-ROW($A$2);2)=1;4;1))
Even Start row
=INDEX($A$2:$D$9;ROUND(ROW(A1)/2;0);IF(MOD(ROW()-ROW($A$1);2)=1;4;1))
What is A1 in the picture is the cell directly above your first data cell.
If you want to place it on a different sheet you just add the sheet name:
=INDEX(MySheet!$A$2:$D$9;ROUND(ROW(MySheet!A1)/2;0);IF(MOD(ROW()-ROW(MySheet!$A$2);2)=1;4;1))
=INDEX(MySheet!$A$2:$D$9;ROUND(ROW(MySheet!A1)/2;0);IF(MOD(ROW()-ROW(MySheet!$A$1);2)=1;4;1))

Excel: Obtain a column by sorting anotr one values

I need to automatically obtain a sorted column of values from another given column values, like in the sample:
I have I need A unchanged, and also B obtained from A
A A B
-----------------
1 1 0
0 0 0
3 3 1
8 8 3
0 0 8
I mean if the values from A changes, the B should change accordignly...
Is that possible in MS Excel?
Here a sandbox and sample:
http://1drv.ms/1SkqMhS
If you put The formula =SMALL(A:A,ROW()) in B1 and copy down then the cells in B will be linked to the cells in A in such a way that the numbers in B will be the numbers in A in sorted order. This won't be efficient for larger ranges but will work fine for small to medium size ranges.
If you want the numbers to start in a lower row, say B2 because you have a header in B1, adjust ROW() to something like ROW()-1.
A word of warning: Use of ROW() can make a spreadsheet somewhat fragile in that formulas that involve it can change their meaning if rows are inserted or deleted or the block containing the formula is moved to somewhere else. Rather than using ROW(), there is something to be said for adding a helper column which numbers the data in A (which would then be in e.g. B) and referring to these numbers rather than small. For example, in:
If I put the formula
=SMALL($B$2:$B$5,A2)
In C1 and copy down, it works as intended. In response to a question you raised in the comments, I added still another column which gives an index where the corresponding value occurs. To do this I wrote in D2 (then copied) the formula
=MATCH(C2,$B$2:$B$5,0)
Of course. Highlight your range and in the Data tab, click "Sort", then you can choose how you want to sort your data:
If column B has information that is to be used with Column A (like next to A1 is "Car"), and you want to sort the whole table, based on Column A, then just select Columns A and B, then sort by column A.
Found the answer, thanks to John Coleman !
Just some minor details like cell value fixing (with $, like A$2)and the -1+ROW adjustment for the 1 header row!

Ranking in Excel with multiple criteria

For example, I need to create a merit list of few student based on total marks (column C), then higher marks in math (column B) -
A B C D
-------------------------
Student1 80 220 1
Student2 88 180 3
Student3 90 180 2
Expected merit position is given in column D.
I can use RANK function but I can only do that for one column (total number). If total number of multiple student is equal, I could not find any solution of this.
You can try this one in D1
=COUNTIF($C$1:$C$99,">"&C1)+1+SUMPRODUCT(--($C$1:$C$99=C1),--($B$1:$B$99>B1))
and then copy/fill down.
let me know if this helps.
Explanation
Your first criteria sits in column C, and the second criteria sits in Column B.
Basically, first it is counting the number of entries ($C$1:$C$99) that are bigger than the entry itself ($C1). For the first one in the ranking, you will get zero, therefore you need to add 1 to each result (+1).
Until here, you will get duplicate rankings if you have the same value twice. Therefore you need to add another argument to do some extra calculations based on the second criteria:
To resolve the tie situation, you need to sumproduct two array formulas and add the result to the previous argument, the goal is to find the number of entries that are equal to this entry with $C$1:$C$99=C1 and have a bigger value in the second criteria column $B$1:$B$99>B1:
you add -- to convert TRUE and FALSE to 0s and 1s so that you can multiply them:
SUMPRODUCT(--($C$1:$C$99=C1),--($B$1:$B$99>B1))
the first array is to see how many ties you have in the first criteria. And the second array is to find the number of bigger values than the entry itself.
Note you can add as many entries as you like to your columns, but remember to update the ranges in the formula, currently it is set to 99, you can extend it to as many rows as you want.
Sometimes a helper column will provide a quick and calculation-efficient solution. Adding the math marks to the total marks as a decimal should produce a number that will rank according to your criteria. In an unused column to the right, use this formula in row 2,
=C2+B2/1000
Fill down as necessary. You can now use a conventional RANK function on this helper column like =RANK(D2, D$2:D$9) for your ranking ordinals.
Very simple (or, at least, much more simpler that the one provided by the best answer) 'math' solution: do a linear combination with weights.
Do something like
weighted_marks = 10*colC + colB
then sort weighted marks using simple rank function.
It does solve your problem, bulding the ranking you need.
If you don't like to limit the number of rows or the numbers used in the criteria, Jeeped's approach can be extended. You can use the following formulas in cells D2 to L2, assuming that there are three criteria, the first one in column A, the second one in column B, and the third one in column C:
=RANK($A2,$A:$A,1)
=RANK($B2,$B:$B,1)
=D2*2^27+E2
=RANK(F2,F:F,1)
=RANK($C2,$C:$C,1)
=G2*2^27+H2
=RANK(I2,I:I,1)
=J2*2^27-ROW()
=RANK(K2,K:K,0)
The formulas have to be copied down. The result is in column L. Ties are broken using the row number.
If you like to add a fourth criterion, you can do the following after having the formulas above in place:
Add the new criterion between columns C and D.
Insert three new columns between columns I and J.
Copy columns G:I to the new columns J:L.
Copy column G to column M, overwriting its content.
Change the formula in column L to point to the new criterion.
The factor 2^27 used in the formulas balances the precision of 53 bits available in double-precision numbers. This is enough to cover the row limit of current versions of Excel.

Resources