Sum of the greatest value in one column, plus the sum of the other values in another column - excel-formula

Consider the following sheet/table:
A B
1 90 71
2 40 25
3 60 16
4 110 13
5 87 82
I want to have a general formula in cell C1 that sums the greatest value in column A (which is 110), plus the sum of the other values in column B (which are 71, 25, 16 and 82). I would appreciate if the formula wasn't an array formula (as in requiring Ctrl + Shift + Enter). I don’t have Office 365, I have Excel 2019.
My attempt
Getting the greatest value in column A is easy, we use MAX(A1:A5).
So the formula I want in cell C1 should be something like:
=MAX(A1:A5) + SUM(array_of_values_to_be_summed)
Obtaining the values of the other rows in column B (what I called array_of_values_to_be_summed in the previous formula) is the hard part. I've read about using INDEX, MATCH, their combination, and obtaining arrays by using parenthesis and equal signs, and I've tried that, without success so far.
For example, I noticed that NOT((A1:A5 = MAX(A1:A5))) yields an array/list containing ones (or TRUEs) for the relative position of the rows to be summed, and containing a zero (or FALSE) for the relative position of the row to be omitted. Maybe this is useful, I couldn't find how.
Any ideas? Thanks.
Edit 1 (solution)
I managed to obtain what I wanted. I simply multiplied the array obtained with the NOT formula, by the range B1:B5. The final formula is:
=MAX(A1:A5) + SUM(NOT((A1:A5 = MAX(A1:A5))) * B1:B5)
Edit 2 (duplicate values)
I forgot to explain what the formula should do if there are duplicates in column A. In that case, the first term of my final formula (the term that has the MAX function) would be the one whose corresponding value in column B is smallest, and the value in column B of the other duplicates would be used in the second term (the one containing the SUM function).
For example, consider the following sheet/table:
A B
1 90 71
2 110 25
3 60 16
4 110 13
5 110 82
Based on the above table, the formula should yield 110 + (71 + 25 + 16 + 82) = 304.
Just to give context, the reason I want such a formula is because I’m writing a spreadsheet that automatically calculates the electric current rating of the short-circuit protective device of the feeder of a group of electric motors in a house or building or mall, as required by the article 430.62(A) of the US National Electrical Code. Column A is the current rating of the short-circuit protective device of the branch-circuit of each motors, and column B is the full-load current of each motor.

You can use this formula
=MAX(A1:A5)
+SUM(B1:B5)
-AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1)
Based on #Anupam Chand's hint for max-value-duplicates there could also be min-value-duplicates in column B for corresponding max-value-duplicates in column A. :) This formula would account for that
=SUM(B1:B5)
+(MAX(A1:A5)-AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1))
*SUMPRODUCT((A1:A5=MAX(A1:A5))*(B1:B5=AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1)))
Or with #Anupam Chand's shorter and better readable and overall better style :)
=SUM(B1:B5)
+(MAX(A1:A5)-MINIFS(B1:B5,A1:A5,MAX(A1:A5)))
*COUNTIFS(A1:A5,MAX(A1:A5),B1:B5,MINIFS(B1:B5,A1:A5,MAX(A1:A5)))
The explanation works for bot solutions:
The SUM-part just sums the whole list.
The second line gets the max-value for column A and the corresponding min-value of column B for the max-values in column A and adds or subtracts it respectively.
The third line counts, how many times the corresponding min-value for the max-value occurs and multiplies it with the second line.

Can you try this ?
=MAX(A1:A5)+SUM(B1:B5)-MINIFS(B1:B5,A1:A5,MAX(A1:A5))
What we're doing is adding the max of A to all rows of B and then subtracting the min value of B where A is the max.

If you have Excel 365 you can use the following LET-Formula
=LET(A,A1:A5,
B,B1:B5,
MaxA,MAX(A),
MinBExclude, MINIFS(B,A,MaxA),
sumB1,SUMPRODUCT(B*(A=MaxA)*(B<>MinBExclude)),
sumB2,SUMPRODUCT(B*(A<>MaxA)),
MaxA +sumB1+sumB2
A and B are shortcuts for the two ranges
MaxA returns the max value for A (110)
MinBExclude filters the values of column B by the MaxA-value (25, 13, 82) and returns the min-value of the filtered result (13)
sumB1 returns the sum of the other MaxA values from column B (26 + 82)
sumB2 returns the sum of the values from B where value in A <> MaxA (71 + 60)
and finally the result is returned
If you don't have Excel 365 you can add helper columns for MaxA, MinBExclude, sumB1 and sumB2 and the final result

Related

TRUE/FALSE ← VLOOKUP ← Identify the ROW! of the first negative value within a column

Firstly, we have an array of predetermined factors, ie. V-Z;
their attributes are 3, the first two (•xM) multiplied giving the 3rd.
f ... factors
• ... cap, the values in the data set may increase max
m ... fixed multiplier
p ... let's call it power
This is a separate, standalone array .. we'd access with eg. VLOOKUP
f • m pwr
V 1 9 9
W 2 8 16
X 3 7 21
Y 4 6 24
Z 5 5 25
—————————————————————————————————————————————
Then we have 6 columns, in which the actual data to be processed is in, & thereof derive the next-level result, based on the interaction of both samples introduced.
In addition, there are added two columns, for balance & profit.
Here's a short, 6-row data sample:
f • m bal profit
V 2 3 377 1
Y 2 3 156 7
Y 1 1 122 0
X 1 2 -27 2
Z 3 3 223 3
—————————————————————————————————————————————
Ultimately, starting at the end, we are comparing IF -27 inverted → so 27 is within the X's power range ie. 21 (as per the first sample) .. which is then fed into a bigger formula, beyond the scope of this post.
This can be done with VLOOKUP, all fine by now.
—————————————————————————————————————————————
To get to that .. for the working example, we are focusing coincidentally on row5, since that's the one with the first negative value in the 'balance' column, so ..
on factorX = which factor exactly is to us unknown &
balance -27 = which we have to locate amongst potentially dozens to hundreds of rows.
Why!?
Once we know that the factor is X, based on the * & multiplier pertaining to it, then we also know which 'power' (top array) to compare -27, as the identified first negative value in the balance column, to.
Is that clear?
I'd like to know the formula on how to achieve that, & (get to) move on with the broader-scope work.
—————————————————————————————————————————————
The main issue for me is not knowing how to identify the first negative or row -27 pertains to, then having that piece of information how to leverage it to get the X or identify the factor type, especially since its positioned left of the latter & to the best of my knowledge I cannot use negative column index number (so, latter even if possible is out of the question anyway).
To recap;
IF(21>27) = IF(-21<-27)
27 → LOCATE ROW with the first negative number (-27)
21 → IDENTIFY the FACTOR TYPE, same row as (-27)
→ VLOOKUP pwr, based on factor type identified (top array, 4th column right)
→ invert either 21 to a negative number or (-27) to the positive number
= TRUE/FALSE
Guessing your columns I'll say your first chart is in columns A to D, and the second in columns G to K
You could find the letter of that factor with something like this:
=INDEX(G:G,XMATCH(TRUE,INDEX(J:J<0)))
INDEX(J:J<0) converts that column to TRUE and FALSE depending on being negative or not and with XMATCH you find the first TRUE. You could then use that in VLOOKUP:
=VLOOKUP(INDEX(G:G,XMATCH(TRUE,INDEX(J:J<0))),A:D,4,0)
That would return the 21. You can use the first concept too to find the the -27 and with ABS have its "positive value"
=VLOOKUP(INDEX(G:G,XMATCH(TRUE,INDEX(J:J<0))),A:D,4,0) > INDEX(J:J,XMATCH(TRUE,INDEX(J:J<0)))
That should return true or false in the comparison

Presenting a value based on number or text in cell

I have a list of 4 values in Sheet1 and 4 values in Sheet2.
In Sheet3 I will combine a random selection of these numbers and return the value in a column. (edit: no random selection from Excel, its a part picked from a bucket)
(A fifth column in Sheet3 will be used to do calculations with ValueS1 and ValueS2)
Sheet1
NumberS1
ValueS1
1
17.10
2
17.20
3
17.12
4
17.15
Sheet2
NumberS2
ValueS2
1
16.10
2
16.20
3
16.12
4
16.15
Sheet3
NumberS1
NumberS2
ValueS1
ValueS2
1
3
17.10
16.12
2
2
17.20
16.20
4
1
17.15
16.10
3
4
17.12
16.15
What kind of function can give the desired return?
I have looked into examples using "Indirect" but cannot see how they will solve my problem.
for the randomization: =ROUNDUP(RAND()*4,0)
rand() gives you a number between 0 and 1, so rand()*4 gives you a number between 0 and 4.
roundup(x,y) round up the number x with y digits you want to round the number up to (in our case 0).
for import the right number from sheet 1 or 2: =VLOOKUP(A1,Sheet1!A1:B2,2,0)
A1 - The value you look for in sheet 1 or 2.
Sheet1!A1:B4 - The array he look for your value on the firs column, always on the first column.
2 - The column you want to import the value from. (because we write an array of tow columns. we can write here only 1 or 2)
0 - it's an Optionally index (0 or 1). o is if you want an exact match of the return value.
Regular Lookup could do:
=LOOKUP(A2:A5,Sheet1!A2:A5,Sheet1!B2:B5) in Sheet3!C2
And
=LOOKUP(B2:B5,Sheet2!A2:A5,Sheet2!B2:B5) in Sheet3!D2
Note that LOOKUP will give the result to the closest match smaller than the search value.
Or VLOOKUP:
=VLOOKUP(A2:A5,Sheet1!A2:B5,2,0) / =VLOOKUP(B2:B5,Sheet2!A2:B5,2,0)
VLOOKUP will error if the value is not found (the way used above). It uses arguments like this:
=VLOOKUP(What you want to look up, where you want to look for it, the column number in the range containing the value to return, return an Approximate or Exact match – indicated as 1/TRUE, or 0/FALSE)
Office 365 has XLOOKUP which combines the logic of the two above and some more:
=XLOOKUP(A2:A5,Sheet1!A2:A5,Sheet1!B2:B5,"value not found",0)
XLOOKUP uses the following arguments:
=XLOOKUP(lookup_value, lookup_array, return_array, [if_not_found], [match_mode], [search_mode])

Using Offset to get to the next instance of a variable?

I am attempting to find the next instance of a variable in order to generate a list base on another variable:
Mkt ID
10 908
15 915
15 416
25 312
25 215
32 482
Similar to the above. There are two drop downs, one for market and one for ID. I want the user to be able to select a market and in the ID drop down have the data validation filter to that list of IDs respective to the market in the first drop down.Let's say the market dropdown is $G$2. Market is Column A, and ID is column B.
Here's the formula I have so far:
OFFSET(ADDRESS(MATCH($G$2,A:A,0),1),0,1,COUNTIFS(A:A,$G$2),1)
This formula references the market, offsets by 0 rows and 1 column, counts the number of that market instance for height, and 1 row in width. I do not see why this is not working. Excel just gives the typical, are you really trying to type a formula? error code.
ADDRESS returns a string that looks like a cell reference. You need INDIRECT to turn that into a real cell reference that OFFSET can use.
=OFFSET(indirect(ADDRESS(MATCH($G$2, A:A, 0), 1)), 0, 1, COUNTIFS(A:A, $G$2), 1)

Nested array formulas

I want a summation of rates. Let me explain it: I would like to sum the numbers in column D that matches 2 conditions (green rows in excel). First one: column F equal to "closed". Second one: column C equal to those numbers which in turn matches the following condition: column F equal to "Partial Sold". At the same time, EACH number in the previous summation might be divided by the number of column D which matches "Partial Sold" in column F.
There is the example with the table/figure I attached: (4510 / 9820) + (6500 / 9820) + (9100 / 15400) + (2388 / 2995) + (12400 / 9820) + (2904 / 5855).
My cells would be: (D69 / D66) + (D70 / D66) + (D76 / D74) + (D82 / D78) + (D83 / D66) + (D84 / D72).
#Jeeped with your cells would be: (D6 / D3) + (D7 / D3) + (D13 / D11) + (D19 / D15) + (D20 / D3) + (D21 / D9)
.. C D E F
65 # Total Side Condition
66 1 9820 Buy Partial Sold
67 2 3850 Buy Closed
68 3 7151 Buy Partial Sold
69 1 4510 Sell Closed
70 1 6500 Sell Closed
71 4 8180 Buy Open
72 5 5855 Buy Partial Sold
73 6 2553 Buy Open
74 7 15400 Buy Partial Sold
75 2 4600 Sell Closed
76 7 9100 Sell Closed
77 8 7531 Buy Open
78 9 2995 Buy Partial Sold
79 3 3000 Sell Closed
80 10 8691 Buy Open
81 3 2500 Sell Closed
82 9 2388 Sell Closed
83 1 12400 Sell Closed
84 5 2904 Sell Closed
85 11 3848 Buy Open
86 12 7745 Buy Open
To do it in one step with an array formula you could use:
=SUM(IFERROR((D66:D86*(F66:F86="Closed"))/((C66:C86=TRANSPOSE(C66:C86))*TRANSPOSE(D66:D86*(F66:F86="Partial Sold"))),0))
This is an array formula and must be confirmed with Ctrl+Shift+Enter↵.
it will generate a 2D-array holding the original values for closed as rows and divides this 1D array by this:
buils up 2D-array by col C = transposed col C
multiply each row by col D
set all items in each row to 0 if not "Partial Sold"
for each div by 0 the IFERROR will set it to 0
and this all in the SUM will give you your output
I too would recommend using a helper column. You don't need to use array formulas to get to the answer though. You can use the following in the next available column:
=IF(F66="Closed",IFERROR(D66/SUMIFS($D$66:$D$86,$F$66:$F$86,"Partial Sold",$C$66:$C$86,C66),0),0)
This will return values for everything that matches your criteria, and zeroes for everything else. Then you can just take the sum of this helper column as your final summation of rates.
If you really don't want to use a helper column, you can wrap the helper column formula in a SUM and swap out the individual cell references for arrays (i.e. swap F66 for $F$66:$F$86, and so on), then enter it as an array formula with Ctrl+Shift+Enter↵. The whole thing would look like this:
=SUM(IF($F$66:$F$86="Closed",IFERROR($D$66:$D$86/SUMIFS($D$66:$D$86,$F$66:$F$86,"Partial Sold",$C$66:$C$86,$C$66:$C$86),0),0))
I do not see this being done without a helper column. In an unused column to the right of F66, put this array¹ formula.
=IF(AND(OR(C66=INDEX(C$66:C$86*(F$66:F$86="Partial Sold"), , )), F66="Closed"), D66/INDEX(D$66:D$86, AGGREGATE(15, 6, ROW($1:$21)/((C$66:C$86=C66)*(F$66:F$86="Partial Sold")), 1)), "")
Fill down as necessary. The result will be the sum of those 'helper' numbers.
        
Even if this could be done in a single formula, the calculation overhead would likely be prohibitive. Splitting a portion of the array calculations off to a helper column that can directly reference the value in column C for another lookup reduces this significantly.
¹ Array formulas need to be finalized with Ctrl+Shift+Enter↵. Once entered into the first cell correctly, they can be filled or copied down or right just like any other formula. Try and reduce your full-column references to ranges more closely representing the extents of your actual data. Array formulas chew up calculation cycles logarithmically so it is good practise to narrow the referenced ranges to a minimum. See Guidelines and examples of array formulas for more information.

Extracting the upper quartile data from an array and placing it in new column

Hi and thanks for any help with this in advance
Below is a hypothetical data set; abundance = count data; mud% = the mud content in which the animals were found; mud bin = bins i've made up depending on the mud%; and UQ = upper quartile of the abundance data from its corresponding mud bin (i.e. the upper quartile for the abundance data in mud bin 1 is 17.25 etc).
Problem:
In excel, for abundance data in each of the four mud bins, I'm wanting to extract any values in the abundance column that are >= the upper quartile value for that particular mud bin and place these in a new column on the same sheet (with no gaps between rows from values that didn't meet the criteria) along with their corresponding mud% value in the neighboring cell. I've added the new columns to the below sheet to give you an idea of what I'm after.
abundance | mud% | mud bin | UQ | | New column | Mud% |
18 10.9 1 18 10.9(mud bid 1)
15 6.5 1 44 38.9(mud bin 1)
6 13.4 1 45 38 (mud bin 2)
13 42.1 1 37 37.8(mud bin 2)
15 36.4 1 etc
44 38.9 1 17.25 etc
22 46 2
30 36.4 2
45 38 2
29 35.3 2
37 37.8 2
29 41.8 2 35.25
11 44.4 3
17 47.8 3
21 40.7 3
15 13.9 3
35 13.9 3
14 13.9 3
15 13.9 3 19
19 12 4
14 12 4
10 12 4
12 12 4
14 12 4
13 12 4
45 9.525 4
66 9.525 4
78 9.525 4 45
The reality is I have a rather large dataset containing abundance data for a number of species, all on the same excel sheet and would greatly appreciate any insight into how I might achieve this in the most efficient manor.
For starters, to make this explanation simpler, I will assume that the last row of data is in row 100.
Populate Upper Quartile values for all line items
First you'll need to use the Quartile formula; however, since you want to find the upper quartile within a bin, you'll have to use an array formula. Put this formula in your UQ column (place in cell D2 and drag down). When entering the formula Be sure to press Ctrl+Shift before pressing Enter
=QUARTILE(IF($C$2:$C$100=C2,$A$2:$A$100,""),3)
The first part of this formula, $C$2:$C$100=C2 is your condition. Everywhere this condition is met, you will get the corresponding value in $A$2:$A$100; otherwise, you'll get a blank value. This will give you an array of abundance values that matches the indicated mudbin, C2. now that you have your subset of data, the quartile function will give you the value in the 3rd quartile (17.25 for mudbin 1, which will be placed next to every row that has a mudbin of 1).
Now that we have all the quartiles, we can get all the abundance values that are greater than the UQ for that mudbin. This is done in two parts
Get abundance values greater than mudbin UQ
First, you need to select one column of cells that has the same number of rows as your data (for example, select cells F2:F100)
Enter the following formula into the formula bar (while F2:F100 are highlighted) and press Ctrl+Shift, then enter
=IF($A$2:$A$100>$D$2:$D$100,$A$2:$A$100,"")
Similar to the IF statement used before, this formula finds all the abundance values that are greater than their corresponding UQ value. Now column F will have an abundance number where it is greater than it's UQ value, and a blank where it is not. Now onto the final step.
Populate abundance values that are greater than the UQ value, without the blanks
Select G2:G100 (your "New Column" in your sample data)
Enter the following formula into the formula bar (while G2:G100 are highlighted) and press Ctrl+Shift, then enter
=INDEX(F2:F100,SMALL(IF(F2:F100<>"",ROW(F2:F100)-1),ROW()-ROW($F$1)))
Looking at the IF statement again, this will find every value in F2:F100 that is not blank, but instead of grabbing the values, we'll keep track of the row number of that non blank value (done by ROW(F2:F100)-1
). Now that we have the row numbers of all the non blank values, we can grab the non-blank values in order and populate them in G2:G100. ROW()-ROW($F$1) is a counter, and SMALL will use the counter to determine the nth smallest number to return. Once we have our row number of the non blank value, INDEX returns that value
Finally, to populate the Mud%, you'll need to use the row number of the non blank values to get the mud% and the mud bin (You have the formula already to get the row number of the non blank value).
It's not a simple answer, but at least you won't have to use VBA.

Resources