Excel SUMPRODUCT to find the % of a whole number - excel-formula

the Excel SUMPRODUCT function can be quite powerful in different scenario including
Doing the product
Sum with condition
Division
Weighted average
Calculating the % given two sets of value
Finding text in a cell given a range of values
But what I couldn't make out of it is to find the contribution ratio (%) of two numbers.
Say something like
A
B
Contribution %
414
24
5,80%
754
36
4,78%
5,13%
Can I use the SUMPRODUCT to obtain the above Contribution (B/A) but across the whole set?
A =SUMPRODUCT(B1:B2;0,1/A1:B2) gives a value next to it, but not the correct one.
Perhaps it's me that I'm overlooking to something, or simply asking for something impossible.
Thanks for feeding me in.

Related

Excel: Add number before multiplying with PRODUCT(...)

I am calculating the geometric mean of a row in MS Excel by using the GEOMEAN(...) command.
What is the geometric mean: The row could be A1:A10. A geometric mean with
GEOMEAN(A1:A10)
is the product of all 10 cell values (multiplied together) after which the 10th root is taken (mathematically: nth_root(A_1 x A_2 x ... x A_n) ).
The issue: The command GEOMEAN(A1:A10) works fine as long as no cells contain negative values (actually just as long as the product ends up positive). If one cell has a negative value, then taking the root is mathematically an invalid action and Excel gives an error.
The solution: I can work-around this by adding a large enough number such as +1000000 to each value before doing GEOMEAN(A1:A10) and afterwards subtracting -1000000 from the result. This is a mathematical approximation to the pure geometrical mean.
The question: But how do I add +1000000 to each value in Excel? A solution would be to create a whole new extra row where the number is added, and then doing GEOMEAN on this row and subtracting the number from the result. But I would really like to avoid creating a new row, since I have many long data sets to perform this command on.
Is there a way to add the number inside the command itself? To add it onto each value before it is multiplied? Something along the lines of:
GEOMEAN(A1:A10+1000000)-1000000
Solution to avoid the work-around
Based on the answer from and discussion with #ImaginaryHuman072889
It turns out that a working command that avoids any work-around is:
IFERROR(GEOMEAN(A1:A10);-GEOMEAN(ABS(A1:A10)))
If an error are cought by the IFERROR, then we know that a negative result would have appeared, so this is constructed manually in that case.
BUT: This does not take into account the case mentioned by #ImaginaryHuman072889, though, because Excel seems to forbid any negative numbers involved and not just if the inner product is negative. For example, both GEOMEAN(-2,-2) as well as GEOMEAN(-2,-2,-2) give errors in Excel, even though they both should be mathematically valid, giving the results 2 and -2, respectively. To overcome this Excel-issue, we can simply write out the exact same command line manually:
IFERROR(PRODUCT(A1:A10)^(1/COUNTA(A1:A10));-(PRODUCT(ABS(A1:A10))^(1/COUNTA(A1:A10)))))
I add this solution to aid any by-comers who have the same issue. This mathematically works, but the fact that -2 and -2 have the geometrical mean 2 does seem a bit odd and not at all like any useful value of a "mean". It is still mathematically legal as far as I can find (WolframAlpha has no issue with it and the Wikipedia article never mentions a sign).
Your "workaround" of doing this:
GEOMEAN(A1:A10+1000000)-1000000
Is completely wrong. This is absolutely not equal to GEOMEAN(A1:A10).
Simple counter-example:
GEOMEAN({2,8}) returns the value of 4, which is the geometric mean of 2 and 8.
GEOMEAN({2,8}+1)-1 is equal to GEOMEAN({3,9})-1 which is approximately 4.196.
What is a valid workaround is if you multiply each value inside GEOMEAN by a certain value, then divide the result by that value.
Simple example:
GEOMEAN({2,8}*3)/3 is equal to GEOMEAN({6,24})/3 which is 4.
However, this method of multiplying by a constant does not help your situation, since this won't get rid of negative values.
Mathematically speaking, the geometric mean of a positive number and a negative number is an imaginary number, which is presumably why Excel cannot handle it.
Example:
2*-8 = -16
sqrt(-16) = 4i
Therefore, 4i is the geometric mean of 2 and -8. Notice how it has the same magnitude as GEOMEAN({2,8}), just that it is an imaginary number.
All that said... here is what I recommend you doing:
I suggest you return two results, one result is the magnitude of the geometric mean and the other is the phase of the geometric mean.
Formula for magnitude:
= GEOMEAN(ABS(A1:A10))
(Note, this is an array formula, so you'd have to press Ctrl+Shift+Enter instead of just Enter after typing this formula.) The use of ABS converts all negative numbers to positive before the GEOMEAN calculation, guaranteeing a positive geometric mean.
Formula for phase, I would just do something like this:
= IF(PRODUCT(A1:A10)>=0,"Real","Imaginary")
Which obviously returns Real if the geometric mean is a real number and returns Imaginary if the geometric mean is an imaginary number.
EDIT
Technically speaking, some of what I said wasn't completely precise, although the magnitude formula above still stands.
Some things I want to clarify:
If PRODUCT(data) is positive (or zero), then the geometric mean of data is positive (or zero).
If PRODUCT(data) is negative and if the number of entries in data is odd, then the geometric mean of data is negative (but still real).
If PRODUCT(data) is negative and if the number of entries in data is even, then the geometric mean of data is imaginary.
That said... if you want these formulas to be a bit more technically accurate, I would modify to this:
Adjusted formula for magnitude:
= GEOMEAN(ABS(A1:A10))*IF(AND(PRODUCT(A1:A10)<0,MOD(COUNT(A1:A10),2)=1),-1,1)
Adjusted formula for phase:
= IF(AND(PRODUCT(A1:A10)<0,MOD(COUNT(A1:A10),2)=0),"Imaginary","Real")
If the geometric mean is real, it returns the precise geometric mean (whether it is positive or negative), and if the geometric mean is imaginary, it returns a positive real value with the correct magnitude.
So, I just found the answer - although I have no idea why this works.
Doing GEOMEAN(A1:A10+1000000)-1000000 is actually possible. But by pressing enter and error #VALUE is displayed. You must click control+shift+enter to have the actual result displayed.
According to this: https://www.mrexcel.com/forum/excel-questions/264366-calculating-geometric-mean-some-negative-values.html
If anyone has an explanation for this, I am very interested.

Use If flexibly to get average

I want to calculate the response time across multiple subjects with one restriction: only response time from the correct trials should be included in the average. The structure of my data looks like below (for simplicity, I show only 3 subjects and 10 trials, in reality, I have many more)
I would like to get average of RT across subj1, subj2, and subj2 for each of the trials. Only correct trials are included in the average. 0 and 1 are used to denote incorrect and correct trials, respectively. For instance, for cell G2, I would only include B2 and D2 in the average, F2 is left out since the ACC for that trial from that subject is 0. I imagined using If AND function to include the appropriate RT but with many subjects, this becomes very clumsy. Does anyone have a clever solution to this?
Since 0 * anything = 0, G2 = SUM(A2*B2,C2*D2,E2*F2)/SUM(A2,C2,E2)
You can do this with AVERAGE and an array formula which can be easily extended to larger ranges, i.e.
=AVERAGE(IF((RIGHT(B$1:F$1,2)="RT")*(A2:E2=1),B2:F2)
confirm with CTRL+SHIFT+ENTER
...or even simpler with AVERAGEIFS like this
=AVERAGEIFS(B2:F2,A2:E2,1,B$1:F$1,"*RT")
note the “offset” ranges

Optimization of a list in Excel with Variables

I have a list of 153 golfers with associated salaries and average scores.
I want to find the combination of 6 golfers that optimizes avg score and keeps salary under $50,000.
I've tried using Solver, but I am stuck! Can anyone help please? :)
Illustrating a solution that is pretty close to what #ErwinKalvelagen suggested.
Column A is the names of the 153 golfers
Column B is the golfers salaries (generated by =RANDBETWEEN(50, 125)*100, filled down, then Copy/Paste Values)
Column C is the golfers average scores (generated by =RANDBETWEEN(70, 85), filled down, then Copy/Paste Values)
Column D is a 0 or 1 to indicate if the golfer is included.
Cell F2 is the total salary, given by =SUMPRODUCT(B2:B154,D2:D154)
Cell G2 is the number of golfers, given by =SUM(D2:D154)
Cell H2 is the average score of the team, given by =SUMPRODUCT(C2:C154,D2:D154)/G2
The page looks like this, before setting up Solver ...
The Solver setup looks like this ...
According to the help, it says to use Evolutionary engine for non-smooth problems. In Options, I needed to increase the Maximum Time without improvement from 30 to 300 (60 may have been good enough).
It took a couple of minutes for it to complete. It reached the solution of 70 fairly quickly, but spent more time looking for a better answer.
And here are the six golfers it came up with.
Of the golfers with an average of 70, it could have found a lower salary.
In Cell I2 added the formula =F2+F2*(H2-70) which is essentially salary penalized by increases in average score above 70 ...
... and use the same Solver setup, except to minimize Cell I2 instead of H2 ...
and these are the golfers it chose ...
Again - it looks like there is still a better solution. It could have picked Name97 instead of Name96.
This is a simple optimization problem that can be solved using Excel solver (just use "Simplex Lp solver" -- somewhat of a misnomer as we will use it here to solve an integer programming or MIP problem).
You need one column with 153 binary (BIN) variables (Excels limit is I believe 200). Make sure you add a constraint to set the values to Binary. Lets call this column INCLUDE; Solver will fill it with 0 or 1 values. Sum these values, and add a constraint with SUMINCLUDE=6. Then add a column with INCLUDE * SCORE. Sum this column and this is your objective (optimizing the average is the same as optimizing the sum). Then add a column with INCLUDE*SALARY and sum these. Add a constraint with SUMSALARY <= 50k. Press solve and done.
I don't agree with claims that Excel will crash on this or that this does not fit within the limits of Excels solver. (I really tried this out).
I prefer the simplex method above the evolutionary solver as the simplex solver is more suitable for this problem: it is faster (simplex takes < 1 seconds) and provides optimal solutions (evolutionary solver gives often suboptimal solutions).
If you want to solve this problem with Matlab a function to look at is intlinprog (Optimization Toolbox).
To be complete: this is the mathematical model we are solving here:
Results with random data:
....

Generate N random numbers whose sum is a constant K - Excel

How can I generate those numbers in Excel.
I have to generate 8 random numbers whose sum is always 320. I need around 100 sets or so.
http://en.wikipedia.org/wiki/User:Skinnerd/Simplex_Point_Picking. Two methods are explained here.
Or any other way so I can do it in Excel.
You could use the RAND() function to generate N numbers (8 in your case) in column A.
Then, in column B you could use the following formula B1=A1/SUM(A:A)*320, B2=A2/SUM(A:A)*320 and so on (where 320 is the sum that you are interested into).
So you can just enter =RAND() in A1, then drag it down to A8. Then enter =A1/SUM(A:A)*320 in B1 and drag it to B8. B1:B8 now contains 8 random numbers that sum up to 320.
Sample output:
I'm a bit late to the game here - but fyi if only integers required then:
=LET(x_,RANDARRAY(8,1,1,1000000,1),y_,ROUND(x_*320/SUM(x_),0),y_)
is somewhat similar to the favourite soln above, albeit parsimonious (formula in single cell required to produce desired array , no helper column). Also addresses insignificant decimal points, albeit you may need to allocate back the deficit / surplus due to the occasional rounding error which may yield a sum total of 321 or 319. Could do this in a random fashion again using something like index(y_,randbetween(1,8))+320-sum(y_) in formula above - or resort to the infamous helper fn..
Someone commented the favourite soln above (and thus mine, since it stems from a similar concept/approach) is not uniform - I'm not sure this was required; a uniform spread would impede the random nature (and is arguably far simpler as you simply divide a sizeable range into distinct octiles, and follow the same approach already laid out here - not sure where/why the notion that a random spread should be arbitrarily/mechanically 'forced' to adopting some type of non-random spread.. anyways... I obviously haven't read the problem properly (ehem).
I'm a bit late to the game here - but fyi if only integers required then:
=LET(x_,RANDARRAY(8,1,1,1000000,1),y_,ROUND(x_*320/SUM(x_),0),y_)
is somewhat similar to the favourite soln above, albeit parsimonious (formula in single cell required to produce desired array , no helper column). Also addresses insignificant decimal points, albeit you may need to allocate back the deficit / surplus due to the occasional rounding error which may yield a sum total of 321 or 319. Could do this in a random fashion again using something like index(y_,randbetween(1,8))+320-sum(y_) in formula above - or resort to the infamous helper fn..

SUM the column in parts and it calculates different to the entire column SUM

If I sum a large column separately in individual sections, and sum those results, it adds up to be slightly different than if I sum the entire column at once.
The value is off by ~ 0.00000000001 - but my conditional formatting picks this up and it is different - despite the fact they are summing the same values.
The formatting of all cells are set to 'Number'.
I can't figure out why or how this would happen - does anyone have some idea? Has something like this happened to you before when working with accurate values?
I found this article on Microsoft's web site. It discusses limitations in Excel's arithmetic and possible ways to deal with them.
I can't imagine that your input numbers have 15 digits of precision, so probably the easiest solution is to round your multiplication/division/etc. results (which I assume you have to get you to the 15 decimal digits).

Resources