I am calculating the geometric mean of a row in MS Excel by using the GEOMEAN(...) command.
What is the geometric mean: The row could be A1:A10. A geometric mean with
GEOMEAN(A1:A10)
is the product of all 10 cell values (multiplied together) after which the 10th root is taken (mathematically: nth_root(A_1 x A_2 x ... x A_n) ).
The issue: The command GEOMEAN(A1:A10) works fine as long as no cells contain negative values (actually just as long as the product ends up positive). If one cell has a negative value, then taking the root is mathematically an invalid action and Excel gives an error.
The solution: I can work-around this by adding a large enough number such as +1000000 to each value before doing GEOMEAN(A1:A10) and afterwards subtracting -1000000 from the result. This is a mathematical approximation to the pure geometrical mean.
The question: But how do I add +1000000 to each value in Excel? A solution would be to create a whole new extra row where the number is added, and then doing GEOMEAN on this row and subtracting the number from the result. But I would really like to avoid creating a new row, since I have many long data sets to perform this command on.
Is there a way to add the number inside the command itself? To add it onto each value before it is multiplied? Something along the lines of:
GEOMEAN(A1:A10+1000000)-1000000
Solution to avoid the work-around
Based on the answer from and discussion with #ImaginaryHuman072889
It turns out that a working command that avoids any work-around is:
IFERROR(GEOMEAN(A1:A10);-GEOMEAN(ABS(A1:A10)))
If an error are cought by the IFERROR, then we know that a negative result would have appeared, so this is constructed manually in that case.
BUT: This does not take into account the case mentioned by #ImaginaryHuman072889, though, because Excel seems to forbid any negative numbers involved and not just if the inner product is negative. For example, both GEOMEAN(-2,-2) as well as GEOMEAN(-2,-2,-2) give errors in Excel, even though they both should be mathematically valid, giving the results 2 and -2, respectively. To overcome this Excel-issue, we can simply write out the exact same command line manually:
IFERROR(PRODUCT(A1:A10)^(1/COUNTA(A1:A10));-(PRODUCT(ABS(A1:A10))^(1/COUNTA(A1:A10)))))
I add this solution to aid any by-comers who have the same issue. This mathematically works, but the fact that -2 and -2 have the geometrical mean 2 does seem a bit odd and not at all like any useful value of a "mean". It is still mathematically legal as far as I can find (WolframAlpha has no issue with it and the Wikipedia article never mentions a sign).
Your "workaround" of doing this:
GEOMEAN(A1:A10+1000000)-1000000
Is completely wrong. This is absolutely not equal to GEOMEAN(A1:A10).
Simple counter-example:
GEOMEAN({2,8}) returns the value of 4, which is the geometric mean of 2 and 8.
GEOMEAN({2,8}+1)-1 is equal to GEOMEAN({3,9})-1 which is approximately 4.196.
What is a valid workaround is if you multiply each value inside GEOMEAN by a certain value, then divide the result by that value.
Simple example:
GEOMEAN({2,8}*3)/3 is equal to GEOMEAN({6,24})/3 which is 4.
However, this method of multiplying by a constant does not help your situation, since this won't get rid of negative values.
Mathematically speaking, the geometric mean of a positive number and a negative number is an imaginary number, which is presumably why Excel cannot handle it.
Example:
2*-8 = -16
sqrt(-16) = 4i
Therefore, 4i is the geometric mean of 2 and -8. Notice how it has the same magnitude as GEOMEAN({2,8}), just that it is an imaginary number.
All that said... here is what I recommend you doing:
I suggest you return two results, one result is the magnitude of the geometric mean and the other is the phase of the geometric mean.
Formula for magnitude:
= GEOMEAN(ABS(A1:A10))
(Note, this is an array formula, so you'd have to press Ctrl+Shift+Enter instead of just Enter after typing this formula.) The use of ABS converts all negative numbers to positive before the GEOMEAN calculation, guaranteeing a positive geometric mean.
Formula for phase, I would just do something like this:
= IF(PRODUCT(A1:A10)>=0,"Real","Imaginary")
Which obviously returns Real if the geometric mean is a real number and returns Imaginary if the geometric mean is an imaginary number.
EDIT
Technically speaking, some of what I said wasn't completely precise, although the magnitude formula above still stands.
Some things I want to clarify:
If PRODUCT(data) is positive (or zero), then the geometric mean of data is positive (or zero).
If PRODUCT(data) is negative and if the number of entries in data is odd, then the geometric mean of data is negative (but still real).
If PRODUCT(data) is negative and if the number of entries in data is even, then the geometric mean of data is imaginary.
That said... if you want these formulas to be a bit more technically accurate, I would modify to this:
Adjusted formula for magnitude:
= GEOMEAN(ABS(A1:A10))*IF(AND(PRODUCT(A1:A10)<0,MOD(COUNT(A1:A10),2)=1),-1,1)
Adjusted formula for phase:
= IF(AND(PRODUCT(A1:A10)<0,MOD(COUNT(A1:A10),2)=0),"Imaginary","Real")
If the geometric mean is real, it returns the precise geometric mean (whether it is positive or negative), and if the geometric mean is imaginary, it returns a positive real value with the correct magnitude.
So, I just found the answer - although I have no idea why this works.
Doing GEOMEAN(A1:A10+1000000)-1000000 is actually possible. But by pressing enter and error #VALUE is displayed. You must click control+shift+enter to have the actual result displayed.
According to this: https://www.mrexcel.com/forum/excel-questions/264366-calculating-geometric-mean-some-negative-values.html
If anyone has an explanation for this, I am very interested.
Related
This is a little bit complicated to describe, but I will try my best. I have a total, let's say 1000. Then I want to split it by percentages, position count is all the time different. So there can be 3 or 70 or 130 positions or whatever. Then split sum should correspond to target value.
Here is an example of the case:
I input names under Customer request
I enter percentage for position under Percentage
In amount calculation I use =CEILING($C$5*C10;10) and in all the rest of the cells the same to get numbers look nice. It is working fine but he problem is that now totals does not match. It should end up in 15550 but after calculating totals after split it is 15660.
Is there any ideas what kind of master artificial intelligent formula can do the trick to produce nice looking numbers, taking in consideration to match Total (target) in the end if Total (calculated) percentage is 100%?
P.S. Any ideas are welcomed as well. The target is to have nice looking, rounded numbers that will sum in the same number as target - total.
Since you are using CEILING, your output number (e.g. 15660) is guaranteed to be greater than or equal to your input number (e.g. 15550). This is because any time a "perfect match" isn't found, it rounds up.
My first suggestion is to instead use ROUND instead of CEILING. Right off the bat this will perform better than CEILING because ROUND can round up or down but CEILING can only round up.
E.g. try this:
= ROUND($C$5*C10,-1)
Since you provide no details as to "how" the data needs to be adjusted to meet your input value, I can't really provide any automatic solution.
One manual solution is that you can make a new column which indicates whether the data was rounded up or rounded down, and you can adjust the percentages manually to get the data you're looking for.
Here's a formula to tell you if the data is rounded up or down (e.g. put formula in cell E10 and drag down):
= CHOOSE(SIGN(D10-($C$5*C10))+2,"Round Down","Perfect Match","Round Up")
You can use this information to manually tweak your percentages. For example... if your output value is too high, you can slightly decrease some of the higher percentages that "Round Up" and slightly increase some of the lower percentages (e.g. if you have 10% and 3%, maybe change them to 10.1% and 2.9% to see if that makes a difference.)
I want to generate a single column of 6000 numbers with a normal distribution, with a mean of 30.15, standard deviation of 49.8, minium of -11.5, maximum 133.5.
I am a total newb at this so i tried to use the following formula in a cell and than just drag it down to cell 6000:
=NORMINV(RANDBETWEEN(-11.5,133.5)/100,30.15,49.8)
It returns a value but sometimes it returns #NUM! error. Thank you!
Unfortunately NORMINV expects a probability for the argument, which must be a value in the interval (0, 1). Any parameter outside that range will yield #NUM!.
What you're asking cannot be done directly with a normal distribution since that has no constraints on the minimum and maximum values.
One approach is to use a primary column to generate the normally distributed numbers, then filter out the ones you want in the adjacent column. But this will cause even the mean (let alone higher moments) to go off quite considerably due to your minimum and maximum values not being equidistant from the mean. You could get round this by recentering the distribution and adjusting afterwards.
I have a column of positive and negative numbers, which when summed should balance to zero (it's an accounting sheet).
However, if I use a SUMIF formula, instead of 0, i get:
1.81899E-12 or -9.09495E-13 or similar. (I don't know what this sort of result is called, but I think they represent very large or very small numbers)
I have created a sample document which shows the issue.
It returns a zero if the cell is formatted as a number, but the above result if formatted as general.
I often also find that even the simple SUM function also returns a similar result, as does the SUM in the status bar at the bottom of excel, so it is not just the SUMIF function I am struggling with. However, I have been unable to recreate the issue with the SUM function in my example spreadsheet.
I'm using Excel as part of Home and Business 2013.
Thanks for your help.
As #Dominique pointed out, xxxE-12 is a very, very small number. It is very, very close to zero.
xxxE-12 is Excel's (and most programming languages') way of writing xxx * 10^-12.
As you guessed, this is due to rounding. It however also displays the issues of how computers handles floating-point (decimal) numbers; what you think is 1 / 3 = 0.333 might be represented internally as something like 0.333333681. See https://en.wikipedia.org/wiki/Floating-point_arithmetic, or notably https://en.wikipedia.org/wiki/Floating-point_arithmetic#Accuracy_problems.
Secondly, why this appears if the cell is formatted as "General", but not "Number"? With "Number", you expect an integer part and at most, say, 3 decimals. x.xxE-12 has the largest non-zero component at the 12th (!) decimal. So when displayed, it gets rounded to a nice zero. "General" however attempts to display the number as close to the actual value, which in this case is the xxxE-12.
Also note that this might give you issues if you try to compare your calculated value with zero. Say, =IF(SUMIF(...) = 0, ...; it might not evaluate to TRUE even when you think it does (due to the very small value). The solution is instead to compare the difference of calculated value to zero: =IF(ABS(SUMIF(...) - 0) < 1E-9, ....
I have an array of cells that contain all numerators (A2:A500) and an array of cells that contain all denominators (B2:B500) think of them as a fraction.
Is there a way to put them to the smallest common fraction and sum them up in one line? I can do it with the use of another column with multiplied numerators but I struggle to make it a one liner. How can something like this be achieved ?
You will get easily an overflow if the LCM of column B is too high. Anyway, this formula will get you the numerator of the sum fraction:
=SUMPRODUCT(A2:A500/B2:B500)*LCM(B2:B500)
Obviously the denominator is =LCM(B2:B500)
p.s.: I assumed you wanted to have a Fraction as a result. If you want just the resulting number, Garry's answer is the straightforward way to go.
An example for three cells:
=SUMPRODUCT(A1:A3/B1:B3)
And you may calculate the GCD(numerator,denominator)
to obtain the reduced fraction.
How can I generate those numbers in Excel.
I have to generate 8 random numbers whose sum is always 320. I need around 100 sets or so.
http://en.wikipedia.org/wiki/User:Skinnerd/Simplex_Point_Picking. Two methods are explained here.
Or any other way so I can do it in Excel.
You could use the RAND() function to generate N numbers (8 in your case) in column A.
Then, in column B you could use the following formula B1=A1/SUM(A:A)*320, B2=A2/SUM(A:A)*320 and so on (where 320 is the sum that you are interested into).
So you can just enter =RAND() in A1, then drag it down to A8. Then enter =A1/SUM(A:A)*320 in B1 and drag it to B8. B1:B8 now contains 8 random numbers that sum up to 320.
Sample output:
I'm a bit late to the game here - but fyi if only integers required then:
=LET(x_,RANDARRAY(8,1,1,1000000,1),y_,ROUND(x_*320/SUM(x_),0),y_)
is somewhat similar to the favourite soln above, albeit parsimonious (formula in single cell required to produce desired array , no helper column). Also addresses insignificant decimal points, albeit you may need to allocate back the deficit / surplus due to the occasional rounding error which may yield a sum total of 321 or 319. Could do this in a random fashion again using something like index(y_,randbetween(1,8))+320-sum(y_) in formula above - or resort to the infamous helper fn..
Someone commented the favourite soln above (and thus mine, since it stems from a similar concept/approach) is not uniform - I'm not sure this was required; a uniform spread would impede the random nature (and is arguably far simpler as you simply divide a sizeable range into distinct octiles, and follow the same approach already laid out here - not sure where/why the notion that a random spread should be arbitrarily/mechanically 'forced' to adopting some type of non-random spread.. anyways... I obviously haven't read the problem properly (ehem).
I'm a bit late to the game here - but fyi if only integers required then:
=LET(x_,RANDARRAY(8,1,1,1000000,1),y_,ROUND(x_*320/SUM(x_),0),y_)
is somewhat similar to the favourite soln above, albeit parsimonious (formula in single cell required to produce desired array , no helper column). Also addresses insignificant decimal points, albeit you may need to allocate back the deficit / surplus due to the occasional rounding error which may yield a sum total of 321 or 319. Could do this in a random fashion again using something like index(y_,randbetween(1,8))+320-sum(y_) in formula above - or resort to the infamous helper fn..