How do I avoid repeating long formulas in Excel when working with comparisons? - excel

I know that something like the following
=IF(ISERROR(LONG_FORMULA), 0, LONG_FORMULA)
can be replaced with
=IFERROR(LONG_FORMULA, 0)
However I am looking for an expression to avoid having to type REALLY_LONG_FORMULA twice in
=IF(REALLY_LONG_FORMULA < threshold, 0, REALLY_LONG_FORMULA)
How can I do this?

I was able to come up with the following:
=IFERROR(EXP(LN(REALLY_LONG_FORMULA – threshold)) + threshold, 0)
It works by utilizing the fact that the log of a negative number produces an error and that EXP and LN are inverses of each other.
The biggest benefit of this is that it avoids accidentally introducing errors into your spreadsheet when you change something in one copy of REALLY_LONG_FORMULA without remembering to apply the same change to the other copy of REALLY_LONG_FORMULA in your IF statement.
Greater than comparisons as in
=IF(REALLY_LONG_FORMULA>=threshold,0,REALLY_LONG_FORMULA)
can be replaced with
=IFERROR(threshold-EXP(LN(threshold-REALLY_LONG_FORMULA)),0)
Example below (provided by #Jeeped):
For strict inequality comparisons use SQRT(_)^2 as pointed out by #Tom Sharpe.

If you're comparing against a threshold amount, I would consider checking out ExcelJet's recent blog post about
Replacing Ugly IFs with MAX() or MIN().
Also, the MAX() and MIN() functions are much more intuitive than using lessor known functions like EXP() and LN().

Comparing Ln Exp with SQRT ^2:-
because SQRT(0) gives 0 but ln(0) gives #NUM!
So you can choose which one to use depending whether you want the equality or not.
These also work for negative numbers - in theory.

Related

Unclear why functions from Data.Ratio are not exposed and how to work around

I am implementing an algorithm using Data.Ratio (convergents of continued fractions).
However, I encounter two obstacles:
The algorithm starts with the fraction 1%0 - but this throws a zero denominator exception.
I would like to pattern match the constructor a :% b
I was exploring on hackage. An in particular the source seems to be using exactly these features (e.g. defining infinity = 1 :% 0, or pattern matching for numerator).
As beginner, I am also confused where it is determined that (%), numerator and such are exposed to me, but not infinity and (:%).
I have already made a dirty workaround using a tuple of integers, but it seems silly to reinvent the wheel about something so trivial.
Also would be nice to learn how read the source which functions are exposed.
They aren't exported precisely to prevent people from doing stuff like this. See, the type
data Ratio a = a:%a
contains too many values. In particular, e.g. 2/6 and 3/9 are actually the same number in ℚ and both represented by 1:%3. Thus, 2:%6 is in fact an illegal value, and so is, sure enough, 1:%0. Or it might be legal but all functions know how to treat them so 2:%6 is for all observable means equal to 1:%3 – I don't in fact know which of these options GHC chooses, but at any rate it's an implementation detail and could change in future releases without notice.
If the library authors themselves use such values for e.g. optimisation tricks that's one thing – they have after all full control over any algorithmic details and any undefined behaviour that could arise. But if users got to construct such values, it would result in brittle code.
So – if you find yourself starting an algorithm with 1/0, then you should indeed not use Ratio at all there but simply store numerator and denominator in a plain tuple, which has no such issues, and only make the final result a Ratio with %.

Why is POISSON function not consistent in Microsoft Excel?

There is a definition in POISSON function that:
#NUM! error – Occurs if either:
The given value of x is less than zero;
The given value of mean is
less than zero.
But I try to do this in Excel 2013. It gave me differnt value. Here is my example:
=POISSON(0,-0.5,FALSE)
the result is: 1.648721271
instead of #NUM!
Any thoughts?
Speculatively, the bug might have come about as an optimization. Poisson(x,m,TRUE) is defined as e^(-m)*(m^x)/x!. One way to compute m^x when m is floating-point is as e^(x*Ln(m)). In a spreadsheet, you can observe that
=POISSON(A1,A2,TRUE) - EXP(-A2)*EXP(A1*LN(A2))/FACT(A1)
always evaluates to exactly 0 whenever A1,A2 are in the correct domain (and not e.g. 0.0000000001 as might be the case if the calculation had used a different approach).
Furthermore, EXP(-A2)*EXP(A1*LN(A2))/FACT(A1) fails when it should fail, giving #NUM! when fed 0, -0.5. My speculation is that the Excel programmers initially used a formula which failed when it should have failed, letting the called functions raise the error when appropriate. Then someone had the bright idea of just returning EXP(-mean) when x = 0 (since in that case the rest of the expression is 1 when it is defined at all). After all -- why bother to compute something when you know that it is 1?
What I find astonishing is that the bug is still there with POISSON.DIST Excel had been (and still is, although to a lesser extent) heavily criticized for the accuracy of its statistical functions and tests. So much so that "Friends don't let friends use Excel for statistics" is a relatively well-known saying among statisticians. See this for a discussion. The dotted statistical functions such as POISSON.DIST were explicitly designed to address the many complaints which had piled up. POISSON itself is just kept around for backwards compatibility. It is strange how this bug slipped through what should have been a thorough rewriting of these functions from the ground up.

Python3, range(a,b) function behaviour and empty lists

I am a bit confused about the behavior of the range() function in a specific use case.
When I was testing some code I wrote using nested FOR loops, in some cases, the statements in certain loops never seemed to execute at all. I eventually realized that I was in some cases feeding a range() call with an input like:
range(i,2) # where i is 2, giving range(2,2)
...which threw no error, but apparently never executed the for loop contents. After some reading on Python3's FOR implementation, I then added "else:" statements to my loop:
for i in range(a,b): # where a=b, i.e. range(2,2)
[skipped code]
else:
[other code]
...and the else-case code executed fine, as I guess all possible iterators for the given range values were (already) exhausted, and the for-else case was triggered as it's designed to be when that happens.
From what I can see in the documentation for range(), I found: "A range object will be empty if r[0] does not meet the value constraint." ( https://docs.python.org/3/library/stdtypes.html#range ). I'm not quite sure what the "value constraint" is in this case, but if I'm understanding right, "range(a,b)" will return an empty list if a >= b.
My question is, is my understanding correct about when range() returns []? Also, are there any other kinds of input cases where range(a,b) returns [], or other obscure edge case behaviors I should be aware of? Thank you.
as you can see in this documentation, when you use range(a,b) you're setting its start and stop parameters.
what you need to know is that stop parameter is always excluded just like in lists slicing.
another remark is that you can set the step, so if you set a negative step you can actually use a >= b like in this case:
range(10,4,-1)
Also please notice that all parameters need to be integers.
I recommend you visit the documentation provided above it's quite helpful.
range(n) generates an iterator to progress the integer numbers starting with 0 and ending with (n-1).
With reference to your FOR loop, it was not executed because the ending number (i.e. n - 1 = 2 - 1 = 1) is less than the starting number, 2. Since the step argument is omitted in your FOR loop, it defaults to 1. The step can be both negative and positive, but not zero.
Syntax:
range(begin, end[, step])
Examples:
Both examples below will produce empty list.
list(range(0))
list(range(2,2))

Shorter version of IF(ISERROR(...))

I'm wondering if there is a way to check if a formula returns an error, and if not, use the value found, without doing the following:
=IF(ISERROR(A1/B1), 0, A1/B1)
The syntax I'm looking for is something like this:
=EQ([value_if_not_error], [value_if_error])
The solution might contain more stuff, IF and ISERROR is perfectly fine, as long as I avoid having the main function several times in each cell. The reason why I want to do this is that my equations are quite long and the readability is drastically reduced when I have to write the equation twice, or even more (if several ifs).
Is there a simple solution to this?
Try
=IFERROR([formula], [value_if_error])

Why do most programming languages only give one answer to square root of 4?

Most programming languages give 2 as the answer to square root of 4. However, there are two answers: 2 and -2. Is there any particular reason, historical or otherwise, why only one answer is usually given?
Because:
In mathematics, √x commonly, unless otherwise specified, refers to the principal (i.e. positive) root of x [http://mathworld.wolfram.com/SquareRoot.html].
Some languages don't have the ability to return more than one value.
Since you can just apply negation, returning both would be redundant.
If the square root method returned two values, then one of those two would practically always be discarded. In addition to wasting memory and complexity on the extra return value, it would be little used. Everyone knows that you can multiple the answer returned by -1 and get the other root.
I expect that only mathematical languages would return multiple values here, perhaps as an array or matrix. But for most general-purpose programming languages, there is negligible gain and non-negligible cost to doing as you suggest.
Some thoughts:
Historically, functions were defined as procedures which returned a single value.
It would have been fiddly (using primitive programming constructs) to define a clean function which returned multiple values like this.
There are always exceptions to the rule:
0 for example only has a single root (0).
You cannot take the square root of a negative number (unless the language supports complex numbers). This could be treated as an exception (like "divide by 0") in languages which don't support imaginary numbers or the complex number system.
It is usually simple to deduce the 2 square roots (simply negate the value returned by the function). This was probably left as an exercise by the caller of the sqrt() function, if their domain depended on dealing with both the positive (+) and negative (-) roots.
It's easier to return one number than to return two. Most engineering decisions are made in this manner.
There are many functions which only return 1 answer from 2 or more possibilities. Arc tangent for example. The arc tangent of 1 is returned as 45 degrees, but it could also be 225 or even 405. As with many things in life and programming there is a convention we know and can rely on. Square root functions return positive values is one of them. It is up to us, the programmers, to keep in mind there are other solutions and to act on them if needed in code.
By the way this is a common issue in robotics when dealing with kinematics and inverse kinematics equations where there are multiple solutions of links positions corresponding to Cartesian positions.
In mathematics, by convention it's always assumed that you want the positive square root of something unless you explicitly say otherwise. The square root of four really is two. If you want the negative answer, put a negative sign in front. If you want both, put the plus-or-minus sign. Without this convention it would be impossible to write equations; you would never know what the person intended even if they did put a sign in front (because it could be the negative of the negative square root, for example). Also, how exactly would you write any kind of computer code involving mathematics if operators started returning two values? It would break everything.
The unfortunate exception to this convention is when solving for variables. In the following equation:
x^2 = 4
You have no choice but to consider both possible values for X. if you take the square root of both sides, you get x = 2 but now you must put in the plus or minus sign to make sure you aren't missing any possible solutions. Also, remember that in this case it's technically X that can be either plus or minus, not the square root of four.
Because multiple return types are annoying to implement. If you really need the other result, isn't it easy enough to just multiple the result by -1?
Because most programmers only want one answer.
It's easy enough to generate the negative value from the positive value if the caller wants it. For most code the caller only uses the positive value.
However, nowadays it's easy to return two values in many languages. In JavaScript:
var sqrts=function(x) {
var s=Math.sqrt(x);
if (s>0) {
return [s,-s];
} else {
return [0];
}
}
As long as the caller knows to iterate through the array that comes back, you're gold.
>sqrts(2)
[1.4142135623730951, -1.4142135623730951]
I think because the function is called "sqrt", and if you wanted multiple roots, you would have to call the function "sqrts", which doesn't exist, so you can't do it.
The more serious answer is that you're suggesting a specific instance of a larger issue. Many equations, and commonly inverse functions (including sqrt) have multiple possible solutions, such as arcsin, etc, and these are, in general, an issue. With arcsin, for example, should one return an infinite number of answers? See, for example, discussions about branch cuts.
Because it was historically defined{{citation needed}} as the function which gives the side length of a square of known surface. And length is positive in that context.
you can always tell what is the other number, so maybe it's not necessary to return both of them.
It's likely because when people use a calculator to figure out a square root, they only want the positive value.
Go one step further and ask why your calculator won't let you take the square root of a negative number. It's possible, using imaginary numbers, but the average user has absolutely zero use for this.
On imaginary numbers.

Resources