Excel-VBA UDF: Keep 2 values but display only 1 - excel

I wrote a user-defined function for fractions to be displayed with superscripted & subscripted digits (available in Unicode), with denominator no more than the user wants to. I could basically turn π into ²²/₇ with "=Fraction(PI(),30)", since no other fraction would be closer to π with a denominator smaller than or equal to 30.
Then I'm thinking of writing an InvFraction function as well, to get from a string generated by the Fraction function into an actual number. As you can imagine, though, the value is not π anymore, but 3.142857... (i.e. ²²/₇). So I'm postponing the writing until I remove that sense of chasing a ghost I'm feeling about it.
I saw that one could make the Fraction function generate a size-2 array of values, then through the index function, let the user decide which one to display, or enter the Fraction function as an array function covering 2 cells. Neither one is ideal from my perspective, the first option because the second value, which could be π, gets lost through the index choice and is no longer retrievable, the second option because it forces two cells to contain the data (though I guess I COULD end up living with it).
I also thought of using user-defined types containing the string value for the fraction and the double value for the original input, but I noticed they don't work in the actual sheet, then informally confirmed the info there: Call VBA function that returns custom type from spreadsheet
Anyone would have any idea at how to tackle this? Thanks anyways for having taken the time to read.
Edit: To put it simply, if I were to program the InvFraction function as I conceived it with the tools and ideas I have, I could only manage to have “=InvFraction(Fraction(PI(),30))” to equal 3.142857... (22 divided by 7), but I would rather like it to generate 3.14159265... (π).

Related

Excel's LAMBDA with a "kind of" composite function

Ever since I learnt that Excel is now Turing-complete, I understood that I can now "program" Excel using exclusively formulas, therefore excluding any use of VBA whatsoever.
I do not know if my conclusion is right or wrong. In reality, I do not mind.
However, to my satisfaction, I have been able to "program" the two most basic structures of program flow inside formulas: 1- branching the control flow (using an IF function has no secrets in excel) and 2- loops (FOR, WHILE, UNTIL loops).
Let me explain a little more in detail my findings. (Remark: because I am using a Spanish version of Excel 365, the field separator in formulas is the semicolon (";") instead of the comma (",").
A- Acumulator in a FOR loop
B- Factorial (using product)
C- WHILE loop
D-UNTIL loop
E- The notion of INTERNAL/EXTERNAL SCOPE
And now, the time of my question has arrived:
I want to use a formula that is really an array of formulas
I want to use an accumulator for the first number in the "tuple" whereas I want a factorial for the second number in the tuple. And all this using a single excel formula. I think I am not very far away from succeeding.
The REDUCE function accepts a LET function that contains 2 LAMBDAS instead of a single LAMBDA function. Until here, everything is perfect. However, the LET function seems to return only a "single" function instead of a tuple of functions
I can return (in the picture) function "x" or function "y" but not the tuple (x,y).
I have tried to use HSTACK(x,y), but it does not seem to work.
I am aware that this is a complex question, but I've done my best to make myself understood.
Can anybody give me any clues as to how I could solve my problem?
Very nice question.
I noticed that in your attempts you have given REDUCE() a single constant value in the 1st parameter. Funny enough, the documentation nowhere states you can't give values in array-format. Hence you could use the 1st parameter to give all the constants in (your case; horizontal) array-format, and while you loop through the array of the 2nd parameter you can apply the different types of logic using CHOOSE():
=REDUCE({0,1},SEQUENCE(5),LAMBDA(a,b,CHOOSE({1,2},a+b,a*b)))
This way you have a single REDUCE() function which internal processes will update the given constants from the 1st parameter in array-form. You can now start stacking multiple functions horizontally and input an array of constants, for example:
=REDUCE({0,1,100},SEQUENCE(5),LAMBDA(a,b,CHOOSE({1,2,3},a+b,a*b,a/b)))
I suppose you'd have to use {0\1} and {1\2} like I'd have to in my Dutch version of Excel.
Given your accumulator:
Formula in A1:
=REDUCE(F1:G1,SEQUENCE(F3),LAMBDA(a,b,CHOOSE({1,2},a+b,a*b)))

Frequency() with arrays: adds an element to return arrays

I'm using the following formula as named formula (via name manager). It is then used in a larger sumproduct(). The goal is to ensure that with an array calculation, the calculation is only made once for certain groups of rows (e.g. you have the same data repeated accross many rows for category A. I only need to know how many people are in category A once).
=IF(FREQUENCY(IF(LEN(tdata[reportUUID])>0,MATCH(tdata[reportUUID],
tdata[reportUUID],0),0),IF(LEN(tdata[reportUUID])>0,MATCH(tdata[reportUUID],
tdata[reportUUID],0),0))>0,TRUE)
Let's step through the results one by one with the evaluate formula in Excel. Sorry for the screenshot, but Excel doesn't allow to copy actual steps with real data....
In order of steps:
In the last image, there's now a 7th item in my array. I only have 6 row of data, hence why for the previous steps I only had 6 items in the array, as was expect.
This is messing up my calculations, because the return array from this function gets multiplied by others arrays which all have 6 items (or whatever is the number of data rows I have).
What is this 7th item, and how can I either get ride of it or prevent it from return errors?
I did try to wrap some formula into iferror() or ifna(), however it doesn't feel clean. I feel this might backfire and isn't a strong way to handle this. I rather take it at the source....
EDIT: For example of use with other arrays:
{=SUMPRODUCT(--IFERROR(((tdata[_isVisible]=1)*(f_uniqueUUIDfactor),0))}
Where f_uniqueUUIDfactor is the formula from the initial post. tdata[_isVisible]=1 is used as a way to filter data on the dashboard (e.g. through dropdown, the users can set ranges for dates, and with VBA I hide the rows in the raw data NOT within the range).
The point is that sumproduct() ends up multipliying each raw data row thogheter as 0 & 1 s, so that only those meeting all the criterias get returned. The IFERROR() above is the workaround for the extra array element introduced by frequency(). It works as is, but if a cleaner way exists I'd prefer that. I would also be keen on understanding why that elements get added.
This is a good example of why it is preferable to use multiple, recursive IF statements when evaluating arrays over multiple criteria, rather than form the product of those arrays.
Firstly, though, before coming to the reason for that statement, I should point out a few minor technical inaccuracies/flaws with your construction also.
1) By including a value_if_false clause in your constructions being passed as FREQUENCY's data_array and bins_array parameters, you are risking incorrect results, since zero is a valid numerical to be considered by FREQUENCY, whereas a Boolean FALSE (which would be the equivalent entry in the resulting array had you omitted the value_if_false clause altogether) is disregarded by this function.
2) MATCH with an exact (i.e. 0, or FALSE) match_type parameter is a relatively resource-heavy construction, particularly if the range to be considered is quite large. As such, and since it is not necessary to use this construction for FREQUENCY's bins_array parameter, it is preferable to use the more efficient:
ROW(tdata[reportUUID])-MIN(ROW(tdata[reportUUID]))+1
Moreover, note that repetition of the IF(LEN construction is also not necessary within this second parameter.
In all, then:
IF(FREQUENCY(IF(LEN(tdata[reportUUID])>0,MATCH(tdata[reportUUID],tdata[reportUUID],0)),ROW(tdata[reportUUID])-MIN(ROW(tdata[reportUUID]))+1)>0,TRUE)
is considerably more rigorous and more efficient than the version you give.
To answer your main question, it is well-documented that FREQUENCY always returns an array having a number of entries one greater than that of the bins_array passed.
As mentioned in my comment to your post, the resolution to the problem you are facing largely depends on precisely what further manipulation you are intending for the resulting array.
However, let's assume for the sake of an explanation that you simply wish to multiply the array resulting from your FREQUENCY construction by some other column within your table, tdata[Column2] say, and then sum the result.
The difference between:
=SUM(IF(FREQUENCY(IF(LEN(tdata[reportUUID])>0,MATCH(tdata[reportUUID],tdata[reportUUID],0)),ROW(tdata[reportUUID])-MIN(ROW(tdata[reportUUID]))+1)>0,TRUE)*tdata[Column2])
i.e. using multiplication of the two arrays, and:
=SUM(IF(FREQUENCY(IF(LEN(tdata[reportUUID])>0,MATCH(tdata[reportUUID],tdata[reportUUID],0)),ROW(tdata[reportUUID])-MIN(ROW(tdata[reportUUID]))+1)>0,tdata[Column2]))
i.e. using a straightforward IF clause, is here crucial.
In fact, the former will always return an error, whereas the latter, in general, will not.
The reason is that the former will resolve to (assuming that your table has e.g. 10 rows' worth of data and assuming some random Boolean results to the FREQUENCY construction):
=SUM(IF({TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;TRUE;TRUE;FALSE},TRUE)*tdata[Column2])
which is, since the value_if_true clause is superfluous here:
=SUM({TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;TRUE;TRUE;FALSE}*tdata[Column2])
whereas the second construction I give will resolve to:
=SUM(IF({TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;TRUE;TRUE;FALSE},tdata[Column2]))
The two may look identical, but the fact that the former is using multiplication to resolve the array, whereas the latter is not, is the key difference.
Although in both cases the array resulting from the FREQUENCY construction, i.e.:
{TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;TRUE;TRUE;FALSE}
comprises 11 entries (i.e. 1 more than the number of entries in the second array being considered), the difference is that, when you then attempt to multiply an 11-element array with a 10-element array (i.e. tdata[Column2]), Excel, rather than outright disallowing such an operation, artificially redimensions the smaller of the two arrays such that it matches the dimensions of the larger.
In doing so, however, any additional entries are automatically set as #N/A error values.
Effectively, then:
=SUM({TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;TRUE;TRUE;FALSE}*tdata[Column2])
is resolved as:
=SUM({TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;TRUE;TRUE;FALSE}*{38;67;49;3;10;11;97;20;3;57;#N/A})
i.e., as mentioned, the second, 10-element array is redimensioned to one of 11 elements in an attempt to form a legitimate operation. And, as also mentioned, that 11th element is #N/A, which means of course that the entire construction will also result in that value.
In the non-multiplication version, however, i.e.:
=SUM(IF({TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;TRUE;TRUE;FALSE},tdata[Column2]))
although the same redimensiong also takes place, we are saved by our use of an IF clause in place of multiplication, since the above resolves to:
=SUM(IF({TRUE;FALSE;TRUE;FALSE;TRUE;TRUE;TRUE;FALSE;TRUE;FALSE;FALSE},{38;67;49;3;10;11;97;20;3;57;#N/A}))
and the Boolean FALSE in the 11th position here 'overrides' the error value in the equivalent position from the second array, since the above resolves to:
=SUM({38;FALSE;49;FALSE;10;11;97;FALSE;3;FALSE;FALSE})
Regards

Is there Infinity in Spreadsheets?

I am wondering if there is any way to represent infinity (or a sufficiently high number) in MS Excel.
I am particularly looking for something like Double.POSITIVE_INFINITY or Double.MAX_VALUE in Java.
I like to use 1e99 as it gives the largest number with the fewest keystrokes but I believe the absolute maximum is actually 9.99999E+307. At that stage of the number spectrum I don't think there is much difference as far as Excel is concerned.
I think it's worth adding that, Infinity as well as other special values can be returned from a vba function (How do you get VB6 to initialize doubles with +infinity, -infinity and NaN?):
Function Infinity(Optional Recalc) As Double
On Error Resume Next
Infinity = 1/0
End Function
When entered as a cell formula a large number is shown (2^1024). You can set a conditional format to show "+Infinity" as a number format with a formula condition:
=AND(ISNUMBER(A1),A1>2^1023*(2-2^-52))
A dummy argument containing a dynamic reference can be inserted so that values are recalculated when the workbook is opened, for example:
=Infinity(IF(,) IF(,))
With LibreOffice 6 I use 1.79769313486231E+308 that seems the largest number it allows me to enter, but I miss not having an exact representation of +- infinite, also because I suspect the number above is implementation specific...
This is an other point that makes me think that spreadsheets are great for visualising, editing and simple computations on tabular data, but for doing more complex operations/modelling a real programming language is a must...

Trying to show only a certain amount of numbers

To make the sale to my customer I need to import numbers from a report into an Excel document. For example the number coming in will be 14.182392. The only reason for my guy not to buy the product is because he only wants to view 14.182 on the Excel sheet. Okay so the other catch is, the number CANNOT be rounded in any shape or form.
So what I need is a way to just show so much of number, WITHOUT ROUNDING.
Is this possible? Any ideas of how I could get around this would be fantastic.
Please try:
=TEXT(INT(A1)+VALUE(LEFT(MOD(A1,1),5)),"00.000")
Firstly =TRUNC is a better answer (much shorter). My version was connected with uncertainty in your requirement (it is odd!) and in the hope it might be easier to adjust if not exactly what you/your boss wanted.
TRUNC literally just truncates the decimals (no rounding!) to a length to suit (ie 3 if to show nn.182 given nn.182392 or say nn.182999).
LEFT may also be a better choice, but that depends upon knowing how large the integer part of your number is. =LEFT(A1,6) would display 14.189 given say 14.189999 in A1. However it would show 1.4189 given 1.4189999 in A1 (ie four decimal places).
The formula above combines text manipulation with number manipulation.:
INT takes just the integer value (here 14.)
MOD takes just the modulus – the residual that is not an integer after division, in this case by 1. So just the .182392 part. LEFT is then applied here in a similar way to as used above, but without needing to concern oneself with the length of the integer part of the source value (ie 14 or 1 etc does not matter).
VALUE then converts the result back into numeric format (string manipulation functions such as LEFT always return text format) so our abbreviated decimal string can then be added to our integer.
Finally, the TEXT part is for formatting but is hard or impossible to justify! About the only use is that it displays the result left-justified in the cell – perhaps a little warning that the number displayed is not the “true” value (eg it won’t SUM) because, as a result of a formula, it won’t be marked with a little green warning triangle.
The displayed values can use the TRUNC function like this,
=TRUNC(A1, 3)
But you must use A1 in any calculations to retain the precision of the raw value.
Easiest way I know:
=LEFT(A1; x)
where x = the amount of characters You want. Mind that the dot counts as a character as well.

Excel array function for checking monthly values

I have an array equation to tell me the number of unique values in a column (D) based on whether the date field in another column (B) is in a particular month.
My equation is:
=SUM(IF(MONTH($B$2:$B$63)=10,(IF(FREQUENCY(IF(LEN(D2:D63)>0,MATCH(D2:D63,D2:D63,0),""), IF(LEN(D2:D63)>0,MATCH(D2:D63,D2:D63,0),""))>0,1))),0)
This works great for October and when I change the 10 value to be another number it works for all months except january. So you can see if I have done a copying error here is the cell relating to January:
=SUM(IF(MONTH($B$2:$B$63)=1,(IF(FREQUENCY(IF(LEN(D2:D63)>0,MATCH(D2:D63,D2:D63,0),""), IF(LEN(D2:D63)>0,MATCH(D2:D63,D2:D63,0),""))>0,1))),0)
This always returns "N/A"
Any ideas why?
There are a few things wrong with your construction.
Firstly, the array you are using for the bins_array parameter, which is derived from your MATCH construction combined with an IF statement, is forcing FREQUENCY to return an array containing less than 62 elements.
When this array is then compared with the initial IF clause, i.e. IF(MONTH($B$2:$B$63)=1, which does contain 62 elements, you have an issue, and, where possible, the way in which Excel resolves a comparison between two arrays of differing sizes is to artificially increase the smaller of the two so that it is of a dimension equal to that of the larger.
Of course, in doing this, it fills in the missing values with #N/As (what else could it do?). Hence your result.
In any case, repetition of the MATCH construction is not necessary for the bins_array parameter, and forces unnecessary extra calculation. As such, I am always surprised to see how many sources still recommend this set-up.
Finally, any IF clauses should appear within the FREQUENCY construction, not without.
Overall:
=SUM(IF(FREQUENCY(IF(LEN(D2:D63)>0,IF(MONTH($B$2:$B$63)=1,MATCH(D2:D63,D2:D63,0))),ROW(D2:D63)-MIN(ROW(D2:D63))+1),1))
is what you should be using.
Regards

Resources