How to get first significant figure from a number in Excel? - excel

I have a column of numbers in Excel 2016. The numbers span many orders of magnitude, but are all positive. Some are less than zero. How can I return the first significant figure of each cell in a new column?
For example, for the number 1.9 the result should be 1. For the number 0.9 the result should be 9.
Things I've tried:
Using LEFT() to get the first character. This works for values greater than 1, but for numbers between 0 - 1 it returns 0 (that is, LEFT(0.3, 1) returns 0). I've tried using this with scientific notation formatting and it returns the same result.
I've searched Google and SO for solutions to this problem. There are many posts about rounding to significant figures, but I'm looking to truncate, not round.
Reading through Office's online docs regarding scientific notation.

You could use scientific notation:
=LEFT(TEXT(A1,"0.000000000000000E+00"))
Note: You can only have 15 digits of precision in Excel so this should be OK.

you can multiply the number by a factor of 10 significant enough to deal with any 0 not wanted:
=--LEFT(A1*10^LEN(A1),1)

Read the cell value as text, replace dots and zeros (. / 0) with nothing, return the leftmost "character"; multiply it by 1 to coerce it back into a number:
=LEFT(SUBSTITUTE(SUBSTITUTE(TEXT(A1,"#"),".",""),"0",""))*1

You can also create a custom UDF (User Defined Function) that uses Regular Expressions to accomplish this task. This would require copy/paste VBA knowledge, as well as you setting a reference to:
Microsoft VBScript Regular Expressions 5.5
(which can be done by going to the VBE (Alt+F11), Tools > References. Then check the box of the reference listed above)
Paste the following UDF into a standard code module within the VBE:
Public Function SigNum(ByVal InputNumber As Double) As Long
Dim s As SubMatches
With New RegExp
.Pattern = "\.0*([^0])|^([^0])"
If .test(InputNumber) Then
Set s = .Execute(InputNumber)(0).SubMatches
If s(0) > 0 Then ' This is before the period
SigNum = s(0)
Else
SigNum = s(1)
End If
End If
End With
End Function
On your worksheet, you would be able to use your newly created formula as such:
=SigNum(A1)
You can see what it matches in the example on regex101. When viewing this site, green highlighted numbers are what would be returned if the value is < 0, and red would be what is returned if the value > 0). If the value = 0, this will return 0.
Breaking Down the Pattern
Here's how the pattern \.0*([^0])|^([^0]) works. First, you can see that there is a |, which essentially acts like an Or statement, so we will split these into two sections.
First Section \.0*([^0])
\. will match a literal period. This ensures that we are looking at a value that is less than 0.
0* matches all zeros, 0 to unlimited * times. We use * (zero to unlimited) instead of + (1 to unlimited) because a zero is not required to be in front of the significant number - but the zero itself isn't significant.
[^0] This is a negated character class [^...]. This means it will match anything that is not in this class. Since our significant number should be a value other than zero, we do not want to match a zero. And because it's surrounded by a capturing group (...), this is what is returned back to the function.
Second Section ^([^0])
We've established that since the first section didn't match, then the value must be greater than 0.
^ this is an anchor point that matches the beginning of the string. On the first section we didn't require it because we essentially used the period \. as our anchor. Since our value is greater than 0, we need to ensure we are starting from the absolute left of the input number.
(...) Capturing Group. Anything within this group will be returned as a submatch and ultimately back to the function as it's return value.
[^0] Negated Character class. It will match anything except a 0.

Related

How do I find the last number in a string with excel formulas

I'm parsing strings in excel, and I need to return everything through the last number. For example:
Input: A00XX
Output: A00
In my case, I know the last number will be between index 3 and 5, so I'm brute-forcing it with:
=LEFT([#Point],
IF(SUM((MID([#Point],5,1)={"0","1","2","3","4","5","6","7","8","9"})+0),5,
IF(SUM((MID([#Point],4,1)={"0","1","2","3","4","5","6","7","8","9"})+0),4,
IF(SUM((MID([#Point],3,1)={"0","1","2","3","4","5","6","7","8","9"})+0),3,
))))
Unfortunately, I've run into some edge cases where the numbers extended beyond index 5. Is there a generic way to find the last number in a string using excel formulas?
Note:
I've tried =MAX(SEARCH(... but it returns the index of the first number, not the last.
As a starting point: if we know the position of the last number, we can use LEFT to get the string to that point. Suppose that the position is 5:
=LEFT(A1, 5)
But, we don't know the position of the last number. Now, what if the only valid number was 0, and it only appeared once: then we could use FIND to locate the position of the number:
=LEFT(A1, FIND(0, A1))
But, we have more than one valid number. Suppose that we had all the numbers from 0 through 9, but each number could only appear once — then we could use MAX on a FIND array, to tell us which of the numbers is the last one:
=LEFT(A1, MAX(FIND({0,1,2,3,4,5,6,7,8,9}, A1)))
Unfortunately, FIND will throw a #VALUE! error any number doesn't appear, which will then make MAX return the same error. So, we need to fix that with IFERROR:
=LEFT(A1, MAX(IFERROR(FIND({0,1,2,3,4,5,6,7,8,9}, A1), 0)))
However, numbers can appear more than once. As such, we need a method to find the last occurrence of a value in a string (since FIND and SEARCH will, by default, return the first occurrence).
The SUBSTITUTE function has 3 mandatory arguments — Initial String, Value to be Replaced, Value to Replace with — and one Optional argument — the occurrence to replace. Normally, this is omitted, so that all occurrences are replaced. But, if we know how many times a character appears in a string, then we can replace just the last instance with a special/uncommon sub-string to search for.
To count how many times a character appears in a String, just start with the length of the String, then subtract the length when you SUBSTITUTE all copies of that character for Nothing:
=LEN(A1) - LEN(SUBSTITUTE(A1, 0, ""))
This means we can now replace the last occurrence of the character with, for example, ">¦<", and then FIND that:
=FIND(">¦<", SUBSTITUTE(A1, 0, ">¦<", LEN(A1) - LEN(SUBSTITUTE(A1, 0, ""))))
Of course, we want to do this for all the numbers from 0 to 9, and take the MAX value (remembering our IFERROR), so we need to put the Array of values back in:
=MAX(IFERROR(FIND(">¦<", SUBSTITUTE(A1, {0,1,2,3,4,5,6,7,8,9}, ">¦<", LEN(A1) - LEN(SUBSTITUTE(A1, {0,1,2,3,4,5,6,7,8,9}, "")))), 0))
Then, we plug that all back into our initial LEFT function:
=LEFT(A1, MAX(IFERROR(FIND(">¦<", SUBSTITUTE(A1, {0,1,2,3,4,5,6,7,8,9}, ">¦<", LEN(A1) - LEN(SUBSTITUTE(A1, {0,1,2,3,4,5,6,7,8,9}, "")))), 0)))
An alternative, assuming that the length of the string in question will never be more than 9 characters (which seems a safe assumption based on your description):
=LEFT(A1,MATCH(0,0+ISERR(0+MID(A1,{1;2;3;4;5;6;7;8;9},1))))
This, depending on your version of Excel, may or may not require committing with CTRL+SHIFT+ENTER.
Note also that the separator within the array constant {1;2;3;4;5;6;7;8;9} is the semicolon, which, for English-language versions of Excel, represents the row-separator. This may require amending if you are using a non-English-language version.
Of course, we can replace this static constant with a dynamic construction. However, since we are already making the assumption that 9 is an upper limit on the number of characters for the string in question, this would not seem to be necessary.
If you have the newest version of Excel, you can try something like:
=LEFT(D1,
LET(x, SEQUENCE(LEN(D1)),
MAX(IF(ISNUMBER(NUMBERVALUE(MID(D1, SEQUENCE(LEN(D1)), 1))), x))))
For example:

Excel: How to find closest number in table, many times

Excel
Need to find nearest float in a table, for each integer 0..99
https://www.excel-easy.com/examples/closest-match.html explains a great technique for finding the CLOSEST number from an array to a constant cell.
I need to perform this for many values (specifically, find nearest to a vertical list of integers 0..99 from within a list of floats).
Array formulas don't allow the compare-to value (integers) to change as we move down the list of integers, it treats it like a constant location.
I tried Tables, referring to the integers (works) but the formula from the above web site requires an Array operation (F2, control shift Enter), which are not permitted in Tables. Correction: You can enter the formula, control-enter the array function for one cell, copy the formulas, then insert table. Don't change the search cell reference!
Update:
I can still use array operations, but I manually have to copy the desired function into each 100 target cells. No biggie.
Fixed typo in formula. See end of question for details about "perfection".
Example code:
AI4=some integer
AJ4=MATCH(MIN(ABS(Table[float_column]-AI4)), ABS(Table[float_column]-AI4), 0)
repeat for subsequent integers in AI5...AI103
Example data:
0.1 <= matches 0
0.5
0.95 <= matches 1
1.51 <= matches 2
2.89
Consider the case where target=5, and 4.5, 5.5 exist in the list. One gives -0.5 and the other +0.5. Searching for ABS(-.5) will give the first one. Either one is decent, unless your data is non-monotonic.
This still needs a better solution.
Thanks in advance!
I had another problem, which pushed to a better solution.
Specifically, since the Y values for the X that I am interested in can be at varying distances in X, I will interpolate X between the X point before and after. Ie search for less than or equal, also greater than or equal, interpolate the desired X, then interpolate the Y values.
I could go a step further and interpolate N - 1 to N + 1, which will give cleaner results for noisy data.

PowerQuery M Conditional Column 'count' argument is out of range

I have a sheet with dates as MMDDYYY with no leading 0's if month number is single digit. For example, 1012018 or 12312018. Each record has a date, and each date is either 7 or 8 characters in length.
Here is the code I am using to convert the numbers to dates:
if Text.Length([ContractDate]) = 7
then
Text.Range([ContractDate],0,1)&"/"&Text.Range([ContractDate],1,2)&"/"&Text.Range([ContractDate],4,4)
else
Text.Range([ContractDate],0,2)&"/"&Text.Range([ContractDate],2,2)&"/"&Text.Range([ContractDate],4,4)
The code works fine for the "else" condition but I am getting error "Expression.Error: The 'count' argument is out of range. Details: 4" for all records where Text.Length() = 7. I verified this by adding a second column to get Length of ContractDate.
What am I missing?
EDIT: Problem Solved - I'm an idiot. I was getting an error because in the "then" condition, I am extracting a substring of (4,4) from a value that only has Len=7. I can't get 4 characters out of a 7 character string when starting at index of 4.
I know you found the issue with your code, but worth pointing out some things that might be good to know.
Text.Range with no character count will pull in all characters past the start point (so Text.Range([ContractDate], 4) would work for both).
Text.Middle operates like Text.Range but will not cause an error if you select a range that expands past the size of the string. This can be useful if for some reason you were dealing with variable size strings where you need a specific number of characters up to a limit past a certain position.
You could also use Text.PadStart([ContractDate], 8, "0") to pad the 7 length strings with a 0 at the start, and avoid the need for a conditional check all together.

binary sequence subsum combinations

Given a sequence a1a2....a_{m+n} with n +1s and m -1s, if for any 1=< i <=m+n, we have
sum(ai) >=0, i.e.,
a1 >= 0
a1+a2>=0
a1+a2+a3>=0
...
a1+a2+...+a_{m+n}>=0
then the number of sequence that meets the requirement is C(m+n,n) - C(m+n,n-1), where the first item is the total number of sequence, and the second term refers to those sub-sum < 0.
I was wondering whether there is a similar formula for the bi-side sequence number :
a1 >= 0
a1+a2>=0
a1+a2+a3>=0
...
a1+a2+...+a_{m+n}>=0
a_{m+n}>=0
a_{m+n-1}+a_{m+n}>=0
...
a1+a2+...+a_{m+n}>=0
I feel like it can be derived similarly with the single-side subsum problem, but the number C(m+n,n) - 2 * C(m+n,n-1) is definitely incorrect. Any ideas ?
A clue: the first case is a number of paths (with +-1 step) from (0,0) to (n+m, n-m) point, where path never falls below zero line. (Like Catalan numbers for parenthesis pairs, but without balance requirement n=2m)
Desired formula is a number of (+-1) paths which never rise above (n-m) line. It is possible to get recursive formulas. I hope that compact formula exists for it.
If we consider lattice path at nxm grid, where horizontal step for +1 and vertical step for -1, then we need a number of paths restricted by parallelogramm with (n-m) base

Case Function Equivalent in Excel

I have an interesting challenge - I need to run a check on the following data in Excel:
| A - B - C - D |
|------|------|------|------|
| 36 | 0 | 0 | x |
| 0 | 600 | 700 | x |
|___________________________|
You'll have to excuse my wonderfully bad ASCII art. So I need the D column (x) to run a check against the adjacent cells, then convert the values if necessary. Here's the criteria:
If column B is greater than 0, everything works great and I can get coffee. If it doesn't meet that requirement, then I need to convert A1 according to a table - for example, 32 = 1420 and place into D. Unfortunately, there is no relationship between A and what it needs to convert to, so creating a calculation is out of the question.
A case or switch statement would be perfect in this scenario, but I don't think it is a native function in Excel. I also think it would be kind of crazy to chain a bunch of =IF() statements together, which I did about four times before deciding it was a bad idea (story of my life).
Sounds like a job for VLOOKUP!
You can put your 32 -> 1420 type mappings in a couple of columns somewhere, then use the VLOOKUP function to perform the lookup.
Without reference to the original problem (which I suspect is long since solved), I very recently discovered a neat trick that makes the Choose function work exactly like a select case statement without any need to modify data. There's only one catch: only one of your choose conditions can be true at any one time.
The syntax is as follows:
CHOOSE(
(1 * (CONDITION_1)) + (2 * (CONDITION_2)) + ... + (N * (CONDITION_N)),
RESULT_1, RESULT_2, ... , RESULT_N
)
On the assumption that only one of the conditions 1 to N will be true, everything else is 0, meaning the numeric value will correspond to the appropriate result.
If you are not 100% certain that all conditions are mutually exclusive, you might prefer something like:
CHOOSE(
(1 * TEST1) + (2 * TEST2) + (4 * TEST3) + (8 * TEST4) ... (2^N * TESTN)
OUT1, OUT2, , OUT3, , , , OUT4 , , <LOTS OF COMMAS> , OUT5
)
That said, if Excel has an upper limit on the number of arguments a function can take, you'd hit it pretty quickly.
Honestly, can't believe it's taken me years to work it out, but I haven't seen it before, so figured I'd leave it here to help others.
EDIT: Per comment below from #aTrusty:
Silly numbers of commas can be eliminated (and as a result, the choose statement would work for up to 254 cases) by using a formula of the following form:
CHOOSE(
1 + LOG(1 + (2*TEST1) + (4*TEST2) + (8*TEST3) + (16*TEST4),2),
OTHERWISE, RESULT1, RESULT2, RESULT3, RESULT4
)
Note the second argument to the LOG clause, which puts it in base 2 and makes the whole thing work.
Edit: Per David's answer, there's now an actual switch statement if you're lucky enough to be working on office 2016. Aside from difficulty in reading, this also means you get the efficiency of switch, not just the behaviour!
The Switch function is now available, in Excel 2016 / Office 365
SWITCH(expression, value1, result1, [default or value2, result2],…[default or value3, result3])
example:
=SWITCH(A1,0,"FALSE",-1,"TRUE","Maybe")
Microsoft -Office Support
Note: MS has updated that page to only document the behavior of Excel 2019. Eventually, they will probably remove references to 2019 as well... To see what the page looked like in 2016, use the wayback machine:
https://web.archive.org/web/20161010180642/https://support.office.com/en-us/article/SWITCH-function-47ab33c0-28ce-4530-8a45-d532ec4aa25e
Try this;
=IF(B1>=0, B1, OFFSET($X$1, MATCH(B1, $X:$X, Z) - 1, Y)
WHERE
X = The columns you are indexing into
Y = The number of columns to the left (-Y) or right (Y) of the indexed column to get the value you are looking for
Z = 0 if exact-match (if you want to handle errors)
I used this solution to convert single letter color codes into their descriptions:
=CHOOSE(FIND(H5,"GYR"),"Good","OK","Bad")
You basically look up the element you're trying to decode in the array, then use CHOOSE() to pick the associated item. It's a little more compact than building a table for VLOOKUP().
I know it a little late to answer but I think this short video will help you a lot.
http://www.xlninja.com/2012/07/25/excel-choose-function-explained/
Essentially it is using the choose function. He explains it very well in the video so I'll let do it instead of typing 20 pages.
Another video of his explains how to use data validation to populate a drop down which you can select from a limited range.
http://www.xlninja.com/2012/08/13/excel-data-validation-using-dependent-lists/
You could combine the two and use the value in the drop down as your index to the choose function. While he did not show how to combine them, I'm sure you could figure it out as his videos are good. If you have trouble, let me know and I'll update my answer to show you.
I understand that this is a response to an old post-
I like the If() function combined with Index()/Match():
=IF(B2>0,"x",INDEX($H$2:$I$9,MATCH(A2,$H$2:$H$9,0),2))
The if function compare what is in column b and if it is greater than 0, it returns x, if not it uses the array (table of information) identified by the Index() function and selected by Match() to return the value that a corresponds to.
The Index array has the absolute location set $H$2:$I$9 (the dollar signs) so that the place it points to will not change as the formula is copied. The row with the value that you want returned is identified by the Match() function. Match() has the added value of not needing a sorted list to look through that Vlookup() requires. Match() can find the value with a value: 1 less than, 0 exact, -1 greater than. I put a zero in after the absolute Match() array $H$2:$H$9 to find the exact match. For the column that value of the Index() array that one would like returned is entered. I entered a 2 because in my array the return value was in the second column. Below my index array looked like this:
32 1420
36 1650
40 1790
44 1860
55 2010
The value in your 'a' column to search for in the list is in the first column in my example and the corresponding value that is to be return is to the right. The look up/reference table can be on any tab in the work book - or even in another file. -Book2 is the file name, and Sheet2 is the 'other tab' name.
=IF(B2>0,"x",INDEX([Book2]Sheet2!$A$1:$B$8,MATCH(A2,[Book2]Sheet2!$A$1:$A$8,0),2))
If you do not want x return when the value of b is greater than zero delete the x for a 'blank'/null equivalent or maybe put a 0 - not sure what you would want there.
Below is beginning of the function with the x deleted.
=IF(B2>0,"",INDEX...
If you don't have a SWITCH statement in your Excel version (pre-Excel-2016), here's a VBA implementation for it:
Public Function SWITCH(ParamArray args() As Variant) As Variant
Dim i As Integer
Dim val As Variant
Dim tmp As Variant
If ((UBound(args) - LBound(args)) = 0) Or (((UBound(args) - LBound(args)) Mod 2 = 0)) Then
Error 450 'Invalid arguments
Else
val = args(LBound(args))
i = LBound(args) + 1
tmp = args(UBound(args))
While (i < UBound(args))
If val = args(i) Then
tmp = args(i + 1)
End If
i = i + 2
Wend
End If
SWITCH = tmp
End Function
It works exactly like expected, a drop-in replacement for example for Google Spreadsheet's SWITCH function.
Syntax:
=SWITCH(selector; [keyN; valueN;] ... defaultvalue)
where
selector is any expression that is compared to keys
key1, key2, ... are expressions that are compared to the selector
value1, value2, ... are values that are selected if the selector equals to the corresponding key (only)
defaultvalue is used if no key matches the selector
Examples:
=SWITCH("a";"?") returns "?"
=SWITCH("a";"a";"1";"?") returns "1"
=SWITCH("x";"a";"1";"?") returns "?"
=SWITCH("b";"a";"1";"b";TRUE;"?") returns TRUE
=SWITCH(7;7;1;7;2;0) returns 2
=SWITCH("a";"a";"1") returns #VALUE!
To use it, open your Excel, go to Develpment tools tab, click Visual Basic, rightclick on ThisWorkbook, choose Insert, then Module, finally copy the code into the editor. You have to save as a macro-friendly Excel workbook (xlsm).
Even if old, this seems to be a popular questions, so I'll post another solution, which I think is very elegant:
http://fiveminutelessons.com/learn-microsoft-excel/using-multiple-if-statements-excel
It's elegant because it uses just the IF function. Basically, it boils down to this:
if(condition, choose/use a value from the table, if(condition, choose/use another value from the table...
And so on
Works beautifully, even better than HLOOKUP or VLOOOKUP
but... Be warned - there is a limit to the number of nested if statements excel can handle.
Microsoft replace SWITCH, IFS and IFVALUES with CHOOSE only function.
=CHOOSE($L$1,"index_1","Index_2","Index_3")
Recently I unfortunately had to work with Excel 2010 again for a while and I missed the SWITCH function a lot. I came up with the following to try to minimize my pain:
=CHOOSE(SUM((A1={"a";"b";"c"})*ROW(INDIRECT(1&":"&3))),1,2,3)
CTRL+SHIFT+ENTER
where A1 is where your condition lies (it could be a formula, whatever). The good thing is that we just have to provide the condition once (just like SWITCH) and the cases (in this example: a,b,c) and results (in this example: 1,2,3) are ordered, which makes it easy to reason about.
Here is how it works:
Cond={"c1";"c2";...;"cn"} returns a N-vector of TRUE or FALSE (with behaves like 1s and 0s)
ROW(INDIRECT(1&":"&n)) returns a N-vector of ordered numbers: 1;2;3;...;n
The multiplication of both vectors will return lots of zeros and a number (position) where the condition was matched
SUM just transforms this vector with zeros and a position into just a single number, which CHOOSE then can use
If you want to add another condition, just remember to increment the last number inside INDIRECT
If you want an ELSE case, just wrap it inside an IFERROR formula
The formula will not behave properly if you provide the same condition more than once, but I guess nobody would want to do that anyway
If your using Office 2016 or later, or Office 365, there is a new function that acts similarly to a CASE function called IFS. Here's the description of the function from Microsoft's documentation:
The IFS function checks whether one or more conditions are met, and returns a value that corresponds to the first TRUE condition. IFS can take the place of multiple nested IF statements, and is much easier to read with multiple conditions.
An example of usage follows:
=IFS(A2>89,"A",A2>79,"B",A2>69,"C",A2>59,"D",TRUE,"F")
You can even specify a default result:
To specify a default result, enter TRUE for your final logical_test argument. If none of the other conditions are met, the corresponding value will be returned.
The default result feature is included in the example shown above.
You can read more about it on Microsoft's Support Documentation

Resources