Can someone confirm how Microsoft Excel 2007 internally represents numbers? - excel

I know the IEEE 754 floating point standard by heart as I had to learn it for an exam. I know exactly how floating point numbers are used and the problems that they can have. I can manually do any operation on the binary representation of floating point numbers.
However, I have not found a single source which unambiguously states that excel uses 64 bit floating point numbers to internally represent every single cell "type" in excel except for text. I have no idea whether some of the types use signed or unsigned integers and some use 64 bit floating point.
I have found literally trillions of articles which 1) describe floating point numbers and then 2) talk about being careful with excel because of floating point numbers. I have not found a single statement saying "all types are 64 bit floating point numbers except text". I have not found a single statement which says "changing the type of a cell only changes its visual representation and not its internal representation, unless you change the type from text to some other type which is not text or you change some other type which is not text to text".
This is literally all I want to know, and it's so simple and axiomatic that I am amazed that I can find trillions of articles and pages which talk around these statements but do not state them directly.

Excel 2007 supports the OpenXML format which is a ZIP file (.XLSX) containing a bunch of XML files. There is an SDK to work with the OpenXML format you can get the docs for it here, and download it here.
Basically numbers are stored as plain text within a element so if cell A1 has the number 42 and cell A2 has the number 81.56 in the UI, the XML would look like the follow XML fragment:
<row r="1" spans="1:2">
<c r="A1">
<v>42</v>
</c>
<c r="B1">
<v>81.569999999999993</v>
</c>
</row>
When working with OpenXML I would highly recommend just using the SDK and not going after it on your own.

This page has a reference for internal Excel data types.

It stored the numbers as double.

Related

How can I convert four bytes of hexadecimal into a floating point number within Excel

I have copied a stream of hexadecimal data from a Wireshark capture into Excel as a text value with no spaces.
04000000ffffffff2b010000c900000000000000000000000000000000000000
I know how to convert a section of this to an integer in Excel using the formula:
=HEX2DEC(MID(B173,19,2) & MID(B173,17,2))
which in this case returns 299.
But how do I do something similar to retrieve a float? Some articles I've found discuss writing a C# program but I'm just trying to setup a little debug environment in Excel. Other articles only seem to discuss doing the conversion in the other direction but I couldn't figure out the inverse procedure.
Edit: An example of what I am after is:
=SOMEFUNCTION("d162c240") --> 6.062 (I think)
or possibly
=SOMEFUNCTION("c240" & "d162") --> 6.062
What a mission! But for anybody interested I got there, ending up with a custom LAMBDA function HexToFloat() that was built without the need for VBA. The LAMBDA function and several other functions used are new in Excel 365 - if you're getting errors reproducing the functions below you might have an older version.
The Wikipedia page Single-precision floating-point format was my primary reference, in particular the section Converting binary32 to decimal.
Read the article for a full understanding (good luck!!), but for TLDR it's worth noting that there is an implied bit that is not actually stored in memory and when combined with the fraction component this is referred to as the "significand".
For my solution I had to build several supporting LAMBDA functions in Excel's Name Manager. In my case the data was stored as Little-Endian so if this is not applicable in your case you can skip that step.
Name
Comment
LittleEndianHex
Interpret Hex data as Little Endian by reversing it in 2-byte chunks
=LAMBDA(HexData,MID(HexData,7,2) & MID(HexData,5,2) & MID(HexData,3,2) & MID(HexData,1,2))
Name
Comment
HEXtoBIN
Handles bigger numbers than the native Excel HEX2BIN function
=LAMBDA(number,[places],LET(Unpadded,REDUCE("",HEX2BIN(MID(number,SEQUENCE(LEN(number)),1),4),LAMBDA(result,byte,result & byte)),REPT("0",IF(ISOMITTED(places),0,places-LEN(Unpadded))) & Unpadded))
Name
Comment
BINtoDEC
Handles bigger numbers than the native Excel BIN2DEC function
=LAMBDA(E,SUMPRODUCT(MID("0"&E,ROW(INDIRECT("1:"&LEN("0"&E))),1)*2^(LEN("0"&E)-ROW(INDIRECT("1:"&LEN("0"&E))))))
Name
Comment
HexToFloat
Convert hexadecimal representation of little-endian IEEE 754 binary32 (4 bytes) number to Single-precision floating-point format
=LAMBDA(HexData,LET(LEHex,LittleEndianHex(HexData),Binary,HEXtoBIN(LEHex,32),bSign,LEFT(Binary,1),bExponent,MID(Binary,2,8),bImplicit,IF(bExponent=REPT("0",8),"0","1"),bSignificand,bImplicit & RIGHT(Binary,23),dSign,BIN2DEC(bSign),dExponent,BINtoDEC(bExponent),dSignificand,BINtoDEC(bSignificand),(-1)^dSign*(dSignificand*2^-23)*2^(dExponent-127)))
Once you've done all this you can enter the HexToFloat formula directly into a cell, e.g. =HexToFloat(A1) or =HexToFloat("d162c240"). This particular example returns the result 6.07456254959106.
(PS I've never asked for votes before but this took me weeks! If you find it useful please consider giving me an up-tick.)
Short answer = DEC2HEX('Your reference cell or value')
Your formula shifts places of low and high byte of when you extract the hex value from your string. Hex is just like all other position based numbers, where the lowest value is found on the right side.
This person explain working with hex values in excel
Debug of your formula
You can use multipe stages in one formula like:
=DEC2HEX(SUM(HEX2DEC("c240");HEX2DEC("d162")))
Note: If you use US settings in Office replace ; with ,

excel power query transforming data from text to decimal

Howto transform a colum with postive and negative values, which is loaded as text into a column with decial values without errors. the minus sign seems to make the error:
I use a german version of excel 365 on windows 10
Finally solve the problem: It was q quircky minus-sign, which had to be replaced by a standard minus sign before changing type to decimal. Unfortunatley at first sign this quirky decimals looked like nomral ones

Why do Excel values in parentheses become negative values?

A colleague and I encountered a behavior in Excel which isn't clear to us.
Background:
We have a tool which converts an Excel sheet into a table format. The tool calculates the formulas which are in excel and replaces variables inside it with specific values.
The excel tool is used by one of our customers who use values like (8) or (247).
These Value are automatically translated by excel to -8 or -247.
Question:
I saw that many people want to display negative numbers in parentheses. But why would Excel change values in parentheses to a negative number?
I know that I could simply change the cell config to text and this would solve the problem but I wonder if there is a reason for the behavior, since there seems to be no mathematical reason for this.
Its simply the different format of cells you are bringing the "values from" and "pasting to". ..... numbers with parentheses are in cells with "accounting" format and negatives are stored in general or standard number formated cells. To resolve you can change the format of destination cells to accounting using cell formatting as number>accounting.
To answer the why, it's because accountants put negative numbers in brackets for readability
Unfortunately, this is one of the excel feature/bugs that helps some folks and frustrates others. When opening a file or pasting content, excel will immediately and always try to parse any values into formats it deems appropriate, which can mess up data like:
Zip Codes / Tel. # → Numeric: 05401 → 5401
Fractions → Dates: 11/20 → Nov, 20th YYYY
Std. Errors → Negative Numbers: (0.1) → -0.1
For some workarounds , see Stop Excel from automatically converting certain text values to dates
Once the file is open/pasted, the damage is already done. At that point, your best bet is:
Updating the field and displaying as text (appending with ') to prevent re-casting
Formatting the field if the operation wasn't lossy and is just presenting the info differently
Running a clean if/else to pad or other convert your data based on the identified errors
Specific to displaying values back in parens, if excel is converting them and treating them like negative numbers (which may or may not be the appropriate way to actually store the data), you can apply a different format to positive and negative numbers to wrap back in parens.
It is standard practice to write negative values as numbers in parentheses, especially in accounting. This makes negative values stand out much more than a simple negative hyphen; compare -1 and (1).
Excel is a tool very commonly used by accountants and supports accountant-style spreadsheets. Therefore, entering (100) means having a value of -100, even if there is no minus hyphen!
Here is a fun fact, if you enter (-10), Excel will treat it as normal text.

BUG in Excel CountIF function

I am having problems with the CountIf Function in Excel.
=COUNTIF(A:A,A2)
The A column consists of these items:
0107791489614255200011140926107503100513
0107791489614255200011140926107503100457
0107791489614255200011140926107503100518
0107791489614255200011140926107503100503
0107791489614255200011140926107503100519
0107791489614255200011140926107503100444
0107791489614255200011140926107503100521
0107791489614255200011140926107503100438
0107791489614255200011140926107503100449
0107791489614255200011140926107503100443
0107791489614255200011140926107503100501
0107791489614255200011140926107503100455
the formula results to 12, even though these set of strings are not really the same at all. It counts these strings as similar strings, I am thinking this is related to its string length?
What do you guys think? I appreciate your help.
+1, A good question. Not really a bug but a feature!
This is due to Excel implicitly converting the inputs to its internal numeric type and losing precision in doing so. Excel's internal numeric type is an IEEE floating point double precision number. (Although it does clever things with formatting and error propagation so it appears to get sums like 1/3 + 1/3 + 1/3 correct).
As they are so similar they all compare as mutually equal.
One remedy would be to prefix each string with ' (single quotation) which will prevent the conversion to the numeric type. Then the COUNTIF value returns 1. (At least in my version of Excel; 2013).
Preceding the strings with a single apostrophe will not remedy the situation. COUNTIF is designed to interpret data as numerical, where possible, irrespective of the datatype of the values in question. This is sometimes helpful, sometimes (as here) not.
SUMPRODUCT does not exhibit this property:
=SUMPRODUCT(0+($A$1:$A$12=A2))
will return 1, as desired.
Regards

Number representation by Excel

I'm building a VBA program on Excel 2007 inputing long string of numbers (UPC). Now, the program usually works fine, but sometimes the number string seems to be converted to scientific notation and I want to avoid this, since I then VLook them up.
So, I'd like to treat a textbox input as an exact string. No scientific notation, no number interpretation.
On a related side, this one really gets weird. I have two exact UPC : both yield the same value (as far as I or any text editor can tell), yet one of the value gives a successful Vlookup, the other does not.
Anybody has suggestions on this one? Thanks for your time.
Long strings that look like numbers can be a pain in Excel. If you're not doing any math on the "number", it should really be treated as text. As you've discovered, when you want to force Excel to treat something as a string, precede it with an apostrophe.
There are a couple of common problems with VLOOKUP. The one you found, extra whitespace, can be avoided by using a formula such as
=VLOOKUP(TRIM(A1),B1:C:100,2,FALSE)
The TRIM function will remove those extraneous spaces. The other common problem with VLOOKUP is that one argument is a string and the other is a number. I run into this one a lot with imported data. You can use the TEXT function to do the VLOOKUP without having to change the raw data
=VLOOKUP(TEXT(A1,"00000"),B1:C100,2,FALSE)
will convert A1 to a five digit string before it tries to look it up in column B. And, of course, if your data is a real mess, you may need
=VLOOKUP(TEXT(TRIM(A1),"00000"),B1:C100,2,FALSE)

Resources