How was non-decimal money represented in software? - decimal

A lot of the answers to the questions about the accuracy of float and double recommend the use of decimal for monetary amounts. This works because today all currencies are decimal except MGA and MRO, and those have subunits of 1/5 so are still decimal-friendly.
But what about the software used in U.S. stock markets when prices were in 1/16ths of dollar? The accuracy of binary data types wouldn't have been an issue, right?
Going further back, how did pre-1971 British accounting software deal with pounds, shillings, and pence? Did their versions of COBOL have a special PIC clause for it? Were all amounts stored in pence? How was decimalisation handled?

PL/I had a type specifically for British currency - I don't know about COBOL. The British currency at one time incorporated farthings, or a quarter of a penny; I'm not sure though that computers had to deal with those, just with half pennies or ha'pennies.
Accurate accounting usually uses special types - representing decimals exactly. The new IEEE 754 has support for floating-point decimals, and some chips (notably IBM pSeries) have such support in hardware.

COBOL could do it, eg PICTURE 9(4)D88D6 DISPLAY-ST see http://www.computinghistory.org.uk/downloads/10924 page 117

1/16 can be represented in four digits as .0625. For fractions of that type you just add some additional decimal places.

Related

How to handle rouding of numbers in systems

We can take Java as a perspective. Suppose we have a system that has items with a price. The price will take several operations, let's say 15 operations. the items' price will be divided multiplied, summed, subtracted with decimals over and over. Know lets say that our system talks to another system. That other system also makes operations to itens prices. In the end the price values of the two systems have to match exactly(cents). We are hipothetically talking about accounting systems. The chance of the two match is very rare, according to my experience. How can we handle such situation. Is there a rule for rounding?
I'd say always calculate with the raw numbers (i.e. a lot of decimals) and transfer that number as well. Only for the comparison round to a less precise, previously agreed upon degree of precision (which is still precise enough for the purpose). That way you have matching results while maintaining high enough precision.
A factor to consider is the rounding method in case one side may differ. There are three I know about:
The one which is taught in school: 0.5 --> 1.0
Towards zero: 1.5 --> 1.0 but -2.5 --> -2.0
Towards even, also called "Banker's rounding": 1.5 --> 2.0 but 2.5 --> 2.0

Is there any open databases where I can get how many decimal places are there for that currency?

My app needs functionality where if User types Euro my app should be able to tell you how many decimal places are thereafter decimal point? for clarification like dollar has 100 cents so there are two decimal places after the decimal point, for example, 32.56 $. My question is where can i get this type of data for most of the currencies in the world?
Yes there is: the official list of currencies from ISO 4217, available as an XML feed.
It's not indexed by currency code but by country, so there is a bit of deduplication to be made. The field you're looking for is CcyMnrUnts:
<CcyNtry>
<CtryNm>FRANCE</CtryNm>
<CcyNm>Euro</CcyNm>
<Ccy>EUR</Ccy>
<CcyNbr>978</CcyNbr>
<CcyMnrUnts>2</CcyMnrUnts>
</CcyNtry>
This is the list we use in the brick/money library.

Comparing strings for their similarities?

I want to count the number of times there is an ocurrence of certain college course on a list of thousands of entries. The problem is the course is not always spelled the same. For example, Computer Engineering can be spelled Computers Engineering. What is a proper, elegant way to test if 2 strings are very similar?
I would try to canonize the strings using stemming. The idea is - give each string its canonized form, and two different strings, that represent the same word are very likely to have the same canon form (for example, Computer and Computers will have the same cannon form, and you will get a match).
Porter stemming algorithm is often used for canonization.
An alternative - is grading the strings with a distance between each other, the suggested Levenshtein Distance can help you with it, but personally - I'd prefer canonization.

xsd pattern acceptable for decimals?

We have a request to implement our webservice response so that xsd:decimal fraction digits will be zero-padded when it's not long enough when a pattern indicates so. I am wondering if this is a reasonable request and if xsd:decimal is supposed to be used with patterns like these.
Here is the relevant part of the xsd according to their specs:
<xsd:simpleType>
<xsd:restriction base="xsd:decimal">
<xsd:totalDigits value="14"/>
<xsd:fractionDigits value="2"/>
<xsd:pattern value="[\-+]?[0-9]{1,12}[.][0-9]{2}"/>
</xsd:restriction>
</xsd:simpleType>
So the fractionDigits are set to 2 which means the precision can be a maximum of 2 digits. According to http://zvon.org/xxl/XMLSchemaTutorial/Output/ser_types_st2.html it is also fine if there are less fraction digits (for example for a number like 5.1)
But according to the pattern {2} there should always be 2 fraction digits.
We're developing a generic application development platform (Mendix) and there's no telling what the decimal will be used for in advance (currency, pH values, distances, etc). This case comes from a specific project where our platform is being used but normally we won't know what kind of data is being transferred. We could decide to just follow the WSDL in this regard which states it should have 2 fraction digits. but our implementation of it must be very generic.
There is nothing stating with what exactly these fraction digits should be padded with or even that we should pad instead of just leaving out this decimal altogether. In theory we could decide to pad with 5's until it matches the pattern. As far as I know patterns are rarely used and if they are it's used for things like passwords. The XSD specification is vague though so it would be appreciated if someone could shed some light on whether this is valid use of an XSD and if it makes sense for us to decide to pad with 0's.
To be practical, my litmus test would be to look at this from the perspective of the technology stack directly involved or, better yet, that is mainstream.
I can think of JAXB on Java or xsd/svcutil/etc.exe on .NET; A quick test of these pretty common tools against this schema fragment using a value of 1 would fail to produce valid XML. This would send developers scrambling for all sorts of customizations to make it work as per the XSD pattern. Painfull, high developing and maintenance costs...
The same would be applicable to an XSLT; there will be a need to manually format the output... Bottom line, XSD patterns are not machine usable for "automatic" formatting... I have yet to see such a thing...
I also believe that a requirement such as this is unreasonable and I personally feel that it should be considered as an antipattern when it comes to describing data being exchanged. Since there's no absolute, it is conceivable that there must be an exception; I can't think of any, but one must explore the reason why you were presented with such a requirement; I would then try to find a solution that wouldn't involve this pattern...
It actually depends on the parameter of that particular decimal value ..
XML is among the most preferred ones to transport and store data it can have various data. I choose two examples here:
Data is currency, Here my advice is to force [0-9]*[.][0-9]{2} .. aiding to this our Client data restoration software is designed such a way to pad 0s.
Data is a pH value of a chemical, well. Here one digit after decimal point is mandatory. [0-1][0-9][.][0-9]
So it all depends on the object we are referring to .. Unless its really necessary it wouldn't be fair to force the pattern :)

Rounding Standards - Financial Calculations

I am curious about the existence of any "rounding" standards" when it comes to the calculation of financial data. My initial thoughts are to perform rounding only when the data is being presented to the user (presentation layer).
If "rounded" data is then used for further calculations, should be use the "rounded" figure or the "raw" figure? Does anyone have any advice?
Please note that I am aware of different rounding methods, i.e. Bankers Rounding etc.
The first and most important rule: use a decimal data type, never ever binary floating-point types.
When exactly rounding should be performed can be mandated by regulations, such as the conversion between the Euro and national currencies it replaced.
If there are no such rules, I'd do all calculations with high precision, and round only for presentation, i.e. not use rounded values for further calculations. This should yield the best overall precision.
I just asked a greybeard mainframe programmer at the financial software company I work for, and he said there is no well-known standard and it's up to programmer practice.
While statisticians have been aware of the rounding issue since at least 1906, it's difficult to find a financial standard endorsing it.
According to this site, the "European Commission report The Introduction of the Euro and the Rounding of Currency Amounts suggests that there had previously been no standard approach to rounding in banking."
In general, use a symmetric rounding mode no matter what base you are working in (base-2 or base-10).
This will avoid systematic bias during calculations.
Such a mode is Round-Half-To-Even, otherwise known as "bankers rounding".
Use language tools that allow you to specify the numeric context explicity, including the rounding and truncation modes. For example, Python's decimal module. The implicit assumptions made by the C library might not be appropriate for your computations.
http://en.wikipedia.org/wiki/Rounding#Rounding_to_integer
It's frustrating that there aren't clear standards on this, both to guide the programmer, and as a defense in court. Just doing "regular" rounding toward nearest for payroll can lead to underpayment by a few pennies on a paycheck here and there, which is something labor lawyers eat up like crack.
Though a base pay rate may well only be specified in two decimal places ("You're hired at $22.71/hour"), things like blended overtime (determined by averaging multiple pay rates in a period) end up with an effective hourly rate of $23.37183475/hr.
How do you pay overtime on that?
15 hours x 23.37183475 x 1.5 = $525.87 rounded from $525.86628187
15 hours x 23.37 x 1.5 = $525.82
WHY DID YOU STEAL FIVE CENTS FROM MY CLIENT? Sadly, I'm not joking about this.
This gets even more uncomfortable when you calculate at the full precision value but display a truncated version: you do the first calculation above, but only display $23.37 for the rate on the pay stub.
Now the pay stub calculations don't tie out to the penny, and now you have to explain it, but even if it's in the employee's favor, it can be enough for a labor lawyer to smell blood in the water and start looking for other stuff.
One approach is to always round in favor of the employee, not in the natural direction, so there cannot ever be an accusation of systematic wage theft.
Ive not seen the existence of "the one standard to rule them all" - there are any number of rounding rules (as you have referenced), and they seem to come into play based on industry/customer/and currency code (http://en.wikipedia.org/wiki/ISO_4217) - since not everyone uses 2 places after the decimal, the problem becomes even more complicated. At the end of the day, your customer needs to specify the rules they want to implement...
Consider using scaled integers.
In other words, store whole numbers of pennies instead of fractional numbers of dollars.

Resources