Pentaho Data Integration rounds off decimal values after 16 digits - rounding

I am trying to load a numeric value and I see that any number greater than 16 digits gets rounded off by PDI.
Whether I am using "Select values" step, or "Modified Javascript" or even "Generate rows" step - the value gets rounded off.
Like for example -
Input value - 346003617942512178
Output value - 346003617942512190
As you can see, the last 2 digits got rounded off.
Is there any setting in Pentaho, which needs to be changed so that this round off doesn't happen or atleast increase the 16 digit limit higher? I would like the data to load as is without any round offs but still being recognized as a Number and not a String.
Any help on this will be appreciated.

You can look this transformation BigNumber, where input value is same but output is different based on data type.

Related

Controlling Excel time format input/output

Background: I have been officiating our local jogging events for about ten years now. I am responsible for handling the data of the participants (name, sporting club, bib number) split into their categories (age bracket+gender, distance). The main task is collecting their times, and processing that data (sorting the runners within their category etc). I can handle this with Excel mostly fine.
Problem: What is the ideal time format for entering the race times of the participants? The times are either in the format mm:ss or (for slower runners and/or longer distances) h:mm:ss. Excel doesn't seem to have a built-in format where the hours field is optional. For optimizing my workflow ideally I would like to have a cell format such that the input
47:12 is to be interpreted as 47 minutes and 12 seconds, and the input 1:09:38 is to be interpreted as 1hr 9 minutes and 38 seconds. However, Excel, with the best fitting cell format that I found, will insist that the input 47:12 means 47 hours and 12 minutes. For times exceeding 1 hour I would input 1:03:00 if I meant that the seconds field is to be left with value zero.
How to make Excel realize that when the format can handle up to three numbers as inputs, it would, when given only two numbers, move them towards the end?
Thinking: I "can" key in 47 minutes and 12 seconds as 0:47:12 all right. But because most of the times are under 1 hour, that is partly wasted effort. Also, using such a format the data is displayed on the screen together with that superfluous 0:. What's worse (IIRC) those leading zeros
also appear in the printed versions, which is strange (insulting even) in a shorter distance for junior participants.
My hack: I enter the times as general numbers in the mm,ss format (in these parts a comma serves as a decimal separator). Excel can sort those as numbers just fine. I then duplicate the data of that sorted column to another "printable" version (formatted as text), where the data is just copied, but I correct the times exceeding 60 minutes by hand. This works just fine as long as I'm not in a hurry (our event is not exactly Boston Marathon, say, less than 200 participants), and remember to hide the column that is not supposed to be printed. This is kludgy, and there have been accidents, when other officials have been rushing me to get the results printed.
I managed to create a format where the hour-field is optional. It works with a conditional format. First you format your cells as standard, so you get the times as comma-values. After that you create a conditional format for these cells, which has two rules:
if cellvalue > 0.04166667 format hh:mm:ss
if cellvalue < 0.04166666 format mm:ss
Result:
47:12
01:09:38
01:00:00
So you get what you really want and you can use the original values for sorting and so on.
EDIT:
For the input you need four additional columns. You enter the times as you want, e.g. 47:12 and 1:09:38. In the next three columns you split these values in hour, minute and second, whereby the interpretation limit is 3 hours (03:00), which is 0.125.
So, these are the formulas for the split columns (your input is in B1):
Hours: =IF(B1>0.125,0,HOUR(B1))
Minutes: =IF(B1>0.125,INT(B1)*24+HOUR(B1),MINUTE(B1))
Seconds: =IF(B1>0.125,MINUTE(B1),SECOND(B1))
And finally, you put all values togehter in the forth column:
=TIME(C1,D1,E1)
and use the conditional format above.
If you will be entering your data as
`mmm,ss`
where the comma is the decimal point, then you can convert it to "Excel Time" with the simple formula:
=DOLLARDE(A1,60)/1440
Format the result as you wish.
If you want everything displayed as h:mm:ss then use that as your custom format (Format > Cells > Number > Custom Type:...)
If you want h to be displayed only with values of 60 minutes or greater, then use
[<0.0416666666666667]mm:ss;h:mm:ss
for your cell's custom format.
Beware that seconds must be entered with two digits always. In other words
6,2 will translate to 6 min 20 sec.
6,02 will translate to 6 min 2 sec
I really like IQV's answer above, but as pointed out in the comment section, the leading zero will be required for the data entry side. If for whatever reason this is not acceptable you can use the following ugly formula to convert your time entered in your usual method of mm,ss to hh:mm:ss with the hh: being displayed as required. Unfortunately it converts the whole thing to text which means you can no longer perform math operations on it.
=IF(FIND(".",MOD(D2,60)&".")=2,"0","")&MOD(D2,60)
and since you use , as your decimal separator the formula would become:
IF(FIND(",",MOD(D2,60)&",")=2,"0","")&MOD(D2,60)
If you use ; as your list separator then your formula becomes
IF(FIND(",";MOD(D2;60)&",")=2;"0";"")&MOD(D2;60)
There are probably some cleaner formulas, but that will get you started. Just replace D2 with the location where your time is stored.
Again I still prefer IQV's answer as you can do much more with the time information when its stored as a number and not text.
Option 2
lets say you change your data storage method to hhmm,ss in cell D6. you could rip apart the information and reassemble it in a display friendly version as follows.
=IF(FIND(".",D6)<=3,LEFT(D6,2)&":"&RIGHT(D6,LEN(D6)-FIND(".",D6)),LEFT(D6,FIND(".",D6)-3)&":"&MID(D6,FIND(".",D6)-2,2)&":"&RIGHT(D6,LEN(D6)-FIND(".",D6)))
you will need to substitute your list separator for the , and then substitute a coma for the decimal.

Numbers stored as text - when converted to numbers, digits disappear

I have a column of data with numbers stored in text.
The numbers look like this: 735999114002665788
If I select any cell in this column and refer to it with the function =value(), the number shows up as 735999114002665000.
As you can see the last three digits are 0. This happens all the time with numbers this long - but NOT with numbers containing less digits.
Am I trying to convert a number that's too large or what's up? Please help! I've tried every form of text-to-number method with identical results :(
Excel's number precision is 15 digits, which is why you're losing the last three digits when converting your 18 character string
https://support.office.com/en-us/article/excel-specifications-and-limits-1672b34d-7043-467e-8e27-269d656771c3#ID0EBABAAA=Excel_2016-2013
Excel only allows a maximum of 15 digits of precision for each number in a cell. The reason why this number:
735999114002665788
becomes this:
735999114002665000
is because Excel is choosing to retain the 15 most significant digits in the number. This means that the ones, tens, and thousands digits are being tossed out.
By the way, this question has been asked before on SuperUser, and you can read about it here:
https://superuser.com/questions/437764/why-is-excel-truncating-my-16-digit-numbers

increasing fraction digits (decimal points) in anylogic

By result I mean these values which are shown in this picture which are related to variables and statistics. Every time I run the model I get the results up to only 3 decimal places that I want to increase them in to 6 decimal places. I hope I could clearly state what I mean this time.
Anylogic truncates the visible value to 3 decimal places. if you want more you should read you data in some other way:
export it to a text file
output it to the console with: traceln(varname)
use the text-object to display the value

Excel keeps rounding numbers into full numbers (e.g. shows 5.00) instead of displaying decimals

I have been trying to fix this for over an hour now, trying every possible answer I have read on every forum and site when Googling.
I have a series of numbers that I want to multiply by 0.15 (15 cents). However, instead of showing the actual result (33 * 0.15 = 4.95) it shows a ful number.
A full number with decimals (that is, 5.00) but a full number. As you see, it is not an issue of increasing or decreasing decimals, format, etc.
Here is a screenshot
Thanks!
ON the "Home table" click on the "decrease" or "increase" decimal (to the right of what appears on your tab-bar as "Numero"
If that doesnt work, if you look at the actual contact of the cell is there a formula in it (rather than numbers? i.e. something like:
=ROUNDDOWN(A1,-2)
=ROUNDUP(A1,-2)
=ROUND(A1,-2)
They also force rounding of numbers.
And finally the only other option I can think of is looking at the "cell format", but from what it appears you already have it on "Money"...
Just incase, double check that it's not a form of finance that "rounds off" figures. As That happens too!

Why does excel AVERAGE change when changing the number format of cells?

I've got an Excel sheet which is exhibiting strange behaviour. I have 2 values, followed by an average of those 2 values - simple enough, right?
However, if I change the number format of the top cell from 2 decimal places to 30, I get a different result:
Can anyone explain this? When a cell is formatted to 2 decimal places, does that mean all formulae using this cell are rounding the value to 2 decimal places also?
Check your Excel options (Alt+F,T) for the Advanced ► When calculating this workbook ► Set precision as displayed option. When this is checked, calculation is automatically rounded off to the displayed number of decimals rather than the internal 15 digit floating point precision. It also permanently truncates the raw value to the displayed precision so I am unclear on how you are bouncing between the two average values.
The actual average of 1.6786427146 and 1.73 is 1.7043213573 which is 1.70 when only two decimals are displayed. It would only be through Precision as displayed that 1.6786427146 would actually be converted to 1.68 making the average 1.71.
Turn the option off and the underlying raw value will be stored to a 15 digit floating point precision. The same goes for all internal formula calculations.

Resources