How to add a trailing zeros to a pandas dataframe column? - python-3.x

I have a pandas dataframe column that I would like to be formatted to two decimal places.
Such that:
10.1
Appears as:
10.10
How can I do that? I have already tried rounding.

This can be accomplished by mapping a string format object to the column of floats:
df.colName.map('{:.2f}'.format)
(Credit to exp1orer)

You can use:
pd.options.display.float_format = '{:,.2f}'.format
Note that this will only display two decimals for every float in your dataframes.
To go back to normal:
pd.reset_option('display.float_format')

From pyformat.info
Padding numbers
For floating points the padding value represents the length of the complete output. In the example below we want our output to have at least 6 characters with 2 after the decimal point.
'{:06.2f}'.format(3.141592653589793)
The :06 is the length of your output regardless of how many digits are in your input. The .2 indicates you want 2 places after the decimal point. The f indicates you want a float output.
Output
003.14

If you are using Python 3.6 or later you can use f strings. Check out this other answer: https://stackoverflow.com/a/45310389/12229158
>>> a = 10.1234
>>> f'{a:.2f}'
'10.12'

Related

Python digit formatting

I was taking some PCAP certification practice tests when I stumbled upon some sort of digit formatting, I don't really understand what it is and how it works.
Choose the correct statements from the following.
The output of print("-%07d"%555.55) is same as the output of print("%7d"%555.55)
The output of print("-%07d"%555.55) is same as the output of print("-07d"%555)
(Correct)
The output of print("-%02d"%555.55) is same as the output of print("-%2d"%555.55)
(Correct)
The output of print("-%02d"%555.55) is same as the output of print("-%02d"%555)
(Correct)
This is a simplified understanding of it.
The modulo or % character is the formatting character.
The d means integer only, so no floating point values will be used.
The - just makes it negative.
The integer before the d indicates the minimum number of characters desired in output.
If there is a 0 in front of the integer it means that it should be padded with leading zeros in order to reach the desired number of characters.
if the d was an f then it would allow floating point values and any zero padding would be done at the end

Formatting number with a bunch of zeroes

Somewhat simple problem:
I need to turn a column A, which contains numbers with up to 1 decimal (20, 142, 2.5, etc.) to a string with a specific format, namely 8 whole digits and 6 decimal digits but without the actual decimal period, like so:
1 = 00000001000000
13 = 00000013000000
125 = 00000125000000
46.5 = 00000046500000
For what it's worth, the input data from column A will never be more than 3 total digits (0.5 to 999) and the decimal will always be either none or .5.
I also need for Excel to leave the zeroes alone instead of auto-formatting as a number and removing the ones at the beginning of the string.
As a makeshift solution, I've been using =CONCATENATE("'",TEXT(A1,"00000000.000000")), then copying the returning value and "pasting as value" where I actually need it.
It works fine, but I was wondering if there was a more direct solution where I don't have to manually intervene.
Thanks in advance!
=TEXT(A1*1000000,"0000000000000") I think that's what you mean.

extraneous digits formatting within dataframe

I am running into a formatting / precision issue which I'm hoping to control
I obtain a list of numbers such as:
x = [0.009947, 0.009447, 0.008947]
The finished product I'm after is a DataFrame with a column whose value is this list but multiplied by 100 with 3 decimal places, e.g.
[0.995, 0.945, 0.895]
I proceed as follows:
x = 100*np.around([0.009947, 0.009447, 0.008947],5)
this displays as
array([0.995, 0.945, 0.895])
When I build the DataFrame:
pd.DataFrame({'test':[x]})
I get for the value in the 'test' column:
[0.9950000000000001, 0.9450000000000001, 0.895]
This does not happen in other examples and I'm not sure how to control the behavior. Appreciate any suggestions
This is a general issue with the usage of floating points in computers, check this out
from the docs

Restrict floats to allotted padding while parsing as string

I would like to print a series of floats with varying amounts of numbers to the left of the decimal place. I would like these numbers to exactly fill a padding with blank spaces, digits, and a decimal point.
Paraphrasing the data and code I have now
floats = [321.1234561, 21.1234561, 1.1234561, 0.123456, 0.02345, 0.0034, 0.0004567]
for number in floats:
print('{:>8.6f}'.format(number))
This outputs
321.123456
21.123456
1.123456
0.123456
0.02345
0.0034
0.000457
I am looking for a way to print the following in a for loop assuming I don't know the amount of digits that will be to the left of the decimal place and the number of digits to the left never exceeds the padding which is 8 for this example.
321.1234
21.12345
1.123456
0.123456
0.02345
0.0034
0.000457
Similar questions have been asked about printing floating points with a certain width but the width they were talking about appeared to be the precision rather than the total number of character used to print the number.
Edit:
I have added a number to the end of the list for the following reason. The use of the specifier 'g' with 7 significant figures was recommended by attdona. This prevents the padding from being exceeded for numbers greater than or equal to 1 but not for numbers less than 1 with precision greater than 6. Using {:>8.7g} instead gives
321.1234
21.12345
1.123456
0.123456
0.02345
0.0034
0.0004567
Where the only one that exceeds the padding is the newly added one.
Use the General format type specifier g:
'{:>8.7g}'.format(number)
reference: https://docs.python.org/3/library/string.html#format-specification-mini-language
Update: For small numbers this format fails to align correctly. In this case you may adopt a mixed approach, but keep in mind that very small numbers will round to zero
for number in floats:
fstr = '{:>8.7g}'.format(number)
if len(fstr) > 8:
fstr = '{:>8.6f}'.format(number)
print(fstr)
for i in floats:
print('{:>8}'.format(f'{i:{8}.{8-len(str(int(i)))-1}f}'.rstrip('0')))
321.1235
21.12346
1.123456
0.123456
0.02345
0.0034

Converting padded string to fixed decimal in Alteryx

I have a large text file with no headers with fields delimited by a fixed width. All numeric fields are padded with zeros. I want to import this into Alteryx using field settings from a flat file.
Some of my fields should have the format Fixed Decimal, for example the "Regular Cost" column is a fixed decimal 9.04 - 5 decimal places before the decimal point and four following. Input example is "000026300". Desired output is 2.63.
I can't figure out the Length and Scale requirements for this to work.
Length = 9, Scale = 4 gives the error
Regular Cost: "000023600.0000" was too long to fit in this FixedDecimal.
Example image
Apparently it doesn't like the missing decimal point. If you read the file as a string, then add the decimal to the correct location in the string, e.g. read it in and force the field length to 10, then use the formula...
Left([Field_1],5) + "." + Left(Right([Field_1],4),3)
... it will look as expected. Then you can map it to a Double or a FixedDecimal 10.4

Resources