Column not lining up - python-3.x

I am trying to achieve a column-style for converting celcius to fahrenheit.
My question: When I run it, the numbers are lining up nicely. However the c and f are aligned left and are not above the numbers. Why is this?
Here is my code:
def convert(celc):
fahr = celc * 1.8 + 32
return fahr
def table():
format_str = '{0:10} {1:10}'
c = 'c'
f = 'f'
cfhead = format_str.format(c, f)
print(cfhead)
for graden in range(-40,30,10):
result = format_str.format(convert(graden), graden)
print(result)
table()

Use > in your format to always align to the right:
format_str = '{0:>10} {1:>10}'
The default is for strings to be left-aligned, numbers to be right-aligned.
From the Format Specification Mini-Language documentation:
'<'
Forces the field to be left-aligned within the available space (this is the default for most objects).
'>'
Forces the field to be right-aligned within the available space (this is the default for numbers).
With an explicit alignment formatter you get your desired output:
>>> table()
c f
-40.0 -40
-22.0 -30
-4.0 -20
14.0 -10
32.0 0
50.0 10
68.0 20

Related

How do i decode text of a pdf file using python

I have been trying to decode a pdf file using python and the data is as below:
BT
/F2 8.8 Tf
1 0 0 1 36.85 738.3 Tm
0 g
0 G
[(A)31(c)-44(c)-44(o)-79(u)11(n)-79(t)5( )] TJ
ET
How do I make sense of this???
[(A)31(c)-44(c)-44(o)-79(u)11(n)-79(t)5( )] is of what type???
BT /F2 8.8 Tf 1 0 0 1 36.85 738.3 Tm 0 g 0 G [(A)31(c)-44(c)-44(o)-79(u)11(n)-79(t)5( )] TJ ET
Is normal plain ASCII text, thus everyday, decoded binary as text.
Your question is
Q) How do I make sense of this??? [(A)31(c)-44(c)-44(o)-79(u)11(n)-79(t)5( )]
A) Always look at the context
BT = B(egin) T(ext)
/F2 = use F(ont) 2 for encoding (whatever that is)
8.8 = units of height (if un-modified those could be 8.8 full unscaled DTP points,
but beware, point size does not necessarily correspond to any measurement
of the size of the letters on the printed page.)
... Mainly T(ransform )m(atrix) e.g. placement
[ = start a string group
(A) = literal(y) "A"
31 = Kern next character (+ is usually) left wise by 31 units where units (is usually) 1/1440 inch or 17.639 µm
(c) = the next glyph literal that needs to be etched on screen or paper
-44 is push the two x (c) apart by 44 units
(c)
...
] Tj ET = Close Group, T(exte)j(ect) E(nd) T(ext)
So there we have it somewhere on the page (first or last word or any time in between) but at that time somewhere, most likely top Left, there is one continuous selectable plain text string that **audibly sounds like a word in a human language = "Account", with an extra spacebar literal (that's actually un-necessary for a PDF, it will print that and any other "word" good enough without one.)
Why did I say sounds and not "looks like" is because those "literal" characters are not the ones presented they are the encoded names of glyphs.
Hear :-) is how they could look like using /F2 if it was set to different glyph font such as use emojis or other Dings, so A is BC but c is a checkbox u is underground t is train but audibly all ink, is just an Account of which graphics to use.

How do I parse data without using the index because some characters are different lengths

I need to parse this data so that each value in the data parsing column is deposited in its own column.
userid data_to_parse
0 54f3ad9a29ada "value":"N;U;A7;W"}]
1 54f69f2de6aec "value":"N;U;I6;W"}]
2 54f650f004474 "value":"Y;U;A7;W"}]
3 54f52e8872227 "value":"N;U;I1;W"}]
4 54f64d3075b72 "value":"Y;U;A7;W"}]
So for example, the four additional columns for the first entry would have values of “N”, “U”, “A7”, and “W”. I first attempted to split based upon index like so:
parsing_df['value_one'] = parsing_df['data_to_parse'].str[9:10]
parsing_df['value_two'] = parsing_df['data_to_parse'].str[11:12]
parsing_df['value_three'] = parsing_df['data_to_parse'].str[13:15]
parsing_df['value_four'] = parsing_df['data_to_parse'].str[16:17]
This worked really well except that there are a few that are different lengths like 937 and 938.
935 54f45edd13582 "value":"N;U;A7;W"}] N U A7 W
936 54f4d55080113 "value":"N;C;A7;L"}] N C A7 L
937 54f534614d44b "value":"N;U;U;W"}] N U U; "
938 54f383ee53069 "value":"N;U;U;W"}] N U U; "
939 54f40656a4be4 "value":"Y;U;A1;W"}] Y U A1 W
940 54f5d4e063d6a "value":"N;U;A4;W"}] N U A4 W
Does anyone have any solutions that doesn't utilize hard-coded positions?
Thanks for the help!
A relatively simple way to approach the problem:
txt = """54f45edd13582 "value":"N;U;A7;W"}]
54f4d55080113 "value":"N;C;A7;L"}]
54f534614d44b "value":"N;U;U;W"}]
54f383ee53069 "value":"N;U;U;W"}]
54f40656a4be4 "value":"Y;U;A1;W"}]
54f5d4e063d6a "value":"N;U;A4;W"}]
"""
import pandas as pd
txt = txt.replace('}','').replace(']','').replace('"','') #first, clean up the data
#then, collect your data (it may be possible to do it w/ list comprehension, but I prefer this):
rows = []
for l in [t.split('\tvalue:') for t in txt.splitlines()]:
#depending on your actual data, you may have to split by "\nvalue" or " value" or whatever
row = l[1].split(';')
row.insert(0,l[0])
rows.append(row)
#define your columns
columns = ['userid','value_one','value_two','value_three','value_four']
#finally, create your dataframe:
pd.DataFrame(rows,columns=columns)
Output (pardon the formatting):
userid value_one value_two value_three value_four
0 54f45edd13582 N U A7 W
1 54f4d55080113 N C A7 L
2 54f534614d44b N U U W
3 54f383ee53069 N U U W
4 54f40656a4be4 Y U A1 W
5 54f5d4e063d6a N U A4 W
str.split(':')
E.g.
chars = parsing_df['data_to_parse']split(':')
parsing_df['value_one'] = chars[0]
...
for i, char in enumerate(parsing_df['data_to_parse']split(':')):
pass
# use i to get the column and then set it to char

Trying to end up with two decimal points on a float, but keep getting 0.0

I have a float and would like to limit to just two decimals.
I've tried format(), and round(), and still just get 0, or 0.0
x = 8.972990688205408e-05
print ("x: ", x)
print ("x using round():", round(x))
print ("x using format():"+"{:.2f}".format(x))
output:
x: 8.972990688205408e-05
x using round(): 0
x using format():0.00
I'm expecting 8.98, or 8.97 depending on what method used. What am I missing?
You are using the scientific notation. As glhr pointed out in the comments, you are trying to round 8.972990688205408e-05 = 0.00008972990688205408. This means trying to round as type float will only print the first two 0s after the decimal points, resulting in 0.00. You will have to format via 0:.2e:
x = 8.972990688205408e-05
print("{0:.2e}".format(x))
This prints:
8.97e-05
You asked in one of your comments on how to get only the 8.97.
This is the way to do it:
y = x*1e+05
print("{0:.2f}".format(y))
output:
8.97
In python (and many other programming language), any number suffix with an e with a number, it is power of 10 with the number.
For example
8.9729e05 = 8.9729 x 10^3 = 8972.9
8.9729e-05 = 8.9729 x 10^-3 = 0.000089729
8.9729e0 = 8.9729 x 10^0 = 8.9729
8.972990688205408e-05 8.972990688205408 x 10^-5 = 0.00008972990688205408
8.9729e # invalid syntax
As pointed out by other answer, if you want to print out the exponential round up, you need to use the correct Python string format, you have many choices to choose from. i.e.
e Floating point exponential format (lowercase, precision default to 6 digit)
e Floating point exponential format (uppercase, precision default to 6 digit).
g Same as "e" if exponent is greater than -4 or less than precision, "f" otherwise
G Same as "E" if exponent is greater than -4 or less than precision, "F" otherwise
e.g.
x = 8.972990688205408e-05
print('{:e}'.format(x)) # 8.972991e-05
print('{:E}'.format(x)) # 8.972991E-05
print('{:.2e}'.format(x)) # 8.97e-05
(Update)
OP asked a way to remove the exponent "E" number. Since str.format() or "%" notation just output a string object, break the "e" notation out of the string will do the trick.
'{:.2e}'.format(x).split("e") # ['8.97', '-05']
print('{:.2e}'.format(x).split('e')[0]) # 8.97
If I understand correctly, you only want to round the mantissa/significand? If you want to keep x as a float and output a float, just specify the precision when calling round:
x = round(8.972990688205408e-05,7)
Output:
8.97e-05
However, I recommend converting x with the decimal module first, which "provides support for fast correctly-rounded decimal floating point arithmetic" (see this answer):
from decimal import Decimal
x = Decimal('8.972990688205408e-05').quantize(Decimal('1e-7')) # output: 0.0000897
print('%.2E' % x)
Output:
8.97E-05
Or use the short form of the format method, which gives the same output:
print(f"{x:.2E}")
rount() returns closest multiple of 10 to the power minus ndigits,
so there is no chance you will get 8.98 or 8.97. you can check here also.

Displaying a rounded matrix

I want to display a vector with a predefined precision. For instance, let us consider the following vector,
v = [1.2346 2.0012 0.1230 0.0001 1.0000]
If I call,
mat2str(v, 1);
the output should be,
1.2 2.0 0.1 0.0 1.0
If I call,
mat2str(v, 2)
the output should be,
1.24 2.00 0.12 0.00 1.00
and so on.
I tried this code, but it resulted in an empty matrix:
function s = mat2str(mat, precision)
s = sprintf('%.%df ', precision, round(mat, precision));
end
mat2str(similarity, 3)
ans =
Empty string: 1-by-0
How can I display a vector with a predefined number of decimal places?
The format specifier for sprintf already provides an easy way to do this by using * for the precision field and passing that value as an argument to sprintf. Your function (which I renamed to mat2prec) can therefore be written as follows:
function s = mat2prec(mat, precision)
s = sprintf('%.*f', precision, mat);
end
This one works on my Matlab 2014b:
function s = mat2str(mat, precision)
printstring=strcat('%',num2str(precision),'.',num2str(precision),'f','\t');
s = sprintf(printstring, round(mat, precision));
end
function roundedmat2str(X,N)
NN = num2str(N); % Make a number of the precision
for ii = size(X,1):-1:1
out(ii,:) = sprintf(['%.' NN 'f \t'],X(ii,:)); % create string
end
disp(out)
end
X=magic(3)+rand(3);N=2;
MyRounding(X,N)
8.69 1.03 6.77
3.32 5.44 7.80
4.95 9.38 2.19
X = X(:).';
MyRounding(X,N)
8.69 3.32 4.95 1.03 5.44 9.38 6.77 7.80 2.19
Note that sprintf and fprintf already do implicit rounding when setting the number of decimals.
Also: please don't use existing function names for your own functions or variables. Never call a sum sum, a mean mean, or a mat2str mat2str. Do things like total, average and roundedmat2str. This makes your code portable and also makes sure you don't error out when you're using your own function but expect the default and vice-versa.
I think this is what you wanted to do in the first place:
s = sprintf(sprintf('%%.%df ', precision), mat)
EDIT
In case you want to extend your question to matrices, you could use this slightly more complicated one-liner:
s = sprintf([repmat(sprintf('%%.%df ', precision), 1, size(mat, 2)) '\n'], mat')
One noticeable difference with the previous one-liner is that it ends with a carriage return.

Finding the second largest number?

this program is to find the second largest number this works for most of the inputs but it is not working for the following inputs.
n=4
a1 = '-7 -7 -7 -7 -6'
a1=[int(arr_temp) for arr_temp in a1.strip().split(' ')]
print(a1)
largest = max(a1)
largest2 = 0
for i in range(0,len(a1)):
if ((a1[i]>largest2 or a1[i]<0) and largest2<largest and a1[i]!=largest):
largest2 = a1[i]
print(largest2)
Setting largest2 to 0 just complicates the if statement later. Set it to the smallest value in the array and it becomes clearer.
n=4
a1 = '-7 -7 -7 -7 -6'
a1=[int(arr_temp) for arr_temp in a1.strip().split(' ')]
print(a1)
largest = max(a1)
largest2 = min(a1)
for i in range(0,len(a1)):
if (a1[i] > largest2) and (a1[i] < largest):
largest2 = a1[i]
print(largest2)
Note that if the array is large, the call to min becomes non-trivial. In that case you could set largest2 to the smallest value possible (on that note, this link might be useful)

Resources