How to change behaviour of str(aList) in python - string

I'm trying to extract information from a very large XML file by parsing it with a python(v2.7.10) script. The goal is to make this information available to my other project written in C as (very large) array and integer literals in a file.
The following code demonstrates what I'm doing:
import sys
myList = [[1,2,3],[11,22,33]]
size = len(myList)
print("int myList[" + str(size) + "][3] = " + str(myList) + ";\n")
The result of this is int myList[2][3] = [[1,2,3],[11,22,33]];, but what I need is C syntax: int myList[2][3] = {{1,2,3},{11,22,33}};
Is there a way to modify str() in a way that it uses {} instead of [] for printing lists?

You can use a translate table trans for that, and then use the str.translate(..) function:
from string import maketrans
trans = maketrans('[]','{}')
print("int myList[" + str(size) + "][3] = " + str(myList).translate(trans) + ";\n")
This then prints:
>>> from string import maketrans
>>> myList = [[1,2,3],[11,22,33]]
>>> size = len(myList)
>>> trans = maketrans('[]','{}')
>>> print("int myList[" + str(size) + "][3] = " + str(myList).translate(trans) + ";\n")
int myList[2][3] = {{1, 2, 3}, {11, 22, 33}};
But note that if the elements are not lists, this can result in content that is not semantically valid. We simply replace '[' by '{', and ']' by '}'.

Related

Turn a str into a reasonable list python beginner

I'm trying to separate a string into a list that make sense
for exampia, in order to count the items on the list.
for example: str - "tomatoes,eggs,milk"
and result: lst = ['tomatoes', 'eggs', 'milk']
the code I wrote was:
def separate_groceries(str):
lst = [1, 2, 3] # really limiting the len of str, i need it to be able to recive each str
p = 0
for i in str:
pos = str.find(',')
item = str[:pos]
lst[p] = (item)
p = p + 1
str = str[pos+1:]
return lst
str = "tomatoes,milk,eggs"
res = separate_groceries(str)
print(res)
thank you for your help!
Stimer,
Like Niranjan Nagaraju said, what you may only need is:
groceries_list = "tomatoes,milk,eggs"
res = groceries_list.split(',')
print(res)
You may find more information about the .split() string method in the official Python documentation.

Python Parameterize Formatting

So I was wondering if there was a way to parameterize the format operator
For example
>>> '{:.4f}'.format(round(1.23456789, 4))
'1.2346
However, is there anyway to do something like this instead
>>> x = 4
>>> '{:.xf}'.format(round(1.23456789, x))
'1.2346
Yes, this is possible with a little bit of string concatenation. Check out the code below:
>>> x = 4
>>> string = '{:.' + str(x) + 'f}' # concatenate the string value of x
>>> string # you can see that string is the same as '{:.4f}'
'{:.4f}'
>>> string.format(round(1.23456789, x)) # the final result
'1.2346'
>>>
or if you wish to do this without the extra string variable:
>>> ('{:.' + str(x) + 'f}').format(round(1.23456789, x)) # wrap the concatenated string in parenthesis
'1.2346'

What is the best way to do this replace a list of characters with '-' in a string.?

I want to replace these symbols with '-' and I know there should be a better way than doing this:
if '/' in var1:
var1= var1.replace('/', '-')
if '#' in var1:
var1= var1.replace('#', '-')
if ';' in var1:
var1 = var1.replace(';', '-')
if ':' in var1:
var1= var1.replace(':', '-')
This is what I tried, which is clearly wrong and I'm not able to properly optimize it.
str = 'Testing PRI/Sec (#434242332;PP:432:133423846,335)'
a = ['#',':',';','/']
print([str.replace(i,'-') for i in str])
replaceAll doesn't work, gives me an error saying str does not has that attribute.
str.replaceAll("[<>]", "")
How about using str.translate()?
# make a translation table that replaces any of "#:;/" with hyphens
hyphenator = str.maketrans({c: "-" for c in "#:;/"})
# use str.translate to apply it
print("Testing PRI/Sec (#434242332;PP:432:133423846,335)".translate(hyphenator))
Or, even faster, use a compiled regex:
compiled_re = re.compile("|".join(re.escape(i) for i in "#:;/"))
print(compiled_re.sub("-", "Testing PRI/Sec (#434242332;PP:432:133423846,335)"))
Both of these methods are much faster than the other methods proposed (at least on that input):
import re
import timeit
s = "Testing PRI/Sec (#434242332;PP:432:133423846,335)"
a = ["#", ":", ";", "/"]
hyphenator = str.maketrans({c: "-" for c in "#:;/"})
def str_translate():
s.translate(hyphenator)
def join_generator():
"".join("-" if ch in a else ch for ch in s)
def append_in_loop():
temp = ""
for i in s:
if i in a:
temp += "-"
else:
temp += i
def re_sub():
re.sub("|".join(re.escape(i) for i in a), "-", s)
def compiled_re_sub():
compiled_re.sub("-", s)
for method in [str_translate, join_generator, re_sub, append_in_loop, compiled_re_sub]:
# run a million iterations and report the total time
print("{} took a total of {}s".format(method.__name__, timeit.timeit(method)))
Results on my machine:
str_translate took a total of 1.1160085709998384s
join_generator took a total of 4.599312704987824s
re_sub took a total of 4.101858579088002s
append_in_loop took a total of 4.257988628000021s
compiled_re_sub took a total of 1.0353244650177658s
s = 'Testing PRI/Sec (#434242332;PP:432:133423846,335)'
a = ['#',':',';','/']
print(''.join('-' if ch in a else ch for ch in s))
Prints:
Testing PRI-Sec (-434242332-PP-432-133423846,335)
Or using re:
s = 'Testing PRI/Sec (#434242332;PP:432:133423846,335)'
a = ['#',':',';','/']
import re
print(re.sub('|'.join(re.escape(i) for i in a), '-', s))
Prints:
Testing PRI-Sec (-434242332-PP-432-133423846,335)
Use re package
import re
string = 'Testing PRI/Sec (#434242332;PP:432:133423846,335)'
result = re.sub('[#:;/]',"-", string)
print(result)
Result:
Testing PRI-Sec (-434242332-PP-432-133423846,335)
Just loop through add each character to the temp variable unless it is in the list "a" if it is in the list just replace it by adding "-" to the variable instead.
str = 'Testing PRI/Sec (#434242332;PP:432:133423846,335)'
a = ['#',':',';','/']
temp = ''
for i in str:
if i in a:
temp = temp + "-"
else:
temp = temp + i
print(temp)

how can I convert these outputted coordinates to standard looking ones?

I have this code that outputs coordinates for a port:
import urllib
import urllib.request as request
import re
a = input("What country is your port in?: ")
b = input("What is the name of the port?: ")
url = "http://ports.com/"
country = ["united-kingdom","greece"]
ports = ["port-of-eleusis","portsmouth-continental-ferry-port","poole-harbour"]
totalurl = "http://ports.com/" + a + "/" + b + "/"
htmlfile = urllib.request.urlopen(totalurl)
htmltext = htmlfile.read()
regex = '<strong>Coordinates:</strong>(.*?)</span>'
pattern = re.compile(regex)
with urllib.request.urlopen(totalurl) as response:
html = htmltext.decode()
num = re.findall(pattern, html)
print(num)
The output is correct and readable but I need the coordinates to something like this format: 39°09'24.6''N 175°37'55.8''W instead of :
>>> [' 50°48′41.04″N 1°5′31.31″W']
Your error is caused because HTML internally uses these codes to display specific unicode characters, while python does not. To fix this, replace print(num) with print(list(i.replace('°', "°").replace('′',"′").replace('″',"″") for i in num))
This essentially replaces ° with °, ′ with ′, and ″ with ″.
>>> print(list(i.replace('°', "°").replace('′',"′").replace('″',"″") for i in num))
[" 50°48′41.04″N 1°5′31.31″W"]
>>>

Python write a list to a text file

I am using python format method to write a list of numbers into a text file. When the list length is fixed I can do:
a = [16,1,16,1]
myfile.write('{0},{1},{2},{3}'.format(*a) + "\n")
My question is how to use the format method when the size of list is not fixed. Is there any easy easy way of doing this? Rather than first create a string b, and then map a to b. I am not sure if I can use something like myfile.write('{all elements in a}'.format(*a) + "\n")
b=''
for i in range(len(a)):
if i<(len(a)-1):
b=b+'{'+ str(i) + '},'
else:
b=b+'{'+ str(i) + '}'
myfile.write(b.format(*a) + "\n")
Use str.join:
>>> lis = [16,1,16,1]
>>> ','.join(str(x) for x in lis) + '\n'
'16,1,16,1\n'
>>> lis = range(10)
>>> ','.join(str(x) for x in lis) + '\n'
'0,1,2,3,4,5,6,7,8,9\n'

Resources