I have a bunch of addresses:
123 Main Street, PO Box 345, Chicago, IL 92921
1992 Super Way, Bakersfield, CA
234 Wonderland Lane, Attn: Daffy Duck, Orlando, FL 09922
How could I cut out the second string in there, when I do myStr.split(',') on each?
The idea is that I want to return:
123 Main Street, Chicago, IL 92921
1992 Super Way, CA
234 Wonderland Lane, Orlando, FL 09922
I could loop through each part, and build yet another string, skipping the second index, but was wondering if there's a better way to do so.
What I have now:
def filter_address(address):
print("Filtering address on",address)
updated_addr = ""
indx = 0
for section in address.split(","):
if indx != 1:
updated_addr = updated_addr + "," + section
indx += 1
updated_addr = updated_addr[1:] # This is to remove the leading `,`
new_address = filter_address("123 Main Street, Chicago, IL 92921")
You could use del in python and glue back the components of the string with ", " after splitting them.
For example:
address = "123 Main Street, PO Box 345, Chicago, IL 92921".split(",")
del address[1]
pretty_address = ", ".join(address)
print(pretty_address) # Gives 123 Main Street, Chicago, IL 92921
Related
import re
text = """STAR PLUS LIMITED Unit B & C, 15/F, Casey Aberdeen House, 38 Heung Yip Road, Wong Chuk Hang, Hong Kong. Tel: (852)2511 0112 Fax: 2507 4300 Email: info#starplushk.com Ref No: LSM25781 SALES Sales Quote No: SP21-SQ10452 Buyer's Ref: LSM-021042-5 Messers JSC "Tander" Russian Federation 350002 Krasnodar"""
ref_no = re.findall(r"(?:(?<=Buyer's Ref: )|(?<=Ref No: ))[\w\d-]+",text)
print(ref_no)
Required solution: ['LSM25781', 'LSM-021042-5']
The script above outputs this, but I have man keywords, so I want to generate the regex dynamically. How can I do that?
Tried:
ref_keywords = ["Buyer's Ref:","Ref No:","Reference number:"]
b = r"(?:(?<=" + '|'.join(ref_keyword)+ r" ))[\w\d-]+"
ref_no = re.findall(b, text)
print(ref_no)
This results in the following error
Traceback (most recent call last):
File "/home/v/.config/JetBrains/PyCharm2021.3/scratches/scratch_2.py", line 7, in <module>
ref_no = re.findall(regex, text)
File "/home/v/.pyenv/versions/3.9.5/lib/python3.9/re.py", line 241, in findall
return _compile(pattern, flags).findall(string)
File "/home/v/.pyenv/versions/3.9.5/lib/python3.9/re.py", line 304, in _compile
p = sre_compile.compile(pattern, flags)
File "/home/v/.pyenv/versions/3.9.5/lib/python3.9/sre_compile.py", line 768, in compile
code = _code(p, flags)
File "/home/v/.pyenv/versions/3.9.5/lib/python3.9/sre_compile.py", line 607, in _code
_compile(code, p.data, flags)
File "/home/v/.pyenv/versions/3.9.5/lib/python3.9/sre_compile.py", line 182, in _compile
raise error("look-behind requires fixed-width pattern")
re.error: look-behind requires fixed-width pattern
Process finished with exit code 1
Is there a solution to add list of keywords inside regex. I cannot use "|" because I have many list of keywords.
key_word = ['key1', 'key2', 'key2']
combined_word = ""
for key in key_word:
combined_word += "|"+key
import re
sentence = "I wana delete key1 and key2 but also key3."
re.split(combined_word, sentence)
You can do it the following way (you almost had it, you just need to create all the regex characters for each keyword:
import re
text = """STAR PLUS LIMITED Unit B & C, 15/F, Casey Aberdeen House, 38 Heung Yip Road, Wong Chuk Hang, Hong Kong. Tel: (852)2511 0112 Fax: 2507 4300 Email: info#starplushk.com Ref No: LSM25781 SALES Sales Quote No: SP21-SQ10452 Buyer's Ref: LSM-021042-5 Messers JSC "Tander" Russian Federation 350002 Krasnodar"""
ref_keywords = ["Buyer's Ref:", "Ref No:", "Reference number:"]
def keyword_to_regex(keyword: str) -> str:
# you missed creating these for each keyword
return f"(?<={keyword} )"
regex_for_all_keywords = r"(?:" + "|".join(map(keyword_to_regex, ref_keywords)) + r")[\w\d-]+"
ref_no = re.findall(regex_for_all_keywords, text)
print(ref_no) # ['LSM25781', 'LSM-021042-5']
Code example
from itertools import *
from collections import Counter
from tqdm import *
#for i in tqdm(Iterable):
for i in combinations_with_replacement(['1','2','3','4','5','6','7','8'], 8):
b = (''.join(i))
if b == '72637721':
print (b)
when i try profuct i have
for i in product(['1','2','3','4','5','7','6','8'], 8):
TypeError: 'int' object is not iterable
How can i get all combinations ? ( i was belive it before not test , so now all what i was do wrong)
i was read about combinations_with_replacement return all , but how i see it's lie
i use python 3.8
Out put for ask
11111111 11111112 11111113 11111114 11111115 11111116 11111117
11111118 11111122 11111123 11111124 11111125 11111126 11111127
11111128 11111133 11111134 11111135 11111136 11111137 11111138
11111144 11111145 11111146 11111147 11111148 11111155 11111156
11111157 11111158 11111166 11111167 11111168 11111177 11111178
11111188 11111222 11111223 11111224 11111225 11111226 11111227
11111228 11111233 11111234 11111235 11111236 11111237 11111238
11111244 11111245 11111246 11111247 11111248 11111255 11111256
11111257 11111258 11111266 11111267 11111268 11111277 11111278
11111288
what it start give at end
56666888 56668888 56688888 56888888 58888888 77777777 77777776
77777778 77777766 77777768 77777788 77777666 77777668 77777688
77777888 77776666 77776668 77776688 77776888 77778888 77766666
77766668 77766688 77766888 77768888 77788888 77666666 77666668
77666688 77666888 77668888 77688888 77888888 76666666 76666668
76666688 76666888 76668888 76688888 76888888 78888888 66666666
66666668 66666688 66666888 66668888 66688888 66888888 68888888
88888888
more cleare think it how it be count from 1111 1111 to 8888 8888 ( but for characters , so this why i use try do it at permutation/combine with repitions...
it miss some possible combinations of that symbols.
as example what i try do , make all permutatuion of possible variants of hex numbers , like from 0 to F , but make it not only for them , make this possible for any charater.
this only at example ['1','2','3','4','5','6','7','8']
this can be ['a','b','x','c','d','g','r','8'] etc.
solition is use itertools.product instead combinations_with_replacement
from itertools import *
for i in product(['1','2','3','4','5','6','7','8'],repeat = 8):
b = (''.join(i))
if b == '72637721':
print (b)
:
itertools.product ('ABCD', 'ABCD') AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD # full multiplication with duplicates and mirrored pairs
itertools.permutations ('ABCD', 2) -> AB AC AD BA BC BD CA CB CD DA DB DC # full multiplication without duplicates and mirrored pairs
itertools.combinations_with_replacement ('ABCD', 2) -> AA AB AC AD BB BC BD CC CD DD # no mirror pairs with duplicates
itertools.combinations ('ABCD', 2) -> AB AC AD BC BD CD # no mirrored pairs and no duplicates
Here's the updated code that will print you all the combinations. It does not matter if your list has strings and numbers.
To ensure that you are doing a combination only for the specific number of elements, I recommend that you do:
comb_list = [1, 2, 3, 'a']
comb_len = len(comb_list)
and replace the line with:
comb = combinations_with_replacement(comb_list, comb_len)
from itertools import combinations_with_replacement
comb = combinations_with_replacement([1, 2, 3, 'a'], 4)
for i in list(comb):
print (''.join([str(j) for j in i]))
This will result as follows:
1111
1112
1113
111a
1122
1123
112a
1133
113a
11aa
1222
1223
122a
1233
123a
12aa
1333
133a
13aa
1aaa
2222
2223
222a
2233
223a
22aa
2333
233a
23aa
2aaa
3333
333a
33aa
3aaa
aaaa
I don't know what you are trying to do. Here's an attempt to start a dialogue to get to the final answer:
samples = [1,2,3,4,5,'a','b']
len_samples = len(samples)
for elem in samples:
print (str(elem)*len_samples)
The output of this will be as follows:
1111111
2222222
3333333
4444444
5555555
aaaaaaa
bbbbbbb
Is this what you want? If not, explain your question section what you expect as an output.
import random
capitals = {'Alabama': 'Montgomery', 'Alaska': 'Juneau', 'Arizona': 'Phoenix',
'Arkansas': 'Little Rock', 'California': 'Sacramento', 'Colorado': 'Denver',
'Connecticut': 'Hartford', 'Delaware': 'Dover', 'Florida': 'Tallahassee',
'Georgia': 'Atlanta', 'Hawaii': 'Honolulu', 'Idaho': 'Boise', 'Illinois':
'Springfield', 'Indiana': 'Indianapolis', 'Iowa': 'Des Moines', 'Kansas':
'Topeka', 'Kentucky': 'Frankfort', 'Louisiana': 'Baton Rouge', 'Maine':
'Augusta', 'Maryland': 'Annapolis', 'Massachusetts': 'Boston', 'Michigan':
'Lansing', 'Minnesota': 'Saint Paul', 'Mississippi': 'Jackson', 'Missouri':
'Jefferson City', 'Montana': 'Helena', 'Nebraska': 'Lincoln', 'Nevada':
'Carson City', 'New Hampshire': 'Concord', 'New Jersey': 'Trenton', 'NewMexico': 'Santa Fe', 'New York': 'Albany', 'North Carolina': 'Raleigh',
'North Dakota': 'Bismarck', 'Ohio': 'Columbus', 'Oklahoma': 'Oklahoma City',
'Oregon': 'Salem', 'Pennsylvania': 'Harrisburg', 'Rhode Island': 'Providence',
'South Carolina': 'Columbia'}
for quizNum in range(10):
quizFile = open('capitalsquiz%s.txt' % (quizNum + 1), 'w')
answerKeyFile = open('capitalsquiz_answers%s.txt' % (quizNum + 1), 'w')
quizFile.write('Name:\n\nDate:\n\nPeriod:\n\n')
quizFile.write((' ' * 20) + 'State Capitals Quiz (Form %s)' % (quizNum + 1))
quizFile.write('\n\n')
states = list(capitals.keys())
random.shuffle(states)
for questionNum in range(40):
correctAnswer = capitals[states[questionNum]]
wrongAnswers = list(capitals.values())
del wrongAnswers[wrongAnswers.index(correctAnswer)]
wrongAnswers = random.sample(wrongAnswers, 3)
answerOptions = wrongAnswers + [correctAnswer]
random.shuffle(answerOptions)
quizFile.write('%s. What is the capital of %s?\n' % (questionNum + 1,states[questionNum]))
for i in range(4):
quizFile.write(' %s. %s\n' % ('ABCD'[i], answerOptions[i]))
quizFile.write('\n')
answerKeyFile.write('%s. %s\n' % (questionNum + 1, 'ABCD'[answerOptions.index(correctAnswer)]))
quizFile.close()
answerKeyFile.close()
The remaining 9 booklets are created as .txt files but are blank. Where am I going wrong ?
Questions involve the capital cities of U.S.A
Expected file form is :
Name:
Date:
Period:
State Capitals Quiz (Form 1)
What is the capital of West Virginia?
A. Hartford
B. Santa Fe
C. Harrisburg
D. Charleston
What is the capital of Colorado?
A. Raleigh
B. Harrisburg
C. Denver
D. Lincoln
3......
and so on . Like that for all the other files(remaining 9 .txt files which get created) questions would have to be printed.But they just turn out to be blank.
Each question booklets has the same questions but are in different order(to prevent cheating)
You questions loop is outside of the file loop. This is causing all of the files to be opened before any questions are written, then the last opened file is what gets the questions đź‘Ť
I can post the fix later if needed, just on my phone right now.
I am needing to convert a list of lists of strings into a three column table where the first column is 1 space longer than the longest string. I have figured out how to identify the longest string and how long it is, but getting the table to form has been quite tricky. Here is the program with the lists in it and it shows you that the longest one is 26 characters long.
def main():
mycities = [['Cape Girardeau', 'MO', '63780'], ['Columbia', 'MO', '65201'],
['Kansas City', 'MO', '64108'], ['Rolla', 'MO', '65402'],
['Springfield', 'MO', '65897'], ['St Joseph', 'MO', '64504'],
['St Louis', 'MO', '63111'], ['Ames', 'IA', '50010'], ['Enid',
'OK', '73773'], ['West Palm Beach', 'FL', '33412'],
['International Falls', 'MN', '56649'], ['Frostbite Falls',
'MN', '56650']]
col_width = max(len(item) for sub in mycities for item in sub)
print(col_width)
main()
Now I am just needing to get it to print off like this:
Cape Girardeau MO 63780
Columbia MO 65201
Kansas City MO 64108
Springfield MO 65897
St Joseph MO 64504
St Louis MO 63111
Ames IA 50010
Enid OK 73773
West Palm Beach FL 33412
International Falls MN 56649
Frostbite Falls MN 56650
You're off to the right start -- as an example given the specific structure to the lists you have, you can use the col_width you calculated to determine the number of spaces you'd need after the name of each city to append to the end of each city name:
for city in mycities:
# append the string with the number of spaces required
city_padded = city[0] + " " + " "*(col_width-len(city[0]))
print(city_padded + city[1] + " " + city[2])
Given your example, this will produce:
Cape Girardeau MO 63780
Columbia MO 65201
Kansas City MO 64108
Rolla MO 65402
Springfield MO 65897
St Joseph MO 64504
St Louis MO 63111
Ames IA 50010
Enid OK 73773
West Palm Beach FL 33412
International Falls MN 56649
Frostbite Falls MN 56650
Note in the original version of your question, you are missing commas in your sublists in your mycities variable, for which I've added in an edit.
As a side note, it is convention in Python that words be separated by underscores in variable names for readability, so you might rename mycities to my_cities.
pep8 ref: (https://www.python.org/dev/peps/pep-0008/#function-and-variable-names)
"String name".ljust(26) will add spaces to the end of your string. For example,
Ames.ljust(26) will result in 'Ames (22 spaces here)', and then the next column will print after. If you are not sure what the longest city will be, you could replace the 26 with len(cities[-1]) after ordering the cities in a list by length. To do this, you can do sortedCities = sorted(cityListVariable, key=len)
def main():
cities = ['Cape Girardeau, MO 63780', 'Columbia, MO 65201', 'Kansas City, MO 64108', 'Rolla, MO 65402',
'Springfield, MO 65897', 'St Joseph, MO 64504', 'St Louis, MO 63111', 'Ames, IA 50010',
'Enid, OK 73773', 'West Palm Beach, FL 33412', 'International Falls, MN 56649', 'Frostbite Falls, MN 56650',
'Charlotte, NC 28241', 'Upper Marlboro, MD 20774', 'Camdenton, MO 65020', 'San Fransisco, CA 94016'] #create list of information
for x in cities:
col = x.split(",")
if(len(col) == 2):
city = col[0].strip()
temp = col[1].strip()
else:
city = x[:15].strip()
temp = c[15:].strip()
state = temp[:2]
zipCode = int(temp[-5::])
print("%-20s\t%s\t%d"%(city, state, zipCode))
main()
I have a problem and i can't get a proper and good solution.
Here is the thing, i have my list, which contains addresses:
addresses = ['Paris, 9 rue VoltaIre', 'Paris, 42 quai Voltaire', 'Paris, 87 rue VoLtAiRe']
And I want to get after treatment:
to_upper_words = ['paris', 'voltaire']
Because 'Paris' and 'Voltaire' are in all strings.
The goal is to upper all common words, and results for example:
'PARIS, 343 BOULEVARD SAINT-GERMAIN'
'PARIS, 458 BOULEVARD SAINT-GERMAIN'
'Paris', 'Boulevard', 'Saint-Germain' are in upper case because all strings (two in this case) have these words in common.
I found multiple solutions but only for two strings, and the result is ugly to apply for N strings.
If anyone can help...
Thank you in advance !
Remove the commas so that the words can be picked out simply by splitting on blanks. Then find the words that common to all of the addresses using set intersection. (There is only one such word for my collection, namely Montréal.) Now look through the original collection of addresses replacing each occurrence of common words with its uppercase equivalent. The most likely mistake in doing this might be to try to assign a value to an individual value in the list. However, these are, in effect, copies which is why I arranged to assign the modified values to addressses[i].
>>> addresses = [ 'Montréal, 3415 Chemin de la Côte-des-Neiges', 'Montréal, 3655 Avenue du Musée', 'Montréal, 3475 Rue de la Montagne', 'Montréal, 3450 Rue Drummond' ]
>>> sans_virgules = [address.replace(',', ' ') for address in addresses]
>>> listes_de_mots = [sans_virgule.split() for sans_virgule in sans_virgules]
>>> mots_en_commun = set(listes_de_mots[0])
>>> for liste in listes_de_mots[1:]:
... mots_en_commun = mots_en_commun.intersection(set(liste))
...
>>> mots_en_commun
{'Montréal'}
>>> for i, address in enumerate(addresses):
... for mot in list(mots_en_commun):
... addresses[i] = address.replace(mot, mot.upper())
...
>>> addresses
['MONTRÉAL, 3415 Chemin de la Côte-des-Neiges', 'MONTRÉAL, 3655 Avenue du Musée', 'MONTRÉAL, 3475 Rue de la Montagne', 'MONTRÉAL, 3450 Rue Drummond']