--*--
-***-
--*--
bars are blanks
print('', '*', ' \n', '***', ' \n', '', '*', '')
This is what i made and it doesn't work...I thought ''=blank and since there's comma it's one more blank so there should be 2 blanks as a result?
anyway what should i do using only one print f(x)
Just put it in as a single string:
print(' * \n***\n * ')
Output:
*
***
*
You can do this, because Python treats \n as new line character and it will not interfere with the rest of the text, even if it "touches" it. Putting it in a single string makes it more readable. There is no reason to fragment the whole statement with commas, when you can do it all in one string.
Basically:
'' --> empty string
' ' --> one space char (or blank)
So, modifying your print:
Only change the first argument from '' to ' '
print(' ', '*', ' \n', '***', ' \n', '', '*', '')
You can also simplify it passing only 1 argument:
print(' * \n *** \n * ')
Related
I am trying to parse a long string of 'objects' enclosed by quotes delimitated by commas. EX:
s='"12345","X","description of x","X,Y",,,"345355"'
output=['"12345"','"X"','"description of x"','"X,Y"','','','"345355"']
I am using split to delimitate by commas:
s=["12345","X","description of x","X,Y",,,"345355"]
s.split(',')
This almost works but the output for the string segment ...,"X,Y",... ends up parsing the data enclosed by quotes to "X and Y". I need the split to ignore commas inside of quotes
Split_Output
Is there a way I can delaminate by commas except for in quotes?
I tried using a regex but it ignores the ...,,,... in data because there are no quotes for blank data in the file I'm parsing. I am not an expert with regex and this sample I used from Python split string on quotes. I do understand what this example is doing and not sure how I could modify it to allow parse data that is not enclosed by quotes.
Thanks!
Regex_Output
split by " (quote) instead of by , (comma) then it will split the string into a list with extra commas, then you can just remove all elements that are commas
s='"12345","X","description of x","X,Y",,,"345355"'
temp = s.split('"')
print(temp)
#> ['', '12345', ',', 'X', ',', 'description of x', ',', 'X,Y', ',,,', '345355', '']
values_to_remove = ['', ',', ',,,']
result = list(filter(lambda val: not val in values_to_remove, temp))
print(result)
#> ['12345', 'X', 'description of x', 'X,Y', '345355']
this should work:
In [1]: import re
In [2]: s = '"12345","X","description of x","X,Y",,,"345355"'
In [3]: pattern = r"(?<=[\",]),(?=[\",])"
In [4]: re.split(pattern, s)
Out[4]: ['"12345"', '"X"', '"description of x"', '"X,Y"', '', '', '"345355"']
Explanation:
(?<=...) is a "positive lookbehind assertion". It causes your pattern (in this case, just a comma, ",") to match commas in the string only if they are preceded by the pattern given by .... Here, ... is [\",], which means "either a quotation mark or a comma".
(?=...) is a "positive lookahead assertion". It causes your pattern to match commas in the string only if they are followed by the pattern specified as ... (again, [\",]: either a quotation mark or a comma).
Since both of these assertions must be satisfied for the pattern to match, it will still work correctly if any of your 'objects' begin or end with commas as well.
You can replace all quotes with empty string.
s='"12345","X","description of x","X,Y",,,"345355"'
n = ''
i = 0
while i < len(s):
if i >= len(s):
break
if i<len(s) and s[i] == '"':
i+=1
while i<len(s) and s[i] != '"':
n+=s[i]
i+=1
i+=1
if i < len(s) and s[i] == ",":
n+=", "
i+=1
n.split(", ")
output: ['12345', 'X', 'description of x', 'X,Y', '', '', '345355']
For example, I have a string:
sentence = ['cracked $300 million','she\'s resolutely, smitten ', 'that\'s creative [r]', 'the market ( knowledge check : prices up!']
I want to remove the punctuation and replace numbers with the '£' symbol.
I have tried this but can only replace one or the other when I try to run them both.
my code is below
import re
s =([re.sub(r'[!":$()[]\',]',' ', word) for word in sentence])
s= [([re.sub(r'\d+','£', word) for word in s])]
s)
I think the problem could be in the square brackets??
thank you!
If you want to replace some specific punctuation symbols with a space and any digit chunks with a £ sign, you can use
import re
rx = re.compile(r'''[][!":$()',]|(\d+)''')
sentence = ['cracked $300 million','she\'s resolutely, smitten ', 'that\'s creative [r]', 'the market ( knowledge check : prices up!']
s = [rx.sub(lambda x: '£' if x.group(1) else ' ', word) for word in sentence]
print(s) # => ['cracked £ million', 'she s resolutely smitten ', 'that s creative r ', 'the market knowledge check prices up ']
See the Python demo.
Note where [] are inside a character class: when ] is at the start, it does not need to be escaped and [ does not have to be escaped at all inside character classes. I also used a triple-quoted string literal, so you can use " and ' as is without extra escaping.
So, here, [][!":$()',]|(\d+) matches ], [, !, ", :, $, (, ), ' or , or matches and captures into Group 1 one or more digits. If Group 1 matched, the replacement is the euro sign, else, it is a space.
Sorry i didn't see the second part of your request but you can to this for the number and the punctuation
sentence = ['cracked $300 million', 'she\'s resolutely, smitten ', 'that\'s creative [r]',
'the market ( knowledge check : prices up!']
def replaceDigitAndPunctuation(newSentence):
new_word = ""
for char in newSentence:
if char in string.digits:
new_word += "£"
elif char in string.punctuation:
pass
else:
new_word += char
return new_word
for i in range(len(sentence)):
sentence[i] = replaceAllDigitInString(sentence[i])
Using your input and pattern:
>>> ([re.sub(r'[!":$()[]\',]',' ', word) for word in sentence])
['cracked $300 million', "she's resolutely, smitten ", "that's creative [r]", 'the market ( knowledge check : prices up!']
>>>
The reason is because [!":$()[] is being treated as a character group, and \',] is a literal pattern, i.e. the engine is looking for ',] exactly.
With the closing bracket in the group escaped:
\]
>>> ([re.sub(r'[!":$()[\]\',]',' ', word) for word in sentence])
['cracked 300 million', 'she s resolutely smitten ', 'that s creative r ', 'the market knowledge check prices up ']
>>>
Edit:
If you're trying to stack multiple actions into a single list comprehension, then place your actions in a function and call the function:
def process_word(word):
word = re.sub(r'[!":$()[\]\',]',' ', word)
word = re.sub(r'\d+','£', word)
return word
Results in:
>>> [process_word(word) for word in sentence]
['cracked £ million', 'she s resolutely smitten ', 'that s creative r ', 'the market knowledge check prices up ']
i have a text full of regular expression and I want to extract the numbers that have 4 digits,
mytext ="""A text including special characters like 1000+(100)=1100 """
numbers = []
seperators=[
'(', ')', '[', ']', '{', '}', ';', ':', '=', '+', '-', '/', '*', '&', '%', '$', '#', '#', '^', '*', '~', '`', '"', '>', '|', '\\', '?', '.', '<', "'"]
how to use split function to extract numbers?
for word in mytext2.split(seperators):
if word.isdigit():
numbers.append(int(word))
#print(numbers)
for mynumbers in numbers:
if mynumbers >999 and 10000>mynumbers: #for 4 digits
print(mynumbers)
#this should print all the 4 digit numbers
text = "A text including special characters like 1000+(100)=1100 "
import re
numbers = [int(number) for number in re.findall(r'\b\d{4}\b', text)]
print(numbers)
# Outputs [1000, 1001]
mytext ="""Alain Fabien Maurice Marcel Delon (French: [al d l ] ɛ̃ ə ɔ̃; born 8 November 1935) is a French actor and businessman. He is known as
one of Europe's most prominent actors and screen sex symbols from the 1960s and 1970s. He achieved critical acclaim for roles in
films such as Rocco and His Brothers (1960), Plein Soleil (1960), L'Eclisse (1962), The Leopard (1963), The Yellow Rolls-
Royce (1965), Lost Command (1966), and Le Samouraï (1967). Over the course of his career Delon worked with many wellknown directors, including Luchino Visconti, Jean-Luc Godard, Jean-Pierre Melville, Michelangelo Antonioni, and Louis Malle. He
acquired Swiss citizenship in 1999"""
numbers = []
seperators=['#','(',')','$','%','^','&','*','+']
mytext2=mytext
mytext2=mytext2.replace('(',' ' )
mytext2=mytext2.replace(')',' ' )
mytext2=mytext2.replace('[',' ' )
mytext2=mytext2.replace(']',' ' )
mytext2=mytext2.replace('{',' ' )
mytext2=mytext2.replace('}',' ' )
mytext2=mytext2.replace(';',' ' )
mytext2=mytext2.replace(':',' ' )
mytext2=mytext2.replace('=',' ' )
mytext2=mytext2.replace('+',' ' )
mytext2=mytext2.replace('-',' ' )
mytext2=mytext2.replace('/',' ' )
mytext2=mytext2.replace('*',' ' )
mytext2=mytext2.replace('&',' ' )
mytext2=mytext2.replace('%',' ' )
mytext2=mytext2.replace('$',' ' )
mytext2=mytext2.replace('#',' ' )
mytext2=mytext2.replace('#',' ' )
mytext2=mytext2.replace('^',' ' )
mytext2=mytext2.replace('*',' ' )
mytext2=mytext2.replace('~',' ' )
mytext2=mytext2.replace('`',' ' )
mytext2=mytext2.replace('"',' ' )
mytext2=mytext2.replace('>',' ' )
mytext2=mytext2.replace('|',' ' )
mytext2=mytext2.replace('\\',' ' )
mytext2=mytext2.replace('?',' ' )
mytext2=mytext2.replace('.',' ' )
mytext2=mytext2.replace('<',' ' )
mytext2=mytext2.replace("'",' ' )
#print(mytext2)
for word in mytext2.split():
if word.isdigit():
numbers.append(int(word))
#print(numbers)
for mynumbers in numbers:
if mynumbers >999 and 10000>mynumbers:
print(mynumbers)
this code prints all the n digit numbers in the text, if your text more special characters you should add them in the first part to be replaced.
Like i have string variable which has value is given below
string_value = 'hello ' how ' are - you ? and/ nice to % meet # you'
Expected result:
hello how are you and nice to meet you
You could try just removing all non word characters:
string_value = "hello ' how ' are - you ? and/ nice to % meet # you"
output = re.sub(r'\s+', ' ', re.sub(r'[^\w\s]+', '', string_value))
print(string_value)
print(output)
This prints:
hello ' how ' are - you ? and/ nice to % meet # you
hello how are you and nice to meet you
The solution I used first targets all non word characters (except whitespace) using the pattern [^\w\s]+. But, there is then the chance that clusters of two or more spaces might be left behind. So, we make a second call to re.sub to remove extra whitespace.
brackets = {')', '(', '{', '}', '[', '>', ']', '<'}
string_line = <[2{12.5 6.0}](3 -4 5)>'
Basically I have to add space around any brackets in a string_line for brackets that are in the set. e.g. '[' becomes ' [ '
Assuming that I wouldn't know what brackets it contains, how can i avoid repeating line.replace 8 times? (there are 8 types of brackets)
Thanks!
You could try using regular expressions. The stupid while loop at the end is due to fact that I do not know how to replace overlapping matches. I would be grateful for any advice on this item.
#! /usr/bin/python3
import re
string_line = '<[2{12.5 6.0}](3 -4 5)>'
while True:
string_line, count = re.subn ('[{}<>\[\]()][{}<>\[\]()]', lambda x: '{} {}'.format (*x.group () ), string_line)
if not count: break
print (string_line)
This yields:
< [2{12.5 6.0} ] (3 -4 5) >
Basically inserting a whitespace between two following brackets. If this is not the expected behaviour please let me know the expected output.
try with list operations:
import functools
def replaceif(x):
if x in brackets:
x=' '+x+' '
return x
brackets = [')', '(', '{', '}', '[', '>', ']', '<']
string_line = '<[2{12.5 6.0}](3 -4 5)>'
print(functools.reduce(lambda x,y: replaceif(x)+replaceif(y), string_line))