An Elegant Solution to Python's Multiline String? - python-3.x

I was trying to log a completion of a scheduled event I set to run on Django. I was trying my very best to make my code look presentable, So instead of putting the string into a single line, I have used a multiline string to output to the logger within a Command Management class method. The example as code shown:
# the usual imports...
# ....
import textwrap
logger = logging.getLogger(__name__)
class Command(BaseCommand):
def handle(self, *args, **kwargs):
# some codes here
# ....
final_statement = f'''\
this is the final statements \
with multiline string to have \
a neater code.'''
dedented_text = textwrap.dedent(final_statment)
logger.info(dedent.replace(' ',''))
I have tried a few methods I found, however, most quick and easy methods still left a big chunk of spaces on the terminal. As shown here:
this is the final statement with multiline string to have a neater code.
So I have come up with a creative solution to solve my problem. By using.
dedent.replace(' ','')
Making sure to replace two spaces with no space in order not to get rid of the normal spaces between words. Which finally produced:
this is the final statement with multiline string to have a neater code.
Is this an elegant solution or did I missed something on the internet?

You could use regex to simply remove all white space after a newline. Additionally, wrapping it into a function leads to less repetitive code, so let's do that.
import re
def single_line(string):
return re.sub("\n\s+", "", string)
final_statement = single_line(f'''
this is the final statements
with multiline string to have
a neater code.''')
print(final_statement)
Alternatively, if you wish to avoid this particular problem (and don't mine the developmental overhead), you could store them inside a file, like JSON so you can quickly edit prompts while keeping your code clean.

Thanks to Neil's suggestion, I have come out with a more elegant solution. By creating a function to replace the two spaces with none.
def single_line(string):
return string.replace(' ','')
final_statement = '''\
this is a much neater
final statement
to present my code
'''
print(single_line(final_statement)
As improvised from Neil's solution, I have cut down the regex import. That's one line less of code!
Also, making it a function improves on readability as the whole print statement just read like English. "Print single line final statement"
Any better idea?

The issue with both Neil’s and Wong Siwei’s answers is they don’t work if your multiline string contains lines more indented than others:
my_string = """\
this is my
string and
it has various
identation
levels"""
What you want in the case above is to remove the two-spaces indentation, not every space at the beginning of a line.
The solution below should work in all cases:
import re
def dedent(s):
indent_level = None
for m in re.finditer(r"^ +", s):
line_indent_level = len(m.group())
if indent_level is None or indent_level > line_indent_level:
indent_level = line_indent_level
if not indent_level:
return s
return re.sub(r"(?:^|\n) {%s}" % indent_level, "", s)
It first scans the whole string to find the lowest indentation level then uses that information to dedent all lines of it.
If you only care about making your code easier to read, you may instead use C-like strings "concatenation":
my_string = (
"this is my string"
" and I write it on"
" multiple lines"
)
print(repr(my_string))
# => "this is my string and I write it on multiple lines"
You may also want to make it explicit with +s:
my_string = "this is my string" + \
" and I write it on" + \
" multiple lines"

Related

How can I print "\n" using exec()?

ab = open("bonj.txt","w")
exec(f'''print("Hi I'm Mark\n", file=ab)
print("\tToday I'm tired", file=ab)
''')
ab.close()
I would absolutely need to use exec() to print some informations on a txt doc. The problem is that when I use exec(), I lost the possibility of put newlines or tabs on my text, and I dont understand why, could you help me ?
This is the error message that I receive : "SyntaxError: EOL while scanning string literal"
You just need to escape \n and \t properly
ab = open("bonj.txt","w")
exec(f'''print("Hi I'm Mark\\n", file=ab)
print("\\tToday I'm tired", file=ab)
''')
ab.close()
You need to prevent python from interpreting the \n early.
This can be done by specifying the string as a raw string, using the r prefix:
ab = open("bonj.txt","w")
exec(rf'''print("Hi I'm Mark\n", file=ab)
print("\tToday I'm tired", file=ab)
''')
ab.close()
Anyway, using exec is odd there, you would rather try to see if you can write your code as something like:
lines = ["Hi I'm Mark\n", "\tToday I'm tired"]
with open("bonj.txt", "w") as f:
f.write("\n".join(lines))
Note that you need to use "\n".join to obtain the same result as with print because print adds a newline by default (see its end="\n" argument).
Also, when handling files, using the context manager syntax (with open ...) is good practice.

How to replace hyphen and newline in string in Python

I am working in a text with several syllables divisions.
A typical string is something like that
"this good pe-
riod has"
I tried:
my_string.replace('-'+"\r","")
However, it is not working.
I would like to get
"this good period has"
Have you tried this?
import re
text = """this good pe-
riod has"""
print(re.sub(r"-\s+", '', text))
# this good period has
After you match -, you should match the newline \n :
my_string = """this good pe-
riod has"""
print(my_string.replace("-\n",""))
# this good period has
It depends how your string ends, you could also use my_string.replace('-\r\n', '') or an optional carriage return using re.sub and -(?:\r?\n|\r)
If there has to be a word character before and after, instead of removing all the hyphens at the end of the line, you could use lookarounds:
(?<=\w)-\r?\n(?=\w)
Regex demo | Python demo
For example
import re
regex = r"(?<=\w)-\r?\n(?=\w)"
my_string = """this good pe-
riod has"""
print (re.sub(regex, "", my_string))
Output
this good period has

How to filter only text in a line?

I have many lines like these:
_ÙÓ´Immediate Transformation With Vee_ÙÓ´
‰ÛÏThe Real Pernell Stacks‰Û
I want to get something like this:
Immediate Transformation With Vee
The Real Pernell Stacks
I tried this:
for t in test:
t.isalpha()
but characters like this Ó count as well
So I also thought that I can create a list of English words, a space and punctuation marks and delete all the elements from the line that are not in this list, but I do not think that this is the right option, since the line can contain not only English words and that's fine.
Using Regex.
Ex:
import re
data = """_ÙÓ´Immediate Transformation With Vee_ÙÓ´
‰ÛÏThe Real Pernell Stacks‰Û"""
for line in data.splitlines(keepends=False):
print(re.sub(r"[^A-Za-z\s]", "", line))
Output:
Immediate Transformation With Vee
The Real Pernell Stacks
use re
result = ' '.join(re.split(r'[^A-Za-z]', s))

multiple variable in python regex

I have seen several related posts and several forums to find an answer for my question, but nothing has come up to what I need.
I am trying to use variable instead of hard-coded values in regex which search for either word in a line.
However i am able to get desired result if i don't use variable.
<http://www.somesite.com/software/sub/a1#Msoffice>
<http://www.somesite.com/software/sub1/a1#vlc>
<http://www.somesite.com/software/sub2/a2#dell>
<http://www.somesite.com/software/sub3/a3#Notepad>
re.search(r"\#Msoffice|#vlc|#Notepad", line)
This regex will return the line which has #Msoffice OR #vlc OR #Notepad.
I tried defining a single variable using re.escape and that worked absolutely fine. However i have tried many combination using | and , (pipe and comma) but no success.
Is there any way i can specify #Msoffice , #vlc and #Notepad in different variables and so later i can change those ?
Thanks in advance!!
If I did understand you the right way you'd like to insert variables in your regex.
You are actually using a raw string using r' ' to make the regex more readable, but if you're using f' ' it allows you to insert any variables using {your_var} then construct your regex as you like:
var1 = '#Msoffice'
var2 = '#vlc'
var3 = '#Notepad'
re.search(f'{var1}|{var2}|{var3}', line)
The most annoying issue is that you will have to add \ to escaped char, to look for \ it will be \\
Hope it helped
import re
lines = ["<http://www.somesite.com/software/sub/a1#Msoffice>",
"<http://www.somesite.com/software/sub1/a1#vlc>",
"<http://www.somesite.com/software/sub2/a2#dell>",
"<http://www.somesite.com/software/sub3/a3#Notepad>"]
for line in lines:
if re.search(r'\b(?:\#{}|\#{}|\#{})\b'.format('Msoffice', 'vlc', 'Notepad'), line):
print(line)
Output :
<http://www.somesite.com/software/sub/a1#Msoffice>
<http://www.somesite.com/software/sub1/a1#vlc>
<http://www.somesite.com/software/sub3/a3#Notepad>

replacing unigrams and n-grams in python without changing words

This seems like it should be straightforward, but it is not, I want to implement string replacement in python, the strings to be replaced can be unigrams or n-grams, but I do not want to replace a string contained within a word.
So for example:
x='hello world'
x.replace('llo','ll)
returns:
'hell world'
but I dont want that to happen.
Splitting the string on whitespace works for inidividual words (unigrams) but I also want to replace n-grams
so:
'this world is a happy place to be'
to be converted to:
'this world is a miserable cesspit to be'
and splitting on whitespace does not work.
Is there an in-built function in Python3 that allows me to do this?
I could do:
if len(new_string.split(' '))>1:
x.replace(old_string,new_string)
else:
x_array=x.split(' ')
x_array=[new_string if y==old_string else y for y in x_array]
x=' '.join(x_array)
you could do this:
import re
re_search = '(?P<pre>[^ ])llo(?P<post>[^ ])'
re_replace = '\g<pre>ll\g<post>'
print(re.sub(re_search, re_replace, 'hello world'))
print(re.sub(re_search, re_replace, 'helloworld'))
output:
hello world
hellworld
note how you need to add pre and post again.
now i see the comments... \b may work nicer.

Resources