Python instructional code to print long strings with line breaks - python-3.x

What's wrong with this code? Except for the print statement, it is the direct answer code from a Udacity learning python lesson. It suggests br as an html response, but to me that didn't make sense in python. The python run results print the letters <BR> between every letter of the string.
def breakify(strings):
return "<br>".join(strings)
print(breakify("Haiku frogs in snow" "A limerick came from Nantucket" "Tetrametric drum-beats thrumming,"))
Output:
H<br>a<br>i<br>k<br>u<br> <br>f<br>r<br>o<br>g<br>s<br> <br>i<br>n<br> <br>s<br>n<br>o<br>w<br>A<br> <br>l<br>i<br>m<br>e<br>r<br>i<br>c<br>k<br> <br>c<br>a<br>m<br>e<br> <br>f<br>r<br>o<br>m<br> <br>N<br>a<br>n<br>t<br>u<br>c<br>k<br>e<br>t<br>T<br>e<br>t<br>r<br>a<br>m<br>e<br>t<br>r<br>i<br>c<br> <br>d<br>r<br>u<br>m<br>-<br>b<br>e<br>a<br>t<br>s<br> <br>t<br>h<br>r<br>u<br>m<br>m<br>i<br>n<br>g<br>,

The strings are being concatenated due to string literal concatenation.
Simply put them in a list (or tuple) and separate them with commas.
Example with shorter strings for readability:
print(breakify(["Haiku", "limerick", "drum"]))
Output:
Haiku<br>limerick<br>drum
You got the output you did because str.join takes any iterable, and a string is an iterable. For example:
>>> '.'.join('hello')
'h.e.l.l.o'

Related

Python substitute column text

On execute the following line of code, I am getting TypeError: repl must be a string or callable
ps_df['Name'].str.replace(ps_df['Substitute'].str,'\n'+ps_df['Substitute'].str)
When I changed it to the following,
ps_df['Name'].str.replace(ps_df['Substitute'].str,'\n'+ps_df['Substitute'].str)
I get this error, TypeError: can only concatenate str (not "StringMethods") to str
According to pandas documentation for pandas.Series.str.replace, the first argument should be a string or compiled regex.
But you are trying to feed in a series or a list of strings. Also, you have used the .str accessor of the series which is not properly used. Hence, the error.
You should be using apply to replace strings row-wise.
ps_df.apply(lambda x: x.Name.replace(x.Substitute, '\n' + x.Substitute), axis=1)

How to use f'string bytes'string together? [duplicate]

I'm looking for a formatted byte string literal. Specifically, something equivalent to
name = "Hello"
bytes(f"Some format string {name}")
Possibly something like fb"Some format string {name}".
Does such a thing exist?
No. The idea is explicitly dismissed in the PEP:
For the same reason that we don't support bytes.format(), you may
not combine 'f' with 'b' string literals. The primary problem
is that an object's __format__() method may return Unicode data
that is not compatible with a bytes string.
Binary f-strings would first require a solution for
bytes.format(). This idea has been proposed in the past, most
recently in PEP 461. The discussions of such a feature usually
suggest either
adding a method such as __bformat__() so an object can control how it is converted to bytes, or
having bytes.format() not be as general purpose or extensible as str.format().
Both of these remain as options in the future, if such functionality
is desired.
In 3.6+ you can do:
>>> a = 123
>>> f'{a}'.encode()
b'123'
You were actually super close in your suggestion; if you add an encoding kwarg to your bytes() call, then you get the desired behavior:
>>> name = "Hello"
>>> bytes(f"Some format string {name}", encoding="utf-8")
b'Some format string Hello'
Caveat: This works in 3.8 for me, but note at the bottom of the Bytes Object headline in the docs seem to suggest that this should work with any method of string formatting in all of 3.x (using str.format() for versions <3.6 since that's when f-strings were added, but the OP specifically asks about 3.6+).
From python 3.6.2 this percent formatting for bytes works for some use cases:
print(b"Some stuff %a. Some other stuff" % my_byte_or_unicode_string)
But as AXO commented:
This is not the same. %a (or %r) will give the representation of the string, not the string iteself. For example b'%a' % b'bytes' will give b"b'bytes'", not b'bytes'.
Which may or may not matter depending on if you need to just present the formatted byte_or_unicode_string in a UI or if you potentially need to do further manipulation.
As noted here, you can format this way:
>>> name = b"Hello"
>>> b"Some format string %b World" % name
b'Some format string Hello World'
You can see more details in PEP 461
Note that in your example you could simply do something like:
>>> name = b"Hello"
>>> b"Some format string " + name
b'Some format string Hello'
This was one of the bigger changes made from python 2 to python3. They handle unicode and strings differently.
This s how you'd convert to bytes.
string = "some string format"
string.encode()
print(string)
This is how you'd decode to string.
string.decode()
I had a better appreciation for the difference between Python 2 versus 3 change to unicode through this coursera lecture by Charles Severence. You can watch the entire 17 minute video or fast forward to somewhere around 10:30 if you want to get to the differences between python 2 and 3 and how they handle characters and specifically unicode.
I understand your actual question is how you could format a string that has both strings and bytes.
inBytes = b"testing"
inString = 'Hello'
type(inString) #This will yield <class 'str'>
type(inBytes) #this will yield <class 'bytes'>
Here you could see that I have a string a variable and a bytes variable.
This is how you would combine a byte and string into one string.
formattedString=(inString + ' ' + inBytes.encode())

Converting string to dictionary from a opened file

A text file contains dictionary as below
{
"A":"AB","B":"BA"
}
Below are code of python file
with open('devices_file') as d:
print (d["A"])
Result should print AB.
As #rassar and #Ivrf suggested in comments you can use ast.literal_eval() as well as json.loads() to achieve this. Both code snippets outputs AB.
Solution with ast.literal_eval():
import ast
with open("devices_file", "r") as d:
content = d.read()
result = ast.literal_eval(content)
print(result["A"])
Solution with json.loads():
import json
with open("devices_file") as d:
content = json.load(d)
print(content["A"])
Python documentation about ast.eval_literal() and json.load().
Also: I noticed that you're not using the correct syntax in the code snippet in your question. Indented lines should be indented with 4 spaces, and between the print keyword and the associated parentheses there's no whitespace allowed.

print(f"...:")-statement too long - break it into multiple lines without messing up the format

I have a console program with formatted output. to always get the same length of the printout, I have a rather complex formatted print output.
print(f"\n{WHITE_BG}{64*'-'}")
print(f"\nDirektvergleich{9*' '}{RED}{players[0].name}{4*' '}{GREEN}vs.{4*' '}{RED}{players[1].name}{CLEAR}\n")
print(f"""{15*'~'}{' '}{YELLOW}Gesamt{CLEAR}:{' '}{players[0].name}{' '}{GREEN}{int(player1_direct_wins)}{(int(4-len(player1_direct_wins)))*' '}-{(int(4-len(player1_direct_losses)))*' '}{int(player1_direct_losses)}{CLEAR}{' '}{players[1].name}{' '}{(28-len(players[0].name)-len(players[1].name))*'~'}\n""")
print(f"""{15*'~'}{' '}{YELLOW}Trend{CLEAR}:{' '}{players[0].name}{' '}{GREEN}{int(player1_trend_wins)}{(int(4-len(player1_trend_wins)))*' '}-{(int(4-len(player1_trend_losses)))*' '}{int(player1_trend_losses)}{CLEAR}{' '}{players[1].name}{' '}{(28-len(players[0].name)-len(players[1].name))*'~'}""")
print(f"\n{WHITE_BG}{64*'-'}")
This leads to the following output in my windows cmd
For readibility purpose, I tried to make the print over multiple lines, therefore I found on stackoverflow the idea to start with triple quotes. But when I cut this print(f"...") statement in the middle, I mess up my formatting.
Example:
print(f"\n{WHITE_BG}{64*'-'}") #als String einspeisen?!
print(f"\nDirektvergleich{9*' '}{RED}{players[0].name}{4*' '}{GREEN}vs.{4*' '}{RED}{players[1].name}{CLEAR}\n")
print(f"""{15*'~'}{' '}{YELLOW}Gesamt{CLEAR}:{' '}{players[0].name}{' '}{GREEN}{int(player1_direct_wins)}{(int(4-len(player1_direct_wins)))*' '}-
{(int(4-len(player1_direct_losses)))*' '}{int(player1_direct_losses)}{CLEAR}{' '}{players[1].name}{' '}{(28-len(players[0].name)-len(players[1].name))*'~'}\n""")
print(f"""{15*'~'}{' '}{YELLOW}Trend{CLEAR}:{' '}{players[0].name}{' '}{GREEN}{int(player1_trend_wins)}{(int(4-len(player1_trend_wins)))*' '}-
{(int(4-len(player1_trend_losses)))*' '}{int(player1_trend_losses)}{CLEAR}{' '}{players[1].name}{' '}{(28-len(players[0].name)-len(players[1].name))*'~'}""")
print(f"\n{WHITE_BG}{64*'-'}")
leads to...
can anyone point me in the right direction how to format my output in the displayed way, but without having this absurd long line length.
Thank you guys in advance!
Triple quoted strings preserve newline characters, so they are indeed not what you want here. Now when it finds two adjacent strings, the Python parser automagically concatenates them into a single string, i.e.:
s = "foo" "bar"
is equivalent to
s = "foobar"
And this works if you put your strings within parens:
s = ("foo" "bar")
in which case you can put each string on its own line as well:
s = (
"foo"
"bar"
)
This also applies to "fstrings" so what you want is something like:
print((
f"{15*'~'}{' '}{YELLOW}Gesamt{CLEAR}:{' '}{players[0].name}{' '}{GREEN} "
f"{int(player1_direct_wins)}{(int(4-len(player1_direct_wins)))*' '}-"
f"{(int(4-len(player1_direct_losses)))*' '}{int(player1_direct_losses)}"
f"{CLEAR}{' '}{players[1].name}{' '}{(28-len(players[0].name)-"
f"len(players[1].name))*'~'}\n"
))
That being said, I'd rather use intermediate variables than trying to cram such complex expressions in a fstring.

replacing unigrams and n-grams in python without changing words

This seems like it should be straightforward, but it is not, I want to implement string replacement in python, the strings to be replaced can be unigrams or n-grams, but I do not want to replace a string contained within a word.
So for example:
x='hello world'
x.replace('llo','ll)
returns:
'hell world'
but I dont want that to happen.
Splitting the string on whitespace works for inidividual words (unigrams) but I also want to replace n-grams
so:
'this world is a happy place to be'
to be converted to:
'this world is a miserable cesspit to be'
and splitting on whitespace does not work.
Is there an in-built function in Python3 that allows me to do this?
I could do:
if len(new_string.split(' '))>1:
x.replace(old_string,new_string)
else:
x_array=x.split(' ')
x_array=[new_string if y==old_string else y for y in x_array]
x=' '.join(x_array)
you could do this:
import re
re_search = '(?P<pre>[^ ])llo(?P<post>[^ ])'
re_replace = '\g<pre>ll\g<post>'
print(re.sub(re_search, re_replace, 'hello world'))
print(re.sub(re_search, re_replace, 'helloworld'))
output:
hello world
hellworld
note how you need to add pre and post again.
now i see the comments... \b may work nicer.

Resources