Printing a string character by character including control characters - python-3.x

EDIT - The problem that I was having was with the library that i was trying to use, colorama, I probably should have been more specific.
I want to be able to print a string character-by-character with an extrememly short pause inbetween each character for effect but my code ignores control characters and just prints the individual characters. Not sure how to counter this.
Here is the part of the code that does it:
import time, sys
def slowprint(message, speed):
for x in range(0, len(message)):
if x == len(message)-1:
print(message[x])
else:
print(message[x], end="")
sys.stdout.flush()
time.sleep(speed)
I'm on python 3.2.

I'm not completely sure to understand your question, but I'll assume you're trying to print control characters like '\t' or '\n'.
When you create a string like "A\tB" it's made of three characters and not four. '\t' get converted to a single character directly.
So when you iterate through characters you'd need to map back these control characters to their string representation. For this you can use repr() (see this answer) and you're good
>>> slowprint(repr("abs\tsd\n"), 0.1)

Maybe something like this ?
import time, sys
def slowprint(message, speed):
i = iter(message)
# output first character before pausing
sys.stdout.write(next(i))
sys.stdout.flush()
for letter in i:
time.sleep(speed)
sys.stdout.write(letter)
sys.stdout.flush()
EDIT: fixed

Related

If input() returns a string, why doesn't print() display the quotation marks?

I'm having trouble understanding the following:
According to my book, unless otherwise specified, input will return a string type. If a string is printed wouldn't you expect the quotes to be included in the result? Is this just how print() is designed to work, if so why?
Example problem:
x = input() # user enters 5.5
print(x) # i expect '5.5' to be printed, instead 5.5 is printed
Wouldn't it be better to print the variable x for exactly what it is?
No, you use quotes to create a literal string; quotes are not part of the string value itself. If you want to see the quotes, ask Python for the representation of your string, i.e.
print(repr(x))

python converting strings into three blocks and if not two blocks

I want to write a function that converts the given string T and group them into three blocks.
However, I want to split the last block into two if it can't be broken down to three numbers.
For example, this is my code
import re
def num_format(T):
clean_number = re.sub('[^0-9]+', '', T)
formatted_number = re.sub(r"(\d{3})(?=(\d{3})+(?!\d{3}))", r"\1-", clean_number)
return formatted_number
num_format("05553--70002654")
this returns : '055-537-000-2654' as a result.
However, I want it to be '055-537-000-26-54'.
I used the regular expression, but have no idea how to split the last remaining numbers into two blocks!
I would really appreciate helping me to figure this problem out!!
Thanks in advance.
You can use
def num_format(T):
clean_number = ''.join(c for c in T if c.isdigit())
return re.sub(r'(\d{3})(?=\d{2})|(?<=\d{2})(?=\d{2}$)', r'\1-', clean_number)
See the regex demo.
Note you can get rid of all non-numeric chars using plain Python comprehension, the solution is borrowed from Removing all non-numeric characters from string in Python.
The regex matches
(\d{3}) - Group 1 (\1): three digits...
(?=\d{2}) - followed with two digits
| - or
(?<=\d{2})(?=\d{2}$) - a location between any two digit sequence and two digits that are at the end of string.
See the Python demo:
import re
def num_format(T):
clean_number = ''.join(c for c in T if c.isdigit())
return re.sub(r'(\d{3})(?=\d{2})|(?<=\d{2})(?=\d{2}$)', r'\1-', clean_number)
print(num_format("05553--70002654"))
# => 055-537-000-26-54

Read unknown number of ints separated by spaces or ends of lines

I am trying to find a way to get this right, I found some bits that answer this question only partially, such as:
from sys import stdin
lines = stdin.read().splitlines()
but this would only take in ints separated by lines
inp = list(map(int,input().split()))
while this only reads ints separated by spaces
I am stuck on this and couldn't figure out an intersection for the two. I am trying to learn EOF function.
import sys
numbers = []
for line in sys.stdin:
numbers += [int(number) for number in line.split()]
print(numbers)
Keep in mind that you have to explicitly send EOF on the terminal for the loop to finish (ctrl+d in bash). Otherwise, it will be stuck in for loop forever.

How to replace hex value in a string

While importing data from a flat file, I noticed some embedded hex-values in the string (<0x00>, <0x01>).
I want to replace them with specific characters, but am unable to do so. Removing them won't work either.
What it looks like in the exported flat file: https://i.imgur.com/7MQpoMH.png
Another example: https://i.imgur.com/3ZUSGIr.png
This is what I've tried:
(and mind, <0x01> represents a none-editable entity. It's not recognized here.)
import io
with io.open('1.txt', 'r+', encoding="utf-8") as p:
s=p.read()
# included in case it bears any significance
import re
import binascii
s = "Some string with hex: <0x01>"
s = s.encode('latin1').decode('utf-8')
# throws e.g.: >>> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 114: invalid start byte
s = re.sub(r'<0x01>', r'.', s)
s = re.sub(r'\\0x01', r'.', s)
s = re.sub(r'\\\\0x01', r'.', s)
s = s.replace('\0x01', '.')
s = s.replace('<0x01>', '.')
s = s.replace('0x01', '.')
or something along these lines in hopes to get a grasp of it while iterating through the whole string:
for x in s:
try:
base64.encodebytes(x)
base64.decodebytes(x)
s.strip(binascii.unhexlify(x))
s.decode('utf-8')
s.encode('latin1').decode('utf-8')
except:
pass
Nothing seems to get the job done.
I'd expect the characters to be replacable with the methods I've dug up, but they are not. What am I missing?
NB: I have to preserve umlauts (äöüÄÖÜ)
-- edit:
Could I introduce the hex-values in the first place when exporting? If so, is there a way to avoid that?
with io.open('out.txt', 'w', encoding="utf-8") as temp:
temp.write(s)
Judging from the images, these are actually control characters.
Your editor displays them in this greyed-out way showing you the value of the bytes using hex notation.
You don't have the characters "0x01" in your data, but really a single byte with the value 1, so unhexlify and friends won't help.
In Python, these characters can be produced in string literals with escape sequences using the notation \xHH, with two hexadecimal digits.
The fragment from the first image is probably equal to the following string:
"sich z\x01 B. irgendeine"
Your attempts to remove them were close.
s = s.replace('\x01', '.') should work.

re.sub replacing string using original sub-string

I have a text file. I would like to remove all decimal points and their trailing numbers, unless text is preceding.
e.g 12.29,14.6,8967.334 should be replaced with 12,14,8967
e.g happypants2.3#email.com should not be modified.
My code is:
import re
txt1 = "9.9,8.8,22.2,88.7,morris1.43#email.com,chat22.3#email.com,123.6,6.54"
txt1 = re.sub(r',\d+[.]\d+', r'\d+',txt1)
print(txt1)
unless there is an easier way of completing this, how do I modify r'\d+' so it just returns the number without a decimal place?
You need to make use of groups in your regex. You put the digits before the '.' into parentheses, and then you can use '\1' to refer to them later:
txt1 = re.sub(r',(\d+)[.]\d+', r',\1',txt1)
Note that in your attempted replacement code you forgot to replace the comma, so your numbers would have been glommed together. This still isn't perfect though; the first number, since it doesn't begin with a comma, isn't processed.
Instead of checking for a comma, the better way is to check word boundaries, which can be done using \b. So the solution is:
import re
txt1 = "9.9,8.8,22.2,88.7,morris1.43#email.com,chat22.3#email.com,123.6,6.54"
txt1 = re.sub(r'\b(\d+)[.]\d+\b', r'\1',txt1)
print(txt1)
Considering these are the only two types of string that is present in your file, you can explicitly check for these conditions.
This may not be an efficient way, but what I have done is split the str and check if the string contains #email.com. If thats true, I am just appending to a new list. For your 1st condition to satisfy, we can convert the str to int which will eliminate the decimal points.
If you want everything back to a str variable, you can use .join().
Code:
txt1 = "9.9,8.8,22.2,88.7,morris1.43#email.com,chat22.3#email.com,123.6,6.54"
txt_list = []
for i in (txt1.split(',')):
if '#email.com' in i:
txt_list.append(i)
else:
txt_list.append(str(int(float(i))))
txt_new = ",".join(txt_list)
txt_new
Output:
'9,8,22,88,morris1.43#email.com,chat22.3#email.com,123,6'

Resources