String replacement in python is replacing the entire string - python-3.x

When I do the string replacement I am getting an error. For example: my string is my_string = '15:15'. I want to replace 15 which is after the colon to 30. For example I need '15:30'. When I try to do the string replace it's working fine for all other values for example, '09:15', '09:20'.
I have tried:
my_string = '15:15'
my_new_string = my_string.replace(my_string[-2:], '30')
my_string = '15:15'
my_new_string = my_string.replace(my_string[-2:], '30')
What I am expecting is 15:30 but my actual output is 30:30

my_new_string = my_string.replace(my_string[-2:],'30') gets you 30:30, because you are replacing all occurrences of 15 -> 15:15 will become 30:30.
You could use str.split and str.format to get your new string:
my_string = '15:15'
my_new_string = '{}:{}'.format(my_string.split(':')[0], '30')
print(my_new_string)
Prints:
15:30

That is the expected behavior. Look at what the arguments for str.replace() mean:
replace(...)
S.replace(old, new[, count]) -> string
Return a copy of string S with all occurrences of substring
old replaced by new. If the optional argument count is
given, only the first count occurrences are replaced.
It does not replace a substring, rather all occurrences of what you pass as the first parameter.
By calling my_string.replace(my_string[-2:], '30') you're essentially calling '15:15'.replace('15', '30') -- which will replace all occurrences of "15" by "30" so you'll end up with '30:30'.
If you want to replace the last two characters, reverse your logic: keep everything up to the last two characters and then add the '30' string you want at the end:
my_new_string = my_string[:-2] + '30'

When you use my_string[-2:] you are getting the string '15:'. Then when you substitute the function string.replace replaces all occurrences of 15: with 3, giving you 3030.
Instead, you can use my_string[2:] to get the string ':15' and replace it with ':30'. If you don't include the colon, then you will replace both occurrences of 15 and get '30:30'
my_new_string = my_string.replace(my_string[:-2], ':30')

Related

Regular Expression to remove substring having at least 5 Uppercases

I have a python list and I want a regular expression to remove substring which contains at least 5 uppercases. And another regex which could remove the part of string from ‘?’ till ‘:’
INPUT : list = [‘helLo/aPPle/BuTTeRfLY:Missed’,’bliss/ScIENCEs/brew?Dyna=skjdk:Nest’,’Self/NESTeDsd/hello/MiSSInG:Good’]
Output : list = [‘helLo/aPPle/:Missed’,’bliss//brew:Nest’,’Self//hello/:Good’]
Here make 2 regex:
(\w*[A-Z]\w*){5,} - find atleast 5 uppercase letters
?.*(?=:) - find substring start with ? and end with :
if we find string match with regex pattern then replace string with '' and update value in list
import re
reg =r'(\w*[A-Z]\w*){5,}|\?.*(?=:)'
input_list = ["helLo/aPPle/BuTTeRfLY:Missed","bliss/ScIENCEs/brew?Dyna=skjdk:Nest","Self/NESTeDsd/hello/MiSSInG:Good"]
for data in input_list:
match = re.finditer(reg,data)
if match:
for match_word in match:
print(match_word)
if match_word.group() in data:
# if uppercase char >5 then replace this substring with ''
final_str = data.replace(str(match_word.group()),'')
# find index of data
index = input_list.index(data)
# replce new value in list
input_list[index] = data =final_str
print(input_list)
Output: :- ['helLo/aPPle/:Missed', 'bliss//brew:Nest', 'Self//hello/:Good']

Is there a way to replace characters in a string from index 0 to index -4 (i.e. all but last 4 characters) with a '#'

For example, If my string was 'HelloWorld'
I want the output to be ######orld
My Code:
myString = 'ThisIsAString'
hashedString = string.replace(string[:-4], '#')
print(hashedString)
Output >> #ring
I expected the output to have just one # symbol since it is replacing argument 1 with argument 2.
Can anyone help me with this?
You could multiply # by the word length - 4 and then use the string slicing.
myString = 'HelloWorld'
print('#' * (len(myString) - 4) + myString[-4:])
myString = 'ThisIsAString'
print('#' * (len(myString) - 4) + myString[-4:])
string.replace(old, new) replaces all instances of old with new. So the code you provided is actually replacing the entire beginning of the string with a single pound sign.
You will also notice that input like abcdabcd will give the output ##, since you are replacing all 'abcd' substrings.
Using replace, you could do
hashes = '#' * len(string[:-4])
hashedString = string.replace(string[:-4], hashes, 1)
Note the string multiplication to get the right number of pound symbols, and the 1 passed to replace, which tells it only to replace the first case it finds.
A better method would be to not use replace at all:
hashes = '#' * (len(string) - 4)
leftover = string[-4:]
hashedString = hashes + leftover
This time we do the same work with getting the pound sign string, but instead of replacing we just take the last 4 characters and add them after the pound signs.

Replace non-numeric characters in string

I would like to know how to replace non-numeric characters in a single string with different random integers.
I have tried the following:
text = '1$1#387'
rec_1 = re.sub("\D+",str(random.randint(0,9)),text)
It then produced:
output: 1717387
As you can see, the non-numeric characters have been replaced by the same integer. I would like each non-numeric character to be replaced by a different integer. For example:
desired output: 1714387
Please assist.
Use a function as the replacement value:
def replacement(match):
return str(random.randint(0, 9))
text = '1$1#387'
rec_1 = re.sub(r"\D", replacement, text)
rec_1 is now "1011387", or "1511387", ...
That's because the randint function is called only 1 time.
You can use a lambda to get a new randint each time:
rec_1 = re.sub("\D+", lambda x: str(random.randint(0, 9)), text)

Get digits at end of string in a pythonic way

I'm using python 3.x. I'm trying to get the (int) number at the end of a string with format
string_example_1 = l-45-98-567-567-12
string_example_2 = s-89-657
or in general, a single lowercase letter followed by a number of integers separated by '-'. What I need is to get the last number (12 and 657 in these cases). I have archived this with the function
def ending(the_string):
out = ''
while the_string[-1].isdigit():
out = the_string[-1] + out
the_string = the_string[:-1]
return out
but I'm sure there must be a more pythonic way to do this. In a previous instance I check manually that the string starts the way I like by doing something like
if st[0].isalpha() and st[1]=='-' and st[2].isdigit():
statement...
I would just split the string on -, take the last of the splits and convert it to an integer.
string_example_1 = "l-45-98-567-567-12"
string_example_2 = "s-89-657"
def last_number(s):
return int(s.split("-")[-1])
print(last_number(string_example_1))
# 12
print(last_number(string_example_2))
# 657
Without regular expressions, you could reverse the string, take elements from the string while they're still numbers, and then reverse the result. In Python:
from itertools import takewhile
def extract_final_digits(s):
return int(''.join(reversed(list(takewhile(lambda c: c.isdigit(), reversed(s))))))
But the simplest is to just split on a delimiter and take the final element in the split list.

Split by the delimiter that comes first, Python

I have some unpredictable log lines that I'm trying to split.
The one thing I can predict is that the first field always ends with either a . or a :.
Is there any way I can automatically split the string at whichever delimiter comes first?
Look at the index of the . and : characters in the string using the index() function.
Here’s a simple implementation:
def index_default(line, char):
"""Returns the index of a character in a line, or the length of the string
if the character does not appear.
"""
try:
retval = line.index(char)
except ValueError:
retval = len(line)
return retval
def split_log_line(line):
"""Splits a line at either a period or a colon, depending on which appears
first in the line.
"""
if index_default(line, ".") < index_default(line, ":"):
return line.split(".")
else:
return line.split(":")
I wrapped the index() function in an index_default() function because if the line doesn’t contain a character, index() throws a ValueError, and I wasn’t sure if every line in your log would contain both a period and a colon.
And then here’s a quick example:
mylines = [
"line1.split at the dot",
"line2:split at the colon",
"line3:a colon preceded. by a dot",
"line4-neither a colon nor a dot"
]
for line in mylines:
print split_log_line(line)
which returns
['line1', 'split at the dot']
['line2', 'split at the colon']
['line3', 'a colon preceded. by a dot']
['line4-neither a colon nor a dot']
Check the indexes for both both characters, then use the lowest index to split your string.

Resources