Replace string only if all characters match (Thai) - python-3.x

The problem is that มาก technically is in มาก็. Because มาก็ is มาก + ็.
So when I do
"แชมพูมาก็เยอะ".replace("มาก", " X ")
I end up with
แชมพู X ็เยอะ
And what I want
แชมพู X เยอะ
What I really want is to force the last character ก็ to count as a single character, so that มาก no longer matches มาก็.

While I haven't found a proper solution, I was able to find a solution. I split each string into separate (combined) characters via regex. Then I compare those lists to each other.
# Check is list is inside other list
def is_slice_in_list(s,l):
len_s = len(s) #so we don't recompute length of s on every iteration
return any(s == l[i:len_s+i] for i in range(len(l) - len_s+1))
def is_word_in_string(w, s):
a = regex.findall(u'\X', w)
b = regex.findall(u'\X', s)
return is_slice_in_list(a, b)
assert is_word_in_string("มาก็", "พูมาก็เยอะ") == True
assert is_word_in_string("มาก", "พูมาก็เยอะ") == False
The regex will split like this:
พู ม า ก็ เ ย อ ะ
ม า ก
And as it compares ก็ to ก the function figures the words are not the same.
I will mark as answered but if there is a nice or "proper" solution I will chose that one.

Related

Recursive function how to manage output

I'm working on a project for creating some word list. I have a word and some rules, for example, this char % is for digit, while this one ^ for special character, for example January%%^ should create things like:
January00!
January01!
January02!
January03!
January04!
January05!
January06!
etc.
For now I'm trying to do it with only digit and create a recursive function, because people can add as many digits and special characters as they want
January^%%%^% (for example)
This is the first function I have created:
month = "January"
nbDigit = "%%%"
def addNumber(month : list, position: int):
for i in range(position, len(month)):
for j in range(0,10):
month[position] = j
if(position == len(month)-1):
print (''.join(str(v) for v in month))
if position < len(month):
if month[position+1] == "%":
addNumber(month, position+1)
The problem is for each % that I have there is another output (three %, three times as output January000-January999/January000-January999/January000-January999).
When I tried to add the new function special character it's even worse, because I can't manage the output since every word can't end with a special character or digit. (AddSpecialChar is also a recursive function).
I believe what you are looking for is the following:
month = 'January'
nbDigit = "%%"
def addNumbers(root: str, mask: str)-> list:
# create a list of words using root followed By digits
rslt = []
mxNmb = 0
for i in range(len(mask)):
mxNmb += 9 * 10**i
mxNmb += 1
for i in range(mxNmb):
word = f"{root}{((str(i).rjust(len(mask), '0')))}"
rslt.append(word)
return rslt
this will produce:
['January00',
'January01',
'January02',
'January03',
'January04',
'January05',
'January06',
'January07',
'January08',
'January09',
'January10',
'January11',
'January12',
'January13',
'January14',
'January15',
'January16',
'January17',
'January18',
'January19',
'January20',
'January21',
'January22',
'January23',
'January24',
'January25',
'January26',
'January27',
'January28',
'January29',
'January30',
'January31',
'January32',
'January33',
'January34',
'January35',
'January36',
'January37',
'January38',
'January39',
'January40',
'January41',
'January42',
'January43',
'January44',
'January45',
'January46',
'January47',
'January48',
'January49',
'January50',
'January51',
'January52',
'January53',
'January54',
'January55',
'January56',
'January57',
'January58',
'January59',
'January60',
'January61',
'January62',
'January63',
'January64',
'January65',
'January66',
'January67',
'January68',
'January69',
'January70',
'January71',
'January72',
'January73',
'January74',
'January75',
'January76',
'January77',
'January78',
'January79',
'January80',
'January81',
'January82',
'January83',
'January84',
'January85',
'January86',
'January87',
'January88',
'January89',
'January90',
'January91',
'January92',
'January93',
'January94',
'January95',
'January96',
'January97',
'January98',
'January99']
Adding another position to the nbDigit variable will produce the numeric sequence from 000 to 999

I want to compress each letter in a string with a specific length

I have the following string:
x = 'aaabbbbbaaaaaacccccbbbbbbbbbbbbbbb'. I want to get an output like this: abaacbbb, in which "a" will be compressed with a length of 3 and "b" will be compressed with a length of 5. I used the following function, but it removes all the adjacent duplicates and the output is: abacb :
def remove_dup(x):
if len(x) < 2:
return x
if x[0] != x[1]:
return x[0] + remove_dup(x[1:])
return remove_dup(x[1:])
x = 'aaabbbbbaaaaaacccccbbbbbbbbbbbbbbb'
print(remove_dup(x))
It would be wonderful if somebody could help me with this.
Thank you!
Unless this is a homework question with special constraints, this would be more conveniently and arguably more readably implemented with a regex substitution that replaces desired quantities of specific characters with a single instance of the captured character:
import re
def remove_dup(x):
return re.sub('(a){3}|([bc]){5}', r'\1\2', x)
x = 'aaabbbbbaaaaaacccccbbbbbbbbbbbbbbb'
print(remove_dup(x))
This outputs:
abaacbbb

Select part of a string and change it in lowercase or uppercase python 3.x

I want to convert a string so that the pair positions will be in upper case characters and the impair positions will be in lower case characters.
Here is what I've tried so far:
def foldingo(chaine):
chaineuh=chaine[0::2].upper()
chaine=chaineuh[1::2].lower()
return chaine
your code takes every other character in chaine, uppercases them, and assigns those characters to chaineuh.
Then it takes every other character in chaineuh, lowercases them, and assigns those characters to chaine again. In other words:
abcdefg -> ACEG -> cg
You'll notice it's not keeping the characters that you're not trying to target.
You could try building all the uppercases and lowercases separately, then iterate with zip to get them together.
def fold(s):
uppers = s[0::2].upper()
lowers = s[1::2].lower()
return zip(uppers, lowers)
But this doesn't quit work either, since zip gives you tuples, not strings, and will drop the last character in odd-lengthed strings
abcdefg -> ACEG, bdf -> ('A', 'b'), ('C', 'd'), ('E', 'f')
We could fix that by using a couple calls to str.join and using itertools.zip_longest with a fillvalue='', but it's kind of like using a wrench to hammer in a nail. It's not really the right tool for the job. For the record: it would look like:
''.join([''.join(pair) for pair in itertools.zip_longest(uppers, lowers, fillvalue='')])
yuck.
Let's instead just iterate over the string and uppercase every other letter. We can use an alternating boolean to track whether we're upper'ing or lower'ing this time around.
def fold(s):
time_to_upper = True
result = ""
for ch in s:
if time_to_upper:
result += ch.upper()
else:
result += ch.lower()
time_to_upper = not time_to_upper
return result
You could also use enumerate and a modulo to keep track:
def fold(s):
result = ""
for i, ch in enumerate(s):
ch = ch.lower() if i % 2 else ch.upper()
result += ch
return result
Or by using itertools.cycle, str.join, and list comprehensions, we can make this a lot shorter (possibly at the cost of readability!)
import itertools
def fold(s):
return ''.join([op(ch) for op, ch in zip(itertools.cycle([str.upper, str.lower]), s)]

How can I delete the letter that occurs in the two strings using python?

That's the source code:
def revers_e(str_one,str_two):
for i in range(len(str_one)):
for j in range(len(str_two)):
if str_one[i] == str_two[j]:
str_one = (str_one - str_one[i]).split()
print(str_one)
else:
print('There is no relation')
if __name__ == '__main__':
str_one = input('Put your First String: ').split()
str_two = input('Put your Second String: ')
print(revers_e(str_one, str_two))
How can I remove a letter that occurs in both strings from the first string then print it?
How about a simple pythonic way of doing it
def revers_e(s1, s2):
print(*[i for i in s1 if i in s2]) # Print all characters to be deleted from s1
s1 = ''.join([i for i in s1 if i not in s2]) # Delete them from s1
This answer says, "Python strings are immutable (i.e. they can't be modified). There are a lot of reasons for this. Use lists until you have no choice, only then turn them into strings."
First of all you don't need to use a pretty suboptimal way using range and len to iterate over a string since strings are iterable you can just iterate over them with a simple loop.
And for finding intersection within 2 string you can use set.intersection which returns all the common characters in both string and then use str.translate to remove your common characters
intersect=set(str_one).intersection(str_two)
trans_table = dict.fromkeys(map(ord, intersect), None)
str_one.translate(trans_table)
def revers_e(str_one,str_two):
for i in range(len(str_one)):
for j in range(len(str_two)):
try:
if str_one[i] == str_two[j]:
first_part=str_one[0:i]
second_part=str_one[i+1:]
str_one =first_part+second_part
print(str_one)
else:
print('There is no relation')
except IndexError:
return
str_one = input('Put your First String: ')
str_two = input('Put your Second String: ')
revers_e(str_one, str_two)
I've modified your code, taking out a few bits and adding a few more.
str_one = input('Put your First String: ').split()
I removed the .split(), because all this would do is create a list of length 1, so in your loop, you'd be comparing the entire string of the first string to one letter of the second string.
str_one = (str_one - str_one[i]).split()
You can't remove a character from a string like this in Python, so I split the string into parts (you could also convert them into lists like I did in my other code which I deleted) whereby all the characters up to the last character before the matching character are included, followed by all the characters after the matching character, which are then appended into one string.
I used exception statements, because the first loop will use the original length, but this is subject to change, so could result in errors.
Lastly, I just called the function instead of printing it too, because all that does is return a None type.
These work in Python 2.7+ and Python 3
Given:
>>> s1='abcdefg'
>>> s2='efghijk'
You can use a set:
>>> set(s1).intersection(s2)
{'f', 'e', 'g'}
Then use that set in maketrans to make a translation table to None to delete those characters:
>>> s1.translate(str.maketrans({e:None for e in set(s1).intersection(s2)}))
'abcd'
Or use list comprehension:
>>> ''.join([e for e in s1 if e in s2])
'efg'
And a regex to produce a new string without the common characters:
>>> re.sub(''.join([e for e in s1 if e in s2]), '', s1)
'abcd'

Delete specific characters from a string (Python)

I understand that str = str.replace('x', '') will eliminate all the x's.
But let's say I have a string jxjrxxtzxz and I only want to delete the first and last x making the string jjrxxtzz. This is not string specific. I want to be able to handle all strings, and not just that specific example.
edit: assume that x is the only letter I want to remove. Thank you!
One fairly straight forward way is to just use find and rfind to locate the characters to remove;
s = 'jxjrxxtzxz'
# Remove first occurrence of 'x'
ix = s.find('x')
if ix > -1:
s = s[:ix]+s[ix+1:]
# Remove last occurrence of 'x'
ix = s.rfind('x')
if ix > -1:
s = s[:ix]+s[ix+1:]
Not pretty but this will work:
def remove_first_last(c, s):
return s.replace(c,'', 1)[::-1].replace(c,'',1)[::-1]
Usage:
In [1]: remove_first_last('x', 'jxjrxxtzxz')
Out[1]: 'jjrxxtzz'

Resources