Check if text contains a string and keep matched words from original text: - python-3.x

a = "Beauty Store is all you need!"
b = "beautystore"
test1 = ''.join(e for e in a if e.isalnum())
test2 = test1.lower()
test3 = [test2]
match = [s for s in test3 if b in s]
if match != []:
print(match)
>>>['beautystoreisallyouneed']
What I want is: "Beauty Store"
I search for the keyword in the string and I want to return the keyword from the string in the original format (with capital letter and space between, whatever) of the string, but only the part that contains the keyword.

If the keyword only occurs once, this will give you the right solution:
a = "Beauty Store is all you need!"
b = "beautystore"
ind = range(len(a))
joined = [(letter, number) for letter, number in zip(a, ind) if letter.isalnum()]
searchtext = ''.join(el[0].lower() for el in joined)
pos = searchtext.find(b)
original_text = a[joined[pos][1]:joined[pos+len(b)][1]]
It saves the original position of each letter, joins them to the lowercase string, finds the position and then looks up the original positions again.

Related

Create a list of strings with one/multiple character replacement

How to create a whole list of string from one string where each string in the list containing exactly one character replacement? The string itself is consisted of only four characters (say: A, B, C, and D), so that the whole list of a string of length n would contain 3n+1 strings with exactly one character replacement.
Example:
inputstr = 'ABCD'
output = ['ABCD', 'BBCD', 'CBCD', 'DBCD', 'AACD', 'ACCD', 'ADCD', 'ABAD', 'ABBD', 'ABDD', 'ABCA', 'ABCB', 'ABCC']
I write the following python code:
strin = 'ABCD'
strout = set()
tempstr1 = ''
tempstr2 = ''
tempstr3 = ''
tempstr4 = ''
for base in range(len(strin)):
if strin[base] == 'A': #this block will be repeated for char B, C and D
tempstr1 = strin.replace(strin[base], 'A')
strout.add(tempstr1)
tempstr1 = ''
tempstr2 = strin.replace(strin[base], 'B')
strout.add(tempstr2)
tempstr2 = ''
tempstr3 = strin.replace(strin[base], 'C')
strout.add(tempseq3)
tempstr3 = ''
tempstr4 = strin.replace(strin[base], 'D')
strout.add(tempseq4)
tempstr4 = ''
return strout
and it works well as long as there is no repeated character (such as 'ABCD'). However, when the input string contains repeated character (such as 'AACD'), it will return less than 3n+1 string. I tried with 'AACD' string and it returns only 10 instead of 13 strings.
Anyone can help?
change
strout = set() ===> strout = list()
I found it. I used a slicing method to create a list of total combination of strings with one replacement.
for i in range(len(seq)):
seqxlist.append(seq[:i] + 'x' + seq[i+1:])
and after that I filter out all the x-replaced strings which are longer than the original string length:
seqxlist = [x for x in seqxlist if (len(x) == len(seq))]
Then, I changed x into any of the substitution characters:
for m in seqxlist:
tempseq1 = m.replace('x', 'A')
outseq.append(tempseq1)
tempseq2 = m.replace('x', 'B')
outseq.append(tempseq2)
tempseq3 = m.replace('x', 'C')
outseq.append(tempseq3)
tempseq4 = m.replace('x', 'D')
outseq.append(tempseq4)
This will create all the possible combinations of string replacement, but still contains duplicates. To remove duplicates, I use set() to the outseq list.

hello friends i cant execute my else condition

The program must accept a string S as the input. The program must replace every vowel in the string S by the next consonant (alphabetical order) and replace every consonant in the string S by the next vowel (alphabetical order). Finally, the program must print the modified string as the output.
s=input()
z=[let for let in s]
alpa="abcdefghijklmnopqrstuvwxyz"
a=[let for let in alpa]
v="aeiou"
vow=[let for let in v]
for let in z:
if(let=="a"or let=="e" or let=="i" or let=="o" or let=="u"):
index=a.index(let)+1
if index!="a"or index!="e"or index!="i"or index!="o"or index!="u":
print(a[index],end="")
else:
for let in alpa:
ind=alpa.index(let)
i=ind+1
if(i=="a"or i=="e" or i=="i"or i=="o"or i=="u"):
print(i,end="")
the output is :
i/p orange
pbf
the required output is:
i/p orange
puboif
I would do it like this:
import string
def dumb_encrypt(text, vowels='aeiou'):
result = ''
for char in text:
i = string.ascii_letters.index(char)
if char.lower() in vowels:
result += string.ascii_letters[(i + 1) % len(string.ascii_letters)]
else:
c = 'a'
for c in vowels:
if string.ascii_letters.index(c) > i:
break
result += c
return result
print(dumb_encrypt('orange'))
# puboif
Basically, I would use string.ascii_letters, instead of defining that anew. Also, I would not convert all to list as it is not necessary for looping through. The consonants you got right. The vowels, I would just do an uncertain search for the next valid consonant. If the search, fails it sticks back to default a value.
Here I use groupby to split the alphabet into runs of vowels and consonants. I then create a mapping of letters to the next letter of the other type (ignoring the final consonants in the alphabet). I then use str.maketrans to build a translation table I can pass to str.translate to convert the string.
from itertools import groupby
from string import ascii_lowercase as letters
vowels = "aeiou"
is_vowel = vowels.__contains__
partitions = [list(g) for k, g in groupby(letters, is_vowel)]
mapping = {}
for curr_letters, next_letters in zip(partitions, partitions[1:]):
for letter in curr_letters:
mapping[letter] = next_letters[0]
table = str.maketrans(mapping)
"orange".translate(table)
# 'puboif'

Change Letters in A String One at a Time (Pandas,Python3)

I have a list of words in Pandas (DF)
Words
Shirt
Blouse
Sweater
What I'm trying to do is swap out certain letters in those words with letters from my dictionary one letter at a time.
so for example:
mydict = {"e":"q,w",
"a":"z"}
would create a new list that first replaces all the "e" in a list one at a time, and then iterates through again replacing all the "a" one at a time:
Words
Shirt
Blouse
Sweater
Blousq
Blousw
Swqater
Swwater
Sweatqr
Sweatwr
Swezter
I've been looking around at solutions here: Mass string replace in python?
and have tried the following code but it changes all instances "e" instead of doing so one at a time -- any help?:
mydict = {"e":"q,w"}
s = DF
for k, v in mydict.items():
for j in v:
s['Words'] = s["Words"].str.replace(k, j)
DF["Words"] = s
this doesn't seem to work either:
s = DF.replace({"Words": {"e": "q","w"}})
This answer is very similar to Brian's answer, but a little bit sanitized and the output has no duplicates:
words = ["Words", "Shirt", "Blouse", "Sweater"]
md = {"e": "q,w", "a": "z"}
md = {k: v.split(',') for k, v in md.items()}
newwords = []
for word in words:
newwords.append(word)
for c in md:
occ = word.count(c)
pos = 0
for _ in range(occ):
pos = word.find(c, pos)
for r in md[c]:
tmp = word[:pos] + r + word[pos+1:]
newwords.append(tmp)
pos += 1
Content of newwords:
['Words', 'Shirt', 'Blouse', 'Blousq', 'Blousw', 'Sweater', 'Swqater', 'Swwater', 'Sweatqr', 'Sweatwr', 'Swezter']
Prettyprint:
Words
Shirt
Blouse
Blousq
Blousw
Sweater
Swqater
Swwater
Sweatqr
Sweatwr
Swezter
Any errors are a result of the current time. ;)
Update (explanation)
tl;dr
The main idea is to find the occurences of the character in the word one after another. For each occurence we are then replacing it with the replacing-char (again one after another). The replaced word get's added to the output-list.
I will try to explain everything step by step:
words = ["Words", "Shirt", "Blouse", "Sweater"]
md = {"e": "q,w", "a": "z"}
Well. Your basic input. :)
md = {k: v.split(',') for k, v in md.items()}
A simpler way to deal with replacing-dictionary. md now looks like {"e": ["q", "w"], "a": ["z"]}. Now we don't have to handle "q,w" and "z" differently but the step for replacing is just the same and ignores the fact, that "a" only got one replace-char.
newwords = []
The new list to store the output in.
for word in words:
newwords.append(word)
We have to do those actions for each word (I assume, the reason is clear). We also append the world directly to our just created output-list (newwords).
for c in md:
c as short for character. So for each character we want to replace (all keys of md), we do the following stuff.
occ = word.count(c)
occ for occurrences (yeah. count would fit as well :P). word.count(c) returns the number of occurences of the character/string c in word. So "Sweater".count("o") => 0 and "Sweater".count("e") => 2.
We use this here to know, how often we have to take a look at word to get all those occurences of c.
pos = 0
Our startposition to look for c in word. Comes into use in the next loop.
for _ in range(occ):
For each occurence. As a continual number has no value for us here, we "discard" it by naming it _. At this point where c is in word. Yet.
pos = word.find(c, pos)
Oh. Look. We found c. :) word.find(c, pos) returns the index of the first occurence of c in word, starting at pos. At the beginning, this means from the start of the string => the first occurence of c. But with this call we already update pos. This plus the last line (pos += 1) moves our search-window for the next round to start just behind the previous occurence of c.
for r in md[c]:
Now you see, why we updated mc previously: we can easily iterate over it now (a md[c].split(',') on the old md would do the job as well). So we are doing the replacement now for each of the replacement-characters.
tmp = word[:pos] + r + word[pos+1:]
The actual replacement. We store it in tmp (for debug-reasons). word[:pos] gives us word up to the (current) occurence of c (exclusive c). r is the replacement. word[pos+1:] adds the remaining word (again without c).
newwords.append(tmp)
Our so created new word tmp now goes into our output-list (newwords).
pos += 1
The already mentioned adjustment of pos to "jump over c".
Additional question from OP: Is there an easy way to dictate how many letters in the string I want to replace [(meaning e.g. multiple at a time)]?
Surely. But I have currently only a vague idea on how to achieve this. I am going to look at it, when I got my sleep. ;)
words = ["Words", "Shirt", "Blouse", "Sweater", "multipleeee"]
md = {"e": "q,w", "a": "z"}
md = {k: v.split(',') for k, v in md.items()}
num = 2 # this is the number of replaces at a time.
newwords = []
for word in words:
newwords.append(word)
for char in md:
for r in md[char]:
pos = multiples = 0
current_word = word
while current_word.find(char, pos) != -1:
pos = current_word.find(char, pos)
current_word = current_word[:pos] + r + current_word[pos+1:]
pos += 1
multiples += 1
if multiples == num:
newwords.append(current_word)
multiples = 0
current_word = word
Content of newwords:
['Words', 'Shirt', 'Blouse', 'Sweater', 'Swqatqr', 'Swwatwr', 'multipleeee', 'multiplqqee', 'multipleeqq', 'multiplwwee', 'multipleeww']
Prettyprint:
Words
Shirt
Blouse
Sweater
Swqatqr
Swwatwr
multipleeee
multiplqqee
multipleeqq
multiplwwee
multipleeww
I added multipleeee to demonstrate, how the replacement works: For num = 2 it means the first two occurences are replaced, after them, the next two. So there is no intersection of the replaced parts. If you would want to have something like ['multiplqqee', 'multipleqqe', 'multipleeqq'], you would have to store the position of the "first" occurence of char. You can then restore pos to that position in the if multiples == num:-block.
If you got further questions, feel free to ask. :)
Because you need to replace letters one at a time, this doesn't sound like a good problem to solve with pandas, since pandas is about doing everything at once (vectorized operations). I would dump out your DataFrame into a plain old list and use list operations:
words = DF.to_dict()["Words"].values()
for find, replace in reversed(sorted(mydict.items())):
for word in words:
occurences = word.count(find)
if not occurences:
print word
continue
start_index = 0
for i in range(occurences):
for replace_char in replace.split(","):
modified_word = list(word)
index = modified_word.index(find, start_index)
modified_word[index] = replace_char
modified_word = "".join(modified_word)
print modified_word
start_index = index + 1
Which gives:
Words
Shirt
Blousq
Blousw
Swqater
Swwater
Sweatqr
Sweatwr
Words
Shirt
Blouse
Swezter
Instead of printing the words, you can append them to a list and re-create a DataFrame if that's what you want to end up with.
If you are looping, you need to update s at each cycle of the loop. You also need to loop over v.
mydict = {"e":"q,w"}
s=deduped
for k, v in mydict.items():
for j in v:
s = s.replace(k, j)
Then reassign it to your dataframe:
df["Words"] = s
If you can write this as a function that takes in a 1d array (list, numpy array etc...), you can use df.apply to apply it to any column, using df.apply().

Replace substring that lies between two positions

I have a string S in Matlab. How can I replace a substring in S with some pattern P. I only know the first and the last index of substring in S. What is the approach?
How about that?
str = 'My dog is called Jim'; %// original string
a = 4; %// starting index
b = 6; %// last index
replace = 'hamster'; %// new pattern
newstr = [str(1:a-1) replace str(b+1:end)]
returns:
newstr = My hamster is called Jim
In case the pattern you want to substitute has the same number of characters as the new one, you can use simple indexing:
str(a:b) = 'cat'
returns:
str = My cat is called Jim

Split string and replace dot char in Lua

I have a string stored in sqlite database and I've assigned it to a var, e.g. string
string = "First line and string. This should be another string in a new line"
I want to split this string into two separated strings, the dot (.) must be replace with (\n) new line char
At the moment I'm stuck and any help would be great!!
for row in db:nrows("SELECT * FROM contents WHERE section='accounts'") do
tabledata[int] = string.gsub(row.contentName, "%.", "\n")
int = int+1
end
I tried the other questions posted here in stachoverflow but with zero luck
What about this solution:`
s = "First line and string. This should be another string in a new line"
a,b=s:match"([^.]*).(.*)"
print(a)
print(b)
Are you looking to actually split the string into two different string objects? If so maybe this can help. It's a function I wrote to add some additional functionality to the standard string library. You can use it as-is or rename it to what ever you like.
--[[
string.split (s, p)
====================================================================
Splits the string [s] into substrings wherever pattern [p] occurs.
Returns: a table of substrings or, if no match is made [nil].
--]]
string.split = function(s, p)
local temp = {}
local index = 0
local last_index = string.len(s)
while true do
local i, e = string.find(s, p, index)
if i and e then
local next_index = e + 1
local word_bound = i - 1
table.insert(temp, string.sub(s, index, word_bound))
index = next_index
else
if index > 0 and index <= last_index then
table.insert(temp, string.sub(s, index, last_index))
elseif index == 0 then
temp = nil
end
break
end
end
return temp
end
Using it is very simple, it returns a tables of strings.
Lua 5.1.4 Copyright (C) 1994-2008 Lua.org, PUC-Rio
> s = "First line and string. This should be another string in a new line"
> t = string.split(s, "%.")
> print(table.concat(t, "\n"))
First line and string
This should be another string in a new line
> print(table.maxn(t))
2

Resources