Python Join String to Produce Combinations For All Words in String - string

If my string is this: 'this is a string', how can I produce all possible combinations by joining each word with its neighboring word?
What this output would look like:
this is a string
thisis a string
thisisa string
thisisastring
thisis astring
this isa string
this isastring
this is astring
What I have tried:
s = 'this is a string'.split()
for i, l in enumerate(s):
''.join(s[0:i])+' '.join(s[i:])
This produces:
'this is a string'
'thisis a string'
'thisisa string'
'thisisastring'
I realize I need to change the s[0:i] part because it's statically anchored at 0 but I don't know how to move to the next word is while still including this in the output.

A simpler (and 3x faster than the accepted answer) way to use itertools product:
s = 'this is a string'
s2 = s.replace('%', '%%').replace(' ', '%s')
for i in itertools.product((' ', ''), repeat=s.count(' ')):
print(s2 % i)

You can also use itertools.product():
import itertools
s = 'this is a string'
words = s.split()
for t in itertools.product(range(len('01')), repeat=len(words)-1):
print(''.join([words[i]+t[i]*' ' for i in range(len(t))])+words[-1])

Well, it took me a little longer than I expected... this is actually tricker than I thought :)
The main idea:
The number of spaces when you split the string is the length or the split array - 1. In our example there are 3 spaces:
'this is a string'
^ ^ ^
We'll take a binary representation of all the options to have/not have either one of the spaces, so in our case it'll be:
000
001
011
100
101
...
and for each option we'll generate the sentence respectively, where 111 represents all 3 spaces: 'this is a string' and 000 represents no-space at all: 'thisisastring'
def binaries(n):
res = []
for x in range(n ** 2 - 1):
tmp = bin(x)
res.append(tmp.replace('0b', '').zfill(n))
return res
def generate(arr, bins):
res = []
for bin in bins:
tmp = arr[0]
i = 1
for digit in list(bin):
if digit == '1':
tmp = tmp + " " + arr[i]
else:
tmp = tmp + arr[i]
i += 1
res.append(tmp)
return res
def combinations(string):
s = string.split(' ')
bins = binaries(len(s) - 1)
res = generate(s, bins)
return res
print combinations('this is a string')
# ['thisisastring', 'thisisa string', 'thisis astring', 'thisis a string', 'this isastring', 'this isa string', 'this is astring', 'this is a string']
UPDATE:
I now see that Amadan thought of the same idea - kudos for being quicker than me to think about! Great minds think alike ;)

The easiest is to do it recursively.
Terminating condition: Schrödinger join of a single element list is that word.
Recurring condition: say that L is the Schrödinger join of all the words but the first. Then the Schrödinger join of the list consists of all elements from L with the first word directly prepended, and all elements from L with the first word prepended with an intervening space.
(Assuming you are missing thisis astring by accident. If it is deliberately, I am sure I have no idea what the question is :P )
Another, non-recursive way you can do it is to enumerate all numbers from 0 to 2^(number of words - 1) - 1, then use the binary representation of each number as a selector whether or not a space needs to be present. So, for example, the abovementioned thisis astring corresponds to 0b010, for "nospace, space, nospace".

Related

Create a List with spaces without and not using .split()

I have to write an application that asks the user to enter a list of numbers separated by a space and then prints the sum of the numbers. The user can enter any number of numbers. I am not allowed to use the split function in python. I was wondering how I can do it that. Any help would be appreciated it as I'm kind of stuck on where to start.
Possible solution is to use regular expressions:
# import regular expression library
import re
# let user enter numbers and store user data into 'data' variable
data = input("Enter numbers separated by space: ")
"""
regular expression pattern '\d+' means the following:
'\d' - any number character,
'+' - one or more occurence of the character
're.findall' will find all occurrences of regular expression pattern
and store to list like '['1', '258', '475', '2', '6']'
please note that list items stored as str type
"""
numbers = re.findall(r'\d+', data)
"""
list comprehension '[int(_) for _ in numbers]' converts
list items to int type
'sum()' - summarizes list items
"""
summary = sum([int(_) for _ in numbers])
print(f'Sum: {summary}')
Another solution is following:
string = input("Enter numbers separated by space: ")
splits = []
pos = -1
last_pos = -1
while ' ' in string[pos + 1:]:
pos = string.index(' ', pos + 1)
splits.append(string[last_pos + 1:pos])
last_pos = pos
splits.append(string[last_pos + 1:])
summary = sum([int(_) for _ in filter(None, splits)])
print(f'Sum: {summary}')
From my point of view, the first option is more concise and better protected from user errors.

Is there a way to substring, which is between two words in the string in Python?

My question is more or less similar to:
Is there a way to substring a string in Python?
but it's more specifically oriented.
How can I get a par of a string which is located between two known words in the initial string.
Example:
mySrting = "this is the initial string"
Substring = "initial"
knowing that "the" and "string" are the two known words in the string that can be used to get the substring.
Thank you!
You can start with simple string manipulation here. str.index is your best friend there, as it will tell you the position of a substring within a string; and you can also start searching somewhere later in the string:
>>> myString = "this is the initial string"
>>> myString.index('the')
8
>>> myString.index('string', 8)
20
Looking at the slice [8:20], we already get close to what we want:
>>> myString[8:20]
'the initial '
Of course, since we found the beginning position of 'the', we need to account for its length. And finally, we might want to strip whitespace:
>>> myString[8 + 3:20]
' initial '
>>> myString[8 + 3:20].strip()
'initial'
Combined, you would do this:
startIndex = myString.index('the')
substring = myString[startIndex + 3 : myString.index('string', startIndex)].strip()
If you want to look for matches multiple times, then you just need to repeat doing this while looking only at the rest of the string. Since str.index will only ever find the first match, you can use this to scan the string very efficiently:
searchString = 'this is the initial string but I added the relevant string pair a few more times into the search string.'
startWord = 'the'
endWord = 'string'
results = []
index = 0
while True:
try:
startIndex = searchString.index(startWord, index)
endIndex = searchString.index(endWord, startIndex)
results.append(searchString[startIndex + len(startWord):endIndex].strip())
# move the index to the end
index = endIndex + len(endWord)
except ValueError:
# str.index raises a ValueError if there is no match; in that
# case we know that we’re done looking at the string, so we can
# break out of the loop
break
print(results)
# ['initial', 'relevant', 'search']
You can also try something like this:
mystring = "this is the initial string"
mystring = mystring.strip().split(" ")
for i in range(1,len(mystring)-1):
if(mystring[i-1] == "the" and mystring[i+1] == "string"):
print(mystring[i])
I suggest using a combination of list, split and join methods.
This should help if you are looking for more than 1 word in the substring.
Turn the string into array:
words = list(string.split())
Get the index of your opening and closing markers then return the substring:
open = words.index('the')
close = words.index('string')
substring = ''.join(words[open+1:close])
You may want to improve a bit with the checking for the validity before proceeding.
If your problem gets more complex, i.e multiple occurrences of the pair values, I suggest using regular expression.
import re
substring = ''.join(re.findall(r'the (.+?) string', string))
The re should store substrings separately if you view them in list.
I am using the spaces between the description to rule out the spaces between words, you can modify to your needs as well.

Apply backspace within a string

I have a string which includes backspace. Displaying it to the commandline will 'apply' the backspaces such that each backspace and the non-backspace character which immediately precedes it cannot be seen:
>> tempStr = ['aaab', char(8)]
tempStr =
aaa
Yet the deletion operation operation only happens when displaying the string. It still has the backspace character, and the 'b', inside it:
>> length(tempStr)
ans =
5
I'm looking for a minimal (ideally some sort of string processing built in) function which applies the backspace operation:
>>f(tempStr)
ans =
'aaa'
It may also help to know that I have an enumerations class over the alphabet 'a' to 'z' plus ' ' and backspace (to store my own personal indexing of the letters, images associated with each etc.). It'd be real spiffy to have this backspace removal operation be a method of the superclass that acts on a vector of its objects.
You can do it with a simple function using a while loop:
function s = printb(s)
while true
% Find backspaces
I = strfind(s, char(8));
% Break condition
if isempty(I), break; end
% Remove elements
if I(1)==1
s = s(2:end);
else
s(I(1)-1:I(1)) = [];
end
end
and the test gives:
s = [char(8) 'hahaha' char(8) char(8) '!'];
numel(s) % returns 10
z = printb(s) % returns 'haha!'
numel(z) % returns 5
This is not really "minimal", but as far as my knowlegde goes I don't think this is feasible with regular expressions in Matlab.
Best,
Your problem can be solved very elegantly using regular expressions:
function newStr = applyBackspaces(tempStr)
newStr = tempStr;
while (sum(newStr==char(8))>0) % while there is at least one char(8) in newStr do:
tmp = newStr; % work on previous result
if (tmp(1) == char(8)) % if first character is char(8)
newStr = tmp(2:end); % then suppress first character
else % else delete all characters just before a char(8)
newStr = regexprep(tmp,[ '.' char(8)],''); % as well as char(8) itself.
end
end
end
In essence, what my function does is delete the character just before the backspace until there are no more backspaces in your input string tempStr.
To test if it works, we check the output and the length of the string:
>> tempStr = ['abc', char(8), 'def', char(8), char(8), 'ghi']
tempStr =
abdghi
>> length(tempStr)
ans =
12
>> applyBackspaces(tempStr)
ans =
abdghi
>> length(applyBackspaces(tempStr))
ans =
6
Hence, tempStr and applyBackspaces(tempStr) show the same string, but applyBackspaces(tempStr) is the same length as the number of characters displayed.

How can I delete the letter that occurs in the two strings using python?

That's the source code:
def revers_e(str_one,str_two):
for i in range(len(str_one)):
for j in range(len(str_two)):
if str_one[i] == str_two[j]:
str_one = (str_one - str_one[i]).split()
print(str_one)
else:
print('There is no relation')
if __name__ == '__main__':
str_one = input('Put your First String: ').split()
str_two = input('Put your Second String: ')
print(revers_e(str_one, str_two))
How can I remove a letter that occurs in both strings from the first string then print it?
How about a simple pythonic way of doing it
def revers_e(s1, s2):
print(*[i for i in s1 if i in s2]) # Print all characters to be deleted from s1
s1 = ''.join([i for i in s1 if i not in s2]) # Delete them from s1
This answer says, "Python strings are immutable (i.e. they can't be modified). There are a lot of reasons for this. Use lists until you have no choice, only then turn them into strings."
First of all you don't need to use a pretty suboptimal way using range and len to iterate over a string since strings are iterable you can just iterate over them with a simple loop.
And for finding intersection within 2 string you can use set.intersection which returns all the common characters in both string and then use str.translate to remove your common characters
intersect=set(str_one).intersection(str_two)
trans_table = dict.fromkeys(map(ord, intersect), None)
str_one.translate(trans_table)
def revers_e(str_one,str_two):
for i in range(len(str_one)):
for j in range(len(str_two)):
try:
if str_one[i] == str_two[j]:
first_part=str_one[0:i]
second_part=str_one[i+1:]
str_one =first_part+second_part
print(str_one)
else:
print('There is no relation')
except IndexError:
return
str_one = input('Put your First String: ')
str_two = input('Put your Second String: ')
revers_e(str_one, str_two)
I've modified your code, taking out a few bits and adding a few more.
str_one = input('Put your First String: ').split()
I removed the .split(), because all this would do is create a list of length 1, so in your loop, you'd be comparing the entire string of the first string to one letter of the second string.
str_one = (str_one - str_one[i]).split()
You can't remove a character from a string like this in Python, so I split the string into parts (you could also convert them into lists like I did in my other code which I deleted) whereby all the characters up to the last character before the matching character are included, followed by all the characters after the matching character, which are then appended into one string.
I used exception statements, because the first loop will use the original length, but this is subject to change, so could result in errors.
Lastly, I just called the function instead of printing it too, because all that does is return a None type.
These work in Python 2.7+ and Python 3
Given:
>>> s1='abcdefg'
>>> s2='efghijk'
You can use a set:
>>> set(s1).intersection(s2)
{'f', 'e', 'g'}
Then use that set in maketrans to make a translation table to None to delete those characters:
>>> s1.translate(str.maketrans({e:None for e in set(s1).intersection(s2)}))
'abcd'
Or use list comprehension:
>>> ''.join([e for e in s1 if e in s2])
'efg'
And a regex to produce a new string without the common characters:
>>> re.sub(''.join([e for e in s1 if e in s2]), '', s1)
'abcd'

Is there a pythonic way to insert space characters at random positions of an existing string?

is there a pythonic way to implement this:
Insert /spaces_1/ U+0020 SPACE
characters into /key_1/ at random
positions other than the start or end
of the string.
?
There /spaces_1/ is integer and /key_1/ is arbitrary existing string.
Thanks.
strings in python are immutable, so you can't change them in place. However:
import random
def insert_space(s):
r = random.randint(1, len(s)-1)
return s[:r] + ' ' + s[r:]
def insert_spaces(s):
for i in xrange(random.randrange(len(s))):
s = insert_space(s)
return s
Here's a list based solution:
import random
def insert_spaces(s):
s = list(s)
for i in xrange(len(s)-1):
while random.randrange(2):
s[i] = s[i] + ' '
return ''.join(s)
I'm going to arbitrarily decide you never want two spaces inserted adjacently - each insertion point used only once - and that "insert" excludes "append" and "prepend".
First, construct a list of insertion points...
insert_points = range (1, len (mystring))
Pick out a random selection from that list, and sort it...
import random
selected = random.sample (insert_points, 5)
selected.sort ()
Make a list of slices of your string...
selected.append (len (mystring)) # include the last slice
temp = 0 # start with first slice
result = []
for i in selected :
result.append (mystring [temp:i])
temp = i
Now, built the new string...
" ".join (result)
Just because no one used map yet:
import random
''.join(map(lambda x:x+' '*random.randint(0,1), s)).strip()
This method inserts a given number of spaces to a random position in a string and takes care that there are no double spaces after each other:
import random
def add_spaces(s, num_spaces):
assert(num_spaces <= len(s) - 1)
space_idx = []
space_idx.append(random.randint(0, len(s) - 2))
num_spaces -= 1
while (num_spaces > 0):
idx = random.randint(0, len(s) - 2)
if (not idx in space_idx):
space_idx.append(idx)
num_spaces -= 1
result_with_spaces = ''
for i in range(len(s)):
result_with_spaces += s[i]
if i in space_idx:
result_with_spaces += ' '
return result_with_spaces
If you want to add more than one space, then go
s[:r] + ' '*n + s[r:]
Here it comes...
def thePythonWay(s,n):
n = max(0,min(n,25))
where = random.sample(xrange(1,len(s)),n)
return ''.join("%2s" if i in where else "%s" for i in xrange(len(s))) % tuple(s)
We will randomly choose the locations where spaces will be added - after char 0, 1, ... n-2 of the string (n-1 is the last character, and we will not place a space after that); and then insert the spaces by replacing the characters in the specified locations with (the original character) + ' '. This is along the lines of Steve314's solution (i.e. keeping the assumption that you don't want consecutive spaces - which limits the total spaces you can have), but without using lists.
Thus:
import random
def insert_random_spaces(original, amount):
assert amount > 0 and amount < len(original)
insert_positions = sorted(random.sample(xrange(len(original) - 1), amount))
return ''.join(
x + (' ' if i in insert_positions else '')
for (i, x) in enumerate(original)
)

Resources