Zipping strings together at arbitrary index and step (Python) - string

I am working in Python 2.7. I am trying to create a function which can zip a string into a larger string starting at an arbitrary index and with an arbitrary step.
For example, I may want to zip the string ##*#* into the larger string TNAXHAXMKQWGZESEJFPYDMYP starting at the 5th character with a step of 3. The resulting string should be:
TNAXHAX#MK#QW*GZ#ES*EJFPYDMYP
The working function that I came up with is
#Insert one character of string every nth position starting after ith position of text
text="TNAXHAXMKQWGZESEJFPYDMYP"
def zip_in(string,text,i,n):
text=list(text)
for c in string:
text.insert(i+n-1,c)
i +=n
text = ''.join(text)
print text
This function produces the desired result, but I feel that it is not as elegant as it could be.
Further, I would like it to be general enough that I can zip in a string backwards, that is, starting at the ith position of the text, I would like to insert the string in one character at a time with a backwards step.
For example, I may want to zip the string ##*#* into the larger string TNAXHAXMKQWGZESEJFPYDMYP starting at the 22nd position with a step of -3. The resulting string should be:
TNAXHAXMKQW*GZ#ES*EJ#FP#YDMYP
With my current function, I can do this by setting n negative, but if I want a step of -3, I need to set n as -2.
All of this leads me to my question:
Is there a more elegant (or Pythonic) way to achieve my end?
Here are some related questions which don't provide a general answer:
Pythonic way to insert every 2 elements in a string
Insert element in Python list after every nth element
Merge Two strings Together at N & X

You can use some functions from the itertools and more_itertools libraries (make sure to have them) and combine them to get your result : chunked and izip_longest.
# Parameters
s1 = 'ABCDEFGHIJKLMNOPQ' # your string
s2 = '####' # your string of elements to add
int_from = 4 # position from which we start adding letters
step = 2 # we will add in elements of s2 each 2 letters
return_list = list(s1)[:int_from] # keep the first int_from elements unchanged
for letter, char in izip_longest(chunked(list(s1)[int_from:], step), s2, fillvalue=''):
return_list.extend(letter)
return_list.append(char)
Then get your string back by doing :
''.join(return_list)
Output :
# For the parameters above the output is :
>> 'ABCDEF#GH#IJ#KL#MNOPQ'
What does izip_longest(chunked(list(s1)[int_from:], step), s2, fillvalue='') return ?:
for letter, char in izip_longest(chunked(list(s1)[int_from:], step), s2, fillvalue=''):
print(letter, char)
>> Output
>> (['E', 'F'], '#')
(['G', 'H'], '#')
(['I', 'J'], '#')
(['K', 'L'], '#')
(['M', 'N'], '')
(['O', 'P'], '')
(['Q'], '')

Related

How to remove the alphanumeric characters from a list and split them in the result?

'''def tokenize(s):
string = s.lower().split()
getVals = list([val for val in s if val.isalnum()])
result = "".join(getVals)
print (result)'''
tokenize('AKKK#eastern B!##est!')
Im trying for the output of ('akkkeastern', 'best')
but my output for the above code is - AKKKeasternBest
what are the changes I should be making
Using a list comprehension is a good way to filter elements out of a sequence like a string. In the example below, the list comprehension is used to build a list of characters (characters are also strings in Python) that are either alphanumeric or a space - we are keeping the space around to use later to split the list. After the filtered list is created, what's left to do is make a string out of it using join and last but not least use split to break it in two at the space.
Example:
string = 'AKKK#eastern B!##est!'
# Removes non-alpha chars, but preserves space
filtered = [
char.lower()
for char in string
if char.isalnum() or char == " "
]
# String-ifies filtered list, and splits on space
result = "".join(filtered).split()
print(result)
Output:
['akkkeastern', 'best']

Get digits at end of string in a pythonic way

I'm using python 3.x. I'm trying to get the (int) number at the end of a string with format
string_example_1 = l-45-98-567-567-12
string_example_2 = s-89-657
or in general, a single lowercase letter followed by a number of integers separated by '-'. What I need is to get the last number (12 and 657 in these cases). I have archived this with the function
def ending(the_string):
out = ''
while the_string[-1].isdigit():
out = the_string[-1] + out
the_string = the_string[:-1]
return out
but I'm sure there must be a more pythonic way to do this. In a previous instance I check manually that the string starts the way I like by doing something like
if st[0].isalpha() and st[1]=='-' and st[2].isdigit():
statement...
I would just split the string on -, take the last of the splits and convert it to an integer.
string_example_1 = "l-45-98-567-567-12"
string_example_2 = "s-89-657"
def last_number(s):
return int(s.split("-")[-1])
print(last_number(string_example_1))
# 12
print(last_number(string_example_2))
# 657
Without regular expressions, you could reverse the string, take elements from the string while they're still numbers, and then reverse the result. In Python:
from itertools import takewhile
def extract_final_digits(s):
return int(''.join(reversed(list(takewhile(lambda c: c.isdigit(), reversed(s))))))
But the simplest is to just split on a delimiter and take the final element in the split list.

Find and append characters of a String by matching with a List in Python 2.7

There is a list contains with character sequences such below:
seq_list = ['C','CA','CAF','CMMVF','E','CMM','CMMF','CMMFF',...]
and a string can be defined as below:
a_str = 'CAFCMMVFCMMECMMFFCCAF'
The problem is to match the longest character sequence of seq_list in a_str from left to right iteratively, and then a character('|') should be appended if it's found.
For example,
a_str begins with 'C' but the actual character sequence is 'CAF' because 'CAF' has the longer sequence than 'C',
so that it should be achieved such below:
a_str = 'CAF|CMMVFCMMECMMFFCCAF' #actual sequence match
'C|AFCMMVFCMMECMMFFCCAF' #false sequence match
Then, remaining a_str_r should be like this a_str_r = 'CMMVFCMMECMMFFCCAF' after a character '|' has been appended. So that the iterative process has to start over again by matching the longest sequence from the list until the end of the string, and the final result should be like this:
a_str = 'CAF|CMMVF|CMM|E|CMMFF|C|CAF|'
This was one of the attempts for this problem, and still couldn't get right!
a_str_r = []
for each in seq_list:
for i in a_str:
if each in i:
a_str_r.append(i+'|')
return a_str_r
You want to search for leftmost longest match. That is a standout for a regular expression search.
import re
seq_list = ['C','CA','CAF','CMMVF','E','CMM','CMMF','CMMFF']
# Sort to put longer match strings before shorter ones
sseq_list = sorted(seq_list, key=lambda a: len(a), reverse=True)
# Turn list into a regular expression string
sseq_re = '|'.join(sseq_list)
# Compile regular expression string
rx = rx = re.compile(sseq_re)
# Put pipe characters between the matches
print '|'.join(rx.findall('CAFCMMVFCMMECMMFFCCAF'))

Select part of a string and change it in lowercase or uppercase python 3.x

I want to convert a string so that the pair positions will be in upper case characters and the impair positions will be in lower case characters.
Here is what I've tried so far:
def foldingo(chaine):
chaineuh=chaine[0::2].upper()
chaine=chaineuh[1::2].lower()
return chaine
your code takes every other character in chaine, uppercases them, and assigns those characters to chaineuh.
Then it takes every other character in chaineuh, lowercases them, and assigns those characters to chaine again. In other words:
abcdefg -> ACEG -> cg
You'll notice it's not keeping the characters that you're not trying to target.
You could try building all the uppercases and lowercases separately, then iterate with zip to get them together.
def fold(s):
uppers = s[0::2].upper()
lowers = s[1::2].lower()
return zip(uppers, lowers)
But this doesn't quit work either, since zip gives you tuples, not strings, and will drop the last character in odd-lengthed strings
abcdefg -> ACEG, bdf -> ('A', 'b'), ('C', 'd'), ('E', 'f')
We could fix that by using a couple calls to str.join and using itertools.zip_longest with a fillvalue='', but it's kind of like using a wrench to hammer in a nail. It's not really the right tool for the job. For the record: it would look like:
''.join([''.join(pair) for pair in itertools.zip_longest(uppers, lowers, fillvalue='')])
yuck.
Let's instead just iterate over the string and uppercase every other letter. We can use an alternating boolean to track whether we're upper'ing or lower'ing this time around.
def fold(s):
time_to_upper = True
result = ""
for ch in s:
if time_to_upper:
result += ch.upper()
else:
result += ch.lower()
time_to_upper = not time_to_upper
return result
You could also use enumerate and a modulo to keep track:
def fold(s):
result = ""
for i, ch in enumerate(s):
ch = ch.lower() if i % 2 else ch.upper()
result += ch
return result
Or by using itertools.cycle, str.join, and list comprehensions, we can make this a lot shorter (possibly at the cost of readability!)
import itertools
def fold(s):
return ''.join([op(ch) for op, ch in zip(itertools.cycle([str.upper, str.lower]), s)]

How can I delete the letter that occurs in the two strings using python?

That's the source code:
def revers_e(str_one,str_two):
for i in range(len(str_one)):
for j in range(len(str_two)):
if str_one[i] == str_two[j]:
str_one = (str_one - str_one[i]).split()
print(str_one)
else:
print('There is no relation')
if __name__ == '__main__':
str_one = input('Put your First String: ').split()
str_two = input('Put your Second String: ')
print(revers_e(str_one, str_two))
How can I remove a letter that occurs in both strings from the first string then print it?
How about a simple pythonic way of doing it
def revers_e(s1, s2):
print(*[i for i in s1 if i in s2]) # Print all characters to be deleted from s1
s1 = ''.join([i for i in s1 if i not in s2]) # Delete them from s1
This answer says, "Python strings are immutable (i.e. they can't be modified). There are a lot of reasons for this. Use lists until you have no choice, only then turn them into strings."
First of all you don't need to use a pretty suboptimal way using range and len to iterate over a string since strings are iterable you can just iterate over them with a simple loop.
And for finding intersection within 2 string you can use set.intersection which returns all the common characters in both string and then use str.translate to remove your common characters
intersect=set(str_one).intersection(str_two)
trans_table = dict.fromkeys(map(ord, intersect), None)
str_one.translate(trans_table)
def revers_e(str_one,str_two):
for i in range(len(str_one)):
for j in range(len(str_two)):
try:
if str_one[i] == str_two[j]:
first_part=str_one[0:i]
second_part=str_one[i+1:]
str_one =first_part+second_part
print(str_one)
else:
print('There is no relation')
except IndexError:
return
str_one = input('Put your First String: ')
str_two = input('Put your Second String: ')
revers_e(str_one, str_two)
I've modified your code, taking out a few bits and adding a few more.
str_one = input('Put your First String: ').split()
I removed the .split(), because all this would do is create a list of length 1, so in your loop, you'd be comparing the entire string of the first string to one letter of the second string.
str_one = (str_one - str_one[i]).split()
You can't remove a character from a string like this in Python, so I split the string into parts (you could also convert them into lists like I did in my other code which I deleted) whereby all the characters up to the last character before the matching character are included, followed by all the characters after the matching character, which are then appended into one string.
I used exception statements, because the first loop will use the original length, but this is subject to change, so could result in errors.
Lastly, I just called the function instead of printing it too, because all that does is return a None type.
These work in Python 2.7+ and Python 3
Given:
>>> s1='abcdefg'
>>> s2='efghijk'
You can use a set:
>>> set(s1).intersection(s2)
{'f', 'e', 'g'}
Then use that set in maketrans to make a translation table to None to delete those characters:
>>> s1.translate(str.maketrans({e:None for e in set(s1).intersection(s2)}))
'abcd'
Or use list comprehension:
>>> ''.join([e for e in s1 if e in s2])
'efg'
And a regex to produce a new string without the common characters:
>>> re.sub(''.join([e for e in s1 if e in s2]), '', s1)
'abcd'

Resources