Delete repeated pairs in a string - string

I have a string S='BDBCFBCFABDDEABCCDGAEAABCEAAHF'. The string S is combined by many pairs respectively such as : 'BD', 'BC', 'FB',...,'HF'.
How can I delete all of the repeated pairs in this string? I would like to delete the pairs which has the same characters as well such as 'AA','BB',...,'ZZ'
The output should be:
Out = 'BDBCFBCFABEABCCDGAEACEHF'

Depending on your restrictions maybe you're after:
U = unique(reshape(S,[],2),'rows','stable')
And from there you can delete rows of double letters like:
out = U(U(:,1)~=U(:,2),:)

Related

How to remove the alphanumeric characters from a list and split them in the result?

'''def tokenize(s):
string = s.lower().split()
getVals = list([val for val in s if val.isalnum()])
result = "".join(getVals)
print (result)'''
tokenize('AKKK#eastern B!##est!')
Im trying for the output of ('akkkeastern', 'best')
but my output for the above code is - AKKKeasternBest
what are the changes I should be making
Using a list comprehension is a good way to filter elements out of a sequence like a string. In the example below, the list comprehension is used to build a list of characters (characters are also strings in Python) that are either alphanumeric or a space - we are keeping the space around to use later to split the list. After the filtered list is created, what's left to do is make a string out of it using join and last but not least use split to break it in two at the space.
Example:
string = 'AKKK#eastern B!##est!'
# Removes non-alpha chars, but preserves space
filtered = [
char.lower()
for char in string
if char.isalnum() or char == " "
]
# String-ifies filtered list, and splits on space
result = "".join(filtered).split()
print(result)
Output:
['akkkeastern', 'best']

Finding unique letters in a string

I am trying to find out how to find the amount of unique letters in a string.
I know how to find the amount of unique characters in a string by using the code below.
But what if I want to find the amount of unique letters, not characters, excluding punctuation,in the string?
import string
s = 'AabC'
s = s.lower()
print(sum(1 for c in string.ascii_lowercase if s.count(c) == 1))
First, you can filter out all non-letter characters, then you can convert it into a set and check the length.
s = 'AabC123qwer!!>>??'
unique = set(filter(str.isalpha, s.lower()))
print(len(unique))
7

Zipping strings together at arbitrary index and step (Python)

I am working in Python 2.7. I am trying to create a function which can zip a string into a larger string starting at an arbitrary index and with an arbitrary step.
For example, I may want to zip the string ##*#* into the larger string TNAXHAXMKQWGZESEJFPYDMYP starting at the 5th character with a step of 3. The resulting string should be:
TNAXHAX#MK#QW*GZ#ES*EJFPYDMYP
The working function that I came up with is
#Insert one character of string every nth position starting after ith position of text
text="TNAXHAXMKQWGZESEJFPYDMYP"
def zip_in(string,text,i,n):
text=list(text)
for c in string:
text.insert(i+n-1,c)
i +=n
text = ''.join(text)
print text
This function produces the desired result, but I feel that it is not as elegant as it could be.
Further, I would like it to be general enough that I can zip in a string backwards, that is, starting at the ith position of the text, I would like to insert the string in one character at a time with a backwards step.
For example, I may want to zip the string ##*#* into the larger string TNAXHAXMKQWGZESEJFPYDMYP starting at the 22nd position with a step of -3. The resulting string should be:
TNAXHAXMKQW*GZ#ES*EJ#FP#YDMYP
With my current function, I can do this by setting n negative, but if I want a step of -3, I need to set n as -2.
All of this leads me to my question:
Is there a more elegant (or Pythonic) way to achieve my end?
Here are some related questions which don't provide a general answer:
Pythonic way to insert every 2 elements in a string
Insert element in Python list after every nth element
Merge Two strings Together at N & X
You can use some functions from the itertools and more_itertools libraries (make sure to have them) and combine them to get your result : chunked and izip_longest.
# Parameters
s1 = 'ABCDEFGHIJKLMNOPQ' # your string
s2 = '####' # your string of elements to add
int_from = 4 # position from which we start adding letters
step = 2 # we will add in elements of s2 each 2 letters
return_list = list(s1)[:int_from] # keep the first int_from elements unchanged
for letter, char in izip_longest(chunked(list(s1)[int_from:], step), s2, fillvalue=''):
return_list.extend(letter)
return_list.append(char)
Then get your string back by doing :
''.join(return_list)
Output :
# For the parameters above the output is :
>> 'ABCDEF#GH#IJ#KL#MNOPQ'
What does izip_longest(chunked(list(s1)[int_from:], step), s2, fillvalue='') return ?:
for letter, char in izip_longest(chunked(list(s1)[int_from:], step), s2, fillvalue=''):
print(letter, char)
>> Output
>> (['E', 'F'], '#')
(['G', 'H'], '#')
(['I', 'J'], '#')
(['K', 'L'], '#')
(['M', 'N'], '')
(['O', 'P'], '')
(['Q'], '')

Delete repeated characters in strings in cell array

I have a cell array like this :
Input = {'CEEEGH';'CCEEG';'ABCDEFF';'BCFGG';'BCDEEG';'BEFFH';'AACEGH'}
How can I delete all of the repeated characters and just keep only 1 character left in each string in the Input ? The expected output should be like this:
Output = {'CEGH';'CEG';'ABCDEF';'BCFG';'BCDEG';'BEFH';'ACEGH'}
use :
cellfun(#unique,input,'UniformOutput',0)
ans =
'CEGH'
'CEG'
'ABCDEF'
'BCFG'
'BCDEG'
'BEFH'
'ACEGH'
EDIT:
To conserve ordering in case the letters are not sorted, as #thewaywewalk commented, you can use:
cellfun(#(x) unique(x,'stable'),input,'UniformOutput',0)

How to change a dictionary in to a string

def dict_to_str(d):
'''(dict) -> str
Returns a string containing each key and value in d. Keys and values
are separated by a space. Each key-value pair is separated by a
comma.
>>> dict_to_str({3:4, 5:6})
'3 4, 5 6'
'''
The following is in Python.
exDict = {1:3, 2:4}
# This will give you a string that looks like "{1: 3, 2: 4}".
stringDict = str(exDict)
At this point you have a string of the dict. What you need to do now is replace the curly brackets and colon with an empty space. This should give you the form you want.
for char in stringDict:
if char in "{}:":
stringDict = stringDict.replace(char, "")
That should do the trick.

Resources