Remove middle B from each element in the list - python-3.x

suppose I have a list which calls name:
name=['ACCBCDB','CCABACB','CAABBCB']
I want to use python to remove middle B from each element in the list.
the output should display :
['ACCCDB','CCAACB','CAABCB']

def ter(s):
return s[3:-3]
name=['ACBBDBA','CCABACB','CABBCBB']
xx=[ter(s) for s in name]
z=xx
print(z)
output
['B', 'B', 'B']
I did the reverse I want to delete B in middle and keep the other parts from each element

name = ['ACCBCDB','CCABACB','CAABBCB']
name_without_middle = []
for oldstr in name:
midlen = int((len(oldstr)/2))
newstr = oldstr[:midlen] + oldstr[midlen+1:]
name_without_middle.append(newstr)
print(name_without_middle)
Returns
['ACCCDB', 'CCAACB', 'CAABCB']
Try it here

Related

Replace an item in list if it starts with a precise character

I am fairly new to python and have a task to solve.
I have a list that is made of strings made of hexadecimal numbers. I want to replace some items with '0', if they do not start with the right characters.
So, for example, I have
List = ['0800096700000000', '090000000000025d', '0b0000000000003c', '0500051b014f0000']
and I want, say, to only have the data that starts with "0b" and "05", and I want to replace the others by "0".
For now, I have this:
multiplex = ('0b', '05')
List = ['0800096700000000', '090000000000025d', '0b0000000000003c', '0500051b014f0000']
List = [x for x in List if x.startswith(multiplex)]
This gives me the following result:
['0b0000000000003c', '0500051b014f0000']
Although I would like the following result:
['0', '0', '0b0000000000003c', '0500051b014f0000']
I cannot index the specific item I wish to change because the actual data is way too large for that...
Can someone help?
You should use an if/else to determine what to return, not if a value should be in the list.
my_list = ['0800096700000000', '090000000000025d', '0b0000000000003c', '0500051b014f0000']
multiplex = ('0b', '05')
my_new_list = [x if x.startswith(multiplex) else '0' for x in my_list]
print(my_new_list)
'''' Sample Output
['0', '0', '0b0000000000003c', '0500051b014f0000']
''''
Your multiplex strings are too long, so a single character string does not start with 2 characters. Try if x.startswith(multiplex) or len(str(x)) < 2 and x.startswith("0") or if x.startswith(multiplex) or str(x) == "0"
List = [x if x.startswith(multiplex) else '0' for x in List]

Find multiple words or strings in an if statement [duplicate]

How can I check if any of the strings in an array exists in another string?
For example:
a = ['a', 'b', 'c']
s = "a123"
if a in s:
print("some of the strings found in s")
else:
print("no strings found in s")
How can I replace the if a in s: line to get the appropriate result?
You can use any:
a_string = "A string is more than its parts!"
matches = ["more", "wholesome", "milk"]
if any([x in a_string for x in matches]):
Similarly to check if all the strings from the list are found, use all instead of any.
any() is by far the best approach if all you want is True or False, but if you want to know specifically which string/strings match, you can use a couple things.
If you want the first match (with False as a default):
match = next((x for x in a if x in str), False)
If you want to get all matches (including duplicates):
matches = [x for x in a if x in str]
If you want to get all non-duplicate matches (disregarding order):
matches = {x for x in a if x in str}
If you want to get all non-duplicate matches in the right order:
matches = []
for x in a:
if x in str and x not in matches:
matches.append(x)
You should be careful if the strings in a or str gets longer. The straightforward solutions take O(S*(A^2)), where S is the length of str and A is the sum of the lenghts of all strings in a. For a faster solution, look at Aho-Corasick algorithm for string matching, which runs in linear time O(S+A).
Just to add some diversity with regex:
import re
if any(re.findall(r'a|b|c', str, re.IGNORECASE)):
print 'possible matches thanks to regex'
else:
print 'no matches'
or if your list is too long - any(re.findall(r'|'.join(a), str, re.IGNORECASE))
A surprisingly fast approach is to use set:
a = ['a', 'b', 'c']
str = "a123"
if set(a) & set(str):
print("some of the strings found in str")
else:
print("no strings found in str")
This works if a does not contain any multiple-character values (in which case use any as listed above). If so, it's simpler to specify a as a string: a = 'abc'.
You need to iterate on the elements of a.
a = ['a', 'b', 'c']
str = "a123"
found_a_string = False
for item in a:
if item in str:
found_a_string = True
if found_a_string:
print "found a match"
else:
print "no match found"
a = ['a', 'b', 'c']
str = "a123"
a_match = [True for match in a if match in str]
if True in a_match:
print "some of the strings found in str"
else:
print "no strings found in str"
jbernadas already mentioned the Aho-Corasick-Algorithm in order to reduce complexity.
Here is one way to use it in Python:
Download aho_corasick.py from here
Put it in the same directory as your main Python file and name it aho_corasick.py
Try the alrorithm with the following code:
from aho_corasick import aho_corasick #(string, keywords)
print(aho_corasick(string, ["keyword1", "keyword2"]))
Note that the search is case-sensitive
The regex module recommended in python docs, supports this
words = {'he', 'or', 'low'}
p = regex.compile(r"\L<name>", name=words)
m = p.findall('helloworld')
print(m)
output:
['he', 'low', 'or']
Some details on implementation: link
A compact way to find multiple strings in another list of strings is to use set.intersection. This executes much faster than list comprehension in large sets or lists.
>>> astring = ['abc','def','ghi','jkl','mno']
>>> bstring = ['def', 'jkl']
>>> a_set = set(astring) # convert list to set
>>> b_set = set(bstring)
>>> matches = a_set.intersection(b_set)
>>> matches
{'def', 'jkl'}
>>> list(matches) # if you want a list instead of a set
['def', 'jkl']
>>>
Just some more info on how to get all list elements availlable in String
a = ['a', 'b', 'c']
str = "a123"
list(filter(lambda x: x in str, a))
It depends on the context
suppose if you want to check single literal like(any single word a,e,w,..etc) in is enough
original_word ="hackerearcth"
for 'h' in original_word:
print("YES")
if you want to check any of the character among the original_word:
make use of
if any(your_required in yourinput for your_required in original_word ):
if you want all the input you want in that original_word,make use of all
simple
original_word = ['h', 'a', 'c', 'k', 'e', 'r', 'e', 'a', 'r', 't', 'h']
yourinput = str(input()).lower()
if all(requested_word in yourinput for requested_word in original_word):
print("yes")
flog = open('test.txt', 'r')
flogLines = flog.readlines()
strlist = ['SUCCESS', 'Done','SUCCESSFUL']
res = False
for line in flogLines:
for fstr in strlist:
if line.find(fstr) != -1:
print('found')
res = True
if res:
print('res true')
else:
print('res false')
I would use this kind of function for speed:
def check_string(string, substring_list):
for substring in substring_list:
if substring in string:
return True
return False
Yet another solution with set. using set.intersection. For a one-liner.
subset = {"some" ,"words"}
text = "some words to be searched here"
if len(subset & set(text.split())) == len(subset):
print("All values present in text")
if subset & set(text.split()):
print("Atleast one values present in text")
If you want exact matches of words then consider word tokenizing the target string. I use the recommended word_tokenize from nltk:
from nltk.tokenize import word_tokenize
Here is the tokenized string from the accepted answer:
a_string = "A string is more than its parts!"
tokens = word_tokenize(a_string)
tokens
Out[46]: ['A', 'string', 'is', 'more', 'than', 'its', 'parts', '!']
The accepted answer gets modified as follows:
matches_1 = ["more", "wholesome", "milk"]
[x in tokens for x in matches_1]
Out[42]: [True, False, False]
As in the accepted answer, the word "more" is still matched. If "mo" becomes a match string, however, the accepted answer still finds a match. That is a behavior I did not want.
matches_2 = ["mo", "wholesome", "milk"]
[x in a_string for x in matches_1]
Out[43]: [True, False, False]
Using word tokenization, "mo" is no longer matched:
[x in tokens for x in matches_2]
Out[44]: [False, False, False]
That is the additional behavior that I wanted. This answer also responds to the duplicate question here.
data = "firstName and favoriteFood"
mandatory_fields = ['firstName', 'lastName', 'age']
# for each
for field in mandatory_fields:
if field not in data:
print("Error, missing req field {0}".format(field));
# still fine, multiple if statements
if ('firstName' not in data or
'lastName' not in data or
'age' not in data):
print("Error, missing a req field");
# not very readable, list comprehension
missing_fields = [x for x in mandatory_fields if x not in data]
if (len(missing_fields)>0):
print("Error, missing fields {0}".format(", ".join(missing_fields)));

How to Sort Alphabets

Input : abcdABCD
Output : AaBbCcDd
ms=[]
n = input()
for i in n:
ms.append(i)
ms.sort()
print(ms)
It gives me ABCDabcd.
How to sort this in python?
Without having to import anything, you could probably do something like this:
arr = "abcdeABCDE"
temp = sorted(arr, key = lambda i: (i.lower(), i))
result = "".join(temp)
print(result) # AaBbCcDdEe
The key will take in each element of arr and sort it first by lower-casing it, then if it ties, it will sort it based on its original value. It will group all similar letters together (A with a, B with b) and then put the capital first.
Use a sorting key:
ms = "abcdABCD"
sorted_ms = sorted(ms, key=lambda letter:(letter.upper(), letter.islower()))
# sorted_ms = ['A', 'a', 'B', 'b', 'C', 'c', 'D', 'd']
sorted_str = ''.join(sorted_ms)
# sorted_str = 'AaBbCcDd'
Why this works:
You can specify the criteria by which to sort by using the key argument in the sorted function, or the list.sort() method - this expects a function or lambda that takes the element in question, and outputs a new criteria by which to sort it. If that "new criteria" is a tuple, then the first element takes precedence - if it's equal, then the second argument, and so on.
So, the lambda I provided here returns a 2-tuple:
(letter.upper(), letter.islower())
letter.upper() as the first element here means that the strings are going to be sorted lexigraphically, but case-insensitively (as it will sort them as if they were all uppercase). Then, I use letter.islower() as the second argument, which is True if the letter was lowercase and False otherwise. When sorting, False comes before True - which means that if you give a capital letter and a lowercase letter, the capital letter will come first.
Try this:
>>>s='abcdABCD'
>>>''.join(sorted(s,key=lambda x:x.lower()))
'aAbBcCdD'

why it show me this in result (list index out of range)?

Write a function called stop_at_z that iterates through a list of strings. Using a while loop, append each string to a new list until the string that appears is “z”. The function should return the new list.
def stop_at_z(str):
d = 0
x=[]
str1 = list(str)
while True :
if str1[d] != 'Z' :
x.append(str1[d])
d+=1
if str1[d] == 'Z' :
break
return x
Using a while loop, append each string to a new list until the string that appears is “z”. The function should return the new list.
You're getting this error because d keeps increasing infinitely if there is no uppercase 'Z' in the string. Instead, you should only stay in the while loop while the full length of the input string has not been reached:
def stop_at_z(inputstr):
d = 0
x=[]
str1 = list(inputstr)
while d<len(inputstr) :
if str1[d] == 'z' :
break
else:
x.append(str1[d])
d+=1
return x
Note that you can achieve the same thing using takewhile() from the itertools module:
from itertools import takewhile
def stop_at_z(inputstr):
return list(takewhile(lambda i: i != 'z', inputstr))
print(stop_at_z("hello wzrld"))
Output:
['h', 'e', 'l', 'l', 'o', ' ', 'w']
Is the the way you are doing it, searching for “z” is case-sensitive, try something like:
If str1[d].strip().lower() == “z”
It strips off leading and trailing white space and then converts the str1 element to lower case (both of these simply return the modified string, so the original is unchanged) and compares it to a lower case z
What if the string 'z' is never in the list?
Then it keeps on increasing the index and eventually runs into an error.
Just restricting the loop to the length of the list should help.
def stop_at_z(str):
d = 0
x=[]
str1 = list(str)
for d in range(0,len(str1)) :
print(d)
if str1[d] != 'Z' :
x.append(str1[d])
else:
break
return x
Basically, we needed to have a list that could have all the characters until we get "z". One way we could do that is we first convert the string into a list and iterate that list and add every character to a new list ls until we get "z". But the problem is we may get a string that doesn't have "z" so we need to iterate till the length of that list. I hope it is clear.
def stop_at_z(s):
ls = []
idx = 0
x = list(s)
while idx<len(x):
if x[idx]=="z":
break
ls.append(x[idx])
idx+=1
return ls
It's my first time posting here, but I use this while loop:
def stop_at_z(input_list):
print (input_list)
output_list=[]
index=0
while index< len(input_list):
if input_list[index] != "z":
output_list.append(input_list[index])
index+=1
else:
break
return output_list

Function that prints each element of a list and its index per line

I am trying to write a function that can print the element and index of a list. I want to do this without using the enumerate built in function and do it using for loops.
I was able to print out the element but I couldn't figure out a way to loop the index of my list.
Is there any good way I could work around this? Many thanks.
You could do this, simply iterating over the range of numbers regarding the length of your list:
def item_and_index(my_list):
for i in range(len(my_list)):
print(my_list[i], i)
This is exactly what you need, a function using for loops and not the enumerate function.
>>> L = ['a', 'b', 'c']
>>> for i in range(len(L)):
... print(i, L[i])
...
0 a
1 b
2 c
You could also try this:
i = 0
for elem in L:
print(i, elem)
i += 1

Resources