Which character comes first? - string

So the input is word and I want to know if a or b comes first.
I can use a_index = word.find('a') and compare this to b_index = word.find('b') and if a is first, a is first is returned. But if b isn't in word, .find() will return -1, so simply comparing b_index < a_index would return b is first. This could be accomplished by adding more if-statements, but is there a cleaner way?
function description:
input: word, [list of characters]
output: the character in the list that appears first in the word
Example: first_instance("butterfly", ['a', 'u', 'e'] returns u

You can create a function that takes word and a list of chars - convert those chars into a set for fast lookup and looping over word take the first letter found, eg:
# Chars can be any iterable whose elements are characters
def first_of(word, chars):
# Remove duplicates and get O(1) lookup time
lookup = set(chars)
# Use optional default argument to next to return `None` if no matches found
return next((ch for ch in word if ch in lookup), None)
Example:
>>> first_of('bob', 'a')
>>> first_of('bob', 'b')
'b'
>>> first_of('abob', 'ab')
'a'
>>> first_of("butterfly", ['a', 'u', 'e'])
'u'
This way you're only ever iterating over word once and short-circuit on the first letter found instead of running multiple finds, storing the results and then computing the lowest index.

Make a list without the missing chars and then sort it by positions.
def first_found(word, chars):
places = [x for x in ((word.find(c), c) for c in chars) if x[0] != -1]
if not places:
# no char was found
return None
else:
return min(places)[1]

In any case you need to check the type of the input:
if isinstance(your_input, str):
a_index = your_input.find('a')
b_index = your_input.find('b')
# Compare the a and b indexes
elif isinstance(your_input, list):
a_index = your_input.index('a')
b_index = your_input.index('b')
# Compare the a and b indexes
else:
# Do something else
EDIT:
def first_instance(word, lst):
indexes = {}
for c in lst:
if c not in indexes:
indexes[c] = word.find(c)
else:
pass
return min(indexes, key=indexes.get)
It will return the character from list lst which comes first in the word.
If you need to return the index of this letter then replace the return statement with this:
return min_value = indexes[min(indexes, key=indexes.get)]

Related

Compare lists with multiple elements

I have a tuple as follows s=[(1,300),(250,800),(900,1000),(1200,1300),(1500,2100)]
I need to compare the upper limit of the list with the lower limit of the next list. If the lower limit of the next list is less than the upper limit of the previous list than it should throw error else it should pass.
Example:
s=[(1,300),(250,800),(900,1000),(1200,1300),(1500,2100)] - This should throw error as 250<300.If it fails for any one, it should throw error immediately.
s=[(1,300),(350,800),(900,1000)] - This should not throw error as 350>300.
I have tried something like this:
s=[(1,300),(250,800),(900,1000)]
s= (sorted(s))
print(s)
def f(mytuple, currentelement):
return mytuple[mytuple.index(currentelement) + 1]
for i in s:
j = f(s,i)
if i[0]<j[1]:
print("fail")
else:
print("pass")
But it's not working. Help me out here.
zip() combines lists (or any iterables) to a new iterable. It stops when the shortest list is exhausted. Imagine:
a = [1, 2, 3, 4]
b = ['a', 'b', 'c']
zipped = zip(a, b) # Gives: [(1, 'a'), (2, 'b'), (3, 'c')]
# 4 is skipped, because there is no element remaining in b
We can used this to get all pairs in s in an elegant, easy to read form:
s=[(1,300),(250,800),(900,1000)]
s= (sorted(s))
pairs = zip(s, s[1:]) # zip s from index 0 with s from index 1
Now that we have pairs in the form of ((a0, a1), (b0, b1)) you can easily compare if a1 > b0 in a loop:
for a,b in pairs:
if a[1] > b[0]:
print("fail")
else:
print("pass")
Two problems I see:
1) You're running into an out of bounds error, as the last element (900,1000) is trying to check the follow element which does not exist.
You can skip the last element by adding [:-1] to your loop.
2) In addition, your "if" condition seems to be backwards. You seem to be wanting to compare i[1] with j[0] instead of i[0] with j[1].
s=[(1,300),(250,800),(900,1000)]
s= (sorted(s))
print(s)
def f(mytuple, currentelement):
return mytuple[mytuple.index(currentelement) + 1]
for i in s[:-1]:
j = f(s,i)
if i[1]>j[0]:
print("fail")
else:
print("pass")
See How to loop through all but the last item of a list? for more details.

Check if element is occurring very first time in python list

I have a list with values occurring multiple times. I want to loop over the list and check if value is occurring very first time.
For eg: Let's say I have a one list like ,
L = ['a','a','a','b','b','b','b','b','e','e','e'.......]
Now, at every first occurrence of element, I want to perform some set of tasks.
How to get the first occurrence of element?
Thanks in Advance!!
Use a set to check if you had processed that item already:
visited = set()
L = ['a','a','a','b','b','b','b','b','e','e','e'.......]
for e in L:
if e not in visited:
visited.add(e)
# process first time tasks
else:
# process not first time tasks
You can use unique_everseen from itertools recipes.
This function returns a generator which yield only the first occurence of an element.
Code
from itertools import filterfalse
def unique_everseen(iterable, key=None):
"List unique elements, preserving order. Remember all elements ever seen."
# unique_everseen('AAAABBBCCDAABBB') --> A B C D
# unique_everseen('ABBCcAD', str.lower) --> A B C D
seen = set()
seen_add = seen.add
if key is None:
for element in filterfalse(seen.__contains__, iterable):
seen_add(element)
yield element
else:
for element in iterable:
k = key(element)
if k not in seen:
seen_add(k)
yield element
Example
lst = ['a', 'a', 'b', 'c', 'b']
for x in unique_everseen(lst):
print(x) # Do something with the element
Output
a
b
c
The function unique_everseen also allows to pass a key for comparison of elements. This is useful in many cases, by example if you also need to know the position of each first occurence.
Example
lst = ['a', 'a', 'b', 'c', 'b']
for i, x in unique_everseen(enumerate(lst), key=lambda x: x[1]):
print(i, x)
Output
0 a
2 b
3 c
Why not using that?
L = ['a','a','a','b','b','b','b','b','e','e','e'.......]
for idxL, L_idx in enumerate(L):
if (L.index(L_idx) == idxL):
print("This is first occurence")
For very long lists, it is less efficient than building a set prior to the loop, but seems more direct to write.

Using the function Map, count the number of words that start with ā€˜Sā€™ in list in Python3

I'd like to get the total count of elements in a list starting with 'S' by only using Map function and Lambda expression. What I've tried is using list function encapsulated which is not I want.
Below is my code in which I've tried which is not desired.
input_list = ['San Jose', 'San Francisco', 'Santa Fe', 'Houston']
desireList = list(map(lambda x: x if x[0] == 'S' else '', input_list))
desireList.remove('')
print(len(desireList))
It's more Pythonic to use sum with a generator expression for your purpose:
sum(w.startswith('S') for w in input_list)
or:
sum(f == 'S' for f, *_ in input_list)
or if you still would prefer to use map and lambda:
sum(map(lambda x: x[0] == 'S', input_list))
With your sample input, all of the above would return: 3
You can try this:
count = list(map(lambda x:x[0]=='S',input_list)).count(True)
Here's an alternate approach
list( map( lambda x : x[0].lower() , input_list ) ).count('s')
Generate a list of 1st characters per item in the list, and count the number of 's' characters in that list.

Python string duplicates

I have a list
a=['apple', 'elephant', 'ball', 'country', 'lotus', 'potato']
I am trying to find largest element in the list with no duplicates.
For example script should return "country" as it doesn't have any duplicates.
Please help
You could also use collections.Counter for this:
from collections import Counter
a = ['apple', 'elephant', 'ball', 'country', 'lotus', 'potato']
a = set(a)
no_dups = []
for word in a:
counts = Counter(word)
if all(v == 1 for v in counts.values()):
no_dups.append(word)
print(max(no_dups, key = len))
Which follows this procedure:
Converts a to a set, since we only need to look at a word once, just in case a contains duplicates.
Creates a Counter() object of each word.
Only appends words that have a count of 1 for each letter, using all().
Get longest word from this resultant list, using max().
Note: This does not handle ties, you may need to do further work to handle this.
def has_dup(x):
unique = set(x) # pick unique letters
return any([x.count(e) != 1 for e in unique]) # find if any letter appear more than once
def main():
a = ['apple', 'elephant', 'ball', 'country', 'lotus', 'potato']
a = [e for e in a if not has_dup(e)] # filter out duplicates
chosen = max(a, key=len) # choose with max length
print(chosen)
if __name__ == '__main__':
main()

How to test if an input has only specific characters

Im trying to make a script that tests id there are characters in the input that are not A, T, C, G and if there are than the input is false.
I dont have any clue how to start. I would love if someone could help.
Thanks!
The following function can check a string to find out if it only contains the characters A, T, C, and G.
def check_string(code):
return all(character in {'A', 'T', 'C', 'G'} for character in code)
Expressed using sets:
The list function takes a string and returns a list of its characters. The set function takes a list and returns a set (with duplicates discarded).
>>> def check_string(code):
... return set(list('ACTG')).issuperset(set(list(code)))
...
>>> check_string('IT')
False
>>> check_string('ACTG')
True
>>> check_string('')
True
>>> check_string('ACT')
True
output = True
nucl_dict = {'A':'T', 'T':'A', 'C':'G', 'G':'C'}
n = input("Insert DNA seqence: ").upper()
for c in n:
if(c in ("A", "T", "C", "G")):
output = False
if(output == False):
print('Issue detected please try again')
print(n)
print(''.join(nucl_dict.get(nucl, nucl) for nucl in n))
else:
print("All good")

Resources