keys = ['a','H','c','D','m','l']
values = ['a','c','H','D']
category = []
for index, i in enumerate(keys):
for j in values:
if j in i:
category.append(j)
break
if index == len(category):
category.append("other")
print(category)
My expected output is ['a', 'H', 'c', 'D', 'other', 'other']
But i am getting ['a', 'other', 'H', 'c', 'D', 'other']
EDIT: OP edited his question multiple times.
Python documentation break statement:
It terminates the nearest enclosing loop.
You break out of the outer loop using the "break" statement. The execution never even reaches the inner while loop.
Now.. To solve your problem of categorising strings:
xs = ['Am sleeping', 'He is walking','John is eating']
ys = ['walking','eating','sleeping']
categories = []
for x in xs:
for y in ys:
if y in x:
categories.append(y)
break
categories.append("other")
print(categories) # ['sleeping', 'walking', 'eating']
Iterate over both lists and check if any categories match. If they do append to the categories list and continue with the next string to categorise. If didn't find any matching category (defined by the count of matched categories being less than the current index (index is 0 based, so they are shifted by 1, which means == is less than in this case) then categorise as "other.
Related
In my case
list A = [a,a,a,b,b,c]
I have to find the occurrence of the elements available in the list and print their counts
For example print as a=3, b =2 and c =1
Just use JavaScript. The filter() operation is perfect for this:
* def data = ['a', 'c', 'b', 'c', 'c', 'd']
* def count = data.filter(x => x == 'c').length
* assert count == 3
Further reading: https://github.com/karatelabs/karate#json-transforms
I am trying to validate a list say:
X = ['a','c', 'c', 'b', 'd','d','d']
against a custom ordered list:
Y = ['a',b','d']
In this case X validated against Y should return True regardless of the extra elements and duplicates in it as long as it goes with the order in Y and contains at least two elements.
Case Examples:
X = ['a','b'] # Returns True
X = ['d','a', 'a', 'c','b'] # Returns False
X = ['c','a','b', 'b', 'c'] # Returns True
The most I can do right now is remove the duplicates and extra elements. I am not trying to sort them using the custom list. I just need to validate the order. What I done or at least tried is to create a dictionary where the value is the index of the order. Can anyone point me in the right direction?
from itertools import zip_longest, groupby
okay = list(x == y for y, (x, _) in zip_longest(
(y for y in Y if y in X), groupby(x for x in X if x in Y)))
print(len(okay) >= 2 and all(okay))
First we discard unnecessary elements from both lists. Then we can use groupby to collapse sequences of the same elements of X. For example, your first example ['a', 'c', 'c', 'b', 'd', 'd', 'd'] first becomes ['a', 'c', 'c', 'b'] (by discarding the unnecessary'd'), then[('a', _), ('c', _), ('b', _)]. If we compare its keys element by element to the Y without the unnecessary bits, and there are at least 2 of them, we have a match. If the order was violated (e.g. ['b', 'c', 'c', 'a', 'd', 'd', 'd'], there would have been a False in okay, and it would fail. If an extra element appeared somewhere, there would be a comparison with None (thanks to zip_longest), and again a False would have been in okay.
This can be improved by use of sets to speed up the membership lookup.
Create a new list from X that only contains the elements from Y without duplicates. Then, similarly, remove all elements from Y not contained in X and deduplicate. Then your check is just a simple equality check.
def deduplicate(iterable):
seen = set()
return [seen.add(x) or x for x in iterable if x not in seen]
def goes_with_order(X, Y):
Xs = set(X); Ys = set(Y)
X = deduplicate(x for x in X if x in Ys)
Y = deduplicate(y for y in Y if y in Xs)
return X == Y
I have a list with values occurring multiple times. I want to loop over the list and check if value is occurring very first time.
For eg: Let's say I have a one list like ,
L = ['a','a','a','b','b','b','b','b','e','e','e'.......]
Now, at every first occurrence of element, I want to perform some set of tasks.
How to get the first occurrence of element?
Thanks in Advance!!
Use a set to check if you had processed that item already:
visited = set()
L = ['a','a','a','b','b','b','b','b','e','e','e'.......]
for e in L:
if e not in visited:
visited.add(e)
# process first time tasks
else:
# process not first time tasks
You can use unique_everseen from itertools recipes.
This function returns a generator which yield only the first occurence of an element.
Code
from itertools import filterfalse
def unique_everseen(iterable, key=None):
"List unique elements, preserving order. Remember all elements ever seen."
# unique_everseen('AAAABBBCCDAABBB') --> A B C D
# unique_everseen('ABBCcAD', str.lower) --> A B C D
seen = set()
seen_add = seen.add
if key is None:
for element in filterfalse(seen.__contains__, iterable):
seen_add(element)
yield element
else:
for element in iterable:
k = key(element)
if k not in seen:
seen_add(k)
yield element
Example
lst = ['a', 'a', 'b', 'c', 'b']
for x in unique_everseen(lst):
print(x) # Do something with the element
Output
a
b
c
The function unique_everseen also allows to pass a key for comparison of elements. This is useful in many cases, by example if you also need to know the position of each first occurence.
Example
lst = ['a', 'a', 'b', 'c', 'b']
for i, x in unique_everseen(enumerate(lst), key=lambda x: x[1]):
print(i, x)
Output
0 a
2 b
3 c
Why not using that?
L = ['a','a','a','b','b','b','b','b','e','e','e'.......]
for idxL, L_idx in enumerate(L):
if (L.index(L_idx) == idxL):
print("This is first occurence")
For very long lists, it is less efficient than building a set prior to the loop, but seems more direct to write.
i have a ranked list of elements as below :
ranked_list_1 = ['G','A','M','S','D']
i need to rerank the list as below
1) Re rank as :
re_ranked_list_1 = ['A','M','D','E','G','S']
Logic : 'G' and 'S' should always be in last 2 positions and new element 'E' should also be tagged to the list just before the last 2 position.
I think this is what you want. This will put 'G' and 'S' at the end, in the order they appear in the list.
ordered_list = list()
final_terms = ['E']
for item in ranked_list_1:
if item in ['G', 'S']:
final_terms.append(item)
else:
ordered_list.append(item)
output_list = ordered_list + final_terms
print(output_list)
>>>['A', 'M', 'D', 'E', 'G', 'S']
I have a list that contains multiple sets of strings, and I would like to find the symmetric difference between each string and the other strings in the set.
For example, I have the following list:
targets = [{'B', 'C', 'A'}, {'E', 'C', 'D'}, {'F', 'E', 'D'}]
For the above, desired output is:
[2, 0, 1]
because in the first set, A and B are not found in any of the other sets, for the second set, there are no unique elements to the set, and for the third set, F is not found in any of the other sets.
I thought about approaching this backwards; finding the intersection of each set and subtracting the length of the intersection from the length of the list, but set.intersection(*) does not appear to work on strings, so I'm stuck:
set1 = {'A', 'B', 'C'}
set2 = {'C', 'D', 'E'}
set3 = {'D', 'E', 'F'}
targets = [set1, set2, set3]
>>> set.intersection(*targets)
set()
The issue you're having is that there are no strings shared by all three sets, so your intersection comes up empty. That's not a string issue, it would work the same with numbers or anything else you can put in a set.
The only way I see to do a global calculation over all the sets, then use that to find the number of unique values in each one is to first count all the values (using collections.Counter), then for each set, count the number of values that showed up only once in the global count.
from collections import Counter
def unique_count(sets):
count = Counter()
for s in sets:
count.update(s)
return [sum(count[x] == 1 for x in s) for s in sets]
Try something like below:
Get symmetric difference with every set. Then intersect with the given input set.
def symVal(index,targets):
bseSet = targets[index]
symSet = bseSet
for j in range(len(targets)):
if index != j:
symSet = symSet ^ targets[j]
print(len(symSet & bseSet))
for i in range(len(targets)):
symVal(i,targets)
Your code example doesn't work because it's finding the intersection between all of the sets, which is 0 (since no element occurs everywhere). You want to find the difference between each set and the union of all other sets. For example:
set1 = {'A', 'B', 'C'}
set2 = {'C', 'D', 'E'}
set3 = {'D', 'E', 'F'}
targets = [set1, set2, set3]
result = []
for set_element in targets:
result.append(len(set_element.difference(set.union(*[x for x in targets if x is not set_element]))))
print(result)
(note that the [x for x in targets if x != set_element] is just the set of all other sets)