Is there a python function to get all indexes from unique values? - python-3.x

I know there are methods like set() or np.unqiue() to get unique values from lists. But I search for a way to get the index for the value which occurs not more than one time.
example = [0,1,1,2,3,3,4]
what I looking for is
desired_index_list = [0,3,6]
Any suggestions?

Don't know of any prebuilt solution, probably you need to create your own. There are different approaches for that, but with classical Python implementation, you can easily create a count_dict and filter those values from the original list that have count of 1.
>>> from collections import Counter
>>> example = [0,1,1,2,3,3,4]
>>> counted = Counter(example)
>>> desired_index_list = [index for index, elem in enumerate(example) if counted[elem] == 1]
>>> desired_index_list
[0, 3, 6]

You can do this as a one-liner with a list comprehension:
from collections import Counter
[example.index(x) for x, y in Counter(example).items() if y == 1]
(Using Counter, return tuples for each item (x) and its number of occurrence (y), and return the index of the item if it's count is 1).

Related

How to subtract adjacent items in list with unknown length (python)?

Provided with a list of lists. Here's an example myList =[[70,83,90],[19,25,30]], return a list of lists which contains the difference between the elements. An example of the result would be[[13,7],[6,5]]. The absolute value of (70-83), (83-90), (19-25), and (25-30) is what is returned. I'm not sure how to iterate through the list to subtract adjacent elements without already knowing the length of the list. So far I have just separated the list of lists into two separate lists.
list_one = myList[0]
list_two = myList[1]
Please let me know what you would recommend, thank you!
A custom generator can return two adjacent items at a time from a sequence without knowing the length:
def two(sequence):
i = iter(sequence)
a = next(i)
for b in i:
yield a,b
a = b
original = [[70,83,90],[19,25,30]]
result = [[abs(a-b) for a,b in two(sequence)]
for sequence in original]
print(result)
[[13, 7], [6, 5]]
Well, for each list, you can simply get its number of elements like this:
res = []
for my_list in list_of_lists:
res.append([])
for i in range(len(my_list) - 1):
# Do some stuff
You can then add the results you want to res[-1].

Number of elements in a nested sublist (starting from the first Index)

My code is as follows :
Nums=[['D'],['A','B'],['A','C'],['C','A']]
Output should be D=0
A=2
C=1
B=0
I have tried as follows:
nums=[['D'],['A','B'],['A','C'],['C','A']]
d=dict()
for i in (nums):
for j in i:
if(len(i)==1):
d[j]=0
else:
d[j]=1
print(d)
Am I on the right path to choose a dictionary to count the path?
Please post your suggestion in any data-structure
import collections
seen_dict = collections.Counter([x[0] for x in Nums if len(x) > 1])
To obtain a dictionary with the sum minus one of the occurrences you can perform a dictionary comprehension using the library Counter:
from collections import Counter # import library
flat = sum(nums, []) # returns a flat list
count = Counter(flat).items() # counts the elements (returns a dictionary)
result = {c[0]:c[1]-1 for c in count} # dictionary comprehnsion returning the sum minus one
Compressed form:
result = {c[0]:c[1]-1 for c in Counter(sum(nums, [])).items()}

How can I append a different element for each list in a column in pandas?

I have a dataframe, df, with lists in a specific column, col_a. For example,
df = pd.DataFrame()
df['col_a'] = [[1,2,3], [3,4], [5,6,7]]
I want to use conditions on these lists and apply specific modifications, including appends. For example, imagine that if the length of the list is > 2, I want to append another element, which is the sum of the last two elements of the current list. So, considering the first list above, I have [1, 2, 3] and I want to have [1, 2, 3, 5].
What I tried to do was:
df.loc[:, col_a] = df[col_a].apply(
lambda value: value.append(value[-2]+value[-1])
if len(value) > 1 else value)
But the result in that column is None for all the elements of the column.
Can someone help me, please?
Thank you very much in advance.
The issue is that append is an in place function and returns None. You need to add two lists together. So a working example with dummy variable would be:
df = pd.DataFrame({'cola':[[1,2],[2,3,4]], 'dum':[1,2]})
df['cola']=df.cola.apply(lambda x: (x+[sum(x[-2:])] if len(x)>2 else x))
If you want to use append try this:
def my_logic_for_list(values):
if len(values) > 2:
return values + [values[-2]+values[-1]]
return values
df['new_a'] = df['a'].apply(my_logic_for_list)
You can not use append inside lambda function.

Simple way to remove duplicate item in a list [duplicate]

This question already has answers here:
How do I remove duplicates from a list, while preserving order?
(30 answers)
Closed 4 years ago.
the program says "TypeError: 'int' object is not iterable"
list=[3,3,2]
print(list)
k=0
for i in list:
for l in list:
if(l>i):
k=l
for j in k:
if(i==j):
del list[i]
print(list)
An easy way to do this is with np.unique.
l=[3,3,2]
print(np.unique(l))
Hope that helps!
Without using any numpy the easiest way I can think of is to start with a new list and then loop through the old list and append the values to the new list that are new. You can cheaply keep track of what has already been used with a set.
def delete_duplicates(old_list):
used = set()
new_list= []
for i in old_list:
if i not in used:
used.add(i)
new_list.append(i)
return new_list
Also, a couple tips on your code. You are getting a TypeError from the for j in k line, it should be for j in range(k). k is just an integer so you can't iterate over it, but range(k) creates an iterable that will do what you want.
Just build another list
>>> list1=[3,2,3]
>>> list2=[]
>>> for i in list1:
... if i in list2:
... pass
... else:
... list2.append(i)
...
>>> list2
[3, 2]
You can always add list1 = list2 at the end if you prefer.
You can use set()
t = [3, 3, 2]
print(t) # prints [3, 3, 2]
t = list(set(t))
print(t) # prints [2, 3]
To remove a duplicate item in a list and get list with unique element, you can always use set() like below:
example:
>>>list1 = [1,1,2,2,3,3,3]
>>>new_unique_list = list(set(list1))
>>> new_unique_list
>>>[1, 2, 3]
You have the following line in your code which produces the error:
for j in k:
k is an int and cannot be iterated over. You probably meant to write for j in list.
There are a couple good answers already. If you really want to write the code yourself however, I'd recommend functional style instead of working in place (i.e. modifying the original array). For example like the following function which is basically a port of Haskell's Data.List.nub.
def nub(list):
'''
Remove duplicate elements from a list.
Singleton lists and empty lists cannot contain duplicates and are therefore returned immediately.
For lists with length gte to two split into head and tail, filter the head from the tail list and then recurse on the filtered list.
'''
if len(list) <= 1: return list
else:
head, *tail = list
return [head] + nub([i for i in tail if i != head])
This is—in my opinion—easier to read and saves you the trouble associated with multiple iteration indexes (since you create a new list).

How to get list of indices for elements whose value is the maximum in that list

Suppose I have a list l=[3,4,4,2,1,4,6]
I would like to obtain a subset of this list containing the indices of elements whose value is max(l).
In this case, list of indices will be [1,2,5].
I am using this approach to solve a problem where, a list of numbers are provided, for example
l=[1,2,3,4,3,2,2,3,4,5,6,7,5,4,3,2,2,3,4,3,4,5,6,7]
I need to identify the max occurence of an element, however in case more than 1 element appears the same number of times,
I need to choose the element which is greater in magnitude,
suppose I apply a counter on l and get {1:5,2:5,3:4...}, I have to choose '2' instead of '1'.
Please suggest how to solve this
Edit-
The problem begins like this,
1) a list is provided as an input
l=[1 4 4 4 5 3]
2)I run a Counter on this to obtain the counts of each unique element
3)I need to obtain the key whose value is maximum
4)Suppose the Counter object contains multiple entries whose value is maximum,
as in Counter{1:4,2:4,3:4,5:1}
I have to choose 3 as the key whose value is 4.
5)So far, I have been able to get the Counter object, I have seperated key/value lists using k=counter.keys();v=counter.values()
6)I want to get the indices whose values are max in v
If I run v.index(max(v)), I get the first index whose value matches max value, but I want to obtain the list of indices whose value is max, so that I can obtain corresponding list of keys and obtain max key in that list.
With long lists, using NumPy or any other linear algebra would be helpful, otherwise you can simply use either
l.index(max(l))
or
max(range(len(l)),key=l)
These however return only one of the many argmax's.
So for your problem, you can choose to reverse the array, since you want the maximum that appears later as :
len(l)-l[::-1].index(max(l))-1
If I understood correctly, the following should do what you want.
from collections import Counter
def get_largest_most_freq(lst):
c = Counter(lst)
# get the largest frequency
freq = max(c.values())
# get list of all the values that occur _max times
items = [k for k, v in c.items() if v == freq]
# return largest most frequent item
return max(items)
def get_indexes_of_most_freq(lst):
_max = get_largest_most_freq(lst)
# get list of all indexes that have a value matching _max
return [i for i, v in enumerate(lst) if v == _max]
>>> lst = [3,4,4,2,1,4,6]
>>> get_largest_most_freq(lst)
4
>>> get_indexes_of_most_freq(lst)
[1, 2, 5]
>>> lst = [1,2,3,4,3,2,2,3,4,5,6,7,5,4,3,2,2,3,4,3,4,5,6,7]
>>> get_largest_most_freq(lst)
3
>>> get_indexes_of_most_freq(lst)
[2, 4, 7, 14, 17, 19]

Resources