I am trying to understand why the index of minus one behaves as it does with the built-in insert function.
# assinging a list to variable a.
a = [1, 2, 3, 4]
print(a) # we get [1, 2, 3, 4]
a.insert(-1, 5)
print(a) # we get [1, 2, 3, 5, 4]
# why is this different than other list indexing using minus one.
It's not, actually. In case you use
>>> a = [1, 2, 3, 4]
>>> a[-1]
4
The [-1] index gives you the last element. All is fine. So why the -1 in insert() function inserts the element before the element, and not as the last? Answer is in the docs:
Insert an item at a given position. The first argument is the index of the element before which to insert, so a.insert(0, x) inserts at the front of the list, and a.insert(len(a), x) is equivalent to a.append(x).
So as you see, it might be counterintuitive, but it's all in the docs. As a sidenote, I'd recommend you reading wtfpython, maybe you'll find some other fancy quirks.
Good evening,
I have a table in the format of a list of lists. Each list is of the same length. I would like to obtain a smaller list based on the unique values of the numbers (1 list per number whichever shows up first).
tbl = [[1, 'aaa'], [2, 'aab'], [3, 'aac'], [4, 'GGC'], [4, 'GGH'], [6, 'GGS'], [7, 'aad']]
I've tried the following snippet of code:
tbl_simple = [list(x) for x in set(tuple(x) for x in tbl)]
But this line treats the whole list as one big unique value and I end up with the same table. I would like to filter on just the condition of the number (or some column of my choosing). The final result would look like this:
[[1, 'aaa'], [2, 'aab'], [3, 'aac'], [4, 'GGC'], [6, 'GGS'], [7, 'aad']]
Thank you for any assistance.
An easy non-one liner would be:
output = []
seen = set()
for num, letters in tbl:
if num in seen:
continue
output.append([num, letters])
seen.add(num)
Still thinking about a one-liner.
I have to find the harmonic mean of the nested list that contains some negative values. I know harmonicmean is only used for positive values, so what can I do to compute harmonic mean of my list?
I tried this:
x=[['a', 1, -3, 5], ['b', -2, 6, 8], ['c', 3, 7, -9]]
import statistics as s
y=[s.harmonicmean(i[1:]) for i in x1]
but I get statistics.statisticserror for the negative values.
You probably want to use filter
filter will iterate over a copy of a list, or anything that's iterable, while filtering out elements that don't satisfy a specific condition. Keep in mind I said "copy;" it doesn't mutate the iterable you pass to it.
for example:
>>> numbers = [-1, 2, 3]
>>> filter(lambda i: i >= 0, numbers)
[2, 3]
or if you just want absolute values, you can use map which will iterate over a copy of a list, or anything that's iterable, while applying a function to each element:
>>> map(abs, numbers)
[1, 2, 3]
I have the following Python dict:
[(2, [3, 4, 5]), (3, [1, 0, 0, 0, 1]), (4, [-1]), (10, [1, 2, 3])]
Now I want to sort them on the basis of sum of values of the values of dictionary, so for the first key the sum of values is 3+4+5=12.
I have written the following code that does the job:
def myComparator(a,b):
print "Values(a,b): ",(a,b)
sum_a=sum(a[1])
sum_b=sum(b[1])
print sum_a,sum_b
print "Comparision Returns:",cmp(sum_a,sum_b)
return cmp(sum_a,sum_b)
items.sort(myComparator)
print items
This is what the output that I get after running above:
Values(a,b): ((3, [1, 0, 0, 0, 1]), (2, [3, 4, 5]))
2 12
Comparision Returns: -1
Values(a,b): ((4, [-1]), (3, [1, 0, 0, 0, 1]))
-1 2
Comparision Returns: -1
Values(a,b): ((10, [1, 2, 3]), (4, [-1]))
6 -1
Comparision Returns: 1
Values(a,b): ((10, [1, 2, 3]), (3, [1, 0, 0, 0, 1]))
6 2
Comparision Returns: 1
Values(a,b): ((10, [1, 2, 3]), (2, [3, 4, 5]))
6 12
Comparision Returns: -1
[(4, [-1]), (3, [1, 0, 0, 0, 1]), (10, [1, 2, 3]), (2, [3, 4, 5])]
Now I am unable to understand as to how the comparator is working, which two values are being passed and how many such comparisons would happen? Is it creating a sorted list of keys internally where it keeps track of each comparison made? Also the behavior seems to be very random. I am confused, any help would be appreciated.
The number and which comparisons are done is not documented and in fact, it can freely change from different implementations. The only guarantee is that if the comparison function makes sense the method will sort the list.
CPython uses the Timsort algorithm to sort lists, so what you see is the order in which that algorithm is performing the comparisons (if I'm not mistaken for very short lists Timsort just uses insertion sort)
Python is not keeping track of "keys". It just calls your comparison function every time a comparison is made. So your function can be called many more than len(items) times.
If you want to use keys you should use the key argument. In fact you could do:
items.sort(key=lambda x: sum(x[1]))
This will create the keys and then sort using the usual comparison operator on the keys. This is guaranteed to call the function passed by key only len(items) times.
Given that your list is:
[a,b,c,d]
The sequence of comparisons you are seeing is:
b < a # -1 true --> [b, a, c, d]
c < b # -1 true --> [c, b, a, d]
d < c # 1 false
d < b # 1 false
d < a # -1 true --> [c, b, d, a]
how the comparator is working
This is well documented:
Compare the two objects x and y and return an integer according to the outcome. The return value is negative if x < y, zero if x == y and strictly positive if x > y.
Instead of calling the cmp function you could have written:
sum_a=sum(a[1])
sum_b=sum(b[1])
if sum_a < sum_b:
return -1
elif sum_a == sum_b:
return 0
else:
return 1
which two values are being passed
From your print statements you can see the two values that are passed. Let's look at the first iteration:
((3, [1, 0, 0, 0, 1]), (2, [3, 4, 5]))
What you are printing here is a tuple (a, b), so the actual values passed into your comparison functions are
a = (3, [1, 0, 0, 0, 1])
b = (2, [3, 4, 5]))
By means of your function, you then compare the sum of the two lists in each tuple, which you denote sum_a and sum_b in your code.
and how many such comparisons would happen?
I guess what you are really asking: How does the sort work, by just calling a single function?
The short answer is: it uses the Timsort algorithm, and it calls the comparison function O(n * log n) times (note that the actual number of calls is c * n * log n, where c > 0).
To understand what is happening, picture yourself sorting a list of values, say v = [4,2,6,3]. If you go about this systematically, you might do this:
start at the first value, at index i = 0
compare v[i] with v[i+1]
If v[i+1] < v[i], swap them
increase i, repeat from 2 until i == len(v) - 2
start at 1 until no further swaps occurred
So you get, i =
0: 2 < 4 => [2, 4, 6, 3] (swap)
1: 6 < 4 => [2, 4, 6, 3] (no swap)
2: 3 < 6 => [2, 4, 3, 6] (swap)
Start again:
0: 4 < 2 => [2, 4, 3, 6] (no swap)
1: 3 < 4 => [2, 3, 4, 6] (swap)
2: 6 < 4 => [2, 3, 4, 6] (no swap)
Start again - there will be no further swaps, so stop. Your list is sorted. In this example we have run through the list 3 times, and there were 3 * 3 = 9 comparisons.
Obviously this is not very efficient -- the sort() method only calls your comparator function 5 times. The reason is that it employs a more efficient sort algorithm than the simple one explained above.
Also the behavior seems to be very random.
Note that the sequence of values passed to your comparator function is not, in general, defined. However, the sort function does all the necessary comparisons between any two values of the iterable it receives.
Is it creating a sorted list of keys internally where it keeps track of each comparison made?
No, it is not keeping a list of keys internally. Rather the sorting algorithm essentially iterates over the list you give it. In fact it builds subsets of lists to avoid doing too many comparisons - there is a nice visualization of how the sorting algorithm works at Visualising Sorting Algorithms: Python's timsort by Aldo Cortesi
Basically, for the simple list such as [2, 4, 6, 3, 1] and the complex list you provided, the sorting algorithms are the same.
The only differences are the complexity of elements in the list and the comparing scheme that how to compare any tow elements (e.g. myComparator you provided).
There is a good description for Python Sorting: https://wiki.python.org/moin/HowTo/Sorting
First, the cmp() function:
cmp(...)
cmp(x, y) -> integer
Return negative if x<y, zero if x==y, positive if x>y.
You are using this line: items.sort(myComparator) which is equivalent to saying: items.sort(-1) or items.sort(0) or items.sort(1)
Since you want to sort based on the sum of each tuples list, you could do this:
mylist = [(2, [3, 4, 5]), (3, [1, 0, 0, 0, 1]), (4, [-1]), (10, [1, 2, 3])]
sorted(mylist, key=lambda pair: sum(pair[1]))
What this is doing is, I think, exactly what you wanted. Sorting mylist based on the sum() of each tuples list
Is there any other argument than key, for example: value?
Arguments of sort and sorted
Both sort and sorted have three keyword arguments: cmp, key and reverse.
L.sort(cmp=None, key=None, reverse=False) -- stable sort *IN PLACE*;
cmp(x, y) -> -1, 0, 1
sorted(iterable, cmp=None, key=None, reverse=False) --> new sorted list
Using key and reverse is preferred, because they work much faster than an equivalent cmp.
key should be a function which takes an item and returns a value to compare and sort by. reverse allows to reverse sort order.
Using key argument
You can use operator.itemgetter as a key argument to sort by second, third etc. item in a tuple.
Example
>>> from operator import itemgetter
>>> a = range(5)
>>> b = a[::-1]
>>> c = map(lambda x: chr(((x+3)%5)+97), a)
>>> sequence = zip(a,b,c)
# sort by first item in a tuple
>>> sorted(sequence, key = itemgetter(0))
[(0, 4, 'd'), (1, 3, 'e'), (2, 2, 'a'), (3, 1, 'b'), (4, 0, 'c')]
# sort by second item in a tuple
>>> sorted(sequence, key = itemgetter(1))
[(4, 0, 'c'), (3, 1, 'b'), (2, 2, 'a'), (1, 3, 'e'), (0, 4, 'd')]
# sort by third item in a tuple
>>> sorted(sequence, key = itemgetter(2))
[(2, 2, 'a'), (3, 1, 'b'), (4, 0, 'c'), (0, 4, 'd'), (1, 3, 'e')]
Explanation
Sequences can contain any objects, not even comparable, but if we can define a function which produces something we can compare for each of the items, we can pass this function in key argument to sort or sorted.
itemgetter, in particular, creates such a function that fetches the given item from its operand. An example from its documentation:
After, f=itemgetter(2), the call f(r) returns r[2].
Mini-benchmark, key vs cmp
Just out of curiosity, key and cmp performance compared, smaller is better:
>>> from timeit import Timer
>>> Timer(stmt="sorted(xs,key=itemgetter(1))",setup="from operator import itemgetter;xs=range(100);xs=zip(xs,xs);").timeit(300000)
6.7079150676727295
>>> Timer(stmt="sorted(xs,key=lambda x:x[1])",setup="xs=range(100);xs=zip(xs,xs);").timeit(300000)
11.609490871429443
>>> Timer(stmt="sorted(xs,cmp=lambda a,b: cmp(a[1],b[1]))",setup="xs=range(100);xs=zip(xs,xs);").timeit(300000)
22.335839986801147
So, sorting with key seems to be at least twice as fast as sorting with cmp. Using itemgetter instead of lambda x: x[1] makes sort even faster.
Besides key=, the sort method of lists in Python 2.x could alternatively take a cmp= argument (not a good idea, it's been removed in Python 3); with either or none of these two, you can always pass reverse=True to have the sort go downwards (instead of upwards as is the default, and which you can also request explicitly with reverse=False if you're really keen to do that for some reason). I have no idea what that value argument you're mentioning is supposed to do.
Yes, it takes other arguments, but no value.
>>> print list.sort.__doc__
L.sort(cmp=None, key=None, reverse=False) -- stable sort *IN PLACE*;
cmp(x, y) -> -1, 0, 1
What would a value argument even mean?