Sorted a list of tuple and return first element of tuple in python [duplicate] - python-3.x

This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
I have a dictionary of values read from two fields in a database: a string field and a numeric field. The string field is unique, so that is the key of the dictionary.
I can sort on the keys, but how can I sort based on the values?
Note: I have read Stack Overflow question here How do I sort a list of dictionaries by a value of the dictionary? and probably could change my code to have a list of dictionaries, but since I do not really need a list of dictionaries I wanted to know if there is a simpler solution to sort either in ascending or descending order.

Python 3.7+ or CPython 3.6
Dicts preserve insertion order in Python 3.7+. Same in CPython 3.6, but it's an implementation detail.
>>> x = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0}
>>> {k: v for k, v in sorted(x.items(), key=lambda item: item[1])}
{0: 0, 2: 1, 1: 2, 4: 3, 3: 4}
or
>>> dict(sorted(x.items(), key=lambda item: item[1]))
{0: 0, 2: 1, 1: 2, 4: 3, 3: 4}
Older Python
It is not possible to sort a dictionary, only to get a representation of a dictionary that is sorted. Dictionaries are inherently orderless, but other types, such as lists and tuples, are not. So you need an ordered data type to represent sorted values, which will be a list—probably a list of tuples.
For instance,
import operator
x = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0}
sorted_x = sorted(x.items(), key=operator.itemgetter(1))
sorted_x will be a list of tuples sorted by the second element in each tuple. dict(sorted_x) == x.
And for those wishing to sort on keys instead of values:
import operator
x = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0}
sorted_x = sorted(x.items(), key=operator.itemgetter(0))
In Python3 since unpacking is not allowed we can use
x = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0}
sorted_x = sorted(x.items(), key=lambda kv: kv[1])
If you want the output as a dict, you can use collections.OrderedDict:
import collections
sorted_dict = collections.OrderedDict(sorted_x)

As simple as: sorted(dict1, key=dict1.get)
Well, it is actually possible to do a "sort by dictionary values". Recently I had to do that in a Code Golf (Stack Overflow question Code golf: Word frequency chart). Abridged, the problem was of the kind: given a text, count how often each word is encountered and display a list of the top words, sorted by decreasing frequency.
If you construct a dictionary with the words as keys and the number of occurrences of each word as value, simplified here as:
from collections import defaultdict
d = defaultdict(int)
for w in text.split():
d[w] += 1
then you can get a list of the words, ordered by frequency of use with sorted(d, key=d.get) - the sort iterates over the dictionary keys, using the number of word occurrences as a sort key .
for w in sorted(d, key=d.get, reverse=True):
print(w, d[w])
I am writing this detailed explanation to illustrate what people often mean by "I can easily sort a dictionary by key, but how do I sort by value" - and I think the original post was trying to address such an issue. And the solution is to do sort of list of the keys, based on the values, as shown above.

You could use:
sorted(d.items(), key=lambda x: x[1])
This will sort the dictionary by the values of each entry within the dictionary from smallest to largest.
To sort it in descending order just add reverse=True:
sorted(d.items(), key=lambda x: x[1], reverse=True)
Input:
d = {'one':1,'three':3,'five':5,'two':2,'four':4}
a = sorted(d.items(), key=lambda x: x[1])
print(a)
Output:
[('one', 1), ('two', 2), ('three', 3), ('four', 4), ('five', 5)]

Dicts can't be sorted, but you can build a sorted list from them.
A sorted list of dict values:
sorted(d.values())
A list of (key, value) pairs, sorted by value:
from operator import itemgetter
sorted(d.items(), key=itemgetter(1))

In recent Python 2.7, we have the new OrderedDict type, which remembers the order in which the items were added.
>>> d = {"third": 3, "first": 1, "fourth": 4, "second": 2}
>>> for k, v in d.items():
... print "%s: %s" % (k, v)
...
second: 2
fourth: 4
third: 3
first: 1
>>> d
{'second': 2, 'fourth': 4, 'third': 3, 'first': 1}
To make a new ordered dictionary from the original, sorting by the values:
>>> from collections import OrderedDict
>>> d_sorted_by_value = OrderedDict(sorted(d.items(), key=lambda x: x[1]))
The OrderedDict behaves like a normal dict:
>>> for k, v in d_sorted_by_value.items():
... print "%s: %s" % (k, v)
...
first: 1
second: 2
third: 3
fourth: 4
>>> d_sorted_by_value
OrderedDict([('first': 1), ('second': 2), ('third': 3), ('fourth': 4)])

Using Python 3.5
Whilst I found the accepted answer useful, I was also surprised that it hasn't been updated to reference OrderedDict from the standard library collections module as a viable, modern alternative - designed to solve exactly this type of problem.
from operator import itemgetter
from collections import OrderedDict
x = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0}
sorted_x = OrderedDict(sorted(x.items(), key=itemgetter(1)))
# OrderedDict([(0, 0), (2, 1), (1, 2), (4, 3), (3, 4)])
The official OrderedDict documentation offers a very similar example too, but using a lambda for the sort function:
# regular unsorted dictionary
d = {'banana': 3, 'apple':4, 'pear': 1, 'orange': 2}
# dictionary sorted by value
OrderedDict(sorted(d.items(), key=lambda t: t[1]))
# OrderedDict([('pear', 1), ('orange', 2), ('banana', 3), ('apple', 4)])

Pretty much the same as Hank Gay's answer:
sorted([(value,key) for (key,value) in mydict.items()])
Or optimized slightly as suggested by John Fouhy:
sorted((value,key) for (key,value) in mydict.items())

As of Python 3.6 the built-in dict will be ordered
Good news, so the OP's original use case of mapping pairs retrieved from a database with unique string ids as keys and numeric values as values into a built-in Python v3.6+ dict, should now respect the insert order.
If say the resulting two column table expressions from a database query like:
SELECT a_key, a_value FROM a_table ORDER BY a_value;
would be stored in two Python tuples, k_seq and v_seq (aligned by numerical index and with the same length of course), then:
k_seq = ('foo', 'bar', 'baz')
v_seq = (0, 1, 42)
ordered_map = dict(zip(k_seq, v_seq))
Allow to output later as:
for k, v in ordered_map.items():
print(k, v)
yielding in this case (for the new Python 3.6+ built-in dict!):
foo 0
bar 1
baz 42
in the same ordering per value of v.
Where in the Python 3.5 install on my machine it currently yields:
bar 1
foo 0
baz 42
Details:
As proposed in 2012 by Raymond Hettinger (cf. mail on python-dev with subject "More compact dictionaries with faster iteration") and now (in 2016) announced in a mail by Victor Stinner to python-dev with subject "Python 3.6 dict becomes compact and gets a private version; and keywords become ordered" due to the fix/implementation of issue 27350 "Compact and ordered dict" in Python 3.6 we will now be able, to use a built-in dict to maintain insert order!!
Hopefully this will lead to a thin layer OrderedDict implementation as a first step. As #JimFasarakis-Hilliard indicated, some see use cases for the OrderedDict type also in the future. I think the Python community at large will carefully inspect, if this will stand the test of time, and what the next steps will be.
Time to rethink our coding habits to not miss the possibilities opened by stable ordering of:
Keyword arguments and
(intermediate) dict storage
The first because it eases dispatch in the implementation of functions and methods in some cases.
The second as it encourages to more easily use dicts as intermediate storage in processing pipelines.
Raymond Hettinger kindly provided documentation explaining "The Tech Behind Python 3.6 Dictionaries" - from his San Francisco Python Meetup Group presentation 2016-DEC-08.
And maybe quite some Stack Overflow high decorated question and answer pages will receive variants of this information and many high quality answers will require a per version update too.
Caveat Emptor (but also see below update 2017-12-15):
As #ajcr rightfully notes: "The order-preserving aspect of this new implementation is considered an implementation detail and should not be relied upon." (from the whatsnew36) not nit picking, but the citation was cut a bit pessimistic ;-). It continues as " (this may change in the future, but it is desired to have this new dict implementation in the language for a few releases before changing the language spec to mandate order-preserving semantics for all current and future Python implementations; this also helps preserve backwards-compatibility with older versions of the language where random iteration order is still in effect, e.g. Python 3.5)."
So as in some human languages (e.g. German), usage shapes the language, and the will now has been declared ... in whatsnew36.
Update 2017-12-15:
In a mail to the python-dev list, Guido van Rossum declared:
Make it so. "Dict keeps insertion order" is the ruling. Thanks!
So, the version 3.6 CPython side-effect of dict insertion ordering is now becoming part of the language spec (and not anymore only an implementation detail). That mail thread also surfaced some distinguishing design goals for collections.OrderedDict as reminded by Raymond Hettinger during discussion.

It can often be very handy to use namedtuple. For example, you have a dictionary of 'name' as keys and 'score' as values and you want to sort on 'score':
import collections
Player = collections.namedtuple('Player', 'score name')
d = {'John':5, 'Alex':10, 'Richard': 7}
sorting with lowest score first:
worst = sorted(Player(v,k) for (k,v) in d.items())
sorting with highest score first:
best = sorted([Player(v,k) for (k,v) in d.items()], reverse=True)
Now you can get the name and score of, let's say the second-best player (index=1) very Pythonically like this:
player = best[1]
player.name
'Richard'
player.score
7

I had the same problem, and I solved it like this:
WantedOutput = sorted(MyDict, key=lambda x : MyDict[x])
(People who answer "It is not possible to sort a dict" did not read the question! In fact, "I can sort on the keys, but how can I sort based on the values?" clearly means that he wants a list of the keys sorted according to the value of their values.)
Please notice that the order is not well defined (keys with the same value will be in an arbitrary order in the output list).

If values are numeric you may also use Counter from collections.
from collections import Counter
x = {'hello': 1, 'python': 5, 'world': 3}
c = Counter(x)
print(c.most_common())
>> [('python', 5), ('world', 3), ('hello', 1)]

Starting from Python 3.6, dict objects are now ordered by insertion order. It's officially in the specifications of Python 3.7.
>>> words = {"python": 2, "blah": 4, "alice": 3}
>>> dict(sorted(words.items(), key=lambda x: x[1]))
{'python': 2, 'alice': 3, 'blah': 4}
Before that, you had to use OrderedDict.
Python 3.7 documentation says:
Changed in version 3.7: Dictionary order is guaranteed to be insertion
order. This behavior was implementation detail of CPython from 3.6.

In Python 2.7, simply do:
from collections import OrderedDict
# regular unsorted dictionary
d = {'banana': 3, 'apple':4, 'pear': 1, 'orange': 2}
# dictionary sorted by key
OrderedDict(sorted(d.items(), key=lambda t: t[0]))
OrderedDict([('apple', 4), ('banana', 3), ('orange', 2), ('pear', 1)])
# dictionary sorted by value
OrderedDict(sorted(d.items(), key=lambda t: t[1]))
OrderedDict([('pear', 1), ('orange', 2), ('banana', 3), ('apple', 4)])
copy-paste from : http://docs.python.org/dev/library/collections.html#ordereddict-examples-and-recipes
Enjoy ;-)

This is the code:
import operator
origin_list = [
{"name": "foo", "rank": 0, "rofl": 20000},
{"name": "Silly", "rank": 15, "rofl": 1000},
{"name": "Baa", "rank": 300, "rofl": 20},
{"name": "Zoo", "rank": 10, "rofl": 200},
{"name": "Penguin", "rank": -1, "rofl": 10000}
]
print ">> Original >>"
for foo in origin_list:
print foo
print "\n>> Rofl sort >>"
for foo in sorted(origin_list, key=operator.itemgetter("rofl")):
print foo
print "\n>> Rank sort >>"
for foo in sorted(origin_list, key=operator.itemgetter("rank")):
print foo
Here are the results:
Original
{'name': 'foo', 'rank': 0, 'rofl': 20000}
{'name': 'Silly', 'rank': 15, 'rofl': 1000}
{'name': 'Baa', 'rank': 300, 'rofl': 20}
{'name': 'Zoo', 'rank': 10, 'rofl': 200}
{'name': 'Penguin', 'rank': -1, 'rofl': 10000}
Rofl
{'name': 'Baa', 'rank': 300, 'rofl': 20}
{'name': 'Zoo', 'rank': 10, 'rofl': 200}
{'name': 'Silly', 'rank': 15, 'rofl': 1000}
{'name': 'Penguin', 'rank': -1, 'rofl': 10000}
{'name': 'foo', 'rank': 0, 'rofl': 20000}
Rank
{'name': 'Penguin', 'rank': -1, 'rofl': 10000}
{'name': 'foo', 'rank': 0, 'rofl': 20000}
{'name': 'Zoo', 'rank': 10, 'rofl': 200}
{'name': 'Silly', 'rank': 15, 'rofl': 1000}
{'name': 'Baa', 'rank': 300, 'rofl': 20}

Try the following approach. Let us define a dictionary called mydict with the following data:
mydict = {'carl':40,
'alan':2,
'bob':1,
'danny':3}
If one wanted to sort the dictionary by keys, one could do something like:
for key in sorted(mydict.iterkeys()):
print "%s: %s" % (key, mydict[key])
This should return the following output:
alan: 2
bob: 1
carl: 40
danny: 3
On the other hand, if one wanted to sort a dictionary by value (as is asked in the question), one could do the following:
for key, value in sorted(mydict.iteritems(), key=lambda (k,v): (v,k)):
print "%s: %s" % (key, value)
The result of this command (sorting the dictionary by value) should return the following:
bob: 1
alan: 2
danny: 3
carl: 40

You can create an "inverted index", also
from collections import defaultdict
inverse= defaultdict( list )
for k, v in originalDict.items():
inverse[v].append( k )
Now your inverse has the values; each value has a list of applicable keys.
for k in sorted(inverse):
print k, inverse[k]

You can use the collections.Counter. Note, this will work for both numeric and non-numeric values.
>>> x = {1: 2, 3: 4, 4:3, 2:1, 0:0}
>>> from collections import Counter
>>> #To sort in reverse order
>>> Counter(x).most_common()
[(3, 4), (4, 3), (1, 2), (2, 1), (0, 0)]
>>> #To sort in ascending order
>>> Counter(x).most_common()[::-1]
[(0, 0), (2, 1), (1, 2), (4, 3), (3, 4)]
>>> #To get a dictionary sorted by values
>>> from collections import OrderedDict
>>> OrderedDict(Counter(x).most_common()[::-1])
OrderedDict([(0, 0), (2, 1), (1, 2), (4, 3), (3, 4)])

The collections solution mentioned in another answer is absolutely superb, because you retain a connection between the key and value which in the case of dictionaries is extremely important.
I don't agree with the number one choice presented in another answer, because it throws away the keys.
I used the solution mentioned above (code shown below) and retained access to both keys and values and in my case the ordering was on the values, but the importance was the ordering of the keys after ordering the values.
from collections import Counter
x = {'hello':1, 'python':5, 'world':3}
c=Counter(x)
print( c.most_common() )
>> [('python', 5), ('world', 3), ('hello', 1)]

You can also use a custom function that can be passed to parameter key.
def dict_val(x):
return x[1]
x = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0}
sorted_x = sorted(x.items(), key=dict_val)

You can use a skip dict which is a dictionary that's permanently sorted by value.
>>> data = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0}
>>> SkipDict(data)
{0: 0.0, 2: 1.0, 1: 2.0, 4: 3.0, 3: 4.0}
If you use keys(), values() or items() then you'll iterate in sorted order by value.
It's implemented using the skip list datastructure.

Of course, remember, you need to use OrderedDict because regular Python dictionaries don't keep the original order.
from collections import OrderedDict
a = OrderedDict(sorted(originalDict.items(), key=lambda x: x[1]))
If you do not have Python 2.7 or higher, the best you can do is iterate over the values in a generator function. (There is an OrderedDict for 2.4 and 2.6 here, but
a) I don't know about how well it works
and
b) You have to download and install it of course. If you do not have administrative access, then I'm afraid the option's out.)
def gen(originalDict):
for x, y in sorted(zip(originalDict.keys(), originalDict.values()), key=lambda z: z[1]):
yield (x, y)
#Yields as a tuple with (key, value). You can iterate with conditional clauses to get what you want.
for bleh, meh in gen(myDict):
if bleh == "foo":
print(myDict[bleh])
You can also print out every value
for bleh, meh in gen(myDict):
print(bleh, meh)
Please remember to remove the parentheses after print if not using Python 3.0 or above

from django.utils.datastructures import SortedDict
def sortedDictByKey(self,data):
"""Sorted dictionary order by key"""
sortedDict = SortedDict()
if data:
if isinstance(data, dict):
sortedKey = sorted(data.keys())
for k in sortedKey:
sortedDict[k] = data[k]
return sortedDict

Here is a solution using zip on d.values() and d.keys(). A few lines down this link (on Dictionary view objects) is:
This allows the creation of (value, key) pairs using zip(): pairs = zip(d.values(), d.keys()).
So we can do the following:
d = {'key1': 874.7, 'key2': 5, 'key3': 8.1}
d_sorted = sorted(zip(d.values(), d.keys()))
print d_sorted
# prints: [(5, 'key2'), (8.1, 'key3'), (874.7, 'key1')]

As pointed out by Dilettant, Python 3.6 will now keep the order! I thought I'd share a function I wrote that eases the sorting of an iterable (tuple, list, dict). In the latter case, you can sort either on keys or values, and it can take numeric comparison into account. Only for >= 3.6!
When you try using sorted on an iterable that holds e.g. strings as well as ints, sorted() will fail. Of course you can force string comparison with str(). However, in some cases you want to do actual numeric comparison where 12 is smaller than 20 (which is not the case in string comparison). So I came up with the following. When you want explicit numeric comparison you can use the flag num_as_num which will try to do explicit numeric sorting by trying to convert all values to floats. If that succeeds, it will do numeric sorting, otherwise it'll resort to string comparison.
Comments for improvement welcome.
def sort_iterable(iterable, sort_on=None, reverse=False, num_as_num=False):
def _sort(i):
# sort by 0 = keys, 1 values, None for lists and tuples
try:
if num_as_num:
if i is None:
_sorted = sorted(iterable, key=lambda v: float(v), reverse=reverse)
else:
_sorted = dict(sorted(iterable.items(), key=lambda v: float(v[i]), reverse=reverse))
else:
raise TypeError
except (TypeError, ValueError):
if i is None:
_sorted = sorted(iterable, key=lambda v: str(v), reverse=reverse)
else:
_sorted = dict(sorted(iterable.items(), key=lambda v: str(v[i]), reverse=reverse))
return _sorted
if isinstance(iterable, list):
sorted_list = _sort(None)
return sorted_list
elif isinstance(iterable, tuple):
sorted_list = tuple(_sort(None))
return sorted_list
elif isinstance(iterable, dict):
if sort_on == 'keys':
sorted_dict = _sort(0)
return sorted_dict
elif sort_on == 'values':
sorted_dict = _sort(1)
return sorted_dict
elif sort_on is not None:
raise ValueError(f"Unexpected value {sort_on} for sort_on. When sorting a dict, use key or values")
else:
raise TypeError(f"Unexpected type {type(iterable)} for iterable. Expected a list, tuple, or dict")

I just learned a relevant skill from Python for Everybody.
You may use a temporary list to help you to sort the dictionary:
# Assume dictionary to be:
d = {'apple': 500.1, 'banana': 1500.2, 'orange': 1.0, 'pineapple': 789.0}
# Create a temporary list
tmp = []
# Iterate through the dictionary and append each tuple into the temporary list
for key, value in d.items():
tmptuple = (value, key)
tmp.append(tmptuple)
# Sort the list in ascending order
tmp = sorted(tmp)
print (tmp)
If you want to sort the list in descending order, simply change the original sorting line to:
tmp = sorted(tmp, reverse=True)
Using list comprehension, the one-liner would be:
# Assuming the dictionary looks like
d = {'apple': 500.1, 'banana': 1500.2, 'orange': 1.0, 'pineapple': 789.0}
# One-liner for sorting in ascending order
print (sorted([(v, k) for k, v in d.items()]))
# One-liner for sorting in descending order
print (sorted([(v, k) for k, v in d.items()], reverse=True))
Sample Output:
# Ascending order
[(1.0, 'orange'), (500.1, 'apple'), (789.0, 'pineapple'), (1500.2, 'banana')]
# Descending order
[(1500.2, 'banana'), (789.0, 'pineapple'), (500.1, 'apple'), (1.0, 'orange')]

Use ValueSortedDict from dicts:
from dicts.sorteddict import ValueSortedDict
d = {1: 2, 3: 4, 4:3, 2:1, 0:0}
sorted_dict = ValueSortedDict(d)
print sorted_dict.items()
[(0, 0), (2, 1), (1, 2), (4, 3), (3, 4)]

Iterate through a dict and sort it by its values in descending order:
$ python --version
Python 3.2.2
$ cat sort_dict_by_val_desc.py
dictionary = dict(siis = 1, sana = 2, joka = 3, tuli = 4, aina = 5)
for word in sorted(dictionary, key=dictionary.get, reverse=True):
print(word, dictionary[word])
$ python sort_dict_by_val_desc.py
aina 5
tuli 4
joka 3
sana 2
siis 1

If your values are integers, and you use Python 2.7 or newer, you can use collections.Counter instead of dict. The most_common method will give you all items, sorted by the value.

This works in 3.1.x:
import operator
slovar_sorted=sorted(slovar.items(), key=operator.itemgetter(1), reverse=True)
print(slovar_sorted)

For the sake of completeness, I am posting a solution using heapq. Note, this method will work for both numeric and non-numeric values
>>> x = {1: 2, 3: 4, 4:3, 2:1, 0:0}
>>> x_items = x.items()
>>> heapq.heapify(x_items)
>>> #To sort in reverse order
>>> heapq.nlargest(len(x_items),x_items, operator.itemgetter(1))
[(3, 4), (4, 3), (1, 2), (2, 1), (0, 0)]
>>> #To sort in ascending order
>>> heapq.nsmallest(len(x_items),x_items, operator.itemgetter(1))
[(0, 0), (2, 1), (1, 2), (4, 3), (3, 4)]

Related

How to sort a dictionary of nested lists by value with one key ascending and one key descending?

I'm working on a problem that states the following:
Write a function telling apart accepted and refused students according to a threshold.
The function should be called select_student and takes as arguments:
A list where each element is a list of a student name, and his mark.
A mark. The student mark must be superior or equal to the given mark to be accepted.
Your function must return a dictionary with two entries:
Accepted which list the accepted students sorted by marks in the descending order.
Refused which list the refused students sorted by marks in ascending order.
Example
In [1]: from solution import select_student
In [2]: my_class = [['Kermit Wade', 27], ['Hattie Schleusner', 67], ['Ben Ball', 5], ['William Lee', 2]]
In [3]: select_student(my_class, 20)
Out[3]:
{'Accepted': [['Hattie Schleusner', 67], ['Kermit Wade', 27]],
'Refused': [['William Lee', 2], ['Ben Ball', 5]]}
In [4]: select_student(my_class, 50)
Out[4]:
{'Accepted': [['Hattie Schleusner', 67]],
'Refused': [['William Lee', 2], ['Ben Ball', 5], ['Kermit Wade', 27]]}
My code is:
from collections import OrderedDict
students = [
["Kermit Wade", 27],
["Hattie Schleusner", 67],
["Ben Ball", 5],
["William Lee", 2],
]
def select_student(students, threshold):
output = {
'Accepted' : [],
'Refused' : []
}
for i in range(len(students)):
if students[i][1] >= threshold:
output['Accepted'].append(students[i])
elif students[i][1] < threshold:
output['Refused'].append(students[i])
return output
My output is:
{'Accepted': [['Kermit Wade', 27], ['Hattie Schleusner', 67]], 'Refused': [['Ben Ball', 5], ['William Lee', 2]]}
The output is for these parameters
print(select_student(students, 20))
As you can see I need to reverse the order for both accepted and refused. So Hattie comes first in accepted and then William comes first in refused.
I tried to use OrderedLists and googling but because of the nested list structure required by the problem I could not find a way to sort by the grade nor could I find a way to have it both be ascending and descending depending on the dictionary's key.
Thanks in advance!
Modify your select student function to sort your accepted and refused lists as follows:
def select_student(students, threshold):
output = {
'Accepted' : [],
'Refused' : []
}
for i in range(len(students)):
if students[i][1] >= threshold:
output['Accepted'].append(students[i])
elif students[i][1] < threshold:
output['Refused'].append(students[i])
output['Accepted'] = sorted(output['Accepted'], key= lambda x: x[1], reverse= True)
output['Refused'] = sorted(output['Refused'], key = lambda x: x[1])
return output

What is the best possible way to find the first AND the last occurrences of an element in a list in Python?

The basic way I usually use is by using the list.index(element) and reversed_list.index(element), but this fails when I need to search for many elements and the length of the list is too large say 10^5 or say 10^6 or even larger than that. What is the best possible way (which uses very little time) for the same?
You can build auxiliary lookup structures:
lst = [1,2,3,1,2,3] # super long list
last = {n: i for i, n in enumerate(lst)}
first = {n: i for i, n in reversed(list(enumerate(lst)))}
last[3]
# 5
first[3]
# 2
The construction of the lookup dicts takes linear time, but then the lookup itself is constant.
Whreas calls to list.index() take linear time, and repeatedly doing so is then quadratic (given the number of lookups you make depends on the size of the list).
You could also build a single structure in one iteration:
from collections import defaultdict
lookup = defaultdict(lambda: [None, None])
for i, n in enumerate(lst):
lookup[n][1] = i
if lookup[n][0] is None:
lookup[n][0] = i
lookup[3]
# [2, 5]
lookup[2]
# [1, 4]
Well, someone needs to do the work in finding the element, and in a large list this can take time! Without more information or a code example, it'll be difficult to help you, but usually the go-to answer is to use another data structure- for example, if you can keep your elements in a dictionary instead of a list with the key being the element and the value being an array of indices, you'll be much quicker.
You can just remember first and last index for every element in the list:
In [9]: l = [random.randint(1, 10) for _ in range(100)]
In [10]: first_index = {}
In [11]: last_index = {}
In [12]: for idx, x in enumerate(l):
...: if x not in first_index:
...: first_index[x] = idx
...: last_index[x] = idx
...:
In [13]: [(x, first_index.get(x), last_index.get(x)) for x in range(1, 11)]
Out[13]:
[(1, 3, 88),
(2, 23, 90),
(3, 10, 91),
(4, 13, 98),
(5, 11, 57),
(6, 4, 99),
(7, 9, 92),
(8, 19, 95),
(9, 0, 77),
(10, 2, 87)]
In [14]: l[0]
Out[14]: 9
Your approach sounds good, I did some testing and:
import numpy as np
long_list = list(np.random.randint(0, 100_000, 100_000_000))
# This takes 10ms in my machine
long_list.index(999)
# This takes 1,100ms in my machine
long_list[::-1].index(999)
# This takes 1,300ms in my machine
list(reversed(long_list)).index(999)
# This takes 200ms in my machine
long_list.reverse()
long_list.index(999)
long_list.reverse()
But at the end of the day, a Python list does not seem like the best data structure for this.
As others have sugested, you can build a dict:
indexes = {}
for i, val in enumerate(long_list):
if val in indexes.keys():
indexes[val].append(i)
else:
indexes[val] = [i]
This is memory expensive, but solves your problem (depends on how often you modify the original list).
You can then do:
# This takes 0.02ms in my machine
ix = indexes.get(999)
ix[0], ix[-1]

When utilizing a for loop what does each argument specify exactly?

I'm new to learning Python and have a clarifying question regarding for loops.
For instance:
dictionary_a = {"A": "Apple", "B": "Ball", "C": "Cat"}
dictionary_b = {"A": "Ant", "B": "Basket", "C": "Carrot"}
temp = ""
for k_a, v_a in dictionary_a.items():
temp = dictionary_b[k_a]
dictionary_b[k_a] = v_a
dictionary_a[k_a] = temp
How exactly is k_a run through the interpreter? I understand v_a in dictionary_a.items() as simply iterating through the sequence in whatever collection.
But when for loops have the syntax for x, y in z I don't quite understand what values x takes with each iteration.
Hope I'm making some sense. Appreciate any help.
when iterating over a dict.items(), it will return a 2 tuple, so when providing two variables in the for loop, each tuple elements will be assigned to it.
Here is another example to help you understand the mechanics:
coordinates = [(1, 2, 3), (4, 5, 6)]
for x, y, z in coordinates:
print(x)
Edit: you can make even more complicated unpacking. For example, let's assume you are interested to collect only the first and last item in a long list, you can proceed as follow:
long_list = 'This is a very long list to process'.split()
first_item, *_, last_item = long_list
In Python you can "Cast" multiple variables from another iterable variable.
Let's use this example:
>>> a, b = [1, 2]
>>> a
1
>>> b
2
The above behavior is what is happening when you loop over a dictionary with the dict.items() method.
Here is an example of what is happening:
>>> a = {"abc":123, "def":456}
>>> a.items()
dict_items([('abc', 123), ('def', 456)])
>>> for i in a.items():
... i
...
('abc', 123)
('def', 456)
>>>

python return list of sorted dictionary keys

I'm sure this has been asked and answered, but I cant find it. I have this dictionary:
{'22775': 15.9,
'22778': 29.2,
'22776': 20.25,
'22773': 9.65,
'22777': 22.9,
'22774': 12.45}
a string and a float.
I want to list the key strings in a tk listbox to allow the user to select one and then use the corresponding float in a calculation to determine a delay factor in an event.
I have this code:
def dic_entry(line):
#Create key:value pairs from string
key, sep, value = line.strip().partition(":")
return key, float(value)
with open(filename1) as f_obj:
s = dict(dic_entry(line) for line in f_obj)
print (s) #for testing only
s_ord = sorted(s.items(),key=lambda x: x[1])
print (s_ord)
The first print gets me
{'22775': 15.9,
'22778': 29.2,
'22776': 20.25,
'22773': 9.65,
'22777': 22.9,
'22774': 12.45}
as expected. The second, which I hoped would give me an ordered list of keys gets me
[('22773', 9.65),
('22774', 12.45),
('22775', 15.9),
('22776', 20.25),
('22777', 22.9),
('22778', 29.2)].
I have tried using sorteddictionary from the collections module and it gives me a sorted dictionary, but I'm having trouble extracting a list of keys.
s_ord2 = []
for keys in s.items():
s_ord2.append (keys)
print (s_ord2)
gives me a list of key value pairs:
[('22776', 20.25),
('22777', 22.9),
('22774', 12.45),
('22773', 9.65),
('22778', 29.2),
('22775', 15.9)]
I'm sure I'm doing something dumb, I just don't know what it is.
You're using items when you want to use keys:
In [1]: d = {'z': 3, 'b': 4, 'a': 9}
In [2]: sorted(d.keys())
Out[2]: ['a', 'b', 'z']
In [3]: sorted(d.items())
Out[3]: [('a', 9), ('b', 4), ('z', 3)]
d.items() gives you tuples of (key, value); d.keys() just gives you just the keys.

How to loop through python dictionaries [duplicate]

d = {'x': 1, 'y': 2, 'z': 3}
for key in d:
print(key, 'corresponds to', d[key])
How does Python recognize that it needs only to read the key from the dictionary? Is key a special keyword, or is it simply a variable?
key is just a variable name.
for key in d:
will simply loop over the keys in the dictionary, rather than the keys and values. To loop over both key and value you can use the following:
For Python 3.x:
for key, value in d.items():
For Python 2.x:
for key, value in d.iteritems():
To test for yourself, change the word key to poop.
In Python 3.x, iteritems() was replaced with simply items(), which returns a set-like view backed by the dict, like iteritems() but even better.
This is also available in 2.7 as viewitems().
The operation items() will work for both 2 and 3, but in 2 it will return a list of the dictionary's (key, value) pairs, which will not reflect changes to the dict that happen after the items() call. If you want the 2.x behavior in 3.x, you can call list(d.items()).
It's not that key is a special word, but that dictionaries implement the iterator protocol. You could do this in your class, e.g. see this question for how to build class iterators.
In the case of dictionaries, it's implemented at the C level. The details are available in PEP 234. In particular, the section titled "Dictionary Iterators":
Dictionaries implement a tp_iter slot that returns an efficient
iterator that iterates over the keys of the dictionary. [...] This
means that we can write
for k in dict: ...
which is equivalent to, but much faster than
for k in dict.keys(): ...
as long as the restriction on modifications to the dictionary
(either by the loop or by another thread) are not violated.
Add methods to dictionaries that return different kinds of
iterators explicitly:
for key in dict.iterkeys(): ...
for value in dict.itervalues(): ...
for key, value in dict.iteritems(): ...
This means that for x in dict is shorthand for for x in
dict.iterkeys().
In Python 3, dict.iterkeys(), dict.itervalues() and dict.iteritems() are no longer supported. Use dict.keys(), dict.values() and dict.items() instead.
Iterating over a dict iterates through its keys in no particular order, as you can see here:
(This is no longer the case in Python 3.6, but note that it's not guaranteed behaviour yet.)
>>> d = {'x': 1, 'y': 2, 'z': 3}
>>> list(d)
['y', 'x', 'z']
>>> d.keys()
['y', 'x', 'z']
For your example, it is a better idea to use dict.items():
>>> d.items()
[('y', 2), ('x', 1), ('z', 3)]
This gives you a list of tuples. When you loop over them like this, each tuple is unpacked into k and v automatically:
for k,v in d.items():
print(k, 'corresponds to', v)
Using k and v as variable names when looping over a dict is quite common if the body of the loop is only a few lines. For more complicated loops it may be a good idea to use more descriptive names:
for letter, number in d.items():
print(letter, 'corresponds to', number)
It's a good idea to get into the habit of using format strings:
for letter, number in d.items():
print('{0} corresponds to {1}'.format(letter, number))
key is simply a variable.
For Python2.X:
>>> d = {'x': 1, 'y': 2, 'z': 3}
>>> for my_var in d:
>>> print my_var, 'corresponds to', d[my_var]
x corresponds to 1
y corresponds to 2
z corresponds to 3
... or better,
d = {'x': 1, 'y': 2, 'z': 3}
for the_key, the_value in d.iteritems():
print the_key, 'corresponds to', the_value
For Python3.X:
d = {'x': 1, 'y': 2, 'z': 3}
for the_key, the_value in d.items():
print(the_key, 'corresponds to', the_value)
When you iterate through dictionaries using the for .. in ..-syntax, it always iterates over the keys (the values are accessible using dictionary[key]).
To iterate over key-value pairs, use the following:
for k,v in dict.iteritems() in Python 2
for k,v in dict.items() in Python 3
This is a very common looping idiom. in is an operator. For when to use for key in dict and when it must be for key in dict.keys() see David Goodger's Idiomatic Python article (archived copy).
I have a use case where I have to iterate through the dict to get the key, value pair, also the index indicating where I am. This is how I do it:
d = {'x': 1, 'y': 2, 'z': 3}
for i, (key, value) in enumerate(d.items()):
print(i, key, value)
Note that the parentheses around the key, value are important, without them, you'd get an ValueError "not enough values to unpack".
Iterating over dictionaries using 'for' loops
d = {'x': 1, 'y': 2, 'z': 3}
for key in d:
...
How does Python recognize that it needs only to read the key from the
dictionary? Is key a special word in Python? Or is it simply a
variable?
It's not just for loops. The important word here is "iterating".
A dictionary is a mapping of keys to values:
d = {'x': 1, 'y': 2, 'z': 3}
Any time we iterate over it, we iterate over the keys. The variable name key is only intended to be descriptive - and it is quite apt for the purpose.
This happens in a list comprehension:
>>> [k for k in d]
['x', 'y', 'z']
It happens when we pass the dictionary to list (or any other collection type object):
>>> list(d)
['x', 'y', 'z']
The way Python iterates is, in a context where it needs to, it calls the __iter__ method of the object (in this case the dictionary) which returns an iterator (in this case, a keyiterator object):
>>> d.__iter__()
<dict_keyiterator object at 0x7fb1747bee08>
We shouldn't use these special methods ourselves, instead, use the respective builtin function to call it, iter:
>>> key_iterator = iter(d)
>>> key_iterator
<dict_keyiterator object at 0x7fb172fa9188>
Iterators have a __next__ method - but we call it with the builtin function, next:
>>> next(key_iterator)
'x'
>>> next(key_iterator)
'y'
>>> next(key_iterator)
'z'
>>> next(key_iterator)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
When an iterator is exhausted, it raises StopIteration. This is how Python knows to exit a for loop, or a list comprehension, or a generator expression, or any other iterative context. Once an iterator raises StopIteration it will always raise it - if you want to iterate again, you need a new one.
>>> list(key_iterator)
[]
>>> new_key_iterator = iter(d)
>>> list(new_key_iterator)
['x', 'y', 'z']
Returning to dicts
We've seen dicts iterating in many contexts. What we've seen is that any time we iterate over a dict, we get the keys. Back to the original example:
d = {'x': 1, 'y': 2, 'z': 3}
for key in d:
If we change the variable name, we still get the keys. Let's try it:
>>> for each_key in d:
... print(each_key, '=>', d[each_key])
...
x => 1
y => 2
z => 3
If we want to iterate over the values, we need to use the .values method of dicts, or for both together, .items:
>>> list(d.values())
[1, 2, 3]
>>> list(d.items())
[('x', 1), ('y', 2), ('z', 3)]
In the example given, it would be more efficient to iterate over the items like this:
for a_key, corresponding_value in d.items():
print(a_key, corresponding_value)
But for academic purposes, the question's example is just fine.
For Iterating through dictionaries, The below code can be used.
dictionary= {1:"a", 2:"b", 3:"c"}
#To iterate over the keys
for key in dictionary.keys():
print(key)
#To Iterate over the values
for value in dictionary.values():
print(value)
#To Iterate both the keys and values
for key, value in dictionary.items():
print(key,'\t', value)
You can check the implementation of CPython's dicttype on GitHub. This is the signature of method that implements the dict iterator:
_PyDict_Next(PyObject *op, Py_ssize_t *ppos, PyObject **pkey,
PyObject **pvalue, Py_hash_t *phash)
CPython dictobject.c
To iterate over keys, it is slower but better to use my_dict.keys(). If you tried to do something like this:
for key in my_dict:
my_dict[key+"-1"] = my_dict[key]-1
it would create a runtime error because you are changing the keys while the program is running. If you are absolutely set on reducing time, use the for key in my_dict way, but you have been warned.
If you are looking for a clear and visual example:
cat = {'name': 'Snowy', 'color': 'White' ,'age': 14}
for key , value in cat.items():
print(key, ': ', value)
Result:
name: Snowy
color: White
age: 14
This will print the output in sorted order by values in ascending order.
d = {'x': 3, 'y': 1, 'z': 2}
def by_value(item):
return item[1]
for key, value in sorted(d.items(), key=by_value):
print(key, '->', value)
Output:
y -> 1
z -> 2
x -> 3
Let's get straight to the point. If the word key is just a variable, as you have mentioned then the main thing to note is that when you run a 'FOR LOOP' over a dictionary it runs through only the 'keys' and ignores the 'values'.
d = {'x': 1, 'y': 2, 'z': 3}
for key in d:
print (key, 'corresponds to', d[key])
rather try this:
d = {'x': 1, 'y': 2, 'z': 3}
for i in d:
print (i, 'corresponds to', d[i])
but if you use a function like:
d = {'x': 1, 'y': 2, 'z': 3}
print(d.keys())
in the above case 'keys' is just not a variable, its a function.
A dictionary in Python is a collection of key-value pairs. Each key is connected to a value, and you can use a key to access the value associated with that key. A key's value can be a number, a string, a list, or even another dictionary. In this case, threat each "key-value pair" as a separate row in the table: d is your table with two columns. the key is the first column, key[value] is your second column. Your for loop is a standard way to iterate over a table.

Resources