Understanding frequency sort with lambda - python-3.x

I have this code which does a frequency sort with key=lambda x: (count[x])
nums = [1,1,2,2,2,3]
count = collections.Counter(nums)
output = sorted(nums, key=lambda x: (count[x]))
this gives the output
[3,1,1,2,2,2]
I would like to know why [3,1,2] isn't the output? How are the keys being repeated from the counter?

Because you are still sorting nums which contains all the elements (with duplication/repetition). key=lambda x: count[x] only decides the order in which the elements are ordered in.
An equivalent but less efficient (O(n2) instead of O(n)) code is
sorted(nums, key=lambda x: nums.count(x))

Related

How does sorted work with both positive and negative values?

Ex:
dictionary = {"sand": 1, "coal": -2, "apples": -1, "corn": 5}
I want to arrange it in ascending order with the lowest value first "coal: -2"
answer = [coal, apples, sand, corn]
But when I tried to sort the dictionary using sorted function below:
print(sorted(dictionary, key=lambda x: x[1]))
I got this:
['sand', 'coal', 'corn', 'apples']
how does the sorted function work for both + and - values?
What's wrong
sorted(dictionary, key=lambda x: x[1])
Above statement not comparing with value but comparing with sencond char in key
Solution
sorted(dictionary, key=lambda x: dictionary[x])

Group two dimensional list records Python [duplicate]

This question already has answers here:
Python summing values in list if it exists in another list
(5 answers)
Closed 4 years ago.
I have a list of lists (string,integer)
eg:
my_list=[["apple",5],["banana",6],["orange",6],["banana",9],["orange",3],["apple",111]]
I'd like to sum the same items and finally get this:
my2_list=[["apple",116],["banana",15],["orange",9]]
You can use itertools.groupby on the sorted list:
from itertools import groupby
my_list=[["apple",5],["banana",6],["orange",6],["banana",9],["orange",3],["apple",111]]
my_list2 = []
for i, g in groupby(sorted(my_list), key=lambda x: x[0]):
my_list2.append([i, sum(v[1] for v in g)])
print(my_list2)
# [['apple', 116], ['banana', 15], ['orange', 9]]
Speaking of SQL Group By and pre-sorting:
The operation of groupby() is similar to the uniq filter in Unix. It
generates a break or new group every time the value of the key
function changes (which is why it is usually necessary to have sorted
the data using the same key function). That behavior differs from
SQL’s GROUP BY which aggregates common elements regardless of their
input order.
Emphasis Mine
from collections import defaultdict
my_list= [["apple",5],["banana",6],["orange",6],["banana",9],["orange",3],["apple",111]]
result = defaultdict(int)
for fruit, value in my_list:
result[fruit] += value
result = result.items()
print result
Or you can keep result as dictionary
Using Pandas and groupby:
import pandas as pd
>>> pd.DataFrame(my_list, columns=['fruit', 'count']).groupby('fruit').sum()
count
fruit
apple 116
banana 15
orange 9
from itertools import groupby
[[k, sum(v for _, v in g)] for k, g in groupby(sorted(my_list), key = lambda x: x[0])]
# [['apple', 116], ['banana', 15], ['orange', 9]]
If you dont want the order to preserved, then plz use the below code.
my_list=[["apple",5],["banana",6],["orange",6],["banana",9],["orange",3],["apple",111]]
my_dict1 = {}
for d in my_list:
if d[0] in my_dict1.keys():
my_dict1[d[0]] += d[1]
else:
my_dict1[d[0]] = d[1]
my_list2 = [[k,v] for (k,v) in my_dict1.items()]

Get Key and Value of a Dictionary for max N values of the Dictionary

I have seen many posts, which use the below:
sorted(iterable, key=keyfunc, reverse=True)[0]
But how do I get both key and value ?
The above returns key only.
Should I get the keys and iterate as done in the below code or is there a simpler way to do it ?
topn_dict = sorted(similarity_dict, key=similarity_dict.get, reverse=True)[:5]
topn_pairs = {k: similarity[k] for k in topn_dict}
print (topn_pairs)
Edit:
similarity_dict = {a: 5.1, b: 4.99, c: 8.72, d: 6.34, e: 2.3, f: 9.1}
I would like the output as following for top 3:
f - 9.1
c - 8.72
d - 6.34
you can sort and re-create a dict directly:
topn_dict = dict(sorted(similarity.items(), key=lambda x: x[1], reverse=True))
without an explicit key (or key=lambda x: x[0]) this will sort according to the keys; the version above sorts the values.
note that dicts are sorted only on python >= 3.?. in older versions dicts are not sorted anyway. (update from DeepSapce: starting from python 3.7 you can rely on that.)
for your example you can use:
topn_dict = dict(sorted(similarity_dict.items(), key=lambda x: x[1],
reverse=True)[:5])
print(topn_dict)

Using the function Map, count the number of words that start with ‘S’ in list in Python3

I'd like to get the total count of elements in a list starting with 'S' by only using Map function and Lambda expression. What I've tried is using list function encapsulated which is not I want.
Below is my code in which I've tried which is not desired.
input_list = ['San Jose', 'San Francisco', 'Santa Fe', 'Houston']
desireList = list(map(lambda x: x if x[0] == 'S' else '', input_list))
desireList.remove('')
print(len(desireList))
It's more Pythonic to use sum with a generator expression for your purpose:
sum(w.startswith('S') for w in input_list)
or:
sum(f == 'S' for f, *_ in input_list)
or if you still would prefer to use map and lambda:
sum(map(lambda x: x[0] == 'S', input_list))
With your sample input, all of the above would return: 3
You can try this:
count = list(map(lambda x:x[0]=='S',input_list)).count(True)
Here's an alternate approach
list( map( lambda x : x[0].lower() , input_list ) ).count('s')
Generate a list of 1st characters per item in the list, and count the number of 's' characters in that list.

Split/partition list based on invariant/hash?

I have a list [a1,21,...] and would like to split it based on the value of a function f(a).
For example if the input is the list [0,1,2,3,4] and the function def f(x): return x % 3,
I would like to return a list [0,3], [1,4], [2], since the first group all takes values 0 under f, the 2nd group take value 1, etc...
Something like this works:
return [[x for x in lst if f(x) == val] for val in set(map(f,lst))],
But it does not seem optimal (nor pythonic) since the inner loop unnecessarily scans the entire list and computes same f values of the elements several times.
I'm looking for a solution that would compute the value of f ideally once for every element...
If you're not irrationally ;-) set on a one-liner, it's straightforward:
from collections import defaultdict
lst = [0,1,2,3,4]
f = lambda x: x % 3
d = defaultdict(list)
for x in lst:
d[f(x)].append(x)
print(list(d.values()))
displays what you want. f() is executed len(lst) times, which can't be beat
EDIT: or, if you must:
from itertools import groupby
print([[pair[1] for pair in grp]
for ignore, grp in
groupby(sorted((f(x), x) for x in lst),
key=lambda pair: pair[0])])
That doesn't require that f() produce values usable as dict keys, but incurs the extra expense of a sort, and is close to incomprehensible. Clarity is much more Pythonic than striving for one-liners.
#Tim Peters is right, and here is a mentioned setdefault and another itertool.groupby option.
Given
import itertools as it
iterable = range(5)
keyfunc = lambda x: x % 3
Code
setdefault
d = {}
for x in iterable:
d.setdefault(keyfunc(x), []).append(x)
list(d.values())
groupby
[list(g) for _, g in it.groupby(sorted(iterable, key=keyfunc), key=keyfunc)]
See also more on itertools.groupby

Resources