Related
I have a OrderedDict below, which column1 and column2 present a relationship
This created for me the following OrderedList
OrderedDict([('AD',
[['A', 'Q_30', 100],
['A', 'Q_24', 74],
['B', 'Q_28', 37],
['B', 'Q_30', 100],
['C', 'Q_25', 38],
['C', 'Q_30', 100],
['D', 'D_4', 44],
['E', 'D_4', 44],
['F', 'D_5', 44]])
I would like to iterate over the elements, each time look at other row and collect column2.
eg.
element A contain Q_30 and Q24 and collect related member from other rows
element B contain Q_30, so collect Q24,Q28,Q30 and order by column3
OrderedDict([('AD',
[{'Q_30':100, 'Q_24':74, 'Q_25':38, 'Q_28': 37}, {'D_4':44}, {'D_5':44}])
When I understand this correctly, your "OrderedDict" is currently a tuple with a list inside, in which is another list and is meant to look like this:
OrderedList = ('AD',
[['A', 'Q_30', 100],
['A', 'Q_24', 74],
['B', 'Q_28', 37],
['B', 'Q_30', 100],
['C', 'Q_25', 38],
['C', 'Q_30', 100],
['D', 'D_4', 44],
['E', 'D_4', 44],
['F', 'D_5', 44]])
and you want to convert it into a tuple with a list inside which holds dicts:
OrderedDict = ('AD',
[{'Q_30': 100,
'Q_24': 74,
'Q_25': 38,
'Q_28': 37},
{'D_4': 44},
{'D_5': 44}])
In this case I am guessing you look for groupby():
from itertools import groupby
OrderedList = ('AD',
[['A', 'Q_30', 100],
['A', 'Q_24', 74],
['B', 'Q_28', 37],
['B', 'Q_30', 100],
['C', 'Q_25', 38],
['C', 'Q_30', 100],
['D', 'D_4', 44],
['E', 'D_4', 44],
['F', 'D_5', 44]])
for key, group in groupby(OrderedList[1], lambda x: x[0]):
for thing in group:
print("%s is a %s." % (thing[1], key))
Gives:
Q_30 is a A.
Q_24 is a A.
Q_28 is a B.
Q_30 is a B.
Q_25 is a C.
Q_30 is a C.
D_4 is a D.
D_4 is a E.
D_5 is a F.
This is not the full answer, but an example as I feel like it would be spoon-feeding otherwise
I have the following set of rules for grading system
if 25 < score <= 30, grade = A.
if 20 < score <= 25, grade = B.
if 15 < score <= 20, grade = C.
if 10 < score <= 15, grade = D.
if 5 < score <= 10, grade = E.
if 0 <= score <= 5, grade = F.
so I have to write a function which takes score as parameter and returns letter grade. So I can do this using selections(if, else). But I want to do it in different manner.
for instance I want to declare a dictionary like below:
gradeDict = {
'A': [26, 27, 28, 29, 30],
'B': [21, 22, 23, 24, 25],
'C': [16, 17, 18, 19, 20],
'D': [11, 12, 13, 14, 15],
'E': [6, 7, 8, 9, 10],
'F': [0, 1, 2, 3, 4, 5]
}
so while checking the score with values I want to return the key
In python I've learned something like dict.get(term, 'otherwise') but it will give you the values. Is there any other mechanism that does the opposite, ie: if we can pass the value in the get method it will return the key?
The bisect standard library offers an elegant solution to problems like this one. In fact, grading is one of the examples shown in the docs.. Here is an adaption of the example modeled on OP's grading curve:
Example:
from bisect import bisect_left
def grade(score, breakpoints=[5, 10, 15, 20, 25], grades='FEDCBA'):
i = bisect_left(breakpoints, score)
return grades[i]
[grade(score) for score in [1, 5, 8, 10, 11, 15, 17, 20, 22, 25, 26]]
Output:
['F', 'F', 'E', 'E', 'D', 'D', 'C', 'C', 'B', 'B', 'A']
Funny thing is that you don't even need a dictionary for this, just an array. Ofc you can do it in a dictionary way style by declaring the following dict:
gradeDict = {
1:'F',
2:'E',
3:'D',
4:'C',
5:'B',
6:'A'
}
This dict seems to be useless since it's just an ordered list of indexes 1,2,3...
You can transform it: grates_arr = ['F', 'E', 'D', 'C', 'B', 'A']
But how can I get the letter that I need? you may ask. Simple, divide the score by 5. 21 // 5 means 4. grates_arr[21//5] is 'B'.
2 more particular cases:
when the score divides 5 means you have to subtract 1 because for example 25 // 5 means 5 but grates_arr[5] is A not B.
when score is 0 do not subtract.
from bisect import bisect
grades = "FEDCBA"
breakpoints = [30, 44, 66, 75, 85]
def grade(total):
return grades[bisect(breakpoints, total)]
print(grade(66))
print(list(map(grade, [33, 99, 77, 44, 12, 88])))
'''
C
['E', 'A', 'B', 'D', 'F', 'A']
[Program finished]'''
Not my program. Imported from enki.
Bisect module provides support for maintaining a list in sorted order without having to sort the list after each insertion.
So, when we call grade(66). It passes 66 to the grade function which returns C, How?
The second print statement is even more confusing.
It is mapping function grade with a List.
If I try to print, print (grades[bisect(breakpoints, grades)]),
I get Err,
TypeError: '<' not supported between instances of 'str' and 'int'
Your code produced the correct results for the data you fed it!
The breakpoint list [30, 44, 66, 75, 85] bisects the letter string "FEDCBA" as follows:
If grade < 30 then F
If 30 <= grade < 44 then E
If 44 <= grade < 66 then D
If 66 <= grade < 75 then C
If 75 <= grade < 85 then B
If 85 <= grade then A
Therefore print(grade(66)) correctly resulted in an output of C.
Similarly, your print(list(map(grade, [33, 99, 77, 44, 12, 88]))) correctly resulted in an output of ['E', 'A', 'B', 'D', 'F', 'A'].
Now regarding your getting TypeError because of print (grades[bisect(breakpoints, grades)]), it just looks like you meant to do grades[bisect(breakpoints, total)] instead. Notice total instead of grades as the second argument to bisect().
Here's another version of the working code which puts all the variables at the top so you can change them easier for testing:
data_list = [33, 99, 77, 44, 12, 88]
grade_string = 'FEDCBA'
breakpoint_list = [30, 44, 66, 75, 85]
def grade(total, breakpoints=breakpoint_list, grades=grade_string):
i = bisect(breakpoints, total)
return grades[i]
print([grade(total) for total in data_list])
The output is:
['E', 'A', 'B', 'D', 'F', 'A']
from bisect import bisect
grades = "FEDCBA"
breakpoints = [95, 44, 66, 75, 85]
def grade(total):
i = bisect(breakpoints, total)
return grades[i]
print("Original Data:", [grade(total) for total in breakpoints ])
print("Data within print statement:",list(map(grade, [33, 99, 77, 44, 12, 88])))
Thanks Scott,
Original Data: ['A', 'D', 'C', 'B', 'A']
Data within print statement: ['F', 'A', 'B', 'D', 'F', 'A']
I was able to produce the output I wanted for learning purpose
I have a dataframe as given below
data = {
'Code': ['P', 'J', 'M', 'Y', 'P', 'Z', 'P', 'P', 'J', 'P', 'J', 'M', 'P', 'Z', 'Y', 'M', 'Z', 'J', 'J'],
'Value': [10, 10, 20, 30, 10, 40, 50, 10, 10, 20, 10, 50, 60, 40, 30, 20, 40, 20, 10]
}
example = pd.DataFrame(data)
Using Python 3, I want to create another dataframe from the dataframe example such that the Code associated with the greater number of Value is obtained.
The new dataframe should look like solution below
output = {'Code': ['J', 'M', 'Y', 'Z', 'P', 'M'],'Value': [10, 20, 30, 40, 50, 50]}
solution = pd.DataFrame(output)
As can be seen, J has more association to Value 10 than other Code so J is selected, and so on.
You could define a function that returns the most occurring items and apply it to the grouped elements. Finally explode to list to rows.
>>> def most_occurring(grp):
... res = Counter(grp)
... highest = max(res.values())
... return [k for k, v in res.items() if v == highest]
...
>>> example.groupby('Value')['Code'].apply(lambda x: most_occurring(x)).explode().reset_index()
Value Code
0 10 J
1 20 M
2 30 Y
3 40 Z
4 50 P
5 50 M
6 60 P
If I understood correctly, you need something like this:
grouped = example.groupby(['Code', 'Value']).indices
arr_tmp = []
[arr_tmp.append([i[0], i[1], len(grouped[i])]) for i in grouped]#['Int64Index'])
output = pd.DataFrame(data=arr_tmp, columns=['Code', 'Value', 'index_count'])
output = output.sort_values(by=['index_count'], ascending=False)
output.reset_index(inplace=True)
output
I have a double list of this type: dl = [[13, 22, 41], ['c', 'b', 'a']], in which, each element dl[0][i] belongs a value in dl[1][i] (with the same index). How can I sort my list using dl[0] values as my order criteria, maintainning linked both sublists? Sublist are kind of 'linked data', so the previous dl[0][i] and dl[1][i] values must match their index after sorting the parent entire list, using as sorting criteria, the first sublist values
I expect something like:
input: dl = [ [14,22,7,17], ['K', 'M', 'F','A'] ]
output: dl = [ [7, 14, 17, 22], ['F', 'K', 'A', 'M'] ]
This was way too much fun to write. I don't doubt that this function can be greatly improved, but this is what I've gotten in a very short amount of time and should get you started.
I've included some tests just so you can verify that this does indeed do what you want.
from unittest import TestCase, main
def sort_by_first(data):
sorted_data = []
for seq in data:
zipped_to_first = zip(data[0], seq)
sorted_by_first = sorted(zipped_to_first)
unzipped_data = zip(*sorted_by_first)
sorted_data.append(list(tuple(unzipped_data)[1]))
return sorted_data
class SortByFirstTestCase(TestCase):
def test_sort(self):
output_1 = sort_by_first([[1, 3, 5, 2, 4], ['a', 'b', 'c', 'd', 'e']])
self.assertEqual(output_1, [[1, 2, 3, 4, 5], ['a', 'd', 'b', 'e', 'c']])
output_2 = sort_by_first([[9, 1, 5], [21, 22, 23], ['spam', 'foo', 'bar']])
self.assertEqual(output_2, [[1, 5, 9], [22, 23, 21], ['foo', 'bar', 'spam']])
if __name__ == '__main__':
main()
Updated for what you're looking for, selection sort but added another line to switch for the second list to match the first.
for i in range(len(dl[0])):
min_idx = i
for j in range(i+1, len(dl[0])):
if dl[0][min_idx] > dl[0][j]:
min_idx = j
dl[0][i], dl[0][min_idx] = dl[0][min_idx], dl[0][i]
dl[1][i], dl[1][min_idx] = dl[1][min_idx], dl[1][i]
You can try solving this with a for loop also:
dl = [ [3,2,1], ['c', 'b', 'a'] ]
for i in range(0,len(dl)):
dl[i].sort()
print(dl)