Related
How do I add element 1 in all the lists? After that, add element 2? I need to find the percentage of them.
rows = [["SOBs", 60, 80, 70, 75], ["Test1", 60, 50, 60, 65], ["Test2", 40, 30, 40, 45], ["Test3", 45, 90, 80, 85], ["CW", 40, 80, 70, 75]]
I have tried in this manner:
sum(sum(rows, [2])) - this doesn't work
print(sum(rows[0][1] + [1][1])) - also doesn't work
So the for element 1 in it would be 60+60+40+45+40 = 245
Then I take the 245/500*100 = 49%
You can try with list comprehension easily.
element = 1
rows = [["SOBs", 60, 80, 70, 75], ["Test1", 60, 50, 60, 65], ["Test2", 40, 30, 40, 45], ["Test3", 45, 90, 80, 85], ["CW", 40, 80, 70, 75]]
sum_1= sum([rows[i][element] for i in range(len(rows))])
element = 2
sum_2= sum([rows[i][element] for i in range(len(rows))])
print((sum_1*100)/(sum_1+sum_2))
Consider it like a table, and use comprehensions, to summarize columns.
Try this
rows = [["SOBs", 60, 80, 70, 75], ["Test1", 60, 50, 60, 65], ["Test2", 40, 30, 40, 45], ["Test3", 45, 90, 80, 85],
["CW", 40, 80, 70, 75]]
COL_LENGTH = 4 # set as 4, as you have only 4 colums to evaluate
# Create a list of column values for each column index
selected_cols = [[row[i] for row in rows] for i in range(1, COL_LENGTH)]
# Get sum of every column
sums = [sum(col) for col in selected_cols]
# get avg of every column
avgs = [_sum / 5 for _sum in sums]
im a begginer with DataFrame.
I have task that i need to write a query for DataFram
This is how my dataframe look like:
I need the first row that has minimum age and points is bigger then 100.
i tried to use min() function but i dont know how to use anther query?
How to build the DF
dict1 ={'Driver':['Hamilton', 'Vettel', 'Raikkonen',
'Verstappen', 'Bottas', 'Ricciardo',
'Hulkenberg', 'Perez', 'Magnussen',
'Sainz', 'Alonso', 'Ocon', 'Leclerc',
'Grosjean', 'Gasly', 'Vandoorne',
'Ericsson', 'Stroll', 'Hartley', 'Sirotkin'],
'Points':[408, 320, 251, 249, 247, 170, 69, 62, 56,
53, 50, 49, 39, 37, 29, 12, 9, 6, 4, 1],
'Age':[33, 31, 39, 21, 29, 29, 31, 28, 26, 24, 37,
22, 21, 32, 22, 26, 28, 20, 29, 23]}
df = pd.DataFrame(dict1)
Expected result row3: Verstappen 249 21
because he is a yungest age and points is bigger then 100
Thanks for any help!
You can use df.loc and df.idxmin() for this:
Find all rows with Points > 100 and from these rows find the index of row with min Age:
In [3124]: ix = df[df.Points.gt(100)].Age.idxmin()
Use the above index to find the row from the df:
In [3126]: df = df.loc[ix]
In [3127]: df
Out[3127]:
Driver Verstappen
Points 249
Age 21
Name: 3, dtype: object
I'm a brand new to programming and I'm stuck on a practice exercise.
EDIT: exact error code "avgDict[k] =max(sum(v)/ float(len(v)))
TypeError: 'float' object is not iterable"
If I remove max it's printing every student's avg.
# student_grades contains scores (out of 100) for 5 assignments
diction = {
'Andrew': [56, 79, 90, 22, 50],
'Colin': [88, 62, 68, 75, 78],
'Alan': [95, 88, 92, 85, 85],
'Mary': [76, 88, 85, 82, 90],
'Tricia': [99, 92, 95, 89, 99]
}
def averageGrades(diction):
avgDict = {}
for k, v in diction.items():
avgDict[k] =max(sum(v)/ float(len(v)))
return avgDict
What your code is doing right now is iterating through each key-value pair of the dictionary (where the key is the student's name and the value is the list of the student's grades), and then calculating the average for the student. Then, the way you are using max right now, it is trying to find the max of a single student's average. This is why you are receiving the error, because max expects either an iterable or multiple parameters, and a float (which is the value produced by sum(v) / float(len(v))) is not an iterable. You should instead compute all of the averages first, and then find the max value in the dictionary of averages:
diction = {
'Andrew': [56, 79, 90, 22, 50],
'Colin': [88, 62, 68, 75, 78],
'Alan': [95, 88, 92, 85, 85],
'Mary': [76, 88, 85, 82, 90],
'Tricia': [99, 92, 95, 89, 99]
}
def averageGrades(diction):
avgDict = {}
for k, v in diction.items():
avgDict[k] = sum(v) / len(v)
return max(avgDict.items(), key=lambda i: i[1]) # find the pair which has the highest value
print(averageGrades(diction)) # ('Tricia', 94.8)
Sidenote, in Python 3, using / does normal division (as opposed to integer division) by default, so casting len(v) to a float is unnecessary.
Alternatively, if you don't need to create the avgDict variable, you can just determine the max directly without the intermediate variable:
def averageGrades(diction):
return max([(k, sum(v) / len(v)) for k, v in diction.items()], key=lambda i: i[1])
print(averageGrades(diction)) # ('Tricia', 94.8)
list = [1,2,,3,4,5,6,1,2,56,78,45,90,34]
range = ["0-25","25-50","50-75","75-100"]
I am coding in python. I want to sort a list of integers in range of numbers and store them in differrent lists.How can i do it?
I have specified my ranges in the the range list.
Create a dictionary with max-value of each bin as key. Iterate through your numbers and append them to the list that's the value of each bin-key:
l = [1,2,3,4,5,6,1,2,56,78,45,90,34]
# your range covers 25 a piece - and share start/endvalues.
# I presume [0-25[ ranges
def inRanges(data,maxValues):
"""Sorts elements of data into bins that have a max-value. Max-values are
given by the list maxValues which holds the exclusive upper bound of the bins."""
d = {k:[] for k in maxValues} # init all keys to empty lists
for n in data:
key = min(x for x in maxValues if x>n) # get key
d[key].append(n) # add number
return d
sortEm = inRanges(l,[25,50,75,100])
print(sortEm)
print([ x for x in sortEm.values()])
Output:
{25: [1, 2, 3, 4, 5, 6, 1, 2], 50: [25, 45, 34],
75: [56], 100: [78, 90]}
[[1, 2, 3, 4, 5, 6, 1, 2], [25, 45, 34], [56], [78, 90]]
Another stable bin approach for your special case (regular intervaled bins) would be to use a calculated key - this would get rid of the key-search in each step.
Stable search means the order of numbers in the list is the same as in the input data:
def inRegularIntervals(data, interval):
"""Sorts elements of data into bins of regular sizes.
The size of each bin is given by 'interval'."""
# init dict so keys are ordered - collection.defaultdict(list)
# would be faster - but this works for lists of a couple of
# thousand numbers if you have a quarter up to one second ...
# if random key order is ok, shorten this to d = {}
d = {k:[] for k in range(0, max(data), interval)}
for n in data:
key = n // interval # get key
key *= interval
d.setdefault(key, [])
d[key ].append(n) # add number
return d
Use on random data:
from random import choices
data = choices(range(100), k = 50)
data.append(135) # add a bigger value to see the gapped keys
binned = inRegularIntervals(data, 25)
print(binned)
Output (\n and spaces added):
{ 0: [19, 9, 1, 0, 15, 22, 4, 9, 12, 7, 12, 9, 16, 2, 7],
25: [25, 31, 37, 45, 30, 48, 44, 44, 31, 39, 27, 36],
50: [50, 50, 58, 60, 70, 69, 53, 53, 67, 59, 52, 64],
75: [86, 93, 78, 93, 99, 98, 95, 75, 88, 82, 79],
100: [],
125: [135], }
To sort the binned lists in place, use
for k in binned:
binned[k].sort()
to get:
{ 0: [0, 1, 2, 4, 7, 7, 9, 9, 9, 12, 12, 15, 16, 19, 22],
25: [25, 27, 30, 31, 31, 36, 37, 39, 44, 44, 45, 48],
50: [50, 50, 52, 53, 53, 58, 59, 60, 64, 67, 69, 70],
75: [75, 78, 79, 82, 86, 88, 93, 93, 95, 98, 99],
100: [],
125: [135]}
This Question is about challenge number 6 in set number 1 in the challenges of "the cryptopals crypto challenges".
The challenge is:
There's a file here. It's been base64'd after being encrypted with repeating-key XOR.
Decrypt it.
After that there's a description of steps to decrypt the file, There is total of 8 steps. You can find them in the site.
I have been trying to solve this challenge for a while and I am struggling with the final two steps. Even though I've solved challenge number 3, and it contains the solution for these steps.
Note: It is, of course, possible that there is a mistake in the first 6 steps but they seems to work well after looking at the print after every step.
My code:
Written in Python 3.6.
In order to not deal with web requests, and since it is not the purpose of this challenge. I just copied the content of the file to a string in the begging, You can do this as well before running the code.
import base64
# Encoding the file from base64 to binary
file = base64.b64decode("""HUIfTQsP...JwwRTWM=""")
print(file)
print()
# Step 1 - guess key size
KEYSIZE = 4
# Step 2 - find hamming distance - number of differing bits
def hamming2(s1, s2):
"""Calculate the Hamming distance between two bit strings"""
assert len(s1) == len(s2)
return sum(c1 != c2 for c1, c2 in zip(s1, s2))
def distance(a, b): # Hamming distance
calc = 0
for ca, cb in [(a[i], b[i]) for i in range(len(a))]:
bina = '{:08b}'.format(int(ca))
binb = '{:08b}'.format(int(cb))
calc += hamming2(bina, binb)
return calc
# Test step 2
print("distance: 'this is a test' and 'wokka wokka!!!' =", distance([ord(c) for c in "this is a test"], [ord(c) for c in "wokka wokka!!!"])) # 37 - Working
print()
# Step 3
key_sizes = []
# For each key size
for KEYSIZE in range(2, 41):
# take the first KEYSIZE worth of bytes, and the second KEYSIZE worth of bytes -
# file[0:KEYSIZE], file[KEYSIZE:2*KEYSIZE]
# and find the edit distance between them
# Normalize this result by dividing by KEYSIZE
key_sizes.append((distance(file[0:KEYSIZE], file[KEYSIZE:2*KEYSIZE]) / KEYSIZE, KEYSIZE))
key_sizes.sort(key=lambda a: a[0])
# Step 4
for val, key in key_sizes:
print(key, ":", val)
KEYSIZE = key_sizes[0][1]
print()
# Step 5 + 6
# Each line is a list of all the bytes in that index
splited_file = [[] for i in range(KEYSIZE)]
counter = 0
for char in file:
splited_file[counter].append(char)
counter += 1
counter %= KEYSIZE
for line in splited_file:
print(line)
print()
# Step 7
# Code from another level
# Gets a string and a single char
# Doing a single-byte XOR over it
def single_char_string(a, b):
final = ""
for c in a:
final += chr(c ^ b)
return final
# Going over all the bytes and listing the result arter the XOR by number of bytes
def find_single_byte(in_string):
helper_list = []
for num in range(256):
helper_list.append((single_char_string(in_string, num), num))
helper_list.sort(key=lambda a: a[0].count(' '), reverse=True)
return helper_list[0]
# Step 8
final_key = ""
key_list = []
for line in splited_file:
result = find_single_byte(line)
print(result)
final_key += chr(result[1])
key_list.append(result[1])
print(final_key)
print(key_list)
Output:
b'\x1dB\x1fM\x0b\x0f\x02\x1fO\x13N<\x1aie\x1fI...\x08VA;R\x1d\x06\x06TT\x0e\x10N\x05\x16I\x1e\x10\'\x0c\x11Mc'
distance: 'this is a test' and 'wokka wokka!!!' = 37
5 : 1.2
3 : 2.0
2 : 2.5
.
.
.
26 : 3.5
28 : 3.5357142857142856
9 : 3.5555555555555554
22 : 3.727272727272727
6 : 4.0
[29, 15, 78, 31, 19, 27, 0, 32, ... 17, 26, 78, 38, 28, 2, 1, 65, 6, 78, 16, 99]
[66, 2, 60, 73, 1, 1, 30, 3, 13, ... 26, 14, 0, 26, 79, 99, 8, 79, 11, 4, 82, 59, 84, 5, 39]
[31, 31, 19, 26, 79, 47, 17, 28, ... 71, 89, 12, 1, 16, 45, 78, 3, 120, 11, 42, 82, 84, 22, 12]
[77, 79, 105, 14, 7, 69, 73, 29, 101, ... 54, 70, 78, 55, 7, 79, 31, 88, 10, 69, 65, 8, 29, 14, 73, 17]
[11, 19, 101, 78, 78, 54, 100, 67, 82, ... 1, 76, 26, 1, 2, 73, 21, 72, 73, 49, 27, 86, 6, 16, 30, 77]
('=/n?3; \x00\x13&-,>1...r1:n\x06<"!a&n0C', 32)
('b"\x1ci!!>ts es(ogg ...5i<% tc:. :oC(o+$r\x1bt%\x07', 32)
('??:<+6!=ngm2i4\x0byD...&h9&2:-)sm.a)u\x06&=\x0ct&~n +=&*4X:<(3:o\x0f1<mE gy,!0\rn#X+\nrt6,', 32)
('moI.\'ei=Et\'\x1c:l ...6k=\x1b m~t*\x155\x1ei+=+ts/e*9$sgl0\'\x02\x16fn\x17\'o?x*ea(=.i1', 32)
('+3Enn\x16Dcr<$,)\x01...i5\x01,hi\x11;v&0>m', 32)
[32, 32, 32, 32, 32]
Notice that in the printing of the key as string you cannot see it but there is 5 chars in there.
It is not the correct answer since you can see that in the forth part - after the XOR, the results do not look like words... Probably a problem in the last two functions but I couldn't figure it out.
I've also tried some other lengths and It does not seems to be the problem.
So what I'm asking is not to fix my code, I want to solve this challenge by myself :). I would like you to tell me where I am wrong? why? and how should I continue?
Thank you for your help.
After a lot of thinking and checking the conclusion was that the problem is in step number 3. The result was not good enough since I looked only at the first two blocks.
I fixed the code so it will calculate the KEYSIZE according to all of the blocks.
The code of Step 3 now look like this:
# Step 3
key_sizes = []
# For each key size
for KEYSIZE in range(2, 41):
running_sum = []
for i in range(0, int(len(file) / KEYSIZE) - 1):
running_sum.append(distance(file[i * KEYSIZE:(i + 1) * KEYSIZE],
file[(i + 1) * KEYSIZE:(i + 2) * KEYSIZE]) / KEYSIZE)
key_sizes.append((sum(running_sum)/ len(running_sum), KEYSIZE))
key_sizes.sort(key=lambda a: a[0])
Thanks for any one who tried to help.