Select nth row after orderby in pyspark dataframe - apache-spark

I want to select the second row for each group of names. I used orderby to sort by name and then the purchase date/timestamp. It is important that I select the second purchase for each name (by datetime).
Here is the data to build dataframe:
data = [
('George', datetime(2020, 3, 24, 3, 19, 58), datetime(2018, 2, 24, 3, 22, 55)),
('Andrew', datetime(2019, 12, 12, 17, 21, 30), datetime(2019, 7, 21, 2, 14, 22)),
('Micheal', datetime(2018, 11, 22, 13, 29, 40), datetime(2018, 5, 17, 8, 10, 19)),
('Maggie', datetime(2019, 2, 8, 3, 31, 23), datetime(2019, 5, 19, 6, 11, 33)),
('Ravi', datetime(2019, 1, 1, 4, 19, 47), datetime(2019, 1, 1, 4, 22, 55)),
('Xien', datetime(2020, 3, 2, 4, 33, 51), datetime(2020, 5, 21, 7, 11, 50)),
('George', datetime(2020, 3, 24, 3, 19, 58), datetime(2020, 3, 24, 3, 22, 45)),
('Andrew', datetime(2019, 12, 12, 17, 21, 30), datetime(2019, 9, 19, 1, 14, 11)),
('Micheal', datetime(2018, 11, 22, 13, 29, 40), datetime(2018, 8, 19, 7, 11, 37)),
('Maggie', datetime(2019, 2, 8, 3, 31, 23), datetime(2018, 2, 19, 6, 11, 42)),
('Ravi', datetime(2019, 1, 1, 4, 19, 47), datetime(2019, 1, 1, 4, 22, 17)),
('Xien', datetime(2020, 3, 2, 4, 33, 51), datetime(2020, 6, 21, 7, 11, 11)),
('George', datetime(2020, 3, 24, 3, 19, 58), datetime(2020, 4, 24, 3, 22, 54)),
('Andrew', datetime(2019, 12, 12, 17, 21, 30), datetime(2019, 8, 30, 3, 12, 41)),
('Micheal', datetime(2018, 11, 22, 13, 29, 40), datetime(2017, 5, 17, 8, 10, 38)),
('Maggie', datetime(2019, 2, 8, 3, 31, 23), datetime(2020, 3, 19, 6, 11, 12)),
('Ravi', datetime(2019, 1, 1, 4, 19, 47), datetime(2018, 2, 1, 4, 22, 24)),
('Xien', datetime(2020, 3, 2, 4, 33, 51), datetime(2018, 9, 21, 7, 11, 41)),
]
df = sqlContext.createDataFrame(data, ['name', 'trial_start', 'purchase'])
df.show(truncate=False)
I order the data by name and then purchase
df.orderBy("name","purchase").show()
to produce the result:
+-------+-------------------+-------------------+
| name| trial_start| purchase|
+-------+-------------------+-------------------+
| Andrew|2019-12-12 22:21:30|2019-07-21 06:14:22|
| Andrew|2019-12-12 22:21:30|2019-08-30 07:12:41|
| Andrew|2019-12-12 22:21:30|2019-09-19 05:14:11|
| George|2020-03-24 07:19:58|2018-02-24 08:22:55|
| George|2020-03-24 07:19:58|2020-03-24 07:22:45|
| George|2020-03-24 07:19:58|2020-04-24 07:22:54|
| Maggie|2019-02-08 08:31:23|2018-02-19 11:11:42|
| Maggie|2019-02-08 08:31:23|2019-05-19 10:11:33|
| Maggie|2019-02-08 08:31:23|2020-03-19 10:11:12|
|Micheal|2018-11-22 18:29:40|2017-05-17 12:10:38|
|Micheal|2018-11-22 18:29:40|2018-05-17 12:10:19|
|Micheal|2018-11-22 18:29:40|2018-08-19 11:11:37|
| Ravi|2019-01-01 09:19:47|2018-02-01 09:22:24|
| Ravi|2019-01-01 09:19:47|2019-01-01 09:22:17|
| Ravi|2019-01-01 09:19:47|2019-01-01 09:22:55|
| Xien|2020-03-02 09:33:51|2018-09-21 11:11:41|
| Xien|2020-03-02 09:33:51|2020-05-21 11:11:50|
| Xien|2020-03-02 09:33:51|2020-06-21 11:11:11|
+-------+-------------------+-------------------+
How might I get the second row for each name? In pandas it was easy. I could just use nth. I have been looking at sql but have not found a solution. Any suggestions appreciated.
The output I am looking for would be:
+-------+-------------------+-------------------+
| name| trial_start| purchase|
+-------+-------------------+-------------------+
| Andrew|2019-12-12 22:21:30|2019-08-30 07:12:41|
| George|2020-03-24 07:19:58|2020-03-24 07:22:45|
| Maggie|2019-02-08 08:31:23|2019-05-19 10:11:33|
|Micheal|2018-11-22 18:29:40|2018-05-17 12:10:19|
| Ravi|2019-01-01 09:19:47|2019-01-01 09:22:17|
| Xien|2020-03-02 09:33:51|2020-05-21 11:11:50|
+-------+-------------------+-------------------+

Try with window row_number() function then filter only the 2 row after ordering by purchase.
Example:
from pyspark.sql import *
from pyspark.sql.functions import *
w=Window.partitionBy("name").orderBy(col("purchase"))
df.withColumn("rn",row_number().over(w)).filter(col("rn") ==2).drop(*["rn"]).show()
SQL Api:
df.createOrReplaceTempView("tmp")
spark.sql("SET spark.sql.parser.quotedRegexColumnNames=true")
sql("select `(rn)?+.+` from (select *,row_number() over(partition by name order by purchase) rn from tmp) e where rn =2").\
show()

Related

bokeh hbar_stack not rendering properly when using datetimes

i'm trying to plot a hbar_stack with datetimes in x axis with no luck. i've done normal hbar plots with datetimes before with no problems so it's has to be something with the hbar_stack.
Here is the code with some static data:
start_date = datetime.datetime(2020, 7, 10, 10, 26, 15, 240666)
end_date = datetime.datetime(2020, 7, 10, 13, 27, 33, 741238)
tasks = ['task 1', 'task 2', 'task 3', 'task 4']
status = ['status_1', 'status_2', 'status_3', 'status_4']
exports = {'tasks': tasks, 'status_1': [datetime.datetime(2020, 7, 10, 13, 26, 59, 531234),
datetime.datetime(2020, 7, 10, 13, 25, 16, 666837),
datetime.datetime(2020, 7, 10, 10, 37, 16, 368927),
datetime.datetime(2020, 7, 10, 10, 26, 15, 240666)],
'status_2': [None, datetime.datetime(2020, 7, 10, 13, 27, 33, 741238),
datetime.datetime(2020, 7, 10, 11, 37, 7, 629667),
datetime.datetime(2020, 7, 10, 10, 27, 5, 540767)],
'status_3': [None, None, None, datetime.datetime(2020, 7, 10, 10, 54, 17, 738024)],
'status_4': [None, None, None, datetime.datetime(2020, 7, 10, 11, 2, 15, 196620)]}
p = figure(y_range=tasks, x_range=[start_date, end_date], x_axis_type='datetime', title="Tasks timeline",
tools=["hover,pan,reset,save,wheel_zoom"], tooltips=None)
p.xaxis.formatter = DatetimeTickFormatter(
days=["%m-%d-%Y"],
months=["%m-%d-%Y"],
years=["%m-%d-%Y"],
)
p.xaxis.major_label_orientation = radians(30)
p.hbar_stack(status, y='tasks', height=0.2, color=Spectral[11][:len(status)], source=ColumnDataSource(exports))
As one can see from the data the datetimes are minutes apart but it renders with years of difference. On hovering the data(x, y) the x value is not showing a date, instead it's showing a big number like 1.589e+12. Any help is appreciated.
enter image description here
dts = [
datetime(2020, 7, 10, 13, 26, 59, 531234),
datetime(2020, 7, 10, 13, 25, 16, 666837),
datetime(2020, 7, 10, 10, 37, 16, 368927),
datetime(2020, 7, 10, 10, 26, 15, 240666)
]
# because the datetimes are in reverse order
ends = dts[0:-1]
starts = dts[1:]
p = figure(plot_height=350, x_axis_type="datetime", y_range=["a", "b", "c"])
p.hbar(y=["b", "b", "b"], left=starts, right=ends,
line_color="white", fill_color=["red", "blue", "orange"])
p.xaxis.formatter.hours = ["%b %Y %H:%M"]
show(p)
which yields:

remove element from the list then display list without element that remove

I have this nested list:
a = [[1, 3, 6, 11, 16, 21, 25, 28, 31, 32, 33, 34, 35, 36],
[1, 2, 5, 9, 15, 20, 24, 26, 30, 36],
[1, 3, 6, 11, 16, 21, 25, 29, 31, 32, 33, 34, 35, 36],
[1, 2, 4, 8, 14, 18, 23, 36],
[1, 2, 5, 9, 15, 20, 24, 27, 30, 36],
[1, 3, 6, 11, 16, 22, 25, 28, 31, 32, 33, 34, 35, 36],
[1, 3, 7, 12, 17, 36],
[1, 2, 4, 8, 14, 19, 23, 36],
[1, 2, 5, 10, 15, 20, 24, 26, 30, 36],
[1, 3, 6, 11, 16, 22, 25, 29, 31, 32, 33, 34, 35, 36],
[1, 2, 5, 10, 15, 20, 24, 27, 30, 36],
[1, 3, 6, 11, 16, 21, 25, 28, 31, 32, 33, 35, 36],
[1, 3, 6, 11, 16, 21, 25, 28, 31, 33, 34, 35,36],
[1, 3, 6, 11, 16, 21, 25, 29, 31, 32, 33, 35, 36]]
I need to choose max length of sublist in nested list, than compare item of sublist with nested list. If item in sublist equal then same item in nested list remove and in final print nested list without this item.
I hope I understand your question correctly.
You want input to be:
a = [[1, 3, 6, 11, 16, 21, 25, 28, 31, 32, 33, 34, 35, 36],
[1, 2, 5, 9, 15, 20, 24, 26, 30, 36],
[1, 3, 6, 11, 16, 21, 25, 29, 31, 32, 33, 34, 35, 36],
[1, 2, 4, 8, 14, 18, 23, 36],
[1, 2, 5, 9, 15, 20, 24, 27, 30, 36],
[1, 3, 6, 11, 16, 22, 25, 28, 31, 32, 33, 34, 35, 36],
[1, 3, 7, 12, 17, 36],
[1, 2, 4, 8, 14, 19, 23, 36],
[1, 2, 5, 10, 15, 20, 24, 26, 30, 36],
[1, 3, 6, 11, 16, 22, 25, 29, 31, 32, 33, 34, 35, 36],
[1, 2, 5, 10, 15, 20, 24, 27, 30, 36],
[1, 3, 6, 11, 16, 21, 25, 28, 31, 32, 33, 35, 36],
[1, 3, 6, 11, 16, 21, 25, 28, 31, 33, 34, 35, 36],
[1, 3, 6, 11, 16, 21, 25, 29, 31, 32, 33, 35, 36]]
We are removing
[1, 3, 6, 11, 16, 22, 25, 29, 31, 32, 33, 34, 35, 36]
and
[1, 3, 6, 11, 16, 21, 25, 29, 31, 32, 33, 34, 35, 36]
since they are of the same length.
The output should be:
a = [[1, 2, 5, 9, 15, 20, 24, 26, 30, 36],
[1, 2, 4, 8, 14, 18, 23, 36],
[1, 2, 5, 9, 15, 20, 24, 27, 30, 36],
[1, 3, 7, 12, 17, 36],
[1, 2, 4, 8, 14, 19, 23, 36],
[1, 2, 5, 10, 15, 20, 24, 26, 30, 36],
[1, 2, 5, 10, 15, 20, 24, 27, 30, 36],
[1, 3, 6, 11, 16, 21, 25, 28, 31, 32, 33, 35, 36],
[1, 3, 6, 11, 16, 21, 25, 28, 31, 33, 34, 35, 36],
[1, 3, 6, 11, 16, 21, 25, 29, 31, 32, 33, 35, 36]]
with the previous lists removed.
Your question was not worded clearly, but I hope this is what you wanted. Here is the code:
# assume a is not empty
d = {} # list of the max length -> number of occurrences in 2d array
# find the length of the longest list
maxLen = len(a[0])
for l in a:
if len(l) > maxLen:
maxLen = len(l)
# add lists of the same max length and their count to the dictionary
for l in a:
if len(l) == maxLen:
#convert list to string because python does not support list being key of a dictionary
l_string = str(l)
if l_string in d:
d[l_string] += 1
else:
d[l_string] = 1
# remove
for l_string in d:
while d[l_string] > 0:
# convert string back to list and remove
a.remove(eval(l_string))
d[l_string] -= 1
# test result if you want
for row in a:
print(row)

Description text in speech recognition

I'm trying to build my own speech recognition network. I understood how to pre-process audio. But I can't figure out the pre-processing of the text.
I have a alphabet:
alphabet = {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5, 'f': 6, 'g': 7, 'h': 8, 'i': 9, 'j': 10, 'k': 11, 'l': 12, 'm': 13, 'n': 14,'o': 15, 'p': 16, 'q': 17, 'r': 18, 's': 19, 't': 20, 'u': 21, 'v': 22, 'w': 23, 'x': 24, 'y': 25, 'z': 26}
And I encode each letter of the sentence into a number (27 is a space):
array([list([27, 23, 8, 5, 14, 27, 8, 5, 27, 19, 16, 5, 1, 11, 19, 27, 9, 14, 27, 15, 21, 18, 27, 12, 1, 14, 7, 21, 1, 7, 5, 27, 9, 27, 3, 1, 14, 27, 9, 14, 20, 5, 18, 16, 18, 5, 20, 27, 23, 8, 1, 20, 27, 8, 5, 27, 8, 1, 19, 27, 19, 1, 9, 4, 27]),
list([27, 19, 15, 27, 14, 15, 23, 27, 9, 27, 6, 5, 1, 18, 27, 14, 15, 20, 8, 9, 14, 7, 27, 2, 5, 3, 1, 21, 19, 5, 27, 9, 20, 27, 23, 1, 19, 27, 20, 8, 15, 19, 5, 27, 15, 13, 5, 14, 19, 27, 20, 8, 1, 20, 27, 2, 18, 15, 21, 7, 8, 20, 27, 25, 15, 21, 27, 20, 15, 27, 13, 5, 27]),
list([27, 14, 9, 7, 8, 20, 27, 6, 5, 12, 12, 27, 1, 14, 4, 27, 1, 14, 27, 1, 19, 19, 15, 18, 20, 13, 5, 14, 20, 27, 15, 6, 27, 6, 9, 7, 8, 20, 9, 14, 7, 27, 13, 5, 14, 27, 1, 14, 4, 27, 13, 5, 18, 3, 8, 1, 14, 20, 19, 27, 5, 14, 20, 5, 18, 5, 4, 27, 1, 14, 4, 27, 5, 24, 9, 20, 5, 4, 27, 20, 8, 5, 27, 20, 5, 14, 20, 27]),
list([27, 9, 27, 8, 5, 1, 18, 4, 27, 1, 27, 6, 1, 9, 14, 20, 27, 13, 15, 22, 5, 13, 5, 14, 20, 27, 21, 14, 4, 5, 18, 27, 13, 25, 27, 6, 5, 5, 20, 27]),
list([27, 25, 15, 21, 27, 3, 1, 13, 5, 27, 19, 15, 27, 20, 8, 1, 20, 27, 25, 15, 21, 27, 3, 15, 21, 12, 4, 27, 12, 5, 1, 18, 14, 27, 1, 2, 15, 21, 20, 27, 25, 15, 21, 18, 27, 4, 18, 5, 1, 13, 19, 27, 19, 1, 9, 4, 27, 20, 8, 5, 27, 15, 12, 4, 27, 23, 15, 13, 1, 14, 27])],
dtype=object)
Here are 5 sentences.
I just create one network layer and try to transfer this data there in order to get a number corresponding to the letter.
model = Sequential()
model.add(Dense(27, input_shape=(20,), activation='softmax'))
model.compile(loss='mean_squared_error',optimizer='Adam', metrics=['accuracy'])
for X, y in batch(X_train, y_train, 5):
model.train_on_batch(X, y)
batch() just breaks X_train, y_train into batch.
5 is size of batch.
But when I try to start the network I get an error
Error when checking target: expected dense_25 to have shape (27,) but got array with shape (1,)
UPD:
I'm using MFCC for X
audio, sr = librosa.load(pathTrain+"\\"+str(file), mono=True, sr=None)
fileMFCC = librosa.feature.mfcc(audio)
mean_scale = np.mean(fileMFCC, axis=0)
std_scale = np.std(fileMFCC, axis=0)
fileMFCC = (fileMFCC - mean_scale[np.newaxis, :]) / std_scale[np.newaxis, :]
X is
[array([[-4.35889894, -4.35889894, -4.35455134, ..., -3.95851777,
-3.99308173, -4.05261022],
[ 0.22941573, 0.22941573, 0.31913073, ..., 1.87189324,
1.7987301 , 1.66804349],
[ 0.22941573, 0.22941573, 0.31165866, ..., -0.27962786,
-0.19009062, -0.13788484],
...,
[ 0.22941573, 0.22941573, 0.18657944, ..., 0.14699792,
0.12751924, 0.16724807],
[ 0.22941573, 0.22941573, 0.18478513, ..., 0.00674492,
-0.04570105, 0.01231168],
[ 0.22941573, 0.22941573, 0.18232521, ..., 0.2571599 ,
0.22477036, 0.09153304]])
etc.

DES implementation

I'm trying to programm the Data Encyption Standard on my own and I'm struggling to Programm the SBoxes. I know there is already a module to encrypt and decrypt with the DES but my teacher asked to programm it myself, so here is what i have:
import random
from re import findall
class DES:
def __init__(self):
self. Eingabe=""
self.Schluessel=""
self.NachrichtBinaer=""
self.Bitslinks=""
self.Bitsrechts=""
self.Teil1=""
self.Teil2=""
self.Subkey=""
self.pcschluessel=""
self.subkeyliste=[]
self.initialpermutation=""
self.liste1=[]
self.Bits48=""
self.ausgabesbox=[]
self.liste2=[]
def EingabeNachricht(self):
self.Eingabe=input("Geben Sie ein Wort ein:")
print ("Eingegebenes Wort: ",self.Eingabe)
def Bitumwandlung(self):
for i in range(0,len(self.Eingabe)):
self.NachrichtBinaer=self.NachrichtBinaer+bin(ord(self.Eingabe[i]))
self.NachrichtBinaer=self.NachrichtBinaer.replace("b","")
if len(self.NachrichtBinaer)<64:
self.NachrichtBinaer=self.NachrichtBinaer.rjust(64,"0")
self.NachrichtBinaer="0000000100100011010001010110011110001001101010111100110111101111"
print ("Nachricht in Binaer: ",self.NachrichtBinaer)
def Teilen(self):
self.Bitslinks=self.initialpermutation[:int(len(self.initialpermutation)/2)]
self.Bitsrechts=self.initialpermutation[int(len(self.initialpermutation)/2):]
print ("Teil links: ",self.Bitslinks)
print ("Teil rechts: ",self.Bitsrechts)
def SchluesselGenerieren(self):
#self.Schluessel= getrandbits(64)
self.Schluessel="0001001100110100010101110111100110011011101111001101111111110001"
def ippermutation(self):
ip = [57, 49, 41, 33, 25, 17, 9,1,
59, 51, 43, 35, 27, 19, 11, 3,
61, 53, 45, 37, 29, 21, 13, 5,
63, 55, 47, 39, 31, 23, 15, 7,
56, 48, 40, 32, 24, 16, 8, 0,
58, 50, 42, 34, 26, 18, 10, 2,
60, 52, 44, 36, 28, 20, 12, 4,
62, 54, 46, 38, 30, 22, 14, 6
]
for i in ip:
self.initialpermutation=self.initialpermutation+self.NachrichtBinaer[i]
print ("Erste Permutation: ",self.initialpermutation)
def Expandieren(self):
ExpandierenTabelle = [
31,0,1,2,3,4,
3,4,5,6,7,8,
7,8,9,10,11,12,
11,12,13,14,15,16,
15,16,17,18,19,20,
9,20,21,22,23,24,
3,24,25,26,27,28,
27,28,29,30,31,0]
for Elemente in ExpandierenTabelle:
self.Bits48 = self.Bits48 + self.Bitsrechts[Elemente]
print ("expandiert: ",self.Bits48)
def XOR(self,wert1,wert2):
antwort=wert1^wert2
return antwort
def SBox(self):
self.sbox=[
[[14, 4, 13, 1, 2, 15, 11, 8, 3, 10, 6, 12, 5, 9, 0, 7],
[0, 15, 7, 4, 14, 2, 13, 1, 10, 6, 12, 11, 9, 5, 3, 8],
[4, 1, 14, 8, 13, 6, 2, 11, 15, 12, 9, 7, 3, 10, 5, 0],
[15, 12, 8, 2, 4, 9, 1, 7, 5, 11, 3, 14, 10, 0, 6, 13]],
# S2
[[15, 1, 8, 14, 6, 11, 3, 4, 9, 7, 2, 13, 12, 0, 5, 10],
[3, 13, 4, 7, 15, 2, 8, 14, 12, 0, 1, 10, 6, 9, 11, 5],
[0, 14, 7, 11, 10, 4, 13, 1, 5, 8, 12, 6, 9, 3, 2, 15],
[13, 8, 10, 1, 3, 15, 4, 2, 11, 6, 7, 12, 0, 5, 14, 9]],
# S3
[[10, 0, 9, 14, 6, 3, 15, 5, 1, 13, 12, 7, 11, 4, 2, 8],
[13, 7, 0, 9, 3, 4, 6, 10, 2, 8, 5, 14, 12, 11, 15, 1],
[13, 6, 4, 9, 8, 15, 3, 0, 11, 1, 2, 12, 5, 10, 14, 7],
[1, 10, 13, 0, 6, 9, 8, 7, 4, 15, 14, 3, 11, 5, 2, 12]],
# S4
[[7, 13, 14, 3, 0, 6, 9, 10, 1, 2, 8, 5, 11, 12, 4, 15],
[13, 8, 11, 5, 6, 15, 0, 3, 4, 7, 2, 12, 1, 10, 14, 9],
[10, 6, 9, 0, 12, 11, 7, 13, 15, 1, 3, 14, 5, 2, 8, 4],
[3, 15, 0, 6, 10, 1, 13, 8, 9, 4, 5, 11, 12, 7, 2, 14]],
# S5
[[2, 12, 4, 1, 7, 10, 11, 6, 8, 5, 3, 15, 13, 0, 14, 9],
[14, 11, 2, 12, 4, 7, 13, 1, 5, 0, 15, 10, 3, 9, 8, 6],
[4, 2, 1, 11, 10, 13, 7, 8, 15, 9, 12, 5, 6, 3, 0, 14],
[11, 8, 12, 7, 1, 14, 2, 13, 6, 15, 0, 9, 10, 4, 5, 3]],
# S6
[[12, 1, 10, 15, 9, 2, 6, 8, 0, 13, 3, 4, 14, 7, 5, 11],
[10, 15, 4, 2, 7, 12, 9, 5, 6, 1, 13, 14, 0, 11, 3, 8],
[9, 14, 15, 5, 2, 8, 12, 3, 7, 0, 4, 10, 1, 13, 11, 6],
[4, 3, 2, 12, 9, 5, 15, 10, 11, 14, 1, 7, 6, 0, 8, 13]],
# S7
[[4, 11, 2, 14, 15, 0, 8, 13, 3, 12, 9, 7, 5, 10, 6, 1],
[13, 0, 11, 7, 4, 9, 1, 10, 14, 3, 5, 12, 2, 15, 8, 6],
[1, 4, 11, 13, 12, 3, 7, 14, 10, 15, 6, 8, 0, 5, 9, 2],
[6, 11, 13, 8, 1, 4, 10, 7, 9, 5, 0, 15, 14, 2, 3, 12]],
# S8
[[13, 2, 8, 4, 6, 15, 11, 1, 10, 9, 3, 14, 5, 0, 12, 7],
[1, 15, 13, 8, 10, 3, 7, 4, 12, 5, 6, 11, 0, 14, 9, 2],
[7, 11, 4, 1, 9, 12, 14, 2, 0, 6, 10, 13, 15, 3, 5, 8],
[2, 1, 14, 7, 4, 10, 8, 13, 15, 12, 9, 0, 3, 5, 6, 11]]
]
for z in range(len(self.sbox)):
for y in range(len(self.sbox[z])):
for x in range(len(self.sbox[z][y])):
ausserebits=self.liste2[x][z][0]+self.liste2[x][z][-1]
innerebits=self.liste2[x][z][1:5]
print ("innere: ",innerebits)
print ("aussere: ",ausserebits)
def pc1undteilen(self):
pc1 = [56,48,40,32,24,16,8,
0,57,49,41,33,25,17,
9,1,58,50,42,34,26,
18,19,2,59,51,43,35,
62,54,46,38,30,22,14,
6,61,53,45,37,29,21,
13,5,60,52,44,36,28,
20,2,4,27,19,11,3]
#Subkey = ""
for j in pc1:
self.Subkey = self.Subkey+self.Schluessel[j]
self.Teil1=self.Subkey[:int(len(self.Subkey)/2)]
self.Teil2=self.Subkey[int(len(self.Subkey)/2):]
print("Schluessel64 :",self.Subkey)
print("SchluesselTeil1 :",self.Teil1)
print("SchluesselTeil2 :",self.Teil2)
def rotationundpc2(self):
pc2 = [
13, 16, 10, 23, 0, 4,
2, 27, 14, 5, 20, 9,
22, 18, 11, 3, 25, 7,
15, 6, 26, 19, 12, 1,
40, 51, 30, 36, 46, 54,
29, 39, 50, 44, 32, 47,
43, 48, 38, 55, 33, 52,
45, 41, 49, 35, 28, 31]
rotation = [
1, 1, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 1]
k=0
#schluesselliste
self.Teil1=list(self.Teil1)
self.Teil2=list(self.Teil2)
while k<16:
#rotation
u=0
while u < rotation[k]:
self.Teil1.append(self.Teil1[0])
del self.Teil1[0]
self.Teil2.append(self.Teil2[0])
del self.Teil2[0]
self.Teil1="".join(self.Teil1)
self.Teil2="".join(self.Teil2)
self.subschluessel=self.Teil1+self.Teil2
print("Teil1: ",self.Teil1)
print("Teil2: ",self.Teil2)
print ("Subschluessel: ",self.subschluessel)
self.Teil1=list(self.Teil1)
self.Teil2=list(self.Teil2)
u+=1
#pc2 und erstellung der 16 subkeys
for index2 in pc2:
self.subkeyliste.append(self.subschluessel[index2])
k+=1
while len(self.subkeyliste)>0:
self.liste1.append("".join(self.subkeyliste[0:48]))
del self.subkeyliste[0:48]
print ("Subkeyliste: ",self.liste1)
def sechsbitunterteilung(self):
for l in range(0,16):
self.liste2.append(findall("......",listenachxor[l]))
print ("liste2: ",self.liste2)
#objekt der klasse DES wird erstellt
listenachxor=[]
Krypto=DES()
#Schluesselgenerieren
Krypto.SchluesselGenerieren()
Krypto.pc1undteilen()
Krypto.rotationundpc2()
Krypto.EingabeNachricht()
Krypto.Bitumwandlung()
Krypto.ippermutation()
Krypto.Teilen()
Krypto.Expandieren()
for p in range(0,16):
listenachxor.append(bin(Krypto.XOR(int(Krypto.liste1[p],2),int(Krypto.Bits48,2))))
listenachxor[p]=listenachxor[p].replace("b","")
print ("listenachxor: ",listenachxor)
Krypto.sechsbitunterteilung()
Krypto.SBox()
By the way the problem is on this part of the programm, the rest just works fine:
for z in range(len(self.sbox)):
for y in range(len(self.sbox[z])):
for x in range(len(self.sbox[z][y])):
ausserebits=self.liste2[x][z][0]+self.liste2[x][z][-1]
innerebits=self.liste2[x][z][1:5]
print ("innere: ",innerebits)
print ("aussere: ",ausserebits)

How to get non repeating random integers

I am trying to get numbers between 0 and 25 assigned to 26 things on a list but cannot be repeated I am assuming that you would use and if and else statement but this is what I have so far
def f():
a=[0]*26
for x in a:
b=randrange(0,26)
a[b]=randrange(0,26)
return(a)
print(f())
Make a list of numbers 0..25 and shuffle it:
>>> import random
>>> a = list(range(26))
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 2
2, 23, 24, 25]
>>> random.shuffle(a)
>>> a
[11, 3, 17, 0, 20, 13, 24, 21, 4, 12, 14, 1, 22, 18, 5, 8, 6, 10, 9, 25, 23, 19,
16, 7, 2, 15]

Resources