Hill cipher implementation when results are not expected - security

I am working on some ciphers (just theory, no coding yet). Currently I am doing the hill cipher and I can use it fine. However I have came across a problem which has stumped me. Say for example I am encrypting the letters A and I. A would be 0 and I 8. Now take my encryption box to be:
K= 18 2
23 0
This is all well and good. I can encrypt as such:
A = 18*0 = 0
2 *8 = 16
The problem is that adding these results produces 16. Is 16 % 26 just 16? Is this the number that I use for my encryption? Similar problem occurs if I have an encryption where the result is 260 % 26. Do this become 10 or 0? When you divide 260 by 26 you get 10. To finish the modulo operation I would take away any whole number and multiply the remainder by 26. Of course if I do it in this case then I get 0, which cannot be multiplied. Any suggestions?

Yes. 16 % 26 = 16 and 260 % 26 = 0.
The point is that your encryption matrix cannot be used as Hill cipher's encryption/decryption key.
The reason is that the encryption matrix must have an inverse matrix (modulo 26). In other words, the determinant of the matrix must be nonzero, and not divided by 2 or 13. In fact,
the determinant of your matrix is 24 mod 26, which cannot satisfy this requirement of the Hill cipher. This is why you got the strange result and the decryption will failed.
So try to generate another encryption matrix which has the required property. For example,
3 5
1 2 can be used as an encryption matrix.

Related

How to compress an integer to a smaller string of text?

Given a random integer, for example, 19357982357627685397198. How can I compress these numbers into a string of text that has fewer characters?
The string of text must only contain numbers or alphabetical characters, both uppercase and lowercase.
I've tried Base64 and Huffman-coding that claim to compress, but none of them makes the string shorter when writing on a keyboard.
I also tried to make some kind of algorithm that tries to divide the integer by the numbers "2,3,...,10" and check if the last number in the result is the number it was divided by (looks for 0 in case of division by 10). So, when decrypting, you would just multiply the number by the last number in the integer. But that does not work because in some cases you can't divide by anything and the number would stay the same, and when it would be decrypted, it would just multiply it into a larger number than you started with.
I also tried to divide the integer into blocks of 2 numbers starting from left and giving a letter to them (a=1, b=2, o=15), and when it would get to z it would just roll back to a. This did not work because when it was decrypted, it would not know how many times the number rolled over z and therefore be a much smaller number than in the start.
I also tried some other common encryption strategies. For example Base32, Ascii85, Bifid Cipher, Baudot Code, and some others I can not remember.
It seems like an unsolvable problem. But because it starts with an integer, each number can contain 10 different combinations. While in the alphabet, letters can contain 26 different combinations. This makes it so that you can store more data in 5 alphabetical letters, than in a 5 digit integer. So it is possible to store more data in a string of characters than in an integer in mathematical means, but I just can't find anyone who has ever done it.
You switch from base 10 to eg. base 62 by repeatedly dividing by 62 and record the remainders from each step like this:
Converting 6846532136 to base62:
Operation Result Remainder
6846532136 / 62 110427937 42
110427937 / 62 1781095 47
1781095 / 62 28727 21
28727 / 62 463 21
463 / 62 7 29
7 / 62 0 7
Then you use the remainder as index in to a base62 alphabet of your choice eg:
0 1 2 3 4 5 6
01234567890123456789012345678901234567890123456789012345678901
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
Giving: H (7) d (29) V (21) V (21) v (47) q (42) = HdVVvq
------
It's called base10 to base62, there bunch of solutions and code on the internet.
Here is my favorite version: Base 62 conversion

Get Poisson expectation of preceding values of a time series in Python

I have some time series data (in a Pandas dataframe), d(t):
time 1 2 3 4 ... 99 100
d(t) 5 3 17 6 ... 23 78
I would like to get a time-shifted version of the data, e.g. d(t-1):
time 1 2 3 4 ... 99 100
d(t) 5 3 17 6 ... 23 78
d(t-1) NaN 5 3 17 6 ... 23
But with a complication. Instead of simply time-shifting the data, I need to take the expected value based on a Poisson-distributed shift. So instead of d(t-i), I need E(d(t-j)), where j ~ Poisson(i).
Is there an efficient way to do this in Python?
Ideally, I would be able to dynamically generate the result with i as a parameter (that I can use in an optimization).
numpy's Poisson functions seem to be about generating draws from a Poisson rather than giving a PMF that could be used to calculate expected value. If I could generate a PMF, I could do something like:
for idx in len(d(t)):
Ed(t-i) = np.multiply(d(t)[:idx:-1], PMF(Poisson, i)).sum()
But I have no idea what actual functions to use for this, or if there is an easier way than iterating over indices. This approach also won't easily let me optimize over i.
You can use scipy.stats.poisson to get PMF.
Here's a sample:
from scipy.stats import poisson
mu = 10
# Declare 'rv' to be a poisson random variable with λ=mu
rv = poisson(mu)
# poisson.pmf(k) = (e⁻ᵐᵘ * muᵏ) / k!
print(rv.pmf(4))
For more information about scipy.stats.poisson check this doc.

Analyse runtime of my algorithm

I am working on creating some algorithms for a course, both of which are for the vertex cover problem.
For the first part I created an algorithm that does the work via brute force, it creates every possible combination of vertices, removes sets that are not covers, then analyses them. This size I already have.
The second part is the same brute force with an added heuristic, where I eliminate the lower portion of combos that are unlikely to make a cover based on the number of edges.
Since both of these do work on the sum of all base elements in the combos I need to understand the size of said list.
The graphs are randomly generated with integers for vertices and edges created randomly from pairs of vertices.
combos = []
vertices = [1, 2, 3,...]
edges = [(1, 2), (2, 3),...]
E = len(edges)
V = len(vertices)
Brute force
for x in range(1, V+1):
for subset in itertools.combinations(vertices, X):
combos.append(subset)
sum = 0
for i in combos:
for j in i:
sum += 1
The sum of brute force is:
Heuristc:
for x in range(ceil((V**2)/E), V+1):
for subset in itertools.combinations(vertices, X):
combos.append(subset)
sum = 0
for i in combos:
for j in i:
sum += 1
The sum as I thought it would end up being:
However, my test runs are not matching up for heuristic, brute force is matching up.
Sample runs:
V E Brute Heuristic
5 10 80 25
6 11 192 36
7 17 448 294
8 23 1024 792
9 25 2304 1467
10 36 5120 4660
Ok, so I was doing my math formula wrong, the heuristic is not the sum of all V minus the sum of the lower end. It is the sum from the lower end to all V:
I do not know exactly why this works and my original does not, since logically the other one is just an expanded version of this one.

Is there a J idiom for adding to a list until a certain condition is met?

Imagine you're generating the Fibonacci numbers using the obvious, brute-force algorithm. If I know the number of Fibonaccis I want to generate in advance, I can do something like this using the power conjunction ^::
(, [: +/ _2&{.)^:20 i.2
How can I instead stop when the Fibonaccis reach some limit, say 1e6? (I know how to do this inside a function using while., but that's no fun.)
I want to stress that this is a general question about J, not a specific question about Fibonacci. Don't get too distracted by Fibonacci numbers. The heart of the question is how to keep appending to a list until some condition is met.
Power has also a verb form u^:v^:n where the second verb can be used as a check. Eg: double (+:) while (n is _) less than 100 (100&>):
+:^:(100&>)^:_ ] 1
128
+:^:(100&>)^:_ ] 3
192
As usual, to append to the result of power, you box the noun:
+:^:(100&>)^:(<_) ] 3
3 6 12 24 48 96 192
I think the best answer to this is in Henry Rich's book J for C programmers. Specifically, it using the Power Conjunction ^: . You can also use it to converge until there is no change, so that the limit would not need to be defined. Henry uses the example that:
2 *^:(100&>#:])^:_"0 (1 3 5 7 9 11)
128 192 160 112 144 176
The ^:_ Power Conjunction repeats until there is no change and the ^:(100&>#:]) tests for the result being less than 100. If it is then ^: is applied to 1 and the loop 2* is done again, if it is not less than 100 then ^: would be applied to 0 and that results in it doing nothing and nothing changes and the loop exits. The fact that it use "0 as the rank means that it can apply the doubling function 2* to each of 1 3 5 7 9 11 individually.
Henry really does explain the process better than I, so here is the reference for further reading.
http://www.jsoftware.com/help/jforc/loopless_code_iv_irregular_o.htm#_Toc191734389

Dynamic Programming : Why the 1?

The following pseudocode finds the smallest number of coins needed to sum upto S using DP. Vj is the value of coin and min represents m as described in the following line.
For each coin j, Vj≤i, look at the minimum number of coins found for the i-Vjsum (we have already found it previously). Let this number be m. If m+1 is less than the minimum number of coins already found for current sum i, then we write the new result for it.
1 Set Min[i] equal to Infinity for all of i
2 Min[0]=0
3
4 For i = 1 to S
5 For j = 0 to N - 1
6 If (Vj<=i AND Min[i-Vj]+1<Min[i])
7 Then Min[i]=Min[i-Vj]+1
8
9 Output Min[S]
Can someone explain the significance of the "+1 " in line 6? Thanks
The +1 is because you need one extra coin. So for example, if you have:
Vj = 5
Min[17] = 4
And you want to know the number of coins it will take to get 22, then the answer isn't 4, but 5. It takes 4 coins to get to 17 (according to the previously calculated result Min[17]=4), and an additional one coin (of value Vj = 5) to get to 22.
EDIT
As requested, an overview explanation of the algorithm.
To start, imagine that somebody told you you had access to coins of value 5, 7 and 17, and needed to find the size of the smallest combination of coins which added to 1000. You could probably work out an approach to doing this, but it's certainly not trivial.
So now let's say in addition to the above, you're also given a list of all the values below 1000, and the smallest number of coins it takes to get those values. What would your approach be now?
Well, you only have coins of value 5, 7, and 23. So go back one step- the only options you have are a combination which adds to 995 + an extra 5-value coin, a combination which adds to 993 + an extra 7-value, or a combination up to 977 + an extra 23-value.
So let's say the list has this:
...
977: 53 coins
...
993: 50 coins
...
995: 54 coins
(Those examples were off the top of my head, I'm sure they're not right, and probably don't make sense, but assume they're correct for now).
So from there, you can see pretty easily that the lowest number of coins it will take to get 1000 is 51 coins, which you do by taking the same combination as the one in the list which got 993, then adding a single extra 7-coin.
This is, more or less, what your algorithm does- except instead of aiming just to calculate the number for 1000, it's aim would be to calculate every number up to 1000. And instead of being passed the list for lower numbers in from somewhere external, it would keep track of the values it had already calculated.

Resources