I want to do an operation similar to matrix multiplication, except instead of multiplying I want to check equality. The effect that I want to achieve is similar to the following:
a = torch.Tensor([[1, 2, 3], [4, 5, 6]]).to(torch.uint8)
b = torch.Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]]).to(torch.uint8)
result = [[sum(a[i] == b [j]) for j in range(len(b))] for i in range(len(a))]
Is there a way that I can use einsum, or any other function in pytorch to achieve the above efficiently?
You can use torch.repeat and torch.repeat_interleave:
a = torch.Tensor([[1, 2, 3], [4, 5, 6]])
b = torch.Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
mask = a.repeat_interleave(3, dim=0) == b.repeat((2, 1))
torch.sum(mask, axis=1).reshape(a.shape)
# output
tensor([[3, 0, 0],
[0, 3, 0]])
You can make use of the broadcasting to do the same, for instance with
result = (a[:, None, :] == b[None, :, :]).sum(dim=2)
Here None just introduces a dummy dimensions - alternatively you can use the less visual .unsqueeze() instead.
matrix multiplication is ij,jk->ik in einsum notation, all of these operations are equivalent with varying levels of verbosity:
a # b
torch.einsum("ij,jk", a, b)
torch.einsum("ij,jk->ik", a, b)
(a[:,:,None] * b[None,:,:]).sum(1)
"multiply i and k dimensions and reduce j dimension"
i, j, k i, j, k
a: (2, 3) => (2, 3, None)
b: (3, 3) (None, 3, 3)
It should now be clear from this function decomposition that multiplication can be replaced with any binary operation, e.g. the equality operation.
Unfortunately, there is no generalized form of einsum (AFAIK) in pytorch that swaps the multiplication "out-of-the-box". There is however the einops library which is basically a wrapper around deep learning frameworks such as PyTorch.
Related
Given a 1d vecotr x of size n, how can we construct an n-by-n matrix X consisting of all the rolled vectors of x in PyTorch?
For example
x = torch.tensor([1,2,3,4])
The expected output is
tensor([[1, 2, 3, 4],
[2, 3, 4, 1],
[3, 4, 1, 2],
[4, 1, 2, 3]])
Is there any better way than this?
N = x.shape[0]
A = torch.zeros(N, N)
for i in range(N):
A[i] = torch.roll(x, -i)
I'm working with linear transformation in the form of Y=Q(X+A), where X is the input tensor and Y is the output, Q and A are two tensors to be learned. Q is an arbitrary tensor, therefore I can use nn.Linear. But A is a (differentiable) tensor that has some specific pattern, as a short example,
A = [[a0,a1,a2,a2,a2],
[a1,a0,a1,a2,a2],
[a2,a1,a0,a1,a2],
[a2,a2,a1,a0,a1],
[a2,a2,a2,a1,a0]].
So I cannot define such a pattern in nn.Linear. Is there any way to define such a tensor in Pytorch?
This looks like a Toeplitz matrix. A possible implementation in PyTorch is:
def toeplitz(c, r):
vals = torch.cat((r, c[1:].flip(0)))
shape = len(c), len(r)
i, j = torch.ones(*shape).nonzero().T
return vals[j-i].reshape(*shape)
In your case with a0 as 0, a1 as 1 and a2 as 2:
>>> toeplitz(torch.tensor([0,1,2,2,2]), torch.tensor([0,1,2,2,2]))
tensor([[0, 1, 2, 2, 2],
[1, 0, 1, 2, 2],
[2, 1, 0, 1, 2],
[2, 2, 1, 0, 1],
[2, 2, 2, 1, 0]])
For a more detailed explanation refer to my other answer here.
Here are two related SO questions 1 2 that helped me formulate my preliminary solution.
The reason for wanting to do this is to feed permutations by edit distance into a Damerau-Levenshtein NFA; the number of permutations grows fast, so it's a good idea to delay (N-C) cycle N permutations candidates until (N-C) iterations of the NFA.
I've only studied engineering math up to Differential Equations and Discrete Mathematics, so I lack the foundation to approach this task from a formal perspective. If anyone can provide reference materials to help me understand this problem properly, I would appreciate that!
Through brief empirical analysis, I've noticed that I can generate the swaps for all C cycle N permutations with this procedure:
Generate all 2-combinations of N elements (combs)
Subdivide combs into arrays where the smallest element of each 2-combination is the same (ncombs)
Generate the cartesian products of the (N-C)-combinations of ncombs (pcombs)
Sum pcombs to get a list of the swaps that will generate all C cycle N permutations (swaps)
The code is here.
My Python is a bit rusty, so helpful advice about the code is appreciated (I have the feeling that lines 17, 20, and 21 should be combined. I'm not sure if I should be making lists of the results of itertools.(combinations|product). I don't know why line 10 can't be ncombs += ... instead of ncombs.append(...)).
My primary question is how to solve this question properly. I did the rounds on my own due diligence by finding a solution, but I am sure there's a better way. I've also only verified my solution for N=3 and N=4, is it really correct?
The ideal solution would be functionally identical to heap's algorithm, except it would generate the permutations in decreasing cycle order (by the minimum number of swaps to generate the permutation, increasing).
This is far from Heap's efficiency, but it does produce only the necessary cycle combinations restricted by the desired number of cycles, k, in the permutation. We use the partitions of k to create all combinations of cycles for each partition. Enumerating the actual permutations is just a cartesian product of applying each cycle n-1 times, where n is the cycle length.
Recursive Python 3 code:
from math import ceil
def partitions(N, K, high=float('inf')):
if K == 1:
return [[N]]
result = []
low = ceil(N / K)
high = min(high, N-K+1)
for k in range(high, low - 1, -1):
for sfx in partitions(N-k, K - 1, k):
result.append([k] + sfx)
return result
print("partitions(10, 3):\n%s\n" % partitions(10, 3))
def combs(ns, subs):
def g(i, _subs):
if i == len(ns):
return [tuple(tuple(x) for x in _subs)]
res = []
cardinalities = set()
def h(j):
temp = [x[:] for x in _subs]
temp[j].append(ns[i])
res.extend(g(i + 1, temp))
for j in range(len(subs)):
if not _subs[j] and not subs[j] in cardinalities:
h(j)
cardinalities.add(subs[j])
elif _subs[j] and len(_subs[j]) < subs[j]:
h(j)
return res
_subs = [[] for x in subs]
return g(0, _subs)
A = [1,2,3,4]
ns = [2, 2]
print("combs(%s, %s):\n%s\n" % (A, ns, combs(A, ns)))
A = [0,1,2,3,4,5,6,7,8,9,10,11]
ns = [3, 3, 3, 3]
print("num combs(%s, %s):\n%s\n" % (A, ns, len(combs(A, ns))))
def apply_cycle(A, cycle):
n = len(cycle)
last = A[ cycle[n-1] ]
for i in range(n-1, 0, -1):
A[ cycle[i] ] = A[ cycle[i-1] ]
A[ cycle[0] ] = last
def permutations_by_cycle_count(n, num_cycles):
arr = [x for x in range(n)]
cycle_combs = []
for partition in partitions(n, num_cycles):
cycle_combs.extend(combs(arr, partition))
result = {}
def f(A, cycle_comb, i):
if i == len(cycle_comb):
result[cycle_comb].append(A)
return
if len(cycle_comb[i]) == 1:
f(A[:], cycle_comb, i+1)
for k in range(1, len(cycle_comb[i])):
apply_cycle(A, cycle_comb[i])
f(A[:], cycle_comb, i+1)
apply_cycle(A, cycle_comb[i])
for cycle_comb in cycle_combs:
result[cycle_comb] = []
f(arr, cycle_comb, 0)
return result
result = permutations_by_cycle_count(4, 2)
print("permutations_by_cycle_count(4, 2):\n")
for e in result:
print("%s: %s\n" % (e, result[e]))
Output:
partitions(10, 3):
[[8, 1, 1], [7, 2, 1], [6, 3, 1], [6, 2, 2], [5, 4, 1], [5, 3, 2], [4, 4, 2], [4, 3, 3]]
# These are the cycle combinations
combs([1, 2, 3, 4], [2, 2]):
[((1, 2), (3, 4)), ((1, 3), (2, 4)), ((1, 4), (2, 3))]
num combs([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], [3, 3, 3, 3]):
15400
permutations_by_cycle_count(4, 2):
((0, 1, 2), (3,)): [[2, 0, 1, 3], [1, 2, 0, 3]]
((0, 1, 3), (2,)): [[3, 0, 2, 1], [1, 3, 2, 0]]
((0, 2, 3), (1,)): [[3, 1, 0, 2], [2, 1, 3, 0]]
((1, 2, 3), (0,)): [[0, 3, 1, 2], [0, 2, 3, 1]]
((0, 1), (2, 3)): [[1, 0, 3, 2]]
((0, 2), (1, 3)): [[2, 3, 0, 1]]
((0, 3), (1, 2)): [[3, 2, 1, 0]]
Im trying to make a matrix that is 3 rows by 4 columns and includes the numbers 1-12. Would then like to multiply those numbers by a factor to make a new matrix.
def matrix(x):
matrix=[[1,2,3],[4,5,6],[7,8,9],[10,11,12]]
new_matrix=[[x*1,x*2,x*3],[x*4,x*5,x*6],[x*7,x*8,x*9],[x*10,x*11,x*12]]
print(new_matrix)
This approach works, however it does not use loops, I'm looking for an approach that uses loops, something like this:
def matrix(x):
for i in range(3):
matrix.append([])
for j in range(4):
matrix[i].append(0)
return matrix
You do not need to use explicit loops for something like this (unless you really want to). List comprehensions are a much more efficient way to generate lists, and have a similar syntax to a for loop:
Here is a comprehension for generating any MxN matrix containing the numbers up to M * N:
def matrix(m, n):
return [[x+1 for x in range(row * n, (row + 1) * n)] for row in range(m)]
Here is a comprehension for multiplying the nested list returned by matrix by some factor:
def mult(mat, fact):
return [[x * fact for x in row] for row in mat]
Here is the result for your specific 3x4 case:
>>> m = matrix(3, 4)
>>> print(m)
[[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]
>>> m2 = mult(m, 2)
>>> print(m2)
[[2, 4, 6, 8], [10, 12, 14, 16], [18, 20, 22, 24]]
If you want the indices to be swapped as in your original example, just swap the inputs m and n:
>>> matrix(4, 3)
[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]
mult will work the same for any nested list you pass in.
I have one tensor in shape (2, G) and another in shape (N, 2).
I need to add them in such a way that the output is (N, 2, G), meaning that the first tensor is replicated to (N, 2, G) and then the second tensor is added to each matrix along the third dimension. (or vice versa: the second tensor is replicated to (N, 2, G) and the first one is added to every sub-tensor along the first dimension).
How can this done efficiently in Theano?
Thanks.
In an attempt to understand the problem, the following example is assumed to be representative.
If
A = [[1, 2, 3],
[4, 5, 6]]
and
B = [[1, 2],
[3, 4],
[5, 6],
[7, 8]]
then the result should be
C = [[[ 2. 3. 4.]
[ 6. 7. 8.]]
[[ 4. 5. 6.]
[ 8. 9. 10.]]
[[ 6. 7. 8.]
[ 10. 11. 12.]]
[[ 8. 9. 10.]
[ 12. 13. 14.]]]
Here G=3 and N=4.
To achieve this in Theano, one need only add new broadcastable dimensions and rely on broadcasting to get the desired result.
import numpy
import theano
import theano.tensor as tt
x = tt.matrix()
y = tt.matrix()
z = x.dimshuffle('x', 0, 1) + y.dimshuffle(0, 1, 'x')
f = theano.function([x, y], outputs=z)
print f(numpy.array([[1, 2, 3], [4, 5, 6]]), numpy.array([[1, 2], [3, 4], [5, 6], [7, 8]]))