Is there an efficient 'Numpy'-based solution to create a 3 (or higher) dimensional diagonal matrix?
More specifically, I am looking for a shorter (and perhaps more efficient) solution to replace the following:
N = 100
M = 4
d = np.random.randn(N) # calculated in the real use case from other parameters
A = np.zeros(M, M, N, dtype=d.dtype)
for i in range(M):
A[i, i, :] = d
The above-mentioned solution will be slow if M is large, and I think not very memory-efficient as d is copied M times in the memory.
Here's one with np.einsum diag-view -
np.einsum('iij->ij',A)[:] = d
Looking at the string notation, this also translates well from the iterative part : A[i, i, :] = d.
Generalize to ndarray with ellipsis -
np.einsum('ii...->i...',A)[:] = d
Related
I need help in speeding up the following block of code:
import numpy as np
x = 100
pp = np.zeros((x, x))
M = np.ones((x,x))
arrayA = np.random.uniform(0,5,2000)
arrayB = np.random.uniform(0,5,2000)
for i in range(x):
for j in range(x):
y = np.multiply(arrayA, np.exp(-1j*(M[j,i])*arrayB))
p = np.trapz(y, arrayB) # Numerical evaluation/integration y
pp[j,i] = abs(p**2)
Is there a function in numpy or another method to rewrite this piece of code with so that the nested for-loops can be omitted? My idea would be a function that multiplies every element of M with the vector arrayB so we get a 100 x 100 matrix in which each element is a vector itself. And then further each vector gets multiplied by arrayA with the np.multiply() function to then again obtain a 100 x 100 matrix in which each element is a vector itself. Then at the end perform numerical integration for each of those vectors with np.trapz() to obtain a 100 x 100 matrix of which each element is a scalar.
My problem though is that I lack knowledge of such functions which would perform this.
Thanks in advance for your help!
Edit:
Using broadcasting with
M = np.asarray(M)[..., None]
y = 1000*arrayA*np.exp(-1j*M*arrayB)
return np.trapz(y,B)
works and I can ommit the for-loops. However, this is not faster, but instead a little bit slower in my case. This might be a memory issue.
y = np.multiply(arrayA, np.exp(-1j*(M[j,i])*arrayB))
can be written as
y = arrayA * np.exp(-1j*M[:,:,None]*arrayB
producing a (x,x,2000) array.
But the next step may need adjustment. I'm not familiar with np.trapz.
np.trapz(y, arrayB)
I am trying to find the indices of the n smallest values in a list of tensors in pytorch. Since these tensors might contain many non-unique values, I cannot simply compute percentiles to obtain the indices. The ordering of non-unique values does not matter however.
I came up with the following solution but am wondering if there is a more elegant way of doing it:
import torch
n = 10
tensor_list = [torch.randn(10, 10), torch.zeros(20, 20), torch.ones(30, 10)]
all_sorted, all_sorted_idx = torch.sort(torch.cat([t.view(-1) for t in tensor_list]))
cum_num_elements = torch.cumsum(torch.tensor([t.numel() for t in tensor_list]), dim=0)
cum_num_elements = torch.cat([torch.tensor([0]), cum_num_elements])
split_indeces_lt = [all_sorted_idx[:n] < cum_num_elements[i + 1] for i, _ in enumerate(cum_num_elements[1:])]
split_indeces_ge = [all_sorted_idx[:n] >= cum_num_elements[i] for i, _ in enumerate(cum_num_elements[:-1])]
split_indeces = [all_sorted_idx[:n][torch.logical_and(lt, ge)] - c for lt, ge, c in zip(split_indeces_lt, split_indeces_ge, cum_num_elements[:-1])]
n_smallest = [t.view(-1)[idx] for t, idx in zip(tensor_list, split_indeces)]
Ideally a solution would pick a random subset of the non-unique values instead of picking the entries of the first tensor of the list.
Pytorch does provide a more elegant (I think) way to do it, with torch.unique_consecutive (see here)
I'm going to work on a tensor, not a list of tensors because as you did yourself, there's just a cat to do. Unraveling the indices afterward is not hard either.
# We want to find the n=3 min values and positions in t
n = 3
t = torch.tensor([1,2,3,2,0,1,4,3,2])
# To get a random occurrence, we create a random permutation
randomizer = torch.randperm(len(t))
# first, we sort t, and get the indices
sorted_t, idx_t = t[randomizer].sort()
# small util function to extract only the n smallest values and positions
head = lambda v,w : (v[:n], w[:n])
# use unique_consecutive to remove duplicates
uniques_t, counts_t = head(*torch.unique_consecutive(sorted_t, return_counts=True))
# counts_t.cumsum gives us the position of the unique values in sorted_t
uniq_idx_t = torch.cat([torch.tensor([0]), counts_t.cumsum(0)[:-1]], 0)
# And now, we have the positions of uniques_t values in t :
final_idx_t = randomizer[idx_t[uniq_idx_t]]
print(uniques_t, final_idx_t)
#>>> tensor([0,1,2]), tensor([4,0,1])
#>>> tensor([0,1,2]), tensor([4,5,8])
#>>> tensor([0,1,2]), tensor([4,0,8])
EDIT : I think the added permutation solves your need-random-occurrence problem
If I have two small complex matrices, the complex number multiplication is fine even when I do it manually (Breaking the complex numbers into real and imaginary parts and do the multiplication respectively).
import numpy as np
a_shape = (3,10)
b_shape = (10,3)
# Generating the first complex matrix a
np.random.seed(0)
a_real = np.random.randn(a_shape[0], a_shape[1])
np.random.seed(1)
a_imag = np.random.randn(a_shape[0], a_shape[1])
a = a_real + a_imag*1j
# Generating the second complex matrix b
np.random.seed(2)
b_real = np.random.randn(b_shape[0], b_shape[1])
np.random.seed(3)
b_imag = np.random.randn(b_shape[0], b_shape[1])
b = b_real + b_imag*1j
# 1st approach to do complex multiplication
output1 = np.dot(a,b)
# Manaul complex multiplication
output_real = np.dot(a.real,b.real) - np.dot(a.imag,b.imag)
np.array_equal(output1.real, output_real) # the results are the same
>>> True
However, if my matrices are bigger, the results obtained by np.(a,b) and multiplying it manually are different.
a_shape = (3,500)
b_shape = (500,3)
# Generating the first complex matrix a
np.random.seed(0)
a_real = np.random.randn(a_shape[0], a_shape[1])
np.random.seed(1)
a_imag = np.random.randn(a_shape[0], a_shape[1])
a = a_real + a_imag*1j
# Generating the second complex matrix b
np.random.seed(2)
b_real = np.random.randn(b_shape[0], b_shape[1])
np.random.seed(3)
b_imag = np.random.randn(b_shape[0], b_shape[1])
b = b_real + b_imag*1j
# 1st approach to do complex multiplication
output1 = np.dot(a,b)
# 2nd approach to do complex multiplication
output_real = np.dot(a.real,b.real) - np.dot(a.imag,b.imag)
np.array_equal(output1.real, output_real)
>>> False
I am asking this because I need to do some complex number multiplication in pytorch. pytorch doesn't support complex number natively, so I need to do it manually for the real and imagery components.
Then the result is slightly off than using np.dot(a,b)
Any resolution to this problem?
Differences between the two calculations
output1.real - output_real
>>>array([[-3.55271368e-15, -2.48689958e-14, 1.06581410e-14],
[-1.06581410e-14, -5.32907052e-15, -7.10542736e-15],
[ 0.00000000e+00, -2.84217094e-14, -7.10542736e-15]])
You don't say how small the differences are but I suspect what you are seeing has nothing to do with complex numbers but with the nature of floating point arithmetic.
In particular floating point addition is not associative, that is we do not necessarily have
(a + b) + c = a + (b + c)
This would explain what you are seeing, as what you are doing is comparing
Sum{ Ra[i]*Rb[i] - Ia[i]*Ib[i]}
and
Sum{ Ra[i]*Rb[i]} - Sum{ Ia[i]*Ib[i]}
(where Ra[i] is the real part of a[i] etc)
One thing to try to see that this is the problem is to restrict the real and complex parts of the numbers to be, say, a whole number of sixteenths. With such numbers -- as long as you don't add an outrageous number (many many billions) of them -- double precision floating point arithmetic will be exact and so you should get identical results. For example in C you could generate such numbers by generating a bunch of random integers between say -16 and 16 and then divining each by the (double precision) number 16.0, to get a double precision number between -1 and 1 that is a whole number of sixteenths.
Just for practice, I am using nested lists (for exaple, [[1, 0], [0, 1]] is the 2*2 identity matrix) as matrices. I am trying to compute determinant by reducing it to an upper triangular matrix and then by multiplying its diagonal entries. To do this:
"""adds two matrices"""
def add(A, B):
S = []
for i in range(len(A)):
row = []
for j in range(len(A[0])):
row.append(A[i][j] + B[i][j])
S.append(row)
return S
"""scalar multiplication of matrix with n"""
def scale(n, A):
return [[(n)*x for x in row] for row in A]
def detr(M):
Mi = M
#the loops below are supossed to convert Mi
#to upper triangular form:
for i in range(len(Mi)):
for j in range(len(Mi)):
if j>i:
k = -(Mi[j][i])/(Mi[i][i])
Mi[j] = add( scale(k, [Mi[i]]), [Mi[j]] )[0]
#multiplies diagonal entries of Mi:
k = 1
for i in range(len(Mi)):
k = k*Mi[i][i]
return k
Here, you can see that I have set M (argument) equal to Mi and and then operated on Mi to take it to upper triangular form. So, M is supposed to stay unmodified. But after using detr(A), print(A) prints the upper triangular matrix. I tried:
setting X = M, then Mi = X
defining kill(M): return M and then setting Mi = kill(M)
But these approaches are not working. This was causing some problems as I was trying to use detr(M) in another function, problems which I was able to bypass, but why is this happening? What is the compiler doing here, why was M modified even though I operated only on Mi?
(I am using Spyder 3.3.2, Python 3.7.1)
(I am sorry if this question is silly, but I have only started learning python and new to coding in general. This question means a lot to me because I still don't have a deep understanding of this language.)
See python documentation about assignment:
https://docs.python.org/3/library/copy.html
Assignment statements in Python do not copy objects, they create bindings between a target and an object. For collections that are mutable or contain mutable items, a copy is sometimes needed so one can change one copy without changing the other.
You need to import copy and then use Mi = copy.deepcopy(M)
See also
How to deep copy a list?
Suppose that I have a (400,10) array called x and a (400,10) array called y. Is that possible to do a polyfit of each row in y to the corresponding row in x without iteration? If with for loop it will be something like
import numpy as np
coe = np.zeros((400,3))
for i in np.arange(y.shape[0]):
coe[i,:] = np.polyfit(x[i,:], y[i,:], 2)
Because the 400 rows in x is totally different, I cannot just apply np.polyfit with the same x coordinate to a multi-dimensional array y.
Have you tried a comprehension?
coe = [tuple(np.polyfit(x[i,:], y[i,:], 2)) for i in range(400)]
The range(400) emits the values 0 to 399 into i
For each i, you compute the polyfit for x[i,:] vs y[i,:]. I believe the results are a tuple (p, v)
The resulting list-of-tuples is assigned to coe
At the innermost levels, this is an iteration - but in Python 3, such comprehensions are optimized for performance at the C level, so you will probably see a nice performance boost doing it this way over using a for: loop.