sparse symmetric matrix generation with diagonal elements non zero in python - python-3.x

I have to generate a sparse symmetric matrix of given dimension n*n, in which all diagonal elements are non-zero. And off-diagonal elements can expect possible k number of non-zero values. Mean if k = 3, and n = 4, we should have symmetric sparse matrix of size 4*4. And all diagonal elements are non-zero. For a first row, there can be 3 non zero values, all other are zero. How can I achieve it ?

Related

KxK matrices from NxKxK tensor multiplied by rows of NxK matrix of MxNxK tensor

I have two tensors. One is a N batch of KxK matrices, i.e. I have NxKxK tensor called A. Then I have a MxNxK tensor called B. I want to get a new MxNxK tensor where each i'th transposed row from NxK tensor from B is multiplied by the i'th KxK matrix from A forming a new i'th transposed row of NxK tensor. And this is done for all NxK tensors from B.
Because KxK matrices from A are low-triangular maybe it will be easier to resolve this question forming A from upper-triangular matrices and do not use transpose operations multiplying rows from B by KxK upper-triangular matrices.
I attached screen to be more precise
It seems that solution is
torch.einsum('pts,tsk->ptk', B, A)
if A is upper-triangular

Exclude indices from Pytorch tensor

I have an MxN table of distances between two distinct sets of points of size M <= N. I would like to find associate to each point of the first set M points in the second set in the following way.
Suppose that the shortest of all pairwise distances is between the i0 point of the first set and the j0 of the second. Then we attribute point i0 of the first set to j0 in the second. For the second pair, I have to find i1 != i0 and j1 != j0 such that the distance is minimal among remaining non-paired points.
I figure that I could do the first step by using torch.min function that will deliver me both minimal value as well as its 2d index in the matrix. But for the next steps I'll need to each time exclude a row a colunm, while keeping their original indices.
In other words, if I have a 3x4 matrix, and my first element is (1,2), I would like to be left with a 2x3 matrix with indices 0,2 and 0,1,3. So that, if my second desired element position in the original matrix is, say (2,3) I will be given (2,3) as a result of performing torch.max on the matrix with excluded row and column, rather than (1,2) again.
P.S. I could reach my goal by replacing the values in row and column I'd like to exclude by, say, positive infinities, but I think the question is still worth asking.

Efficient sparse matrix column change

I'm implementing an efficient PageRank algorithm so I'm using sparse matrices. I'm close, but there's one problem. I have a matrix where I want the sum of each column to be one. This is easy to implement, but the problem occurs when I get a matrix with a zero column.
In this case, I want to set each element in the column to be 1/(n-1) where n is the dimension of the matrix. I divide by n-1 and not n because I wish to keep the diagonals zero, always.
How can I implement this efficiently? My naive solution is to just determine the sum of each column and then find the column indices that are zero and replace the entire column with an 1/(n-1) value like so:
# naive approach (too slow!)
# M is my nxn sparse matrix where each column sums to one
col_sums = M.sum(axis=0)
for i in range(n):
if col_sums[0,i] == 0:
# set entire column to 1/(n-1)
M[:, i] = 1/(n-1)
# make sure diagonal is zeroed
M[i,i] = 0
My M matrix is very very very large and this method simply doesn't scale. How can I do this efficiently?
You can't add new nonzero values without reallocating and copying the underlying data structure. If you expect these zero columns to be very common (> 25% of the data) you should handle them in some other way, or you're better off with a dense array.
Otherwise try this:
import scipy.sparse
M = scipy.sparse.rand(1000, 1000, density=0.001, format='csr')
nz_col_weights = scipy.sparse.csr_matrix(M.shape, dtype=M.dtype)
nz_col_weights[:, M.getnnz(axis=0) == 0] = 1 / (M.shape[0] - 1)
nz_col_weights.setdiag(0)
M += nz_col_weights
This has only two allocation operations

Two regular loops with using given values for a parameter in MATLAB

I have an S1 (21x21) matrix and a W (21x21) matrix given. I define a matrix results with each element as a matrix as results = {W};
and then, I have two regular for loops such that it runs all the values in index1 and then goes to the second index; but each time it should take a specific value of k for example.
There are also two given vectors cos and ens each having dimension 21x1. Here is the code:
rowsP=21;
M=0;
beta=0.9;
p=0.5;
q=0.5;
k= [1:rowsP-1];
for j=1:rowsP-k
for i=1:rowsP-k
R(i,j) = ( S1(i,end-k) - cos(j+k) ) *ens(j)-0.001*M +
beta*(p*results{k}(i,end-j)+q*results{k}(i+1,end-j));
results{k+1}=fliplr(R);
end
end
I am getting the error
Matrix Dimensions must agree.
So I am trying to calculate a matrix results each time using two for loops given results{1}=W (a given matrix) given k=1.
Then flipping the matrix left to right, I get results{2} which then helps to calculate R again but for k=2. And this is then repeated until k=21.
As you see, I keep dropping the last column of each successive R, the matrix results should be appended each time giving a row of 21 elements each cell having 21x21 matrix (the given matrix W) and then a matrix of 20x20 and then 19x19 and so on... until a matrix of 1x1. I am unable to solve the problem as Matlab only does 1 iteration and then does not compute the correct answer. I keep getting two cells in results with a 21x21 matrix (the one given) and the next 20x20 matrix.
I tried with another for loop for k, but in that case, for a given k, starting from k=1, it runs the whole code for j and then i, but it does not solve my problem.

Can the cosine similarity when using Locality Sensitive Hashing be -1?

I was reading this question:
How to understand Locality Sensitive Hashing?
But then I found that the equation to calculate the cosine similarity is as follows:
Cos(v1, v2) = Cos(theta) = (hamming distance/signature length) * pi = ((h/b) * pi )
Which means if the vectors are fully similar, then the hamming distance will be zero and the cosine value will be 1. But when the vectors are totally not similar, then the hamming distance will be equal to the signature length and so we have cos(pi) which will result in -1. Shouldn't the similarity be always between 0 and 1?
Cosine similarity is the dot product of the vectors divided by the magnitudes, so it's entirely possible to have a negative value for the angle's cosine. For example, if you have unit vectors pointing in opposite directions, then you want the value to be -1. I think what's confusing you is the nature of the representation because the other post is talking about angles between vectors in 2-D space whereas it's more common to create vectors in a multidimensional space where the number of dimensions is customarily much greater than 2, and the value for each dimension is non-negative (e.g., a word occurs in document or not), resulting in a 0 to 1 range.

Resources