After transforming the input text of "abracdabra!", my transformation vector is [3, 0, 5, 6, 7, 9, 10, 8, 2, 1, 4], the text is then piped through a few more transformations and compressed to disk.
After closing the program, we obviously no longer have access to the transformation vector. Are we expected to write the transformation vector to disk? Wouldn't the size of the vector actually equal n characters? Wouldn't this actually increase the size of the compressed file?
The Burrows Wheeler Transformation is reversible without the transformation vector.
Related
In PyTorch, we have nn.linear that applies a linear transformation to the incoming data:
y = WA+b
In this formula, W and b are our learnable parameters and A is my input data matrix. The matrix 'A' for my case is too large for RAM to complete loading, so I use it sparsely. Is it possible to perform such an operation on sparse matrices using PyTorch?
This is possible with PyTorch using sparse matrix multiply. In your case, I think you want something like:
>> i = [[0, 1, 1],
[2, 0, 2]]
>> v = [3, 4, 5]
>> A = torch.sparse_coo_tensor(i, v, (2, 3))
>> A.to_dense()
tensor([[0, 0, 3],
[4, 0, 5]])
# compute W#A by computing ((A.T)#(W.T)).T because...
# at time of writing, the sparse matrix must be first in the matmul
>> (A.t() # W.t()).t()
How would I go about blacking out a portion of an image or feature map such that AutoGrad can backprop through the operation?
Specifically I want to black out everything except for n layers of border pixels. So if we consider a single channel of the feature map which looks like:
[
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
]
I set a constant n=1 so my operation does the following to the input:
[
[1, 1, 1, 1],
[1, 0, 0, 1],
[1, 0, 0, 1],
[1, 1, 1, 1],
]
In my case I'd be doing it to a multi channel feature map and all channels would be treated the same way.
If possible, I want to do it in a functional manner.
Considering the comments you added, i.e. that you don't need the output to be differentiable wrt. to the mask (said differently, the mask is constant), you could just store the indices of the 1s in the mask and act only on the corresponding elements of whatever Tensor you're considering. Or if you don't want to deal with fancy indexing, you could just keep the mask as a Tensor of 0s and 1s and do an element-wise multiplication of it with whatever Tensor you're considering. Or, if you truly just need to compute a loss along just the border pixels, just extract the first and last row, and first and last column, and avoid double-counting the corners. This latter solution is essentially just the first solution recast in a special case.
To address the question in your comment to my answer:
x = torch.tensor([[1.0,2,3],[4,5,6]], requires_grad = True)
print(x[:,0])
gives
tensor([1., 4.], grad_fn=<SelectBackward>)
, so we see that slicing does not mess with the autograd engine (it's still tracking the contribution to the gradient). It is not too surprising that this works automatically; slicing can be viewed as the (mathematical) function that of projecting onto a subspace of R^n, for which it's easy to compute the gradient.
I have a matrix whose many rows are already in the upper triangular form. I would like to ask if the command scipy.linalg.lu recognize this special structure to faster decompose it. If I decompose this matrix on paper, I only use Gaussian elimination on those rows that are not in the upper triangular form. For example, I will only make transformations on the last row of matrix B.
import numpy as np
A = np.array([[2, 5, 8, 7, 8],
[5, 2, 2, 8, 9],
[7, 5, 6, 6, 10],
[5, 4, 4, 8, 10]])
B = np.array([[2, 5, 8, 7, 8],
[0, 2, 2, 8, 9],
[0, 0, 6, 6, 10],
[5, 4, 4, 8, 10]])
Because my square matrix is of very large dimension and this procedure is repeated thousands of times. I would like to make use of this special structure to reduce the computational complexity.
Thank you so much for your elaboration!
Not automatically.
You'll need to use the structure yourself if want to. Whether you can make it faster then the built-in implementation depends on many factors (the number of zeros etc)
I have binary images (as the one below) at the output of my net. I need the '1's to be further from each other (not connected), so that they would form a sparse binary image (without white blobs). Something like salt-and-pepper noise. I am looking for a way to define a loss (in pytorch) that would punish based on the density of the '1's.
Thanks.
I
It depends on how you're generating said image. Since neural networks have to be trained by backpropagation, I'm rather sure your binary image is not the direct output of your neural network (ie not the thing you're applying loss to), because gradient can't blow through binary (discrete) variables. I suspect you do something like pixel-wise binary cross entropy or similar and then threshold.
I assume your code works like that: you densely regress real-valued numbers and then apply thresholding, likely using sigmoid to map from [-inf, inf] to [0, 1]. If it is so, you can do the following. Build a convolution kernel which is 0 in the center and 1 elsewhere, of size related to how big you want your "sparsity gaps" to be.
kernel = [
[1, 1, 1, 1, 1]
[1, 1, 1, 1, 1]
[1, 1, 0, 1, 1]
[1, 1, 1, 1, 1]
[1, 1, 1, 1, 1]
]
Then you apply sigmoid to your real-valued output to squash it to [0, 1]:
squashed = torch.sigmoid(nn_output)
then you convolve squashed with kernel, which gives you the relaxed number of non-zero neighbors.
neighborhood = nn.functional.conv2d(squashed, kernel, padding=2)
and your loss will be the product of each pixel's value in squashed with the corresponding value in neighborhood:
sparsity_loss = (squashed * neighborhood).mean()
If you think of this loss applied to your binary image, for a given pixel p it will be 1 if and only if both p and at least one of its neighbors have values 1 and 0 otherwise. Since we apply it to non-binary numbers in [0, 1] range, it will be the differentiable approximation of that.
Please note that I left out some of the details from the code above (like correctly reshaping kernel to work with nn.functional.conv2d).
After reading this similar question, I still can't fully understand how to go about implementing the solution im looking for. I have a sparse matrix, i.e.:
import numpy as np
from scipy import sparse
arr = np.array([[0,5,3,0,2],[6,0,4,9,0],[0,0,0,6,8]])
arr_csc = sparse.csc_matrix(arr)
I would like to efficiently get the top n items of each row, without converting the sparse matrix to dense.
The end result should look like this (assuming n=2):
top_n_arr = np.array([[0,5,3,0,0],[6,0,0,9,0],[0,0,0,6,8]])
top_n_arr_csc = sparse.csc_matrix(top_n_arr)
What is wrong with the linked answer? Does it not work in your case? or you just don't understand it? Or it isn't efficient enough?
I was going to suggest working out a means of finding the top values for a row of an lil format matrix, and apply that row by row. But I would just be repeating my earlier answer.
OK, my previous answer was a start, but lacked some details on iterating through the lol format. Here's a start; it probably could be cleaned up.
Make the array, and a lil version:
In [42]: arr = np.array([[0,5,3,0,2],[6,0,4,9,0],[0,0,0,6,8]])
In [43]: arr_sp=sparse.csc_matrix(arr)
In [44]: arr_ll=arr_sp.tolil()
The row function from the previous answer:
def max_n(row_data, row_indices, n):
i = row_data.argsort()[-n:]
# i = row_data.argpartition(-n)[-n:]
top_values = row_data[i]
top_indices = row_indices[i] # do the sparse indices matter?
return top_values, top_indices, i
Iterate over the rows of arr_ll, apply this function and replace the elements:
In [46]: for i in range(arr_ll.shape[0]):
d,r=max_n(np.array(arr_ll.data[i]),np.array(arr_ll.rows[i]),2)[:2]
arr_ll.data[i]=d.tolist()
arr_ll.rows[i]=r.tolist()
....:
In [47]: arr_ll.data
Out[47]: array([[3, 5], [6, 9], [6, 8]], dtype=object)
In [48]: arr_ll.rows
Out[48]: array([[2, 1], [0, 3], [3, 4]], dtype=object)
In [49]: arr_ll.tocsc().A
Out[49]:
array([[0, 5, 3, 0, 0],
[6, 0, 0, 9, 0],
[0, 0, 0, 6, 8]])
In the lil format, the data is stored in 2 object type arrays, as sublists, one with the data numbers, the other with the column indices.
Viewing the data attributes of sparse matrix is handy when doing new things. Changing those attributes has some risk, since it mess up the whole array. But it looks like the lil format can be tweaked like this safely.
The csr format is better for accessing rows than csc. It's data is stored in 3 arrays, data, indices and indptr. The lil format effectively splits 2 of those arrays into sublists based on information in the indptr. csr is great for math (multiplication, addition etc), but not so good when changing the sparsity (turning nonzero values into zeros).