Exclude indices from Pytorch tensor - pytorch

I have an MxN table of distances between two distinct sets of points of size M <= N. I would like to find associate to each point of the first set M points in the second set in the following way.
Suppose that the shortest of all pairwise distances is between the i0 point of the first set and the j0 of the second. Then we attribute point i0 of the first set to j0 in the second. For the second pair, I have to find i1 != i0 and j1 != j0 such that the distance is minimal among remaining non-paired points.
I figure that I could do the first step by using torch.min function that will deliver me both minimal value as well as its 2d index in the matrix. But for the next steps I'll need to each time exclude a row a colunm, while keeping their original indices.
In other words, if I have a 3x4 matrix, and my first element is (1,2), I would like to be left with a 2x3 matrix with indices 0,2 and 0,1,3. So that, if my second desired element position in the original matrix is, say (2,3) I will be given (2,3) as a result of performing torch.max on the matrix with excluded row and column, rather than (1,2) again.
P.S. I could reach my goal by replacing the values in row and column I'd like to exclude by, say, positive infinities, but I think the question is still worth asking.

Related

Efficient sparse matrix column change

I'm implementing an efficient PageRank algorithm so I'm using sparse matrices. I'm close, but there's one problem. I have a matrix where I want the sum of each column to be one. This is easy to implement, but the problem occurs when I get a matrix with a zero column.
In this case, I want to set each element in the column to be 1/(n-1) where n is the dimension of the matrix. I divide by n-1 and not n because I wish to keep the diagonals zero, always.
How can I implement this efficiently? My naive solution is to just determine the sum of each column and then find the column indices that are zero and replace the entire column with an 1/(n-1) value like so:
# naive approach (too slow!)
# M is my nxn sparse matrix where each column sums to one
col_sums = M.sum(axis=0)
for i in range(n):
if col_sums[0,i] == 0:
# set entire column to 1/(n-1)
M[:, i] = 1/(n-1)
# make sure diagonal is zeroed
M[i,i] = 0
My M matrix is very very very large and this method simply doesn't scale. How can I do this efficiently?
You can't add new nonzero values without reallocating and copying the underlying data structure. If you expect these zero columns to be very common (> 25% of the data) you should handle them in some other way, or you're better off with a dense array.
Otherwise try this:
import scipy.sparse
M = scipy.sparse.rand(1000, 1000, density=0.001, format='csr')
nz_col_weights = scipy.sparse.csr_matrix(M.shape, dtype=M.dtype)
nz_col_weights[:, M.getnnz(axis=0) == 0] = 1 / (M.shape[0] - 1)
nz_col_weights.setdiag(0)
M += nz_col_weights
This has only two allocation operations

Can I compute the mean of two different axes of a 4D array using np.mean?

The data from my files is stored in 4D arrays in python of shape (64,128,64,3). The code I run is in a grid code format, so the shape tells us that there are 64 cells in the x,128 in the y, and 64 in the z. The 3 is the x, y, and z components of velocity. What I want to do is compute the average x velocity in each direction for every cell in y.
Let's start in the corner of my grid. I want the first element of my average array to be the average of the x velocity of all the x cells and all the z cells in position y[0]. The next element should be the same, but for y[1]. The end result should be an array of shape (128).
I'm fairly new to python, so I could be missing something simple, but I don't see a way to do this with one np.mean statement because you need to sum over two axes (In this case, 1 and 2 I think). I tried
velx_avg = np.mean(ds['u'][:,:,:,0],axis=1)
here, ds is the data set I've loaded in, and the module I've used to load it stores the velocity data under 'u'. This gave me an array of shape (64,64).
What is the most efficient way to produce the result that I want?
You can use the flatten command to make your life here much easier, this takes an np.ndarray and flattens it into one dimension.
The challenge here is trying to find your definition of 'efficient', but you can play around with that yourself. To do what you want, I simply iterate over the array and flatten the x and z component into a continuous array, and then take the mean of that, see below:
velx_avg = np.mean([ds['u'][:, i, :, 0].flatten() for i in range(128)], axis=1)

Bitmap pattern recogination

Describe and analyze an algorithm that finds the maximum-area rectangular
pattern that appears more than once in a given bitmap. Specifically, given
a two-dimensional array M[1 .. n, 1 .. n] of bits as input, your algorithm
should output the area of the largest repeated rectangular pattern in M.
For example, given the bitmap shown on the left in the figure below, your
algorithm should return the integer 195, which is the area of the 15 x 13
doggo. (Although it doesn’t happen in this example, the two copies of the
repeated pattern might overlap.)
Image: enter image description here
Using term rectangle instead of sub-matrix.
Brute Force Approach O(n^8)
Take two distinct positions A and B in matrix. Then consider all rectangles in the matrix with A as top left corner. Similarly consider all rectangles with B as top left corner. Suppose we have a rectangle with top-left corner A of length l and breadth b, then take corresponding rectangle with top left corner B, length l, breath b (Assuming both such rectangles exist). If every bit of both rectangles match, then we have a pattern of area of lb repeating and if lb is greater than previously seen maximum area, update the maximum area.
Repeat this procedure for all pairs of A and B.
Time Complexity : Without making analysis complex, let us assume that for every point P in matrix,for all rectangles with P as top left corner, possible maximum length of rectangle is n (But this is not true, for example consider point in the last row and last column of matrix). Let us make similar assumption for maximum possible breath of rectangle.
Then from product rule in counting, there are n^2 possible rectangles with point P as their top left corner.
This estimate is not so bad because according to this, there are total n^4 rectangles possible, as there are n^2 points in the rectangle and we have n^2 rectangles per point.
Actual answer is ( (n+1) Choose 2 ) * ( (n+1) Choose 2 ), which is in the order of n^4
So in total, for each pair, we'd compare n^2 rectangles. And similar to above, let us estimate that there are n^2 points in each rectangle for easy calculations. So for a pair of points A and B, we'll have O(n^4) comparisons because for each rectangle, we need to check all of their corresponding points.
And there are ( n^2 Choose 2 ) pairs of points in total, which is of order n^4. So we'd have overall time complexity of O(n^8).
Avoiding Repeated Calculations O(n^6)
We see that we are repeatedly checking same area again and again. For example consider two rectangles with top-left corners at A and B respectively, with length l and breadth b. Again when we check another corresponding rectangles with top-left corners at A and B with length (l+1) and breadth b, we are again checking rectangle of length l and breadth b.
So we use memoization to avoid repeated calculations.
Assume length of a rectangle is measured horizontally and breadth vertically measured.
Consider rectangles of length l+1 and breadth b with top-left corners at A and B. Now we need to compare if all values of corresponding points in rectangles match. A_r, B_r be points to right of A and B respectively. Then if rectangles have same values for all corresponding points, rectangles of length l, breadth b with top left corners A and B must repeat. Similarly rectangles of length l and breadth b with top-left corners at A_r and B_r must match.
Consider below figure :
Time Complexity By above procedure, it'd take O(1) time to compare two rectangles. So compared to above case, time complexity is reduced by a factor of n^2 (which was time required to compare rectangles in earlier case). So O(n^6) time in this case.
Let us represent (P, m, n) as rectangle with top-left corner P, length m and breadth m.
Removing unnecessary calculations O(n^5)
In the above approach, even if we know that if rectangles (A, l, b) and (B, l, b) do not match we again compare the rectangles (A, l+1, b) and (B, l+1, B).
So suppose now we have rectangles with top-left corners A and B each of length l, then what is maximum possible breadth of the rectangles?
If all the corresponding elements in top row of each rectangle do not match, then answer is 0. But if all corresponding rows match, then the answer is 1 + (max.breadth of rectangles of length l but with top-left corners below A and B).
And similar to above approach this calculation would take O(1) time, as we need breadth that of (A, B, l-1) and that of below rectangles with length l.
Refer to this answer Largest Square Block for more clarity.
Time Complexity For each pair of points in the matrix, we have to store max. breadth for length 1, 2, ...n. And there are order of n^4 such pairs. And we can check obtain each value in O(1) time. So overall time complexity is O(n^5). And finding max. area for each length of rectangle for each pair is obvious here.
Further Optimization O(n^4)
Imagine you have two copies of bitmaps. One is glued to the ground and other can move on top the glued one. Let us call fixed one as base and other one as moving bitmap.
Now top-left corner of moving bitmap can be on one of (n^2 - 1) points of base, except the case where moving bitmap sits on top of base. Now in each case, there are some points left-out i.e for some points on base bitmap, there will not be a point of moving one on top of it and vice-versa. Assume top-left corner of moving bitmap needs to have an element of base bitmap below it.
Now take one instance of these (n^2 - 1) configurations. And for all points on moving bitmap which have a point of base bitmap below them, let us construct a new matrix such that it contains "Y" if top and bottom element of both the bitmaps are same, else it'd contain "N". And remember, the size of matrix is same as no of elements on moving bitmap which have an element of base bitmap below them.
Then, maximum area of repeated pattern for that portions of base and moving bitmaps would be maximum solid block area containing all Y's, which can be done in O(n^2) time.
Our required answer is maximum of answers in all these (n^2 - 1) configurations.
Refer to Largest rectangle containing all Y's for further clarity
Time Complexity For each configuration, constructing new matrix of "Y" and "N"'s would take O(n^2) time and maximum area calculation also O(n^2) time.
And there are (n^2- 1) such configurations, so overall O(n^4) time.
Further Optimization O(n^3 polylog(n))
Consider in above bitmap :
For length : 1, a rectangle with length : 1 and breadth 3 is
repeating twice. (1st and 2nd columns are same in bitmap). So
maxBreadth(1) is 3.
For length : 2, a rectangle with length : 2 and breadth 2 is
repeating twice. So maxBreadth(2) is 2.
The rectangle is :
0
0
0
0
For length : 3, a rectangle with length : 3 and breadth 1 is
repeating twice. So maxBreadth(3) is 1.
For lengths : 4, no rectangle with length : 4 is repeating in the
bitMap, so maxBreadth(4) is 0.
Consider you have a method named maxBreadth which takes a bitmap matrix and length L as input and returns to you the maximum breadth B for which there is a repeating rectangle of with length L and breadth B in the bitmap.
Using this, can you find the area of largest repeating rectangle in a bitmap?? Iterate over each length. We now know that there is a rectangle repeating in the bitmap with area length * maxBreadth(bitMap, length). Update the corresponding maximum area encountered till now.
So now let's focus on maxBreadth method.
Observations :
If a rectangle of length 3 and breadth 5 is not repeating in given bitmap then a rectangle of length 3 and breadth 6 definitely will not be repeating in the bitmap. Generally if a rectangle of length l and breadth b does not repeat in bitMap, any rectangle of length l and breadth > b does not repeat in bitMap. Same goes for length also.
So based on above observation, you can do binary search to find maxBreadth of given length.
If rectangle of length L breadth B is
repeating, then check for breadth 2B
not repeating, then check for breadth B/2
Ok, now how to check if any rectangle of dimensions (l, b) is repeating in a bitMap? There are n^2 such rectangles, so will you compare each with all the others?
What will you do if you asked if an array of numbers is having a repeated element efficiently? Answer is to sort them and check
So take all the n^2 rectangles and sort them and check if there is any repeating one. And how to compare two 2D rectangles, just divide the rectangle into 4 quadrants and compare them individually. In this way we only need to store sorted indices for breadths 1, 2, 4, 8, ... n. and also for lengths 1, 2, 4, 8, ....n.
Each sorting takes O(n^2 log(n)) time for given (length, breadth)
And for each length we perform this operation log(n) times, so total O((n.logn)^2) time complexity for maxBreadth operation.
Finally we call the method maxBreadth n times, so overall time complexity is O(n^3 log(n)^2) and space complexity is O((n.logn)^2)
Note : This O(n^3 log(n)^2) method not only works for bitMaps but also for matrices containing any number of distinct arbitrary numbers to search for maximum repeating sub-matrix

Two regular loops with using given values for a parameter in MATLAB

I have an S1 (21x21) matrix and a W (21x21) matrix given. I define a matrix results with each element as a matrix as results = {W};
and then, I have two regular for loops such that it runs all the values in index1 and then goes to the second index; but each time it should take a specific value of k for example.
There are also two given vectors cos and ens each having dimension 21x1. Here is the code:
rowsP=21;
M=0;
beta=0.9;
p=0.5;
q=0.5;
k= [1:rowsP-1];
for j=1:rowsP-k
for i=1:rowsP-k
R(i,j) = ( S1(i,end-k) - cos(j+k) ) *ens(j)-0.001*M +
beta*(p*results{k}(i,end-j)+q*results{k}(i+1,end-j));
results{k+1}=fliplr(R);
end
end
I am getting the error
Matrix Dimensions must agree.
So I am trying to calculate a matrix results each time using two for loops given results{1}=W (a given matrix) given k=1.
Then flipping the matrix left to right, I get results{2} which then helps to calculate R again but for k=2. And this is then repeated until k=21.
As you see, I keep dropping the last column of each successive R, the matrix results should be appended each time giving a row of 21 elements each cell having 21x21 matrix (the given matrix W) and then a matrix of 20x20 and then 19x19 and so on... until a matrix of 1x1. I am unable to solve the problem as Matlab only does 1 iteration and then does not compute the correct answer. I keep getting two cells in results with a 21x21 matrix (the one given) and the next 20x20 matrix.
I tried with another for loop for k, but in that case, for a given k, starting from k=1, it runs the whole code for j and then i, but it does not solve my problem.

Probability question: Estimating the number of attempts needed to exhaustively try all possible placements in a word search

Would it be reasonable to systematically try all possible placements in a word search?
Grids commonly have dimensions of 15*15 (15 cells wide, 15 cells tall) and contain about 15 words to be placed, each of which can be placed in 8 possible directions. So in general it seems like you can calculate all possible placements by the following:
width*height*8_directions_to_place_word*number of words
So for such a grid it seems like we only need to try 15*15*8*15 = 27,000 which doesn't seem that bad at all. I am expecting some huge number so either the grid size and number of words is really small or there is something fishy with my math.
Formally speaking, assuming that x is number of rows and y is number of columns you should sum all the probabilities of every possible direction for every possible word.
Inputs are: x, y, l (average length of a word), n (total words)
so you have
horizontally a word can start from 0 to x-l and going right or from l to x going left for each row: 2x(x-l)
same approach is used for vertical words: they can go from 0 to y-l going down or from l to y going up. So it's 2y(y-l)
for diagonal words you shoul consider all possible start positions x*y and subtract l^2 since a rect of the field can't be used. As before you multiply by 4 since you have got 4 possible directions: 4*(x*y - l^2).
Then you multiply the whole result for the number of words included:
total = n*(2*x*(x-l)+2*y*(y-l)+4*(x*y-l^2)

Resources