Sum of absolute off-diagonal differences in numpy matrix - python-3.x

I have a 2d numpy matrix and want to calculate the following test statistic.
I have brute-force code to do it, but it seems like there should be a more general numpy solution that works for any 2D matrix, using things like np.diag(). I can't figure it out though.
def bruteforce(m):
s = 0.0
for (i,j) in itertools.product(range(0,m.shape[0]),range(0,m.shape[0])):
if i<j:
n = (m[i,j]-m[j,i])**2
d = m[i,j]+m[j,i]
if float(d) != 0.:
s = s+(float(n)/float(d))
else:
return('NA')
return(s)
Where in this case m is an NxN matrix of integers. Is there a way to do it vectorised in numpy, avoiding brute force loops like this?

If m is a square matrix, this will do the job:
import numpy as np
np.sum((m-m.T)**2/(m+m.T))/2
Here is a function that covers the case in which there is 0 in the denominator:
def find_s(m):
d=(m+m.T)
off_diag_indices=np.triu_indices(len(d),1)
if 0 in d[off_diag_indices]:
return 'NA'
else:
numerator=(m-m.T)**2
denominator=m+m.T
return np.sum(numerator[off_diag_indices]/denominator[off_diag_indices])
The reason that I used off_diag_indices is because we actually do allow to have 0 on the diagonal of m+m.T, because we never sum the elements on the diagonal.

Related

Is there a way to compute the matrix logarithm of a Pytorch tensor?

I am trying to compute matrix logarithms in Pytorch but I need to keep tensors because I then apply gradients which means I can't use numpy arrays.
Basically I'm trying to do the equivalent of https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.logm.html but with Pytorch tensors.
Thank you.
Unfortunately the matrix logarithm (unlike the matrix exponential) is not implemented yet, but matrix powers are, this means
in the mean time you can approximate the matrix logarithm by using a the power series expansion, and just truncate it after you get a sufficient accuracy.
Alternatively Lezcano proposes a (slow) solution of a differentiable matrix logarithm via adjoint here. I'll cite their suggested solution:
import scipy.linalg
import torch
def adjoint(A, E, f):
A_H = A.T.conj().to(E.dtype)
n = A.size(0)
M = torch.zeros(2*n, 2*n, dtype=E.dtype, device=E.device)
M[:n, :n] = A_H
M[n:, n:] = A_H
M[:n, n:] = E
return f(M)[:n, n:].to(A.dtype)
def logm_scipy(A):
return torch.from_numpy(scipy.linalg.logm(A.cpu(), disp=False)[0]).to(A.device)
class Logm(torch.autograd.Function):
#staticmethod
def forward(ctx, A):
assert A.ndim == 2 and A.size(0) == A.size(1) # Square matrix
assert A.dtype in (torch.float32, torch.float64, torch.complex64, torch.complex128)
ctx.save_for_backward(A)
return logm_scipy(A)
#staticmethod
def backward(ctx, G):
A, = ctx.saved_tensors
return adjoint(A, G, logm_scipy)
logm = Logm.apply
Easy to implement in Pytorch as follows:
import torch
a=torch.randn(5,10)
cov=torch.cov(a)
u, s, v = torch.linalg.svd(cov)
log_cov=torch.matmul(torch.matmul(u, torch.diag_embed(torch.log(s))), v)
You can easily verify that log_cov and log_cov_np are the same.
log_cov_np=scipy.linalg.logm(cov.detach().numpy())
if cov is singular, you can use the regularization methods to make it have good condtion number for calculating the matrix-log.

How can I interpolate a numpy array so that it becomes a certain length?

I have three numpy arrays each with different lengths:
A.shape = (3401,)
B.shape = (2200,)
C.shape = (4103,)
I would like to average the three arrays to produce a new array with size of the largest array (in this case C):
D.shape = (4103,)
Problem is, I don't think I can do this without adding "fake" data to A and B, by interpolation.
How can I perform interpolation on the first two numpy arrays so that they are of the same length as array C?
Do I even need to interpolate here?
First thing that comes to mind is zoom from scipy:
The array is zoomed using spline interpolation of the requested order.
Code:
import numpy as np
from scipy.ndimage import zoom
A = np.random.rand(3401)
B = np.random.rand(2200)
C = np.ones(4103)
for arr in [A, B]:
zoom_rate = C.shape[0] / arr.shape[0]
arr = zoom(arr, zoom_rate)
print(arr.shape)
Output:
(4103,)
(4103,)
I think the simplest option is to do the following:
D = np.concatenate([np.average([A[:2200], B, C[:2200]], axis=0),
np.average([A[2200:3401], C[2200:3401]], axis=0),
C[3401:]])

Torch - Interpolate missing values

I have a stock of tensor images of a form NumOfImagesxHxW that includes zeros. I am looking for a way to interpolate the missing values (zeros) using the information in the same image only (no connection between the images). Is there a way to do it using pytorch?
F.interpolate seems to work only for reshaping. I need to fill the zeros, while keeping the dimensions and the gradients of the tensor.
Thanks.
EDIT: Turns out the below does not answer the OP as it does not provide a solution to track gradients for back-propagation. Still leaving it as it can be used as part of a solution.
One way is to convert the tensor to numpy array and use scipy interpolation, e.g. scipy.interpolate.LinearGridInterpolator [1] or other possible numpy array interpolation options (some detailed here). Not sure this helps as this is not pytorch + may involve copying the tensor around.
As scipy interpolation may be slow, one possible solution is to only use pixels adjacent to missing values for interpolation (can be easily obtained by dilation on missing values mask). I think that this might speed things up by an order of magnitude, depeding on tensor dimensions and number of missing values.
Edit: implemented it, seems to give a speedup of two orders of magnitude in my case.
def fillMissingValues(target_for_interp, copy=True,
interpolator=scipy.interpolate.LinearNDInterpolator):
import cv2, scipy, numpy as np
if copy:
target_for_interp = target_for_interp.copy()
def getPixelsForInterp(img):
"""
Calculates a mask of pixels neighboring invalid values -
to use for interpolation.
"""
# mask invalid pixels
invalid_mask = np.isnan(img) + (img == 0)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
#dilate to mark borders around invalid regions
dilated_mask = cv2.dilate(invalid_mask.astype('uint8'), kernel,
borderType=cv2.BORDER_CONSTANT, borderValue=int(0))
# pixelwise "and" with valid pixel mask (~invalid_mask)
masked_for_interp = dilated_mask * ~invalid_mask
return masked_for_interp.astype('bool'), invalid_mask
# Mask pixels for interpolation
mask_for_interp, invalid_mask = getPixelsForInterp(target_for_interp)
# Interpolate only holes, only using these pixels
points = np.argwhere(mask_for_interp)
values = target_for_interp[mask_for_interp]
interp = interpolator(points, values)
target_for_interp[invalid_mask] = interp(np.argwhere(invalid_mask))
return target_for_interp
# For the target tensor:
target_filled = fillMissingValues(target.numpy().squeeze())
# transform back to tensor etc..
Note that interpolated values will be np.nan outside of the convex hull of valid points, as provided to LinearNDInterpolator.
If you only want nearest neighbor interpolation, you can make #Yuri Feldman's answer differentiable by returning the index mapping instead of the interpolated image.
What I did is to create a new class from scipy.interpolate.NearestNDInterpolator and override its __call__ method. It's just returning indices instead of values.
from scipy.interpolate.interpnd import _ndim_coords_from_arrays
class NearestNDInterpolatorIndex(NearestNDInterpolator):
def __init__(self, x, y, rescale=False, tree_options=None):
NearestNDInterpolator.__init__(self, x, y, rescale=rescale, tree_options=tree_options)
self.points = np.asarray(x)
def __call__(self, *args):
"""
Evaluate interpolator at given points.
Parameters
----------
xi : ndarray of float, shape (..., ndim)
Points where to interpolate data at.
"""
xi = _ndim_coords_from_arrays(args, ndim=self.points.shape[1])
xi = self._check_call_shape(xi)
xi = self._scale_x(xi)
dist, i = self.tree.query(xi)
return self.points[i]
Then, in fillMissingValues, instead of returning target_for_interp, we return these:
source_indices = np.argwhere(invalid_mask)
target_indices = interp(source_indices)
return source_indices, target_indices
Pass the new interpolator to fillMissingValues, then we can get the nearest neighbor interpolation of the image by
img[..., source_indices[:, 0], source_indices[:, 1]] = img[..., target_indices[:, 0], target_indices[:, 1]]
assuming that the image size is on the last two dimensions.
EDIT: This is not differentiable as I just tested. The problem lies in the index mapping. We need to use masking instead of the in-place operation, and then problem solved.

Matrix Sum logic

I am working on a 2D matrix and finding sum of elements, below is my logic:
def calculateSum(a, x, y):
s = 0;
for i in range(0,x+1):
for j in range(0,y+1):
s = s + a[i][j];
print(s)
return s
def check(a):
arr = []
x = 0
y = 0
for i in range(len(a)):
row = []
y = 0
for j in range(len(a[i])):
row.append(calculateSum(a, x, y))
y = y + 1
x = x + 1
print(row)
check([[1, 2], [3, 4]])
calculateSum is the function that calculates sum of elements.
Now my question is, if the matrix size is huge then is there is a way to improve performance of the above program?
Update:
import numpy as np
def calculateSum(a, x, y):
return np.sum(a[x:,y:])
After using numpy I am getting error as TypeError: list indices must be integers or slices, not tuple if I use numpy
As the matrix dimensions increases, Efficiency will fall, the efficient way to deal with this is to parallelize the task of summing the values, this is possible because addition follows Associative property.
Luckily for you this parallelization is already implemented in a library known as numpy.
To get started with numpy, use pip install numpy To get an overview of the library visit: https://www.geeksforgeeks.org/numpy-in-python-set-1-introduction/
And for your question you will need to use function numpy.sum()
Edit:
Also as #Mad Physicist pointed out Numpy also has packed memory layout and the routines are implemented in C which boost its speed even further.

gradient descendent coust increass by each iteraction in linear regression with one feature

Hi I am learning some machine learning algorithms and for the sake of understanding I was trying to implement a linear regression algorithm with one feature using as cost function the Residual sum of squares for gradient descent method as bellow:
My pseudocode:
while not converge
w <- w - step*gradient
python code
Linear.py
import math
import numpy as num
def get_regression_predictions(input_feature, intercept, slope):
predicted_output = [intercept + xi*slope for xi in input_feature]
return(predicted_output)
def rss(input_feature, output, intercept,slope):
return sum( [ ( output.iloc[i] - (intercept + slope*input_feature.iloc[i]) )**2 for i in range(len(output))])
def train(input_feature,output,intercept,slope):
file = open("train.csv","w")
file.write("ID,intercept,slope,RSS\n")
i =0
while True:
print("RSS:",rss(input_feature, output, intercept,slope))
file.write(str(i)+","+str(intercept)+","+str(slope)+","+str(rss(input_feature, output, intercept,slope))+"\n")
i+=1
gradient = [derivative(input_feature, output, intercept,slope,n) for n in range(0,2) ]
step = 0.05
intercept -= step*gradient[0]
slope-= step*gradient[1]
return intercept,slope
def derivative(input_feature, output, intercept,slope,n):
if n==0:
return sum( [ -2*(output.iloc[i] - (intercept + slope*input_feature.iloc[i])) for i in range(0,len(output))] )
return sum( [ -2*(output.iloc[i] - (intercept + slope*input_feature.iloc[i]))*input_feature.iloc[i] for i in range(0,len(output))] )
With the main program:
import Linear as lin
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
df = pd.read_csv("test2.csv")
train = df
lin.train(train["X"],train["Y"], 0, 0)
The test2.csv:
X,Y
0,1
1,3
2,7
3,13
4,21
I resisted the value of rss on a file and noticed that the value of rss became worst at each iteration as follows:
ID,intercept,slope,RSS
0,0,0,669
1,4.5,14.0,3585.25
2,-7.25,-18.5,19714.3125
3,19.375,58.25,108855.953125
Mathematically I think it doesn't make any sense I review my own code many times I think it is correct, I am doing something else wrong?
If your cost isn't decreasing, that's usually a sign you're overshooting with your gradient descent approach, meaning too large of a step size.
A smaller step size can help. You can also look into methods for variable step sizes, which can change each iteration to get you nice convergence properties and speed; usually, these methods change the step size with some proportionality to the gradient. Of course, the specifics depend on each problem.

Resources