Convolution layer padding difference between pytorch and tensorflow - pytorch

torch.nn.Conv2d(7, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3))
In the above line of code for convolution in pytorch what is happening with the padding = (3,3) parameter. How can we implement the same padding in tensorflow.
tf.keras.layers.Conv2D(7, 64, stride = (2,2), padding="same padding as pytorch")
How can we implement this in tensorflow ?

The Padding is adding extra bytes on each dimension in the image. As per the documentation,
padding controls the amount of implicit zero-paddings on both sides
for padding number of points for each dimension.
In Tensorflow, you can use tf.pad to do the same. For example to do padding=(3, 3) similar to pytorch, you can use below code -
import tensorflow as tf
image = tf.constant([[1, 2, 3], [4, 5, 6]])
paddings = tf.constant([[3, 3,], [3, 3]])
image = tf.pad(image, paddings, "CONSTANT")
print(image)
Output-
tf.Tensor(
[[0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0]
[0 0 0 1 2 3 0 0 0]
[0 0 0 4 5 6 0 0 0]
[0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0]], shape=(8, 9), dtype=int32)

Related

Does 1D or 2D array matter while fitting and prediction of a ML model?

I have developed a text classification model where my X_test and X-train are 2D array. Where as y_test and y_trainare 1D array. Though I did not encounter with any error while training, fitting and predicting my ML model. But i am dont know why I am having trouble generating ROC score. It says AxisError: axis 1 is out of bounds for array of dimension 1!!
I am unable to find a solution for this. So I am just curious to know if there's any correlation of having 1D and 2D arrays in a ML model. Or It should be one of them; either 1D or 2D array.
Can anyone explain this?
Sample code for text classification model(to generate roc score):
from sklearn.metrics import roc_curve, roc_auc_score
r_auc = roc_auc_score(y_test, r_probs, multi_class='OVO')
I had done the following before calculating auroc;
#Prediction probabilities
r_probs = [0 for _ in range(len(y_test))]
rf_probs = RFClass.predict_proba(X_test)
dt_probs = DTClass.predict_proba(X_test)
sgdc_probs = sgdc_model.predict_proba(X_test)
#Probabilities for the positive outcome is kept.
dt_probs = dt_probs[:, 1]
sgdc_probs = sgdc_probs[:, 1]
rf_probs = rf_probs[:, 1]
y_test sample output;
Covid19 - Form
Covid19 - Phone
Covid19 - Email
Covid19 - Email
Covid19 - Phone
r_probs sample output;
[0,
0,
0,
0,
0,
...]
Here is the error;
---------------------------------------------------------------------------
AxisError Traceback (most recent call last)
/tmp/ipykernel_14270/1310904144.py in <module>
4 from sklearn.metrics import roc_curve, roc_auc_score
5
----> 6 r_auc = roc_auc_score(y_test, r_probs, multi_class='OVO')
7 #rf_auc = roc_auc_score(y_test, rf_probs, multi_class='ovr')
8 #dt_auc = roc_auc_score(y_test, dt_probs, multi_class='ovr')
packages/sklearn/metrics/_ranking.py in roc_auc_score(y_true, y_score, average, sample_weight, max_fpr, multi_class, labels)
559 if multi_class == "raise":
560 raise ValueError("multi_class must be in ('ovo', 'ovr')")
--> 561 return _multiclass_roc_auc_score(
562 y_true, y_score, labels, multi_class, average, sample_weight
563 )
There seems to be a mismatch in the shapes of your y_test and r_probs. Also, you seem to have assigned the r_probs to all zeros and never have updated them. Note that you need to have some 1's in the ground truth and predictions in order for the roc_auc_score to work.
First some background:
The y_test and the predictions, both can be 1-D or 2-D depending on whether you have formulated it as binary, multi-class or a multi-label problem. Read more under the y_true and multi_class parameters here roc_auc_score
y_true:
True labels or binary label indicators. The binary and multiclass cases expect labels with shape (n_samples,) while the multilabel case expects binary label indicators with shape (n_samples, n_classes).
multi_class:
Only used for multiclass targets. Determines the type of configuration to use. The default value raises an error, so either 'ovr' or 'ovo' must be passed explicitly.
I'd print the shapes of the y_test and r_probs, just before invoking the roc_auc_score function just to be sure. Showing below samples that work for the binary (1-D labels) and multi-label (2-D labels) cases:
binary (1-D) class labels:
import numpy as np
from sklearn.metrics import roc_auc_score
np.random.seed(42)
n = 100
y_test = np.random.randint(0, 2, (n,))
r_probs = np.random.randint(0, 2, (n,))
r_auc = roc_auc_score(y_test, r_probs)
print(f'Shape of y_test: {y_test.shape}')
print(f'Shape of r_probs: {r_probs.shape}')
print(f'y_test: {y_test}')
print(f'r_probs: {r_probs}')
print(f'r_auc: {r_auc}')
Output:
Shape of y_test: (100,)
Shape of r_probs: (100,)
y_test: [0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 1 1 1 0 1 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 0 0 0 0 0 1 1 1 1 1 0 1 1 0 1 0 1 0 1 1 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1 0 1 0 1 1 1 0 1 0 1 0 1 0 0 1 0 1 1 1 1 1 1 1 1 1 1 1 0]
r_probs: [0 1 1 1 1 1 1 1 1 0 1 0 1 1 0 1 0 1 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 1 0 0 0 0 0 1 0 1 0 1 0 0 1 1 1 0 1 0 0 1 1 0 0 1 1 1 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 1 1 1 0 0]
r_auc: 0.5073051948051948
multi-label (2-D) class labels:
y_test = np.random.randint(0, 2, (n, 4))
r_probs = np.random.randint(0, 2, (n, 4))
r_auc = roc_auc_score(y_test, r_probs, multi_class='ovr')
print(f'Shape of y_test: {y_test.shape}')
print(f'Shape of r_probs: {r_probs.shape}')
print(f'y_test: {y_test}')
print(f'r_probs: {r_probs}')
print(f'r_auc: {r_auc}')
Output:
Shape of y_test: (100, 4)
Shape of r_probs: (100, 4)
y_test: [[0 1 0 0] [1 0 1 1] ... [1 0 0 0] [0 0 1 1]]
r_probs: [[0 1 1 1] [0 0 0 1] ... [1 1 0 1] [1 1 1 0]]
r_auc: 0.5270015526313198

Confused about keras Dot Layer. How is the Dot product computed?

I read all posts about the Dot Layer but none explains how this and so the output shape is computed! It seems so standard though!
How exactly are the values computed with a along a specific axis?
val = np.random.randint(2, size=(2, 3, 4))
a = K.variable(value=val)
val2 = np.random.randint(2, size=(2, 2, 3))
b = K.variable(value=val)
print("a")
print(val)
print("b")
print(val2)
out = Dot(axes = 2)([a,b])
print(out.shape)
print("DOT")
print(K.eval(out))
I get:
a
[[[0 1 1 1]
[1 1 0 0]
[0 0 1 1]]
[[1 1 1 0]
[0 0 1 0]
[0 1 0 0]]]
b
[[[1 0 1]
[1 0 1]]
[[1 0 1]
[1 1 0]]]
(2, 3, 3)
DOT
[[[ 3. 1. 2.]
[ 1. 2. 0.]
[ 2. 0. 2.]]
[[ 3. 1. 1.]
[ 1. 1. 0.]
[ 1. 0. 1.]]]
I cannot understand with my mathematical and algebraic matrix know-how how the heck this is computed?
Here's how the Dot product works. Internally it is calling K.batch_dot.
First, I think you might have intended to do,
val = np.random.randint(2, size=(2, 3, 4))
a = K.variable(value=val)
val2 = np.random.randint(2, size=(2, 2, 3))
b = K.variable(value=val2) # You have val here
But fortunately, you had (or could have been your initial intention too. Anyway just pointing out)
b = K.variable(value=val)
If you had the intended code, it will throw an error because the dimension you want the dot product on, doesn't match. Moving on,
How dot product is computed
You have
a.shape = (2,3,4)
b.shape = (2,3,4)
First you are only performing element-wise dot over the batch dimension. So that dimension stays that way.
Now you can ignore the first dimension of both a and b and consider the dot product between two matrices (3,4) and (3,4) and do the dot product over the last axis, which results in a (3,3) matrix. Now add the batch dimension you get a,
(2, 3, 3) tensor
Let's now take your example. You got,
a
[[[0 1 1 1]
[1 1 0 0]
[0 0 1 1]]
[[1 1 1 0]
[0 0 1 0]
[0 1 0 0]]]
b
[[[0 1 1 1]
[1 1 0 0]
[0 0 1 1]]
[[1 1 1 0]
[0 0 1 0]
[0 1 0 0]]]
Then you do the following two dot products.
# 1st sample
[0 1 1 1] . [0 1 1 1]
[1 1 0 0] . [1 1 0 0]
[0 0 1 1] . [0 0 1 1]
# 2nd sample
[1 1 1 0] . [1 1 1 0]
[0 0 1 0] . [0 0 1 0]
[0 1 0 0] . [0 1 0 0]
This gives,
# 1st sample
[3 1 2]
[1 2 0]
[2 0 2]
# 2nd sample
[ 3 1 1]
[ 1 1 0]
[ 1 0 1]
Finally by adding the missing batch dimension you get,
[[[ 3. 1. 2.]
[ 1. 2. 0.]
[ 2. 0. 2.]]
[[ 3. 1. 1.]
[ 1. 1. 0.]
[ 1. 0. 1.]]]

Set to 0 x% of non zero values in numpy 2d array

I tried different ways but it seems impossible for me to do it efficiently without looping through.
Input is an array y and a percentage x.
e.g. input is
y=np.random.binomial(1,1,[10,10])
x=0.5
output
[[0 0 0 0 1 1 1 1 0 1]
[1 0 1 0 0 1 0 1 0 1]
[1 0 1 1 1 1 0 0 0 1]
[0 1 0 1 1 0 1 0 1 1]
[0 1 1 0 0 1 1 1 0 0]
[0 0 1 1 1 0 1 1 0 1]
[0 1 0 0 0 0 1 0 1 1]
[0 0 0 1 1 1 1 1 0 0]
[0 1 1 1 1 0 0 1 0 0]
[1 0 1 0 1 0 0 0 0 0]]
Here's one based on masking -
def set_nonzeros_to_zeros(a, setz_ratio):
nz_mask = a!=0
nz_count = nz_mask.sum()
z_set_count = int(np.round(setz_ratio*nz_count))
idx = np.random.choice(nz_count,z_set_count,replace=False)
mask0 = np.ones(nz_count,dtype=bool)
mask0.flat[idx] = 0
nz_mask[nz_mask] = mask0
a[~nz_mask] = 0
return a
We are skipping the generation all the indices with np.argwhere/np.nonzero in favor of a masking based one to focus on performance.
Sample run -
In [154]: np.random.seed(0)
...: a = np.random.randint(0,3,(5000,5000))
# number of non-0s before using solution
In [155]: (a!=0).sum()
Out[155]: 16670017
In [156]: a_out = set_nonzeros_to_zeros(a, setz_ratio=0.2) #set 20% of non-0s to 0s
# number of non-0s after using solution
In [157]: (a_out!=0).sum()
Out[157]: 13336014
# Verify
In [158]: 16670017 - 0.2*16670017
Out[158]: 13336013.6
There are a few vectorized methods that might help you, depending on what you want to do:
# Flatten the 2D array and get the indices of the non-zero elements
c = y.flatten()
d = c.nonzero()[0]
# Shuffle the indices and set the first 100x % to zero
np.random.shuffle(d)
x = 0.5
c[d[:int(x*len(d))]] = 0
# reshape to the original 2D shape
y = c.reshape(y.shape)
No doubt there are some efficiency improvements to be made here.

Create a new large matrix by stacking in its diagonal K matrices

l have K (let K here be 7) distincts matrices of dimension (50,50).
I would like to create a new matrix L by filling it in diagonal with the K matrices. Hence L is of dimension (50*K,50*K).
What l have tried ?
K1=np.random.random((50,50))
N,N=K1.shape
K=7
out=np.zeros((K,N,K,N),K1.dtype)
np.einsum('ijik->ijk', out)[...] = K1
L=out.reshape(K*N, K*N) # L is of dimension (50*7,50*7)=(350,350)
Its indeed creating a new matrix L by stacking K1 seven times within its diagonal. However, l would like to stack respectively K1,K2,K3,K5,K6,K7 rather than K1 seven times.
Inputs :
K1=np.random.random((50,50))
K2=np.random.random((50,50))
K3=np.random.random((50,50))
K4=np.random.random((50,50))
K5=np.random.random((50,50))
K6=np.random.random((50,50))
K7=np.random.random((50,50))
L=np.zeros((50*7,50*7))#
Expected outputs :
L[:50,:50]=K1
L[50:100,50:100]=K2
L[100:150,100:50]=K3
L[150:200,150:200]=K4
L[200:250,200:250]=K5
L[250:300,250:300]=K6
L[300:350,300:350]=K7
You could try scipy.linalg.block_diag. If you look at the source, this function basically just loops over the given blocks the way you have written as your output. It can be used like:
K1=np.random.random((50,50))
K2=np.random.random((50,50))
K3=np.random.random((50,50))
K4=np.random.random((50,50))
K5=np.random.random((50,50))
K6=np.random.random((50,50))
K7=np.random.random((50,50))
L=sp.linalg.block_diag(K1,K2,K3,K4,K5,K6,K7)
If you have your K as a ndarray of shape (7,50,50) you can unpack it directly like:
K=np.random.random((7,50,50))
L=sp.linalg.block_diag(*K)
If you don't want to import scipy, you can always just write a simple loop to do what you have written for the expected output.
Here is a way to do that with NumPy:
import numpy as np
def put_in_diagonals(a):
n, rows, cols = a.shape
b = np.zeros((n * rows, n * cols), dtype=a.dtype)
a2 = a.reshape(-1, cols)
ii, jj = np.indices(a2.shape)
jj += (ii // rows) * cols
b[ii, jj] = a2
return b
# Test
a = np.arange(24).reshape(4, 2, 3)
print(put_in_diagonals(a))
Output:
[[ 0 1 2 0 0 0 0 0 0 0 0 0]
[ 3 4 5 0 0 0 0 0 0 0 0 0]
[ 0 0 0 6 7 8 0 0 0 0 0 0]
[ 0 0 0 9 10 11 0 0 0 0 0 0]
[ 0 0 0 0 0 0 12 13 14 0 0 0]
[ 0 0 0 0 0 0 15 16 17 0 0 0]
[ 0 0 0 0 0 0 0 0 0 18 19 20]
[ 0 0 0 0 0 0 0 0 0 21 22 23]]

pandas misreads lines in file [duplicate]

This question already has answers here:
Trailing delimiter confuses pandas read_csv
(3 answers)
Closed 4 years ago.
I'm trying to read the following file with pandas using python 3.6:
$ cat tmp2.txt
somename nan 0 0 1 0 0 1 11 0.909091 0 0 1 0 0 7 1 1 0 0 0 0 2
somename nan 0 0 1 0 0 1 36 0.972222 0 0 7 0 5 22 0 6 1 0 0 0 2
somename UgzVrvH-ahjgfT9-NfN4AaABAg.8e3_FgQnopN8e4FLHwai7v0 0 1 0 0 0 25 0.920000 0 0 0 0 2 22 0 1 0 0 0 0 0
somename UgxyXxibolL_qOhMsyZ4AaABAg.8eApKy29u5J8eAxINbTH2m0 0 1 0 0 0 13 1.000000 0 0 0 0 1 10 0 2 0 0 0 0 0
somename nan 0 0 0 0 0 2 56 0.839286 0 0 0 0 11 14 5 7 3 0 3 1 10
When I try reading it with pandas :
>>> import pandas as pd
>>> df = pd.read_csv(header=None, filepath_or_buffer="tmp2.txt", delim_whitespace=True, index_col=0)
>>> df.values[2,:]
array(['UgzVrvH-ahjgfT9-NfN4AaABAg.8e3_FgQnopN8e4FLHwai7v0', 0, 1, 0, 0,
0, 25, 0.92, 0.0, 0, 0, 0, 2, 22, 0, 1, 0, 0, 0, 0, 0, nan],
dtype=object)
>>> df.values[3,:]
array(['UgxyXxibolL_qOhMsyZ4AaABAg.8eApKy29u5J8eAxINbTH2m0', 0, 1, 0, 0,
0, 13, 1.0, 0.0, 0, 0, 0, 1, 10, 0, 2, 0, 0, 0, 0, 0, nan],
dtype=object)
>>> df.values[4,:]
array([nan, 0, 0, 0, 0, 0, 2, 56.0, 0.8392860000000001, 0, 0, 0, 0, 11,
14, 5, 7, 3, 0, 3, 1, 10.0], dtype=object)
As can be seen when I print df.values[2,:] and df.values[3,:] I get an extraneous nan at the end. It seems like this might be an issue with there being a maximum number of characters per line, but the man page for pandas.read_csv does not contain any mention of that.
QUESTION : What causes this and how can I get pandas.read_csv to correctly read this file?
It's similar to this: python pandas - trailing delimiter confuses read_csv
Your input data has trailing delimiters on some or all of the lines. Two easy fixes are to set usecols in read_csv(), or after reading do something like this:
if df[df.columns[-1]].isnull().all():
df.drop(df.columns[-1], axis=1, inplace=True)

Resources