What is the difference between nn.init.xavier_uniform and nn.init.xavier_uniform_ when initialising weights?
The _ convention in nn.init.xavier_uniform_ is PyTorch's way of doing an operation in place. This convention applies to many of its functions.
In latest pytorch version they both are the same , see this for more detail .
in short i have small doc.
Signature: nn.init.xavier_normal(*args, **kwargs)
Docstring:
xavier_normal(...)
.. warning::
This method is now deprecated in favor of :func:`torch.nn.init.xavier_normal_`.
See :func:`~torch.nn.init.xavier_normal_` for details.
Hence both are the same
It would appear that they are the same. If you look at the source code, you can see that xavier_uniform is defined as xavier_uniform_, but with deprecation warnings added.
I can't speak to why this change was opted for.
Related
I am studying some source codes from PytorchGeometric.
Actually I am really finding from torch_sparse import SparseTensor in Google, to get how to use SparseTensor.
But there is nothing I can see explanation. I saw many documents about COO,CSR something like that, but how can I use SparseTensor?
I read : https://pytorch.org/docs/stable/sparse.html# but there is nothing like SparseTensor.
Thank you in advance :)
I just had the same problem and stumbled upon your question, so I will just detail what I did here, maybe it helps someone. I think the main confusion results from the naming of the package. SparseTensoris from torch_sparse, but you posted the documentation of torch.sparse. The first is an individual project in the pytorch ecosystem and a part of the foundation of PyTorch Geometric, but the latter is a submodule of the actual official PyTorch package.
So, looking at the right package (torch_sparse), there is not much information about how to use the SparseTensor class there (Link).
If we go to the source code on the other hand (Link) you can see that the class has a bunch of classmethods that you can use to genereate your own SparseTensor from well documented pytorch classes.
In my case, all I needed was a way to feed the RGCNConvLayer with just one Tensor including both the edges and edge types, so I put them together with the following line:
edge_index = SparseTensor.from_edge_index(edge_index, edge_types)
If you, however, already have a COO or CSR Tensor, you can use the appropriate classmethods instead.
I am concerned whether torch.solve() examine the condition of the coefficient matrix for a linear system and employ desirable preconditionings; thus I am curious about its implementation details. I have read through several answers trying to track down the source file but in vain. I hope somebody can help me to locate its definition in the ATen library.
I think it just uses LAPACK for CPU and CUBLAS for GPU, since torch.solve is listed under "BLAS and LAPACK Operations" on the official docs.
Then we're looking for wrapper code, which I believe is this part.
After using GridSearchCV, is there any way to find out if StratifiedKFold was really used instead of KFold?
As an estimator I used SVC (Support Vector Machine) with a cv=10.
I know that the documentation (scikit-learn Version 0.21.3) says that StratifiedKFold is actually used in this case. I, however, suspect that this may not have been the case.
Many thanks for your help.
If you are unsure, you can always enter into the github repo and read the code. Take a look here, where the function is defined.
Also, exactly in this line you have your answer. Yes it does.
I am trying to do research on batch normalization, and had to make some modifications for the pytorch BN code. I dig into the pytorch code and got stuck with torch.nn.functional.batch_norm, which references torch.batch_norm.
The problem is that torch.batch_norm cannot be further found in the torch library. Is there any way I can find the source code of this built-in function and re-implement it? Thanks!
It's there, but it's not defined in Python. They're defined in C++ in the aten/ directories.
For CPU, the implementation (one of them, it depends on whether or not the input is contiguous) is here: https://github.com/pytorch/pytorch/blob/420b37f3c67950ed93cd8aa7a12e673fcfc5567b/aten/src/ATen/native/Normalization.cpp#L61-L126
For CUDA, the implementation is here: https://github.com/pytorch/pytorch/blob/7aae51cdedcbf0df5a7a8bf50a947237ac4b3ee8/aten/src/ATen/native/cudnn/BatchNorm.cpp#L52-L143
I wanted to see how the conv1d module is implemented
https://pytorch.org/docs/stable/_modules/torch/nn/modules/conv.html#Conv1d. So I looked at functional.py but still couldn’t find the looping and cross-correlation computation.
Then I searched Github by keyword ‘conv1d’, checked conv.cpp https://github.com/pytorch/pytorch/blob/eb5d28ecefb9d78d4fff5fac099e70e5eb3fbe2e/torch/csrc/api/src/nn/modules/conv.cpp 1 but still couldn’t locate where the computation is happening.
My question is two-fold.
Where is the source code that "conv1d” is implemented?
In general, if I want to check how the modules are implemented, where is the best place to find? Any pointer to the documentation will be appreciated. Thank you.
It depends on the backend (GPU, CPU, distributed etc) but in the most interesting case of GPU it's pulled from cuDNN which is released in binary format and thus you can't inspect its source code. It's a similar story for CPU MKLDNN. I am not aware of any place where PyTorch would "handroll" it's own convolution kernels, but I may be wrong. EDIT: indeed, I was wrong as pointed out in an answer below.
It's difficult without knowing how PyTorch is structured. A lot of code is actually being autogenerated based on various markup files, as explained here. Figuring this out requires a lot of jumping around. For instance, the conv.cpp file you're linking uses torch::conv1d, which is defined here and uses at::convolution which in turn uses at::_convolution, which dispatches to multiple variants, for instance at::cudnn_convolution. at::cudnn_convolution is, I believe, created here via a markup file and just plugs in directly to cuDNN implementation (though I cannot pinpoint the exact point in code when that happens).
Below is an answer that I got from pytorch discussion board:
I believe the “handroll”-ed convolution is defined here: https://github.com/pytorch/pytorch/blob/master/aten/src/THNN/generic/SpatialConvolutionMM.c 3
The NN module implementations are here: https://github.com/pytorch/pytorch/tree/master/aten/src
The GPU version is in THCUNN and the CPU version in THNN