How to compliate GPT-Neo without HuggingFace-Files? - pytorch

Since the only simple way to get GPT-neo running nowadays seems to require HuggingFace-Hub (which pulls a fully compiled model from the HuggingFace-hub, instead of compiling the official Weights from "the-eye.eu/Ai",
I can not be sure what else is actually in the pulled data (and library) as it is a blackbox.
Therefore I'd prefer to compile the model myself from the official pretrained weights (https://the-eye.eu/public/AI/models)
Unfortunately I havent found any ressources yet how to do that, but the AI community would benefit from having alot more transparency in the process and trust in EleutherAI.
Could anyone explain the compiling-lines required to create the original model from the uploaded weights? (and a tokenizer)

Related

Does nn.Transformer in PyTorch include the PositionalEncoding() process so far?

I was trying to solve a seq2seq problem with Transformer Module. According my understanding to the source code from pytoch's github, I thought PositionalEncoding() is not yet included. (Correct me if i'm wrong). But I saw soooo many codes (including the one written by one of my tutors) using the default nn.Transformer to model seq2seq problems which really confused me.

Doc2Vec' object has no attribute 'neg_labels' when trying to use pretrained model

So I'm trying to use a pretrained Doc2vec for my semantic search project. I tried with this one https://github.com/jhlau/doc2vec (English Wikipedia DBOW) and with the forked version of Gensim (0.12.4) and python 2.7
It works fine when I use most_similar but when i try to use infer_vector I get this error:
AttributeError: 'Doc2Vec' object has no attribute 'neg_labels'
what can i do to make this work?
For reasons given in this other answer, I'd recommend against using a many-years-old custom fork of Gensim, and also find those particular pre-trained models a little fishy in their sizes to actually contain all the purported per-article vectors.
But also: that error resembles a very-old bug which only showed up if Gensim was not fully installed to have the necessary Cython-optimized routines for fast training/inference operations. (That caused some older, seldom-run code to be run that had a dependency on the missing neg_labels. Newer versions of Gensim have eliminated that slow code-path entirely.)
My comment on an old Gensim issue has more details, and a workaround that might help - but really, the much better thing to do for quality results & speedy code is to use a current Gensim, & train your own model.

Pytorch-Forecasting N-Beats model with SELU() activation function?

I am working at timeseries forecasting, and I've using the PyTorch lib pytorch-forecasting lately. If you didn't know it, try it. It's great.
I am interested in SELU activation function for Self-Normalizing-Networks (SNNs, see, e.g., the docs). As I didn't find any N-Beats corrected to use SELU and its requirements (i.e. AlphaDropout, proper weights init), I made an implementation myself.
It would be great if any of you with experience with these concepts -NBeats architecture, pytorch-forecasting, or SELU()- could review whether everything is right in my implementation.
My implementation here: https://gist.github.com/pnmartinez/fef1f488497fa85a2cc1626af2a5b4bd

How to hide or encrypt my own keras model file(like h5) when deploying?

I made my own model for application and saved this in Keras as .h5 file. and I made GUI Application using PyQt5 and this application uses this model. I'm trying to deploy this application without any information about deep learning model.
I have some questions about this situation.
Can I hide or encrypt my model to prevent its architecture and weight exposure?
If Keras doesn't support encrypting model, are there any other libraries(like PyTorch) that support this function?
I'm looking forward to hearing any advice. Thank you for your answer.
Model encryption is not officially part of either keras nor pytorch.
I think Python is a big problem if you want to hide something. AFAIK it's not really possible to hide your solution well enough using it, I will outline what I would do to "protect" the model (those are quite lengthy, so make sure you really need this protection [or what level of protection exactly]).
Provided Python solutions
There exists PySyft which handles both PyTorch and Keras but it's used for Secure Multi-Party Computation. As users have access to your Python code (you mentioned PyQT5) and all the sensible data (model in this case) they would be able to recover it quite easily.
Possible solution
If I were you I would go for simple password-protected archive (AES or .zip). For the first case I've found this post and related TFSecured repository, which does AES encryption of tensorflow model via Python and allows you to load saved encrypted protobuf model file in C++ (which should be your way to go, reasons below).
Leave Python alone
Is you want to seriously secure your model (not some kind of mere obfuscation) you shouldn't use Python at the user's side at all.
There is no way to compile Python's code, especially the one using heavy ML libraries like Keras, Tensorflow or PyTorch. Although there are programs like PyInstaller it's notoriously hard to make it work with complex dependencies. Even if you do, users will still be able to get to the code albeit it might be a little harder (PyInstaller just bundles Python, your dependencies and app as a single archive which is later unzipped).
You could further obfuscate the code using pyarmor or a-like but it's all quite easily reversible if someone's determined.
C++
Whether you go for keras/tensorflow or pytorch you can go lower level and use C++ to load your network.
As it is a compiled language all you have to do is to provide a binary file (if linking statically) or binary file with shared libraries. Inside C++ source code you keep your AES/zip key as shown by blog post about TFSecured:
#include <GraphDefDecryptor.hpp>
........
tensorflow::GraphDef graph;
// Decryption:
const std::string key = "JHEW8F7FE6F8E76W8F687WE6F8W8EF5";
auto status = tfsecured::GraphDefDecryptAES(path, // path to *.pb file (encrypted graph)
graph,
key); // your key
if (!status.ok()) {
std::cout << status.error_message() << std::endl;
return;
}
// Create session :
std::unique_ptr<Session> session(NewSession(options));
status = session->Create(graph);
It would be much harder to reverse engineer compiled C++ code to get to key buried inside. Similar procedure could be done for PyTorch as well via some third party tools/libraries. On the other hand you would have to rewrite your PyQt5 app in C++ with Qt5.
i just feel you need one complied model file where your client shouldn't be aware of the model architecture.
So in that case you can have a look at this
It supports all framework that too with privileges' of using python
Have a look here

pytorch - Where is “conv1d” implemented?

I wanted to see how the conv1d module is implemented
https://pytorch.org/docs/stable/_modules/torch/nn/modules/conv.html#Conv1d. So I looked at functional.py but still couldn’t find the looping and cross-correlation computation.
Then I searched Github by keyword ‘conv1d’, checked conv.cpp https://github.com/pytorch/pytorch/blob/eb5d28ecefb9d78d4fff5fac099e70e5eb3fbe2e/torch/csrc/api/src/nn/modules/conv.cpp 1 but still couldn’t locate where the computation is happening.
My question is two-fold.
Where is the source code that "conv1d” is implemented?
In general, if I want to check how the modules are implemented, where is the best place to find? Any pointer to the documentation will be appreciated. Thank you.
It depends on the backend (GPU, CPU, distributed etc) but in the most interesting case of GPU it's pulled from cuDNN which is released in binary format and thus you can't inspect its source code. It's a similar story for CPU MKLDNN. I am not aware of any place where PyTorch would "handroll" it's own convolution kernels, but I may be wrong. EDIT: indeed, I was wrong as pointed out in an answer below.
It's difficult without knowing how PyTorch is structured. A lot of code is actually being autogenerated based on various markup files, as explained here. Figuring this out requires a lot of jumping around. For instance, the conv.cpp file you're linking uses torch::conv1d, which is defined here and uses at::convolution which in turn uses at::_convolution, which dispatches to multiple variants, for instance at::cudnn_convolution. at::cudnn_convolution is, I believe, created here via a markup file and just plugs in directly to cuDNN implementation (though I cannot pinpoint the exact point in code when that happens).
Below is an answer that I got from pytorch discussion board:
I believe the “handroll”-ed convolution is defined here: https://github.com/pytorch/pytorch/blob/master/aten/src/THNN/generic/SpatialConvolutionMM.c 3
The NN module implementations are here: https://github.com/pytorch/pytorch/tree/master/aten/src
The GPU version is in THCUNN and the CPU version in THNN

Resources