pytorch initialize two sub-modules with same weights? - pytorch

I was wondering if there's a typical way to initialize two parts of one network to same some initial values? Say I got two separate auto-encoders for both query and document, and I would like to initialize the weights of these two auto-encoders to same weights (not sharing the weights).
Thanks!

I think the most easy way would be to init one of the sub-modules at random, save the state_dict and then load_state_dict from the other module.

Related

Using GAN to Model Posteriors with PyTorch

I have a 5-dimensional dataset and I'm interested in using a neural network to model the posterior distributions from which the data was drawn. I decided to implement a GAN to do this, and have been familiarizing myself with PyTorch.
I'm wondering how one should go about restricting what values the generator can produce for the parameters. For one of the parameters, the values must be nonnegative real values. For another case, the values must be nonnegative integer values. For the other three cases, the parameters can take on any real value.
My first idea was to control this through the transfer function applied to the nodes in the output layer of my neural network. But all of the PyTorch examples I've seen so far apply the same transfer function to all of the output nodes, which is not what I want to do. Is there a way to apply a different transfer function to each output node? Or is there maybe a better way to approach this problem?

Can I combine two ONNX graphs together, passing the output from one as input to another?

I have a model, exported from pytorch, I'll call main_model.onnx. It has an input node I'll call main_input that expects a list of integers. I can load this in onnxruntime and send a list of ints and it works great.
I made another ONNX model I'll call pre_model.onnx with input pre_input and output pre_output. This preprocesses some text so input is the text, and pre_output is a list of ints, exactly as main_model.onnx needs for input.
My goal here is, using the Python onnx.helper tools, create one uber-model that accepts text as input, and runs through my pre-model.onnx, possibly some connector node (Identity maybe?), and then through main_model.onnx all in one big combined.onnx model.
I have tried using pre_model.graph.node+Identity connector+main_model.graph.node as nodes in a new graph, but the parameters exported from pytorch are lost this way. Is there a way to keep all those parameters and everything around, and export this one even larger combined ONNX model?
This is possible to achieve albeit a bit tricky. You can explore the Python APIs offered by ONNX (https://github.com/onnx/onnx/blob/master/docs/PythonAPIOverview.md). This will allow you to load models to memory and you'll have to "compose" your combined model using the APIs exposed (combine both the GraphProto messages into one - this is easier said than done - you' ll have to ensure that you don't violate the onnx spec while doing this) and finally store the new Graphproto in a new ModelProto and you have your combined model. I would also run it through the onnx checker on completion to ensure the model is valid post its creation.
If you have static size inputs, sclblonnx package is an easy solution for merging Onnx models. However, it does not support dynamic size inputs.
For dynamic size inputs, one solution would be writing your own code using ONNX API as stated earlier.
Another solution would be converting the two ONNX models to a framework(Tensorflow or PyTorch) using tools like onnx-tensorflow or onnx2pytorch. Then pass the outputs of one network as inputs of the other network and export the whole network to Onnx format.

how can I modify a build-In model inside keras applications

I need to get several outputs from several layers from keras model instead of getting the output from the last layer.
I know how to adjust the code according to what I need, but don't know how I can use it within keras application. I meant how can I import it then. Do I need to infall the setup.py at keras-application again. I did so but nothing happen. My changes doesn't applied. Is there any different way to get the outputs from different layer within the model?
Actually the solution was so simple, you need just to call the output of specified layer.
new_model= tf.keras.Model(base_model.input, base_model.get_layer(layer_name).output)

Do images need to be numbered for Keras CNN training/testing?

I've noticed that for any tutorial or example of a Keras CNN that I've seen, the input images are numbered, e.g.:
dog0001.jpg
dog0002.jpg
dog0003.jpg
...
Is this necessary?
I'm working with an image dataset with fairly random filenames (the classes come from the directory name), e.g.:
picture_A2.jpg
image41110.jpg
cellofinterest9A.jpg
I actually want to keep the filenames because they mean something to me, but do I need to append sequential numbers to my image files?
No they can be of different names, it really depends on how you load your data. In your case, you can use flow_from_directory to generate the training data and indeed the directory will be the associated class, this is part of ImageDataGenerator.

I'm trying to implement 'multi-threading' to do both training and prediction(testing) at the same time

I'm trying to implement 'multi-threading' to do both training and prediction(testing) at the same time. And I'm gonna use the python module 'threading' as shown in https://www.tensorflow.org/api_docs/python/tf/FIFOQueue
And the followings are questions.
If I use the python module 'threading', does tensorflow use more portion of gpu or more portion of cpu?
Do I have to make two graphs(neural nets which have the same topology) in tensorflow one for prediction and the other for training? Or is it okay to make just one graph?
I'll be very grateful to anyone who can answer these questions! thanks!
If you use python threading module, it will only make use of cpu; also python threading not for run time parallelism, you should use multiprocessing.
In your model if you are using dropout or batch_norm like ops which change based on training and validation, it's a good idea to create separate graphs, reusing (validation graph will reuse all training variables) the common variable for validation/testing.
Note: you can use one graph also, with additional operations which changes behaviors based on training/validation.

Resources