I'm trying to make use of LSTM Cell, according to this tutorial:
https://mxnet.apache.org/versions/1.2.1/api/python/gluon/rnn.html
model = mx.gluon.rnn.SequentialRNNCell()
model.add(mx.gluon.rnn.LSTMCell(20))
model.add(mx.gluon.rnn.LSTMCell(20))
states = model.begin_state(batch_size=32)
input = mx.nd.random.uniform(shape=(32, 10))
model.initialize()
model(input, states)
I'm getting the following error: ValueError: Deferred initialization failed because shape cannot be inferred. Operator FullyConnected registered in backend is known as FullyConnected in Python. This is a legacy operator which can only accept legacy ndarrays, while received an MXNet numpy ndarray. Please call as_nd_ndarray() upon the numpy ndarray to convert it to a legacy ndarray, and then feed the converted array to this operator.
What's wrong with it? I'm using MXnet '2.0.0' and can't find any up to date tutorial.
Related
I am new to Jax.
I am implementing a variational autoencoder (VAE) using Jax and Flax. During training, I sample a latent code (from the distribution inferred by the encoder, which I implement using compositions of flax.linen.nn modules). Crucially, in addition to passing this code through the decoder (as is standard for a VAE), I also pass the code to an external function (the MuJoCo physics engine), which tries to assign it to a NumPy array. This unsurprisingly leads to the following error:
TracerArrayConversionError: The numpy.ndarray conversion method array() was called on the JAX Tracer object...
Fundamentally, I need to pass a concrete numpy array to MuJoCo. How can I make my variable a NumPy array will still allowing my model to be implemented in a computationally efficient manner using abstract tracers wherever possible?
Here is a minimal working example of the problem I am facing - gym and mujoco (https://mujoco.org/) will need to be installed to run this I believe:
import jax
import jax.numpy as np
import numpy as onp
import gym
from jax import jit
# create an instance of an open AI gym environment
env = gym.make('Humanoid-v3')
env.reset()
def this_fails(env, x):
# this gives a TracerArrayConversionError
env.sim.data.qpos[:] = x
return env, x
x = np.arange(len(env.sim.data.qpos))
jit_this_fails = jax.jit(this_fails, static_argnums = 0)
env, x = jit_this_fails(env, x)
Edit: there is now a JAX FAQ entry on this topic: https://jax.readthedocs.io/en/latest/faq.html#how-can-i-convert-a-jax-tracer-to-a-numpy-array
Note: this is the answer to the OP's question as originally written. The question has been edited multiple times and no longer asks what it originally asked.
In the past this sort of thing has not been supported, but you can do this with the new jax.pure_callback feature that is part of JAX version 0.3.17, which is not yet released at the time I am writing this.
For example, say you want to call a numpy-based function from within a JAX jit-compiled function; we'll use np.sin for simplicity. You might first try something like this:
import jax
import jax.numpy as jnp
import numpy as np
#jax.jit
def this_fails(x):
# Call a numpy function...
return np.sin(x)
x = jnp.arange(5.0)
this_fails(x)
jax._src.errors.TracerArrayConversionError: The numpy.ndarray conversion method __array__() was called on the JAX Tracer object Traced<ShapedArray(float32[5])>with<DynamicJaxprTrace(level=0/1)>
The error occurred while tracing the function this_fails at tmp.py:7 for jit. This concrete value was not available in Python because it depends on the value of the argument 'x'.
See https://jax.readthedocs.io/en/latest/errors.html#jax.errors.TracerArrayConversionError
The result is a TracerConversionError, because you're attempting to pass a traced JAX value into a function that expects a numpy array (side note: see How To Think In JAX for an introduction to JAX Tracers and related topics).
In JAX version 0.3.17 or newer, you can get around this issue using jax.pure_callback:
#jax.jit
def numpy_callback(x):
# Need to forward-declare the shape & dtype of the expected output.
result_shape = jax.core.ShapedArray(x.shape, x.dtype)
return jax.pure_callback(np.sin, result_shape, x)
x = jnp.arange(5.0)
print(numpy_callback(x))
[ 0. 0.841471 0.9092974 0.14112 -0.7568025]
A few caveats to keep in mind:
the resulting execution will rely on a callback to the host, so it will be quite slow on accelerators like GPU/TPU, particularly in distributed/multi-host settings. In the case of local CPU execution, though, it avoids buffer copies and can be quite performant.
if you vmap the function, it will result in a for loop of multiple callbacks (you can specify vectorized=True if the callback function handles batches natively).
autodiff transformations like grad and jacobian will not work with this function, because JAX has no way of reasoning about the computations being done. If you would like to use it with autodiff transformations, you could define custom gradients as in Custom Derivative Rules, though this would require having access to a function that computes the gradient for your callback function.
None of this is documented yet on the JAX website, but we hope to write docs for pure_callback soon!
I want to export roberta-base based language model to ONNX format. The model uses ROBERTA embeddings and performs text classification task.
from torch import nn
import torch.onnx
import onnx
import onnxruntime
import torch
import transformers
from logs:
17: pytorch: 1.10.2+cu113
18: CUDA: False
21: device: cpu
26: onnxruntime: 1.10.0
27: onnx: 1.11.0
PyTorch export
batch_size = 3
model_input = {
'input_ids': torch.empty(batch_size, 256, dtype=torch.int).random_(32000),
'attention_mask': torch.empty(batch_size, 256, dtype=torch.int).random_(2),
'seq_len': torch.empty(batch_size, 1, dtype=torch.int).random_(256)
}
model_file_path = os.path.join("checkpoints", 'model.onnx')
torch.onnx.export(da_inference.model, # model being run
model_input, # model input (or a tuple for multiple inputs)
model_file_path, # where to save the model (can be a file or file-like object)
export_params=True, # store the trained parameter weights inside the model file
opset_version=11, # the ONNX version to export the model to
operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK,
do_constant_folding=True, # whether to execute constant folding for optimization
input_names = ['input_ids', 'attention_mask', 'seq_len'], # the model's input names
output_names = ['output'], # the model's output names
dynamic_axes={'input_ids': {0 : 'batch_size'},
'attention_mask': {0 : 'batch_size'},
'seq_len': {0 : 'batch_size'},
'output' : {0 : 'batch_size'}},
verbose=True)
I know there maybe problems converting some operators from ATen (A Tensor Library for C++11), if included in model architecture PyTorch Model Export to ONNX Failed Due to ATen.
Exports succeeds if I set the parameter operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK which means 'leave as is ATen operators if not supported in ONNX'.
PyTorch export function gives me the following warning:
Warning: Unsupported operator ATen. No schema registered for this operator.
Warning: Shape inference does not support models with experimental operators: ATen
It looks like the only ATen operators in the model that are not converted to ONNX are situated inside layers LayerNorm.weight and LayerNorm.bias (I have several layers like that):
%1266 : Float(3, 256, 768, strides=[196608, 768, 1], requires_grad=0, device=cpu) =
onnx::ATen[cudnn_enable=1, eps=1.0000000000000001e-05, normalized_shape=[768], operator="layer_norm"]
(%1265, %model.utterance_rnn.base.encoder.layer.11.output.LayerNorm.weight,
%model.utterance_rnn.base.encoder.layer.11.output.LayerNorm.bias)
# /opt/conda/lib/python3.9/site-packages/torch/nn/functional.py:2347:0
Than model check passes OK:
model = onnx.load(model_file_path)
# Check that the model is well formed
onnx.checker.check_model(model)
# Print a human readable representation of the graph
print(onnx.helper.printable_graph(model.graph))
I also can visualize computation graph using Netron.
But when I try to perform inference using exported ONNX model it stalls with no logs or stdout. So this code will hang the system:
model_file_path = os.path.join("checkpoints", "model.onnx")
sess_options = onnxruntime.SessionOptions()
sess_options.log_severity_level = 0
ort_providers: List[str] = ["CUDAExecutionProvider"] if use_gpu else ['CPUExecutionProvider']
session = InferenceSession(model_file_path, providers=ort_providers, sess_options=sess_options)
Is there any suggestions to overcome this problem? From official documentation I see that torch.onnx models exported this way are probably runnable only by Caffe2.
This layers are not inside the base frozen roberta model, so this is additional layers that I added by myself. Is it possible to substitute the offending layers with similar ones and retrain the model?
Or Caffe2 is the best choice here and onnxruntime will not do the inference?
Update: I retrained the model on the basis of BERT cased embeddings, but the problem persists. The same ATen operators are not converted in ONNX.
It looks like the layers LayerNorm.weight and LayerNorm.bias are only in the model above BERT. So, what is your suggestions to change this layers and enable ONNX export?
Have you tried to export after defining the operator for onnx? Something along the lines of the following code by Huawei.
On another note, when loading a model, you can technically override anything you want. Putting a specific layer to equal your modified class that inherits the original, keeps the same behavior (input and output) but execution of it can be modified.
You can try to use this to save the model with changed problematic operators, transform it in onnx, and fine tune in such form (or even in pytorch).
This generally seems best solved by the onnx team, so long term solution might be to post a request for that specific operator on the github issues page (but probably slow).
Best way to go will be to rewrite the place in the model that uses these operator in a way it will convert look at this for reference.
if for example the issue is layer norm then you can write it yourself. another thing that help sometimes is not setting the axes as dynamic, since some op dont support it yet
I have converted a savedModel format to onnx model but when loading it via onnxruntime
import onnxruntime as rt
sess = rt.InferenceSession('model.onnx')
It throws me the below error:
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from /mnt/model/io_files/convert/1606801475/model.onnx failed:This is an invalid model. Type Error: Type 'tensor(float)' of input parameter (const_fold_opt__342) of operator (Slice) in node (StatefulPartitionedCall/mobilenet_1.00_224/reshape_1/strided_slice) is invalid.
The savedModel I have used is Keras pretrained MobileNet from the tensorflow website: https://www.tensorflow.org/guide/saved_model.
I saw the parameters in netron is float but I am unable to address and understand this issue.
Below is the snip from netron:
Looks like a bug in Keras->ONNX converter.
starts input of Slice must be int32 or int64: https://github.com/onnx/onnx/blob/master/docs/Operators.md#Slice
You can try to patch the model by using onnx Python interface: load the model, find the node, change input type. But if the model has this issue, the Keras->ONNX converter is probably not very well-tested and there are likely other issues.
Can you find an equivalent PyTorch model? PyTorch->ONNX converter should be much better.
I'm trying to convert Pytorch model to MLModel with Onnx.
My code:
import torch
from onnx_coreml import convert
import coremltools
net = BiSeNet(19)
net.cuda()
net.load_state_dict(torch.load('model.pth'))
#net.eval()
dummy = torch.rand(1,3,512,512).cuda()
torch.onnx.export(net, dummy, "Model.onnx", input_names=["image"], output_names=["output"], opset_version=11)
finalModel = convert(model='Model.onnx', minimum_ios_deployment_target='12')
finalModel.save('ModelML.mlmodel')
After the code runs Model.onnx is generated, however, .mlmodel file is not generated. There're no errors in the console. This is the output:
2020-04-15 21:49:32.367179: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
WARNING:root:TensorFlow version 2.2.0-rc2 detected. Last version known to be fully compatible is 1.14.0 .
WARNING:root:Keras version 2.3.1 detected. Last version known to be fully compatible of Keras is 2.2.4 .
1.4.0
/content/drive/My Drive/Collab/fp/model.py:116: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
size_array = [int(s) for s in feat32.size()[2:]]
/content/drive/My Drive/Collab/fp/model.py:80: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
size_array = [int(s) for s in feat.size()[2:]]
/content/drive/My Drive/Collab/fp/model.py:211: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
size_array = [int(s) for s in feat.size()[2:]]
What could be the issue?
I'm new to the Machine learning domain and in Learn Regression i have some doubt
1:While practicing the sklearn learn regression model prediction method getting the below error.
Code:
sklearn.linear_model.LinearRegression.predict(25)
Error:
"ValueError: Expected 2D array, got scalar array instead: array=25. Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample."
Do i need to pass a 2-D array? Checked on sklearn documentation page any haven't found any thing for version update.
**Running my code on Kaggle
https://www.kaggle.com/aman9d/bikesharingdemand-upx/
2: Is index of dataset going to effect model's score (weights)?
First of all you should put your code as you use:
# import, instantiate, fit
from sklearn.linear_model import LinearRegression
linreg = LinearRegression()
linreg.fit(X, y)
# use the predict method
linreg.predict(25)
Because what you post in the question is not properly executable, predict method is not static for the class LinearRegression.
When you fit a model, the first step is recognize which kind of data will be the input, in your case will be similar to X, that means that if you pass something with different shape of X to the model it will raise an error.
In your example X seems to be a pd.DataFrame() instance with only 1 column, this should be replaceable with an array of 2 dimension representing the number of examples by the number of features, so if you try:
linreg.predict([[25]])
should work.
For example if you were trying a regression with more than 1 feature aka column, let's say temp and humidity, your input would look like this:
linreg.predict([[25, 56]])
I hope this will help you and always keep in mind which is the shape of your data.
Documentation: LinearRegression fit
X : array-like or sparse matrix, shape (n_samples, n_features)