I am trying to load a densenet121 model in Kaggle kernel without switching on the internet.
I have done the required steps such as adding the pre-trained weights to my input directory and moving it to '.cache/torch/checkpoints/'. It still would not work and throws a gaierror.
The following the is code SNIPPET:
!mkdir -p /tmp/.cache/torch/checkpoints
!cp ../input/fastai-pretrained-models/densenet121-a639ec97.pth /tmp/.cache/torch/checkpoints/densenet121-a639ec97.pth
learn_cd = create_cnn(data_cd, models.densenet121, metrics=[error_rate, accuracy],model_dir = Path('../kaggle/working/models'),path=Path('.'),).to_fp16()
I have been struggling with this for a long time. Any help would be immensely helpful
so input path "../input/" in kaggle kernel is read only. create a folder in "kaggle/working" rather and copy the model weights there. Example below
if not os.path.exists('/root/.cache/torch/hub/checkpoints/'):
os.makedirs('/root/.cache/torch/hub/checkpoints/')
!mkdir '/kaggle/working/resnet34'
!cp '/root/.cache/torch/hub/checkpoints/resnet34-333f7ec4.pth' '/kaggle/working/resnet34/resnet34.pth'
Related
I am saving keras model
model.save('model.h5')
in databricks, but model is not saving,
I have also tried saving in /tmp/model.h5 as mentioned here
but model is not saving.
The saving cell executes but when I load model it shows no model.h5 file is available.
when I do this dbfs_model_path = 'dbfs:/FileStore/models/model.h5' dbutils.fs.cp('file:/tmp/model.h5', dbfs_model_path)
OR try loading model
tf.keras.models.load_model("file:/tmp/model.h5")
I get error message java.io.FileNotFoundException: File file:/tmp/model.h5 does not exist
The problem is that Keras is designed to work only with local files, so it doesn't understand URIs, such as dbfs:/, or file:/. So you need to use local paths for saving & loading operations, and then copy files to/from DBFS (unfortunately /dbfs doesn't play well with Keras because of the way it works).
The following code works just fine. Note that dbfs:/ or file:/ are used only in the calls to the dbutils.fs commands - Keras stuff uses the names of local files.
create model & save locally as /tmp/model-full.h5:
from tensorflow.keras.applications import InceptionV3
model = InceptionV3(weights="imagenet")
model.save('/tmp/model-full.h5')
copy data to DBFS as dbfs:/tmp/model-full.h5 and check it:
dbutils.fs.cp("file:/tmp/model-full.h5", "dbfs:/tmp/model-full.h5")
display(dbutils.fs.ls("/tmp/model-full.h5"))
copy file from DBFS as /tmp/model-full2.h5 & load it:
dbutils.fs.cp("dbfs:/tmp/model-full.h5", "file:/tmp/model-full2.h5")
from tensorflow import keras
model2 = keras.models.load_model("/tmp/model-full2.h5")
This is begin from curious....I download the pretrained yolov5s.pt from public google drive, and convert it as yolov5s.onnx file with input shape [1,3,640,640] by using yolov5's models/export.py. Then I use openvino's deployment tools/mo.py to convert the yolov5s.onnx into openvino inference engines file (.xml+.bin). The conversion is success without error. At last, I try to run the predict by using these files by using openvino's demo program the prediction is successfully return the result. But all the result return negative numbers, and the array level is wrong. Is it impossible to convert the yolov5 weights for openvino?
YOLOv5 is currently not an officially supported topology by the OpenVINO toolkit. Please see the list of validated ONNX and PyTorch topologies here https://docs.openvinotoolkit.org/latest/openvino_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_ONNX.html
However, we have one recommendation for you to try, but it was no guaranteed it will succeed. You can do it by change export.py to include the Detect layer:
yolov5/models/export.py
Line 28 in a1c8406
model.model[-1].export = True # set Detect() layer export=True
The above needs to be changed from True to False. For more detail, you can follow this thread here.
Try this :python mo.py --input_model yolov5s.onnx -s 255 --reverse_input_channels --output Conv_245,Conv_261,Conv_277 (use Netron to check your architecture)
I am using fastai with pytorch to fine tune XLMRoberta from huggingface.
I've trained the model and everything is fine on the machine where I trained it.
But when I try to load the model on another machine I get OSError - Not Found - No such file or directory pointing to .cache/torch/transformers/. The issue is the path of a vocab_file.
I've used fastai's Learner.export to export the model in .pkl file, but I don't believe that issue is related to fastai since I found the same issue appearing in flairNLP.
It appears that the path to the cache folder, where the vocab_file is stored during the training, is embedded in the .pkl file:
The error comes from transformer's XLMRobertaTokenizer __setstate__:
def __setstate__(self, d):
self.__dict__ = d
self.sp_model = spm.SentencePieceProcessor()
self.sp_model.Load(self.vocab_file)
which tries to load the vocab_file using the path from the file.
I've tried patching this method using:
pretrained_model_name = "xlm-roberta-base"
vocab_file = XLMRobertaTokenizer.from_pretrained(pretrained_model_name).vocab_file
def _setstate(self, d):
self.__dict__ = d
self.sp_model = spm.SentencePieceProcessor()
self.sp_model.Load(vocab_file)
XLMRobertaTokenizer.__setstate__ = MethodType(_setstate, XLMRobertaTokenizer(vocab_file))
And that successfully loaded the model but caused other problems like missing model attributes and other unwanted issues.
Can someone please explain why is the path embedded inside the file, is there a way to configure it without reexporting the model or if it has to be reexported how to configure it dynamically using fastai, torch and huggingface.
I faced the same error. I had fine tuned XLMRoberta on downstream classification task with fastai version = 1.0.61. I'm loading the model inside docker.
I'm not sure about why the path is embedded, but I found a workaround. Posting for future readers who might be looking for workaround as retraining is usually not possible.
I created /home/.cache/torch/transformer/ inside the docker image.
RUN mkdir -p /home/<username>/.cache/torch/transformers
Copied the files (which were not found in docker) from my local /home/.cache/torch/transformer/ to docker image /home/.cache/torch/transformer/
COPY filename:/home/<username>/.cache/torch/transformers/filename
I am currently trying to implement Mask RCNN by following Matterport repo. I have a doubt regarding implementation of tensorboard.
The dataset is similar to coco dataset. Inside model.py under def train, tensorboard is mentioned as
callbacks = [ keras.callbacks.TensorBoard(log_dir=self.log_dir,histogram_freq=0, write_graph=True, write_images=False)
But What else I should mention for using tensorboard? When I try to run the tensorboard, it say log file not found. I know that there is something I am missing some where!!. Please help me out !
In your model.train() ensure you set custom_callbacks = callbacks parameter.
If you specified these parameters exactly like this, then it means that your issue is that you do not properly open the logs directory.
Open (inside Anaconda/PyCharm) or a separate Python terminal and put the absolute path(to make sure it works):
tensorboard --logdir = my_absolute_path/logs/
What I am trying to do is to convert my trained CNN to TfLite and use it in my android app. AFAIK I need the .pbtxt in order to freeze the parameters and do the conversion.
However when I save my network using this standard code:
saver = tf.train.Saver(max_to_keep=4)
saver.save(sess=session, save_path="some_path", global_step=step)
I only get the
.data
.index
.meta
checkpoint
files. No pbtxt.
Is there a way to convert the trained network to tflite without a pbtxt or can I obtain the pbtxt from those files?
Thank you
Simply execute:
tf.train.write_graph(session.graph.as_graph_def(),
"path",
'model.pb',
as_text=False)
to get a .pb or
tf.train.write_graph(session.graph.as_graph_def(),
"path",
'model.pbtxt',
as_text=True)
to get the text version.