I've been researching all over for a tutorial/guide to load a civitAI model (https://civitai.com/models/4823/deliberate) into pytorch and then use it for Inference.
Most research leads to the following:
Create your base model (which should have the same model Architecture as the model from where the checkpoint was save).
Then loading the checkpoint using torch.load()
torch.load_state_dict(loaded_checkpoint)
However, the models on civitai only have the ckpt file and nothing more. So cannot do step 1.
I do know it's possible, because the GUI version AUTOMATIC1111 is able to do it.
PS. I do know that the same deliberate model is available on huggingface.co and can be downloaded like standard stable diffusion models, but i'm interested in working with the ckpt file alone and do it the way AUTO1111 does it.
model_id = "stabilityai/stable-diffusion-2-1"
model = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
model.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
# Load the Checkpoint File
ckpt_path = '/Users/XXXX/XXXX/model.ckpt'
checkpoint = torch.load(ckpt_path, map_location="cpu")
model.load_state_dict(checkpoint['state_dict'])
model.eval()
image = model(prompt='xxxxxx')
Related
I trained a model on Google colab , and dumped it, then I was finishing the project on VS Code on my system. I got an error saying the versions don't match since I had the latest version of libraries and the colab had an older version. I had the retrain the model on my system, which took a lot of time, as my System has a basic configuration. My only use of colab was to put the training stress on colab rather than my system
I didn't know there will be a version conflict as I thought Colab will have the latest version of libraries
I have in colab (Google) tensorflow version 2.9.2 and in my Raspberry 4 tensorflow versio 2.4.1. So diferent versions. I made in colab a pre-training model VGG19 with input_shape(220,220,3). And I classificated 2 types image
So once I have trained the model in the Colab.Google environment, then from Colab except the model and its weights.
# serialize model to JSON
model_json = loaded_model2.to_json()
with open('/content/drive/MyDrive/dataset/extract/model_5.json', "w") as json_file:
json_file.write(model_json)
# serialize weights to HDF5
loaded_model2.save_weights('/content/drive/MyDrive/model_5.h5')
print("Saved model to disk")
As I have indicated, then I intend to use that trained model on my Raspberry4. So I create a model on the Raspberry just like I did in Colab, but I don't do 'fit'. What I do next is load the .h5 file with the 'weights' that was generated in Colab.Google.
And in principle this has worked for me
model_new = tf.keras.Sequential()
model_new.add(tf.keras.applications.VGG19(include_top=false, weights='imagenet',pooling='avg',input_shape=(220,220,3)))
model_new.add(tf.keras.layers.Dense(2,activation="softmax"))
opt = tf.keras.optimizers.SGC(0,004)
model_new.compile(loss='categorical_crossentropy',optimizer=opt,metrics=['accuracy'])
model_new.load_weights('/home/pi/projects/models/model_5.h5)
I am trying to port two pre-trained keras models into the IPU machine. I managed to load and run them using IPUstrategy.scope but I dont know if i am doing it the right way. I have my pre-trained models in .h5 file format.
I load them this way:
def first_model():
model = tf.keras.models.load_model("./model1.h5")
return model
After searching your ipu.keras.models.py file I couldn't find any load methods to load my pre-trained models, and this is why i used tf.keras.models.load_model().
Then i use this code to run:
cfg=ipu.utils.create_ipu_config()
cfg=ipu.utils.auto_select_ipus(cfg, 1)
ipu.utils.configure_ipu_system(cfg)
ipu.utils.move_variable_initialization_to_cpu()
strategy = ipu.ipu_strategy.IPUStrategy()
with strategy.scope():
model = first_model()
print('compile attempt\n')
model.compile("sgd", "categorical_crossentropy", metrics=["accuracy"])
print('compilation completed\n')
print('running attempt\n')
res = model.predict(input_img)[0]
print('run completed\n')
you can see the output here:link
So i have some difficulties to understand how and if the system is working properly.
Basically the model.compile wont compile my model but when i use model.predict then the system first compiles and then is running. Why is that happening? Is there another way to run pre-trained keras models on an IPU chip?
Another question I have is if its possible to load a pre-trained keras model inside an ipu.keras.model and then use model.fit/evaluate to further train and evaluate it and then save it for future use?
One last question I have is about the compilation part of the graph. Is there a way to avoid recompilation of the graph every time i use the model.predict() in a different strategy.scope()?
I use tensorflow2.1.2 wheel
Thank you for your time
To add some context, the Graphcore TensorFlow wheel includes a port of Keras for the IPU, available as tensorflow.python.ipu.keras. You can access the API documentation for IPU Keras at this link. This module contains IPU-specific optimised replacement for TensorFlow Keras classes Model and Sequential, plus more high-performance, multi-IPU classes e.g. PipelineModel and PipelineSequential.
As per your specific issue, you are right when you mention that there are no IPU-specific ways to load pre-trained Keras models at present. I would encourage you, as you appear to have access to IPUs, to reach out to Graphcore Support. When doing so, please attach your pre-trained Keras model model1.h5 and a self-contained reproducer of your code.
Switching topic to the recompilation question: using an executable cache prevents recompilation, you can set that up with environmental variable TF_POPLAR_FLAGS='--executable_cache_path=./cache'. I'd also recommend to take a look into the following resources:
this tutorial gathers several considerations around recompilation and how to avoid it when using TensorFlow2 on the IPU.
Graphcore TensorFlow documentation here explains how to use the pre-compile mode on the IPU.
I am new to Deep learning and I want to know, how can I save the final model in Pytorch? I tried some things that were mentioned but I got confused with, how to save the model and how to load it back?
to save:
# save the weights of the model to a .pt file
torch.save(model.state_dict(), "your_model_path.pt")
to load:
# load your model architecture/module
model = YourModel()
# fill your architecture with the trained weights
model.load_state_dict(torch.load("your_model_path.pt"))
I am using model.save("cnn.model") and model.save("cnn.h5") to save the model after training.
What is the difference of the saving the model in 2 different extensions?
File name, which includes the extension, doesn't matter. Whatever it is, Keras will save a HDF5 formatted model into that file.
Doc: How can I save a Keras model?
You can use model.save(filepath) to save a Keras model into a single
HDF5 file which will contain:
the architecture of the model, allowing to re-create the model
the weights of the model
the training configuration (loss, optimizer)
the state of the optimizer, allowing to resume training exactly where you left off.
I am learning Pytorch and trying to understand how the library works for semantic segmentation.
What I've understood so far is that we can use a pre-trained model in pytorch. I've found an article which was using this model in the .eval() mode but I have not been able to find any tutorial on using such a model for training on our own dataset. I have a very small dataset and I need transfer learning to get results. My goal is to only train the FC layers with my own data. How is that achievable in Pytorch without complicating the code with OOP or so many .py files. I have been having a hard time figuring out such repos in github as I am not the most proficient person when it comes to OOP. I have been using Keras for Deep Learning until recently and there everything is easy and straightforward. Do I have the same options in Pycharm?
I appreciate any guidance on this. I need to run a piece of code that does the semantic segmentation and I am really confused about many of the steps I need to take.
Assume you start with a pretrained model called model. All of this occurs before you pass the model any data.
You want to find the layers you want to train by looking at all of them and then indexing them using model.children(). Running this command will show you all of the blocks and layers.
list(model.children())
Suppose you have now found the layers that you want to finetune (your FC layers as you describe). If the layers you want to train are the last 5 you can grab all of the layers except for the last 5 in order to set their requires_grad params to False so they don't train when you run the training algorithm.
list(model.children())[-5:]
Remove those layers:
layer_list = list(model.children())[-5:]
Rebuild model using sequential:
model_small = nn.Sequential(*list(model.children())[:-5])
Set requires_grad params to False:
for param in model_small.parameters():
param.requires_grad = False
Now you have a model called model_small that has all of the layers except the layers you want to train. Now you can reattach the layers that your removed and they will intrinsically have the requires_grad param set to True. Now when you train the model it will only update the weights on those layers.
model_small.avgpool_1 = nn.AdaptiveAvgPool2d()
model_small.lin1 = nn.Linear()
model_small.logits = nn.Linear()
model_small.softmax = nn.Softmax()
model = model_small.to(device)