The following code run in Colab and I got the following error:
NameError: name 'MINST' is not defined
What do I need to do?
import torch
import torchvision
from torchvision.datasets import MNIST
dataset = MINST(root='data/', download=True)
len(dataset)
test_dataset = MINST(root='data/', train=False)
len(test_dataset)
dataset[0]
It is what it say it is a NameError
You imported the MNIST dataset and try to access MINST which is not a valid name.
Your code should be:
import torch
import torchvision
from torchvision.datasets import MNIST
dataset = MNIST(root='data/', download=True)
len(dataset)
test_dataset = MINST(root='data/', train=False)
len(test_dataset)
dataset[0]
Related
This my first use of Pytorch, However I use tensorflow as below:
path ="./Train_set"
img_gen = tf.keras.preprocessing.image.ImageDataGenerator()
test_set = img_gen.flow_from_directory(path,(224, 224),'rgb')
So my dataset is an image dataset .
Now, I would like to download my dataset with Pytorch, with this manner, I don't find anything in train_set:
import torch
import torch.nn as nn
import torch.nn.functional as F
DS="./path"
from torch.utils.data import DataLoader
train_set= DataLoader(DS, batch_size=64, shuffle=True)
So, How can I change the code
Thanks
Trying to run a sample code for a Named Entity Recognition model as apractice.
The reference article is: Named Entity Recognition (NER) with keras and tensorflow
github: https://github.com/nxs5899/Named-Entity-Recognition_DeepLearning-keras
However, I have stacked with version difference of tensorflow version.
Since I'm not familiar with Tensorflow, I cannot modify the sample code following the description of the change.
I'd also appreciate it if you could share helpful articles or GitHub to build a Named Entity Recognition model with original data.
Error Message
---> 11 sess = tf.Session()
12 K.set_session(sess)
AttributeError: module 'tensorflow' has no attribute 'Session'
Working Code
from sklearn.model_selection import train_test_split
import tensorflow as tf
import tensorflow_hub as hub
from tensorflow.keras.backend import eval
X_tr, X_te, y_tr, y_te = train_test_split(new_X, y, test_size=0.1, random_state=2018)
batch_size = 32
import tensorflow as tf
import tensorflow_hub as hub
from keras import backend as K
sess = tf.Session()
K.set_session(sess)
elmo_model = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)
sess.run(tf.global_variables_initializer())
sess.run(tf.tables_initializer())
What I tried to do
Following the related question about Tensorflow 2.0 - AttributeError: module 'tensorflow' has no attribute 'Session', I tried to fix my code, but another error was shown.
If it is because of my trial fixed code, I would like to how should I write for the new version of tensorflow.
Another Error
module 'tensorflow' has no attribute 'global_variables_initializer'
fixed version
from sklearn.model_selection import train_test_split
import tensorflow as tf
import tensorflow_hub as hub
from tensorflow.keras.backend import eval
tf.compat.v1.disable_eager_execution()
X_tr, X_te, y_tr, y_te = train_test_split(new_X, y, test_size=0.1, random_state=2018)
batch_size = 32
import tensorflow as tf
import tensorflow_hub as hub
from keras import backend as K
sess = tf.compat.v1.Session()
K.set_session(sess)
elmo_model = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)
sess.run(tf.global_variables_initializer())
sess.run(tf.tables_initializer())
To execute your code in Tensorflow 2.x, you can try as shown below
from sklearn.model_selection import train_test_split
import tensorflow as tf
import tensorflow_hub as hub
X_tr, X_te, y_tr, y_te = train_test_split(new_X, y, test_size=0.1, random_state=2018)
elmo_model = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)
I ran my code and saved saved my model using the following code:
model_json = model.to_json()
with open(inFilePath+".json", "w") as json_file:
json_file.write(model_json)
modWeightsFilepath=inFilePath+".weights.hdf5"
checkpoint = ModelCheckpoint(modWeightsFilepath, monitor='val_accuracy', verbose=1, save_best_only=True, save_weights_only=True, mode='auto')
And then I wanted to load my model again to make predictions:
from keras.models import model_from_json
json_file = open('/home/models/final_model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
model = model_from_json(loaded_model_json)
#load weights into new model
model.load_weights('/home/models/final_model.weights.hdf5')
print("Loaded model from disk")
But this gives me the following error:
TypeError: __init__() got an unexpected keyword argument 'ragged'
Full traceback:
And I don't quite know what's wrong. My Keras gpu version is 2.1.6-tf.
Edit:
In order to create the model I used:
import json
import numpy as np
from generator import DataGenerator
import tensorflow
KERAS_BACKEND=tensorflow
import keras
from keras.preprocessing import sequence
from keras.models import Sequential, Model
from keras import optimizers
from keras.layers import Dense, Dropout, Activation, Flatten, Input
from keras.layers import Conv1D, AveragePooling1D, MaxPooling1D
from keras.layers.merge import concatenate
from keras.optimizers import SGD
import os
import sys
from itertools import chain
#import matplotlib.pyplot as plt
from functools import reduce
from keras.callbacks import EarlyStopping,ModelCheckpoint
from sklearn.utils import class_weight
And in order to load the model, I imported:
from keras.models import model_from_json
after which I got the error that I told you about. And then I changed it to:
from tensorflow.keras.models import model_from_json
And the error persisted.
I trained my model and saved the model in .h5 format. Trained by freezing the last layer of the mobilenet imagenet model.
Loading the model and trying prediction makes error stating ValueError: You are trying to load a weight file containing 58 layers into a model with 55 layers.
Training code :
# coding: utf-8
# In[1]:
import pandas as pd
import numpy as np
import os
import keras
import matplotlib.pyplot as plt
from keras.layers import Dense,GlobalAveragePooling2D
from keras.applications import MobileNet
from keras.preprocessing import image
from keras.applications.mobilenet import preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Model
from keras.optimizers import Adam
# In[2]:
base_model=MobileNet(weights='imagenet',include_top=False) #imports the mobilenet model and discards the last 1000 neuron layer.
x=base_model.output
x=GlobalAveragePooling2D()(x)
x=Dense(1024,activation='relu')(x) #we add dense layers so that the model can learn more complex functions and classify for better results.
x=Dense(1024,activation='relu')(x) #dense layer 2
x=Dense(512,activation='relu')(x) #dense layer 3
preds=Dense(2,activation='softmax')(x) #final layer with softmax activation
# In[3]:
model=Model(inputs=base_model.input,outputs=preds)
#specify the inputs
#specify the outputs
#now a model has been created based on our architecture
# In[4]:
for layer in model.layers[:20]:
layer.trainable=False
for layer in model.layers[20:]:
layer.trainable=True
# In[5]:
train_datagen=ImageDataGenerator(preprocessing_function=preprocess_input) #included in our dependencies
train_generator=train_datagen.flow_from_directory('./train/', # this is where you specify the path to the main data folder
target_size=(224,224),
color_mode='rgb',
batch_size=64,
class_mode='categorical',
shuffle=True)
# In[33]:
model.compile(optimizer='Adam',loss='categorical_crossentropy',metrics=['accuracy'])
# Adam optimizer
# loss function will be categorical cross entropy
# evaluation metric will be accuracy
step_size_train=train_generator.n//train_generator.batch_size
model.fit_generator(generator=train_generator,
steps_per_epoch=step_size_train,
epochs=10)
# serialize model to JSON
model_json = model.to_json()
with open("mobilenet_2.json", "w") as json_file:
json_file.write(model_json)
# serialize weights to HDF5
model.save_weights("mobilenet_2.h5")
print("Saved model to disk")
Prediciton code :
import keras
from keras import backend as K
from keras.layers.core import Dense, Activation
from keras.optimizers import Adam
from keras.metrics import categorical_crossentropy
from keras.preprocessing.image import ImageDataGenerator
from keras.preprocessing import image
from keras.models import Model
from keras.applications import imagenet_utils
from keras.layers import Dense,GlobalAveragePooling2D
from keras.applications import MobileNet
from keras.applications.mobilenet import preprocess_input
import numpy as np
from keras.optimizers import Adam
from keras.models import load_model
model = load_model("mobilenet_1.h5")
#mobile = keras.applications.mobilenet.MobileNet(weights="imagenet")
def prepare_image(file):
img_path = ''
img = image.load_img("/home/christie/mobilenet/transfer-learning/" + file, target_size=(224, 224))
img_array = image.img_to_array(img)
img_array_expanded_dims = np.expand_dims(img_array, axis=0)
return keras.applications.mobilenet.preprocess_input(img_array_expanded_dims)
'''
lookup_list = ["banana","banana_palenkodan","banana_red","banana_nendran","banana_karpooravalli"]
#print(lookup_list)
if ans not in lookup_list:sx
print("Not found")
return "[None]"
'''
preprocessed_image = prepare_image('test.jpg')
predictions = model.predict(preprocessed_image)
results = imagenet_utils.decode_predictions(predictions)
print(results)
Error log :
ValueError: You are trying to load a weight file containing 58 layers
into a model with 55 layers.
The model is converted to JSON format and written to mobilenet_2.json in the local directory. The network weights are written to mobilenet_2.h5 in the local directory.
Similarly you have to load the json and its corresponding weights.
Try editing as below :
# serialize model to JSON
model_json = model.to_json()
with open("mobilenet_2.json", "w") as json_file:
json_file.write(model_json)
# serialize weights to HDF5
model.save_weights("mobilenet_2.h5")
print("Saved model to disk")
# later...
# load json and create model
json_file = open('mobilenet_2.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)
# load weights into new model
loaded_model.load_weights("mobilenet_2.h5")
print("Loaded model from disk")
You are saving just the weights but trying to load the model architecture and weights. If you would like to save weights and model architecture together and later load, then try the below code -
# save model and architecture to single file
model.save("model.h5")
# later...
# load model
model = load_model('model.h5')
I am trying to use Pytorch to run classification on a dataset of images of cats and dogs. In my code I am so far downloading the data and going into the folder train which has two folders in it called "cats" and "dogs." I am then trying to load this data into a dataloader and iterate through batches, but it is giving me some error I don't understand in the iteration step.
Since it is Google Colabs I have code in there for downloading data and installing libraries. Any other advice on my code so far would be appreciated as well.
!pip install torch
!pip install torchvision
from __future__ import print_function, division
import os
import torch
import pandas as pd
import numpy as np
# For showing and formatting images
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
# For importing datasets into pytorch
import torchvision.datasets as dataset
# Used for dataloaders
import torch.utils.data as data
# For pretrained resnet34 model
import torchvision.models as models
# For optimisation function
import torch.nn as nn
import torch.optim as optim
!wget http://files.fast.ai/data/dogscats.zip
!unzip dogscats.zip
batch_size = 256
train_raw = dataset.ImageFolder(PATH+"train", transform=transforms.ToTensor())
train_loader = data.DataLoader(train_raw, batch_size=batch_size, shuffle=True)
for batch_idx, (data, target) in enumerate(train_loader):
print("Data: ", batch_idx)
The error comes up on the last lines and is below:
RuntimeErrorTraceback (most recent call last)
<ipython-input-66-c32dd0c1b880> in <module>()
----> 1 for batch_idx, (data, target) in enumerate(train_loader):
2 print("Data: ", batch_idx)
3
/usr/local/lib/python2.7/dist-packages/torch/utils/data/dataloader.pyc in __next__(self)
257 if self.num_workers == 0: # same-process loading
258 indices = next(self.sample_iter) # may raise StopIteration
--> 259 batch = self.collate_fn([self.dataset[i] for i in indices])
260 if self.pin_memory:
261 batch = pin_memory_batch(batch)
/usr/local/lib/python2.7/dist-packages/torch/utils/data/dataloader.pyc in default_collate(batch)
133 elif isinstance(batch[0], collections.Sequence):
134 transposed = zip(*batch)
--> 135 return [default_collate(samples) for samples in transposed]
136
137 raise TypeError((error_msg.format(type(batch[0]))))
/usr/local/lib/python2.7/dist-packages/torch/utils/data/dataloader.pyc in default_collate(batch)
110 storage = batch[0].storage()._new_shared(numel)
111 out = batch[0].new(storage)
--> 112 return torch.stack(batch, 0, out=out)
113 elif elem_type.__module__ == 'numpy' and elem_type.__name__ != 'str_' \
114 and elem_type.__name__ != 'string_':
/usr/local/lib/python2.7/dist-packages/torch/functional.pyc in stack(sequence, dim, out)
62 inputs = [t.unsqueeze(dim) for t in sequence]
63 if out is None:
---> 64 return torch.cat(inputs, dim)
65 else:
66 return torch.cat(inputs, dim, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 400 and 487 in dimension 2 at /pytorch/torch/lib/TH/generic/THTensorMath.c:2897
Thanks
I think the main problem was images being of different size . I may have understood ImageFolder in other way but, i think you don't need labels for images if the directory structure is as specified in pytorch and pytorch will figure out the labels for you.
I would also add more things to your transform that automatically resizes every images from the folder such as:
normalize = transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]
)
transform = transforms.Compose(
[transforms.ToTensor(),transforms.Resize((224,224)),
normalize])
Also you can use other tricks to make your DataLoader much faster such as adding batch_size and number of cpu workers such as:
testloader = DataLoader(testset, batch_size=16,
shuffle=False, num_workers=4)
I think this will make you pipeline much faster.
I see two problems in your code first you are importing import torch.utils.data as data and again replacing that in the data loader. Please keep the imported module and your variable name in separate namespace. I think this error could be because of different sizes of data returned by dataloder(images) and labels. As you can see there is an error in concatenation because the first dimension ie. the label size and number of images in folder do not match. Hope this helps.
I think I was wrong in my comment to Manoj Acharya, the problem was in the batch_size being put into the dataloader. I read the below source and it seems you can't batch images together with different sizes:
https://medium.com/#yvanscher/pytorch-tip-yielding-image-sizes-6a776eb4115b
So in my code after changing the data variable Manoj points out I changed the batch_size to 1 and the program stopped failing. I want to put it in batches though so I added a further transform CenterCrop() to resize all images to the same size. Below is my new code:
!pip install torch
!pip install torchvision
from __future__ import print_function, division
import os
import torch
import pandas as pd
import numpy as np
# For showing and formatting images
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
# For importing datasets into pytorch
import torchvision.datasets as dataset
# Used for dataloaders
from torch.utils.data import DataLoader
# For pretrained resnet34 model
import torchvision.models as models
# For optimisation function
import torch.nn as nn
import torch.optim as optim
# For turning data into tensors
import torchvision.transforms as transforms
!wget http://files.fast.ai/data/dogscats.zip
!unzip dogscats.zip
batch_size = 256
sz = 224
train_raw = dataset.ImageFolder(PATH+"train", transform=transforms.Compose([transforms.CenterCrop(sz),transforms.ToTensor()]))
train_loader = DataLoader(train_raw,batch_size=batch_size, shuffle=True)
for batch_idx, (data, target) in enumerate(train_loader):
print("Data: ", batch_idx)
Thanks