MNIST object has no attribute data - mnist

I am trying to run the piece of code given below but unfortunately I get the following error- "MNIST object has no attribute data". The error is because of the line "mnist_train_set.data.view(-1, 1, 28, 28).float()". Can someone shed some light on how to fix this? Thanks.
import torch
from torchvision import datasets
...
mnist_train_set = datasets.MNIST(data_dir + '/mnist/', train = True, download = True)
mnist_test_set = datasets.MNIST(data_dir + '/mnist/', train = False, download = True)
train_input = mnist_train_set.data.view(-1, 1, 28, 28).float()
train_target = mnist_train_set.targets
test_input = mnist_test_set.data.view(-1, 1, 28, 28).float()
test_target = mnist_test_set.targets

I have run into this same error - it's a torchvision versioning issue.
In the current version of torchvision (0.4.0), the dataset x and y properties are called "data" and "targets".
In the previous version of torchvision (0.3.0), the dataset x and y properties were called either "train_data" and "train_labels", or "test_data" and "test_labels" (depending on which you specified to be loaded).
To fix your code, use latest torchvision or change it to use previous version property names.

Here is the solution: instead of data use train_data for train_input, train_labels for train_target, test_data for test_input, test_labels for test_target. I have run the following code without an error.
train_input = mnist_train_set.train_data.view(-1, 1, 28, 28).float()
train_target = mnist_train_set.train_labels
test_input = mnist_test_set.test_data.view(-1, 1, 28, 28).float()
test_target = mnist_test_set.test_labels

Related

Cannot export PyTorch model to ONNX

I am trying to convert a pre-trained torch model to ONNX, but recive the following error:
RuntimeError: step!=1 is currently not supported
I'm trying this on a pre-trained colorization model: https://github.com/richzhang/colorization
Here is the code I ran in Google Colab:
!git clone https://github.com/richzhang/colorization.git
cd colorization/
import colorizers
model = colorizer_siggraph17 = colorizers.siggraph17(pretrained=True).eval()
input_names = [ "input" ]
output_names = [ "output" ]
dummy_input = torch.randn(1, 1, 256, 256, device='cpu')
torch.onnx.export(model, dummy_input, "test_converted_model.onnx", verbose=True,
input_names=input_names, output_names=output_names)
I appreciate any help :)
UPDATE 1: #Proko suggestion solved the ONNX export issue. Now I have a new possibly related problem when I try to convert the ONNX to TensorRT. I get the following error:
[TensorRT] ERROR: Network must have at least one output
Here is the code I used:
import torch
import pycuda.driver as cuda
import pycuda.autoinit
import tensorrt as trt
import onnx
TRT_LOGGER = trt.Logger()
def build_engine(onnx_file_path):
# initialize TensorRT engine and parse ONNX model
builder = trt.Builder(TRT_LOGGER)
builder.max_workspace_size = 1 << 25
builder.max_batch_size = 1
if builder.platform_has_fast_fp16:
builder.fp16_mode = True
network = builder.create_network()
parser = trt.OnnxParser(network, TRT_LOGGER)
# parse ONNX
with open(onnx_file_path, 'rb') as model:
print('Beginning ONNX file parsing')
parser.parse(model.read())
print('Completed parsing of ONNX file')
# generate TensorRT engine optimized for the target platform
print('Building an engine...')
engine = builder.build_cuda_engine(network)
context = engine.create_execution_context()
print("Completed creating Engine")
return engine, context
ONNX_FILE_PATH = 'siggraph17.onnx' # Exported using the code above
engine,_ = build_engine(ONNX_FILE_PATH)
I tried to force the build_engine function to use the output of the network by:
network.mark_output(network.get_layer(network.num_layers-1).get_output(0))
but it did not work.
I appropriate any help!
Like I have mentioned in a comment, this is because slicing in torch.onnx supports only step = 1 but there are 2-step slicing in the model:
self.model2(conv1_2[:,:,::2,::2])
Your only option as for now is to rewrite slicing to be some other ops. You can do it by using range and reshape to obtain proper indices. Consider the following function "step-less-arange" (I hope it is generic enough for anyone with similar problem):
def sla(x, step):
diff = x % step
x += (diff > 0)*(step - diff) # add length to be able to reshape properly
return torch.arange(x).reshape((-1, step))[:, 0]
usage:
>> sla(11, 3)
tensor([0, 3, 6, 9])
Now you can replace every slice like this:
conv2_2 = self.model2(conv1_2[:,:,self.sla(conv1_2.shape[2], 2),:][:,:,:, self.sla(conv1_2.shape[3], 2)])
NOTE: you should optimize it. Indices are calculated for every call so it might be wise to pre-compute it.
I have tested it with my fork of the repo and I was able to save the model:
https://github.com/prokotg/colorization
What works for me was to add the opset_version=11 on torch.onnx.export
First I had tried use opset_version=10, but the API suggest 11 so it works.
So your function should be:
torch.onnx.export(model, dummy_input, "test_converted_model.onnx", verbose=True,opset_version=11,
input_names=input_names, output_names=output_names)

XGBoost: OS Error : [WinError -529697949] Windows Error 0xe06d7363 running XGBClassifier with large dataset, CPU Mode

Getting this error while trying to run XGBClassifier and GridsearchCV for hyperparameter optimization. I have seen this issue being opened in Github but closed and marked resolved but no solution provided. Has anyone actually found a soultion to this error?
My dataset:
X = np array with 350000 rows and 1715 columns (after one hot encoding)
y = 350000 rows and 1 column (target)
My Code:
X = train.drop(['Breakage'], axis=1,) #features (read from dataframe)
y = train['Breakage'] #target (read from dataframe)
X= X.as_matrix() #convert to np array
y= y.as_matrix() #convert to np array
y = np.reshape(y,(-1, 1)) #reshape array
X = X.astype('uint8') #Change dtype to avoid overcommmit error in windows
y = y.astype('uint8') #Change dtype to avoid overcommmit error in windows
#define estimators and learning rate
model = XGBClassifier()
n_estimators = [100, 200, 300, 400, 500]
learning_rate = [0.0001, 0.001, 0.01, 0.1]
# GRidSearchCV
param_grid = dict(learning_rate=learning_rate, n_estimators=n_estimators)
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=7)
grid_search = GridSearchCV(model, param_grid, n_jobs=-1, cv=kfold)
grid_result = grid_search.fit(X,y)
The output Error:
OSError: [WinError -529697949] Windows Error 0xe06d7363
Can anyone please tell me what i am doing wrong
I have same problem as yours.
This answer will help you.
Windows Error using XGBoost with python
Change the 'xgb.fit()' sklearn api to 'xgb.train()' will solve this problem.
I had this issue and it was something to do with the installation of xgboost.
I was using code that already worked before, and although there was a lot of memory usage, it is not a memory error.
What worked for me was
Close the IDE
Uninstall xgboost
Upgrade pip (not sure if necessary)
Reinstall xgboost
Rebuild and install python project
Run again
Source of inspiration from this question here:
Windows Error 0xe06d7363 when using Cross Validation XGboost

cannot get the same output as the pytorch model with openvino

I have a strange problem in trying to use OpenVino.
I have exported my pytorch model to onnx and then imported it to OpenVino using the following command:
python /opt/intel/openvino/deployment_tools/model_optimizer/mo.py --input_model ~/Downloads/unet2d.onnx --disable_resnet_optimization --disable_fusing --disable_gfusing --data_type=FP32
So for the test case, I have disabled the optimizations.
Now, using the sample python applications, I run inference using the model as follows:
from openvino.inference_engine import IENetwork, IECore
import numpy as np
model_xml = path.expanduser('model.xml')
model_bin = path.expanduser('model.bin')
ie = IECore()
net = IENetwork(model=model_xml, weights=model_bin)
input_blob = next(iter(net.inputs))
out_blob = next(iter(net.outputs))
net.batch_size = 1
exec_net = ie.load_network(network=net, device_name='CPU')
np.random.seed(0)
x = np.random.randn(1, 2, 256, 256) # expected input shape
res = exec_net.infer(inputs={input_blob: x})
res = res[out_blob]
The problem is that this seems to output something completely different from my onnx or the pytorch model.
Additionally, I realized that I do not even have to pass an input, so if I do something like:
x = None
res = exec_net.infer(inputs={input_blob: x})
This still returns me the same output! So it seems to suggest that somehow my input is getting ignored or something like that?
Could you try without --disable_resnet_optimization --disable_fusing --disable_gfusing
with leaving the optimizations in.

H2O Target Mean Encoder "frames are being sent in the same order" ERROR

I am following the H2O example to run target mean encoding in Sparking Water (sparking water 2.4.2 and H2O 3.22.04). It runs well in all the following paragraph
from h2o.targetencoder import TargetEncoder
# change label to factor
input_df_h2o['label'] = input_df_h2o['label'].asfactor()
# add fold column for Target Encoding
input_df_h2o["cv_fold_te"] = input_df_h2o.kfold_column(n_folds = 5, seed = 54321)
# find all categorical features
cat_features = [k for (k,v) in input_df_h2o.types.items() if v in ('string')]
# convert string to factor
for i in cat_features:
input_df_h2o[i] = input_df_h2o[i].asfactor()
# target mean encode
targetEncoder = TargetEncoder(x= cat_features, y = y, fold_column = "cv_fold_te", blending_avg=True)
targetEncoder.fit(input_df_h2o)
But when I start to use the same data set used to fit Target Encoder to run the transform code (see code below):
ext_input_df_h2o = targetEncoder.transform(frame=input_df_h2o,
holdout_type="kfold", # mean is calculating on out-of-fold data only; loo means leave one out
is_train_or_valid=True,
noise = 0, # determines if random noise should be added to the target average
seed=54321)
I will have error like
Traceback (most recent call last):
File "/tmp/zeppelin_pyspark-6773422589366407956.py", line 331, in <module>
exec(code)
File "<stdin>", line 5, in <module>
File "/usr/lib/envs/env-1101-ver-1619-a-4.2.9-py-3.5.3/lib/python3.5/site-packages/h2o/targetencoder.py", line 97, in transform
assert self._encodingMap.map_keys['string'] == self._teColumns
AssertionError
I found the code in its source code http://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/_modules/h2o/targetencoder.html
but how to fix this issue? It is the same table used to run the fit.
The issue is because you are trying encoding multiple categorical features. I think that is a bug of H2O, but you can solve putting the transformer in a for loop that iterate over all categorical names.
import numpy as np
import pandas as pd
import h2o
from h2o.targetencoder import TargetEncoder
h2o.init()
df = pd.DataFrame({
'x_0': ['a'] * 5 + ['b'] * 5,
'x_1': ['c'] * 9 + ['d'] * 1,
'x_2': ['a'] * 3 + ['b'] * 7,
'y_0': [1, 1, 1, 1, 0, 1, 0, 0, 0, 0]
})
hf = h2o.H2OFrame(df)
hf['cv_fold_te'] = hf.kfold_column(n_folds=2, seed=54321)
hf['y_0'] = hf['y_0'].asfactor()
cat_features = ['x_0', 'x_1', 'x_2']
for item in cat_features:
target_encoder = TargetEncoder(x=[item], y='y_0', fold_column = 'cv_fold_te')
target_encoder.fit(hf)
hf = target_encoder.transform(frame=hf, holdout_type='kfold',
seed=54321, noise=0.0)
hf
Thanks everyone for letting us know. Assertion was a precaution as I was not sure whether there could be the case that order could be changed. Rest of the code was written with this assumption in mind and therefore safe to use with changed order anyway, but assertion was left and forgotten. Added test and removed assertion. Now this issue is fixed and merged. Should be available in the upcoming fix release. 0xdata.atlassian.net/browse/PUBDEV-6474

Using a `tf.Tensor` as a Python `bool` is not allowed. Use `if t is not None:` instead of `if t:` to test if a tensor is defined,

I am trying to define func(x) in order to use the genetic algs library here:
https://github.com/bobirdmi/genetic-algorithms/tree/master/examples
However, when I try and use sga.init_random_population(population_size, params, interval) the code complains of me using tf.Tensors as python bools.
However, I am only referencing one bool in the entire code (Elitism) so I have no idea why this error is even showing. Asked around others who used sga.init_... and my inputs/setup is fine. Any suggestions would be greatly appreciated.
Full traceback:
Traceback (most recent call last):
File "C:\Users\Eric\eclipse-workspace\hw1\ga2.py", line 74, in <module>
sga.init_random_population(population_size, params, interval)
File "C:\Program Files\Python36\lib\site-packages\geneticalgs\real_ga.py", line 346, in init_random_population
self._sort_population()
File "C:\Program Files\Python36\lib\site-packages\geneticalgs\standard_ga.py", line 386, in _sort_population
self.population.sort(key=lambda x: x.fitness_val, reverse=True)
File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 671, in __bool__
raise TypeError("Using a `tf.Tensor` as a Python `bool` is not allowed. "
TypeError: Using a `tf.Tensor` as a Python `bool` is not allowed. Use `if t is not None:` instead of `if t:` to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.
code
import hw1
#import matplotlib
from geneticalgs import BinaryGA, RealGA, DiffusionGA, MigrationGA
#import numpy as np
#import csv
#import time
#import pickle
#import math
#import matplotlib.pyplot as plt
from keras.optimizers import Adam
from hw1 import x_train, y_train, x_test, y_test
from keras.losses import mean_squared_error
#import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense, Dropout
# GA standard settings
generation_num = 50
population_size = 16
elitism = True
selection = 'rank'
tournament_size = None # in case of tournament selection
mut_type = 1
mut_prob = 0.05
cross_type = 1
cross_prob = 0.95
optim = 'min' # minimize or maximize a fitness value? May be 'min' or 'max'.
interval = (-1, 1)
# Migration GA settings
period = 5
migrant_num = 3
cloning = True
def func(x):
#dimensions of weights and biases
#layer0weights = [10][23]
#layer0biases = [10]
#layer1weights = [10][20]
#layer1biases = [20]
#layer2weights = [1][20]
#layer2biases = [1]
#split up x for weights and biases
lay0 = x[0:230]
bias0 = x[230:240]
lay1 = x[240:440]
bias1 = x[440:460]
lay2 = x[460:480]
bias2 = x[480:481]
#fit to the shape of the actual model
lay0 = lay0.reshape(23,10)
bias0 = bias0.reshape(10,)
lay1 = lay1.reshape(10,20)
bias1 = bias1.reshape(20,)
lay2 = lay2.reshape(20,1)
bias2 = bias2.reshape(1,)
#set the newly shaped object to layers
hw1.model.layers[0].set_weights([lay0, bias0])
hw1.model.layers[1].set_weights([lay1, bias1])
hw1.model.layers[2].set_weights([lay2, bias2])
res = hw1.model.predict(x_train)
error = mean_squared_error(res,y_train)
return error
ga_model = Sequential()
ga_model.add(Dense(10, input_dim=23, activation='relu'))
ga_model.add(Dense(20, activation='relu'))
ga_model.add(Dense(1, activation='sigmoid'))
sga = RealGA(func, optim=optim, elitism=elitism, selection=selection,
mut_type=mut_type, mut_prob=mut_prob,
cross_type=cross_type, cross_prob=cross_prob)
params = 481
sga.init_random_population(population_size, params, interval)
optimal = sga.best_solution[0]
predict = func(optimal)
print(predict)
Tensorflow generates a computational graph of operations to be executed in an Tensorflow session.
geneticalgs.RealGA.init_random_population is an operation that uses the numpy.random.uniform to generate a numpy array. 1
The generated population being a Tensor object could mean maybe:
numpy.random.uniform invoked in geneticalgs.RealGA.init_random_population was decorated to return Tensors
numpy.random.uniform was added in the computation graph to be executed in a session.
I'll try executing the program eagerly by enabling eager execution. 2
tf.enable_execution()
You can also in a way execute the parts that you care about eagerly.
size = tf.placeholder(tf.int64)
dim = tf.placeholder(tf.int64)
interval = tf.placeholder(tf.int64, shape=(2,))
init_random_population = tf.py_func(
sga.init_random_population, [size, dim, interval], [])
with tf.Session() as session:
session.run(
init_random_population,
{size: population_size, dim: params, interval: interval})

Resources