Missing method NeuralNet.train_split() in lasagne - python-3.x

I am learning to deal with python and lasagne. I have following installed on my pc:
python 3.4.3
theano 0.9.0
lasagne 0.2.dev1
and also six, scipy and numpy. I call net.fit(), and the stacktrace tries to call train_split(X, y, self), which, I guess, should split the samples into training set and validation set (both the inputs X as well as the outputs Y).
But there is no method like train_split(X, y, self) , there is only a float field train_split - I assume, the ratio between training and validation set sizes. Then I get following error:
Traceback (most recent call last):
File "...\workspaces\python\cnn\dl_tutorial\lasagne\Test.py", line
72, in
net = net1.fit(X[0:10,:,:,:],y[0:10])
File "...\Python34\lib\site-packages\nolearn\lasagne\base.py", line
544, in fit
self.train_loop(X, y, epochs=epochs)
File "...\Python34\lib\site-packages\nolearn\lasagne\base.py", line
554, in train_loop
X_train, X_valid, y_train, y_valid = self.train_split(X, y, self)
TypeError: 'float' object is not callable
What could be wrong or missing? Any suggestions? Thank you very much.

SOLVED
in previous versions, the input parameter train_split has been a number, that was used by the same-named method. In nolearn 0.6.0, it's a callable object, that can implement its own logic to split the data. So instead of providing a float number to the input parameter train_split, I have to provide a callable instance (the default one is TrainSplit), that will be executed in each training epoch.

Related

AttributeError: 'numpy.ndarray' object has no attribute 'unsqueeze'

I'm running a training code using pyhtorch and numpy.
This is the plot_example function:
def plot_example(low_res_folder, gen):
files=os.listdir(low_res_folder)
gen.eval()
for file in files:
image=Image.open("test_images/" + file)
with torch.no_grad():
upscaled_img=gen(
config1.both_transform(image=np.asarray(image))["image"]
.unsqueeze(0)
.to(config1.DEVICE)
)
save_image(upscaled_img * 0.5 + 0.5, f"saved/{file}")
gen.train()
The problem I have is that the unsqueeze attribute raises the error:
File "E:\Downloads\esrgan-tf2-masteren\modules\train1.py", line 58, in train_fn
plot_example("test_images/", gen)
File "E:\Downloads\esrgan-tf2-masteren\modules\utils1.py", line 46, in plot_example
config1.both_transform(image=np.asarray(image))["image"]
AttributeError: 'numpy.ndarray' object has no attribute 'unsqueeze'
The network is GAN network and gen() represents the Generator.
Make sure image is a tensor in the shape of [batch size, channels, height, width] before entering any Pytorch layers.
Here you have
image=np.asarray(image)
I would remove this numpy conversion and keep it a torch.tensor.
Or if you really want it to be a numpy array, then right before it enters your generator make sure to use torch.from_numpy() as shown in this documentation on your numpy image before it gets unsqueezed: https://pytorch.org/docs/stable/generated/torch.from_numpy.html
This function is ofcourse an alternative if you don't want to get rid of that original conversion.
Sarthak Jain

BERT NER: can't convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first

I want to train my BERT NER model on colab. But following error occurs
Code:
tr_logits = tr_logits.detach().cpu().numpy()
tr_label_ids = torch.masked_select(b_labels, (preds_mask == 1))
tr_batch_preds = np.argmax(tr_logits[preds_mask.squeeze()], axis=1)
tr_batch_labels = tr_label_ids.to(device).numpy()
tr_preds.extend(tr_batch_preds)
tr_labels.extend(tr_batch_labels)
Error:
Using TensorFlow backend.
Saved standardized data to ./data/en/combined/train_combined.txt.
Saved standardized data to ./data/en/combined/dev_combined.txt.
Saved standardized data to ./data/en/combined/test_combined.txt.
Constructed SentenceGetter with 25650 examples.
Constructed SentenceGetter with 8934 examples.
Loaded training and validation data into DataLoaders.
Initialized model and moved it to cuda.
Initialized optimizer and set hyperparameters.
Epoch: 0% 0/5 [00:00<?, ?it/s]Starting training loop.
Epoch: 0% 0/5 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/content/FYP_Presentation/python/main.py", line 102, in <module>
valid_dataloader,
File "/content/FYP_Presentation/python/utils/main_utils.py", line 431, in train_and_save_model
tr_batch_preds = torch.max(tr_logits[preds_mask.squeeze()], axis=1)
File "/usr/local/lib/python3.6/dist-packages/torch/tensor.py", line 412, in __array__
return self.numpy()
TypeError: can't convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
How would I solve this issue?
In the first line of your code, tr_logits = tr_logits.detach().cpu().numpy() already turns tr_logits into a numpy array. In the line that raises the error:
tr_batch_preds = torch.max(tr_logits[preds_mask.squeeze()], axis=1)
the first thing for the program to do is to evaluate tr_logits[preds_mask.squeeze()]. Now that tr_logits is numpy array, its index preds_mask must also be numpy array. So the programs calls preds_mask.numpy() to change it to a numpy array. However, it is on GPU and hence the error.
I'd suggest using either numpy arrays or pytorch tensors all the way in one program, not alternatively .

error in converting tensor to numpy array

I'm trying to convert input_image which is a tensor to numpy array.Following the already answered questions here and several others that suggested to use input_image.eval() or equivalently sess.run() for this conversion, I did the same, but it throws an error and apparently expects a feed_dict value for the sess.run(). But since here I'm not trying to run an operation dependent on unknown values, I don't see the need for the feed_dict here because all I'm doing here is just conversion.
Besides, just so as to check I also tried converting a tf.constant([1,2,3]) value right above it using the same method and it got successfully compiled despite its data type being the same as input_image. Here's my code which is the part of larger script:
def call(self, x):
input_image = Input(shape=(None, None, 3))
print(input_image.shape)
print(type(tf.constant([1,2,3])))
print(type(input_image))
print(type(K.get_session().run(tf.constant([1,2,3]))))
print(type(K.get_session().run(input_image)))
and here's the error:
(?, ?, ?, 3)
<class 'tensorflow.python.framework.ops.Tensor'>
<class 'tensorflow.python.framework.ops.Tensor'>
<class 'numpy.ndarray'>
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
target_list, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: You must feed a value for placeholder tensor 'input_1' with dtype float and shape [?,?,?,3]
[[{{node input_1}}]]
[[input_1/_1051]]
(1) Invalid argument: You must feed a value for placeholder tensor 'input_1' with dtype float and shape [?,?,?,3]
[[{{node input_1}}]]
0 successful operations.
0 derived errors ignored.
I wonder why the former would work and the latter won't.
There is no such thing as "converting" a symbolic tensor to a numpy array, as the latter cannot hold the same kind of information as the former.
When you use eval() or session.run(), what you are doing is evaluating a symbolic expression to get a numerical result, which is a numpy array, but this is not a conversion. Evaluating an expression might or might not require additional input data (that's what the feed_dict is for), depending on the expression.
Evaluating a constant (tf.constant) does not require any input data, but evaluating your other expression does require the input data, so you cannot "convert" this to a numpy array.
Just adding to (or elaborating on) what #MatiasValdenegro said,
TensorFlow follows something called graph execution (or define-then-run). In other words, when you write a TensorFlow program it defines something called a data-flow graph which shows how the operations you defined are related to each other. And then you execute bits and pieces of that graph depending on the results you're after.
Let's consider two examples. (I am switching to a simple TensorFlow program instead of Keras bits as it makes things more clear - After all K.get_session() returns a Session object).
Example 1
Say you have the following program.
import tensorflow as tf
a = tf.placeholder(shape=[2,2], dtype=tf.float32)
b = tf.constant(1, dtype=tf.float32)
c = a * b
# Wrong: This is what you're doing essentially when you do sess.run(input_image)
with tf.Session() as sess:
print(sess.run(c))
# Right: You need to feed values that c is dependent on
with tf.Session() as sess:
print(sess.run(c, feed_dict={a: np.array([[1,2],[2,3]])}))
Whenever a resulting tensor (e.g. c) is dependent on a placeholder you cannot execute it and get the result without feeding values to all the dependent placeholders.
Example 2
When you define a tf.constant(1) this is not dependent on anything. In other words you don't need a feed_dict and can directly run eval() or sess.run() on it.
Update: Further explanation on why you need a feed_dict for input_image
TLDR: You need a feed_dict because your resulting Tensor is produced by an Input layer.
Your input_image is basically the resulting tensor you get by feeding something to the Input layer. Usually in Keras, you are not exposed to the internal placeholder level details. But you would do that via using model.fit() or model.evaluate(). You can see that Keras Input layer in fact uses a placeholder by analysing this line.
Hope I made my point clear that you do need to feed in a value to the placeholder to successfully evaluate the output of an Input layer. Because that basically holds a placeholder.
Update 2: How to feed to your Input layer
So, appears you can use feed_dict with Keras Input layer in the following manner. Instead of defining shape argument you straight away pass a placeholder to the tensor argument, which will bypass the internal placeholder creation in the layer.
from tensorflow.keras.layers import InputLayer
import numpy as np
import tensorflow.keras.backend as K
x = tf.placeholder(shape=[None, None, None, 3], dtype=tf.float32)
input_image = Input(tensor=x)
arr = np.array([[[[1,1,1]]]])
print(arr.shape)
print(K.get_session().run(input_image, feed_dict={x: arr}))

How to handle NaNs returned from 'roc_curve' before passing to 'auc'?

I am using 'roc_curve' from the metrics model in scikit-learn. The example shows that 'roc_curve' should be called before 'auc' similar to:
fpr, tpr, thresholds = metrics.roc_curve(y, pred, pos_label=2)
and then:
metrics.auc(fpr, tpr)
However the following error is returned:
Traceback (most recent call last): File "analysis.py", line 207, in <module>
r = metrics.auc(fpr, tpr) File "/apps/anaconda/1.6.0/lib/python2.7/site-packages/sklearn/metrics/metrics.py", line 66, in auc
x, y = check_arrays(x, y) File "/apps/anaconda/1.6.0/lib/python2.7/site-packages/sklearn/utils/validation.py", line 215, in check_arrays
_assert_all_finite(array) File "/apps/anaconda/1.6.0/lib/python2.7/site-packages/sklearn/utils/validation.py", line 18, in _assert_all_finite
raise ValueError("Array contains NaN or infinity.") ValueError: Array contains NaN or infinity.
What does it mean in terms or results/is there a way to overcome this?
Are you trying to us roc_curve to evaluate a multiclass classifier? In other words, if you are using roc_curve on a classification problem that is not binary, then this won't work correctly. There is math out there for multidimensional ROC analysis, but the current ROC methods in python don't implement them.
To evaluate multiclass problems trying using methods like: confusion_matrix and classification_report from sklearn, and kappa() from skll.
You state this line:
fpr, tpr, thresholds = metrics.roc_curve(y, pred, pos_label=2)
which leads to the conclusion that you may have copied the sklearn example which also uses "pos_label=2".
However, in most cases you want the "pos_label" to be 1. So if your code outputs probabilities and they are between 0 and 1, then your pos_label should be 1.

How to use OneVsRestClassifier with SVC for multilabel problems?

I'm using OneVsRestClassifier for multilabel classification. It works with LinearSVC, but when I apply it to SVC, the following error appears:
classifier = OneVsRestClassifier(SVC(class_weight='balanced'))
classifier.fit(X1, y1)
y2 = classifier.predict(X2)
Traceback (most recent call last):
...
File "/usr/local/lib/python2.7/dist-packages/sklearn/multiclass.py", line 219, in predict
return predict_ovr(self.estimators_, self.label_binarizer_, X)
File "/usr/local/lib/python2.7/dist-packages/sklearn/multiclass.py", line 93, in predict_ovr
Y = np.array([_predict_binary(e, X) for e in estimators])
File "/usr/local/lib/python2.7/dist-packages/sklearn/multiclass.py", line 66, in _predict_binary
score = estimator.predict_proba(X)[:, 1]
File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/base.py", line 490, in predict_proba
"probability estimates must be enabled to use this method")
NotImplementedError: probability estimates must be enabled to use this method</code>
Does anybody know what is it?
This is a bug. The OneVsRestClassifier calls the predict_proba method when it finds one, but the one on SVC does not actually work unless you construct it with probability=True to get Platt scaling (which I don't actually encourage).
The reason that it works for LinearSVC is that that class does not have a predict_proba, so OvR backs off to the decision_function method.

Resources