I would like to compute the f1-score for a classifier trained with allen-nlp. I used the working code from a allen-nlp guide, which computed accuracy, not F1, so I tried to adjust the metric in the code.
According to the documentation, CategoricalAccuracy and FBetaMultiLabelMeasure take the same inputs. (predictions: torch.Tensor of shape [batch_size, ..., num_classes], gold_labels: torch.Tensor of shape [batch_size, ...])
But for some reason the input that worked perfectly well for the accuracy results in a RuntimeError when given to the f1-multi-label metric.
I condensed the problem to the following code snippet:
>>> from allennlp.training.metrics import CategoricalAccuracy, FBetaMultiLabelMeasure
>>> import torch
>>> labels = torch.LongTensor([0, 0, 2, 1, 0])
>>> logits = torch.FloatTensor([[ 0.0063, -0.0118, 0.1857], [ 0.0013, -0.0217, 0.0356], [-0.0028, -0.0512, 0.0253], [-0.0460, -0.0347, 0.0400], [-0.0418, 0.0254, 0.1001]])
>>> labels.shape
torch.Size([5])
>>> logits.shape
torch.Size([5, 3])
>>> ca = CategoricalAccuracy()
>>> f1 = FBetaMultiLabelMeasure()
>>> ca(logits, labels)
>>> f1(logits, labels)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../lib/python3.8/site-packages/allennlp/training/metrics/fbeta_multi_label_measure.py", line 130, in __call__
true_positives = (gold_labels * threshold_predictions).bool() & mask & pred_mask
RuntimeError: The size of tensor a (5) must match the size of tensor b (3) at non-singleton dimension 1
Why is this error happening? What am I missing here?
You want to use FBetaMeasure, not FBetaMultiLabelMeasure. "Multilabel" means you can specify more than one correct answer, but "Categorical Accuracy" only allows one correct answer. That means you have to specify another dimension in your labels.
I suspect the documentation of FBetaMultiLabelMeasure is misleading. I'll look into fixing it.
Related
Due to my need to speed up my written code, I have modified that to pure NumPy code to evaluate the runtime in this way and by JAX accelerator in Python. I don't know if my code is appropriate to be accelerated by JAX, but my little previous studies and JAX usage experiences encourage me to try vectorizing or parallelizing the prepared NumPy code by JAX. For initial test, I have put jax.jit decorator on the function, but it stuck at the first line of my code. it raised the following error in Colab:
<__array_function__ internals> in take(*args, **kwargs)
UnfilteredStackTrace: NotImplementedError: The 'raise' mode to jnp.take is not supported.
The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.
--------------------
The above exception was the direct cause of the following exception:
NotImplementedError Traceback (most recent call last)
<__array_function__ internals> in take(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/jax/_src/numpy/lax_numpy.py in _take(a, indices, axis, out, mode)
5437 elif mode == "raise":
5438 # TODO(phawkins): we have no way to report out of bounds errors yet.
-> 5439 raise NotImplementedError("The 'raise' mode to jnp.take is not supported.")
5440 elif mode == "wrap":
5441 indices = mod(indices, _constant_like(indices, a.shape[axis_idx]))
NotImplementedError: The 'raise' mode to jnp.take is not supported.
I don't know how to handle this code by JAX. This error is related to np.take module, although I guess it will stuck again at some other lines e.g. which contain reduce.
The sample code is:
import numpy as np
import jax
pp_ = np.array([[0.75, 0.5, 0.5], [15, 10, 15], [0.5, 3., 0.35], [15, 17, 15]])
rr_ = np.array([1, 3, 2, 5], dtype=np.float64)
gg_ = np.array([-0.48305741, -1])
ee_ = np.array([[0, 2], [1, 3]], dtype=np.int64)
#jax.jit
def JAX_acc(pp_, rr_, gg_, ee_):
rr_act = np.take(rr_, ee_)
r_add = np.add.reduce(rr_act, axis=1)
pc_dis = np.sum((r_add, gg_), axis=0)
ang_ = np.arccos((rr_act ** 5 + pc_dis[:, None] ** 2) / 1e5)
pl_rad = rr_act * np.cos(ang_)
pp_act = np.take(pp_, ee_, axis=0)
pc_vec = -np.subtract.reduce(pp_act, axis=1)
pc_ = pp_act[:, 0, :] + pc_vec / np.linalg.norm(pc_vec, axis=1)[:, None] * np.abs(pl_rad[:, 0][:, None])
return print(pc_dis, pc_, pl_rad)
JAX_acc(pp_, rr_, gg_, ee_)
main Qusestion: Could JAX library be utilized for this example? How?
Shall I use other modules instead np.take?
I would be appreciated for helping to cure this code by JAX.
---------------- solved by the update ----------------
I would be grateful for any other explanations on the following extraneus questions (not needed):
Which of math operations (-,+,*,...) and their NumPy equivalents (np.power, nu.sum,...) will be faster using JAX? Do NumPy ones will be handled by JAX in a better scheme (in terms of speed) than common math ones?
Does JAX CPU mode need other writing styles than TPU mode; I didn't use that so far.
Updates:
I have changed the code using jnp related modules based on #jakedvp comment and the problem by np.take is gone:
def JAX_acc_jnp(pp_, rr_, gg_, ee_):
rr_act = jnp.take(rr_, ee_)
r_add = jnp.sum(rr_act, axis=1) # .squees()
pc_dis = jnp.add(r_add, gg_)
ang_ = jnp.arccos((rr_act ** 5 + pc_dis[:, None] ** 2) / 1e5)
pl_rad = rr_act * jnp.cos(ang_)
pp_act = jnp.take(pp_, ee_, axis=0)
pc_vec = jnp.diff(pp_act, axis=1).squeeze()
pc_ = pp_act[:, 0, :] + pc_vec / jnp.linalg.norm(pc_vec, axis=1)[:, None] * jnp.abs(pl_rad[:, 0][:, None])
return pc_dis, pc_, pl_rad
For pc_dis and pc_ the results are true, but pl_rad is different due to ang_ different achieved values which are all -1.0927847e-10; perhaps because true values are with -13 decimals and JAX changed dtype to float32, I don't know. If so, how could I specify which dtype JAX use?
larger data sizes: pp_, rr_, gg_, ee_
I am trying to use the sklearn MinMaxScaler to rescale a python column like below:
scaler = MinMaxScaler()
y = scaler.fit(df['total_amount'])
But got the following errors:
Traceback (most recent call last):
File "/Users/edamame/workspace/git/my-analysis/experiments/my_seq.py", line 54, in <module>
y = scaler.fit(df['total_amount'])
File "/Users/edamame/workspace/git/my-analysis/venv/lib/python3.4/site-packages/sklearn/preprocessing/data.py", line 308, in fit
return self.partial_fit(X, y)
File "/Users/edamame/workspace/git/my-analysis/venv/lib/python3.4/site-packages/sklearn/preprocessing/data.py", line 334, in partial_fit
estimator=self, dtype=FLOAT_DTYPES)
File "/Users/edamame/workspace/git/my-analysis/venv/lib/python3.4/site-packages/sklearn/utils/validation.py", line 441, in check_array
"if it contains a single sample.".format(array))
ValueError: Expected 2D array, got 1D array instead:
array=[3.180000e+00 2.937450e+03 6.023850e+03 2.216292e+04 1.074589e+04
:
0.000000e+00 0.000000e+00 9.000000e+01 1.260000e+03].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
Any idea what was wrong?
The input to MinMaxScaler needs to be array-like, with shape [n_samples, n_features]. So you can apply it on the column as a dataframe rather than a series (using double square brackets instead of single):
y = scaler.fit(df[['total_amount']])
Though from your description, it sounds like you want fit_transform rather than just fit (but I could be wrong):
y = scaler.fit_transform(df[['total_amount']])
A little more explanation:
If your dataframe had 100 rows, consider the difference in shape when you transform a column to an array:
>>> np.array(df[['total_amount']]).shape
(100, 1)
>>> np.array(df['total_amount']).shape
(100,)
The first returns a shape that matches [n_samples, n_features] (as required by MinMaxScaler), whereas the second does not.
Try to do with this way:
import pandas as pd
from sklearn import preprocessing
x = df.values #returns a numpy array
min_max_scaler = preprocessing.MinMaxScaler()
x_scaled = min_max_scaler.fit_transform(x)
df = pd.DataFrame(x_scaled)
I started learning Machine Learning and came across Neural Networks. while implementing a program i got this error. i have tried checking for every solution but no luck. here's my code:
from numpy import exp, array, random, dot
class neural_network:
def _init_(self):
random.seed(1)
self.weights = 2 * random.random((2, 1)) - 1
def train(self, inputs, outputs, num):
for iteration in range(num):
output = self.think(inputs)
error = outputs - output
adjustment = 0.01*dot(inputs.T, error)
self.weights += adjustment
def think(self, inputs):
return (dot(inputs, self.weights))
neural = neural_network()
# The training set
inputs = array([[2, 3], [1, 1], [5, 2], [12, 3]])
outputs = array([[10, 4, 14, 30]]).T
# Training the neural network using the training set.
neural.train(inputs, outputs, 10000)
# Ask the neural network the output
print(neural.think(array([15, 2])))
this is the error which i'm getting when running neural.train:
Traceback (most recent call last):
File "neural.py", line 27, in <module>
neural.train(inputs, outputs, 10000)
File "neural.py", line 10, in train
output = self.think(inputs)
File "neural.py", line 16, in think
return (dot(inputs, self.weights))
AttributeError: 'neural_network' object has no attribute 'weights'
Though its has a self attribute self.weights() still it says no such attribute.
Well, it turns out that your initialization method should be named __init__ (two underscores), not _init_...
So, changing the method to
def __init__(self):
random.seed(1)
self.weights = 2 * random.random((2, 1)) - 1
your code works OK:
neural.train(inputs, outputs, 10000)
print(neural.think(array([15, 2])))
# [ 34.]
Your initializing method is written wrong, its two underscores __init__(self): not one underscore_init_(self):
Otherwise, nice code!
I was wondering if I can build an image resize module in Pytorch that takes a torch.tensor of 3*H*W as the input and return a tensor as the resized image.
I know it is possible to convert tensor to PIL Image and use torchvision,
but I also hope to back propagate gradients from the resized image to the original image, and the following example will return such error (in PyTorch 0.4.0 on Windows 10):
import numpy as np
from torchvision import transforms
t2i = transforms.ToPILImage()
i2t = transforms.ToTensor()
trans = transforms.Compose(
t2i, transforms.Resize(size=200), i2t]
)
test = np.random.normal(size=[3, 300, 300])
test = torch.tensor(test, requires_grad=True)
resized = trans(test)
resized.backward()
print(test.grad)
Traceback (most recent call last):
File "D:/Projects/Python/PyTorch/test.py", line 41, in <module>
main()
File "D:/Projects/Python/PyTorch/test.py", line 33, in main
resized = trans(test)
File "D:\Anaconda3\envs\pytorch\lib\site-packages\torchvision\transforms\transforms.py", line 42, in __call__
img = t(img)
File "D:\Anaconda3\envs\pytorch\lib\site-packages\torchvision\transforms\transforms.py", line 103, in __call__
return F.to_pil_image(pic, self.mode)
File "D:\Anaconda3\envs\pytorch\lib\site-packages\torchvision\transforms\functional.py", line 102, in to_pil_image
npimg = np.transpose(pic.numpy(), (1, 2, 0))
RuntimeError: Can't call numpy() on Variable that requires grad. Use var.detach().numpy() instead.
It seems like I cannot "imresize" a tensor without detaching it from autograd first, but detaching it prevents me from computing gradients.
Is there a way to build a torch function/module that does the same thing as torchvision.transforms.Resize that is autograd compatiable? Any help is much appreciated!
torch.nn.functional.upsample works for me, ypa!
I just figured it out how to preserve the gradients when implementing custom loss function.
The trick is to attach your result to the dummy gradients
def custom_loss(tensor1, tensor2):
# convert tensors to PIL image, doing calculation, we have output = 0.123
grad = (tensor1 + tensor2).sum()
loss = grad - grad + output
return loss
I am using 'roc_curve' from the metrics model in scikit-learn. The example shows that 'roc_curve' should be called before 'auc' similar to:
fpr, tpr, thresholds = metrics.roc_curve(y, pred, pos_label=2)
and then:
metrics.auc(fpr, tpr)
However the following error is returned:
Traceback (most recent call last): File "analysis.py", line 207, in <module>
r = metrics.auc(fpr, tpr) File "/apps/anaconda/1.6.0/lib/python2.7/site-packages/sklearn/metrics/metrics.py", line 66, in auc
x, y = check_arrays(x, y) File "/apps/anaconda/1.6.0/lib/python2.7/site-packages/sklearn/utils/validation.py", line 215, in check_arrays
_assert_all_finite(array) File "/apps/anaconda/1.6.0/lib/python2.7/site-packages/sklearn/utils/validation.py", line 18, in _assert_all_finite
raise ValueError("Array contains NaN or infinity.") ValueError: Array contains NaN or infinity.
What does it mean in terms or results/is there a way to overcome this?
Are you trying to us roc_curve to evaluate a multiclass classifier? In other words, if you are using roc_curve on a classification problem that is not binary, then this won't work correctly. There is math out there for multidimensional ROC analysis, but the current ROC methods in python don't implement them.
To evaluate multiclass problems trying using methods like: confusion_matrix and classification_report from sklearn, and kappa() from skll.
You state this line:
fpr, tpr, thresholds = metrics.roc_curve(y, pred, pos_label=2)
which leads to the conclusion that you may have copied the sklearn example which also uses "pos_label=2".
However, in most cases you want the "pos_label" to be 1. So if your code outputs probabilities and they are between 0 and 1, then your pos_label should be 1.