How can I solve svm predict model problem - svm

Im having problem by svm predict model
from sklearn.svm import SVC
svm_model = SVC(kernel='rbf', C=8, gamma=0.1)
svm_model.fit(X_train_std, y_train)
y_pred = svm_model.predict(X_test_std)
/usr/local/lib/python3.8/dist-packages/sklearn/utils/validation.py:993: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
y = column_or_1d(y, warn=True)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-53-398f1caaa8e8> in <module>
3 svm_model = SVC(kernel='rbf', C=8, gamma=0.1)
4
----> 5 svm_model.fit(X_train_std, y_train)
6
7 y_pred = svm_model.predict(X_test_std)
2 frames
/usr/local/lib/python3.8/dist-packages/sklearn/utils/multiclass.py in check_classification_targets(y)
195 "multilabel-sequences",
196 ]:
--> 197 raise ValueError("Unknown label type: %r" % y_type)
198
199
ValueError: Unknown label type: 'continuous'
I thought y type problem
train = pd.get_dummies(train, columns=['LSTAT'], drop_first=True)
So I use that but problem was disappeared
Somebody help me

Related

unhashable type: 'dict' when sampling values for class_weight in BayesSearchCV

I'm trying to tune the hyperparameters of a model using BayesSearchCV. This model have a class_weight parameter which is associated with classes in the form {class_label: weight}. The problem is I'm getting a unhashable type: 'dict' error message when I try to run the code. How can I fix this?
class_weights = [{0:1, 1:x} for x in np.arange(1.0, 5.0, 0.1)]
params = {'C': np.arange(1.0, 5.0, 0.1),
'class_weight': class_weights}
bs_cv = BayesSearchCV(LogisticRegression(), params, cv=50, n_iter=5, random_state=42, scoring=auprc, verbose=False)
history = bs_cv.fit(X_train, y_train)
best_bayes = bs_cv.best_estimator_
print("Test set AUPRC: {}".format(bs_cv.score(X_test, y_test)))
print("Best parameters are: {}".format(bs_cv.best_params_))
The error message:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-433-d9f9e9b7926b> in <module>
3 'class_weight': class_weights}
4
----> 5 bs_cv = BayesSearchCV(LogisticRegression(), params, cv=50, n_iter=5, random_state=42, scoring=auprc, verbose=False)
6 history = bs_cv.fit(X_train, y_train)
7 best_bayes = bs_cv.best_estimator_
6 frames
/usr/local/lib/python3.8/dist-packages/skopt/space/transformers.py in <dictcomp>(.0)
111 List of categories.
112 """
--> 113 self.mapping_ = {v: i for i, v in enumerate(X)}
114 self.inverse_mapping_ = {i: v for v, i in self.mapping_.items()}
115 self._lb.fit([self.mapping_[v] for v in X])
TypeError: unhashable type: 'dict'

Tensorflow HammingLoss gives ValueError with keras.utils.Sequence

I am working on a multi-label image classification problem with 13 labels. I want to use Hamming Loss to evaluate the performance of the model. So I specified tfa.metrics.HammingLoss(mode = 'multilabel') in the metrics parameter during model compilation. This worked when I provided both X_train and y_train to model.fit(), but it threw a ValueError when I used a Sequence object (described below) for training.
Data Generator description
I used a keras.utils.Sequence input object similar to what is present here. The generator returns 2 numpy arrays for each batch - the first array consists of the input images of shape (128, 128, 3) and the second array consists of labels each of shape (13,).
This is what my code looks like:
model.compile(
loss='binary_crossentropy',
optimizer='rmsprop',
metrics=[tfa.metrics.HammingLoss(mode = 'multilabel')]
)
model.fit(
train_datagen,
epochs = 5,
batch_size = BATCH_SIZE,
steps_per_epoch = TOTAL // BATCH_SIZE
)
And this is the error that I obtained:
Epoch 1/5
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-140-978987a2bbaa> in <module>
3 epochs=5,
4 batch_size=BATCH_SIZE,
----> 5 steps_per_epoch = 2000 // BATCH_SIZE
6 # validation_data=validation_generator,
7 )
4 frames
/usr/local/lib/python3.7/dist-packages/tensorflow_addons/metrics/hamming.py in else_body_2()
64 try:
65 do_return = True
---> 66 retval_ = (ag__.ld(nonzero) / ag__.converted_call(ag__.ld(y_true).get_shape, (), None, fscope)[(- 1)])
67 except:
68 do_return = False
ValueError: in user code:
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1051, in train_function *
return step_function(self, iterator)
File "/usr/local/lib/python3.7/dist-packages/tensorflow_addons/metrics/utils.py", line 66, in update_state *
matches = self._fn(y_true, y_pred, **self._fn_kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow_addons/metrics/hamming.py", line 133, in hamming_loss_fn *
return nonzero / y_true.get_shape()[-1]
ValueError: None values not supported.
How do I correct this? Is there any issue with the format of the labels?

LabelEncoder instance is not fitted yet

I have a code for prediction of unseen data in a sentence classification task.
The code is
from sklearn.preprocessing import LabelEncoder
maxlen = 1152
### PREDICT NEW UNSEEN DATA ###
tokenizer = Tokenizer()
label_enc = LabelEncoder()
X_test = ['this is boring', 'wow i like this you did a great job']
X_test = tokenizer.texts_to_sequences(X_test)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
a = (model.predict(X_test)>0.5).astype(int).ravel()
print(a)
reverse_pred = label_enc.inverse_transform(a.ravel())
print(reverse_pred)
But I am getting this error
[1 1]
---------------------------------------------------------------------------
NotFittedError Traceback (most recent call last)
<ipython-input-33-7e12dbe8aec1> in <module>()
39 print(a)
40
---> 41 reverse_pred = label_enc.inverse_transform(a.ravel())
42 print(reverse_pred)
1 frames
/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py in check_is_fitted(estimator, attributes, msg, all_or_any)
965
966 if not attrs:
--> 967 raise NotFittedError(msg % {'name': type(estimator).__name__})
968
969
NotFittedError: This LabelEncoder instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.
I have used Sequential model and the model.fit is written as history=model.fit() in the training part. Why am I getting this error?
following the sklearn documentation and what reported here, you have simply to fit your encoder before making an inverse transform
y = ['positive','negative','positive','negative','positive','negative']
label_enc = LabelEncoder()
label_enc.fit(y)
model_predictions = np.random.uniform(0,1, 3)
model_predictions = (model_predictions>0.5).astype(int).ravel()
model_predictions = label_enc.inverse_transform(model_predictions)

Can I use probabilistic label when train model in logistic regression?

I use sklearn.linear_model.LogisticRegression and would like to use probabilistic label when train model.
But as following code I got error when I attempt to use train data with probability label for training logistic regression model.
Is there an any way to use probablity label for training logistic regression model?
import numpy as np
from sklearn.linear_model import LogisticRegression
x = np.array([1966, 1967, 1968, 1969, 1970,
1971, 1972, 1973, 1974, 1975,
1976, 1977, 1978, 1979, 1980,
1981, 1982, 1983, 1984]).reshape(-1, 1)
y = np.array([0.003, 0.016, 0.054, 0.139, 0.263,
0.423, 0.611, 0.758, 0.859, 0.903,
0.937, 0.954, 0.978, 0.978, 0.982,
0.985, 0.989, 0.988, 0.992])
lr = LogisticRegression()
lr.fit(x, y)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-26-6f0a54f18841> in <module>()
13
14 lr = LogisticRegression()
---> 15 lr.fit(x, y) # => ValueError: Unknown label type: 'continuous'
/home/sudot/anaconda3/lib/python3.6/site-packages/sklearn/linear_model/logistic.py in fit(self, X, y, sample_weight)
1172 X, y = check_X_y(X, y, accept_sparse='csr', dtype=np.float64,
1173 order="C")
-> 1174 check_classification_targets(y)
1175 self.classes_ = np.unique(y)
1176 n_samples, n_features = X.shape
/home/sudot/anaconda3/lib/python3.6/site-packages/sklearn/utils/multiclass.py in check_classification_targets(y)
170 if y_type not in ['binary', 'multiclass', 'multiclass-multioutput',
171 'multilabel-indicator', 'multilabel-sequences']:
--> 172 raise ValueError("Unknown label type: %r" % y_type)
173
174
ValueError: Unknown label type: 'continuous'
Logistic Regression is a binary classification model. You can't pass non-categorical values as target.
Just round values of y before fitting.
y = y.round(0) # Add this line
lr = LogisticRegression()
lr.fit(x, y)

Bad input shape while training a prediction model

I used the following code to train a currency exchange rate prediction model using sklearn but get an error:
import numpy as np
x = [[30],[40],[50],[60],[70],[80],[90],[100],[120],[130],[140],[150]]
y = ['jan','febuary,'march','april','may','june','july','august','september','october','november','december']
y_2 = np.reshape(y, (-1, 2))
#reshaping because it throws in an error to reshape
from sklearn import preprocessing
le = preprocessing.LabelEncoder()
y_3 = le.fit_transform(y_2)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-144-c98a5b8bd15a> in <module>
----> 1 y_3 = le.fit_transform(y_2)
c:\users\user\appdata\local\programs\python\python37-32\lib\site-packages\sklearn\preprocessing\label.py in fit_transform(self, y)
233 y : array-like of shape [n_samples]
234 """
--> 235 y = column_or_1d(y, warn=True)
236 self.classes_, y = _encode(y, encode=True)
237 return y
c:\users\user\appdata\local\programs\python\python37-32\lib\site-packages\sklearn\utils\validation.py in column_or_1d(y, warn)
795 return np.ravel(y)
796
--> 797 raise ValueError("bad input shape {0}".format(shape))
798
799
ValueError: bad input shape (6, 2)
What do I need to do to fix this error?
There is no need to perform the reshape() on y that you are doing. The following is sufficient.
import numpy as np
x = [[30],[40],[50],[60],[70],[80],[90],[100],[120],[130],[140],[150]]
y = ['jan','febuary','march','april','may','june','july','august','september','october','november','december']
#y_2 = np.reshape(y, (-1, 2)) --> This is not needed
#reshaping because it throws in an error to reshape
from sklearn import preprocessing
le = preprocessing.LabelEncoder()
y2 = le.fit_transform(y)
print("LabelEncoder classes =", le.classes_)
# LabelEncoder classes = ['april' 'august' 'december' 'febuary' 'jan' 'july' 'june' 'march' 'may' 'november' 'october' 'september']

Resources