Cannot get the value of hidden weights of RNN - pytorch

I declare my RNN as
self.rnn = torch.nn.RNN(input_size=encoding_dim, hidden_size=1, num_layers=1, nonlinearity='relu')
Later
self.rnn.all_weights
# [[Parameter containing:
tensor([[-0.8099, -0.9543, 0.1117, 0.6221, 0.5034, -0.6766, -0.3360, -0.1700,
-0.9361, -0.3428]], requires_grad=True), Parameter containing:
tensor([[-0.1929]], requires_grad=True), Parameter containing:
tensor([0.7881], requires_grad=True), Parameter containing:
tensor([0.4320], requires_grad=True)]]
self.rnn.all_weights[0][0][0].values
# {RuntimeError}Could not run 'aten::values' with arguments from the 'CPU' backend. 'aten::values' is only available for these backends: [SparseCPU, Autograd, Profiler, Tracer].
Clearly I see the value of the weights, but cannot access to it. Documentation says I need to specify requires_grad=True, but that does not work.
Is there a more elegant and usable way than self.rnn.all_weights[0][0][0]?

Use torch.nn.Module.named_parameters or torch.nn.Module.parameters.
>>> import torch.nn as nn
>>> model = nn.RNN(input_size=encoding_dim, hidden_size=1, num_layers=1, nonlinearity='relu')
>>> weights = []
>>> for name, parameter in model.named_parameters():
... weights.append({name: parameter[0]})
...
>>> just_weights = []
>>> for parameter in model.parameters():
... just_weights.append(parameter[0])
...

Related

Scikit learn GridSearchCV with pipeline with custom transformer

I'm trying to perform a GridSearchCV on a pipeline with a custom transformer. The transformer enriches the features "year" and "odometer" polynomially and one hot encodes the rest of the features. The ML model is a simple linear regression model.
custom transformer code:
import numpy as np
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import PolynomialFeatures
class custom_poly_features(TransformerMixin, BaseEstimator):
def __init__(self, degree = 2, poly_features = ['year', 'odometer']):
self.degree_ = degree
self.poly_features_ = poly_features
def fit(self, X, y=None):
# Return the classifier
return self
def transform(self, X, y=None):
poly_feat = PolynomialFeatures(degree=self.degree_)
OneHot = OneHotEncoder(sparse=False)
not_poly_features = list(set(X.columns) - set(self.poly_features_))
poly = poly_feat.fit_transform(X[self.poly_features_].to_numpy())
poly = np.hstack([poly, OneHot.fit_transform(X[not_poly_features].to_numpy())])
return poly
def get_params(self, deep=True):
return {"degree": self.degree_, "poly_features": self.poly_features_}
pipeline & gridsearch code:
#create pipeline
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LinearRegression
poly_pipeline = Pipeline(steps=[("cpf", custom_poly_features()), ("lin_reg", LinearRegression(n_jobs=-1))])
#perform gridsearch
from sklearn.model_selection import GridSearchCV
param_grid = {"cpf__degree": [3, 4, 5]}
search = GridSearchCV(poly_pipeline, param_grid, n_jobs=-1, cv=3)
search.fit(X_train_ordinal, y_train)
The custom transformer itself works fine and the pipeline also works (although the score is not great, but that is not the topic here).
poly_pipeline.fit(X_train, y_train).score(X_test, y_test)
Output:
0.543546844381771
However, when I perform the gridsearch, the scores are all nan values:
search.cv_results_
Output:
{'mean_fit_time': array([4.46928191, 4.58259885, 4.55605125]),
'std_fit_time': array([0.18111937, 0.03305779, 0.02080789]),
'mean_score_time': array([0.21119197, 0.13816587, 0.11357466]),
'std_score_time': array([0.09206233, 0.02171508, 0.02127906]),
'param_custom_poly_features__degree': masked_array(data=[3, 4, 5],
mask=[False, False, False],
fill_value='?',
dtype=object),
'params': [{'custom_poly_features__degree': 3},
{'custom_poly_features__degree': 4},
{'custom_poly_features__degree': 5}],
'split0_test_score': array([nan, nan, nan]),
'split1_test_score': array([nan, nan, nan]),
'split2_test_score': array([nan, nan, nan]),
'mean_test_score': array([nan, nan, nan]),
'std_test_score': array([nan, nan, nan]),
'rank_test_score': array([1, 2, 3])}
Does anyone know what the problem is? The transformer and the pipeline work fine on their own after all.
To debug searches in general, set error_score='raise', so that you get a full error traceback.
Your issue appears to be data-dependent; I can run this just fine on a custom dataset. That suggests to me that the comment by #Sanjar Adylov not only highlights an important issue, but the issue for your data: the train folds sometimes contain different values in some categorical feature(s) than the test folds, and so the one-hot encodings end up with different numbers of features, and the linear model justifiably breaks.
So the fix there is also as Sanjar says: instantiate, store as attributes, and fit the two transformers and in your fit method, and use their transform methods in your transform method.
You will find there is another big issue: all the scores in cv_results_ are the same. This is because you can't actually set the hyperparameters correctly, because in __init__ you've used mismatching names (degree as the parameter but degree_ as the attribute). Read more in the developer guide. (I think you can get around this by editing set_params similar to how you edited get_params, but it would be much easier to actually rely on the BaseEstimator versions of those and just match the parameter names to the attribute names.)
Also, note that setting a parameter default to a list can have surprising effects. Consider alternatives to the default of poly_features in __init__.
class custom_poly_features(TransformerMixin, BaseEstimator):
def __init__(self, degree=2, poly_features=['year', 'odometer']):
self.degree = degree
self.poly_features = poly_features
def fit(self, X, y=None):
self.poly_feat = PolynomialFeatures(degree=self.degree)
self.onehot = OneHotEncoder(sparse=False)
self.not_poly_features_ = list(set(X.columns) - set(self.poly_features))
self.poly_feat.fit(X[self.poly_features])
self.onehot.fit(X[self.not_poly_features_])
return self
def transform(self, X, y=None):
poly = self.poly_feat.transform(X[self.poly_features])
poly = np.hstack([poly, self.onehot.transform(X[self.not_poly_features_])
return poly
There are some additional things you might want to add, like checks for whether poly_features or not_poly_features_ is empty (which would break the corresponding transformer).
Finally, your custom estimator is just doing what a ColumnTransformer is meant to do. I think the only reason to prefer yours is if you need to search over which columns get which treatment; I don't think that's easy to do with a ColumnTransformer.
custom_poly = ColumnTransformer(
transformers=[('poly', PolynomialFeatures(), ['year', 'odometer'])],
remainder=OneHotEncoder(),
)
param_grid = {"cpf__poly__degree": [3, 4, 5]}

sklearn writing my own predictors with many parameters

I am writing my own Sklearn predictor calling a command-line tool that we have.
This command-line too contains many potential parameters (>200).
I understand I need to specify each argument individually in my estimator's argument and each argument should then be associated with an attribute in init.
From the documentation:
The arguments accepted by init should all be keyword arguments with a default value.
Also, every keyword argument accepted by init should correspond to an attribute on the instance.
def __init__(self, param1=1, param2=2):
self.param1 = param1
self.param2 = param2
So if I understand properly I cannot create a class for all these parameters (they will be used in several estimators and transformers)?
As I will have several estimators with these 200 parameters it is really not ideal. It will be difficult to maintain the code and it will be prone to errors.
Does anyone see a workaround for this? Maybe I misunderstood the Sklearn requirements?
Thanks.
Thibault
You might configure the command-line tool to dump its state.
For example: Here's a simple CLI tool takes two arguments and produces one output:
# File: `cli.py`
import argparse
import logging
PARSER = argparse.ArgumentParser()
PARSER.add_argument("-n1", type=int, default=1)
PARSER.add_argument("-n2", type=int, default=2)
ARGS = PARSER.parse_args()
X = [x for (_, x) in ARGS._get_kwargs()]
y = sum(X)
logging.basicConfig(
format="%(message)s",
filename="data.csv",
encoding="utf-8",
level=logging.INFO,
)
data_string = f"{y}," + ",".join([str(x) for x in X])
logging.info(data_string)
print(f"{ARGS.n1} + {ARGS.n2} = {y}")
If we call this a few times with different arguments:
python cli.py -n1 5 -n2 10
python cli.py -n2 6 -n1 4
python cli.py -n1 3
python cli.py -n2 6
python cli.py -n1 2 -n1 0 -n2 7
... our logging configuration dumps the output and the arguments into a file data.csv:
15,5,10
10,4,6
5,3,2
7,1,6
7,0,7
... which we can can use to fit models or make predictions:
# File: `learn.py`
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import LeaveOneOut
import numpy as np
data = np.loadtxt("data.csv", delimiter=",")
X = data[:, 1:]
y = data[:, 0]
reg = DecisionTreeRegressor(random_state=0)
loo = LeaveOneOut()
for train_index, test_index in loo.split(X):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
reg.fit(X_train, y_train)
print("Predicted:", reg.predict(X_test), "Actual:", y_test)
# ---------- Output ------------
# Predicted: [10.] Actual: [15.]
# Predicted: [5.] Actual: [10.]
# Predicted: [10.] Actual: [5.]
# Predicted: [7.] Actual: [7.]
# Predicted: [7.] Actual: [7.]

Skipping a text line if the line is empty in tensorflow

I would like to process a text file containing sentences.
Each sentence is stored as a each line of that text file. I would like to retrieve each line using an iterator as follows:
class Reader(object):
def __init__(self, file_name):
dataset = tf.data.TextLineDataset(file_name)
self._iterator = dataset.make_one_shot_iterator()
def next_line(self):
# What I want to do is skipping blank lines here.
return self._iterator.get_next()
However, if the line is an empty line, I would like to skip that line. What would be the best way of implementing this skipping? I would like to implement that functionality in the above next_line method.
Any suggestion is welcomed.
You just need to apply filter to dataset.
filter(lambda line:tf.not_equal(tf.strings.length(line),0))
Suppose your data are as follows:
1
2,2
3,3,3
5,5,5
6,6,6
An example:
import tensorflow as tf
tf.enable_eager_execution()
dataset = tf.data.TextLineDataset('a.csv').filter(lambda line:tf.not_equal(tf.strings.length(line),0))
iterator = dataset.make_one_shot_iterator()
while True:
try:
print(iterator.get_next())
except tf.errors.OutOfRangeError:
break
The result:
tf.Tensor(b'1', shape=(), dtype=string)
tf.Tensor(b'2,2', shape=(), dtype=string)
tf.Tensor(b'3,3,3', shape=(), dtype=string)
tf.Tensor(b'5,5,5', shape=(), dtype=string)
tf.Tensor(b'6,6,6', shape=(), dtype=string)

model.cuda() in pytorch

If I call model.cuda() in pytorch where model is a subclass of nn.Module, and say if I have four GPUs, how it will utilize the four GPUs and how do I know which GPUs that are using?
If you have a custom module derived from nn.Module after model.cuda() all model parameters, (model.parameters() iterator can show you these) will end on your cuda.
To check where are your parameters just print them (cuda:0) in my case:
class M(nn.Module):
'custom module'
def __init__(self):
super().__init__()
self.lin = nn.Linear(784, 10)
m = M()
m.cuda()
for _ in m.parameters():
print(_)
# Parameter containing:
# tensor([[-0.0201, 0.0282, -0.0258, ..., 0.0056, 0.0146, 0.0220],
# [ 0.0098, -0.0264, 0.0283, ..., 0.0286, -0.0052, 0.0007],
# [-0.0036, -0.0045, -0.0227, ..., -0.0048, -0.0003, -0.0330],
# ...,
# [ 0.0217, -0.0008, 0.0029, ..., -0.0213, 0.0005, 0.0050],
# [-0.0050, 0.0320, 0.0013, ..., -0.0057, -0.0213, 0.0045],
# [-0.0302, 0.0315, 0.0356, ..., 0.0259, 0.0166, -0.0114]],
# device='cuda:0', requires_grad=True)
# Parameter containing:
# tensor([-0.0027, -0.0353, -0.0349, -0.0236, -0.0230, 0.0176, -0.0156, 0.0037,
# 0.0222, -0.0332], device='cuda:0', requires_grad=True)
You can also specify the device like this:
m.cuda('cuda:0')
With torch.cuda.device_count() you may check how many devices you have.
To expand on prosti's answer to split your computations among multiple GPUs you should use torch.nn.DataParallel or DistributedDataParallel.

Grid search tunung

I am implementing KNN using python and it was working.
Now I get an error:
No module named 'sklearn.grid_search
When I change the package to sklean.model_selection, I get another an error:
'GridSearchCV' object has no attribute 'grid_scores_'
Here is my code:
from sklearn.grid_search import GridSearchCV
from sklearn.neighbors import KNeighborsClassifier
import matplotlib.pyplot as plt
# define the parameter values that should be searched
# for python 2, k_range = range(1, 31)
# instantiate model
knn = KNeighborsClassifier(n_jobs=-1)
k_range = list(range(1, 31))
print(k_range)
# create a parameter grid: map the parameter names to the values that should be searched
# simply a python dictionary
# key: parameter name
# value: list of values that should be searched for that parameter
# single key-value pair for param_grid
param_grid = dict(n_neighbors=k_range)
print(param_grid)
# instantiate the grid
grid = GridSearchCV(knn, param_grid, cv=10, scoring='accuracy')
# fit the grid with data
grid.fit(X, y)
# view the complete results (list of named tuples)
grid.grid_scores_
# examine the first tuple
# we will slice the list and select its elements using dot notation and []
print('Parameters')
print(grid.grid_scores_[0].parameters)
# Array of 10 accuracy scores during 10-fold cv using the parameters
print('')
print('CV Validation Score')
print(grid.grid_scores_[0].cv_validation_scores)
# Mean of the 10 scores
print('')
print('Mean Validation Score')
print(grid.grid_scores_[0].mean_validation_score)
# create a list of the mean scores only
# list comprehension to loop through grid.grid_scores
grid_mean_scores = [result.mean_validation_score for result in grid.grid_scores_]
print(grid_mean_scores)
# plot the results
# this is identical to the one we generated above
plt.plot(k_range, grid_mean_scores)
plt.xlabel('Value of K for KNN')
plt.ylabel('Cross-Validated Accuracy')
# examine the best model
# Single best score achieved across all params (k)
print(grid.best_score_)
# Dictionary containing the parameters (k) used to generate that score
print(grid.best_params_)
# Actual model object fit with those best parameters
# Shows default parameters that
We did not specify:
print(grid.best_estimator_)
Try the below :
from sklearn.model_selection import GridSearchCV
ref link
https://scikit-learn.org/stable/auto_examples/model_selection/plot_grid_search_digits.html

Resources