MultiplicativeLR scheduler not working properly when call scheduler.step() - pytorch

PytorchLightning Framework, I am configuring the optimizers like this:
def configure_optimizers(self):
opt = torch.optim.Adam(self.model.parameters(), lr=cfg.learning_rate)
#modified to fit lightning
sch = torch.optim.lr_scheduler.MultiplicativeLR(opt, lr_lambda = 0.95) #decrease of 5% every epoch
return [opt], [sch]
Then in the training_step, I can either call manually the lr_scheduler or let lightning do it automatically.
Fact is that in any case I got this kind of error:
lr_scheduler["scheduler"].step()
File "/home/lsa/anaconda3/envs/randla_36/lib/python3.6/site-packages/torch/optim/lr_scheduler.py", line 152, in step
values = self.get_lr()
File "/home/lsa/anaconda3/envs/randla_36/lib/python3.6/site-packages/torch/optim/lr_scheduler.py", line 329, in get_lr
for lmbda, group in zip(self.lr_lambdas, self.optimizer.param_groups)]
File "/home/lsa/anaconda3/envs/randla_36/lib/python3.6/site-packages/torch/optim/lr_scheduler.py", line 329, in <listcomp>
for lmbda, group in zip(self.lr_lambdas, self.optimizer.param_groups)]
TypeError: 'float' object is not callable
But ifI use any other scheduler, not only VSCode recognize it as belonging to pytorch, I also do not get this error.
Pytorch version 1.10
Lightning Version 1.5

I think that you need to change the value of `lr_lambda'.
Here is the link to the documentation: https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.MultiplicativeLR.html
lr_lambda (function or list) – A function which computes a multiplicative factor given an integer parameter epoch, or a list of such functions, one for each group in optimizer.param_groups.
So, if you want a decrease of 5% every epoch, then you could do the following:
def configure_optimizers(self):
opt = torch.optim.Adam(self.model.parameters(), lr=cfg.learning_rate)
#modified to fit lightning
lmbda = lambda epoch: 0.95
sch = torch.optim.lr_scheduler.MultiplicativeLR(opt, lr_lambda = lmbda) #decrease of 5% every epoch
return [opt], [sch]

Related

Is it possible to use a custom generator to train multi input architecture with keras tensorflow 2.0.0?

With TF 2.0.0, I can train an architecture with one input, I can train an architecture with one input using a custom generator, and I can train an architecture with two inputs. But I can't train an architecture with two inputs using a custom generator.
To keep it minimalist, here's a simple example, with no generator and no multiple inputs to start with:
from tensorflow.keras import layers, models, Model, Input, losses
from numpy import random, array, zeros
input1 = Input(shape=2)
dense1 = layers.Dense(5)(input1)
fullModel = Model(inputs=input1, outputs=dense1)
fullModel.summary()
# Generate random examples:
nbSamples = 21
X_train = random.rand(nbSamples, 2)
Y_train = random.rand(nbSamples, 5)
batchSize = 4
fullModel.compile(loss=losses.LogCosh())
fullModel.fit(X_train, Y_train, epochs=10, batch_size=batchSize)
It's a simple dense layer which takes in input vectors of size 2. The randomly generated dataset contains 21 examples and the batch size is 4. Instead of loading all the data and giving them to model.fit(), we can also give a custom generator in input. The main advantage (for RAM consumption) of this is to load only batch by batch rather that the whole dataset. Here is a simple example with the previous architecture and a custom generator:
import json
# Save the last dataset in a file:
with open("./dataset1input.txt", 'w') as file:
for i in range(nbSamples):
example = {"x": X_train[i].tolist(), "y": Y_train[i].tolist()}
file.write(json.dumps(example) + "\n")
def generator1input(datasetPath, batch_size, inputSize, outputSize):
X_batch = zeros((batch_size, inputSize))
Y_batch = zeros((batch_size, outputSize))
i=0
while True:
with open(datasetPath, 'r') as file:
for line in file:
example = json.loads(line)
X_batch[i] = array(example["x"])
Y_batch[i] = array(example["y"])
i+=1
if i % batch_size == 0:
yield (X_batch, Y_batch)
i=0
fullModel.compile(loss=losses.LogCosh())
my_generator = generator1input("./dataset1input.txt", batchSize, 2, 5)
fullModel.fit(my_generator, epochs=10, steps_per_epoch=int(nbSamples/batchSize))
Here, the generator opens the dataset file, but loads only batch_size examples (not nbSamples examples) each time it is called and slides into the file while looping.
Now, I can build a simple functional architecture with 2 inputs, and no generator:
input1 = Input(shape=2)
dense1 = layers.Dense(5)(input1)
subModel1 = Model(inputs=input1, outputs=dense1)
input2 = Input(shape=3)
dense2 = layers.Dense(5)(input2)
subModel2 = Model(inputs=input2, outputs=dense2)
averageLayer = layers.average([subModel1.output, subModel2.output])
fullModel = Model(inputs=[input1, input2], outputs=averageLayer)
fullModel.summary()
# Generate random examples:
nbSamples = 21
X1 = random.rand(nbSamples, 2)
X2 = random.rand(nbSamples, 3)
Y = random.rand(nbSamples, 5)
fullModel.compile(loss=losses.LogCosh())
fullModel.fit([X1, X2], Y, epochs=10, batch_size=batchSize)
Until here, all models compile and run, but I'm not able to use a generator with the last architecture and its 2 inputs... By trying the following code (which should logically work in my opinion):
# Save data in a file:
with open("./dataset.txt", 'w') as file:
for i in range(nbSamples):
example = {"x1": X1[i].tolist(), "x2": X2[i].tolist(), "y": Y[i].tolist()}
file.write(json.dumps(example) + "\n")
def generator(datasetPath, batch_size, inputSize1, inputSize2, outputSize):
X1_batch = zeros((batch_size, inputSize1))
X2_batch = zeros((batch_size, inputSize2))
Y_batch = zeros((batch_size, outputSize))
i=0
while True:
with open(datasetPath, 'r') as file:
for line in file:
example = json.loads(line)
X1_batch[i] = array(example["x1"])
X2_batch[i] = array(example["x2"])
Y_batch[i] = array(example["y"])
i+=1
if i % batch_size == 0:
yield ([X1_batch, X2_batch], Y_batch)
i=0
fullModel.compile(loss=losses.LogCosh())
my_generator = generator("./dataset.txt", batchSize, 2, 3, 5)
fullModel.fit(my_generator, epochs=10, steps_per_epoch=(nbSamples//batchSize))
I obtain the following error:
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 729, in fit
use_multiprocessing=use_multiprocessing)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 224, in fit
distribution_strategy=strategy)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 547, in _process_training_inputs
use_multiprocessing=use_multiprocessing)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 606, in _process_inputs
use_multiprocessing=use_multiprocessing)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\data_adapter.py", line 566, in __init__
reassemble, nested_dtypes, output_shapes=nested_shape)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\data\ops\dataset_ops.py", line 540, in from_generator
output_types, tensor_shape.as_shape, output_shapes)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\data\util\nest.py", line 471, in map_structure_up_to
results = [func(*tensors) for tensors in zip(*all_flattened_up_to)]
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\data\util\nest.py", line 471, in <listcomp>
results = [func(*tensors) for tensors in zip(*all_flattened_up_to)]
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 1216, in as_shape
return TensorShape(shape)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 776, in __init__
self._dims = [as_dimension(d) for d in dims_iter]
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 776, in <listcomp>
self._dims = [as_dimension(d) for d in dims_iter]
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 718, in as_dimension
return Dimension(value)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 193, in __init__
self._value = int(value)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'tuple'
As explain in the doc, x argument of model.fit() can be A generator or keras.utils.Sequence returning (inputs, targets), and The iterator should return a tuple of length 1, 2, or 3, where the optional second and third elements will be used for y and sample_weight respectively. Thus, I think that it can not take in input more than one generator. Perhaps multiple inputs are not possible with custom generator. Please, would you have an explanation? A solution?
(otherwise, it seems possible to go through tf.data.Dataset.from_generator() with a less custom approach, but I have difficulties to understand what to indicate in the output_signature argument)
[EDIT] Thank you for your response #Francis Tang. In fact, it's possible to use a custom generator, but it allowed me to understand that I just had to change the line:
yield ([X1_batch, X2_batch], Y_batch)
To:
yield (X1_batch, X2_batch), Y_batch
Nevertheless, it is indeed perhaps better to use tf.keras.utils.Sequence. But I find it a bit restrictive.
In particular, I understand in the example given (as well as in most of the examples I could find about Sequence) that __init__() is first used to load the full dataset, which is against the interest of the generator.
But maybe it was a particular example about Sequence(), and there is no need to use __init__() like that: you can directly read a file and load the desired batch into the __getitem__().
In this case, it seems to push to browse each time the data file, or else it is necessary to create a file per batch beforehand (not really optimal).
from tensorflow.python.keras.utils.data_utils import Sequence
class generator(Sequence):
def __init__(self,filename,batch_size):
data = pickle.load(open(filename,'rb'))
self.X1 = data['X1']
self.X2 = data['X2']
self.y = data['y']
self.bs = batch_size
def __len__(self):
return (len(self.y) - 1) // self.bs + 1
def __getitem__(self,idx):
start, end = idx * self.bs, (idx+1) * self.bs
return (self.X1[start:end], self.X2[start:end]), self.y[start:end]
You need to write a class using Sequence: https://www.tensorflow.org/api_docs/python/tf/keras/utils/Sequence

Theano error when using PyMC3: theano.gof.fg.MissingInputError

I am generating some (noisy) data-points (y) with some known parameters (m,c) that represent the equation of a straight line. Using sampling-based Bayesian methods, I now want to know the true values of parameters (m,c) from the data. Therefore, I am using DE Metropolis (PyMC3) to estimate the true parameters.
I am getting theano error theano.gof.fg.MissingInputError: Input 0 of the graph (indices start from 0), used to compute sigmoid(c_interval__), was not provided and not given a value.
Theano version: 1.0.4
PyMC3 version: 3.9.1
import matplotlib.pyplot as plt
import numpy as np
import arviz as az
import pymc3
import theano.tensor as tt
from theano.compile.ops import as_op
plt.style.use("ggplot")
# define a theano Op for our likelihood function
class LogLike(tt.Op):
itypes = [tt.dvector] # expects a vector of parameter values when called
otypes = [tt.dscalar] # outputs a single scalar value (the log likelihood)
def __init__(self, loglike, data, x, sigma):
# add inputs as class attributes
self.likelihood = loglike
self.data = data
self.x = x
self.sigma = sigma
def perform(self, node, inputs, outputs):
# the method that is used when calling the Op
theta, = inputs # this will contain my variables
# call the log-likelihood function
logl = self.likelihood(theta, self.x, self.data, self.sigma)
outputs[0][0] = np.array(logl) # output the log-likelihood
def my_model(theta, x):
y = theta[0]*x + theta[1]
return y
def my_loglike(theta, x, data, sigma):
model = my_model(theta, x)
ll = -(0.5/sigma**2)*np.sum((data - model)**2)
return ll
# set up our data
N = 10 # number of data points
sigma = 1. # standard deviation of noise
x = np.linspace(0., 9., N)
mtrue = 0.4 # true gradient
ctrue = 3. # true y-intercept
truemodel = my_model([mtrue, ctrue], x)
# make data
np.random.seed(716742) # set random seed, so the data is reproducible each time
data = sigma*np.random.randn(N) + truemodel
print(data)
ndraws = 3000 # number of draws from the distribution
# create our Op
logl = LogLike(my_loglike, data, x, sigma)
# use PyMC3 to sampler from log-likelihood
with pymc3.Model():
# uniform priors on m and c
m = pymc3.Uniform('m', lower=-10., upper=10.)
c = pymc3.Uniform('c', lower=-10., upper=10.)
# convert m and c to a tensor vector
theta = tt.as_tensor_variable([m, c])
# use a DensityDist (use a lamdba function to "call" the Op)
pymc3.DensityDist('likelihood', lambda v: logl(v), observed={'v': theta})
step = pymc3.DEMetropolis()
trace = pymc3.sample(ndraws, step)
# plot the traces
axes = az.plot_trace(trace)
fig = axes.ravel()[0].figure
fig.savefig('./trace_plots.png')
Find the full trace here:
Population sampling (4 chains)
DEMetropolis: [c, m]
Attempting to parallelize chains to all cores. You can turn this off with `pm.sample(cores=1)`.
Population parallelization failed. Falling back to sequential stepping of chains.---------------------| 0.00% [0/4 00:00<00:00]
Sampling 4 chains for 0 tune and 4_000 draw iterations (0 + 16_000 draws total) took 5 seconds.███████| 100.00% [4000/4000 00:04<00:00]
Traceback (most recent call last):
File "test.py", line 75, in <module>
trace = pymc3.sample(ndraws, step)
File "/home/csl_user/.local/lib/python3.7/site-packages/pymc3/sampling.py", line 599, in sample
idata = arviz.from_pymc3(trace, **ikwargs)
File "/home/csl_user/.local/lib/python3.7/site-packages/arviz/data/io_pymc3.py", line 531, in from_pymc3
save_warmup=save_warmup,
File "/home/csl_user/.local/lib/python3.7/site-packages/arviz/data/io_pymc3.py", line 159, in __init__
self.observations, self.multi_observations = self.find_observations()
File "/home/csl_user/.local/lib/python3.7/site-packages/arviz/data/io_pymc3.py", line 172, in find_observations
multi_observations[key] = val.eval() if hasattr(val, "eval") else val
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/gof/graph.py", line 522, in eval
self._fn_cache[inputs] = theano.function(inputs, self)
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/compile/function.py", line 317, in function
output_keys=output_keys)
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/compile/pfunc.py", line 486, in pfunc
output_keys=output_keys)
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/compile/function_module.py", line 1839, in orig_function
name=name)
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/compile/function_module.py", line 1487, in __init__
accept_inplace)
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/compile/function_module.py", line 181, in std_fgraph
update_mapping=update_mapping)
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/gof/fg.py", line 175, in __init__
self.__import_r__(output, reason="init")
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/gof/fg.py", line 346, in __import_r__
self.__import__(variable.owner, reason=reason)
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/gof/fg.py", line 391, in __import__
raise MissingInputError(error_msg, variable=r)
theano.gof.fg.MissingInputError: Input 0 of the graph (indices start from 0), used to compute sigmoid(c_interval__), was not provided and not given a value. Use the Theano flag exception_verbosity='high', for more information on this error.
I've run into the same problem when following the example how to sample from a black box likelihood found here:
https://docs.pymc.io/notebooks/blackbox_external_likelihood.html
This seems to be a version problem. I'm on Manjaro Linux and also ran theano 1.0.4 and pymc3 3.9 using python 3.8. I could solve the issue and make the code work by downgrading to python 3.7 and pymc3 3.8. This seems to be in issue with python 3.8, as simply downgrading pymc3 did not solve the issue for me. I am far from an expert in pymc3 so I don't have a solution how to fix this issue using the newest versions, but for now downgrading makes my simulations run.
Hope this helps.
Edit: The devs seem to be aware of this, there is a an open issue on their github page
https://github.com/pymc-devs/pymc3/issues/4002

BoostedTreeClassifier gets stuck on loss on the first step

I'm trying to run a simple boostedTreeClassifier on my dataset from the example, but it seems to get stuck on first step:
2019-06-28 11:20:31.658689: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:111] Filling up shuffle buffer (this may take a while): 84090 of 85873
2019-06-28 11:20:32.908425: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:162] Shuffle buffer filled.
I0628 11:20:34.904214 140220602029888 basic_session_run_hooks.py:262] loss = 0.6931464, step = 0
W0628 11:21:03.421219 140220602029888 basic_session_run_hooks.py:724] It seems that global step (tf.train.get_global_step) has not been increased. Current value (could be stable): 0 vs previous value: 0. You could increase the global step by passing tf.train.get_global_step() to Optimizer.apply_gradients or Optimizer.minimize.
W0628 11:21:05.555618 140220602029888 basic_session_run_hooks.py:724] It seems that global step (tf.train.get_global_step) has not been increased. Current value (could be stable): 0 vs previous value: 0. You could increase the global step by passing tf.train.get_global_step() to Optimizer.apply_gradients or Optimizer.minimize.
The same dataset seems to work fine when I pass it to other keras based model or xgboost model.
Here's the relevant code:
def make_input_fn(self, X, y, shuffle=True, num_epochs=None):
num_samples = len(self.y_train)
def input_fn():
dataset = tf.data.Dataset.from_tensor_slices((dict(X), y))
if shuffle:
dataset = dataset.shuffle(num_samples).repeat(num_epochs).batch(self.batch_size)
else:
dataset = dataset.repeat(num_epochs).batch(self.batch_size)
return dataset
return input_fn
def ens_train(self):
tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.DEBUG)
train_input_fn = self.make_input_fn(self.X_train, self.y_train, num_epochs=self.epochs)
self.model = tf.estimator.BoostedTreesClassifier(self.feature_columns,
n_batches_per_layer = int(0.5* len(self.y_train)/self.batch_size),
model_dir = self.ofolder,
max_depth = 10,
n_trees = 1000)
self.model.train(train_input_fn, max_steps = 1000)
Was able to get a result by playing around with learning rates and number of epochs. The "best" parameters obtained by hyperparameter tuning on xgboost doesn't give similar results in BoostedTreeClassifier. It took a large number of epochs to get around 84% accuracy (balanced dataset). xgboost had given 95% without even hyperparameter tuning..

Building my own tf.Estimator, how did model_params overwrite model_dir? RuntimeWarning?

Recently I built a customized deep neural net model using TFLearn, which claims to bring deep learning to the scikit-learn estimator API. I could train models and make predictions, but I couldn't get the scoring (evaluate) function to work, so I couldn't do cross-validation. I tried to ask questions about TFLearn in various places, but I got no responses.
It appears that TensorFlow itself has an estimator class. So I am putting TFLearn aside, and I'm trying to follow the guide at https://www.tensorflow.org/extend/estimators. Somehow I'm managing to get variables where they don't belong. Can anyone spot my problem? I will post code and the output.
Note: Of course, I can see the RuntimeWarning at the top of the output. I have found references to this warning online, but so far everyone claims it's harmless. Maybe it is not...
CODE:
import tensorflow as tf
from my_library import Database, l2_angle_distance
def my_model_function(topology, params):
# This function will eventually be a function factory. This should
# allow easy exploration of hyperparameters. For now, this just
# returns a single, fixed model_fn.
def model_fn(features, labels, mode):
# Input layer
net = tf.layers.conv1d(features["x"], topology[0], 3, activation=tf.nn.relu)
net = tf.layers.dropout(net, 0.25)
# The core of the network is here (convolutional layers only for now).
for nodes in topology[1:]:
net = tf.layers.conv1d(net, nodes, 3, activation=tf.nn.relu)
net = tf.layers.dropout(net, 0.25)
sh = tf.shape(features["x"])
net = tf.reshape(net, [sh[0], sh[1], 3, 2])
predictions = tf.nn.l2_normalize(net, dim=3)
# PREDICT EstimatorSpec
if mode == tf.estimator.ModeKeys.PREDICT:
return tf.estimator.EstimatorSpec(mode=mode,
predictions={"vectors": predictions})
# TRAIN or EVAL EstimatorSpec
loss = l2_angle_distance(labels, predictions)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=params["learning_rate"])
train_op = optimizer.minimize(loss=loss, global_step=tf.train.get_global_step())
return tf.estimator.EstimatorSpec(mode, predictions, loss, train_op)
return model_fn
##===================================================================
window = "whole"
encoding = "one_hot"
db = Database("/home/bwllc/Documents/Files for ML/compact")
traindb, testdb = db.train_test_split()
train_features, train_labels = traindb.values(window, encoding)
test_features, test_labels = testdb.values(window, encoding)
# Create the model.
tf.logging.set_verbosity(tf.logging.INFO)
LEARNING_RATE = 0.01
topology = (60,40,20)
model_params = {"learning_rate": LEARNING_RATE}
model_fn = my_model_function(topology, model_params)
model = tf.estimator.Estimator(model_fn, model_params)
print("\nmodel_dir? No? Why not? ", model.model_dir, "\n") # This documents the error
# Input function.
my_input_fn = tf.estimator.inputs.numpy_input_fn({"x" : train_features}, train_labels, shuffle=True)
# Train the model.
model.train(input_fn=my_input_fn, steps=20)
OUTPUT
/usr/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6
return f(*args, **kwds)
INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': {'learning_rate': 0.01}, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f0b55279048>, '_task_type': 'worker', '_task_id': 0, '_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
model_dir? No? Why not? {'learning_rate': 0.01}
INFO:tensorflow:Create CheckpointSaverHook.
Traceback (most recent call last):
File "minimal_estimator_bug_example.py", line 81, in <module>
model.train(input_fn=my_input_fn, steps=20)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py", line 302, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py", line 756, in _train_model
scaffold=estimator_spec.scaffold)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/basic_session_run_hooks.py", line 411, in __init__
self._save_path = os.path.join(checkpoint_dir, checkpoint_basename)
File "/usr/lib/python3.6/posixpath.py", line 78, in join
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not dict
------------------
(program exited with code: 1)
Press return to continue
I can see exactly what went wrong, model_dir (which I left as the default) somehow bound to the value I intended for model_params. How did this happen in my code? I can't see it.
If anyone has advice or suggestions, I would greatly appreciate them. Thanks!
Simply because you're feeding your model_param as a model_dir when you construct your Estimator.
From the tensorflow documentation :
Estimator __init__ function :
__init__(
model_fn,
model_dir=None,
config=None,
params=None
)
Notice how the second argument is the model_dir one. If you want to specify only the params one, you need to pass it as a keyword argument.
model = tf.estimator.Estimator(model_fn, params=model_params)
Or specify all the previous positional arguments :
model = tf.estimator.Estimator(model_fn, None, None, model_params)

scikit-learn: building a learning curve with SVC

I'm trying to graph a learning curve using the SVC classifier. The dataset is kinda skewed, about 150, 1000, 1000, 1000 and 150 in size. I'm running into problem with fitting the estimator:
File "/Users/carrier24sg/.virtualenvs/ml/lib/python2.7/site-packages/sklearn/learning_curve.py", line 135, in learning_curve
for train, test in cv for n_train_samples in train_sizes_abs)
File "/Users/carrier24sg/.virtualenvs/ml/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 644, in __call__
self.dispatch(function, args, kwargs)
File "/Users/carrier24sg/.virtualenvs/ml/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 391, in dispatch
job = ImmediateApply(func, args, kwargs)
File "/Users/carrier24sg/.virtualenvs/ml/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 129, in __init__
self.results = func(*args, **kwargs)
File "/Users/carrier24sg/.virtualenvs/ml/lib/python2.7/site-packages/sklearn/cross_validation.py", line 1233, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
File "/Users/carrier24sg/.virtualenvs/ml/lib/python2.7/site-packages/sklearn/svm/base.py", line 140, in fit
X = atleast2d_or_csr(X, dtype=np.float64, order='C')
File "/Users/carrier24sg/.virtualenvs/ml/lib/python2.7/site-packages/sklearn/svm/base.py", line 450, in _validate_targets
% len(cls))
ValueError: The number of classes has to be greater than one; got 1
My code
df = pd.read_csv('../resources/problem2_processed_validate.csv')
data, label = preprocess_text(df)
cv = StratifiedKFold(label, 10)
plt = plot_learning_curve(estimator=SVC(), title="Learning curve", X=data, y=label.values, cv
train_sizes, train_scores, test_scores = learning_curve(
estimator, data, y=label, cv=cv, train_sizes=np.linspace(.1, 1.0, 5))
Even though I use stratified sampling, I still run into this error. I believe its because the learning curve code doesn't perform stratification when incrementing dataset size, and I've got all similar class labels at one step.
How should I resolve this??
You could use StratifiedShuffleSplit instead of StratifiedKFold, and then write the learning curve loop yourself, creating a new CV object at each iteration. StratifiedShuffleSplit allows you to specify a train_size and a test_size which you can increment as you create your learning curve. As long as you let train_size be greater than the number of classes, it will be able to stratify.
You are right. learning_curve doesn't perform stratification when creating a smaller data set, it just takes the first bit of the data. Lines 134-136 in learning_curve.py say
train[:n_train_samples] for n_train_samples in train_sizes_abs
You can shuffle your data in advance, so that the slice train[:n_train_samples] may (but is not guaranteed to) include data points from all classes. If you are willing to do some more work, what #eickenberg proposed will work.
PS This sounds like something that should be included in sklearn. If you do end up writing that code, please send a pull request on github

Resources