How to use a tflearn trained model in an application? - python-3.x

I am currently trying use a trained model in an application.
I've been using this code to generate US city names with an LSTM model. The code works fine and I do manage to get city names.
Right now, I am trying to save the model so I can load it in a different application without training the model again.
Here is the code of my basic application :
from __future__ import absolute_import, division, print_function
import os
from six import moves
import ssl
import tflearn
from tflearn.data_utils import *
path = "US_cities.txt"
maxlen = 20
X, Y, char_idx = textfile_to_semi_redundant_sequences(
path, seq_maxlen=maxlen, redun_step=3)
# --- Create LSTM model
g = tflearn.input_data(shape=[None, maxlen, len(char_idx)])
g = tflearn.lstm(g, 512, return_seq=True, name="lstm1")
g = tflearn.dropout(g, 0.5, name='dropout1')
g = tflearn.lstm(g, 512, name='lstm2')
g = tflearn.dropout(g, 0.5, name='dropout')
g = tflearn.fully_connected(g, len(char_idx), activation='softmax', name='fc')
g = tflearn.regression(g, optimizer='adam', loss='categorical_crossentropy',
learning_rate=0.001)
# --- Initializing model and loading
model = tflearn.models.generator.SequenceGenerator(g, char_idx)
model.load('myModel.tfl')
print("Model is now loaded !")
#
# Main Application
#
while(True):
user_choice = input("Do you want to generate a U.S. city names ? [y/n]")
if user_choice == 'y':
seed = random_sequence_from_textfile(path, 20)
print("-- Test with temperature of 1.5 --")
model.generate(20, temperature=1.5, seq_seed=seed, display=True)
else:
exit()
And here is what I get as an output :
Do you want to generate a U.S. city names ? [y/n]y
-- Test with temperature of 1.5 --
rk
Orange Park AcresTraceback (most recent call last):
File "App.py", line 46, in <module>
model.generate(20, temperature=1.5, seq_seed=seed, display=True)
File "/usr/local/lib/python3.5/dist-packages/tflearn/models/generator.py", line 216, in generate
preds = self._predict(x)[0]
File "/usr/local/lib/python3.5/dist-packages/tflearn/models/generator.py", line 180, in _predict
return self.predictor.predict(feed_dict)
File "/usr/local/lib/python3.5/dist-packages/tflearn/helpers/evaluator.py", line 69, in predict
o_pred = self.session.run(output, feed_dict=feed_dict).tolist()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 717, in run
run_metadata_ptr)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 894, in _run
% (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (1, 25, 61) for Tensor 'InputData/X:0', which has shape '(?, 20, 61)'
Unfortunately, I can't see why the shape has changed when using generate() in my app. Could anyone help me solve this problem?
Thank you in advance
William

SOLVED?
One solution would be to simply add "modes" to the python script thanks to the argument parser :
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("mode", help="Train or/and test", nargs='+', choices=["train","test"])
args = parser.parse_args()
And then
if args.mode == "train":
# define your model
# train the model
model.save('my_model.tflearn')
if args.mode == "test":
model.load('my_model.tflearn')
# do whatever you want with your model
I dont really understand why this works and why when you're trying to load a model from a different script it doesn't.
But I guess this should be fine for the moment...

Related

How to use multiple heads option in selfAttention class?

I am playing around with Self-attention model from trax library.
when I set n_heads=1, everything works fine. But when I set n_heads=2, my code breaks.
I use only input activations and one SelfAttention layer.
Here is a minimal code:
import trax
import numpy as np
attention = trax.layers.SelfAttention(n_heads=2)
activations = np.random.randint(0, 10, (1, 100, 1)).astype(np.float32)
input = (activations, )
init = attention.init(input)
output = attention(input)
But I have en error:
File [...]/site-packages/jax/linear_util.py, line 166, in call_wrapped
ans = self.f(*args, **dict(self.params, **kwargs))
File [...]/layers/research/efficient_attention.py, line 1637, in forward_unbatched_h
return forward_unbatched(*i_h, weights=w_h, state=s_h)
File [...]/layers/research/efficient_attention.py, line 1175, in forward_unbatched
q_info = kv_info = np.arange(q.shape[-2], dtype=np.int32)
IndexError: tuple index out of range
What I do wrong?

Is it possible to use a custom generator to train multi input architecture with keras tensorflow 2.0.0?

With TF 2.0.0, I can train an architecture with one input, I can train an architecture with one input using a custom generator, and I can train an architecture with two inputs. But I can't train an architecture with two inputs using a custom generator.
To keep it minimalist, here's a simple example, with no generator and no multiple inputs to start with:
from tensorflow.keras import layers, models, Model, Input, losses
from numpy import random, array, zeros
input1 = Input(shape=2)
dense1 = layers.Dense(5)(input1)
fullModel = Model(inputs=input1, outputs=dense1)
fullModel.summary()
# Generate random examples:
nbSamples = 21
X_train = random.rand(nbSamples, 2)
Y_train = random.rand(nbSamples, 5)
batchSize = 4
fullModel.compile(loss=losses.LogCosh())
fullModel.fit(X_train, Y_train, epochs=10, batch_size=batchSize)
It's a simple dense layer which takes in input vectors of size 2. The randomly generated dataset contains 21 examples and the batch size is 4. Instead of loading all the data and giving them to model.fit(), we can also give a custom generator in input. The main advantage (for RAM consumption) of this is to load only batch by batch rather that the whole dataset. Here is a simple example with the previous architecture and a custom generator:
import json
# Save the last dataset in a file:
with open("./dataset1input.txt", 'w') as file:
for i in range(nbSamples):
example = {"x": X_train[i].tolist(), "y": Y_train[i].tolist()}
file.write(json.dumps(example) + "\n")
def generator1input(datasetPath, batch_size, inputSize, outputSize):
X_batch = zeros((batch_size, inputSize))
Y_batch = zeros((batch_size, outputSize))
i=0
while True:
with open(datasetPath, 'r') as file:
for line in file:
example = json.loads(line)
X_batch[i] = array(example["x"])
Y_batch[i] = array(example["y"])
i+=1
if i % batch_size == 0:
yield (X_batch, Y_batch)
i=0
fullModel.compile(loss=losses.LogCosh())
my_generator = generator1input("./dataset1input.txt", batchSize, 2, 5)
fullModel.fit(my_generator, epochs=10, steps_per_epoch=int(nbSamples/batchSize))
Here, the generator opens the dataset file, but loads only batch_size examples (not nbSamples examples) each time it is called and slides into the file while looping.
Now, I can build a simple functional architecture with 2 inputs, and no generator:
input1 = Input(shape=2)
dense1 = layers.Dense(5)(input1)
subModel1 = Model(inputs=input1, outputs=dense1)
input2 = Input(shape=3)
dense2 = layers.Dense(5)(input2)
subModel2 = Model(inputs=input2, outputs=dense2)
averageLayer = layers.average([subModel1.output, subModel2.output])
fullModel = Model(inputs=[input1, input2], outputs=averageLayer)
fullModel.summary()
# Generate random examples:
nbSamples = 21
X1 = random.rand(nbSamples, 2)
X2 = random.rand(nbSamples, 3)
Y = random.rand(nbSamples, 5)
fullModel.compile(loss=losses.LogCosh())
fullModel.fit([X1, X2], Y, epochs=10, batch_size=batchSize)
Until here, all models compile and run, but I'm not able to use a generator with the last architecture and its 2 inputs... By trying the following code (which should logically work in my opinion):
# Save data in a file:
with open("./dataset.txt", 'w') as file:
for i in range(nbSamples):
example = {"x1": X1[i].tolist(), "x2": X2[i].tolist(), "y": Y[i].tolist()}
file.write(json.dumps(example) + "\n")
def generator(datasetPath, batch_size, inputSize1, inputSize2, outputSize):
X1_batch = zeros((batch_size, inputSize1))
X2_batch = zeros((batch_size, inputSize2))
Y_batch = zeros((batch_size, outputSize))
i=0
while True:
with open(datasetPath, 'r') as file:
for line in file:
example = json.loads(line)
X1_batch[i] = array(example["x1"])
X2_batch[i] = array(example["x2"])
Y_batch[i] = array(example["y"])
i+=1
if i % batch_size == 0:
yield ([X1_batch, X2_batch], Y_batch)
i=0
fullModel.compile(loss=losses.LogCosh())
my_generator = generator("./dataset.txt", batchSize, 2, 3, 5)
fullModel.fit(my_generator, epochs=10, steps_per_epoch=(nbSamples//batchSize))
I obtain the following error:
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 729, in fit
use_multiprocessing=use_multiprocessing)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 224, in fit
distribution_strategy=strategy)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 547, in _process_training_inputs
use_multiprocessing=use_multiprocessing)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 606, in _process_inputs
use_multiprocessing=use_multiprocessing)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\data_adapter.py", line 566, in __init__
reassemble, nested_dtypes, output_shapes=nested_shape)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\data\ops\dataset_ops.py", line 540, in from_generator
output_types, tensor_shape.as_shape, output_shapes)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\data\util\nest.py", line 471, in map_structure_up_to
results = [func(*tensors) for tensors in zip(*all_flattened_up_to)]
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\data\util\nest.py", line 471, in <listcomp>
results = [func(*tensors) for tensors in zip(*all_flattened_up_to)]
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 1216, in as_shape
return TensorShape(shape)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 776, in __init__
self._dims = [as_dimension(d) for d in dims_iter]
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 776, in <listcomp>
self._dims = [as_dimension(d) for d in dims_iter]
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 718, in as_dimension
return Dimension(value)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 193, in __init__
self._value = int(value)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'tuple'
As explain in the doc, x argument of model.fit() can be A generator or keras.utils.Sequence returning (inputs, targets), and The iterator should return a tuple of length 1, 2, or 3, where the optional second and third elements will be used for y and sample_weight respectively. Thus, I think that it can not take in input more than one generator. Perhaps multiple inputs are not possible with custom generator. Please, would you have an explanation? A solution?
(otherwise, it seems possible to go through tf.data.Dataset.from_generator() with a less custom approach, but I have difficulties to understand what to indicate in the output_signature argument)
[EDIT] Thank you for your response #Francis Tang. In fact, it's possible to use a custom generator, but it allowed me to understand that I just had to change the line:
yield ([X1_batch, X2_batch], Y_batch)
To:
yield (X1_batch, X2_batch), Y_batch
Nevertheless, it is indeed perhaps better to use tf.keras.utils.Sequence. But I find it a bit restrictive.
In particular, I understand in the example given (as well as in most of the examples I could find about Sequence) that __init__() is first used to load the full dataset, which is against the interest of the generator.
But maybe it was a particular example about Sequence(), and there is no need to use __init__() like that: you can directly read a file and load the desired batch into the __getitem__().
In this case, it seems to push to browse each time the data file, or else it is necessary to create a file per batch beforehand (not really optimal).
from tensorflow.python.keras.utils.data_utils import Sequence
class generator(Sequence):
def __init__(self,filename,batch_size):
data = pickle.load(open(filename,'rb'))
self.X1 = data['X1']
self.X2 = data['X2']
self.y = data['y']
self.bs = batch_size
def __len__(self):
return (len(self.y) - 1) // self.bs + 1
def __getitem__(self,idx):
start, end = idx * self.bs, (idx+1) * self.bs
return (self.X1[start:end], self.X2[start:end]), self.y[start:end]
You need to write a class using Sequence: https://www.tensorflow.org/api_docs/python/tf/keras/utils/Sequence

Error loading trained en_core_web_trf spacyV3 NER model

After loading pretrained spacy model for finetuning on custom data
spacy.require_gpu()
nlp = spacy.load("en_core_web_trf",exclude=['tagger', 'parser', 'attribute_ruler', 'lemmatizer'])
getting error while loading for validation
model=spacy.load(category_output_dir + "/%s" % itn,exclude=['tagger', 'parser', 'attribute_ruler', 'lemmatizer'])
File "/usr/local/lib/python3.6/dist-packages/spacy/init.py", line 47, in load
return util.load_model(name, disable=disable, exclude=exclude, config=config)
File "/usr/local/lib/python3.6/dist-packages/spacy/util.py", line 275, in load_model
return load_model_from_path(Path(name), **kwargs)
File "/usr/local/lib/python3.6/dist-packages/spacy/util.py", line 341, in load_model_from_path
return nlp.from_disk(model_path, exclude=exclude)
File "/usr/local/lib/python3.6/dist-packages/spacy/language.py", line 1705, in from_disk
util.from_disk(path, deserializers, exclude)
File "/usr/local/lib/python3.6/dist-packages/spacy/util.py", line 1085, in from_disk
reader(path / key)
File "/usr/local/lib/python3.6/dist-packages/spacy/language.py", line 1700, in
p, exclude=["vocab"]
File "spacy/pipeline/transition_parser.pyx", line 479, in spacy.pipeline.transition_parser.Parser.from_disk
File "/usr/local/lib/python3.6/dist-packages/thinc/model.py", line 529, in from_bytes
return self.from_dict(msg)
File "/usr/local/lib/python3.6/dist-packages/thinc/model.py", line 552, in from_dict
node.set_dim(dim, value)
File "/usr/local/lib/python3.6/dist-packages/thinc/model.py", line 188, in set_dim
raise ValueError(err)
ValueError: Attempt to change dimension 'nO' for model 'precomputable_affine' from 78 to 74
here is the code snippet
import random
from tqdm import tqdm
import spacy
import os
import copy
from spacy.tokens import Doc
from spacy.training import Example
from spacy.util import minibatch, compounding, decaying
spacy.require_gpu()
nlp = spacy.load("en_core_web_trf",exclude=['tagger', 'parser',
'attribute_ruler', 'lemmatizer'])
ner = nlp.get_pipe('ner')
for _, annotations in TRAIN_DATA:
for ent in annotations.get('entities'):
ner.add_label(ent[2])
for itn in range(n_iter):
random.shuffle(TRAIN_DATA)
losses = {}
for batch in tqdm(minibatch(TRAIN_DATA,size=compounding(4., 64., 1.01))):
for text, annots in batch:
examples = []
try:
examples.append(Example.from_dict(nlp.make_doc(text), annots))
losses=nlp.update(examples,sgd=optimizer,losses=losses)
except:
continue
if use_optimizer_averages:
with nlp.use_params(optimizer.averages):
os.mkdir(output_dir + "/%s" % itn)
nlp.to_disk(output_dir + "/%s" % itn)
model=spacy.load(output_dir + "/%s" % itn,exclude=['tagger', 'parser', 'attribute_ruler', 'lemmatizer'])
Train data format:
TRAIN_DATA = [["Who is Shaka Khan?", {"entities": [(7, 17, "FRIENDS")]}],["I like London.", {"entities": [(7, 13, "LOC")]}]]
It looks like you ran into a bug with release candidate v3.0.0rc2 of spaCy 3. We're working on getting this fixed for the final 3.0 release, see here: https://github.com/explosion/spaCy/pull/6460
Meanwhile, I'm afraid you won't be able to add additional labels to a transformer-based, pretrained NER model. What I would recommend though, is that you train a new NER component from scratch.
In spaCy 3, we recommend using the config system instead of writing your own training loop (https://nightly.spacy.io/usage/training#config). In a config file, you can source components from other pipelines like so:
[components.transformer]
source = "en_core_web_trf"
component = "transformer"
This allows you to reuse the transformer weights of your pretrained model, and create a new NER model on top of that (https://nightly.spacy.io/usage/embeddings-transformers):
[components.ner]
factory = "ner"
[nlp.pipeline.ner.model]
#architectures = "spacy.TransitionBasedParser.v1"
state_type = "ner"
extra_state_tokens = false
hidden_width = 128
maxout_pieces = 3
use_upper = false
[nlp.pipeline.ner.model.tok2vec]
#architectures = "spacy-transformers.TransformerListener.v1"
grad_factor = 1.0
[nlp.pipeline.ner.model.tok2vec.pooling]
#layers = "reduce_mean.v1"
You could even source the old NER component from the en_core_web_trf too, just name it differently, and you'll have two NER's running in your pipeline, each for a distinct set of labels.
Alternatively, you can also use a non-transformer-based model such as en_core_web_lg. Adding custom labels to such a pretrained model should work fine, as before.

Theano error when using PyMC3: theano.gof.fg.MissingInputError

I am generating some (noisy) data-points (y) with some known parameters (m,c) that represent the equation of a straight line. Using sampling-based Bayesian methods, I now want to know the true values of parameters (m,c) from the data. Therefore, I am using DE Metropolis (PyMC3) to estimate the true parameters.
I am getting theano error theano.gof.fg.MissingInputError: Input 0 of the graph (indices start from 0), used to compute sigmoid(c_interval__), was not provided and not given a value.
Theano version: 1.0.4
PyMC3 version: 3.9.1
import matplotlib.pyplot as plt
import numpy as np
import arviz as az
import pymc3
import theano.tensor as tt
from theano.compile.ops import as_op
plt.style.use("ggplot")
# define a theano Op for our likelihood function
class LogLike(tt.Op):
itypes = [tt.dvector] # expects a vector of parameter values when called
otypes = [tt.dscalar] # outputs a single scalar value (the log likelihood)
def __init__(self, loglike, data, x, sigma):
# add inputs as class attributes
self.likelihood = loglike
self.data = data
self.x = x
self.sigma = sigma
def perform(self, node, inputs, outputs):
# the method that is used when calling the Op
theta, = inputs # this will contain my variables
# call the log-likelihood function
logl = self.likelihood(theta, self.x, self.data, self.sigma)
outputs[0][0] = np.array(logl) # output the log-likelihood
def my_model(theta, x):
y = theta[0]*x + theta[1]
return y
def my_loglike(theta, x, data, sigma):
model = my_model(theta, x)
ll = -(0.5/sigma**2)*np.sum((data - model)**2)
return ll
# set up our data
N = 10 # number of data points
sigma = 1. # standard deviation of noise
x = np.linspace(0., 9., N)
mtrue = 0.4 # true gradient
ctrue = 3. # true y-intercept
truemodel = my_model([mtrue, ctrue], x)
# make data
np.random.seed(716742) # set random seed, so the data is reproducible each time
data = sigma*np.random.randn(N) + truemodel
print(data)
ndraws = 3000 # number of draws from the distribution
# create our Op
logl = LogLike(my_loglike, data, x, sigma)
# use PyMC3 to sampler from log-likelihood
with pymc3.Model():
# uniform priors on m and c
m = pymc3.Uniform('m', lower=-10., upper=10.)
c = pymc3.Uniform('c', lower=-10., upper=10.)
# convert m and c to a tensor vector
theta = tt.as_tensor_variable([m, c])
# use a DensityDist (use a lamdba function to "call" the Op)
pymc3.DensityDist('likelihood', lambda v: logl(v), observed={'v': theta})
step = pymc3.DEMetropolis()
trace = pymc3.sample(ndraws, step)
# plot the traces
axes = az.plot_trace(trace)
fig = axes.ravel()[0].figure
fig.savefig('./trace_plots.png')
Find the full trace here:
Population sampling (4 chains)
DEMetropolis: [c, m]
Attempting to parallelize chains to all cores. You can turn this off with `pm.sample(cores=1)`.
Population parallelization failed. Falling back to sequential stepping of chains.---------------------| 0.00% [0/4 00:00<00:00]
Sampling 4 chains for 0 tune and 4_000 draw iterations (0 + 16_000 draws total) took 5 seconds.███████| 100.00% [4000/4000 00:04<00:00]
Traceback (most recent call last):
File "test.py", line 75, in <module>
trace = pymc3.sample(ndraws, step)
File "/home/csl_user/.local/lib/python3.7/site-packages/pymc3/sampling.py", line 599, in sample
idata = arviz.from_pymc3(trace, **ikwargs)
File "/home/csl_user/.local/lib/python3.7/site-packages/arviz/data/io_pymc3.py", line 531, in from_pymc3
save_warmup=save_warmup,
File "/home/csl_user/.local/lib/python3.7/site-packages/arviz/data/io_pymc3.py", line 159, in __init__
self.observations, self.multi_observations = self.find_observations()
File "/home/csl_user/.local/lib/python3.7/site-packages/arviz/data/io_pymc3.py", line 172, in find_observations
multi_observations[key] = val.eval() if hasattr(val, "eval") else val
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/gof/graph.py", line 522, in eval
self._fn_cache[inputs] = theano.function(inputs, self)
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/compile/function.py", line 317, in function
output_keys=output_keys)
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/compile/pfunc.py", line 486, in pfunc
output_keys=output_keys)
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/compile/function_module.py", line 1839, in orig_function
name=name)
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/compile/function_module.py", line 1487, in __init__
accept_inplace)
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/compile/function_module.py", line 181, in std_fgraph
update_mapping=update_mapping)
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/gof/fg.py", line 175, in __init__
self.__import_r__(output, reason="init")
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/gof/fg.py", line 346, in __import_r__
self.__import__(variable.owner, reason=reason)
File "/home/csl_user/.local/lib/python3.7/site-packages/theano/gof/fg.py", line 391, in __import__
raise MissingInputError(error_msg, variable=r)
theano.gof.fg.MissingInputError: Input 0 of the graph (indices start from 0), used to compute sigmoid(c_interval__), was not provided and not given a value. Use the Theano flag exception_verbosity='high', for more information on this error.
I've run into the same problem when following the example how to sample from a black box likelihood found here:
https://docs.pymc.io/notebooks/blackbox_external_likelihood.html
This seems to be a version problem. I'm on Manjaro Linux and also ran theano 1.0.4 and pymc3 3.9 using python 3.8. I could solve the issue and make the code work by downgrading to python 3.7 and pymc3 3.8. This seems to be in issue with python 3.8, as simply downgrading pymc3 did not solve the issue for me. I am far from an expert in pymc3 so I don't have a solution how to fix this issue using the newest versions, but for now downgrading makes my simulations run.
Hope this helps.
Edit: The devs seem to be aware of this, there is a an open issue on their github page
https://github.com/pymc-devs/pymc3/issues/4002

Building my own tf.Estimator, how did model_params overwrite model_dir? RuntimeWarning?

Recently I built a customized deep neural net model using TFLearn, which claims to bring deep learning to the scikit-learn estimator API. I could train models and make predictions, but I couldn't get the scoring (evaluate) function to work, so I couldn't do cross-validation. I tried to ask questions about TFLearn in various places, but I got no responses.
It appears that TensorFlow itself has an estimator class. So I am putting TFLearn aside, and I'm trying to follow the guide at https://www.tensorflow.org/extend/estimators. Somehow I'm managing to get variables where they don't belong. Can anyone spot my problem? I will post code and the output.
Note: Of course, I can see the RuntimeWarning at the top of the output. I have found references to this warning online, but so far everyone claims it's harmless. Maybe it is not...
CODE:
import tensorflow as tf
from my_library import Database, l2_angle_distance
def my_model_function(topology, params):
# This function will eventually be a function factory. This should
# allow easy exploration of hyperparameters. For now, this just
# returns a single, fixed model_fn.
def model_fn(features, labels, mode):
# Input layer
net = tf.layers.conv1d(features["x"], topology[0], 3, activation=tf.nn.relu)
net = tf.layers.dropout(net, 0.25)
# The core of the network is here (convolutional layers only for now).
for nodes in topology[1:]:
net = tf.layers.conv1d(net, nodes, 3, activation=tf.nn.relu)
net = tf.layers.dropout(net, 0.25)
sh = tf.shape(features["x"])
net = tf.reshape(net, [sh[0], sh[1], 3, 2])
predictions = tf.nn.l2_normalize(net, dim=3)
# PREDICT EstimatorSpec
if mode == tf.estimator.ModeKeys.PREDICT:
return tf.estimator.EstimatorSpec(mode=mode,
predictions={"vectors": predictions})
# TRAIN or EVAL EstimatorSpec
loss = l2_angle_distance(labels, predictions)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=params["learning_rate"])
train_op = optimizer.minimize(loss=loss, global_step=tf.train.get_global_step())
return tf.estimator.EstimatorSpec(mode, predictions, loss, train_op)
return model_fn
##===================================================================
window = "whole"
encoding = "one_hot"
db = Database("/home/bwllc/Documents/Files for ML/compact")
traindb, testdb = db.train_test_split()
train_features, train_labels = traindb.values(window, encoding)
test_features, test_labels = testdb.values(window, encoding)
# Create the model.
tf.logging.set_verbosity(tf.logging.INFO)
LEARNING_RATE = 0.01
topology = (60,40,20)
model_params = {"learning_rate": LEARNING_RATE}
model_fn = my_model_function(topology, model_params)
model = tf.estimator.Estimator(model_fn, model_params)
print("\nmodel_dir? No? Why not? ", model.model_dir, "\n") # This documents the error
# Input function.
my_input_fn = tf.estimator.inputs.numpy_input_fn({"x" : train_features}, train_labels, shuffle=True)
# Train the model.
model.train(input_fn=my_input_fn, steps=20)
OUTPUT
/usr/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6
return f(*args, **kwds)
INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': {'learning_rate': 0.01}, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f0b55279048>, '_task_type': 'worker', '_task_id': 0, '_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
model_dir? No? Why not? {'learning_rate': 0.01}
INFO:tensorflow:Create CheckpointSaverHook.
Traceback (most recent call last):
File "minimal_estimator_bug_example.py", line 81, in <module>
model.train(input_fn=my_input_fn, steps=20)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py", line 302, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py", line 756, in _train_model
scaffold=estimator_spec.scaffold)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/basic_session_run_hooks.py", line 411, in __init__
self._save_path = os.path.join(checkpoint_dir, checkpoint_basename)
File "/usr/lib/python3.6/posixpath.py", line 78, in join
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not dict
------------------
(program exited with code: 1)
Press return to continue
I can see exactly what went wrong, model_dir (which I left as the default) somehow bound to the value I intended for model_params. How did this happen in my code? I can't see it.
If anyone has advice or suggestions, I would greatly appreciate them. Thanks!
Simply because you're feeding your model_param as a model_dir when you construct your Estimator.
From the tensorflow documentation :
Estimator __init__ function :
__init__(
model_fn,
model_dir=None,
config=None,
params=None
)
Notice how the second argument is the model_dir one. If you want to specify only the params one, you need to pass it as a keyword argument.
model = tf.estimator.Estimator(model_fn, params=model_params)
Or specify all the previous positional arguments :
model = tf.estimator.Estimator(model_fn, None, None, model_params)

Resources