How to use multiple heads option in selfAttention class? - nlp

I am playing around with Self-attention model from trax library.
when I set n_heads=1, everything works fine. But when I set n_heads=2, my code breaks.
I use only input activations and one SelfAttention layer.
Here is a minimal code:
import trax
import numpy as np
attention = trax.layers.SelfAttention(n_heads=2)
activations = np.random.randint(0, 10, (1, 100, 1)).astype(np.float32)
input = (activations, )
init = attention.init(input)
output = attention(input)
But I have en error:
File [...]/site-packages/jax/linear_util.py, line 166, in call_wrapped
ans = self.f(*args, **dict(self.params, **kwargs))
File [...]/layers/research/efficient_attention.py, line 1637, in forward_unbatched_h
return forward_unbatched(*i_h, weights=w_h, state=s_h)
File [...]/layers/research/efficient_attention.py, line 1175, in forward_unbatched
q_info = kv_info = np.arange(q.shape[-2], dtype=np.int32)
IndexError: tuple index out of range
What I do wrong?

Related

Why using Area under curve in manim is giving me an error?

I am trying to show area under a curve using manim
this is my code
from manimlib import *
import numpy as np
class GraphExample(Scene):
def construct(self):
ax = Axes((-3, 10), (-1, 8))
ax.add_coordinate_labels()
curve = ax.get_graph(lambda x: 2 * np.sin(x))
self.add(ax,curve)
area = ax.get_area_under_graph(graph=curve, x_range= (0,2))
self.add(curve, area)
self.wait(1)
this is giving an error message
File "c:\manim-master\manimlib\__main__.py", line 17, in main scene.run()
File "c:\manim-master\manimlib\scene\scene.py", line 75, in run self.construct()
File "test.py", line 21, in construct self.add(area)
File "c:\manim-master\manimlib\scene\scene.py", line 209, in add self.remove(*new_mobjects)
File "c:\manim-master\manimlib\scene\scene.py", line 226, in remove self.mobjects = restructure_list_to_exclude_certain_family_members(
File "c:\manim-master\manimlib\utils\family_ops.py", line 25, in restructure_list_to_exclude_certain_family_members
to_remove = extract_mobject_family_members(to_remove)
File "c:\manim-master\manimlib\utils\family_ops.py", line 5, in extract_mobject_family_members result = list(it.chain(*[
File "c:\manim-master\manimlib\utils\family_ops.py", line 6, in <listcomp>mob.get_family()
AttributeError: 'NoneType' object has no attribute 'get_family'
I don't know what I need to change, someone please help me out here
I changed the code to use Manim's get_area method.
get_area(graph, x_range=None, color=['#58C4DD', '#83C167'], opacity=0.3, bounded_graph=None, **kwargs)
Returns a Polygon representing the area under the graph passed.
from manim import *
import numpy as np
class GraphExample(Scene):
def construct(self):
ax = Axes((-3, 10), (-1, 8))
ax.add_coordinates()
curve = ax.get_graph(lambda x: 2 * np.sin(x))
self.add(ax, curve)
area = ax.get_area(graph=curve, x_range=(0,2))
self.add(area)
self.wait(1)
Output:

Is it possible to use a custom generator to train multi input architecture with keras tensorflow 2.0.0?

With TF 2.0.0, I can train an architecture with one input, I can train an architecture with one input using a custom generator, and I can train an architecture with two inputs. But I can't train an architecture with two inputs using a custom generator.
To keep it minimalist, here's a simple example, with no generator and no multiple inputs to start with:
from tensorflow.keras import layers, models, Model, Input, losses
from numpy import random, array, zeros
input1 = Input(shape=2)
dense1 = layers.Dense(5)(input1)
fullModel = Model(inputs=input1, outputs=dense1)
fullModel.summary()
# Generate random examples:
nbSamples = 21
X_train = random.rand(nbSamples, 2)
Y_train = random.rand(nbSamples, 5)
batchSize = 4
fullModel.compile(loss=losses.LogCosh())
fullModel.fit(X_train, Y_train, epochs=10, batch_size=batchSize)
It's a simple dense layer which takes in input vectors of size 2. The randomly generated dataset contains 21 examples and the batch size is 4. Instead of loading all the data and giving them to model.fit(), we can also give a custom generator in input. The main advantage (for RAM consumption) of this is to load only batch by batch rather that the whole dataset. Here is a simple example with the previous architecture and a custom generator:
import json
# Save the last dataset in a file:
with open("./dataset1input.txt", 'w') as file:
for i in range(nbSamples):
example = {"x": X_train[i].tolist(), "y": Y_train[i].tolist()}
file.write(json.dumps(example) + "\n")
def generator1input(datasetPath, batch_size, inputSize, outputSize):
X_batch = zeros((batch_size, inputSize))
Y_batch = zeros((batch_size, outputSize))
i=0
while True:
with open(datasetPath, 'r') as file:
for line in file:
example = json.loads(line)
X_batch[i] = array(example["x"])
Y_batch[i] = array(example["y"])
i+=1
if i % batch_size == 0:
yield (X_batch, Y_batch)
i=0
fullModel.compile(loss=losses.LogCosh())
my_generator = generator1input("./dataset1input.txt", batchSize, 2, 5)
fullModel.fit(my_generator, epochs=10, steps_per_epoch=int(nbSamples/batchSize))
Here, the generator opens the dataset file, but loads only batch_size examples (not nbSamples examples) each time it is called and slides into the file while looping.
Now, I can build a simple functional architecture with 2 inputs, and no generator:
input1 = Input(shape=2)
dense1 = layers.Dense(5)(input1)
subModel1 = Model(inputs=input1, outputs=dense1)
input2 = Input(shape=3)
dense2 = layers.Dense(5)(input2)
subModel2 = Model(inputs=input2, outputs=dense2)
averageLayer = layers.average([subModel1.output, subModel2.output])
fullModel = Model(inputs=[input1, input2], outputs=averageLayer)
fullModel.summary()
# Generate random examples:
nbSamples = 21
X1 = random.rand(nbSamples, 2)
X2 = random.rand(nbSamples, 3)
Y = random.rand(nbSamples, 5)
fullModel.compile(loss=losses.LogCosh())
fullModel.fit([X1, X2], Y, epochs=10, batch_size=batchSize)
Until here, all models compile and run, but I'm not able to use a generator with the last architecture and its 2 inputs... By trying the following code (which should logically work in my opinion):
# Save data in a file:
with open("./dataset.txt", 'w') as file:
for i in range(nbSamples):
example = {"x1": X1[i].tolist(), "x2": X2[i].tolist(), "y": Y[i].tolist()}
file.write(json.dumps(example) + "\n")
def generator(datasetPath, batch_size, inputSize1, inputSize2, outputSize):
X1_batch = zeros((batch_size, inputSize1))
X2_batch = zeros((batch_size, inputSize2))
Y_batch = zeros((batch_size, outputSize))
i=0
while True:
with open(datasetPath, 'r') as file:
for line in file:
example = json.loads(line)
X1_batch[i] = array(example["x1"])
X2_batch[i] = array(example["x2"])
Y_batch[i] = array(example["y"])
i+=1
if i % batch_size == 0:
yield ([X1_batch, X2_batch], Y_batch)
i=0
fullModel.compile(loss=losses.LogCosh())
my_generator = generator("./dataset.txt", batchSize, 2, 3, 5)
fullModel.fit(my_generator, epochs=10, steps_per_epoch=(nbSamples//batchSize))
I obtain the following error:
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 729, in fit
use_multiprocessing=use_multiprocessing)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 224, in fit
distribution_strategy=strategy)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 547, in _process_training_inputs
use_multiprocessing=use_multiprocessing)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 606, in _process_inputs
use_multiprocessing=use_multiprocessing)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\data_adapter.py", line 566, in __init__
reassemble, nested_dtypes, output_shapes=nested_shape)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\data\ops\dataset_ops.py", line 540, in from_generator
output_types, tensor_shape.as_shape, output_shapes)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\data\util\nest.py", line 471, in map_structure_up_to
results = [func(*tensors) for tensors in zip(*all_flattened_up_to)]
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\data\util\nest.py", line 471, in <listcomp>
results = [func(*tensors) for tensors in zip(*all_flattened_up_to)]
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 1216, in as_shape
return TensorShape(shape)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 776, in __init__
self._dims = [as_dimension(d) for d in dims_iter]
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 776, in <listcomp>
self._dims = [as_dimension(d) for d in dims_iter]
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 718, in as_dimension
return Dimension(value)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 193, in __init__
self._value = int(value)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'tuple'
As explain in the doc, x argument of model.fit() can be A generator or keras.utils.Sequence returning (inputs, targets), and The iterator should return a tuple of length 1, 2, or 3, where the optional second and third elements will be used for y and sample_weight respectively. Thus, I think that it can not take in input more than one generator. Perhaps multiple inputs are not possible with custom generator. Please, would you have an explanation? A solution?
(otherwise, it seems possible to go through tf.data.Dataset.from_generator() with a less custom approach, but I have difficulties to understand what to indicate in the output_signature argument)
[EDIT] Thank you for your response #Francis Tang. In fact, it's possible to use a custom generator, but it allowed me to understand that I just had to change the line:
yield ([X1_batch, X2_batch], Y_batch)
To:
yield (X1_batch, X2_batch), Y_batch
Nevertheless, it is indeed perhaps better to use tf.keras.utils.Sequence. But I find it a bit restrictive.
In particular, I understand in the example given (as well as in most of the examples I could find about Sequence) that __init__() is first used to load the full dataset, which is against the interest of the generator.
But maybe it was a particular example about Sequence(), and there is no need to use __init__() like that: you can directly read a file and load the desired batch into the __getitem__().
In this case, it seems to push to browse each time the data file, or else it is necessary to create a file per batch beforehand (not really optimal).
from tensorflow.python.keras.utils.data_utils import Sequence
class generator(Sequence):
def __init__(self,filename,batch_size):
data = pickle.load(open(filename,'rb'))
self.X1 = data['X1']
self.X2 = data['X2']
self.y = data['y']
self.bs = batch_size
def __len__(self):
return (len(self.y) - 1) // self.bs + 1
def __getitem__(self,idx):
start, end = idx * self.bs, (idx+1) * self.bs
return (self.X1[start:end], self.X2[start:end]), self.y[start:end]
You need to write a class using Sequence: https://www.tensorflow.org/api_docs/python/tf/keras/utils/Sequence

RuntimeError in Pytorch when increasing batch size to more than 1

This code for my custom data loader runs smoothly with batch_size=1, but when I increase batch size I get the following Error:
RuntimeError: Expected object of scalar type Double but got scalar type Long for sequence element 1 in sequence argument at position #1 'tensors'
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
matplotlib.use("TkAgg")
import os, h5py
import PIL
#------------------------------
import torch
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms
#------------------------------
from data_augmentation import *
#------------------------------
dtype = torch.cuda.FloatTensor if torch.cuda.is_available() else torch.FloatTensor
class NiftiDataset(Dataset):
def __init__(self,transformation_params,data_path, mode='train',transforms=None ):
"""
Parameters:
data_path (string): Root directory of the preprocessed dataset.
mode (string, optional): Select the image_set to use, ``train``, ``valid``
transforms (callable, optional): Optional transform to be applied
on a sample.
"""
self.data_path = data_path
self.mode = mode
self.images = []
self.labels = []
self.W_maps = []
self.centers = []
self.radiuss = []
self.pixel_spacings = []
self.transformation_params = transformation_params
self.transforms = transforms
#-------------------------------------------------------------------------------------
if self.mode == 'train':
self.data_path = os.path.join(self.data_path,'train_set')
elif self.mode == 'valid':
self.data_path = os.path.join(self.data_path,'validation_set')
#-------------------------------------------------------------------------------------
for _, _, f in os.walk(self.data_path):
for file in f:
hdf_file = os.path.join(self.data_path,file)
data = h5py.File(hdf_file,'r') # Dictionary
# Preprocessing of Input Image and Label
patch_img, patch_gt, patch_wmap = PreProcessData(file, data, self.mode, self.transformation_params)
#print(type(data))
self.images.append(patch_img) # 2D image
#print('image shape is : ',patch_img.shape)
self.labels.append(patch_gt) # 2D label
#print('label shape is : ',patch_img.shape)
self.W_maps.append(patch_wmap) # Weight_Map
# self.centers.append(data['roi_center'][:]) # [x,y]
# self.radiuss.append(data['roi_radii'][:]) # [R_min,R_max]
# self.pixel_spacings.append(data['pixel_spacing'][:]) # [x , y , z]
def __len__(self):
return len(self.images)
def __getitem__(self, index):
image = self.images[index]
label = self.labels[index]
W_map = self.W_maps[index]
if self.transforms is not None:
image, label, W_maps = self.transforms(image, label, W_map)
return image, label, W_map
#=================================================================================================
if __name__ == '__main__':
# Test Routinue to check your threaded dataloader
# ACDC dataset has 4 labels
n_labels = 4
path = './hdf5_files'
batch_size = 1
# Data Augmentation Parameters
# Set patch extraction parameters
size1 = (128, 128)
patch_size = size1
mm_patch_size = size1
max_size = size1
train_transformation_params = {
'patch_size': patch_size,
'mm_patch_size': mm_patch_size,
'add_noise': ['gauss', 'none1', 'none2'],
'rotation_range': (-5, 5),
'translation_range_x': (-5, 5),
'translation_range_y': (-5, 5),
'zoom_range': (0.8, 1.2),
'do_flip': (False, False),
}
valid_transformation_params = {
'patch_size': patch_size,
'mm_patch_size': mm_patch_size}
transformation_params = { 'train': train_transformation_params,
'valid': valid_transformation_params,
'n_labels': 4,
'data_augmentation': True,
'full_image': False,
'data_deformation': False,
'data_crop_pad': max_size}
#====================================================================
dataset = NiftiDataset(transformation_params=transformation_params,data_path=path,mode='train')
dataloader = DataLoader(dataset=dataset,batch_size=2,shuffle=True,num_workers=0)
dataiter = iter(dataloader)
data = dataiter.next()
images, labels,W_map = data
#===============================================================================
# Data Visualization
#===============================================================================
print('image: ',images.shape,images.type(),'label: ',labels.shape,labels.type(),
'W_map: ',W_map.shape,W_map.type())
img = transforms.ToPILImage()(images[0,0,:,:,0].float())
lbl = transforms.ToPILImage()(labels[0,0,:,:].float())
W_mp = transforms.ToPILImage()(W_map [0,0,:,:].float())
plt.subplot(1,3,1)
plt.imshow(img,cmap='gray',interpolation=None)
plt.title('image')
plt.subplot(1,3,2)
plt.imshow(lbl,cmap='gray',interpolation=None)
plt.title('label')
plt.subplot(1,3,3)
plt.imshow(W_mp,cmap='gray',interpolation=None)
plt.title('Weight Map')
plt.show()
I have noticed some strange things such as Tensor types are different even though images and labels and weight maps are images with same type and size.
The Error Traceback:
Traceback (most recent call last):
File "D:\Saudi_CV\Vibot\Smester_2\2_Medical Image analysis\Project_2020\OUR_Project\data_loader.py", line 118, in <module>
data = dataiter.next()
File "F:\Download_2019\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 345, in __next__
data = self._next_data()
File "F:\Download_2019\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 385, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "F:\Download_2019\Anaconda3\lib\site-packages\torch\utils\data\_utils\fetch.py", line 47, in fetch
return self.collate_fn(data)
File "F:\Download_2019\Anaconda3\lib\site-packages\torch\utils\data\_utils\collate.py", line 79, in default_collate
return [default_collate(samples) for samples in transposed]
File "F:\Download_2019\Anaconda3\lib\site-packages\torch\utils\data\_utils\collate.py", line 79, in <listcomp>
return [default_collate(samples) for samples in transposed]
File "F:\Download_2019\Anaconda3\lib\site-packages\torch\utils\data\_utils\collate.py", line 64, in default_collate
return default_collate([torch.as_tensor(b) for b in batch])
File "F:\Download_2019\Anaconda3\lib\site-packages\torch\utils\data\_utils\collate.py", line 55, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: Expected object of scalar type Double but got scalar type Long for sequence element 1 in sequence argument at position #1 'tensors'
[Finished in 19.9s with exit code 1]
The problem was solved through this solution explained on this page link
image = torch.from_numpy(self.images[index]).type(torch.FloatTensor)
label = torch.from_numpy(self.labels[index]).type(torch.FloatTensor)
W_map = torch.from_numpy(self.W_maps[index]).type(torch.FloatTensor)

Why is tensorflow not initializing variables?

At the very start of declaring the Session with tf.Session(), I declare both tf.global_variables_initializer and tf.local_variables_initializer functions and unfortunately, keep receiving error messages detailing the use of "Uninitialized value Variable_1." Why?
I did some searching around and found this StackExchange Question, but the answer doesn't help my situation. So I looked through the TensorFlow API and found an operation that should return any uninitialized variables, tf.report_uninitialized_variables(). I printed the results and received an empty pair of square brackets, which doesn't make any sense considering the description of my error messages. So what's going on? I've been clawing my eyes out for a day now. Any help is appreciated.
import tensorflow as tf
import os
from tqdm import tqdm
#hyperparam
training_iterations = 100
PATH = "C:\\Users\\ratno\\Desktop\\honest chaos\\skin cam\\drive-download-20180205T055458Z-001"
#==================================import training_data=================================
def import_data(image_path):
image_contents = tf.read_file(filename=image_path)
modified_image = tf.image.decode_jpeg(contents=image_contents, channels=1)
image_tensor = tf.cast(tf.reshape(modified_image, [1, 10000]), dtype=tf.float32)
return image_tensor
#========================neural network================================
def neural_network(input_layer):
Weight_net_1 = {'weights': tf.Variable(tf.random_normal(shape=(10000, 16))),
'bias': tf.Variable(tf.random_normal(shape=(1, 1)))}
Weight_net_2 = {'weights': tf.Variable(tf.random_normal(shape=(16, 16))),
'bias': tf.Variable(tf.random_normal(shape=(1, 1)))}
Weight_net_3 = {'weights': tf.Variable(tf.random_normal(shape=(16, 16))),
'bias': tf.Variable(tf.random_normal(shape=(1, 1)))}
Weight_net_4 = {'weights': tf.Variable(tf.random_normal(shape=(16, 1))),
'bias': tf.Variable(tf.random_normal(shape=(1, 1)))}
#Input Layer
hypothesis = input_layer; x = hypothesis
#Hidden Layer 1
hypothesis = tf.nn.relu(tf.matmul(x, Weight_net_1['weights']) + Weight_net_1['bias']); x = hypothesis
#Hidden Layer 2
hypothesis = tf.nn.relu(tf.matmul(x, Weight_net_2['weights']) + Weight_net_2['bias']); x = hypothesis
#Hidden Layer 3
hypothesis = tf.nn.relu(tf.matmul(x, Weight_net_3['weights']) + Weight_net_3['bias']); x = hypothesis
# output cell
hypothesis = tf.nn.relu(tf.matmul(x, Weight_net_4['weights']) + Weight_net_4['bias'])
return hypothesis
#============================training the network=========================
def train(hypothesis):
LOSS = tf.reduce_sum(1 - hypothesis)
tf.train.AdamOptimizer(0.01).minimize(LOSS)
#Session==================================================================
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())
image_list = [os.path.join(PATH, file_name) for file_name in os.listdir(PATH)]
for iteration in tqdm(range(training_iterations), desc="COMPLETION", ncols=80):
for i in image_list:
modified_image_tensor = sess.run(import_data(image_path=i))
hypo = sess.run(neural_network(input_layer=modified_image_tensor))
sess.run(train(hypothesis=hypo))
print("\n\nTraining completed.\nRunning test prediction.\n")
DIRECTORY = input("Directory: ")
test_input = sess.run(import_data(DIRECTORY))
prediction = sess.run(neural_network(input_layer=test_input))
print(prediction)
if prediction >= 0.5:
print ("Acne")
else:
print ("What")
And as for the error message:
Caused by op 'Variable/read', defined at:
File "C:/Users/ratno/Desktop/honest chaos/Hotdog/HDogntoHDog.py", line 75, in <module>
hypo = sess.run(neural_network(input_layer=modified_image_tensor))
File "C:/Users/ratno/Desktop/honest chaos/Hotdog/HDogntoHDog.py", line 23, in neural_network
Weight_net_1 = {'weights': tf.Variable(tf.random_normal(shape=(10000, 16))),
File "C:\Users\ratno\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\variables.py", line 199, in __init__
expected_shape=expected_shape)
File "C:\Users\ratno\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\variables.py", line 330, in _init_from_args
self._snapshot = array_ops.identity(self._variable, name="read")
File "C:\Users\ratno\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\gen_array_ops.py", line 1400, in identity
result = _op_def_lib.apply_op("Identity", input=input, name=name)
File "C:\Users\ratno\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\op_def_library.py", line 767, in apply_op
op_def=op_def)
File "C:\Users\ratno\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "C:\Users\ratno\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 1204, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
FailedPreconditionError (see above for traceback): Attempting to use uninitialized value Variable
[[Node: Variable/read = Identity[T=DT_FLOAT, _class=["loc:#Variable"], _device="/job:localhost/replica:0/task:0/cpu:0"](Variable)]]
Let's take a look at your main function, starting from with tf.Session() as sess:. This will be the first line executed when you run your program. The next thing that happens is that you are calling the variables_initializers -- but, you have not yet declared any variables! This is because you have not called any of the other functions you have defed. So this is why, when you then call, e.g., neural_network inside a sess.run call, it will create the (uninitialized) variables as neural_networkis called, and then attempt to use them for sess.run. Obviously this will not work since you have not initialized these newly-created variables.
You have to create your network and all necessary variables in the computational graph before calling the initializers. You could try something along these lines:
data = import_data(image_path)
out = neural_network(data)
tr = train(hypothesis=out)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
By the way, your function train also has no return value so it is unlikely it will work as you are expecting it to. Please re-read the tutorials for tensorflow to understand how to operate an optimizer.

How to use a tflearn trained model in an application?

I am currently trying use a trained model in an application.
I've been using this code to generate US city names with an LSTM model. The code works fine and I do manage to get city names.
Right now, I am trying to save the model so I can load it in a different application without training the model again.
Here is the code of my basic application :
from __future__ import absolute_import, division, print_function
import os
from six import moves
import ssl
import tflearn
from tflearn.data_utils import *
path = "US_cities.txt"
maxlen = 20
X, Y, char_idx = textfile_to_semi_redundant_sequences(
path, seq_maxlen=maxlen, redun_step=3)
# --- Create LSTM model
g = tflearn.input_data(shape=[None, maxlen, len(char_idx)])
g = tflearn.lstm(g, 512, return_seq=True, name="lstm1")
g = tflearn.dropout(g, 0.5, name='dropout1')
g = tflearn.lstm(g, 512, name='lstm2')
g = tflearn.dropout(g, 0.5, name='dropout')
g = tflearn.fully_connected(g, len(char_idx), activation='softmax', name='fc')
g = tflearn.regression(g, optimizer='adam', loss='categorical_crossentropy',
learning_rate=0.001)
# --- Initializing model and loading
model = tflearn.models.generator.SequenceGenerator(g, char_idx)
model.load('myModel.tfl')
print("Model is now loaded !")
#
# Main Application
#
while(True):
user_choice = input("Do you want to generate a U.S. city names ? [y/n]")
if user_choice == 'y':
seed = random_sequence_from_textfile(path, 20)
print("-- Test with temperature of 1.5 --")
model.generate(20, temperature=1.5, seq_seed=seed, display=True)
else:
exit()
And here is what I get as an output :
Do you want to generate a U.S. city names ? [y/n]y
-- Test with temperature of 1.5 --
rk
Orange Park AcresTraceback (most recent call last):
File "App.py", line 46, in <module>
model.generate(20, temperature=1.5, seq_seed=seed, display=True)
File "/usr/local/lib/python3.5/dist-packages/tflearn/models/generator.py", line 216, in generate
preds = self._predict(x)[0]
File "/usr/local/lib/python3.5/dist-packages/tflearn/models/generator.py", line 180, in _predict
return self.predictor.predict(feed_dict)
File "/usr/local/lib/python3.5/dist-packages/tflearn/helpers/evaluator.py", line 69, in predict
o_pred = self.session.run(output, feed_dict=feed_dict).tolist()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 717, in run
run_metadata_ptr)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 894, in _run
% (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (1, 25, 61) for Tensor 'InputData/X:0', which has shape '(?, 20, 61)'
Unfortunately, I can't see why the shape has changed when using generate() in my app. Could anyone help me solve this problem?
Thank you in advance
William
SOLVED?
One solution would be to simply add "modes" to the python script thanks to the argument parser :
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("mode", help="Train or/and test", nargs='+', choices=["train","test"])
args = parser.parse_args()
And then
if args.mode == "train":
# define your model
# train the model
model.save('my_model.tflearn')
if args.mode == "test":
model.load('my_model.tflearn')
# do whatever you want with your model
I dont really understand why this works and why when you're trying to load a model from a different script it doesn't.
But I guess this should be fine for the moment...

Resources