tuning neural network parameter via Hyperopt: how to dump trials - keras

When I try to save hyperopt.trials object, which contains information about auto params tuning in neural network,
best = fmin(fn = objective,
space = space,
algo = tpe.suggest, # or rand.suggest for random params selection
max_evals = max_trials,
trials = trials) #, rstate = np.random.RandomState(50)
pickle.dump(trials, open("neuro.hyperopt", "wb"))
it gives the error:
can't pickle _thread.RLock objects
Moreover, it loads on my local drive a file of 10GB size. That is, it saves not only the trials object, but the whole model.
Would you help me to save trials object with less size (e.g. the XGBoost trials file's size is 1Mb) and avoid the error.
Thank you.

In my case it was because the models stored in the trials were not pickle-able.
I tried to save tf.keras.optimizers.Adam(learning_rate = 0.001) object.
When I added the string 'Adam' instead, the error disapeared.
Of course, it creates another problem: how to setup learining rate for the optimizor. But it seems to be easier. One way is to replace the keras object with string in the trials.trials object before saving:
for trial in trials.trials:
if 'result' in trials.keys():
trials['result'].pop('model', None) # https://stackoverflow.com/questions/15411107/delete-a-dictionary-item-if-the-key-exists
# proceed with pickling
pickle.dump(trials, open("trials.pkl","wb"))
(I took it from here)

Related

MaskRCNN should find exactly one element

I trained a maskrcnn-model with matterport with one class to detect. It worked.
Now I want to predict some unseen images. I know that the object is present on each image and that it appears only once per image. How do I use my model to do so?
An possibility that came to my mind was:
num_results = 0
while num_results = 0:
model = mrcnn.model.MaskRCNN(mode='inference', config=pred_config)
model.load_weights('weight/path')
results = model.detect([img], verbose=1)
num_results = compute_num_of(results)
# lower DETECTION_MIN_CONFIDENCE from pred_config
But I think this is very time consuming because I load that model and its weights at every step. What would be best practice here?
Thanks

Can't get Keras Code Example #1 to work with multi-label dataset

Apologies in advance.
I am attempting to recreate this CNN (from the Keras Code Examples), with another dataset.
https://keras.io/examples/vision/image_classification_from_scratch/
The dataset I am using is one for retinal scans, and classifies images on a scale from 0-4. So, it's a multi-label image classification.
The Keras example used is binary classification (cats v dogs), though I would have hoped it wouldn't make much difference (maybe this is a big assumption on my part).
I skipped the 'image augmentation' part of the walkthrough. So, I have not created the
data_augmentation = keras.Sequential(
[
layers.RandomFlip("horizontal"),
layers.RandomRotation(0.1),
]
)
part. So, instead of:
def make_model(input_shape, num_classes):
inputs = keras.Input(shape=input_shape)
# Image augmentation block
x = data_augmentation(inputs)
# Entry block
x = layers.Rescaling(1.0 / 255)(x)
.......
at the beginning of the model, I have:
def make_model(input_shape, num_classes):
inputs = keras.Input(shape=input_shape)
# Image augmentation block
x = keras.Sequential(inputs)
# Entry block
x = layers.Rescaling(1.0 / 255)(x)
.......
However I keep getting different errors no matter how much I try to change things around, such as "TypeError: Keras symbolic inputs/outputs do not implement __len__.", or "ValueError: Exception encountered when calling layer "rescaling_3" (type Rescaling).".
What am I missing here?

Do you need a for loop for IncrementalPCA in order to keep constant memory usage?

In the past, I've tried to use scikit-learn's IncrementalPCA in order to reduce memory usage. I used this answer as a template for my code. But as #aarslan said in the comment section: "I've noticed that the explained variance seems to decrease at every iteration." I've always suspected the last for loop in the given answer. So, my question is: Do I need a for loop in order to keep a constant memory usage during partial_fit step or batch_size is alone enough? Below you can find the code:
import h5py
import numpy as np
from sklearn.decomposition import IncrementalPCA
h5 = h5py.File('rand-1Mx1K.h5')
data = h5['data'] # it's ok, the dataset is not fetched to memory yet
n = data.shape[0] # how many rows we have in the dataset
chunk_size = 1000 # how many rows we feed to IPCA at a time, the divisor of n
icpa = IncrementalPCA(n_components=10, batch_size=16)
for i in range(0, n//chunk_size):
ipca.partial_fit(data[i*chunk_size : (i+1)*chunk_size])
An old question, but yes, the for-loop is needed. The batch_size= parameter is only used with the .fit() method, not with .partial_fit().
Scikit-learn documentation:
batch_size : int, default=None
The number of samples to use for each batch. Only used when calling fit.

Seq2seq for non-sentence, float data; stuck configuring the decoder

I am trying to apply sequence-to-sequence modelling to EEG data. The encoding works just fine, but getting the decoding to work is proving problematic. The input-data has the shape None-by-3000-by-31, where the second dimension is the sequence-length.
The encoder looks like this:
initial_state = lstm_sequence_encoder.zero_state(batchsize, dtype=self.model_precision)
encoder_output, state = dynamic_rnn(
cell=LSTMCell(32),
inputs=lstm_input, # shape=(None,3000,32)
initial_state=initial_state, # zeroes
dtype=lstm_input.dtype # tf.float32
)
I use the final state of the RNN as the initial state of the decoder. For training, I use the TrainingHelper:
training_helper = TrainingHelper(target_input, [self.sequence_length])
training_decoder = BasicDecoder(
cell=lstm_sequence_decoder,
helper=training_helper,
initial_state=thought_vector
)
output, _, _ = dynamic_decode(
decoder=training_decoder,
maximum_iterations=3000
)
My troubles start when I try to implement inference. Since I am using non-sentence data, I do not need to tokenize or embed, because the data is essentially embedded already. The InferenceHelper class seemed the best way to achieve my goal. So this is what I use. I'll give my code then explain my problem.
def _sample_fn(decoder_outputs):
return decoder_outputs
def _end_fn(_):
return tf.tile([False], [self.lstm_layersize]) # Batch-size is sequence-length because of time major
inference_helper = InferenceHelper(
sample_fn=_sample_fn,
sample_shape=[32],
sample_dtype=target_input.dtype,
start_inputs=tf.zeros(batchsize_placeholder, 32), # the batchsize varies
end_fn=_end_fn
)
inference_decoder = BasicDecoder(
cell=lstm_sequence_decoder,
helper=inference_helper,
initial_state=thought_vector
)
output, _, _ = dynamic_decode(
decoder=inference_decoder,
maximum_iterations=3000
)
The Problem
I don't know what the shape of the inputs should be. I know the start-inputs should be zero because it is the first time-step. But this throws errors; it expects the input to be (1,32).
I also thought I should pass the output of each time-step unchanged to the next. However, this raises problems at run-time: the batch-size varies, so the shape is partial. The library throws an exception at this as it tries to convert the start_input to a tensor:
...
self._start_inputs = ops.convert_to_tensor(
start_inputs, name='start_inputs')
Any ideas?
This is a lesson in poor documentation.
I fixed my problem, but failed to address the variable batch-size problem.
The _end_fn was causing problems I was unaware of. I also managed to work out what the appropriate fields are for the InferenceHelper. I've given the fields names in case anyone needs guidance in future
def _end_fn(_):
return tf.tile([False], [batchsize])
inference_helper = InferenceHelper(
sample_fn=_sample_fn,
sample_shape=[lstm_number_of_units], # In my case, 32
sample_dtype=tf.float32, # Depends on the data
start_inputs=tf.zeros((batchsize, lstm_number_of_units)),
end_fn=_end_fn
)
As for the batch-size problem, there are two things I'm considering:
Changing the internal state of my model object. My TensorFlow computation graph is built inside a class. A class-field records the batch-size. Changing this during training may work. Or:
Pad the batches so that they are 200 sequences long. This will waste time.
Preferably I'd like a way to dynamically manage the batch-sizes.
EDIT: I found a way. It involves simply substituting square-brackets for parentheses:
inference_helper = InferenceHelper(
sample_fn=_sample_fn,
sample_shape=[self.lstm_layersize],
sample_dtype=target_input.dtype,
start_inputs=tf.zeros([batchsize, self.lstm_layersize]),
end_fn=_end_fn
)

Tensorflow image classification - eating up my memory

I am trying to write a small program which is able to classify certain pictures into categories. I create a list with pictures in the main code and pass them to the function in a loop. Code is working perfectly fine except for the fact that it does not free my memory and with each iteration the program uses more until it completely crashes.
Already tried to use "gc.collect()" in the function to force it to clear the memory, but that does not help. Shouldn't the memory be cleared automatically after checking one file or did I miss anything here?
def classify_pictures(self, files):
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
# Read the image_data
image_data = tf.gfile.FastGFile(files, 'rb').read()
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile("tf_files/retrained_labels.txt")]
# Unpersists graph from file
with tf.gfile.FastGFile("tf_files/retrained_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
predictions = sess.run(softmax_tensor, \
{'DecodeJpeg/contents:0': image_data})
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
for each_picture in range(0, 10):
human_string = label_lines[top_k[0]]
if human_string == "selfie":
return ("selfie")
if "passport" in human_string:
return("passport")
if "statement" in human_string:
return("bill")
If you call this function in a loop the computational graph is being rebuild on every iteration and the previous graph is of course still there too. That is what uses all the memory.
To solve this only do the session.run() call inside the loop.
In general when writing Tensorflow code you should always try to keep code that generates the graph separate from code which executes the graph. In your case you are doing both inside a single function, which is than called multiple times so a new graph is rebuild every time.

Resources