What initializer does 'uniform' use? - keras

Keras offers a variety of initializers for weights and biases. Which one does 'uniform' use?
I would think it would be RandomUniform, but this is not confirmed in the documentation, and I reached a dead-end in the source-code: the key 'uniform' is used as a global variable within the module, and I cannot find where the variable uniform is set.

One other way to confirm this is to look at the initializers source code:
# Compatibility aliases
zero = zeros = Zeros
one = ones = Ones
constant = Constant
uniform = random_uniform = RandomUniform
normal = random_normal = RandomNormal
truncated_normal = TruncatedNormal
identity = Identity
orthogonal = Orthogonal

I think today's answer is better, though.
Simpler solution:
From the interactive prompt,
import keras
keras.initializers.normal
# Out[3]: keras.initializers.RandomNormal
keras.initializers.uniform
# Out[4]: keras.initializers.RandomUniform
Original post:
Running the debugger to the deserialize method in initializers.py
and examining
globals()['uniform']
Shows that the value is indeed
<class 'keras.initializers.RandomUniform'>
Similarly, 'normal' is shown in the debugger to be <class 'keras.initializers.RandomNormal'>.
Note that uniform often works better than normal, and the theoretical advantages of one over the other is not clear.

Related

Getting a Scoring Function by Name in scikit-learn

In scikit-learn , there is the notion of a scoring function. If we have some predicted labels and the true labels, we can get to the score by calling scoring(y_true, y_predict). An example of such scoring function is sklearn.metrics.accuracy_score.
A scoring function is not to be confused of the scorer, which is an object that can be called as scorer(estimator, X, y_true).
There are many builtin scorers in scikit-learn. It is possible to get to these scorers by their string names. For example, we can get the scorer corresponding to the name 'accuracy' by calling sklearn.metrics.get_scorer("accuracy")/
But it turns out that there is no obvious mechanism to access the built-in scoring functions by their names at run-time, through passing in the name as a string. For example, there is no way to access sklearn.metrics.accuracy_score by its name accuracy.
For example, if at run time, the program knows the name of the scoring function is contained in variable name, I am looking for a mechanism get_scoring_function(), such that, get_scoring_function(name) will return the scoring function handle. Note that this name, name, is not known at scripting time.
Is there any way to access the built-in scoring functions by their names at run time through passing in the names as strings?
You can use the get_scorer() function, which accepts a string as an argument, and then get the _score_func attribute of the returned object.
So for example
from sklearn.metrics import get_scorer
get_scorer('accuracy')._score_func(y_true, y_pred)
is equivalent to
from sklearn.metrics import accuracy_score
accuracy_score(y_true, y_pred)
I myself faced this task, and I haven't found a better way to access metrics by names than with sklearn.metrics.get_scorer function, but the drawback of it is that you have to pass an estimator there, not predictions. I tried to use the #collinb9 recommendation, but you see, you have to access a protected method there, and in my case, it led to unpleasant consequences, namely incorrectly calculated metrics.
This is a short example showing this problem.
from sklearn import datasets, model_selection, linear_model, metrics
features, labels = datasets.make_regression(1000, random_state=123)
train_features, test_features, train_labels, test_labels = model_selection.train_test_split(features, labels, test_size=0.1, random_state=567)
model = linear_model.LinearRegression()
model.fit(train_features, train_labels)
print(f'variant 1 neg_mse = {metrics.get_scorer("neg_mean_squared_error")(model, test_features, test_labels)}')
print(f'variant 1 neg_rmse = {metrics.get_scorer("neg_root_mean_squared_error")(model, test_features, test_labels)}\n')
preds = model.predict(test_features)
print(f'variant 2 mse = {metrics.mean_squared_error(test_labels, preds)}')
print(f'variant 2 rmse = {metrics.mean_squared_error(test_labels, preds, squared=False)}\n')
print(f'protected neg_mse = {metrics.get_scorer("neg_mean_squared_error")._score_func(test_labels, preds)}')
print(f'protected neg_rmse = {metrics.get_scorer("neg_root_mean_squared_error")._score_func(test_labels, preds)}')
The output of this program will be:
variant 1 neg_mse = -2.142587870436064e-25
variant 1 neg_rmse = -4.628809642268803e-13
variant 2 mse = 2.142587870436064e-25
variant 2 rmse = 4.628809642268803e-13
protected neg_mse = 2.142587870436064e-25
protected neg_rmse = 2.142587870436064e-25
You see, metrics calculated with the use of the protected method differ. First, we ordered to get negative values, but got positive ones (it should be mentioned, that for variant 2 metrics we didn't imply negative values). Second, the neg_mse and neg_rmse values are equal but should be different.
If we go to the source code of sklearn metrics, we will see:
This is how _score_func is called: it is multiplied by sign, so that's where we lose our negative values.
This is how scorers are made: you see, neg_root_mean_squared_error_scorer has extra parameter squared=False. This parameter is stated explicitly as an optional one in metrics.mean_squared_error, so you won't make a mistake. We can pass this parameter as a keyword argument to _score_fun and at least we will get a correct absolute value then:
print(f'protected neg_rmse = {metrics.get_scorer("neg_root_mean_squared_error")._score_func(test_labels, preds, squared=False)}')
protected neg_rmse = 4.628809642268803e-13
To make things short, I've shown, to my knowledge, the only way to get sklearn metrics by name (btw, you can find the full list of names here), and that it's not safe to use protected methods that you're not supposed to use. BTW, I was using sklearn version=0.24.2.
Since the documentation is incomplete, you'll have to go directly to the source code here for the complete list of metric names:
Metric Names
Search for __all__.
Answer of #collinb9 should not be accepted as it would lead to incorrect calculations.
You need other arguments (such as squared:False for rmse) to compute the correct thing. They can be accessed via the _kwargs attribute of _BaseScorer class. If you combine _score_func and _kwargs then we can get the corresponding scorer function.
The full answer to the question should be:
import functools
import sklearn
def score(scoring_name, y_true, y_pred):
sklearn_scorer = sklearn.metrics.get_scorer(scoring_name)
return sklearn_scorer._sign * sklearn_scorer._score_func(
y_true=y_true, y_pred=y_pred, **sklearn_scorer._kwargs
)
score("neg_root_mean_squared_error", y_true, y_pred)

How to use extract the hidden layer features in H2ODeepLearningEstimator?

I found H2O has the function h2o.deepfeatures in R to pull the hidden layer features
https://www.rdocumentation.org/packages/h2o/versions/3.20.0.8/topics/h2o.deepfeatures
train_features <- h2o.deepfeatures(model_nn, train, layer=3)
But I didn't find any example in Python? Can anyone provide some sample code?
Most Python/R API functions are wrappers around REST calls. See http://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/_modules/h2o/model/model_base.html#ModelBase.deepfeatures
So, to convert an R example to a Python one, move the model to be the this, and all other args should shuffle along. I.e. the example from the manual becomes (with dots in variable names changed to underlines):
prostate_hex = ...
prostate_dl = ...
prostate_deepfeatures_layer1 = prostate_dl.deepfeatures(prostate_hex, 1)
prostate_deepfeatures_layer2 = prostate_dl.deepfeatures(prostate_hex, 2)
Sometimes the function name will change slightly (e.g. h2o.importFile() vs. h2o.import_file() so you need to hunt for it at http://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/index.html

Seq2seq for non-sentence, float data; stuck configuring the decoder

I am trying to apply sequence-to-sequence modelling to EEG data. The encoding works just fine, but getting the decoding to work is proving problematic. The input-data has the shape None-by-3000-by-31, where the second dimension is the sequence-length.
The encoder looks like this:
initial_state = lstm_sequence_encoder.zero_state(batchsize, dtype=self.model_precision)
encoder_output, state = dynamic_rnn(
cell=LSTMCell(32),
inputs=lstm_input, # shape=(None,3000,32)
initial_state=initial_state, # zeroes
dtype=lstm_input.dtype # tf.float32
)
I use the final state of the RNN as the initial state of the decoder. For training, I use the TrainingHelper:
training_helper = TrainingHelper(target_input, [self.sequence_length])
training_decoder = BasicDecoder(
cell=lstm_sequence_decoder,
helper=training_helper,
initial_state=thought_vector
)
output, _, _ = dynamic_decode(
decoder=training_decoder,
maximum_iterations=3000
)
My troubles start when I try to implement inference. Since I am using non-sentence data, I do not need to tokenize or embed, because the data is essentially embedded already. The InferenceHelper class seemed the best way to achieve my goal. So this is what I use. I'll give my code then explain my problem.
def _sample_fn(decoder_outputs):
return decoder_outputs
def _end_fn(_):
return tf.tile([False], [self.lstm_layersize]) # Batch-size is sequence-length because of time major
inference_helper = InferenceHelper(
sample_fn=_sample_fn,
sample_shape=[32],
sample_dtype=target_input.dtype,
start_inputs=tf.zeros(batchsize_placeholder, 32), # the batchsize varies
end_fn=_end_fn
)
inference_decoder = BasicDecoder(
cell=lstm_sequence_decoder,
helper=inference_helper,
initial_state=thought_vector
)
output, _, _ = dynamic_decode(
decoder=inference_decoder,
maximum_iterations=3000
)
The Problem
I don't know what the shape of the inputs should be. I know the start-inputs should be zero because it is the first time-step. But this throws errors; it expects the input to be (1,32).
I also thought I should pass the output of each time-step unchanged to the next. However, this raises problems at run-time: the batch-size varies, so the shape is partial. The library throws an exception at this as it tries to convert the start_input to a tensor:
...
self._start_inputs = ops.convert_to_tensor(
start_inputs, name='start_inputs')
Any ideas?
This is a lesson in poor documentation.
I fixed my problem, but failed to address the variable batch-size problem.
The _end_fn was causing problems I was unaware of. I also managed to work out what the appropriate fields are for the InferenceHelper. I've given the fields names in case anyone needs guidance in future
def _end_fn(_):
return tf.tile([False], [batchsize])
inference_helper = InferenceHelper(
sample_fn=_sample_fn,
sample_shape=[lstm_number_of_units], # In my case, 32
sample_dtype=tf.float32, # Depends on the data
start_inputs=tf.zeros((batchsize, lstm_number_of_units)),
end_fn=_end_fn
)
As for the batch-size problem, there are two things I'm considering:
Changing the internal state of my model object. My TensorFlow computation graph is built inside a class. A class-field records the batch-size. Changing this during training may work. Or:
Pad the batches so that they are 200 sequences long. This will waste time.
Preferably I'd like a way to dynamically manage the batch-sizes.
EDIT: I found a way. It involves simply substituting square-brackets for parentheses:
inference_helper = InferenceHelper(
sample_fn=_sample_fn,
sample_shape=[self.lstm_layersize],
sample_dtype=target_input.dtype,
start_inputs=tf.zeros([batchsize, self.lstm_layersize]),
end_fn=_end_fn
)

How to use get_operation_by_name() in tensorflow, from a graph built from a different function?

I'd like to build a tensorflow graph in a separate function get_graph(), and to print out a simple ops a in the main function. It turns out that I can print out the value of a if I return a from get_graph(). However, if I use get_operation_by_name() to retrieve a, it print out None. I wonder what I did wrong here? Any suggestion to fix it? Thank you!
import tensorflow as tf
def get_graph():
graph = tf.Graph()
with graph.as_default():
a = tf.constant(5.0, name='a')
return graph, a
if __name__ == '__main__':
graph, a = get_graph()
with tf.Session(graph=graph) as sess:
print(sess.run(a))
a = sess.graph.get_operation_by_name('a')
print(sess.run(a))
it prints out
5.0
None
p.s. I'm using python 3.4 and tensorflow 1.2.
Naming conventions in tensorflow are subtle and a bit offsetting at first.
The thing is, when you write
a = tf.constant(5.0, name='a')
a is not the constant op, but its output. Names of op outputs derive from the op name by adding a number corresponding to its rank. Here, constant has only one output, so its name is
print(a.name)
# `a:0`
When you run sess.graph.get_operation_by_name('a') you do get the constant op. But what you actually wanted is to get 'a:0', the tensor that is the output of this operation, and whose evaluation returns an array.
a = sess.graph.get_tensor_by_name('a:0')
print(sess.run(a))
# 5

Why I get different values everytime I run the function hmmlearn.hmm.GaussianHMM.fit()

I have a program.
n = 6
data=pd.read_csv('11.csv',index_col='datetime')
volume = data['TotalVolumeTraded']
close = data['ClosingPx']
logDel = np.log(np.array(data['HighPx'])) - np.log(np.array(data['LowPx']))
logRet_1 = np.array(np.diff(np.log(close)))
logRet_5 = np.log(np.array(close[5:])) - np.log(np.array(close[:-5]))
logVol_5 = np.log(np.array(volume[5:])) - np.log(np.array(volume[:-5]))
logDel = logDel[5:]
logRet_1 = logRet_1[4:]
close = close[5:]
Date = pd.to_datetime(data.index[5:])
A = np.column_stack([logDel,logRet_5,logVol_5])
model = GaussianHMM(n_components= n, covariance_type="full", n_iter=2000).fit([A])
hidden_states = model.predict(A)
I run the code the first time ,the value of "hidden_states" is as follow,
I run the code the second time ,the value of "hidden_states" is as follow,
Why are two values "hidden_states" different?
I am not completely sure what happens here, but here're two possible explanations for the results you're seeing.
The model does not maintain any ordering over state labels. So state labelled as 1 in one run could end up being 4 in another run. This is known as label switching problem in latent variable models.
GaussianHMM initializes emission parameters via k-means which might converge to different values depending on the data. The initial parameters are passed to the EM-algorithm which is also prone to local maxima. Therefore different runs could result in different parameter estimates and (as a result) slightly different predictions.
Try to control the randomness by setting the seed and the random_state when you define your model. Moreover you could initialize the startprob_ and the transmat_ and see how it behaves.
That way you might have a better explanation about the cause of this behavior.

Resources