What is the meaning of keep_vars in state_dict? - pytorch

state_dict(destination=None, prefix='', keep_vars=False)
what does changing keep_vars to True do?

In PyTorch >=0.4, it has no use.
keep_vars was added in the commit: Add keep_vars parameter to state_dict stating that
When keep_vars is true, it returns a Variable for each parameter
(rather than a Tensor).
In state_dict function, _save_to_state_dict is called internally, which contains the following code
for name, param in self._parameters.items():
if param is not None:
destination[prefix + name] = param if keep_vars else param.data
for name, buf in self._buffers.items():
if buf is not None:
destination[prefix + name] = buf if keep_vars else buf.data
The portion param if keep_vars else param.data made difference prior to PyTorch 0.4.0 when Variable and Tensor were separate, but now as they are merged, keep_vars is probably present only for backward compatibility. Check Is .data still useful in pytorch?

Related

Why would you call .detach() on a parameter when the code is within with torch.no_grad()?

So I have this code for updating critic in SAC
with torch.no_grad():
_, policy_action, log_pi, _ = self.actor(next_obs)
target_Q1, target_Q2 = self.critic_target(next_obs, policy_action)
target_V = torch.min(target_Q1, target_Q2) - self.alpha.detach() * log_pi
target_Q = reward + (not_done * self.discount * target_V)
this is not my code it's code I got off GitHub. As we can see they have a torch.no_grad() and self.alpha.detach() why would you need both? This seems redundant to me as torch.no_grad() anything within the with statement will not be added to the computational graph and .detach() does the same thing but for a single variable. Why would you use torch.no_grad() and detach()?
You don't need to detach tensors from the graph when under the torch.no_grad context manager. However, I suspect this snippet was copied from the inference loop where the gradients are computed by default. You could verify that by navigating the training loop's source file.

pytorch collections.OrderedDict' object has no attribute 'to'

this is my main code,but I don't know how to fix the problem?
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = torch.load('./checkpoints/fcn_model_5.pth') # 加载模型
model = model.to(device)
You are loading the checkpoint as a state dict, it is not a nn.module object.
checkpoint = './checkpoints/fcn_model_5.pth'
model = your_model() # a torch.nn.Module object
model.load_state_dict(torch.load(checkpoint ))
model = model.to(device)
The source of your problem is simply you are loading your model as a dict, instead of nn.Module. Here is an another approach you can employ without converting to nn.Module bloat adopted from here:
for k, v in model.items():
model[k] = v.to(device)
Now, you have an ordered dict with the items at correct place.
Please note that you will still have an ordered dict instead of nn.Module. You will not be able to forward pass anything from an ordered dict.

Why do we need state_dict = state_dict.copy()

I want to load the weights of a pre-trained model on my local model. I don’t understand why state_dict = state_dict.copy() is necessary if the two networks have the same name state_dict.
# copy state_dict so _load_from_state_dict can modify it
metadata = getattr(state_dict, '_metadata', None)
state_dict = state_dict.copy()
if metadata is not None:
state_dict._metadata = metadata
def load(module, prefix=''):
local_metadata = {} if metadata is None else metadata.get(prefix[:-1], {})
module._load_from_state_dict(
state_dict, prefix, local_metadata, True, missing_keys, unexpected_keys, error_msgs)
for name, child in module._modules.items():
if child is not None:
load(child, prefix + name + '.')
start_prefix = ''
# print("hasattr(model, 'bert')",hasattr(model, 'bert') ) :false
if not hasattr(model, 'bert') and any(s.startswith('bert.') for s in state_dict.keys()):
start_prefix = 'bert.'
load(model, prefix=start_prefix)
Note: the above code is from Hugging Face.
state_dict = state_dict.copy()
does exactly what you tell him to do: it copies in place the state_dict. State dict are all the parameters of your model, and copying it allows to make them independant. One should be careful whether you need a copy or a deepcopy though !

Scikit-Learn: set_param() for custom estimator sets nested parameter before component

I implemented several custom estimators, following the developer guide, so that all of them are inheriting from BaseEstimator. Some of these use other scikit-learn estimators or transformers as attributes (say for example, to build an ensemble). Inheriting from BaseEstimator should give me the convenience of accessing the parameters through get_params() and setting them through set_params() as described here, in the form component__parameter, for example for use in grid search. Find below a minimal example.
from sklearn.base import BaseEstimator
from sklearn.linear_model import LinearRegression
class MyForecaster(BaseEstimator):
def __init__(self, base_estimator=LinearRegression()):
self.base_estimator = base_estimator
def fit(self, X, y):
pass
def predict(self, X, y):
pass
# instantiate forecaster and set parameters
mf = MyForecaster()
mf.set_params(**{"base_estimator" : "ElasticNet", "base_estimator__alpha": 0.05})
This fails with:
ValueError: Invalid parameter alpha for estimator LinearRegression. Check the list of available parameters with `estimator.get_params().keys()`.
This indicates it tries to set the params with first for the nested attribute, instead of checking first if I want to overwrite the "higher level" attribute (ElasticNet has the attribute alpha, LinearRegression not).
One way to handle this would be to overwrite set_params() for each estimator, to make sure that it is handled correctly.
Is there any "built in" way to achieve this, which I simply overlooked another solution? Is this really intended behavior by scikit-learn?
Edit:
So indeed due to some very big coincidence a very similar issue seems to have been fixed with version 0.19.1. However, my particular case still fails, only the case with Pipelines is fixed!
To make it reproduciable I copied the current code of set_params() into my minimal example (only added comment in line 20)
1 def set_params(self, **params):
2 if not params:
3 # Simple optimization to gain speed (inspect is slow)
4 return self
5 valid_params = self.get_params(deep=True)
6
7 nested_params = defaultdict(dict) # grouped by prefix
8 for key, value in params.items():
9 key, delim, sub_key = key.partition('__')
10 if key not in valid_params:
11 raise ValueError('Invalid parameter %s for estimator %s. '
12 'Check the list of available parameters '
13 'with `estimator.get_params().keys()`.' %
14 (key, self))
15
16 if delim:
17 nested_params[key][sub_key] = value
18 else:
19 setattr(self, key, value)
20 #valid_params[key] = value
21
22 for key, sub_params in nested_params.items():
23 valid_params[key].set_params(**sub_params)
24
25 return self
It fails, because it will set the attribute in line 19, but as it not updates valid_params, it will still fail in the next iteration, when the attribute is tried to be set. So I added line 20 which would fix this.
It does work as tested in the current fix of 0.19.1, as it was only tested for Pipelines. Here, set_param() is overwritten to first call _set_param() of _BaseComposition, where appereantly this is handled.
Should I raise this in the scikit-learn github or reopen the other issue?
This is a bug. It has been reported a week ago, and has already been fixed and backported in v0.19.1, which has been released yesterday.
The easiest fix is to update scikit-learn to v0.19.1 (or to master dev branch).
So the fix mentioned in #TomDLT's answer fixed a very similar issue and led to the fix above making it most likely into a future version of sklearn (9999).
So for here: if you come across the problem in the meantime, either use the code above to overwrite set_params() or wait for the fix.

tf.global_variables_initializer() does not work

Hello Tensorflow users/developers,
Even though I call initializer function, reporter tells me that none of my variable is initialized. I created them using tf.get_variable(). Here is where my session and graph objects are created:
with tf.Graph().as_default():
# Store all scores (each score is a loss-per-episode)
init = tf.global_variables_initializer()
all_scores, scores = [], []
# Build common tensors used throughout entire session
nn.build(seq_len)
# Generate inference and loss models
[loss, train_op] = nn.generate_models()
with tf.Session() as sess:
try:
st = time.time()
# Initialize all variables (Note that! not operation tensors; but variable tensors)
print('Initializing variables...')
sess.run(init)
print('Training starts...')
for e, (input_, target) in sample_generator:
feed_dict = nn.prepare_dict(input_, target)
# Run one step of the model. The return values are the activations
# from the `train_op` (which is discarded) and the `loss` Op.
x = sess.run(tf.report_uninitialized_variables(tf.global_variables()))
print(x)
_, score = sess.run([train_op, loss],
feed_dict=feed_dict)
all_scores.append(score)
scores.append(score)
# Asses your predictions against target
if e > 0 and not (e%100):
print('Episode %05d: %.6f' % (e, np.mean(scores).tolist()[0]))
scores.clear()
except KeyboardInterrupt:
print('Elapsed time: %ld' % (time.time()-st))
pass
I've called this method for millions of times before, and it had worked perfectly; but right now it is leaving me in the lurch. What do you think the cause might be? Any suggestion would really be appreciated.
P.S. I tried calling tf.local_variables_initializer() too; though reporter told me that you don't have any local at all.
Thanks in advance.
Thanks for the reply.
Well I've figured it out. I shouldn't have executed the following assignment instruction before I build my model:
init = tf.global_variables_initializer()
For anyone's information: You may think that "I'll execute and get the result of this operation called 'init' when I do so in a Session. So it doesn't matter where I do the assignment specified above".
No! It is not true. Tensorflow decides on which variables to be initialized right after this assignment instruction is executed. Thus, call it after you build your entire model.
If it does not exist I suspect you accidentally downgraded you Tensorflow version.
Can you try tf.initialize_all_variables ?
If this does not work, can you post what version you are using?
I got the same error. However this is my solution: just skip the init = tf.global_variables_initializer()
and just use :
sess = tf.Session
sess.run(init = tf.global_variables_initializer())

Resources