Disable grad and backward Globally? - pytorch

How to disable GLOBALLY grad,backward and any other non forward() functionality in Torch ?
I see examples of how to do it locally but not globally ?
The Docs say that what may be I'm looking is Inference only mode ! but how to set it globally.

You can use torch.set_grad_enabled(False) to disable gradient propagation globally for the entire thread. Besides, after you called torch.set_grad_enabled(False), doing anything like backward() will raise an exception.
a = torch.tensor(np.random.rand(64,5),dtype=torch.float32)
l = torch.nn.Linear(5,10)
o = torch.sum(l(a))
print(o.requires_grad) #True
o.backward()
print(l.weight.grad) #showed gradients
torch.set_grad_enabled(False)
o = torch.sum(l(a))
print(o.requires_grad) #False
o.backward()# RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
print(l.weight.grad)

Related

Creating tensors on M1 GPU by default on PyTorch using jupyter

Right now, if I want to create a tensor on gpu, I have to do it manually. For context, I'm sure that GPU support is available since
print(torch.backends.mps.is_available())# this ensures that the current current PyTorch installation was built with MPS activated.
print(torch.backends.mps.is_built())
returns True.
I've been doing this every time:
device = torch.device("mps")
a = torch.randn((), device=device, dtype=dtype)
Is there a way to specify, for a jupyter notebook, that all my tensors are supposed to be run on the GPU?
The convenient way
There is no convenient way to set default device to MPS as of 2022-12-22, per discussion on this issue.
The inconvenient way
You can accomplish the objective of 'I don't want to specify device= for tensor constructors, just use MPS' by intercepting calls to tensor constructors:
class MPSMode(torch.overrides.TorchFunctionMode):
def __init__(self):
# incomplete list; see link above for the full list
self.constructors = {getattr(torch, x) for x in "empty ones arange eye full fill linspace rand randn randint randperm range zeros tensor as_tensor".split()}
def __torch_function__(self, func, types, args=(), kwargs=None):
if kwargs is None:
kwargs = {}
if func in self.constructors:
if 'device' not in kwargs:
kwargs['device'] = 'mps'
return func(*args, **kwargs)
# sensible usage
with MPSMode():
print(torch.empty(1).device) # prints mps:0
# sneaky usage
MPSMode().__enter__()
print(torch.empty(1).device) # prints mps:0
The recommended way:
I would lean towards just putting your device in a config at the top of your notebook and using it explicitly:
class Conf: dev = torch.device("mps")
# ...
a = torch.randn(1, device=Conf.dev)
This requires you to type device=Conf.dev throughout the code. But you can easily switch your code to different devices, and you don't have any implicit global state to worry about.
as of 2023-01-20, with pytorch 2.0 nightly, you can set default device as mps using:
torch.set_default_device("mps")

Why would you call .detach() on a parameter when the code is within with torch.no_grad()?

So I have this code for updating critic in SAC
with torch.no_grad():
_, policy_action, log_pi, _ = self.actor(next_obs)
target_Q1, target_Q2 = self.critic_target(next_obs, policy_action)
target_V = torch.min(target_Q1, target_Q2) - self.alpha.detach() * log_pi
target_Q = reward + (not_done * self.discount * target_V)
this is not my code it's code I got off GitHub. As we can see they have a torch.no_grad() and self.alpha.detach() why would you need both? This seems redundant to me as torch.no_grad() anything within the with statement will not be added to the computational graph and .detach() does the same thing but for a single variable. Why would you use torch.no_grad() and detach()?
You don't need to detach tensors from the graph when under the torch.no_grad context manager. However, I suspect this snippet was copied from the inference loop where the gradients are computed by default. You could verify that by navigating the training loop's source file.

How to automatically disable register_hook when model is in eval() phase in PyTorch?

I require to update grads of an intermediate tensor variable using the register_hook method. Since the variable isn't a leaf-variable, I require to add the retain_grad() method to it after which, I can use the register_hook method to alter the grads.
score.retain_grad()
h = score.register_hook(lambda grad: grad * torch.FloatTensor(...))
This works perfectly fine during the training (model.train()) phase. However, it gives an error during the evaluation phase (model.eval()).
The error:
File "/home/envs/darthvader/lib/python3.6/site-packages/torch/tensor.py", line 198, in register_hook
raise RuntimeError("cannot register a hook on a tensor that "
RuntimeError: cannot register a hook on a tensor that doesn't require gradient
How could the model automatically disable the register_hook method when it in eval() phase?
Removing score.retain_grad() and guarding register_hook with if condition (if score.requires_grad) does the trick.
if score.requires_grad:
h = score.register_hook(lambda grad: grad * torch.FloatTensor(...))
Originally answered by Alban D here.

Protocol problem with PyMc3 on jupyter notebook

I am working with the following code, but I get an error
import pymc3 as pm
import theano.tensor as tt
with pm.Model() as model:
alpha = 1.0/count_data.mean() # Recall count_data is the
# variable that holds our txt counts
lambda_1 = pm.Exponential("lambda_1", alpha)
lambda_2 = pm.Exponential("lambda_2", alpha)
tau = pm.DiscreteUniform("tau", lower=0, upper=n_count_data - 1)
with model:
idx = np.arange(n_count_data) # Index
lambda_ = pm.math.switch(tau > idx, lambda_1, lambda_2)
with model:
observation = pm.Poisson("obs", lambda_, observed=count_data)
with model:
step = pm.Metropolis()
trace = pm.sample(10000, tune=5000,step=step)
But I get the error
ValueError: must use protocol 4 or greater to copy this object; since getnewargs_ex returned keyword arguments.
I have windows-10, python-3.5.6,
pymc3- 3.5, ipython-6.5.0. Any help is deeply appreciated. Thanks in advance.
It sounds like this exception is being thrown by the joblib library, which uses pickle to send the model to different processes. The easiest fix is to use only a single core, by changing the last line to
trace = pm.sample(10000, tune=5000, step=step, cores=1, chains=4)
It will be hard to diagnose the problem with joblib without more details. Creating a fresh conda environment might help.
The workaround suggested by colcarroll did not work for me. The behavior you are seeing is related to PR#3140 of PyMC3, which you may want to track there. The solution and/or workaround may depend on how you are running theano (with or without GPU support).

tf.global_variables_initializer() does not work

Hello Tensorflow users/developers,
Even though I call initializer function, reporter tells me that none of my variable is initialized. I created them using tf.get_variable(). Here is where my session and graph objects are created:
with tf.Graph().as_default():
# Store all scores (each score is a loss-per-episode)
init = tf.global_variables_initializer()
all_scores, scores = [], []
# Build common tensors used throughout entire session
nn.build(seq_len)
# Generate inference and loss models
[loss, train_op] = nn.generate_models()
with tf.Session() as sess:
try:
st = time.time()
# Initialize all variables (Note that! not operation tensors; but variable tensors)
print('Initializing variables...')
sess.run(init)
print('Training starts...')
for e, (input_, target) in sample_generator:
feed_dict = nn.prepare_dict(input_, target)
# Run one step of the model. The return values are the activations
# from the `train_op` (which is discarded) and the `loss` Op.
x = sess.run(tf.report_uninitialized_variables(tf.global_variables()))
print(x)
_, score = sess.run([train_op, loss],
feed_dict=feed_dict)
all_scores.append(score)
scores.append(score)
# Asses your predictions against target
if e > 0 and not (e%100):
print('Episode %05d: %.6f' % (e, np.mean(scores).tolist()[0]))
scores.clear()
except KeyboardInterrupt:
print('Elapsed time: %ld' % (time.time()-st))
pass
I've called this method for millions of times before, and it had worked perfectly; but right now it is leaving me in the lurch. What do you think the cause might be? Any suggestion would really be appreciated.
P.S. I tried calling tf.local_variables_initializer() too; though reporter told me that you don't have any local at all.
Thanks in advance.
Thanks for the reply.
Well I've figured it out. I shouldn't have executed the following assignment instruction before I build my model:
init = tf.global_variables_initializer()
For anyone's information: You may think that "I'll execute and get the result of this operation called 'init' when I do so in a Session. So it doesn't matter where I do the assignment specified above".
No! It is not true. Tensorflow decides on which variables to be initialized right after this assignment instruction is executed. Thus, call it after you build your entire model.
If it does not exist I suspect you accidentally downgraded you Tensorflow version.
Can you try tf.initialize_all_variables ?
If this does not work, can you post what version you are using?
I got the same error. However this is my solution: just skip the init = tf.global_variables_initializer()
and just use :
sess = tf.Session
sess.run(init = tf.global_variables_initializer())

Resources