I am using the theano tutorials of deep learning and I have a doubt regarding how the theano function's update works. Whatever parameters the update is defined with will it take the new value if any of the parameters changed?
theano.function( [i],
cost,
updates = updates,
givens = { self.x: trainX[i*self.mbSize: (i+1)*self.mbSize],
self.y: trainY[i*self.mbSize: (i+1)*self.mbSize]
}
)
Updates is defined as
updates = [ (param, param-learnRate*grad)
for param, grad in zip(self.params, gradients) ]
Here learnRate is not a theano variable. My question is if I change the learnRate at some point of time then will the theano function take the changed value of learnRate or will continue with the old value?
LearnRate is changed as
LearnRate = LearnRate/10
Initially the LearnRate was 0.05
It will continue with the old value.
Numbers are immutable in python so changing the value of LearnRate won't affect the value being used in the updates list, and I guess once a theano function is compiled with the provided updates it can not be changed unless you compile it again with the new value.
As you probably have known, you can make the learning rate a theano variable and then change its value when you want to.
Here's another question that might be useful, Is there a way to change a function's update list without re-compiling it in Theano?
Related
I need an array of the sums of 3x3 neighboring cells with products based on a kernel of a different array with the same size (this is exactly scipy.ndimage.correlate up to this point). But when a value for the new array is calculated it has to be updated immediately instead of using the value from the original array for the next computation involving that value. I have written this slow code to implement it myself, which is working perfectly fine (although too slow for me) and delivering the expected result:
for x in range(width):
for y in range(height):
AArr[y,x] += laplaceNeighborDifference(x,y)
def laplaceNeighborDifference(x,y,z):
global w, h, AArr
return -AArr[y,x]+AArr[(y+1)%h,x]*.2+AArr[(y-1)%h,x]*.2+AArr[y,(x+1)%w]*.2+AArr[y,(x-1)%w]*.2+AArr[(y+1)%h,(x+1)%w]*.05+AArr[(y-1)%h,(x+1)%w]*.05+AArr[(y+1)%h,(x-1)%w]*.05+AArr[(y-1)%h,(x-1)%w]*.05
In my approach the kernel is coded directly. Although as an array (to be used as a kernel) it would be written like this:
[[.05,.2,.05],
[.2 ,-1,.2 ],
[.05,.2,.05]]
The SciPy implementation would work like this:
AArr += correlate(AArr, kernel, mode='wrap')
But obviously when I use scipy.ndimage.correlate it calculates the values entirely based on the original array and doesn't update them as it computes them. At least I think that is the difference between my implementation and the SciPy implementation, feel free to point out other differences if I've missed one. My question is if there is a similar function to the aforementioned with desired results or if there is an approach to code it which is faster than mine?
Thank you for your time!
You can use Numba to do that efficiently:
import numba as nb
#nb.njit
def laplaceNeighborDifference(AArr,w,h,x,y):
return -AArr[y,x]+AArr[(y+1)%h,x]*.2+AArr[(y-1)%h,x]*.2+AArr[y,(x+1)%w]*.2+AArr[y,(x-1)%w]*.2+AArr[(y+1)%h,(x+1)%w]*.05+AArr[(y-1)%h,(x+1)%w]*.05+AArr[(y+1)%h,(x-1)%w]*.05+AArr[(y-1)%h,(x-1)%w]*.05
#nb.njit('void(float64[:,::1],int64,int64)')
def compute(AArr,width,height):
for x in range(width):
for y in range(height):
AArr[y,x] += laplaceNeighborDifference(AArr,width,height,x,y)
Note that modulus are generally very slow. This is better to remove them by computing the border separately of the main loop. The resulting code should be much faster without any modulus.
I'm moving my comments from https://github.com/tensorflow/tensorflow/issues/8833 to StackOverflow as SO seems more appropriate.
I'm attempting to implement a sequence to sequence model using tensorflow.contrib.seq2seq and tensorflow.contrib.rnn's BasicLSTMCell. Within rnn_cell_impl.py, the line c, h = state causes the following error:
TypeError: 'Tensor' object is not iterable.
When stepping through the code, I learned that the error is caused the third time c, h = state is evaluated. The first two times, state has type <class 'tensorflow.python.ops.rnn_cell_impl.LSTMStateTuple'>, but on the third time, state has type <class 'tensorflow.python.framework.ops.Tensor'>. Clearly, I want the third time to have type LSTMStateTuple, but I have no idea what might be causing the switch.
The problematic state tensor's name is define_model/define_decoder/decoder/while/Identity_3. I wrote the methods define_model() and define_decoder(), and the remaining information suggests that something is happening inside my decoder.
In case it's relevant, I'm using Python 3.6 and Tensorflow 1.2.
The answer can be found at the above linked Github issue page.
To briefly summarize, the problem was that my encoder used a bidirectional RNN, which produces a 2-tuple of LSTMStateTuples i.e. one c and one h state for each directional RNN. Then, later, the decoder accepts a single cell, which has associated with it a single LSTMStateTuple. To solve this problem, you need to separately concatenate the c states and h states for the two directional RNNS, wrap this as a new LSTMStateTuple and pass that to the decoder's state.
I think the similar answer can be found here.
The code converts cudnn cell state to tensorflow internal state.
See this method
def cudnn_lstm_state_to_state_tuples(cudnn_lstm_state):
During a Theano computation I would like to write a variable, say x, to a file. The subsequent computation requires data inside a file called 'scores.txt' which is why 'x' needs to be written to scores.txt. Is there any way we can write the value contained in x into scores.txt? Note that scores.txt will be used by a non-differentiable function (this function is not learnt and hence gradients with respect to operations of this function are not required) and hence any method which can just store the value of 'x' into 'scores.txt' during the theano computation is sufficient.
If x is a shared variable, just use x.get_value() to load the values from it as a python value, or a numpy array, then you can write it to file as you would normally do.
If x is a theano tensor variable, you can define a theano.function which takes the input of x (which should be available as a normal python or numpy data), and outputs x. You can then print and save the output of this function normally:
x_values = theano.function([x_input1, x_input2], x)
print x_values(x_input1, x_input2)
For a few months I started working with python, considering the great advantages it has. But recently, i used odeint from scipy to solve a system of differential equations. But during the integration process the implemented function doesn't work as expected.
In this case, I want to solve a system of differential equations where one of the initial conditions (x[0]) varies (between 4-5) depending on the value that the variable reaches during the integration process (It is programmed inside of the function by means of the if structure).
#Control of oxygen
SO2_lower=4
SO2_upper=5
if x[0]<=SO2_lower:
x[0]=SO2_upper
When the function is used by odeint, some lines of code inside the function are obviated, even when the functions changes the value of x[0]. Here is all my code:
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
plt.ion()
# Stoichiometric parameters
YSB_OHO_Ox=0.67 #Yield for XOHO growth per SB (Aerobic)
YSB_Stor_Ox=0.85 #Yield for XOHO,Stor formation per SB (Aerobic)
YStor_OHO_Ox=0.63 #Yield for XOHO growth per XOHO,Stor (Aerobic)
fXU_Bio_lys=0.2 #Fraction of XU generated in biomass decay
iN_XU=0.02 #N content of XU
iN_XBio=0.07 #N content of XBio
iN_SB=0.03 #N content of SB
fSTO=0.67 #Stored fraction of SB
#Kinetic parameters
qSB_Stor=5 #Rate constant for XOHO,Stor storage of SB
uOHO_Max=2 #Maximum growth rate of XOHO
KSB_OHO=2 #Half-saturation coefficient for SB
KStor_OHO=1 #Half-saturation coefficient for XOHO,Stor/XOHO
mOHO_Ox=0.2 #Endogenous respiration rate of XOHO (Aerobic)
mStor_Ox=0.2 #Endogenous respiration rate of XOHO,Stor (Aerobic)
KO2_OHO=0.2 #Half-saturation coefficient for SO2
KNHx_OHO=0.01 #Half-saturation coefficient for SNHx
#Other parameters
DT=1/86400.0
def f(x,t):
#Control of oxygen
SO2_lower=4
SO2_upper=5
if x[0]<=SO2_lower:
x[0]=SO2_upper
M=np.matrix([[-(1.0-YSB_Stor_Ox),-1,iN_SB,0,0,YSB_Stor_Ox],
[-(1.0-YSB_OHO_Ox)/YSB_OHO_Ox,-1/YSB_OHO_Ox,iN_SB/YSB_OHO_Ox-iN_XBio,0,1,0],
[-(1.0-YStor_OHO_Ox)/YStor_OHO_Ox,0,-iN_XBio,0,1,-1/YStor_OHO_Ox],
[-(1.0-fXU_Bio_lys),0,iN_XBio-fXU_Bio_lys*iN_XU,fXU_Bio_lys,-1,0],
[-1,0,0,0,0,-1]])
R=np.matrix([[DT*fSTO*qSB_Stor*(x[0]/(KO2_OHO+x[0]))*(x[1]/(KSB_OHO+x[1]))*x[4]],
[DT*(1-fSTO)*uOHO_Max*(x[0]/(KO2_OHO+x[0]))*(x[1]/(KSB_OHO+x[1]))* (x[2]/(KNHx_OHO+x[2]))*x[4]],
[DT*uOHO_Max*(x[0]/(KO2_OHO+x[0]))*(x[2]/(KNHx_OHO+x[2]))*((x[5]/x[4])/(KStor_OHO+(x[5]/x[4])))*(KSB_OHO/(KSB_OHO+x[1]))*x[4]],
[DT*mOHO_Ox*(x[0]/(KO2_OHO+x[0]))*x[4]],
[DT*mStor_Ox*(x[0]/(KO2_OHO+x[0]))*x[5]]])
Mt=M.transpose()
MxRm=Mt*R
MxR=MxRm.tolist()
return ([MxR[0][0],
MxR[1][0],
MxR[2][0],
MxR[3][0],
MxR[4][0],
MxR[5][0]])
#ODE solution
t=np.linspace(0.0,3600,3600)
#Initial conditions
y0=np.array([5,176,5,30,100,5])
Var=odeint(f,y0,t,args=(),h0=1,hmin=1,hmax=1,atol=1e-5,rtol=1e-5)
Sol=Var.tolist()
plt.plot(t,Var[:,0])
Thanks very much in advance!!!!!
Short answer:
You should not modify input state vector inside your ODE function. Instead try the following and verify your results:
x0 = x[0]
if x0<=SO2_lower:
x0=SO2_upper
# use x0 instead of x[0] in the rest of this function body
I suppose that this is your problem, but I am not sure, since you did not explain what exactly was wrong with the results. Moreover, you do not change "initial condition". Initial condition is
y0=np.array([5,176,5,30,100,5])
you just change the input state vector.
Detailed answer:
Your odeint integrator is probably using one of the higher order adaptive Runge-Kutta methods. This algorithm requires multiple ODE function evaluations to calculate single integration step, therefore changing the input state vector may lead to undefined results. In C++ boost::odeint this is even not possible to do so, because input variable is "const". Python however is not as strict as C++ and I suppose that it is possible to make this kind of bug unintentionally (I did not try it, though).
EDIT:
OK, I understand what you want to achieve.
Your variable x[0] is constrained by modular algebra and it is not possible to express in the form
x' = f(x,t)
which is one of the possible definitions of the Ordinary Differential Equation, that ondeint library is meant to solve. However, few possible "hacks" can be used here to bypass this limitation.
One possibility is to use a fixed step and low order (because for higher order solvers you need to know, which part of the algorithm you are actually in, see RK4 for example) solver and change your dx[0] equation (in your code it is MxR[0][0] element) to:
# at the beginning of your system
if (x[0] > S02_lower): # everything is normal here
x0 = x[0]
dx0 = # normal equation for dx0
else: # x[0] is too low, we must somehow force it to become S02_upper again
dx0 = (x[0] - S02_upper)/step_size // assuming that x0_{n+1} = x0_{n} + dx0*step_size
x0 = S02_upper
# remember to use x0 in the rest of your code and also remember to return dx0
However, I do not recommend this technique, because it makes you strongly dependent on the algorithm and you must know the exact step size (although, I may recommend it for saturation constraints). Another possibility is to perform a single integration step at a time and correct your x0 each time it is necessary:
// 1 do_step(sys, in, dxdtin, t, out, dt);
// 2 do something with output
// 3 in = out
// 4 return to 1 or finish
Sorry for C++ syntax, here is the exhaustive documentation (C++ odeint steppers), and here is its equivalent in python (Python ode class). C++ odeint interface is better for your task, however you may achieve exactly the same in python. Just look for:
integrate(t[, step, relax])
set_initial_value(y[, t])
in docs.
Does minizinc have any syntax to specify the distribution of values selection of a variable. For eg.
var 0..100: X ;
I would like to specify that X take values in range 0..50 90% of the time and take values in 51..100 10% of time. The syntax
int_search( [X], first_fail, **indomain_random**, complete ) satisfy;
specifies that X may get any value in 0..100 with the same probability.
MiniZinc don't have any syntax to state such a random distribution.
One way might be to change the FlatZinc solver so it behave this way when "indomain_random" is used. Though this requires that you have access to the source of the FlatZinc solver.
That said, what exactly is your use case for wanting this distribution?