How does one write intermediate Theano variables to file? - theano

During a Theano computation I would like to write a variable, say x, to a file. The subsequent computation requires data inside a file called 'scores.txt' which is why 'x' needs to be written to scores.txt. Is there any way we can write the value contained in x into scores.txt? Note that scores.txt will be used by a non-differentiable function (this function is not learnt and hence gradients with respect to operations of this function are not required) and hence any method which can just store the value of 'x' into 'scores.txt' during the theano computation is sufficient.

If x is a shared variable, just use x.get_value() to load the values from it as a python value, or a numpy array, then you can write it to file as you would normally do.
If x is a theano tensor variable, you can define a theano.function which takes the input of x (which should be available as a normal python or numpy data), and outputs x. You can then print and save the output of this function normally:
x_values = theano.function([x_input1, x_input2], x)
print x_values(x_input1, x_input2)

Related

How to make random choose from a set in Python reproducible

I wrote a simple sample code here. In fact, elements will be added or deleted from the set and a random element will be chosen from the set on each iteration in my program.
But even if I run the simplified code below, I got different output every time I run the codes. So, how to make the outputs reproducible?
import random
random.seed(0)
x = set()
for i in range(40):
x.add('a'+str(i))
print(random.sample(x, 1))
The problem is that a set's elements are unordered, and will vary between runs even if the random sample chooses the same thing. Using random.sample on a set is in fact deprecated since python 3.9, and in the future you will need to input a sequence instead.
You could do this by converting the set to a sequence in a consistently ordered way, such as
x = sorted(x)
or probably better, just use a type like list in the first place (always producing ['a24'] in your example).
x = ['a' + str(i) for i in range(40)]

Insulate a segment of code from jax tracing

Apologies in advance for how vague this question is (unfortunately I don't know enough about how jax tracing works to phrase it more precisely), but: Is there a way to completely insulate a function or code block from jax tracing?
For context, I have a function of the form:
def f(x, y):
z = h(y)
return g(x, z)
Essentially, I want to call g(x, z), and treat z as a constant when doing any jax transformations. However, setting up the argument z is very awkward, so the helper function h is used to transform an easier-to-specify input y into the format required by g. What I'd like is for jax to treat h as a non-traceable black box, so that doing jit(lambda x: f(x, y0)) for a particular y0 is the same as first computing z0 = h(y0) with numpy, then doing jit(lambda x: g(x, z0)) (and similar with grad or whatever other function transformations).
In my code, I've already written h to only use standard numpy (which I thought might lead to black-box behaviour), but the compile time of jit(lambda x: f(x, y0)) is noticeably longer than the compile time of jit(lambda x: g(x, z0)) for z0 = h(y0). I have a feeling the compile time may have something to do with jax tracing the many loops in h, though I'm not sure.
Some additional notes:
Writing h in a jax-friendly way would be awkward (input formatting is ragged, tons of looping/conditionals, output shape dependent on input value, etc) and ultimately more trouble than it's worth as the function is extremely cheap to execute, and I don't ever need to differentiate it (the input data is integer-based).
Thoughts?
Edit addition for clarity: I know there are maybe ways around this if, e.g. f is a top-level function. In this case it isn't such a big deal to get the user to call h first to "pre-compile" the jax-friendly inputs to g, then freely perform whatever jax transformations they want to lambda x: g(x, z0). However, I'm imagining cases in which we have many functions that we want to chain together, that have the same structure as f, where there are some jax-unfriendly inputs/computations, but these inputs will always be treated as constant to the jax part of the computation. In principle one could always pull out these pre-computations to set up the jax stuff, but this seems difficult if we have a non-trivial collection of functions of this type that will be calling each other.
Is there some way to control how f gets traced, so that while tracing it knows to just evaluate z=h(y) (instead of tracing h) then continue with tracing g(x, z)?
f_jitted = jax.jit(f, static_argnums=1)
static_argnums parameter probably could help
https://jax.readthedocs.io/en/latest/notebooks/Common_Gotchas_in_JAX.html
You can use transformation parameters such as static_argnums for jit to avoid tracing particular arguments of transformed functions, though at the cost of more recompiles.

What is the meaning of () inside a list eg. [()] in Python?

I came across an h5py tutorial wherein a particular index of an hdf5 file is accessed as follows:
f = h5py.File('random.hdf5', 'r')
data = f['default'][()]
f.close()
print(data[10])
In this manner, even when the file is closed, the data is still accessible. It seems adding [()] no longer makes data a simple pointer, but rather the data object itself. What is the meaning of [()]?
() is an empty tuple. HDF5 datasets can have an arbitrary number of dimensions and support indexing, but some datasets are zero-dimensional (they store a single scalar value). For these, h5py uses indexing with an empty tuple [()] to access that value. You can't use [0] or even [:] because that implies at least one dimension to slice along.
() is an empty tuple, and indexing with an empty tuple is documented in h5py's documentation:
An empty dataset has shape defined as None, which is the best way of determining whether > a dataset is empty or not. An empty dataset can be “read” in a similar way to scalar > datasets, i.e. if empty_dataset is an empty dataset,:
>>> empty_dataset[()]
h5py.Empty(dtype="f")
The dtype of the dataset can be accessed via .dtype as per normal. As empty > datasets cannot be sliced, some methods of datasets such as read_direct will raise an exception if used on a empty dataset.

Define vector in Excel using the extreme interval points and a step

I want plot a function y=f(x) in excel and do some operations on it. The function is defined in an interval (x1,x2) with a defined step xs. Obviously I can define the vector x by hand, but I cannot manage to define it automatically, something like I do in Matlab using (x1:xs:x2). Is there any way to do that?

Is there a way to supply a numerical function to JiTCODE’s function argument instead of symbolic one?

I am getting a function (a learned dynamical system) through a neural network and want to pass it to JiTCODE to calculate trajectories, Lyapunov exponents, etc. As per the JiTCODE documentation, the function f has to be a symbolic function. Is there any way to change this since ultimately JiTCODE is going to lambdify the symbolic function?
Basically, this is what I'm doing right now:
# learns derviates from the Neural net model
# returns an array of numbers [\dot{x},\dot{y}] for input [x,y]
learned_fn = lambda t, y0: NN_model(t, y0)
ODE = jitcode_lyap(learned_fn, n_lyap=2)
ODE.set_integrator("vode")
First beware that JiTCODE does not take regular functions like your learned_fn as an input. It takes either iterables of symbolic expressions or generator functions returning symbolic expressions. This is why your example code will likely produce an error.
What you are asking for
You can “inject” any derivative with the right signature into JiTCODE by changing the f property and telling it that it failed compiling the actual derivative. Here is a minimal example doing this:
from jitcode import jitcode, y
ODE = jitcode([0])
ODE.f = lambda t,y: y[0]
ODE.compile_attempt = False
ODE.set_integrator("dopri5")
ODE.set_initial_value([1],0.0)
for time in range(30):
print(time,*ODE.integrate(time))
Why you probably do not want to do this
Ignoring Lyapunov exponents for a second, the entire point of JiTCODE is to hard-code your derivative for you and pass it to SciPy’s ode or solve_ivp who perform the actual integration. Thus the above example code is just an overly complicated way of passing a function to one SciPy’s standard integrators (here ode), with no advantage. If your NN_model is very efficiently implemented in the first place, you may not even gain a speed boost from JiTCODE’s auto-compilation.
The main reason to use JiTCODE’s Lyapunov-exponent capabilities is that it automatically obtains the Jacobian and the ODE for the tangent-vector evolution (needed for the Benettin method) from the symbolic representation of the derivative. Without a symbolic input, it cannot possibly do this. You could theoretically inject a tangent-vector ODE as well, but then again you would leave little for JiTCODE to do and you would probably better off using SciPy’s ode or solve_ivp directly.
What you probably need
If you want to use JiTCODE, you need to write a small piece of code that translates the output of your neural-network training to a symbolic representation of your ODE as needed by JiTCODE. This is probably much less scary than it sounds. You just need to obtain the trained coefficients and insert it in the equations of the general form of the neural network.
If you are lucky and your NN_model fully supports duck typing (and ), you may do something like this:
from jitcode import t,y
n = 10 # dimension of your ODE
NN_input = [y(i) for i in range(n)]
learned_fn = NN_model(t,NN_input)[1]
The idea is that you feed NN_model once with abstract symbolic input (t and NN_input). NN_model then once acts on this abstract input providing you an abstract result (here you need the duck-typing support). If I interpreted the output of your NN_model correctly, the second component of this result should be the abstract derivative as required by JiTCODE as an input.
Note that your NN_model appears to expect dimensions to be indices, but JiTCODE’s y expects dimensions to be function arguments. Thus you cannot just choose NN_input = y, but you have to transform it as above.
To quote directly from the linked documentation
JiTCODE takes an iterable (or generator function or dictionary) of symbolic expressions, which it translates to C code, compiles on the fly,
so there is no lambdification going on, the function is parsed, not just evaluated.
But in general that should be no problem, you just use the JITCODE provided symbolic vector y and symbol t instead of the function arguments t,y of the right side of the ODE.

Resources