I'm trying to create the model shown below with PyMC 3 but can't figure out how to properly map probabilities to the observed data with a lambda function.
import numpy as np
import pymc as pm
data = np.array([[0, 0, 1, 1, 2],
[0, 1, 2, 2, 2],
[2, 2, 1, 1, 0],
[1, 1, 2, 0, 1]])
(D, W) = data.shape
V = len(set(data.ravel()))
T = 3
a = np.ones(T)
b = np.ones(V)
with pm.Model() as model:
theta = [pm.Dirichlet('theta_%s' % i, a, shape=T) for i in range(D)]
z = [pm.Categorical('z_%i' % i, theta[i], shape=W) for i in range(D)]
phi = [pm.Dirichlet('phi_%i' % i, b, shape=V) for i in range(T)]
w = [pm.Categorical('w_%i_%i' % (i, j),
p=lambda z=z[i][j], phi_=phi: phi_[z], # Error is here
observed=data[i, j])
for i in range(D) for j in range(W)]
The error I get is
AttributeError: 'function' object has no attribute 'shape'
In the model I'm attempting to build, the elements of z indicate which element in phi gives the probability of the corresponding observed value in data (placed in RV w). In other words,
P(data[i,j]) <- phi[z[i,j]][data[i,j]]
I'm guessing I need to define the probability with a Theano expression or use Theano as_op but I don't see how it can be done for this model.
You should specify your categorical p values as Deterministic objects before passing them on to w. Otherwise, the as_op implementation would look something like this:
#theano.compile.ops.as_op(itypes=[t.lscalar, t.dscalar, t.dscalar],otypes=[t.dvector])
def p(z=z, phi=phi):
return [phi[z[i,j]] for i in range(D) for j in range(W)]
Related
I am trying to convert the rows [0-1] of a matrix to representation in number (binary equivalent), the code I have is the following:
import numpy as np
def generate_binary_matrix(matrix):
result = []
for i in matrix:
val = '0b' + ''.join([str(x) for x in i])
result.append(int(val, 2))
result = np.array(result)
return result
initial_matrix = np.array([[0, 1, 0], [1, 0, 0], [0, 0, 1]])
result = generate_binary_matrix(initial_matrix )
print(result)
This code works but it is very slow, does anyone know how to do it in a faster way?
You can convert a 0/1 list to binary using just arithmetic, which should be faster:
from functools import reduce
b = reduce(lambda r, x: 2*r + x, i)
Suppose you matrix numpy array is A with m rows and n columns.
Create a b vector with nelements by:
b = np.power(2, np.arange(n))[::-1]
then your answer is A # b
Example:
import numpy as np
A = np.array([[0, 0, 1], [1, 0, 1]])
n = A.shape[1]
b = np.power(2, np.arange(n))[::-1]
print(A # b) # --> [1 5]
update - I reversed b as the MSB (2^n-1) is A[:,0] + power arguments were mistakenly flipped + add an example.
Hi I'm student who just started for deep learning.
For example, I have 1-D tensor x = [ 1 , 2]. From this one, I hope to make 2D tensor y whose (i,j)th element has value (x[i] - x[j]), i.e y[0,:] = [0 , 1] , y[1,:]=[ -1 , 0].
Is there built-in function like this in pytorch library?
Thanks.
Here you need right dim of tensor to get expected result which you can get using torch.unsqueeze
x = torch.tensor([1 , 2])
y = x - x.unsqueeze(1)
y
tensor([[ 0, 1],
[-1, 0]])
There are a few ways you could get this result, the cleanest I can think of is using broadcasting semantics.
x = torch.tensor([1, 2])
y = x.view(-1, 1) - x.view(1, -1)
which produces
y = tensor([[0, -1],
[1, 0]])
Note I'll try to edit this answer and remove this note if the original question is clarified.
In your question you ask for y[i, j] = x[i] - x[j], which the above code produces.
You also say that you expect y to have values
y = tensor([[ 0, 1],
[-1, 0]])
which is actually y[i, j] = x[j] - x[i] as was posted in Dishin's answer. If you instead wanted the latter then you can use
y = x.view(1, -1) - x.view(-1, 1)
I want to generate a multi class test dataset using numpy only for a classification problem.
For example X is a numpy array of dimension(mxn), y of dimension(mx1) and let's say there are k no. of classes. Please help me with the code.
[Here X represents the features and y represents the labels]
You can use np.random.randint like:
import numpy as np
m = 4
n = 4
k = 5
X = np.random.randint(0,2,(m,n))
X
array([[1, 1, 1, 1],
[1, 0, 0, 1],
[1, 1, 0, 0],
[1, 1, 1, 1]])
y = np.random.randint(0,k,m)
y
array([3, 3, 0, 4])
You can create multi class dataset using numpy as follows -
def generate_dataset(size, classes=2, noise=0.5):
# Generate random datapoints
labels = np.random.randint(0, classes, size)
x = (np.random.rand(size) + labels) / classes
y = x + np.random.rand(size) * noise
# Reshape data in order to merge them
x = x.reshape(size, 1)
y = y.reshape(size, 1)
labels = labels.reshape(size, 1)
# Merge the data
data = np.hstack((x, y, labels))
return data
When visualised with matplotlib generated data will look like following -
You can change the number of classes and spread of data using classes and noise parameter. Here I have kept linear relation between x-axis and y-axis values which can also be changed as per requirement.
I have two tensors - one with bin specification and the other one with observed values. I'd like to count how many values are in each bin.
I know how to do this in either NumPy or bare Python, but I need to do this in pure TensorFlow. Is there a more sophisticated version of tf.histogram_fixed_width with an argument for bin specification?
Example:
# Input - 3 bins and 2 observed values
bin_spec = [0, 0.5, 1, 2]
values = [0.1, 1.1]
# Histogram
[1, 0, 1]
This seems to work, although I consider it to be quite memory- and time-consuming.
import tensorflow as tf
bins = [-1000, 1, 3, 10000]
vals = [-3, 0, 2, 4, 5, 10, 12]
vals = tf.constant(vals, dtype=tf.float64, name="values")
bins = tf.constant(bins, dtype=tf.float64, name="bins")
resh_bins = tf.reshape(bins, shape=(-1, 1), name="bins-reshaped")
resh_vals = tf.reshape(vals, shape=(1, -1), name="values-reshaped")
left_bin = tf.less_equal(resh_bins, resh_vals, name="left-edge")
right_bin = tf.greater(resh_bins, resh_vals, name="right-edge")
resu = tf.logical_and(left_bin[:-1, :], right_bin[1:, :], name="bool-bins")
counts = tf.reduce_sum(tf.to_float(resu), axis=1, name="count-in-bins")
with tf.Session() as sess:
print(sess.run(counts))
I have a theano tensor and I would like to clip its values, but each index to a different range.
For example, if I have a vector [a,b,c] , I want to clip a to [0,1] , clip b to [2,3] and c to [3,5].
How can I do that efficiently?
Thanks!
The theano.tensor.clip operation supports symbolic minimum and maximum values so you can pass three tensors, all of the same shape, and it will perform an element-wise clip of the first with respect to the second (minimum) and third (maximum).
This code shows two variations on this theme. v1 requires the minimum and maximum values to be passed as separate vectors while v2 allows the minimum and maximum values to be passed more like a list of pairs, represented as a two column matrix.
import theano
import theano.tensor as tt
def v1():
x = tt.vector()
min_x = tt.vector()
max_x = tt.vector()
y = tt.clip(x, min_x, max_x)
f = theano.function([x, min_x, max_x], outputs=y)
print f([2, 1, 4], [0, 2, 3], [1, 3, 5])
def v2():
x = tt.vector()
min_max = tt.matrix()
y = tt.clip(x, min_max[:, 0], min_max[:, 1])
f = theano.function([x, min_max], outputs=y)
print f([2, 1, 4], [[0, 1], [2, 3], [3, 5]])
def main():
v1()
v2()
main()