TensorFlow Lambda layer fft2d - keras

I am building a CNN where the input is a grayscale image (256x256x1) and I want to add a Fourier transform layer which should output a shape (256x256x2), with the 2 channels for real and imaginary. I found tf.signal.fft2d on https://www.tensorflow.org/api_docs/python/tf/signal/fft2d . Unfortunately it is hard to find any example or explanation of how to use it concretely... I have tried:
X_input = Input(input_shape,)
X_input_fft=Lambda(lambda v: tf.cast(tf.compat.v1.spectral.rfft2d(v),dtype=tf.float32))(X_input)
l1Conv1 = Conv2D(filters = 16, kernel_size = (5,5), strides = 1, padding ='same',
data_format='channels_last',
kernel_initializer= initializers.he_normal(seed=None),
bias_initializer='zeros')(X_input_fft)
but honestly I don't know what I am doing ...
Also, for the last layer, I would like to do an inverse fft, something like:
myLastLayer= Lambda(lambda v: tf.cast(tf.compat.v1.spectral.irfft2d(tf.cast(v, dtype=tf.complex64)),dtype=tf.float32))(myBeforeLastLayer)

I'm sorry that the answer comes 2 years later but I think this will help a lot of people dealing with Tensorflow fft2d
The first thing you should know is that the documentation says that TensorFlow performs the fft2d in "the inner-most 2 dimensions of input", which only means that they perform the fft2 in the last two dimensions. Then you have to permute the input tensor to work with that.
A function that will do the thing you need would be this one.
def fft2d_function(x, dtype = "complex64"):
x = tf.transpose(x, perm = [2, 0, 1])
x = tf.cast(x, dtype)
x_f = tf.signal.fft2d(x)
x_f = tf.transpose(x_f, perm = [1, 2, 0])
real_x_f, imag_x_f = tf.math.real(x_f), tf.math.imag(x_f)
return real_x_f, imag_x_f
or, if you are sure that the input is a real signal you can use rfft2d instead
def rfft2d_function(x):
x = tf.transpose(x, perm = [2, 0, 1])
x_f = tf.signal.rfft2d(x)
x_f = tf.transpose(x_f, perm = [1, 2, 0])
real_x_f, imag_x_f = tf.math.real(x_f), tf.math.imag(x_f)
return real_x_f, imag_x_f
Besides, if you want to perform the inverse of these functions would be like this.
def ifft2d_function(x_r_i_tuple):
real_x_f, imag_x_f = x_r_i_tuple
x_f = tf.complex(real_x_f, imag_x_f)
x_f = tf.transpose(x_f, perm = [2, 0, 1])
x_hat = tf.signal.ifft2d(x_f)
x_hat = tf.transpose(x_hat, perm = [1, 2, 0])
return x_hat
def irfft2d_function(x_r_i_tuple):
real_x_f, imag_x_f = x_r_i_tuple
x_f = tf.complex(real_x_f, imag_x_f)
x_f = tf.transpose(x_f, perm = [2, 0, 1])
x_hat = tf.signal.irfft2d(x_f)
x_hat = tf.transpose(x_hat, perm = [1, 2, 0])
return x_hat
To end. an important thing in Fourier is the fftshift. TensorFlow also has a
fourier_x = tf.signal.fftshift(fourier_x)
I hope this answer helps someone dealing with Fourier transform in Tensorflow

Related

How to adjust the batch data by the amount of labels in PyTorch

I have made n-grams / doc-ids for document classification,
def create_dataset(tok_docs, vocab, n):
n_grams = []
document_ids = []
for i, doc in enumerate(tok_docs):
for n_gram in [doc[0][i:i+n] for i in range(len(doc[0]) - 1)]:
n_grams.append(n_gram)
document_ids.append(i)
return n_grams, document_ids
def create_pytorch_datasets(n_grams, doc_ids):
n_grams_tensor = torch.tensor(n_grams)
doc_ids_tensor = troch.tensor(doc_ids)
full_dataset = TensorDataset(n_grams_tensor, doc_ids_tensor)
return full_dataset
create_dataset returns pair of (n-grams, document_ids) like below:
n_grams, doc_ids = create_dataset( ... )
train_data = create_pytorch_datasets(n_grams, doc_ids)
>>> train_data[0:100]
(tensor([[2076, 517, 54, 3647, 1182, 7086],
[517, 54, 3647, 1182, 7086, 1149],
...
]),
tensor(([0, 0, 0, 0, 0, ..., 3, 3, 3]))
train_loader = DataLoader(train_data, batch_size = batch_size, shuffle = True)
The first of tensor content means n-grams and the second one does doc_id.
But as you know, by the length of documents, the amount of training data according to the label would changes.
If one document has very long length, there would be so many pairs that have its label in training data.
I think it can cause overfitting in model, because the classification model tends to classify inputs to long length documents.
So, I want to extract input batches from a uniform distribution for label (doc_ids). How can I fix it in code above?
p.s)
If there is train_data like below, I want to extract batch by the probability like that:
n-grams doc_ids
([1, 2, 3, 4], 1) ====> 0.33
([1, 3, 5, 7], 2) ====> 0.33
([2, 3, 4, 5], 3) ====> 0.33 * 0.25
([3, 5, 2, 5], 3) ====> 0.33 * 0.25
([6, 3, 4, 5], 3) ====> 0.33 * 0.25
([2, 3, 1, 5], 3) ====> 0.33 * 0.25
In pytorch you can specify a sampler or a batch_sampler to the dataloader to change how the sampling of datapoints is done.
docs on the dataloader:
https://pytorch.org/docs/stable/data.html#data-loading-order-and-sampler
documentation on the sampler: https://pytorch.org/docs/stable/data.html#torch.utils.data.Sampler
For instance, you can use the WeightedRandomSampler to specify a weight to every datapoint. The weighting can be the inverse length of the document for instance.
I would make the following modifications in the code:
def create_dataset(tok_docs, vocab, n):
n_grams = []
document_ids = []
weights = [] # << list of weights for sampling
for i, doc in enumerate(tok_docs):
for n_gram in [doc[0][i:i+n] for i in range(len(doc[0]) - 1)]:
n_grams.append(n_gram)
document_ids.append(i)
weights.append(1/len(doc[0])) # << ngrams of long documents are sampled less often
return n_grams, document_ids, weights
sampler = WeightedRandomSampler(weights, 1, replacement=True) # << create the sampler
train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=False, sampler=sampler) # << includes the sampler in the dataloader

Is there any build in function for making 2D tensor from 1D tensor using specific calculation?

Hi I'm student who just started for deep learning.
For example, I have 1-D tensor x = [ 1 , 2]. From this one, I hope to make 2D tensor y whose (i,j)th element has value (x[i] - x[j]), i.e y[0,:] = [0 , 1] , y[1,:]=[ -1 , 0].
Is there built-in function like this in pytorch library?
Thanks.
Here you need right dim of tensor to get expected result which you can get using torch.unsqueeze
x = torch.tensor([1 , 2])
y = x - x.unsqueeze(1)
y
tensor([[ 0, 1],
[-1, 0]])
There are a few ways you could get this result, the cleanest I can think of is using broadcasting semantics.
x = torch.tensor([1, 2])
y = x.view(-1, 1) - x.view(1, -1)
which produces
y = tensor([[0, -1],
[1, 0]])
Note I'll try to edit this answer and remove this note if the original question is clarified.
In your question you ask for y[i, j] = x[i] - x[j], which the above code produces.
You also say that you expect y to have values
y = tensor([[ 0, 1],
[-1, 0]])
which is actually y[i, j] = x[j] - x[i] as was posted in Dishin's answer. If you instead wanted the latter then you can use
y = x.view(1, -1) - x.view(-1, 1)

what does a tf placeholder functions do

I am trying to figure out what this piece of code does but I cannot figure out how it passed a image and what it does to the image.
The major line of code is this one
images1, images2 = preprocess(images, is_train, BATCH_SIZE, IMAGE_HEIGHT, IMAGE_WIDTH)
pretty simple it is a function that get images it would think.
now the parameter images is this:
images = tf.placeholder(tf.float32, [2, BATCH_SIZE, IMAGE_HEIGHT, IMAGE_WIDTH, 3], name='images')
is_train = tf.placeholder(tf.bool, name='is_train')
and this is the function for preprocess:
def preprocess(images, is_train, BATCH_SIZE, IMAGE_HEIGHT, IMAGE_WIDTH):
def train():
split = tf.split(images, [1, 1])
shape = [1 for _ in range(split[0].get_shape()[1])]
for i in range(len(split)):
split[i] = tf.reshape(split[i], [BATCH_SIZE, IMAGE_HEIGHT, IMAGE_WIDTH, 3])
split[i] = tf.image.resize_images(split[i], [IMAGE_HEIGHT + 8, IMAGE_WIDTH + 3])
split[i] = tf.split(split[i], shape)
for j in range(len(split[i])):
split[i][j] = tf.reshape(split[i][j], [IMAGE_HEIGHT + 8, IMAGE_WIDTH + 3, 3])
split[i][j] = tf.random_crop(split[i][j], [IMAGE_HEIGHT, IMAGE_WIDTH, 3])
split[i][j] = tf.image.random_flip_left_right(split[i][j])
split[i][j] = tf.image.random_brightness(split[i][j], max_delta=32. / 255.)
split[i][j] = tf.image.random_saturation(split[i][j], lower=0.5, upper=1.5)
split[i][j] = tf.image.random_hue(split[i][j], max_delta=0.2)
split[i][j] = tf.image.random_contrast(split[i][j], lower=0.5, upper=1.5)
split[i][j] = tf.image.per_image_standardization(split[i][j])
return [tf.reshape(tf.concat(split[0], axis=0), [BATCH_SIZE, IMAGE_HEIGHT, IMAGE_WIDTH, 3]),
tf.reshape(tf.concat(split[1], axis=0), [BATCH_SIZE, IMAGE_HEIGHT, IMAGE_WIDTH, 3])]
def val():
split = tf.split(images, [1, 1])
shape = [1 for _ in range(split[0].get_shape()[1])]
for i in range(len(split)):
split[i] = tf.reshape(split[i], [BATCH_SIZE, IMAGE_HEIGHT, IMAGE_WIDTH, 3])
split[i] = tf.image.resize_images(split[i], [IMAGE_HEIGHT, IMAGE_WIDTH])
split[i] = tf.split(split[i], shape)
for j in range(len(split[i])):
split[i][j] = tf.reshape(split[i][j], [IMAGE_HEIGHT, IMAGE_WIDTH, 3])
split[i][j] = tf.image.per_image_standardization(split[i][j])
return [tf.reshape(tf.concat(split[0], axis=0), [BATCH_SIZE, IMAGE_HEIGHT, IMAGE_WIDTH, 3]),
tf.reshape(tf.concat(split[1], axis=0), [BATCH_SIZE, IMAGE_HEIGHT, IMAGE_WIDTH, 3])]
return tf.cond(is_train, train, val)
This is the whole code by the images
if MODE == 'train':
tarin_num_id = get_num_id(DATA_DIR, 'train')
elif MODE == 'eval':
val_num_id = get_num_id(DATA_DIR, 'val')
images1, images2 = preprocess(images, is_train, BATCH_SIZE, IMAGE_HEIGHT, IMAGE_WIDTH)
I dont know how this will process images and send it to the network.
Thank you for any help with this.
The whole code I am working on comes from here
https://github.com/digitalbrain79/person-reid
The answer to this question is that the feed_dict is what is being passed and should contain the images that is needed.
feed_dict = {images: test_images, is_train: False}
you load the images through a array like test_images and then pass this to the feed_dict. This saves time as you can load different images into the feed_dict and not change much of the code for train, validation, or testing
thank you #Chetan Vashisth for pointing to the feed_dict dictionary

Tensorflow histogram with custom bins

I have two tensors - one with bin specification and the other one with observed values. I'd like to count how many values are in each bin.
I know how to do this in either NumPy or bare Python, but I need to do this in pure TensorFlow. Is there a more sophisticated version of tf.histogram_fixed_width with an argument for bin specification?
Example:
# Input - 3 bins and 2 observed values
bin_spec = [0, 0.5, 1, 2]
values = [0.1, 1.1]
# Histogram
[1, 0, 1]
This seems to work, although I consider it to be quite memory- and time-consuming.
import tensorflow as tf
bins = [-1000, 1, 3, 10000]
vals = [-3, 0, 2, 4, 5, 10, 12]
vals = tf.constant(vals, dtype=tf.float64, name="values")
bins = tf.constant(bins, dtype=tf.float64, name="bins")
resh_bins = tf.reshape(bins, shape=(-1, 1), name="bins-reshaped")
resh_vals = tf.reshape(vals, shape=(1, -1), name="values-reshaped")
left_bin = tf.less_equal(resh_bins, resh_vals, name="left-edge")
right_bin = tf.greater(resh_bins, resh_vals, name="right-edge")
resu = tf.logical_and(left_bin[:-1, :], right_bin[1:, :], name="bool-bins")
counts = tf.reduce_sum(tf.to_float(resu), axis=1, name="count-in-bins")
with tf.Session() as sess:
print(sess.run(counts))

Clip parts of a tensor

I have a theano tensor and I would like to clip its values, but each index to a different range.
For example, if I have a vector [a,b,c] , I want to clip a to [0,1] , clip b to [2,3] and c to [3,5].
How can I do that efficiently?
Thanks!
The theano.tensor.clip operation supports symbolic minimum and maximum values so you can pass three tensors, all of the same shape, and it will perform an element-wise clip of the first with respect to the second (minimum) and third (maximum).
This code shows two variations on this theme. v1 requires the minimum and maximum values to be passed as separate vectors while v2 allows the minimum and maximum values to be passed more like a list of pairs, represented as a two column matrix.
import theano
import theano.tensor as tt
def v1():
x = tt.vector()
min_x = tt.vector()
max_x = tt.vector()
y = tt.clip(x, min_x, max_x)
f = theano.function([x, min_x, max_x], outputs=y)
print f([2, 1, 4], [0, 2, 3], [1, 3, 5])
def v2():
x = tt.vector()
min_max = tt.matrix()
y = tt.clip(x, min_max[:, 0], min_max[:, 1])
f = theano.function([x, min_max], outputs=y)
print f([2, 1, 4], [[0, 1], [2, 3], [3, 5]])
def main():
v1()
v2()
main()

Resources