Based on the model presented in "Sharp U-Net: Depthwise convolutional network for biomedical image segmentation", I want to apply a sharpening filter instead of simply concatenating the encoder and decoder. I've been working on this for a few days right now and tried almost everything I could find on the Internet. I defined the sharpening filter and then convolved it with the output of the first layer of the encoder, and then concatenated the result with the fourth layer of the decoder. This final code I am going to attach it here gave an odd error, and I couldn't resolve it.
Any help will be appreciated.
This is how I defined my filter:
sharpen_filter = np.array(([-2, -2, -2], [-2, 17, -2], [-2, -2, -2]), dtype='int')
class Constant(tf.keras.initializers.Initializer):
def __init__(self, filter_matrix):
self.filter_matrix = filter_matrix
def __call__(self, shape, dtype=None):
filter_matrix = np.zeros((3,3,64))
for i in range(64):
filter_matrix[:, :, i] = self.filter_matrix
return K.constant(filter_matrix, shape=shape, dtype=dtype)
This is the part I want to convolve the sharpening filter and enc1, and then I concatenate its result with dec4:
dec4 = Conv2DTranspose(1, 3, strides=2, padding='same')(dec4)
conv_enc1 = keras.layers.Conv2D(64, kernel_size=3, activation='relu',
concat_enc1_dec5 = concatenate(([conv_enc1, dec4]))
Every way I tried resulted in different errors and this one have this error:
Tensor (it's a loooong tensor)has 576 elements, but got shape (3, 3, 64, 64) with 36864 elements).
This error is comming from this line of code:
return K.constant(filter_matrix, shape=shape, dtype=dtype)


PyTorch LSTM: using different sequence lengths for input and target

I want to use a denser time series to predict a less dense time series.
I first had input (X) with shape [33405, 4, 25] and target (Y) with shape [33405, 4, 7], in which 33405 is the amount of samples, 4 is the sequence length and 25 & 7 are the output sizes. They thus had a similar sequence length.
I used the following model:
class LSTMModel(nn.Module):
def __init__(self, input_size, hidden_size, output_size, num_layers=1, dropout=0, activation='tanh'):
self.lstm = nn.LSTM(input_size, hidden_size, num_layers=num_layers, batch_first=True)
self.dropout = nn.Dropout(dropout)
self.linear = nn.Linear(hidden_size, output_size)
def forward(self, x):
x, _ = self.lstm(x)
x = self.dropout(x)
x = self.linear(x)
return x
I got correctly an output of shape [batch_size, 4, 7].
However, I want to do something similar, but now use a more dense time series (sequence length 92) to predict the same sequence of 4. This means that I have an input X that has a sequence length of 92 and a target Y that has a sequence length of 4. My input (X) now has shape [33540, 92, 7] and target (Y) shape [33540, 4, 7].
I use the same model, but now my output has shape [4, 92, 7]. However, I want it, again, to be [batch_size, 4, 7].
I’m a newbie with LSTM and RNNs. Is it possible to work with an X and Y that have different sequence lengths? If so, how should I alter my model to get the desired output?

keras input not matching output confusion

I am trying to implement a silly learn to rank example. Essentially, I have 2 descriptions of a location, size and number of bathrooms. I want to "combine" them to create a score. Then I wish to compare the scores for the "best". I will always be comparing 3 locations at a time.
The neuralnetwork I expect to do this:
# 3 locations with 2 descriptions.
rinputs = Input(shape=(3, 2), name ='inputlayer')
# take my 3 expected inputs, split them
split = Lambda( lambda x: tf.split(x,num_or_size_splits=3,axis=1))(rinputs)
input_one_tensor = split[0]
input_two_tensor = split[1]
input_three_tensor = split[2]
# combine each set of location elements into 1 "score"
layer2 = Dense(1, name = 'Layer2', use_bias = True, activation = 'sigmoid') # 60 was better than 100
layer2a = layer2(input_one_tensor)
layer2b = layer2(input_two_tensor)
layer2c = layer2(input_three_tensor)
concatLayer = Concatenate(name = 'ConcatLayer2')([layer2a,layer2b, layer2c])
# softmax my score to get "best selection"
softmaxLayer = Dense(3, activation='softmax', name = 'softmax', use_bias = False)
softmaxLayer = softmaxLayer(concatLayer)
model = Model(inputs=rinputs, outputs=softmaxLayer)
model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adam(),metrics=['accuracy'])
I now create my test data:
loc1 = [1, 5]
loc2 = [4, 1]
loc3 = [6, 7]
# create two entries for my trial run
inputs = np.asarray([[loc1, loc2, loc3], [loc3,loc3,loc1]]).reshape(2,3,2)
ytrue = np.asarray([[1, 0, 0], [0, 0, 1]]).reshape(2,3), ytrue,verbose=True,)
But then I get the following error about my outputs. That I am not understanding.
File "/.virtualenvs/python310/lib/python3.10/site-packages/keras/", line 1990, in categorical_crossentropy
return backend.categorical_crossentropy(
File "/.virtualenvs/python310/lib/python3.10/site-packages/keras/", line 5529, in categorical_crossentropy
ValueError: Shapes (None, 3) and (None, 1, 3) are incompatible
I'm not entirely understanding why the shapes don't match. I expect my softmax layer to output 3 numbers that sum to 1 and can be compared to my ytrue.
any insights appreciated
Just from the model architecture itself, it seems like you just need a two-dimensional data to be fed into Layer2:
One may use a Reshape/Flatten layer to fix it.
By reshaping the output of Lambda layer from (None, 1, 2) to (None, 2), the final output's shape should become compatible too (None, 3).
Additional notes:
As an example borrowed (with some modifications) from the TensorFlow website, let's assume we want to split an input tensor of the shape of (3, 2) into 3 smaller tensors along the axis=1:
x = tf.Variable(tf.random.uniform([3, 2], -1, 1))
s0, s1, s2 = tf.split(x, num_or_size_splits=3, axis=1)
Here are the smaller tensor splits:
Now, we can see the shape is (1, 2), i.e. a 2D tensor consistent with the tensor it is derived from, and not a vector of the shape of (2,). In the context of your problem, for a batch, that would be (None, 1, 2).

Is there a way to create my own batch of images?

I built a convolutional neural network model and I want to test it using real live camera. For sure it will not work if I enter frame by frame because the input shape of CNN architecture does not match with a single frame. For example input of the network should be like this
(50000, 32, 32, 3)
but a single frame shape like this
(32, 32, 3)
so I want to know if there is a way to create a batch of frames, say take each 5 frames together, test them with the model then take the next 5 frames and so on? Or repeat each frame and put them together into a batch the make the testing. I don't know is this possible or is there a better way. Thanks
I think you mean this:
f0 = np.zeros((32, 32, 3), dtype=np.uint8)
f1 = np.zeros((32, 32, 3), dtype=np.uint8) + 1
f2 = np.zeros((32, 32, 3), dtype=np.uint8) + 2
f3 = np.zeros((32, 32, 3), dtype=np.uint8) + 3
f4 = np.zeros((32, 32, 3), dtype=np.uint8) + 4
arrayOfArrays = np.array([f0,f1,f2,f3,f4])
It may be achieved with a custom dataset. Something like this.
class Live_Video_Dataset(Dataset):
def __init__(self, transform=None):
self.transform = transform
self.video_buffer = []
# Here fill the video buffer with fremes from the camera asyncronusly.
def __len__(self):
return self.video_buffer_length
def __getitem__(self, idx):
if video_buffer[idx] == None
# Handle Exception
sample = video_buffer[idx]
if self.transform:
sample = self.transform(sample)
return sample
Since you did not specify the ML framework you are using, I assumed that you are using PyTorch. But I think that the logic is clear, you can adapt it to any ML framework. Hope it helps. Good luck.

CoreML: creating a custom layer for ONNX RandomNormal

I've trainined a VAE that in PyTorch that I need to convert to CoreML. From this thread PyTorch VAE fails conversion to onnx I was able to get the ONNX model to export, however, this just pushed the problem one step further to the ONNX-CoreML stage.
The original function that contains the torch.randn() call is the reparametrize func:
def reparametrize(self, mu, logvar):
std = logvar.mul(0.5).exp_()
if self.have_cuda:
eps = torch.randn(,, device='cuda')
eps = torch.randn(,
return eps.mul(std).add_(mu)
The solution is, of course, to create a custom layer, but I'm having problems creating a layer with no inputs (i.e., it's just a randn() call).
I can get the CoreML conversion to complete with this def:
def convert_randn(node):
params = NeuralNetwork_pb2.CustomLayerParams()
params.className = "RandomNormal"
params.description = "Random normal distribution generator"
params.parameters["dtype"].intValue = node.attrs.get('dtype', 1)
params.parameters["bs"].intValue = node.attrs.get("shape")[0]
params.parameters["nz"].intValue = node.attrs.get("shape")[1]
return params
I do the conversion with:
coreml_model = convert(onnx_model, add_custom_layers=True,
image_input_names = ['input'],
custom_conversion_functions={"RandomNormal": convert_randn})
I should also note that, at the completion of the mlmodel export, the following is printed:
Custom layers have been added to the CoreML model corresponding to the
following ops in the onnx model:
1/1: op type: RandomNormal, op input names and shapes: [], op output
names and shapes: [('62', 'Shape not available')]
Bringing the .mlmodel into Xcode complains that Layer '62' of type 500 has 0 inputs but expects at least 1. So I'm wondering how to specify a kind of "dummy" input to the layer, since it doesn't actually have an input -- it's just a wrapper around torch.randn() (or, more specifically, the onnx RandonNormal op). I should clarify that I do need the whole VAE, not just the decoder, as I'm actually using the entire process to "error correct" my inputs (i.e., the encoder estimates my z vector, based on an input, then the decoder generates the closest generalizable prediction of the input).
Any help greatly appreciated.
UPDATE: Okay, I finally got a version to load in Xcode (thanks to #MattijsHollemans and his book!). The originalConversion.mlmodel is the initial output of converting my model from ONNX to CoreML. To this, I had to manually insert the input for the RandomNormal layer. I made it (64, 28, 28) for no great reason — I know my batch size is 64, and my inputs are 28 x 28 (but presumably it could also be (1, 1, 1), since it's a "dummy"):
spec = coremltools.utils.load_spec('originalConversion.mlmodel')
nn = spec.neuralNetwork
layers = { for i,l in enumerate(nn.layers)}
layer_idx = layers["62"] # '62' is the name of the layer -- see above
layer = nn.layers[layer_idx]
inp = spec.description.input.add() = "dummy_input"
spec.description.input[1].type.multiArrayType.dataType = ft.ArrayFeatureType.DOUBLE
coremltools.utils.save_spec(spec, "modelWithInsertedInput.mlmodel")
This loads in Xcode, but I have yet to test the functioning of the model in my app. Since the additional layer is simple, and the input is literally a bogus, non-functional input (just to keep Xcode happy), I don't imagine it will be a problem, but I'll post again if it doesn't run properly.
UPDATE 2: Unfortunately, the model doesn't load at runtime. It fails with [espresso] [Espresso::handle_ex_plan] exception=Failed in 2nd reshape after missing custom layer info. What I find very strange and confusing is that, inspecting model.espresso.shape, I see that almost every node has a shape like:
"62" : {
"k" : 0,
"w" : 0,
"n" : 0,
"seq" : 0,
"h" : 0
I have two question/concerns: 1) Most obviously, why are all the values zero (this is the case with all but the input nodes), and 2) Why does it appear to be a sequential model, when it's just a fairly conventional VAE? Opening model.espresso.shape for a fully-functioning GAN in the same app, I see that the nodes are of the format:
"54" : {
"k" : 256,
"w" : 16,
"n" : 1,
"h" : 16
That is, they contain reasonable shape info, and they don't have seq fields.
Very, very confused...
UPDATE 3: I've also just noticed in the compiler report the error: IMPORTANT: new sequence length computation failed, falling back to old path. Your compilation was sucessful, but please file a radar on Core ML | Neural Networks and attach the model that generated this message.
Here's the original PyTorch model:
class VAE(nn.Module):
def __init__(self, bs, nz):
super(VAE, self).__init__() = nz = bs
self.encoder = nn.Sequential(
# input is (nc) x 28 x 28
nn.Conv2d(nc, ndf, 4, 2, 1, bias=False),
nn.LeakyReLU(0.2, inplace=True),
# size = (ndf) x 14 x 14
nn.Conv2d(ndf, ndf * 2, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 2),
nn.LeakyReLU(0.2, inplace=True),
# size = (ndf*2) x 7 x 7
nn.Conv2d(ndf * 2, ndf * 4, 3, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 4),
nn.LeakyReLU(0.2, inplace=True),
# size = (ndf*4) x 4 x 4
nn.Conv2d(ndf * 4, 1024, 4, 1, 0, bias=False),
nn.LeakyReLU(0.2, inplace=True),
self.decoder = nn.Sequential(
# input is Z, going into a convolution
nn.ConvTranspose2d( 1024, ngf * 8, 4, 1, 0, bias=False),
nn.BatchNorm2d(ngf * 8),
# size = (ngf*8) x 4 x 4
nn.ConvTranspose2d(ngf * 8, ngf * 4, 3, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 4),
# size = (ngf*4) x 8 x 8
nn.ConvTranspose2d(ngf * 4, ngf * 2, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 2),
# size = (ngf*2) x 16 x 16
nn.ConvTranspose2d(ngf * 2, nc, 4, 2, 1, bias=False),
self.fc1 = nn.Linear(1024, 512)
self.fc21 = nn.Linear(512, nz)
self.fc22 = nn.Linear(512, nz)
self.fc3 = nn.Linear(nz, 512)
self.fc4 = nn.Linear(512, 1024)
self.lrelu = nn.LeakyReLU()
self.relu = nn.ReLU()
def encode(self, x):
conv = self.encoder(x);
h1 = self.fc1(conv.view(-1, 1024))
return self.fc21(h1), self.fc22(h1)
def decode(self, z):
h3 = self.relu(self.fc3(z))
deconv_input = self.fc4(h3)
deconv_input = deconv_input.view(-1,1024,1,1)
return self.decoder(deconv_input)
def reparametrize(self, mu, logvar):
std = logvar.mul(0.5).exp_()
eps = torch.randn(,, device='cuda') # needs custom layer!
return eps.mul(std).add_(mu)
def forward(self, x):
# print("x", x.size())
mu, logvar = self.encode(x)
z = self.reparametrize(mu, logvar)
decoded = self.decode(z)
return decoded, mu, logvar
To add an input to your Core ML model, you can do the following from Python:
import coremltools
spec = coremltools.utils.load_spec("YourModel.mlmodel")
nn = spec.neuralNetworkClassifier # or just spec.neuralNetwork
layers = { for i,l in enumerate(nn.layers)}
layer_idx = layers["your_custom_layer"]
layer = nn.layers[layer_idx]
inp = spec.description.input.add() = "dummy_input"
coremltools.utils.save_spec(spec, "NewModel.mlmodel")
Here, "your_custom_layer" is the name of the layer you want to add the dummy input to. In your model it looks like it's called 62. You can look at the layers dictionary to see the names of all the layers in the model.
If your model is not a classifier, use nn = spec.neuralNetwork instead of neuralNetworkClassifier.
I made the new dummy input have the type "double". That means your custom layer gets a double value as input.
You need to specify a value for this dummy input when using the model.

Input tensors to a Model must be Keras tensors

Input tensors to a Model must be Keras tensors. Found:
Tensor("my_layer/Identity:0", shape=(?, 10, 1152, 16), dtype=float32)
(missing Keras metadata).
Hi, I get this error when trying to take one layer's intermediate variable to use it as input to a parallel network. Such that one layer's intermediate variable will be input to the other network.
def call(self, inputs, training=None):
inputs_expand = K.expand_dims(inputs, 1)
tensor_b = K.tile(inputs_expand, [1, 16, 1, 1])
tensor_a = K.map_fn(lambda x: K.batch_dot(x, self.Weights, [2, 3]), elems=tensor_b)
# I need this tensor_a
# I tried many things but ended putting it to member variable.
self.tensor_a = K.identity(inputs_hat)
outside when trying to build the parallel model I do this
a_model = models.Model([my_layer.tensor_a],[my_layer.c])
I could not find any good solution to this problem? How can I turn the tensor into K.tensor??
