I'm trying to train a model in Keras to suggest the best possible next move when presented with a pawn chess board. the board is represented as a list of 64 integers (0 for empty, 1 for player, 2 for enemy). The output is represented by a list of a field and a direction that the figure on that field should move in, which means I need two ouput layers with size 64 (number of fields) and 5 (number of possible move directions, including two forward and no move for when the game is over).
I have a list of boards and a list of solutions. When I try to fit the model however, I get the above mentioned error.
The exact error message is:
Epoch 1/75
Traceback (most recent call last):
File "C:\Users\lulll\Documents\CodeStuff\tfTesting\main.py", line 75, in <module>
model.fit(train_fig_starts, train_fig_moves, epochs=75)
File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\utils\traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:\Users\lulll\AppData\Local\Temp\__autograph_generated_filej0zia4d5.py", line 15, in tf__train_function
retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
ValueError: in user code:
File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\engine\training.py", line 1249, in train_function *
return step_function(self, iterator)
File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\engine\training.py", line 1233, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\engine\training.py", line 1222, in run_step **
outputs = model.train_step(data)
File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\engine\training.py", line 1024, in train_step
loss = self.compute_loss(x, y, y_pred, sample_weight)
File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\engine\training.py", line 1082, in compute_loss
return self.compiled_loss(
File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\engine\compile_utils.py", line 265, in __call__
loss_value = loss_obj(y_t, y_p, sample_weight=sw)
File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\losses.py", line 152, in __call__
losses = call_fn(y_true, y_pred)
File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\losses.py", line 284, in call **
return ag_fn(y_true, y_pred, **self._fn_kwargs)
File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\losses.py", line 2176, in binary_crossentropy
backend.binary_crossentropy(y_true, y_pred, from_logits=from_logits),
File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\backend.py", line 5688, in binary_crossentropy
bce = target * tf.math.log(output + epsilon())
ValueError: Dimensions must be equal, but are 2 and 64 for '{{node binary_crossentropy/mul}} = Mul[T=DT_FLOAT](binary_crossentropy/Cast, binary_crossentropy/Log)' with input shapes: [?,2], [?,64].
I have absolutely no idea what is causing this. I've searched for the error already, but the only mentions I've found seem to be describing a completely different scenario.
Since it probably helps, here's the code used to create and fit the model:
inputs = tf.keras.layers.Input(shape=64)
x = tf.keras.layers.Dense(32, activation='relu')(inputs)
out_field = tf.keras.layers.Dense(64, name="field")(x)
out_movement = tf.keras.layers.Dense(5, name="movement")(x)
model = tf.keras.Model(inputs=inputs, outputs=[out_field, out_movement])
model.fit(train_fig_starts, train_fig_moves, epochs=75) #train_fig_starts and moves are defined above
EDIT 1: Here's a sample of the dataset I'm using (the whole thing is too long for the character limit)
train_fig_starts = [[0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 2, 2, 0, 1, 0, 0, 0, 0, 1, 2, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 2, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 2, 1, 0, 0, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 2, 0, 0, 0, 0, 1], [0, 0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 2, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 2, 0], [0, 2, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 2, 1, 2, 2, 2, 0, 0, 0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 2, 2, 0, 0, 0, 0, 0, 1, 0, 0, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 2, 0, 0, 0, 0, 1, 2, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]]
train_fig_moves = [[0, 0], [0, 0], [0, 0], [0, 0], [15, 2], [15, 2]]
I changed it to sparsecategorialcrossentropy since that seems more like what I'm looking for. This is now the model code
inputs = tf.keras.layers.Input(shape=64)
x = tf.keras.layers.Dense(64, activation='relu')(inputs)
out_field = tf.keras.layers.Dense(64, activation="relu", name="field")(x)
out_field = tf.keras.layers.Dense(64, activation="softmax", name="field_softmax")(out_field)
out_movement = tf.keras.layers.Dense(5, activation="relu", name="movement")(x)
out_movement = tf.keras.layers.Dense(5, activation="softmax", name="movement_softmax")(out_movement)
model = tf.keras.Model(inputs=inputs, outputs=[out_field, out_movement])
tf.keras.utils.plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)
it still throws an error, this time its the following:
Node: 'sparse_categorical_crossentropy_1/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits'
logits and labels must have the same first dimension, got logits shape [32,5] and labels shape [64]
[[{{node sparse_categorical_crossentropy_1/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]] [Op:__inference_train_function_1666]
I have no idea why its like that. Output logits and labels should both be [64, 2]. Since I'm using sparse crossentropy I should be able to use integers in my training data to signify the "index" of the ouput neuron with the highest logit, right? Correct me if I'm wrong. If it helps, here's a diagram of my model:
plot of the model
So I fixed the issue by myself now. Honestly it was a pretty stupid error to make but the error messages didn't really explain well what was going on. I swapped the outputs for one hot encoding and changed the loss to CategorialCrossEntropy, which is also more fitting for a categorisation problem (Sparse didn't work with my integers for some reason). After that I needed to change the label list from a 1dim list containing lists of len = 2 to a 2dim list containing both the field and the move one hots in a separate list. If anyone runs into a similar issue and can't make sense of it, maybe this will help.
My TF version is 2.9 and Python 3.8.
I have built an image binary classification CNN model and I am trying to get a confusion matrix.
The dataset structure is as follows.
│------ benign/
│------ normal/
│------ benign/
│------ normal/
The dataset configuration is as follows.
train_ds = tf.keras.utils.image_dataset_from_directory(
directory = train_data_dir,
image_size=(img_height, img_width),
val_ds = tf.keras.utils.image_dataset_from_directory(
directory = train_data_dir,
image_size=(img_height, img_width),
test_ds = tf.keras.utils.image_dataset_from_directory(
directory = test_data_dir,
image_size=(img_height, img_width),
I wrote the code referring to the following link to get the confusion matrix.
Reference Page
And this is my code about the confusion matrix.
predictions = model.predict(test_ds)
y_pred = []
y_true = []
# iterate over the dataset
for image_batch, label_batch in test_ds: # use dataset.unbatch() with repeat
# append true labels
# compute predictions
preds = model.predict(image_batch)
# append predicted labels
y_pred.append(np.argmax(preds, axis = - 1))
# convert the true and predicted labels into tensors
true_labels = tf.concat([item for item in y_true], axis = 0)
predicted_labels = tf.concat([item for item in y_pred], axis = 0)
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(true_labels, predicted_labels)
y_pred and y_true were obtained from test_ds as above, and the results of confusion matrix were as follows.
[[200 0]
[200 0]]
So I tried outputting true_labels and predicted_labels, and confirmed that predicted_labels are both 0 as follows.
<tf.Tensor: shape=(400,), dtype=int32, numpy=
array([0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0,
1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0,
0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0,
0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0,
1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0,
0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1,
0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1,
1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1,
1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1,
0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0,
1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0,
0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1,
0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0,
1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0,
1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0,
0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1,
1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0,
0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0,
0, 0, 1, 1])>
<tf.Tensor: shape=(400,), dtype=int64, numpy=
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0], dtype=int64)>
I'm not sure why predicted_labels are all zero.
But this is wrong. I think the following results are correct.
[[200 0]
[0 200]]
What is wrong? I've been struggling for a few days. Please please help me.
Thanks a lot.
In case of Image Binary Classification, threshold should be used to obtain predict label after model.predict(test_ds). I found that modifying the code in my question y_pred.append(np.argmax(preds, axis = - 1)) to y_pred.append(np.where(preds > threshold, 1, 0)) solved the problem. Hope it was helpful to someone.
I am using a FOR LOOP to calculate a simple probability on a dataset with approximately 500K rows of data.
For loop
class_ = 4
class_freq = Counter(list_[-1] for list_ in train_list) # Counter({5: 1476, 1: 1531, 4: 1562, 3: 1430, 2: 1498, 7: 1517, 6: 1486})
def cp(x, class_, freq_): # x is column index passed from another function
for row in train_list:
pos = 0
neg = 0
if row[x] == 1 and row[54] == class_:
cal_0 = (neg + 0.1) / (class_freq[class_value] + 0.2)
cal_1 = (pos + 0.1) / (class_freq[class_value] + 0.2)
if prob_1 > prob_0:
return prob_1
return prob_0
Train_list sample
[3050, 180, 4, 277, -3, 5782, 221, 242, 156, 2721, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
[2818, 119, 19, 30, 10, 5213, 248, 220, 92, 4497, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2]
[3182, 115, 10, 553, 10, 4605, 237, 231, 124, 1768, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5]
[3024, 312, 18, 474, 177, 5785, 169, 224, 194, 4961, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2]
[3067, 32, 4, 30, -2, 6679, 219, 230, 147, 2947, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4]
[2716, 1, 10, 234, 27, 2100, 206, 222, 153, 5581, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4]
The FOR LOOP works well on small dataset (few hundred rows) as expected. Unfortunately, when I try to use it on 20K rows of data, the processing time take ages. I cannot imagine how long it will take to run 500K rows of data.
FOR LOOP is grossly bad in performance for large dataset. What is an alternative to this? Will Lambda improve processing speed? I appreciate advice and assistance here, thanks.
Thanks to everyone comments, I have tried to work on another algorithm to replace the FOR LOOP.
def cp(x, class_, class_):
filtered_list = [t for t in train_list if t[54] == class_]
count_binary = Counter(binary[col] for binary in filtered_list)
binary_1 = count_binary[1]
binary_0 = count_binary[0]
cal_0 = (binary_0 + 0.1) / (class_freq[class_value] + 0.2)
cal_1 = (binary_1 + 0.1) / (class_freq[class_value] + 0.2)
if prob_1 > prob_0:
return prob_1
return prob_0
I am still running the above code in my program and the process is not done yet - so can't tell if it is much efficient. I will appreciate if someone can provide their opinion on this new block of code.
FYI, if this is indeed a better and more efficient code, then the issue of processing speed is most likely on other parts of my code.
I am working a Kaggle dataset that predicts a price of an item using its description and other attributes. Here is the link to the competition. As part of an experiment, I am currently, only using an item's description to predict its price. The description is free text and I use sklearn's Tfidf's vectorizer with a bi-gram and max features set to 60000 as input to a lightGBM model.
After training, I would like to know the most influential tokens for predicting the price. I assumed lightGBM's feature_importance method will be able to give me this. This will return a 60000 dim numpy array, whose index I can use to retrieve the token from the Tfidf's vectorizer's vocab dictionary.
Here is the code:
vectorizer = TfidfVectorizer(ngram_range=(1,2), max_features=60000)
x_train = vectorizer.fit_transform(train_df['text'].values.astype('U'))
x_valid = vectorizer.transform(valid_df['text'].values.astype('U'))
idx2tok = {v: k for k, v in vectorizer.vocabulary_.items()}
features = [f'token_{i}' for i in range(len(vectorizer.vocabulary_))]
get_tok = lambda x, idxmap: idxmap[int(x[6:])]
lgb_train = lgb.Dataset(x_train, y_train)
lgb_valid = lgb.Dataset(x_valid, y_valid, reference=lgb_train)
gbm = lgb.train(lgb_params, lgb_train, num_boost_round=10, valid_sets=[lgb_train, lgb_valid], early_stopping_rounds=10, verbose_eval=True)
The model trains, however, after training when I call gbm.feature_importance(), I get a sparse array of integers, that really doesn't make sense to me:
fi = gbm.feature_importance()
array([ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 33, 34, 38, 45],
I'm not sure how to interpret this. I thought that earlier indices of the feature importance array will have higher value and thus tokens corresponding to that index in the vectorizer's vocab will be more important/influential than other tokens. Is this assumption wrong? How do I get the most influential/important terms that determines the model outcome? Any help is appreciated.
Could you comment two version of variational autoencoder loss and show me why they give me different results?
data1 = np.array([0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0], dtype='int32')
data2 = np.array([1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0], dtype='int32')
data3 = np.array([0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0], dtype='int32')
100 samples each, so I have 300 samples.
Code 1:
def vae_loss(x, x_decoded_mean):
xent_loss = objectives.binary_crossentropy(x, x_decoded_mean)
kl_loss = -0.5 * K.mean(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var))
loss = xent_loss + kl_loss
return loss
vae.compile(optimizer='rmsprop', loss=vae_loss)
Code 2:
def zero_loss(y_true, y_pred):
return K.zeros_like(y_pred)
class CustomVariationalLayer(Layer):
def __init__(self, **kwargs):
self.is_placeholder = True
super(CustomVariationalLayer, self).__init__(**kwargs)
def vae_loss(self, x, x_decoded_mean):
xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)
K.exp(z_log_var), axis=-1)
return K.mean(xent_loss + kl_loss)
def call(self, inputs):
x = inputs[0]
x_decoded_mean = inputs[1]
loss = self.vae_loss(x, x_decoded_mean)
self.add_loss(loss, inputs=inputs)
return K.ones_like(x)
loss_layer = CustomVariationalLayer()([x, x_decoded_mean])
vae = Model(x, [loss_layer])
vae.compile(optimizer='rmsprop', loss=[zero_loss])
Results are so different and I don't see where? Latent dimension are different. Code 2 shows the separation between groups and code 1 not.
code 1, vae.predict... is not accurate and code 2 give me 1 on all features.
Code 2 gives me accurate feedback of the code:
sent_encoded = encoder.predict(np.array(test), batch_size = batch_size)
sent_decoded = generator.predict(sent_encoded)
and code 1 is not accurate at all.
Both experiments have the same layers. So, once again, where is the different and what is the best solution for dataset like described above?
l have an adjacency matrix of 16 by 16.
Adjacency=[0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
[1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
From this adjacency matrix l applied scipy algorithm to determine the connected components as follow :
from scipy.sparse.csgraph import connected_components
which returns 4 components :
(4, array([0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 2, 2, 3, 0], dtype=int32))
Now the algorithm returns 4 components (4 new nodes or 4 supernodes 0,1,2,3) and its associated adjacency matrix is of dim=(4,4)
My question is as follow :
Given the intial adjacency matrix of 16 by 16 and the connected components, how can l compute efficiently the new adjacency matrix ?
In other way, we need to merge all the nodes that are affected to the same connected component.
EDIT 1 :
Here a concrete example. Given the following adjacency matrix of 6 nodes, dim=-6,6) :
Given three supernodes as follow :
supernodes[0]=[0,2]# supernode 0 merges node 0 and 2
supernodes[1]=[1,4]#supernode 1 merges node 1 and 4
supernodes[2]=[3,5]#supernode 2 merges node 3 and 5
The supposed output :
Adjacency matrix of 3 supernodes dim=(3,3)
What does it mean ?
For instance, consider the first supernodes[0]=[0,2]. The idea is as follow :
A) if i and j are in the same supernode then adjacency[i,j]=0
B)if i and j are in the same supernode and i or j has connection with other nodes other than i and j set 1
Thank you for your help.