Using keras metrics BinaryCrossentropy for a binary model

Question

I'm trying to implement a binary classification model using tensorflow keras and stumbled over problem that I cannot grasp.

My model shall classify images of houses in the two classes of "old/antique" and "new/modern". I used transfer learning using the pre-trained VGG16 and had already some successes but would like to optimize it a little bit. Here's the outline of the model

model_cropped = keras.Sequential([
    layers.InputLayer(input_shape=(224, 224, 3)),
    augmentation,
    pre_model,
layers.Flatten(),
layers.Dense(256, activation='relu'),
layers.Dropout(0.2),
layers.Dense(128, activation='relu'),
layers.Dropout(0.2),
layers.Dense(64, activation='relu'),
layers.Dropout(0.2),
layers.Dense(1, activation='sigmoid')

])
optimizer = keras.optimizers.Nadam(learning_rate=1e-6, use_ema = True)
model_cropped.compile(
    optimizer = optimizer,
    loss=keras.losses.BinaryCrossentropy(),
    metrics=[keras.BinaryCrossentropy()]
)

I used already several metrics and wanted to try the BinaryCrossentropy metric, since according to the manual it's suitable for binary classifaction problems. However, though using binary cross entropy as the loss function with other metrics (as BinaryAccuracy) works fine, using binary cross entropy as a metric throws an error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[201], line 4
      1 epochs = 600
      2 print("Fitting the end-to-end model")
----> 4 history_cropped = model_cropped.fit(
      5     X_train, y_train,
      6     epochs=epochs,
      7     validation_data=(X_test, y_test),
      8     batch_size=64
      9 )
File ~/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py:122, in filter_traceback.<locals>.error_handler(args, *kwargs)
    119     filtered_tb = _process_traceback_frames(e.traceback)
    120     # To get the full stack trace, call:
    121     # keras.config.disable_traceback_filtering()
--> 122     raise e.with_traceback(filtered_tb) from None
    123 finally:
    124     del filtered_tb
File ~/lib/python3.10/site-packages/keras/src/backend/tensorflow/nn.py:737, in binary_crossentropy(target, output, from_logits)
    734 output = tf.convert_to_tensor(output)
    736 if len(target.shape) != len(output.shape):
--> 737     raise ValueError(
    738         "Arguments target and output must have the same rank "
    739         "(ndim). Received: "
    740         f"target.shape={target.shape}, output.shape={output.shape}"
    741     )
    742 for e1, e2 in zip(target.shape, output.shape):
    743     if e1 is not None and e2 is not None and e1 != e2:
ValueError: Arguments target and output must have the same rank (ndim). Received: target.shape=(None,), output.shape=(None, 1)

Apparently, the rank of the output node is (None,1) but binary cross entropy is expecting (None,). Can someone enlighten me, what I misunderstand here or what I'm doing wrong? I thought, since the binary cross entropy is working as a loss function it should therefore also work as a metric.

Thx a lot

score 4 · Accepted Answer · answered Dec 12 '24 at 12:45

As I mentioned in my comment to the OP, the problem seems to be that you have a mismatch in dimensions. To resolve this, you can reshape your target labels (y_train and y_test) to have a shape of (None, 1) instead of (None,). You can do this by adding an extra dimension using np.expand_dims (or similar functionality in TF). If you prefer not to modify your target data, you can adapt the output layer to produce a shape of (None,) by flattening it.

Using keras metrics BinaryCrossentropy for a binary model

1 Answers1