3

I'm trying to implement a binary classification model using tensorflow keras and stumbled over problem that I cannot grasp.

My model shall classify images of houses in the two classes of "old/antique" and "new/modern". I used transfer learning using the pre-trained VGG16 and had already some successes but would like to optimize it a little bit. Here's the outline of the model

model_cropped = keras.Sequential([
    layers.InputLayer(input_shape=(224, 224, 3)),
    augmentation,
    pre_model,
layers.Flatten(),
layers.Dense(256, activation='relu'),
layers.Dropout(0.2),
layers.Dense(128, activation='relu'),
layers.Dropout(0.2),
layers.Dense(64, activation='relu'),
layers.Dropout(0.2),
layers.Dense(1, activation='sigmoid')

])

optimizer = keras.optimizers.Nadam(learning_rate=1e-6, use_ema = True)

model_cropped.compile( optimizer = optimizer, loss=keras.losses.BinaryCrossentropy(), metrics=[keras.BinaryCrossentropy()] )

I used already several metrics and wanted to try the BinaryCrossentropy metric, since according to the manual it's suitable for binary classifaction problems. However, though using binary cross entropy as the loss function with other metrics (as BinaryAccuracy) works fine, using binary cross entropy as a metric throws an error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[201], line 4
      1 epochs = 600
      2 print("Fitting the end-to-end model")
----> 4 history_cropped = model_cropped.fit(
      5     X_train, y_train,
      6     epochs=epochs,
      7     validation_data=(X_test, y_test),
      8     batch_size=64
      9 )

File ~/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py:122, in filter_traceback.<locals>.error_handler(args, *kwargs) 119 filtered_tb = _process_traceback_frames(e.traceback) 120 # To get the full stack trace, call: 121 # keras.config.disable_traceback_filtering() --> 122 raise e.with_traceback(filtered_tb) from None 123 finally: 124 del filtered_tb

File ~/lib/python3.10/site-packages/keras/src/backend/tensorflow/nn.py:737, in binary_crossentropy(target, output, from_logits) 734 output = tf.convert_to_tensor(output) 736 if len(target.shape) != len(output.shape): --> 737 raise ValueError( 738 "Arguments target and output must have the same rank " 739 "(ndim). Received: " 740 f"target.shape={target.shape}, output.shape={output.shape}" 741 ) 742 for e1, e2 in zip(target.shape, output.shape): 743 if e1 is not None and e2 is not None and e1 != e2:

ValueError: Arguments target and output must have the same rank (ndim). Received: target.shape=(None,), output.shape=(None, 1)

Apparently, the rank of the output node is (None,1) but binary cross entropy is expecting (None,). Can someone enlighten me, what I misunderstand here or what I'm doing wrong? I thought, since the binary cross entropy is working as a loss function it should therefore also work as a metric.

Thx a lot

Ada
  • 33
  • 5

1 Answers1

4

As I mentioned in my comment to the OP, the problem seems to be that you have a mismatch in dimensions. To resolve this, you can reshape your target labels (y_train and y_test) to have a shape of (None, 1) instead of (None,). You can do this by adding an extra dimension using np.expand_dims (or similar functionality in TF). If you prefer not to modify your target data, you can adapt the output layer to produce a shape of (None,) by flattening it.

Robert Long
  • 3,518
  • 12
  • 30