Fix label_smoothing in multidimensional CategoricalCrossentropy.
When label smoothing in CategoricalCrossentropy is non-zero, it takes tf.shape(y_true)[1] as the number of classes. However, if the true values and predictions are multidimensional (for example when training a POS tagger where batch elements are sentences composed of words), a wrong value is taken and the training does not work. This fix takes the _last_ dimension as the one containing classes.
This commit is contained in:
parent
390052e2ce
commit
5f1ee72e97
@ -1084,7 +1084,7 @@ def categorical_crossentropy(y_true,
|
||||
label_smoothing = ops.convert_to_tensor_v2(label_smoothing, dtype=K.floatx())
|
||||
|
||||
def _smooth_labels():
|
||||
num_classes = math_ops.cast(array_ops.shape(y_true)[1], y_pred.dtype)
|
||||
num_classes = math_ops.cast(array_ops.shape(y_true)[-1], y_pred.dtype)
|
||||
return y_true * (1.0 - label_smoothing) + (label_smoothing / num_classes)
|
||||
|
||||
y_true = smart_cond.smart_cond(label_smoothing,
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user