Implements the focal loss function.
tfa.losses.SigmoidFocalCrossEntropy(
from_logits: bool = False,
alpha: tfa.types.FloatTensorLike
= 0.25,
gamma: tfa.types.FloatTensorLike
= 2.0,
reduction: str = tf.keras.losses.Reduction.NONE,
name: str = 'sigmoid_focal_crossentropy'
)
Focal loss was first introduced in the RetinaNet paper
(https://arxiv.org/pdf/1708.02002.pdf). Focal loss is extremely useful for
classification when you have highly imbalanced classes. It down-weights
well-classified examples and focuses on hard examples. The loss value is
much higher for a sample which is misclassified by the classifier as compared
to the loss value corresponding to a well-classified example. One of the
best use-cases of focal loss is its usage in object detection where the
imbalance between the background class and other classes is extremely high.
Usage:
fl = tfa.losses.SigmoidFocalCrossEntropy()
loss = fl(
y_true = [[1.0], [1.0], [0.0]],y_pred = [[0.97], [0.91], [0.03]])
loss
<tf.Tensor: shape=(3,), dtype=float32, numpy=array([6.8532745e-06, 1.9097870e-04, 2.0559824e-05],
dtype=float32)>
Usage with tf.keras
API:
model = tf.keras.Model()
model.compile('sgd', loss=tfa.losses.SigmoidFocalCrossEntropy())
Args |
alpha
|
balancing factor, default value is 0.25.
|
gamma
|
modulating factor, default value is 2.0.
|
Returns |
Weighted loss float Tensor . If reduction is NONE , this has the same
shape as y_true ; otherwise, it is scalar.
|
Raises |
ValueError
|
If the shape of sample_weight is invalid or value of
gamma is less than zero.
|
Methods
from_config
@classmethod
from_config(
config
)
Instantiates a Loss
from its config (output of get_config()
).
Args |
config
|
Output of get_config() .
|
get_config
View source
get_config()
Returns the config dictionary for a Loss
instance.
__call__
__call__(
y_true, y_pred, sample_weight=None
)
Invokes the Loss
instance.
Args |
y_true
|
Ground truth values. shape = [batch_size, d0, .. dN] , except
sparse loss functions such as sparse categorical crossentropy where
shape = [batch_size, d0, .. dN-1]
|
y_pred
|
The predicted values. shape = [batch_size, d0, .. dN]
|
sample_weight
|
Optional sample_weight acts as a coefficient for the
loss. If a scalar is provided, then the loss is simply scaled by the
given value. If sample_weight is a tensor of size [batch_size] ,
then the total loss for each sample of the batch is rescaled by the
corresponding element in the sample_weight vector. If the shape of
sample_weight is [batch_size, d0, .. dN-1] (or can be
broadcasted to this shape), then each loss element of y_pred is
scaled by the corresponding value of sample_weight . (Note
ondN-1 : all loss functions reduce by 1 dimension, usually
axis=-1.)
|
Returns |
Weighted loss float Tensor . If reduction is NONE , this has
shape [batch_size, d0, .. dN-1] ; otherwise, it is scalar. (Note
dN-1 because all loss functions reduce by 1 dimension, usually
axis=-1.)
|
Raises |
ValueError
|
If the shape of sample_weight is invalid.
|