`athena.loss`¶

some losses

Module Contents¶

Classes¶

`CTCLoss`	CTC LOSS
`Seq2SeqSparseCategoricalCrossentropy`	Seq2SeqSparseCategoricalCrossentropy LOSS
`MPCLoss`	MPC LOSS
`Tacotron2Loss`	Tacotron2 Loss
`GuidedAttentionLoss`	GuidedAttention Loss to make attention alignments more monotonic
`GuidedMultiHeadAttentionLoss`	Guided multihead attention loss function module for multi head attention.
`FastSpeechLoss`	used for training of fastspeech
`FastSpeech2Loss`	used for training of fastspeech2
`SoftmaxLoss`	Softmax Loss
`AMSoftmaxLoss`	Additive Margin Softmax Loss
`AAMSoftmaxLoss`	Additive Angular Margin Softmax Loss
`ProtoLoss`	Prototypical Loss
`AngleProtoLoss`	Angular Prototypical Loss
`GE2ELoss`	Generalized End-to-end Loss

class athena.loss.CTCLoss(logits_time_major=False, blank_index=-1, name='CTCLoss')¶

Bases: tensorflow.keras.losses.Loss

CTC LOSS CTC LOSS implemented with Tensorflow

__call__(logits, samples, logit_length=None)¶

class athena.loss.Seq2SeqSparseCategoricalCrossentropy(num_classes, eos=-1, by_token=False, by_sequence=True, from_logits=True, label_smoothing=0.0)¶

Bases: tensorflow.keras.losses.CategoricalCrossentropy

Seq2SeqSparseCategoricalCrossentropy LOSS CategoricalCrossentropy calculated at each character for each sequence in a batch

__call__(logits, samples, logit_length=None)¶

class athena.loss.MPCLoss(name='MPCLoss')¶

Bases: tensorflow.keras.losses.Loss

MPC LOSS L1 loss for each masked acoustic features in a batch

__call__(logits, samples, logit_length=None)¶

class athena.loss.Tacotron2Loss(model, guided_attn_loss_function, regularization_weight=0.0, l1_loss_weight=0.0, mask_decoder=False, pos_weight=1.0, name='Tacotron2Loss')¶

Bases: tensorflow.keras.losses.Loss

Tacotron2 Loss

__call__(outputs, samples, logit_length=None)¶

Parameters: outputs – contain elements below: att_ws_stack: shape: [batch, y_steps, x_steps]

class athena.loss.GuidedAttentionLoss(guided_attn_weight, reduction_factor, attn_sigma=0.4, name='GuidedAttentionLoss')¶

Bases: tensorflow.keras.losses.Loss

GuidedAttention Loss to make attention alignments more monotonic

__call__(att_ws_stack, samples)¶

_create_attention_masks(input_length, output_length)¶

masks created by attention location

Parameters

input_length – shape: [batch_size]
output_length – shape: [batch_size]

Returns

shape: [batch_size, 1, y_steps, x_steps]

Return type

masks

_create_length_masks(input_length, output_length)¶

masks created by input and output length

Parameters

input_length – shape: [batch_size]
output_length – shape: [batch_size]

Returns

shape: [batch_size, 1, output_length, input_length]

Return type

masks

Examples

output_length: [6, 8] input_length: [3, 5] masks:

[[[1, 1, 1, 0, 0],

[1, 1, 1, 0, 0], [1, 1, 1, 0, 0], [1, 1, 1, 0, 0], [1, 1, 1, 0, 0], [1, 1, 1, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]],

[[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1]]]

class athena.loss.GuidedMultiHeadAttentionLoss(guided_attn_weight, reduction_factor, attn_sigma=0.4, num_heads=2, num_layers=2, name='GuidedMultiHeadAttentionLoss')¶

Bases: GuidedAttentionLoss

Guided multihead attention loss function module for multi head attention.

__call__(att_ws_stack, samples)¶

class athena.loss.FastSpeechLoss(duration_predictor_loss_weight, eps=1.0, use_mask=True, teacher_guide=False)¶

Bases: tensorflow.keras.losses.Loss

used for training of fastspeech

__call__(outputs, samples)¶

Its corresponding log value is calculated to make it Gaussian. :param outputs: it contains four elements:

before_outs: outputs before postnet, shape: [batch, y_steps, feat_dim] teacher_outs: teacher outputs, shape: [batch, y_steps, feat_dim] after_outs: outputs after postnet, shape: [batch, y_steps, feat_dim] duration_sequences: duration predictions from teacher model, shape: [batch, x_steps] pred_duration_sequences: duration predictions from trained predictor

shape: [batch, x_steps]

Parameters: samples – samples from dataset

class athena.loss.FastSpeech2Loss(variant_predictor_loss_weight, eps=1.0, use_mask=True)¶

Bases: tensorflow.keras.losses.Loss

used for training of fastspeech2

__call__(outputs, samples)¶

class athena.loss.SoftmaxLoss(embedding_size, num_classes, name='SoftmaxLoss')¶

Bases: tensorflow.keras.losses.Loss

Softmax Loss Similar to this implementation “https://github.com/clovaai/voxceleb_trainer”

__call__(outputs, samples, logit_length=None)¶

class athena.loss.AMSoftmaxLoss(embedding_size, num_classes, m=0.3, s=15, name='AMSoftmaxLoss')¶

Bases: tensorflow.keras.losses.Loss

Additive Margin Softmax Loss Reference to paper “CosFace: Large Margin Cosine Loss for Deep Face Recognition”

and “In defence of metric learning for speaker recognition”

Similar to this implementation “https://github.com/clovaai/voxceleb_trainer”

__call__(outputs, samples, logit_length=None)¶

class athena.loss.AAMSoftmaxLoss(embedding_size, num_classes, m=0.3, s=15, easy_margin=False, name='AAMSoftmaxLoss')¶

Bases: tensorflow.keras.losses.Loss

Additive Angular Margin Softmax Loss Reference to paper “ArcFace: Additive Angular Margin Loss for Deep Face Recognition”

and “In defence of metric learning for speaker recognition”

Similar to this implementation “https://github.com/clovaai/voxceleb_trainer”

__call__(outputs, samples, logit_length=None)¶

class athena.loss.ProtoLoss(name='ProtoLoss')¶

Bases: tensorflow.keras.losses.Loss

Prototypical Loss Reference to paper “Prototypical Networks for Few-shot Learning”

and “In defence of metric learning for speaker recognition”

Similar to this implementation “https://github.com/clovaai/voxceleb_trainer”

__call__(outputs, samples=None, logit_length=None)¶

Parameters: outputs – [batch_size, num_speaker_utts, embedding_size]

class athena.loss.AngleProtoLoss(init_w=10.0, init_b=-5.0, name='AngleProtoLoss')¶

Bases: tensorflow.keras.losses.Loss

Angular Prototypical Loss Reference to paper “In defence of metric learning for speaker recognition” Similar to this implementation “https://github.com/clovaai/voxceleb_trainer”

__call__(outputs, samples=None, logit_length=None)¶

Parameters: outputs – [batch_size, num_speaker_utts, embedding_size]

class athena.loss.GE2ELoss(init_w=10.0, init_b=-5.0, name='GE2ELoss')¶

Bases: tensorflow.keras.losses.Loss

Generalized End-to-end Loss Reference to paper “Generalized End-to-end Loss for Speaker Verification”

and “In defence of metric learning for speaker recognition”

Similar to this implementation “https://github.com/clovaai/voxceleb_trainer”

__call__(outputs, samples=None, logit_length=None)¶

Parameters: outputs – [batch_size, num_speaker_utts, embedding_size]

athena.loss¶

Module Contents¶

Classes¶

`athena.loss`¶