athena.models.masked_pc

an implementations for MPC

Module Contents

Classes

MaskedPredictCoding

implementation for MPC pretrain model

class athena.models.masked_pc.MaskedPredictCoding(data_descriptions, config=None)

Bases: athena.models.base.BaseModel

implementation for MPC pretrain model

Parameters
  • num_filters – a int type number, i.e the number of filters in cnn

  • d_model – a int type number, i.e dimension of model

  • num_heads – number of heads in transformer

  • num_encoder_layers – number of layer in encoder

  • dff – a int type number, i.e dimension of model

  • rate – rate of dropout layers

  • chunk_size – number of consecutive masks, i.e 1 or 3

  • keep_probability – probability not to be masked

  • mode – train mode, i.e MPC: pretrain

  • max_pool_layers – index of max pool layers in encoder, default is -1

default_config
call(samples, training: bool = None)

used for training

Parameters
  • dict (samples is a) – ‘input’, ‘input_length’, ‘output_length’, ‘output’ input: acoustic features, Tensor, shape is (batch, time_len, dim, 1), i.e f-bank

  • keys (including) – ‘input’, ‘input_length’, ‘output_length’, ‘output’ input: acoustic features, Tensor, shape is (batch, time_len, dim, 1), i.e f-bank

Return:

MPC outputs to fit acoustic features
    encoder_outputs: Transformer encoder outputs, Tensor, shape is (batch, seqlen, dim)
get_loss(logits, samples, training=None)

get MPC loss

Parameters

logits – MPC output

Return:

MPC L1 loss and metrics
compute_logit_length(samples)

compute the logit length

generate_mpc_mask(input_data)

generate mask for pretraining

Parameters

features (acoustic) – i.e F-bank

Return:

mask tensor
prepare_samples(samples)

for special data prepare carefully: do not change the shape of samples