athena.models.masked_pc
¶
an implementations for MPC
Module Contents¶
Classes¶
implementation for MPC pretrain model |
- class athena.models.masked_pc.MaskedPredictCoding(data_descriptions, config=None)¶
Bases:
athena.models.base.BaseModel
implementation for MPC pretrain model
- Parameters
num_filters – a int type number, i.e the number of filters in cnn
d_model – a int type number, i.e dimension of model
num_heads – number of heads in transformer
num_encoder_layers – number of layer in encoder
dff – a int type number, i.e dimension of model
rate – rate of dropout layers
chunk_size – number of consecutive masks, i.e 1 or 3
keep_probability – probability not to be masked
mode – train mode, i.e MPC: pretrain
max_pool_layers – index of max pool layers in encoder, default is -1
- default_config¶
- call(samples, training: bool = None)¶
used for training
- Parameters
dict (samples is a) – ‘input’, ‘input_length’, ‘output_length’, ‘output’ input: acoustic features, Tensor, shape is (batch, time_len, dim, 1), i.e f-bank
keys (including) – ‘input’, ‘input_length’, ‘output_length’, ‘output’ input: acoustic features, Tensor, shape is (batch, time_len, dim, 1), i.e f-bank
Return:
MPC outputs to fit acoustic features encoder_outputs: Transformer encoder outputs, Tensor, shape is (batch, seqlen, dim)
- get_loss(logits, samples, training=None)¶
get MPC loss
- Parameters
logits – MPC output
Return:
MPC L1 loss and metrics
- compute_logit_length(samples)¶
compute the logit length
- generate_mpc_mask(input_data)¶
generate mask for pretraining
- Parameters
features (acoustic) – i.e F-bank
Return:
mask tensor
- prepare_samples(samples)¶
for special data prepare carefully: do not change the shape of samples