athena.models.asr.speech_conformer_ctc

speech transformer implementation

Module Contents

Classes

SpeechConformerCTC

Standard implementation of a SpeechTransformer. Model mainly consists of three parts:

class athena.models.asr.speech_conformer_ctc.SpeechConformerCTC(data_descriptions, config=None)

Bases: athena.models.base.BaseModel

Standard implementation of a SpeechTransformer. Model mainly consists of three parts: the x_net for input preparation and the transformer itself

default_config
call(samples, training: bool = None)

call model

compute_logit_length(input_length)

used for get logit length

_forward_encoder(speech, speech_length, training=None)
_forward_encoder_log_ctc(samples, training: bool = None)
decode(samples, hparams, lm_model=None)

Initialization of the model for decoding, decoder is called here to create predictions

Parameters
  • samples – the data source to be decoded

  • hparams – decoding configs are included here

  • lm_model – lm model

Returns:

predictions: the corresponding decoding results
argmax(samples, hparams)

argmax for the Conformer CTC model

Parameters
  • samples – the data source to be decoded

  • hparams – decoding configs are included here

Returns::

predictions: the corresponding decoding results

merge_ctc_sequence(seqs, blank=-1)

beam search for freeze only support batch=1

Parameters
  • samples – the data source to be decoded

  • beam_size – beam size

restore_from_pretrained_model(pretrained_model, model_type='')

restore from pretrained model