athena.tools.ctc_scorer

ctc scorer used in joint-ctc decoding

Module Contents

Classes

CTCPrefixScoreTH

Batch processing of CTCPrefixScore

Functions

tf_index_select(input_, dim, indices)

input_(tensor): input tensor

athena.tools.ctc_scorer.tf_index_select(input_, dim, indices)

input_(tensor): input tensor dim(int): dimension indices(list): selected indices list

class athena.tools.ctc_scorer.CTCPrefixScoreTH(x, xlens, blank, eos, margin=0)

Bases: object

Batch processing of CTCPrefixScore

which is based on Algorithm 2 in WATANABE et al. “HYBRID CTC/ATTENTION ARCHITECTURE FOR END-TO-END SPEECH RECOGNITION,” but extended to efficiently compute the label probablities for multiple hypotheses simultaneously See also Seki et al. “Vectorized Beam Search for CTC-Attention-Based Speech Recognition,” In INTERSPEECH (pp. 3825-3829), 2019.

__call__(y, state, scoring_ids=None, att_w=None)

Compute CTC prefix scores for next labels

Parameters
  • y – tensor(shape=[W, L]), prefix label sequences

  • state (tuple) –

    previous CTC state tuple(

    tensor(shape=[T , 2, W]), tensor(shape=[W, O]), 0, 0

    )

  • scoring_ids (torch.Tensor) – scores for pre-selection of hypotheses [Beam, Beam * pre_beam_ratio]

  • att_w (torch.Tensor) – attention weights to decide CTC window

:return new_state, ctc_local_scores (BW, O)

index_select_state(state, best_ids)

Select CTC states according to best ids

:param state : CTC state :param best_ids : index numbers selected by beam pruning (B, W) :return selected_state