athena.utils.misc

misc

Module Contents

Classes

athena_dialect

Describe the usual properties of Excel-generated CSV files.

Functions

mask_index_from_labels(labels, index)

insert_sos_in_labels(labels, sos)

remove_eos_in_labels(input_labels, labels_length)

remove eos in labels, batch size should be larger than 1

insert_eos_in_labels(input_labels, eos, labels_length)

insert eos in labels, batch size should be larger than 1

generate_square_subsequent_mask(size)

Generate a square mask for the sequence. The masked positions are filled with float(1.0).

create_multihead_mask(x, x_length, y[, reverse])

Generate a square mask for the sequence for mult-head attention.

subsequent_chunk_mask(size, chunk_size[, num_left_chunks])

Create mask for subsequent steps (size, size) with chunk size,

add_optional_chunk_mask(xs, masks, max_len, ...)

Apply optional mask for encoder.

generate_square_subsequent_mask_u2(size)

Generate a square mask for the sequence. The masked positions are filled with bool(True).

create_multihead_mask_u2(x, x_length, y[, reverse])

Generate a square mask for the sequence for mult-head attention.

subsequent_chunk_mask_u2(size, chunk_size[, ...])

Create mask for subsequent steps (size, size) with chunk size,

add_optional_chunk_mask_u2(xs, masks, max_len, ...)

Apply optional mask for encoder.

mask_finished_scores(scores, flag)

for the score of finished hyps, mask the first to 0.0 and the others -inf

mask_finished_preds(preds, flag, eos)

for the finished hyps, mask all the selected word to eos

gated_linear_layer(inputs, gates[, name])

validate_seqs(seqs, eos)

Discard end symbol and elements after end symbol

get_wave_file_length(wave_file)

get the wave file length(duration) in ms

splice_numpy(x, context)

Splice a tensor along the last dimension with context.

set_default_summary_writer([summary_directory])

tensor_shape(tensor)

Return a list with tensor shape. For each dimension,

apply_label_smoothing(inputs, num_classes[, ...])

Applies label smoothing. See https://arxiv.org/abs/1512.00567.

get_dict_from_scp(vocab[, func])

read_csv_dict(csv_path)

read_csv(csv_path)

write_csv(csv_path, data)

athena.utils.misc.mask_index_from_labels(labels, index)
athena.utils.misc.insert_sos_in_labels(labels, sos)
athena.utils.misc.remove_eos_in_labels(input_labels, labels_length)

remove eos in labels, batch size should be larger than 1 assuming 0 as the padding and the last one is the eos

athena.utils.misc.insert_eos_in_labels(input_labels, eos, labels_length)

insert eos in labels, batch size should be larger than 1 assuming 0 as the padding,

athena.utils.misc.generate_square_subsequent_mask(size)

Generate a square mask for the sequence. The masked positions are filled with float(1.0). Unmasked positions are filled with float(0.0).

athena.utils.misc.create_multihead_mask(x, x_length, y, reverse=False)

Generate a square mask for the sequence for mult-head attention. The masked positions are filled with float(1.0). Unmasked positions are filled with float(0.0).

athena.utils.misc.subsequent_chunk_mask(size, chunk_size, num_left_chunks=-1)
Create mask for subsequent steps (size, size) with chunk size,

this is for streaming encoder

Parameters
  • size (int) – size of mask

  • chunk_size (int) – size of chunk

  • num_left_chunks (int) – size of history chunk

Returns

mask

Return type

torch.Tensor

Examples

>>> subsequent_mask(4, 2, 1)
[[1, 1, 0, 0],
 [1, 1, 0, 0],
 [0, 1, 1, 1],
 [0, 1, 1, 1]]
athena.utils.misc.add_optional_chunk_mask(xs: tensorflow.Tensor, masks: tensorflow.Tensor, max_len, use_dynamic_chunk: bool, use_dynamic_left_chunk: bool, decoding_chunk_size: int, static_chunk_size: int, num_decoding_left_chunks: int)

Apply optional mask for encoder.

Parameters
  • xs (torch.Tensor) – padded input, (B, L, D), L for max length

  • mask (torch.Tensor) – mask for xs, (B, 1, L)

  • use_dynamic_chunk (bool) – whether to use dynamic chunk or not

  • use_dynamic_left_chunk (bool) – whether to use dynamic left chunk for training.

  • decoding_chunk_size (int) – decoding chunk size for dynamic chunk, it’s 0: default for training, use random dynamic chunk. <0: for decoding, use full chunk. >0: for decoding, use fixed chunk size as set.

  • static_chunk_size (int) – chunk size for static chunk training/decoding if it’s greater than 0, if use_dynamic_chunk is true, this parameter will be ignored

  • num_decoding_left_chunks – number of left chunks, this is for decoding, the chunk size is decoding_chunk_size. >=0: use num_decoding_left_chunks <0: use all left chunks

Returns

chunk mask of the input xs.

Return type

torch.Tensor

athena.utils.misc.generate_square_subsequent_mask_u2(size)

Generate a square mask for the sequence. The masked positions are filled with bool(True). Unmasked positions are filled with bool(False).

athena.utils.misc.create_multihead_mask_u2(x, x_length, y, reverse=False)

Generate a square mask for the sequence for mult-head attention. The masked positions are filled with bool(True). Unmasked positions are filled with bool(False).

athena.utils.misc.subsequent_chunk_mask_u2(size, chunk_size, num_left_chunks=-1)
Create mask for subsequent steps (size, size) with chunk size,

this is for streaming encoder

Parameters
  • size (int) – size of mask

  • chunk_size (int) – size of chunk

  • num_left_chunks (int) – size of history chunk

Returns

mask

Return type

torch.Tensor

Examples

>>> subsequent_chunk_mask_u2(4, 2, 1)
[[False, False, True, True],
 [False, False, True, True],
 [True, False, False, False],
 [True, False, False, False]]
athena.utils.misc.add_optional_chunk_mask_u2(xs: tensorflow.Tensor, masks: tensorflow.Tensor, max_len, use_dynamic_chunk: bool, use_dynamic_left_chunk: bool, decoding_chunk_size: int, static_chunk_size: int, num_decoding_left_chunks: int)

Apply optional mask for encoder.

Parameters
  • xs (tf.Tensor) – padded input, (B, L, D), L for max length

  • mask (tf.Tensor) – mask for xs, (B, 1, L) or (B, 1, 1, L)

  • use_dynamic_chunk (bool) – whether to use dynamic chunk or not

  • use_dynamic_left_chunk (bool) – whether to use dynamic left chunk for training.

  • decoding_chunk_size (int) – decoding chunk size for dynamic chunk, it’s 0: default for training, use random dynamic chunk. <0: for decoding, use full chunk. >0: for decoding, use fixed chunk size as set.

  • static_chunk_size (int) – chunk size for static chunk training/decoding if it’s greater than 0, if use_dynamic_chunk is true, this parameter will be ignored

  • num_decoding_left_chunks – number of left chunks, this is for decoding, the chunk size is decoding_chunk_size. >=0: use num_decoding_left_chunks <0: use all left chunks

Returns

chunk mask of the input xs.

Return type

torch.Tensor

athena.utils.misc.mask_finished_scores(scores, flag)

for the score of finished hyps, mask the first to 0.0 and the others -inf

Parameters
  • scores – candidate scores at current step shape: [batch_size*beam_size, beam_size]

  • flag – end_flag, 1 means end shape: [batch_size*beam_size, 1]

Returns

masked scores for finished hyps

Return type

scores

athena.utils.misc.mask_finished_preds(preds, flag, eos)

for the finished hyps, mask all the selected word to eos

Parameters
  • preds – shape:[batch_size*beam_size, beam_size]

  • flag – end_flag 1 means end shape:[batch_size*beam_size, 1]

Returns

masked preds for finished hyps

Return type

preds

athena.utils.misc.gated_linear_layer(inputs, gates, name=None)
athena.utils.misc.validate_seqs(seqs, eos)

Discard end symbol and elements after end symbol

Parameters
  • seqs – shape=(batch_size, seq_length)

  • eos – eos id

Returns

seqs without eos id

Return type

validated_preds

athena.utils.misc.get_wave_file_length(wave_file)

get the wave file length(duration) in ms

Parameters

wave_file – the path of wave file

Returns

the length(ms) of the wave file

Return type

wav_length

athena.utils.misc.splice_numpy(x, context)

Splice a tensor along the last dimension with context.

Example:

>>> t = [[[1, 2, 3],
>>>     [4, 5, 6],
>>>     [7, 8, 9]]]
>>> splice_tensor(t, [0, 1])
>>>   [[[1, 2, 3, 4, 5, 6],
>>>     [4, 5, 6, 7, 8, 9],
>>>     [7, 8, 9, 7, 8, 9]]]
Parameters
  • tensor – a tf.Tensor with shape (B, T, D) a.k.a. (N, H, W)

  • context – a list of context offsets

Returns

spliced tensor with shape (…, D * len(context))

athena.utils.misc.set_default_summary_writer(summary_directory=None)
athena.utils.misc.tensor_shape(tensor)

Return a list with tensor shape. For each dimension, use tensor.get_shape() first. If not available, use tf.shape().

athena.utils.misc.apply_label_smoothing(inputs, num_classes, smoothing_rate=0.1)

Applies label smoothing. See https://arxiv.org/abs/1512.00567. :param inputs: A 3d tensor with shape of [N, T, V], where V is the number of vocabulary. :param num_classes: Number of classes. :param smoothing_rate: Smoothing rate.

```

athena.utils.misc.get_dict_from_scp(vocab, func=lambda x: ...)
class athena.utils.misc.athena_dialect

Bases: csv.Dialect

Describe the usual properties of Excel-generated CSV files.

delimiter =
quotechar = "
doublequote = True
skipinitialspace = True
lineterminator =
quoting
athena.utils.misc.read_csv_dict(csv_path)
athena.utils.misc.read_csv(csv_path)
athena.utils.misc.write_csv(csv_path, data: list)