`athena.utils.misc`¶

misc

Module Contents¶

Classes¶

athena_dialect

Describe the usual properties of Excel-generated CSV files.

Functions¶

`mask_index_from_labels`(labels, index)
`insert_sos_in_labels`(labels, sos)
`remove_eos_in_labels`(input_labels, labels_length)	remove eos in labels, batch size should be larger than 1
`insert_eos_in_labels`(input_labels, eos, labels_length)	insert eos in labels, batch size should be larger than 1
`generate_square_subsequent_mask`(size)	Generate a square mask for the sequence. The masked positions are filled with float(1.0).
`create_multihead_mask`(x, x_length, y[, reverse])	Generate a square mask for the sequence for mult-head attention.
`subsequent_chunk_mask`(size, chunk_size[, num_left_chunks])	Create mask for subsequent steps (size, size) with chunk size,
`add_optional_chunk_mask`(xs, masks, max_len, ...)	Apply optional mask for encoder.
`generate_square_subsequent_mask_u2`(size)	Generate a square mask for the sequence. The masked positions are filled with bool(True).
`create_multihead_mask_u2`(x, x_length, y[, reverse])	Generate a square mask for the sequence for mult-head attention.
`subsequent_chunk_mask_u2`(size, chunk_size[, ...])	Create mask for subsequent steps (size, size) with chunk size,
`add_optional_chunk_mask_u2`(xs, masks, max_len, ...)	Apply optional mask for encoder.
`mask_finished_scores`(scores, flag)	for the score of finished hyps, mask the first to 0.0 and the others -inf
`mask_finished_preds`(preds, flag, eos)	for the finished hyps, mask all the selected word to eos
`gated_linear_layer`(inputs, gates[, name])
`validate_seqs`(seqs, eos)	Discard end symbol and elements after end symbol
`get_wave_file_length`(wave_file)	get the wave file length(duration) in ms
`splice_numpy`(x, context)	Splice a tensor along the last dimension with context.
`set_default_summary_writer`([summary_directory])
`tensor_shape`(tensor)	Return a list with tensor shape. For each dimension,
`apply_label_smoothing`(inputs, num_classes[, ...])	Applies label smoothing. See https://arxiv.org/abs/1512.00567.
`get_dict_from_scp`(vocab[, func])
`read_csv_dict`(csv_path)
`read_csv`(csv_path)
`write_csv`(csv_path, data)

athena.utils.misc.mask_index_from_labels(labels, index)¶

athena.utils.misc.insert_sos_in_labels(labels, sos)¶

athena.utils.misc.remove_eos_in_labels(input_labels, labels_length)¶: remove eos in labels, batch size should be larger than 1 assuming 0 as the padding and the last one is the eos

athena.utils.misc.insert_eos_in_labels(input_labels, eos, labels_length)¶: insert eos in labels, batch size should be larger than 1 assuming 0 as the padding,

athena.utils.misc.generate_square_subsequent_mask(size)¶: Generate a square mask for the sequence. The masked positions are filled with float(1.0). Unmasked positions are filled with float(0.0).

athena.utils.misc.create_multihead_mask(x, x_length, y, reverse=False)¶: Generate a square mask for the sequence for mult-head attention. The masked positions are filled with float(1.0). Unmasked positions are filled with float(0.0).

athena.utils.misc.subsequent_chunk_mask(size, chunk_size, num_left_chunks=-1)¶

Create mask for subsequent steps (size, size) with chunk size,: this is for streaming encoder

Parameters

size (int) – size of mask
chunk_size (int) – size of chunk
num_left_chunks (int) – size of history chunk

Returns

mask

Return type

torch.Tensor

Examples

>>> subsequent_mask(4, 2, 1)
[[1, 1, 0, 0],
 [1, 1, 0, 0],
 [0, 1, 1, 1],
 [0, 1, 1, 1]]

athena.utils.misc.add_optional_chunk_mask(xs: tensorflow.Tensor, masks: tensorflow.Tensor, max_len, use_dynamic_chunk: bool, use_dynamic_left_chunk: bool, decoding_chunk_size: int, static_chunk_size: int, num_decoding_left_chunks: int)¶

Apply optional mask for encoder.

Parameters

xs (torch.Tensor) – padded input, (B, L, D), L for max length
mask (torch.Tensor) – mask for xs, (B, 1, L)
use_dynamic_chunk (bool) – whether to use dynamic chunk or not
use_dynamic_left_chunk (bool) – whether to use dynamic left chunk for training.
decoding_chunk_size (int) – decoding chunk size for dynamic chunk, it’s 0: default for training, use random dynamic chunk. <0: for decoding, use full chunk. >0: for decoding, use fixed chunk size as set.
static_chunk_size (int) – chunk size for static chunk training/decoding if it’s greater than 0, if use_dynamic_chunk is true, this parameter will be ignored
num_decoding_left_chunks – number of left chunks, this is for decoding, the chunk size is decoding_chunk_size. >=0: use num_decoding_left_chunks <0: use all left chunks

Returns

chunk mask of the input xs.

Return type

torch.Tensor

athena.utils.misc.generate_square_subsequent_mask_u2(size)¶: Generate a square mask for the sequence. The masked positions are filled with bool(True). Unmasked positions are filled with bool(False).

athena.utils.misc.create_multihead_mask_u2(x, x_length, y, reverse=False)¶: Generate a square mask for the sequence for mult-head attention. The masked positions are filled with bool(True). Unmasked positions are filled with bool(False).

athena.utils.misc.subsequent_chunk_mask_u2(size, chunk_size, num_left_chunks=-1)¶

Create mask for subsequent steps (size, size) with chunk size,: this is for streaming encoder

Parameters

size (int) – size of mask
chunk_size (int) – size of chunk
num_left_chunks (int) – size of history chunk

Returns

mask

Return type

torch.Tensor

Examples

>>> subsequent_chunk_mask_u2(4, 2, 1)
[[False, False, True, True],
 [False, False, True, True],
 [True, False, False, False],
 [True, False, False, False]]

athena.utils.misc.add_optional_chunk_mask_u2(xs: tensorflow.Tensor, masks: tensorflow.Tensor, max_len, use_dynamic_chunk: bool, use_dynamic_left_chunk: bool, decoding_chunk_size: int, static_chunk_size: int, num_decoding_left_chunks: int)¶

Apply optional mask for encoder.

Parameters

xs (tf.Tensor) – padded input, (B, L, D), L for max length
mask (tf.Tensor) – mask for xs, (B, 1, L) or (B, 1, 1, L)
use_dynamic_chunk (bool) – whether to use dynamic chunk or not
use_dynamic_left_chunk (bool) – whether to use dynamic left chunk for training.
decoding_chunk_size (int) – decoding chunk size for dynamic chunk, it’s 0: default for training, use random dynamic chunk. <0: for decoding, use full chunk. >0: for decoding, use fixed chunk size as set.
static_chunk_size (int) – chunk size for static chunk training/decoding if it’s greater than 0, if use_dynamic_chunk is true, this parameter will be ignored
num_decoding_left_chunks – number of left chunks, this is for decoding, the chunk size is decoding_chunk_size. >=0: use num_decoding_left_chunks <0: use all left chunks

Returns

chunk mask of the input xs.

Return type

torch.Tensor

athena.utils.misc.mask_finished_scores(scores, flag)¶

for the score of finished hyps, mask the first to 0.0 and the others -inf

Parameters

scores – candidate scores at current step shape: [batch_size*beam_size, beam_size]
flag – end_flag, 1 means end shape: [batch_size*beam_size, 1]

Returns

masked scores for finished hyps

Return type

scores

athena.utils.misc.mask_finished_preds(preds, flag, eos)¶

for the finished hyps, mask all the selected word to eos

Parameters

preds – shape:[batch_size*beam_size, beam_size]
flag – end_flag 1 means end shape:[batch_size*beam_size, 1]

Returns

masked preds for finished hyps

Return type

preds

athena.utils.misc.gated_linear_layer(inputs, gates, name=None)¶

athena.utils.misc.validate_seqs(seqs, eos)¶

Discard end symbol and elements after end symbol

Parameters

seqs – shape=(batch_size, seq_length)
eos – eos id

Returns

seqs without eos id

Return type

validated_preds

athena.utils.misc.get_wave_file_length(wave_file)¶

get the wave file length(duration) in ms

Parameters: wave_file – the path of wave file
Returns: the length(ms) of the wave file
Return type: wav_length

athena.utils.misc.splice_numpy(x, context)¶

Splice a tensor along the last dimension with context.

Example:

>>> t = [[[1, 2, 3],
>>>     [4, 5, 6],
>>>     [7, 8, 9]]]
>>> splice_tensor(t, [0, 1])
>>>   [[[1, 2, 3, 4, 5, 6],
>>>     [4, 5, 6, 7, 8, 9],
>>>     [7, 8, 9, 7, 8, 9]]]

Parameters

tensor – a tf.Tensor with shape (B, T, D) a.k.a. (N, H, W)
context – a list of context offsets

Returns

spliced tensor with shape (…, D * len(context))

athena.utils.misc.set_default_summary_writer(summary_directory=None)¶

athena.utils.misc.tensor_shape(tensor)¶: Return a list with tensor shape. For each dimension, use tensor.get_shape() first. If not available, use tf.shape().

athena.utils.misc.apply_label_smoothing(inputs, num_classes, smoothing_rate=0.1)¶

Applies label smoothing. See https://arxiv.org/abs/1512.00567. :param inputs: A 3d tensor with shape of [N, T, V], where V is the number of vocabulary. :param num_classes: Number of classes. :param smoothing_rate: Smoothing rate.

```

athena.utils.misc.get_dict_from_scp(vocab, func=lambda x: ...)¶

class athena.utils.misc.athena_dialect¶

Bases: csv.Dialect

Describe the usual properties of Excel-generated CSV files.

delimiter =¶

quotechar = "¶

doublequote = True¶

skipinitialspace = True¶

lineterminator =¶

quoting¶

athena.utils.misc.read_csv_dict(csv_path)¶

athena.utils.misc.read_csv(csv_path)¶

athena.utils.misc.write_csv(csv_path, data: list)¶

athena.utils.misc¶

Module Contents¶

Classes¶

Functions¶

`athena.utils.misc`¶