`athena.data.datasets.asr.speech_recognition_kaldiio`¶

audio dataset

Module Contents¶

Classes¶

SpeechRecognitionDatasetKaldiIOBuilder

SpeechRecognitionDatasetKaldiIOBuilder

class athena.data.datasets.asr.speech_recognition_kaldiio.SpeechRecognitionDatasetKaldiIOBuilder(config=None)¶

Bases: athena.data.datasets.base.SpeechBaseDatasetBuilder

SpeechRecognitionDatasetKaldiIOBuilder

property num_class¶: return the max_index of the vocabulary + 1

property sample_type¶

@property

Returns

sample_type of the dataset:

{
    "input": tf.float32,
    "input_length": tf.int32,
    "output_length": tf.int32,
    "output": tf.int32,
}

Return type

dict

property sample_shape¶

@property

Returns

sample_shape of the dataset:

{
    "input": tf.TensorShape([None, dim, nc]),
    "input_length": tf.TensorShape([]),
    "output_length": tf.TensorShape([]),
    "output": tf.TensorShape([None]),
}

Return type

dict

property sample_signature¶

@property

Returns

sample_signature of the dataset:

{
    "input": tf.TensorSpec(shape=(None, None, dim, nc), dtype=tf.float32),
    "input_length": tf.TensorSpec(shape=(None), dtype=tf.int32),
    "output_length": tf.TensorSpec(shape=(None), dtype=tf.int32),
    "output": tf.TensorSpec(shape=(None, None), dtype=tf.int32),
}

Return type

dict

default_config¶

preprocess_kaldi_data(file_dir, apply_sort_filter=True)¶: Generate a list of tuples (feat_key, speaker).

__getitem__(index)¶

compute_cmvn_if_necessary(is_necessary=True)¶: compute cmvn file

athena.data.datasets.asr.speech_recognition_kaldiio¶

Module Contents¶

Classes¶

`athena.data.datasets.asr.speech_recognition_kaldiio`¶