athena.data.datasets.speech_set
¶
audio dataset
Module Contents¶
Classes¶
SpeechDatasetBuilder |
- class athena.data.datasets.speech_set.SpeechDatasetBuilder(config=None)¶
Bases:
athena.data.datasets.base.SpeechBaseDatasetBuilder
SpeechDatasetBuilder
- property num_class¶
@property
- Returns
the target dim
- Return type
int
- property sample_type¶
@property
- Returns
sample_type of the dataset:
{ "input": tf.float32, "input_length": tf.int32, "output": tf.float32, "output_length": tf.int32, }
- Return type
dict
- property sample_shape¶
@property
- Returns
sample_shape of the dataset:
{ "input": tf.TensorShape( [None, self.audio_featurizer.dim, self.audio_featurizer.num_channels] ), "input_length": tf.TensorShape([]), "output": tf.TensorShape([None, None]), "output_length": tf.TensorShape([]), }
- Return type
dict
- property sample_signature¶
@property
- Returns
sample_signature of the dataset:
{ "input": tf.TensorSpec( shape=(None, None, None, None), dtype=tf.float32 ), "input_length": tf.TensorSpec(shape=([None]), dtype=tf.int32), "output": tf.TensorSpec(shape=(None, None, None), dtype=tf.float32), "output_length": tf.TensorSpec(shape=([None]), dtype=tf.int32), }
- Return type
dict
- default_config¶
- preprocess_data(file_path)¶
generate a list of tuples (wav_filename, wav_length_ms, speaker).
- __getitem__(index)¶
get a sample
- Parameters
index (int) – index of the entries
- Returns
sample:
{ "input": input_data, "input_length": input_data.shape[0], "output": output_data, "output_length": output_data.shape[0], }
- Return type
dict