`athena.data.datasets.speech_set`¶

audio dataset

Module Contents¶

Classes¶

SpeechDatasetBuilder

SpeechDatasetBuilder

class athena.data.datasets.speech_set.SpeechDatasetBuilder(config=None)¶

Bases: athena.data.datasets.base.SpeechBaseDatasetBuilder

SpeechDatasetBuilder

property num_class¶

@property

Returns: the target dim
Return type: int

property sample_type¶

@property

Returns

sample_type of the dataset:

{
    "input": tf.float32,
    "input_length": tf.int32,
    "output": tf.float32,
    "output_length": tf.int32,
}

Return type

dict

property sample_shape¶

@property

Returns

sample_shape of the dataset:

{
    "input": tf.TensorShape(
        [None, self.audio_featurizer.dim, self.audio_featurizer.num_channels]
    ),
    "input_length": tf.TensorShape([]),
    "output": tf.TensorShape([None, None]),
    "output_length": tf.TensorShape([]),
}

Return type

dict

property sample_signature¶

@property

Returns

sample_signature of the dataset:

{
    "input": tf.TensorSpec(
        shape=(None, None, None, None), dtype=tf.float32
    ),
    "input_length": tf.TensorSpec(shape=([None]), dtype=tf.int32),
    "output": tf.TensorSpec(shape=(None, None, None), dtype=tf.float32),
    "output_length": tf.TensorSpec(shape=([None]), dtype=tf.int32),
}

Return type

dict

default_config¶

preprocess_data(file_path)¶: generate a list of tuples (wav_filename, wav_length_ms, speaker).

__getitem__(index)¶

get a sample

Parameters

index (int) – index of the entries

Returns

sample:

{
    "input": input_data,
    "input_length": input_data.shape[0],
    "output": output_data,
    "output_length": output_data.shape[0],
}

Return type

dict

athena.data.datasets.speech_set¶

Module Contents¶

Classes¶

`athena.data.datasets.speech_set`¶