athena.data.datasets.tts.speech_fastspeech2
¶
audio dataset
Module Contents¶
Classes¶
SpeechSynthesisDatasetBuilder |
Functions¶
|
|
|
- athena.data.datasets.tts.speech_fastspeech2.average_by_duration(x, durs)¶
- athena.data.datasets.tts.speech_fastspeech2.tf_average_by_duration(x, durs)¶
- class athena.data.datasets.tts.speech_fastspeech2.SpeechFastspeech2DatasetBuilder(config=None)¶
Bases:
athena.data.datasets.base.BaseDatasetBuilder
SpeechSynthesisDatasetBuilder
- property num_class¶
@property
- Returns
the max_index of the vocabulary
- Return type
int
- property feat_dim¶
return the number of feature dims
- property sample_type¶
@property
- Returns
sample_type of the dataset:
{ "utt_id": tf.string, "input": tf.int32, "input_length": tf.int32, "output_length": tf.int32, "output": tf.float32, "speaker": tf.int32, "duration": tf.int32 }
- Return type
dict
- property sample_shape¶
@property
- Returns
sample_shape of the dataset:
{ "utt_id": tf.TensorShape([]), "input": tf.TensorShape([None]), "input_length": tf.TensorShape([]), "output_length": tf.TensorShape([]), "output": tf.TensorShape([None, feature_dim]), "f0": tf.TensorShape([None]), "energy": tf.TensorShape([None]), "speaker": tf.TensorShape([]), "duration": tf.TensorShape([None]) }
- Return type
dict
- property sample_signature¶
@property
- Returns
sample_signature of the dataset:
{ "utt_id": tf.TensorSpec(shape=(None), dtype=tf.string), "input": tf.TensorSpec(shape=(None, None), dtype=tf.int32), "input_length": tf.TensorSpec(shape=(None), dtype=tf.int32), "output_length": tf.TensorSpec(shape=(None), dtype=tf.int32), "output": tf.TensorSpec(shape=(None, None, feature_dim), dtype=tf.float32), "f0": tf.TensorSpec(shape=(None, None), dtype=tf.float32), "energy": tf.TensorSpec(shape=(None, None), dtype=tf.float32), "speaker": tf.TensorSpec(shape=(None), dtype=tf.int32) }
- Return type
dict
- default_config¶
- load_duration(duration)¶
- preprocess_data(file_path)¶
generate a list of tuples (audio_feature, wav_length_ms, transcript, duration, speaker).
- load_audio_feature(audio_feature_file)¶
- __getitem__(index)¶
- compute_cmvn_if_necessary(is_necessary=True)¶
compute cmvn file