athena.transform.feats.framepow
¶
“This model extracts framepow features per frame.
Module Contents¶
Classes¶
Compute power of every frame in speech. |
- class athena.transform.feats.framepow.Framepow(config: dict)¶
Bases:
athena.transform.feats.base_frontend.BaseFrontend
Compute power of every frame in speech.
- Parameters
config – contains four optional parameters.
- Shape:
output: \((T, 1)\).
- Examples::
>>> config = {'window_length': 0.25, 'remove_dc_offset': True} >>> framepow_op = Framepow.params(config).instantiate() >>> framepow_out = framepow_op('test.wav', 16000)
- classmethod params(config=None)¶
Set params.
- Parameters
config – contains the following four optional parameters:
'window_length' – Window length in seconds. (float, default = 0.025)
'frame_length' – Hop length in seconds. (float, default = 0.010)
'snip_edges' – If 1, the last frame (shorter than window_length) will be cutoff. If 2, 1 // 2 frame_length data will be padded to data. (int, default = 1)
'remove_dc_offset' – Subtract mean from waveform on each frame. (bool, default = true)
Note
Return an object of class HParams, which is a set of hyperparameters as name-value pairs.
- call(audio_data, sample_rate)¶
Caculate power of every frame in speech.
- Parameters
audio_data – the audio signal from which to compute spectrum.
sample_rate – the sample rate of the signal we working with.
- Shape:
audio_data: \((1, N)\)
sample_rate: float
- dim()¶