Speech Recognition¶
Please refer to the results table for supported tasks/examples. To run an ASR example, execute the following commands from your Athena root directory:
source env.sh
bash examples/asr/$dataset_name/run.sh
Core Stages:¶
(1) Data preparation¶
Before you run examples/asr/$dataset_name/run.sh
, you should download the coorsponding dataset and store it in examples/asr/$dataset_name/data
. The script examples/asr/$dataset_name/local/prepare_data.py
would generate the desired csv file decripting the dataset
(2) Data normalization¶
With the generated csv file, we should compute the cmvn file firstly like this
$ python athena/cmvn_main.py examples/asr/$dataset_name/configs/mpc.json examples/asr/$dataset_name/data/all.csv
(3) Unsupervised pretraining¶
You can perform the unsupervised pretraining using the json file examples/asr/$dataset_name/mpc.json
or just skip this
(4) Acoustic model training¶
You can train a transformer model using json file examples/asr/$dataset_name/configs/transformer.json
or train a mtl_transformer_ctc model using json file examples/asr/$dataset_name/configs/mtl_transformer.json
(5) Language model training¶
You can train a rnnlm model using the transcripts with the json file examples/asr/$dataset_name/rnnlm.json
, of course, you should firstly prepare the csv file for it
(6) Decoding¶
Currently, we provide a simple but not so effective way for decoding mtl_transformer_ctc model. To use it, run
$ python athena/decode_main.py examples/asr/$dataset_name/configs/mtl_transformer.json
For more detailed results with MPC, please refer to README section of each dataset. Download link of MPC checkpoints can also be found there.