PyTorch implementation of the paper 'Listen, Attend and Spell' (ICASSP 2016)
- Requirements: Python 3, PyTorch
- Dataset: librispeech-clean-100
- Features: Model, Framework agnostic
- WER 40.43% After 34 epochs (with train loss 0.0749, initial loss 3.4095)
[1] Chan, William, et al. "Listen, attend and spell: A neural network for large vocabulary conversational speech recognition." 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2016.
[2] LibriSpeech ASR corpus (http://www.openslr.org/12/)
[3] AzizCode92, Listen-Attend-And-Spell-Pytorch, 2018, GitHub repository (https://github.com/AzizCode92/Listen-Attend-and-Spell-Pytorch)
[4] zszyellow, WER-in-python, 2020, GitHub repository (https://github.com/zszyellow/WER-in-python/blob/master/wer.py#L4)