Skip to content

Latest commit

 

History

History
24 lines (14 loc) · 811 Bytes

File metadata and controls

24 lines (14 loc) · 811 Bytes

Text normalization covering grammars

This repository provides covering grammars for English and Russian text normalization as documented in:

Gorman, K., and Sproat, R. 2016. Minimally supervised number normalization. Transactions of the Association for Computational Linguistics 4: 507-519.

Ng, A. H., Gorman, K., and Sproat, R. 2017. Minimally supervised written-to-spoken text normalization. In ASRU, pages 665-670.

If you use these grammars in a publication, we would appreciate if you cite these works.

Building

The grammars are written in Thrax and compile into OpenFst FAR (FstARchive) files. To compile, simply run make in the src/ directory.

License

See LICENSE.

Mandatory disclaimer

This is not an official Google product.