This repositories are used to collect some programs that preprocess audio and image.
The Image dir includes 3 functions to make data augmentation.
- Image rotation
- Random color
- Random Gaussian
The Speech dir includes 4 python scripts.
- character_to_pinyin.py: used to translate Chinese character to pinyin;
- trim_silence.py: used to trim the silence in begin and end of audio;
- mp3_translate_wav.py: used to translate .mp3 to .wav;
- generate_audio.py: used to generate audio from Baidu's api;