Speech-Image_Tool

This repositories are used to collect some programs that preprocess audio and image.

Image

The Image dir includes 3 functions to make data augmentation.

Image rotation
Random color
Random Gaussian

Speech

The Speech dir includes 4 python scripts.

character_to_pinyin.py: used to translate Chinese character to pinyin;
trim_silence.py: used to trim the silence in begin and end of audio;
mp3_translate_wav.py: used to translate .mp3 to .wav;
generate_audio.py: used to generate audio from Baidu's api;