This repo contains the implementation of the paper Emotion Identification from raw speech signals using DNNs" By Mousmita Sarma, Pegah Ghahremani, Daniel Povey, Nagendra Kumar Goel,Kandarpa Kumar Sarma, Najim Dehak in Pytorch The paper is published in Interspeech 2018 Paper: https://danielpovey.com/files/2018_interspeech_emotion_id.pdf
I suggest you to install Anaconda3 in your system. First download Anancoda3 from https://docs.anaconda.com/anaconda/install/hashes/lin-3-64/
bash Anaconda2-2019.03-Linux-x86_64.sh
https://github.com/KrishnaDN/x-vector-pytorch.git
Once you install anaconda3 successfully, install required packges using requirements.txt
pip iinstall -r requirements.txt
This steps creates manifest files for training and testing
python dataset.py --pickle_filepath /media/newhd/IEMOCAP_dataset/data_collected_full.pickle
--dataset_root /media/newhd/IEMOCAP_dataset/raw_data --store_meta meta/
If you want to add your dataset, take a look at datasets.py code and modify the code accordingly
This steps starts training the model.
python training_Emo_TDNN_StatPool.py --training_filepath meta/training.txt --testing_filepath meta/testing.txt
--input_dim 1 --num_classes 4 --batch_size 64 --use_gpu True --num_epochs 100
Note that this model is based on raw waveform TDNN.
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change. For any queries contact : [email protected]