Skip to content

athuras/attention

Repository files navigation

attention

Accent recognition, for great justice.

Scripts

build_config.py

  1. Parses a directory containing {.mov,.wav} files.
  2. Builds config file of the form: {language, count}

get_dataset.py

  1. Parses a config file generated by build_config.py
  2. Downloads (via ftp) and converts to .wav (via ffmpeg).
  3. Involves multi-processing.
  4. Puts everything (.wav) into a single directory (/data).

feature_extraction.py

  1. Parses the files in /data (from get_dataset.py)
  2. Extracts features (mfcc, et al.) from /data,
  3. writes as serialized numpy arrays to /processed.

Notebooks

Audiolab.ipynb

prototyping environment, spectrograms, signal-vectors

Config Files

dataset.conf

lang count of source files (complete)

Data Files

  1. /data (.wav encoded audio)
  2. speech_archive_meta.tsv: Complementary dataset, contains additional info about speakers involved in each recording.

Todo:

  1. Extract features, store in database (sqlite).
  2. Parse speech_archive_meta.tsv, put into database
  3. Do ML, hope for the best.
  4. Get different features return to step 1.

About

English Accent Recognition, for great justice!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages