SnareGAN

Using Deep Convolution Generative Adversarial Networks to generate snare drum sounds

Results

After 30 Epochs (Around 20 minutes of training on Google Colab with GPU Acceleration) the model clearly shows similarities with an actual snare sound:

Coloured images show short-time fourier transform output

Generated Snare sound

Actual Snare sound from dataset

Discussion of results

Clearly, the model has learned that a snare sound typically has a transient (The initial peak), however in this example has placed two transients. This could be due to examples in the dataset with two transients.

Additionally, the STFT analysis shows strong similarities. With the initial first peak and subsequent tail noise.

However, there is no smooth tail, it is fairly abrubt. Additionally, on listening to the output, it sounds 'squeeky' with a noticable wet sounding noise. However the first transient sounds really good and this could easily be processing within a DAW to sound much closer to an actual snare sound.

Further work

The losses were tracked during training:

As you can see, the loss of the discriminator is converging to zero, however not-so in a fast/reliable manner. This may imply that a different discriminator design is needed. It is unknown due to lack of resources whether this model would produce better results given more training, although there is a good chance that as the loss for the discriminator decreased, eventually it would start to increase as the generator improves.

Usage

Click on 'SnareGAN.ipynb' to view to notebook

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

SnareGAN

Results

Generated Snare sound

Actual Snare sound from dataset

Discussion of results

Further work

Usage

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

SnareGAN

Results

Generated Snare sound

Actual Snare sound from dataset

Discussion of results

Further work

Usage