Skip to content

SimonsThijs/AML-deep-n-cheap

Repository files navigation

This is a modification of deep-n-cheap

We added search transfering for MLP's.

Setup:

  1. Download rcv1_2000.npz from: https://drive.google.com/drive/folders/1fFy-kE_hvjEXAfDcqtcormNljj7YJc2R?usp=sharing
  2. Place the file in the root directory of this repo.

Running search transfer

To run the search transfer you can run the following command:

python3 main.py --network 'mlp_st' --dataset 'mnist' --input_size 1 28 28 --output_size 10 --num_hidden_layers 0 2 --hidden_nodes 20 400 --numepochs 60 --bo_prior_states 15 --bo_steps 15 --bo_explore 1000 --dataset2 'fmnist' --input_size2 1 28 28 --output_size2 10 --dataset3 'rcv1_2000.npz' --input_size3 2000 --output_size3 50 --wc 0.01

This will do a search transfer from mnist to fmnist and rcv1

deep-n-cheap DnC logo

Extended version

Code for the extended version of Deep-n-Cheap can be found here.

Welcome

This repository implements Deep-n-Cheap – an AutoML framework to search for deep learning models. Features include:

  • Complexity oriented: Get models with good performance and low training time or parameter count
  • Cuttomizable search space for both architecture and training hyperparameters
  • Supports models on CNNs, MLPs, Regression
  • available for tensorflow.keras and pytorch frameworks

Highlight: State-of-the-art performance on benchmark and custom datasets with training time orders of magnitude lower than competing frameworks and NAS efforts.

How to cite?
The original research paper was presented in ACML 2020, and is also available on arXiv. Use this bibtex for citation:

@inproceedings{Dey2020_ACML,
  title = {Deep-n-Cheap: An Automated Search Framework for Low Complexity Deep Learning},
  author = {Dey, Sourya and Kanala, Saikrishna C. and Chugg, Keith M. and Beerel, Peter A.},
  booktitle = {Proc. 12th Asian Conference on Machine Learning (ACML)},
  pages = {273--288},
  year = {2020},
  month = {Nov},
  publisher = {Proceedings of Machine Learning Research (PMLR)},
  volume = {129}
}

The extended version paper is published in the journal SN Computer Science.

How to run?

$ pip install sobol_seq tqdm
$ git clone https://github.com/souryadey/deep-n-cheap.git
$ cd deep-n-cheap
$ python main.py

For help:

$ python main.py -h

Complexity customization

Set wc to high values to penalize complexity at the cost of performance:

  • --wc=0: Performance oriented search
  • --wc=0.1: Balance performance and complexity
  • --wc=10: Complexity oriented search
  • Any non-negative value of wc is supported!

Datasets (including custom)

Set dataset to either:

  • --dataset=dataset_name. Currently supported values of <dataset_name> = 'mnist', 'fmnist', 'cifar10', 'cifar100'
  • --dataset='path+dataset_name.npz', where <dataset> should a .npz file with 4 keys:
    • xtr: numpy array of shape (num_train_samples, num_features...), example (50000,3,32,32) or (60000,784). Image data should be in channels_first format.
    • ytr: numpy array of shape (num_train_samples,)
    • xte: numpy array of shape (num_test_samples, num_features...)
    • yte: numpy array of shape (num_test_samples,)
  • Some datasets can be downloaded from the links in dataset_links.txt. Alternatively, define your own custom datasets.

Quick Example

Download mnist.npz from https://drive.google.com/drive/folders/1fFy-kE_hvjEXAfDcqtcormNljj7YJc2R?usp=sharing and put it in this folder. Then run

python main.py --network 'mlp' --dataset 'mnist.npz' --input_size 784 --output_size 10 --numepochs 3 --bo_prior_states 3 --bo_steps 3 --drop_probs_mlp 0 0.5

This should take ~90 seconds to run on a CPU without GPU, give around ~97% validation accuracy (YMMV), and produce a file results.pkl.

Detailed Examples

  1. Search for CNNs between 4-16 layers on CIFAR-10, train each for 100 epochs, run Bayesian optimization for 15 prior points and 15 steps. Optimize for performance only. Estimated search cost: 30 GPU hours on AWS-EC2 P3-2xlarge
python main.py --network 'cnn' --dataset 'cifar10' --input_size 3 32 32 --output_size 10 --wc 0 --numepochs 100 --bo_prior_states 15 --bo_steps 15 --num_conv_layers 4 16 --dl_framework torch

Just change dl_framework for tf.keras version:

python main.py --network 'cnn' --dataset 'cifar10' --input_size 3 32 32 --output_size 10 --wc 0 --numepochs 100 --bo_prior_states 15 --bo_steps 15 --num_conv_layers 4 16 --dl_framework tf.keras
  1. Search for CNNs between 5-10 layers on MNIST without augmentation, max channels in any conv layer is 256, search for batch sizes from 64 to 128 and dropout drop probabilities in [0.1,0.2]. Optimize for fast training by using a high wc.
python main.py --network 'cnn' --dataset 'mnist' --augment False --input_size 1 32 32 --output_size 10 --augment False --wc 1 --num_conv_layers 5 10 --channels_upper 256 --batch_size 64 128 --drop_probs_cnn 0.1 0.2 --dl_framework tf.keras
  1. Search for MLPs on FMNIST without augmentation, 3 Bayesian optimization for each stage1 and stage 3. Custimizable dl_framework on torch and tf.keras
python main.py --network 'mlp' --dataset 'fmnist' --val_split 0 --input_size 1 28 28 --output_size 10 --numepochs 10 --bo_prior_states 3 --bo_steps 3 --bo_explore 10 --dl_framework torch/tf.keras
  1. Search for MLPs on the custom Reuters RCV1 dataset, which has 2000 input features and 50 output classes. Search between 0-2 hidden layers, moderately penalize parameter count, search for initial learning rates from 1e-1 to 1e-4, run each model for 20 epochs. Use half data for validation.
python main.py --network 'mlp' --dataset 'rcv1_2000.npz' --val_split 0.5 --input_size 2000 --output_size 50 --wc 0.05 --penalize 'numparams' --numepochs 20 --num_hidden_layers 0 2 --lr -4 -1
  1. Search for CNNs on the custom Reuters RCV1 dataset reshaped to 5 channels of 20x20 pixels each and saved as 'rcv1_2000_reshaped.npz'. Use default CNN search settings.
python main.py --network 'cnn' --dataset 'rcv1_2000_reshaped.npz' --input_size 5 20 20 --output_size 50
  1. Search on Regression on given dataset. Make sure that .npz is under the root path, or assign --data_folder to a specific path.
python3 main.py --network 'mlp' --dataset new_reg.npz --val_split 0. --input_size 1 --output_size 1 --wc 0.0 --numepochs 10 --bo_prior_states 3 --bo_steps 3 --bo_explore 10 --dl_framework tf.keras --problem_type regression --batch_size 4 8

Some results from the original research paper

  • >91% accuracy on CIFAR-10 in 9 hrs of searching with a model taking 3 sec/epoch to train.
  • >95% accuracy on Fashion-MNIST in 17 hrs of searching with a model taking 5 sec/epoch to train.
  • >91% accuracy on Reuters RCV1 in 2 hrs of searching with a model with 1M parameters.

Contact

Deep-n-Cheap is developed and maintained by the USC HAL team

About

our deep n cheap modification for the AML course

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages