FANN-on-MCU: Optimized FANN Inference for Microcontrollers

This repository contains optimized code to perform inference of FANN-trained neural network on microcontrollers. Currently supported platforms are ARM Cortex M-series and Parallel Ultra-Low Power Platforms (PULP).

FANN-on-ARM: Optimized FANN Inference for ARM Cortex M-series

This repository contains optimized code to perform inference of FANN-trained neural network on the ARM Cortex M-series platform.

Given a data file and pre-trained network in FANN's format, all necessary files to run and test the network on the microcontroller are generated.

Reference/Attribution

If this code is helpful for your research, please cite

X. Wang, M. Magno, L. Cavigelli, L. Benini, "FANN-on-MCU: An Open-Source Toolkit for Energy-Efficient Neural Network Inference at the Edge of the Internet of Things", arXiv:1911.03314 [cs.LG], Nov. 2019

Prerequisites

You should have data and a pre-trained network in the FANN format. The generated codes uses optimized functions provided by CMSIS-DSP. To run the script, python needs to be installed. This code has been tested with TI's MSP432 platform and ST's STM32L475VG.

Usage

First, you need to export your data in the FANN default format and train a neural network with FANN. How to do this is explained here. You should end up with two files, a .data file and a .net file. An example can be found in the sample-data folder.

In order to have optimized access to memory, the code generation script takes into account the available RAM and Flash memory in the selected microcontroller to store the parameters of the trained model in the level of memory closest to the processor which is still large enough to contain the model. Therefore you can give the memory configuration of your microcontroller as in mem_config.json and give it as input to the code generation script.

Finally, you can use the generate.py script to generate the files to run the inference on the microcontroller, for example on arm using fixed point:

python generate.py -i sample-data/myNetwork -m fixed -p arm --mem_config mem_config.json

For more details on how to use generate.py:

python generate.py -h

Now all the *.h and *.c files in the root folder (fann.h and fann_struct.h) and in the output folder can be copied to you project. They include all the data and code to run the network. To call it from your code, just include fann.h and call fann_type *fann_run(fann_type * input);, where fann_type is float or int depending on whether you started with a fixed-point model or not. Don't forget to include the files in your build scripts/makefile/project.

Demo Project

The folder stm32l475-onDeviceTest-linux contains a demo project running test and benchmarking code on an STM32L475 discovery board.

FANN-on-PULP: Optimized FANN Inference for PULP platforms

This repository contains optimized code to perform inference of FANN-trained neural network on PULP platforms.

Given a data file and pre-trained network in FANN's format, all necessary files to run and test the network on the microcontroller are generated.

Prerequisites

You should have data and a pre-trained network in the FANN format. To run the script, python needs to be installed. To use pulp platform, pulp sdk needs to be installed, you can find instructions here. The generated codes uses optimized functions provided by PULP-DSP. Please follow the instructions on PULP-DSP to install the library. This code has been tested with PULP Mr.Wolf.

Usage

First, you need to export your data in the FANN default format and train a neural network with FANN. How to do this is explained here. You should end up with two files, a .data file and a .net file. An example can be found in the sample-data folder.

In order to have optimized access to memory, the code generation script takes into account the available RAM and Flash memory in the selected microcontroller to store the parameters of the trained model in the level of memory closest to the processor which is still large enough to contain the model. Therefore you can give the memory configuration of your microcontroller as in mem_config.json and give it as input to the code generation script.

Finally, you can use the generate.py script to generate the files to run on the microcontroller, for example on pulp using fixed point (currently only fixed point is supported on pulp):

python generate.py -i sample-data/myNetwork -m fixed -p pulp --mem_config mem_config.json

For more details on how to use generate.py:

python generate.py -h

Now all the *.h and *.c files in the root folder (fann.h and fann_struct.h) and in the output folder can be copied to you project. They include all the data and code to run the network. To call it from your code, just include fann.h and call fann_type *fann_run(fann_type * input);, where fann_type is float or int depending on whether you started with a fixed-point model or not.

Demo Project

The folder MrWolf-onBoardTest contains a demo project running test and benchmarking code on an PULP Mr. Wolf board. To run the demo you need to install and configure the pulp sdk (instructions here). Remember to source the sourceme.sh everytime you open a new terminal to use pulp sdk. After installing pulp sdk, run generate.py, copy the generated *.h and *.c files to the MrWolf-onBoardTest folder and do

make clean all run

Fixed-Point Remarks

FANN allows to train your model and export it in fixed-point format easily. After training with fann_train_on_data and potentially saving the floating-point model with fann_save, just run

decimal_point = fann_save_to_fixed(ann, "myNetwork_fixed.net");

You can also convert your training or test data to fixed-point representation this way:

test_data = fann_read_train_from_file("./diabetes_test.data");
fann_save_train_to_fixed(test_data, "diabetes_test_fixed.data", decimal_point);

However, once you are running the code in-system, don't forget to rescale the input data by scaling it accordingly: int x_fixed = x * (1 << DECIMAL_POINT);. The decimal point constant is provided through fann_conf.h.

Furthermore, make sure that the data on which you are previously training your full-precision network is scaled to the [-1,1] interval including a potential safety-margin and that this scaling is also applied during on-device data preparation. FANN's network quantization method assumes the data is normalized this way and quantizes using worst-case data scaling assumptions. Thus training the network or feeding it non-normalized data is likely to introduce overflows.

Experimental tests show that if activation functions with names containing "STEPWISE_" are used already during the training, the loss in accuracy with fixed point inference is almost none.

File Description

Constant files:

generate.py: the script generating the network and data-specific code files based an FANN-format data
fann_structs.h and fann.c: contain the implementation of the NN building blocks.
fann.h: the header file to be included in your code providing the fann_type *fann_run(fann_type * input); function declaration.
sample-data/{myNetwork.net, myNetwork.data}: sample data and network pre-trained with FANN.
fann_utils.h and fann_utils.c: contain utility functions.
test.c: contains a test iterating over the exported test data. Serves as an example for 2-class classification.
mem_config.json: contains memory configurations of the selected microcontroller.
arm and pulp: contain source code for respectively ARM Cortex-M series and PULP-based MCUs. generate.py will copy the corresponding source codes for the selected MCU to the output folder.

Generated files in output folder:

fann_net.h: contains the trained parameters and the network structure.
fann_conf.h: contains some more meta information on the network; #layers, fixed-point parameters (if applicable), ...
test_data.h: contains the test input data and expected result
fann.c and/or fann_utils.c and fann_utils.h: the corresponding source codes for the selected MCU to the output folder.

License and Attribution

Please refer to the LICENSE file for the licensing of our code. We rely on the interfaces, specifications, and some code of the FANN project which is released on LGPL.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FANN-on-MCU: Optimized FANN Inference for Microcontrollers

FANN-on-ARM: Optimized FANN Inference for ARM Cortex M-series

Reference/Attribution

Prerequisites

Usage

Demo Project

FANN-on-PULP: Optimized FANN Inference for PULP platforms

Prerequisites

Usage

Demo Project

Fixed-Point Remarks

File Description

License and Attribution

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
arm		arm
docs		docs
output		output
pulp		pulp
sample-data		sample-data
stm32l475-onDeviceTest-linux		stm32l475-onDeviceTest-linux
wolfe-onBoardTest		wolfe-onBoardTest
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
fann.h		fann.h
fann_structs.h		fann_structs.h
generate.py		generate.py
mem_config.json		mem_config.json
mem_config_rebuttal.json		mem_config_rebuttal.json

License

pulp-platform/fann-on-mcu

Folders and files

Latest commit

History

Repository files navigation

FANN-on-MCU: Optimized FANN Inference for Microcontrollers

FANN-on-ARM: Optimized FANN Inference for ARM Cortex M-series

Reference/Attribution

Prerequisites

Usage

Demo Project

FANN-on-PULP: Optimized FANN Inference for PULP platforms

Prerequisites

Usage

Demo Project

Fixed-Point Remarks

File Description

License and Attribution

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages