Here is an unofficial implementation and experiement of estimating the confidence for prediction form neural networks from "Learning Confidence for Out-of-Distribution Detection in Neural Networks" for Tensorflow.
How to measure the confidence of the network is an interesting topics, and raising more and more attention recently.
For example, the workshop, Uncertainty and Robustness in Deep Learning - ICML 2019.
In this paper, the author propose confidence branch with estimated confidence to perform out-of-distribution detection.
On the other hand, this confidence can also be use to estimate the uncertainty of the model.
To be brief, the idea of this paper is very intuitive and easy to apply.
- Simple and intuitive.
- Quick proof of concept!
- Very few addictional computations required.
However, in my opinions, there are still some issues should be discussed and explored.
The author have proposed some tricks to tackle with these issues. I still work on reproduce them.
- Hyper-parameter is somehow too sensitive, even using the budget proposed in the paper.
- Incompatible with high accuracy model, due to insufficient negative samples.
All in all, if you are finding some works to estimate the uncertainty of the model.
This work still worth a try because it won't take too much time to test it.
This code is implemmented and tested with Tensorflow 1.13.0.
I didn't use any spetial operator, so it should also work for other version of tensorflow.
Just run a simple example of classification problem on MNIST dataset.
python run_example.py
The code run a very simple fully-connected model with a relatively small brach network to estimate confidence.
You can also turn off the confidence branch by setting WITH_CONF
to False
in run_example.py
.
To check out how this work in Tensorboard:
tensorboard --logdir=logs
- Set up the parameters of confidence branch.
import ood._conf.conf_net import base_conf
conf_admin = base_conf.BaseConfNet(lambda1=1.0)
- During building your own network, branch out in the penultimate layer.
confidence = conf_admin.forward(hidden)
- During training, hint you output by the value of confidence.
conf_admin.hinting(outputs, labels)
- Add the confidence penalty to your final losses.
losses += conf_admin.calculate_loss_conf()
The outline of the approach is illustrated in the example.
The sub-net is brached out from the penultimate layer as shown in the figure.
The author use very light-weighted sub-net with small fully-connected layer, leading an ignorable addictional computation.
However, other structures of networks are also work.
The paper using a simple linear interpolation function to give hints to model.
In my opinions, different problem should using different interpolation function to generate a smooth interpolated reuslts.
The weight of the confidence loss (penalty) is very critical and sensitive. By setting large penalty, the model will avoid to output low confidence predictions.
As porposed in the paper, only apply hinting on half of the batch.
Active this feature by setting half_batch
to True
, and use specific batch size (not None
).
The author propose budget between [0.1, 1]
for auto tunning the weight of confidence penalty.
If confidence loss is greater than budger, decrease the weight; otherwise, increase the weight.
The code of using budget will be updated soon.
To be brief, the results of MNIST look similar to the paper.
However, building a high accuracy model disintegrates the performance of OOD a little bit.
This is caused by the insufficient negative samples. I'm still working on this issue.
More experiemntal results on MNIST dataset will be updated soon.
Although the confidence classification problem can also be abtained from the value of softmax, I found this approach do work in a more elegent manner witch is quite simple .
BTW, I also conducted experiments on other task, such as 3D hand tracking.
- Update the budget for auto-tinning the weight of confidence penalty.
- Detailed analysis on MNIST.
- The issue of insufficient begative samples.
please let me know if you have any suggestion, I'll be very grateful.
Code work by Jia-Yau Shiau [email protected].