Skip to content

Google Summer of Code 2014 : Improvement of Automatic Benchmarking System

Anand Soni edited this page Aug 7, 2014 · 20 revisions

Welcome to the benchmarks wiki! This wiki describes in detail the new additions to the benchmarking system made during GSoC 2014.

Overview

This repository was created by Marcus Edel aiming to compare the performance of various Machine Learning libraries over various classifiers. Until the start of GSoC 2014, we compared the libraries based on the run times of various algorithms/classifiers. But, since run-time was not a sufficient way to establish any benchmark, we came up with the following additions to the repository to make it more efficient, useful and unique in its own way:

  • Implemented some of the very widely used machine learning performance metrics
  • Modified the existing interface and integrated these metrics with various classifiers of the following libraries :
    • Mlpack
    • Scikit
    • Shogun
    • Weka
    • Matlab
    • Mlpy
  • Implemented a Bootstrapping framework which can be used to directly compare performance of libraries
  • Developed a visualization framework to plot the metric values for all libraries as a grouped chart
  • Added a Sortable table implementation to sort library performance on the basis of any metric value

Metrics Implemented

The following performance metrics were implemented during the Summer of 2014 -

  • Accuracy -
  • Precision -
  • Recall -
  • F Measure -
  • Lift -
  • Matthews Correlation Coefficient -
  • Mean Squared Error -
  • Mean Predictive Information -

The Bootstrapping Framework

Changes in the benchmarking API/Interface

Updated structure of the config file

How to add new metrics?

How to integrate a new metric with libraries?

Clone this wiki locally