Segmentaion Metrics Package

This is a simple package to compute different metrics for Medical image segmentation(images with suffix .mhd, .mha, .nii, .nii.gz or .nrrd ), and write them to csv file.

BTW, if you need the support for more suffix, just let me know by creating new issues

Summary

To assess the segmentation performance, there are several different methods. Two main methods are volume-based metrics and distance-based metrics.

Metrics included

This library computes the following performance metrics for segmentation:

Voxel based metrics

Dice (F-1)
Jaccard
Precision
Recall
False positive rate
False negtive rate
Volume similarity

The equations for these metrics can be seen in the wikipedia.

Surface Distance based metrics (with spacing as default)

Hausdorff distance
Hausdorff distance 95% percentile
Mean (Average) surface distance
Median surface distance
Std surface distance

Note: These metrics are symmetric, which means the distance from A to B is the same as the distance from B to A. More detailed explanication of these surface distance based metrics could be found here.

Installation

$ pip install seg-metrics

Getting started

Tutorial is at the Colab.

API reference is available at Documentation

Examples could be found below.

Usage

At first, import the package:

import seg_metrics.seg_metrics as sg

Evaluate two batches of images with same filenames from two different folders

labels = [0, 4, 5 ,6 ,7 , 8]
gdth_path = 'data/gdth'  # this folder saves a batch of ground truth images
pred_path = 'data/pred'  # this folder saves the same number of prediction images
csv_file = 'metrics.csv'  # results will be saved to this file and prented on terminal as well. If not set, results 
# will only be shown on terminal.

metrics = sg.write_metrics(labels=labels[1:],  # exclude background
                  gdth_path=gdth_path,
                  pred_path=pred_path,
                  csv_file=csv_file)
print(metrics)  # a list of dictionaries which includes the metrics for each pair of image.

After runing the above codes, you can get a list of dictionaries metrics which contains all the metrics. Also you can find a .csv file containing all metrics in the same directory. If the csv_file is not given, the metrics results will not be saved to disk.

Evaluate two images

labels = [0, 4, 5 ,6 ,7 , 8]
gdth_file = 'data/gdth.mhd'  # ground truth image full path
pred_file = 'data/pred.mhd'  # prediction image full path
csv_file = 'metrics.csv'

metrics = sg.write_metrics(labels=labels[1:],  # exclude background
                  gdth_path=gdth_file,
                  pred_path=pred_file,
                  csv_file=csv_file)

After runing the above codes, you can get a dictionary metrics which contains all the metrics. Also you can find a .csv file containing all metrics in the same directory.

Note:

When evaluating one image, the returned metrics is a dictionary.
When evaluating a batch of images, the returned metrics is a list of dictionaries.

Evaluate two images with specific metrics

labels = [0, 4, 5 ,6 ,7 , 8]
gdth_file = 'data/gdth.mhd'
pred_file = 'data/pred.mhd'
csv_file = 'metrics.csv'

metrics = sg.write_metrics(labels=labels[1:],  # exclude background if needed
                  gdth_path=gdth_file,
                  pred_path=pred_file,
                  csv_file=csv_file,
                  metrics=['dice', 'hd'])
# for only one metric
metrics = sg.write_metrics(labels=labels[1:],  # exclude background if needed
                  gdth_path=gdth_file,
                  pred_path=pred_file,
                  csv_file=csv_file,
                  metrics='msd')

By passing the following parameters to select specific metrics.

- dice:         Dice (F-1)
- jaccard:      Jaccard
- precision:    Precision
- recall:       Recall
- fpr:          False positive rate
- fnr:          False negtive rate
- vs:           Volume similarity

- hd:           Hausdorff distance
- hd95:         Hausdorff distance 95% percentile
- msd:          Mean (Average) surface distance
- mdsd:         Median surface distance
- stdsd:        Std surface distance

For example:

labels = [1]
gdth_file = 'data/gdth.mhd'
pred_file = 'data/pred.mhd'
csv_file = 'metrics.csv'

metrics = sg.write_metrics(labels, gdth_file, pred_file, csv_file, metrics=['dice', 'hd95'])
dice = metrics['dice']
hd95 = metrics['hd95']

Evaluate two images in memory instead of disk

Note:

The two images must be both numpy.ndarray or SimpleITK.Image.
Input arguments are different. Please use gdth_img and pred_img instead of gdth_path and pred_path.
If evaluating numpy.ndarray, the default spacing for all dimensions would be 1.0 for distance based metrics.
If you want to evaluate numpy.ndarray with specific spacing, pass a sequence with the length of image dimension as spacing.

labels = [0, 1, 2]
gdth_img = np.array([[0,0,1], 
                     [0,1,2]])
pred_img = np.array([[0,0,1], 
                     [0,2,2]])
csv_file = 'metrics.csv'
spacing = [1, 2]
metrics = sg.write_metrics(labels=labels[1:],  # exclude background if needed
                  gdth_img=gdth_img,
                  pred_img=pred_img,
                  csv_file=csv_file,
                  spacing=spacing,
                  metrics=['dice', 'hd'])
# for only one metrics
metrics = sg.write_metrics(labels=labels[1:],  # exclude background if needed
                  gdth_img=gdth_img,
                  pred_img=pred_img,
                  csv_file=csv_file,
                  spacing=spacing,
                  metrics='msd')

About the calculation of surface distance

The default surface distance is calculated based on fullyConnected border. To change the default connected type, you can set argument fullyConnected as False as follows.

metrics = sg.write_metrics(labels=[1,2,3],
                        gdth_img=gdth_img,
                        pred_img=pred_img,
                        csv_file=csv_file,
                        fully_connected=False)

In 2D image, fullyconnected means 8 neighbor points, while faceconnected means 4 neighbor points. In 3D image, fullyconnected means 26 neighbor points, while faceconnected means 6 neighbor points.

How to obtain more metrics? like "False omission rate" or "Accuracy"?

A great number of different metrics, like "False omission rate" or "Accuracy", could be derived from some the confusion matrics. To calculate more metrics or design custom metrics, use TPTNFPFN=True to return the number of voxels/pixels of true positive (TP), true negative (TN), false positive (FP), false negative (FN) predictions. For example,

metrics = sg.write_metrics(
                        gdth_img=gdth_img,
                        pred_img=pred_img,
                        TPTNFPFN=True) 
tp, tn, fp, fn = metrics['TP'], metrics['TN'], metrics['FP'], metrics['FN']
false_omission_rate = fn/(fn+tn)
accuracy = (tp + tn)/(tp + tn + fp + fn)

Comparision with medpy

medpy also provide functions to calculate metrics for medical images. But seg-metrics
has several advantages.

Faster. seg-metrics is 10 times faster calculating distance based metrics. This jupyter notebook could reproduce the results.
More convenient. seg-metrics can calculate all different metrics in once in one function while medpy needs to call different functions multiple times which cost more time and code.
More Powerful. seg-metrics can calculate multi-label segmentation metrics and save results to .csv file in good manner, but medpy only provides binary segmentation metrics. Comparision can be found in this jupyter notebook.

If this repository helps you in anyway, show your love ❤️ by putting a ⭐ on this project. I would also appreciate it if you cite the package in your publication. (Note: This package is NOT approved for clinical use and is intended for research use only. )

Citation

If you use this software anywhere we would appreciate if you cite the following articles:

Jia, Jingnan, Marius Staring, and Berend C. Stoel. "seg-metrics: a Python package to compute segmentation metrics." medRxiv (2024): 2024-02.

@article{jia2024seg,
  title={seg-metrics: a Python package to compute segmentation metrics},
  author={Jia, Jingnan and Staring, Marius and Stoel, Berend C},
  journal={medRxiv},
  pages={2024--02},
  year={2024},
  publisher={Cold Spring Harbor Laboratory Press}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Segmentaion Metrics Package

Summary

Metrics included

Voxel based metrics

Surface Distance based metrics (with spacing as default)

Installation

Getting started

Usage

Evaluate two batches of images with same filenames from two different folders

Evaluate two images

Evaluate two images with specific metrics

Evaluate two images in memory instead of disk

About the calculation of surface distance

How to obtain more metrics? like "False omission rate" or "Accuracy"?

Comparision with medpy

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Segmentaion Metrics Package

Summary

Metrics included

Voxel based metrics

Surface Distance based metrics (with spacing as default)

Installation

Getting started

Usage

Evaluate two batches of images with same filenames from two different folders

Evaluate two images

Evaluate two images with specific metrics

Evaluate two images in memory instead of disk

About the calculation of surface distance

How to obtain more metrics? like "False omission rate" or "Accuracy"?

Comparision with medpy

Citation