The official repository for Benchmarks for Corruption Invariant Person Re-identification (NeurIPS 2021 Track on Datasets and Benchmarks), with exhaustive study on corruption invariant learning in single- and cross-modality ReID datasets, including Market-1501-C, CUHK03-C, MSMT17-C, SYSU-MM01-C, RegDB-C.
When deploying person re-identification (ReID) model in safety-critical applications, it is pivotal to understanding the robustness of the model against a diverse array of image corruptions. However, current evaluations of person ReID only consider the performance on clean datasets and ignore images in various corrupted scenarios. In this work, we comprehensively establish 5 ReID benchmarks for learning corruption invariant representation.
The benchmark will be maintained by the authors. We will get constant lectures about the new proposed ReID models and evaluate them under the CIL benchmark settings in time. Besides, we gladly take feedback to the CIL benchmark and welcome any contributions in terms of the new ReID models and corresponding evaluations. Please feel free to contact us, [email protected] .
TODO:
- other datasets configurations
- get started tutorial
- more detailed statistical evaluations
- checkpoints of the baseline models
- cross-modality preson Re-ID dataset, CUHK-PEDES
- vehicle ReID datasets, like VehicleID, VeRi-776, etc.
(Note: codebase from TransReID)
1. Install dependencies
- python=3.7.0
- pytorch=1.6.0
- torchvision=0.7.0
- timm=0.4.9
- albumentations=0.5.2
- imagecorruptions=1.1.2
- h5py=2.10.0
- cython=0.29.24
- yacs=0.1.6
2. Prepare dataset
Download the datasets, Market-1501, CUHK03, MSMT17. Set the root path of the dataset in congigs/Market/resnet_base.yml
, DATASETS: ROOT_DIR: ('root')
, or set it in scripts/train_market.sh
, DATASETS.ROOT_DIR "('root')"
.
3. Train
Train a CIL model on Market-1501,
sh ./scripts/train_market.sh
4. Test
Test the CIL model on Market-1501,
sh ./scripts/eval_market.sh
The main code of corruption transform. (See contextual code in ./datasets/make_dataloader.py
, line 59)
from imagecorruptions.corruptions import *
corruption_function = [gaussian_noise, shot_noise, impulse_noise, defocus_blur,
glass_blur, motion_blur, zoom_blur, snow, frost, fog, brightness, contrast,
elastic_transform, pixelate, jpeg_compression, speckle_noise,
gaussian_blur, spatter, saturate, rain]
class corruption_transform(object):
def __init__(self, level=0, type='all'):
self.level = level
self.type = type
def __call__(self, img):
if self.level > 0 and self.level < 6:
level_idx = self.level
else:
level_idx = random.choice(range(1, 6))
if self.type == 'all':
corrupt_func = random.choice(corruption_function)
else:
func_name_list = [f.__name__ for f in corruption_function]
corrupt_idx = func_name_list.index(self.type)
corrupt_func = corruption_function[corrupt_idx]
c_img = corrupt_func(img.copy(), severity=level_idx)
img = Image.fromarray(np.uint8(c_img))
return img
Evaluating corruption robustness can be realized on-the-fly by modifing the transform function uesed in test dataloader. (See details in ./datasets/make_dataloader.py, Line 266)
val_with_corruption_transforms = T.Compose([
corruption_transform(0),
T.Resize(cfg.INPUT.SIZE_TEST),
T.ToTensor(),])
We introduce a rain corruption type, which is a common type of weather condition, but it is missed by the original corruption benchmark. (See details in ./datasets/make_dataloader.py
, Line 27)
def rain(image, severity=1):
if severity == 1:
type = 'drizzle'
elif severity == 2 or severity == 3:
type = 'heavy'
elif severity == 4 or severity == 5:
type = 'torrential'
blur_value = 2 + severity
bright_value = -(0.05 + 0.05 * severity)
rain = abm.Compose([
abm.augmentations.transforms.RandomRain(rain_type=type,
blur_value=blur_value, brightness_coefficient=1, always_apply=True),
abm.augmentations.transforms.RandomBrightness(limit=[bright_value,
bright_value], always_apply=True)])
width, height = image.size
if height <= 60:
scale_factor = 65.0 / height
new_size = (int(width * scale_factor), 65)
image = image.resize(new_size)
return rain(image=np.array(image))['image']
- Single-modality datasets
Dataset | Method | Clean Eval. | Corruption Eval. | ||||
---|---|---|---|---|---|---|---|
mINP | mAP | Rank-1 | mINP | mAP | Rank-1 | ||
Market-1501 | BoT | 59.30 | 85.06 | 93.38 | 0.20 | 8.42 | 27.05 |
AGW | 64.03 | 86.51 | 94.00 | 0.35 | 12.13 | 31.90 | |
SBS | 60.03 | 88.33 | 95.90 | 0.29 | 11.54 | 34.13 | |
CIL (ours) | 57.90 | 84.04 | 93.38 | 1.76 (0.13) | 28.03 (0.45) | 55.57 (0.63) | |
MSMT17 | BoT | 9.91 | 48.34 | 73.53 | 0.07 | 5.28 | 20.20 |
AGW | 12.38 | 51.84 | 75.21 | 0.08 | 6.53 | 22.77 | |
SBS | 10.26 | 56.62 | 82.02 | 0.05 | 7.89 | 28.77 | |
CIL (ours) | 12.45 | 52.40 | 76.10 | 0.32 (0.03) | 15.33 (0.20) | 39.79 (0.45) | |
CUHK03 | AGW | 49.97 | 62.25 | 64.64 | 0.46 | 3.45 | 5.90 |
CIL (ours) | 53.87 | 65.16 | 67.29 | 4.25 (0.39) | 16.33 (0.76) | 22.96 (1.04) |
- Cross-modality datasets
Note: For RegDB dataset, Mode A and Mode B represent visible-to-thermal and thermal-to-visible experimental settings, respectively. And for SYSU-MM01 dataset, Mode A and Mode B represent all search and indoor search respectively. Note that we only corrupt RGB (visible) images in the corruption evaluation.
Dataset | Method | Mode A | Mode B | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Clean Eval. | Corruption Eval. | Clean Eval. | Corruption Eval. | ||||||||||
mINP | mAP | R-1 | mINP | mAP | R-1 | mINP | mAP | R-1 | mINP | mAP | R-1 | ||
SYSU-MM01 | AGW | 36.17 | 47.65 | 47.50 | 14.73 | 29.99 | 34.42 | 59.74 | 62.97 | 54.17 | 35.39 | 40.98 | 33.80 |
CIL (ours) | 38.15 | 47.64 | 45.51 | 22.48 (1.65) | 35.92 (1.22) | 36.95 (0.67) | 57.41 | 60.45 | 50.98 | 43.11 (4.19) | 48.65 (4.57) | 40.73 (5.55) | |
RegDB | AGW | 54.10 | 68.82 | 75.78 | 32.88 | 43.09 | 45.44 | 52.40 | 68.15 | 75.29 | 6.00 | 41.37 | 67.54 |
CIL (ours) | 55.68 | 69.75 | 74.96 | 38.66 (0.01) | 49.76 (0.03) | 52.25 (0.03) | 55.50 | 69.21 | 74.95 | 11.94 (0.12) | 47.90 (0.01) | 67.17 (0.06) |
(Note: ranked by mAP on corrupted test set)
Method | Reference | Clean Eval. | Corruption Eval. | ||||
---|---|---|---|---|---|---|---|
mINP | mAP | Rank-1 | mINP | mAP | Rank-1 | ||
TransReID | Shuting He et al. (2021) | 69.29 | 88.93 | 95.07 | 1.98 | 27.38 | 53.19 |
CaceNet | Fufu Yu et al. (2020) | 70.47 | 89.82 | 95.40 | 0.67 | 18.24 | 42.92 |
LightMBN | Fabian Herzog et al. (2021) | 73.29 | 91.54 | 96.53 | 0.50 | 14.84 | 38.68 |
PLR-OS | Ben Xie et al. (2020) | 66.42 | 88.93 | 95.19 | 0.48 | 14.23 | 37.56 |
RRID | Hyunjong Park et al. (2019) | 67.14 | 88.43 | 95.19 | 0.46 | 13.45 | 36.57 |
Pyramid | Feng Zheng et al. (2018) | 61.61 | 87.50 | 94.86 | 0.36 | 12.75 | 35.72 |
PCB | Yifan Sun et al.(2017) | 41.97 | 82.19 | 94.15 | 0.41 | 12.72 | 34.93 |
BDB | Zuozhuo Dai et al. (2018) | 61.78 | 85.47 | 94.63 | 0.32 | 10.95 | 33.79 |
Aligned++ | Hao Luo et al. (2019) | 47.31 | 79.10 | 91.83 | 0.32 | 10.95 | 31.00 |
AGW | Mang Ye et al. (2020) | 65.40 | 88.10 | 95.00 | 0.30 | 10.80 | 33.40 |
MHN | Binghui Chen et al. (2019) | 55.27 | 85.33 | 94.50 | 0.38 | 10.69 | 33.29 |
LUPerson | Dengpan Fu et al. (2020) | 68.71 | 90.32 | 96.32 | 0.29 | 10.37 | 32.22 |
OS-Net | Kaiyang Zhou et al. (2019) | 56.78 | 85.67 | 94.69 | 0.23 | 10.37 | 30.94 |
VPM | Yifan Sun et al. (2019) | 50.09 | 81.43 | 93.79 | 0.31 | 10.15 | 31.17 |
DG-Net | Zhedong Zheng et al. (2019) | 61.60 | 86.09 | 94.77 | 0.35 | 9.96 | 31.75 |
ABD-Net | Tianlong Chen et al. (2019) | 64.72 | 87.94 | 94.98 | 0.26 | 9.81 | 29.65 |
MGN | Guanshuo Wang et al.(2018) | 60.86 | 86.51 | 93.88 | 0.29 | 9.72 | 29.56 |
F-LGPR | Yunpeng Gong et al. (2021) | 65.48 | 88.22 | 95.37 | 0.23 | 9.08 | 29.35 |
TDB | Rodolfo Quispe et al. (2020) | 56.41 | 85.77 | 94.30 | 0.20 | 8.90 | 28.56 |
LGPR | Yunpeng Gong et al. (2021) | 58.71 | 86.09 | 94.51 | 0.24 | 8.26 | 27.72 |
BoT | Hao Luo et al. (2019) | 51.00 | 83.90 | 94.30 | 0.10 | 6.60 | 26.20 |
(Note: ranked by mAP on corrupted test set)
Method | Reference | Clean Eval. | Corruption Eval. | ||||
---|---|---|---|---|---|---|---|
mINP | mAP | Rank-1 | mINP | mAP | Rank-1 | ||
CaceNet | Fufu Yu et al. (2020) | 65.22 | 75.13 | 77.64 | 2.09 | 10.62 | 17.04 |
Pyramid | Feng Zheng et al. (2018) | 61.41 | 73.14 | 79.54 | 1.10 | 8.03 | 10.42 |
RRID | Hyunjong Park et al. (2019) | 55.81 | 67.63 | 74.99 | 1.00 | 7.30 | 9.66 |
PLR-OS | Ben Xie et al. (2020) | 62.72 | 74.67 | 78.14 | 0.89 | 6.49 | 10.99 |
Aligned++ | Hao Luo et al. (2019) | 47.32 | 59.76 | 62.07 | 0.56 | 4.87 | 7.99 |
MGN | Guanshuo Wang et al.(2018) | 51.18 | 62.73 | 69.14 | 0.46 | 4.20 | 5.44 |
MHN | Binghui Chen et al. (2019) | 56.52 | 66.77 | 72.21 | 0.46 | 3.97 | 8.27 |
(Note: ranked by mAP on corrupted test set)
Method | Reference | Clean Eval. | Corruption Eval. | ||||
---|---|---|---|---|---|---|---|
mINP | mAP | Rank-1 | mINP | mAP | Rank-1 | ||
OS-Net | Kaiyang Zhou et al. (2019) | 4.05 | 40.05 | 71.86 | 0.08 | 7.86 | 28.51 |
AGW | Mang Ye et al. (2020) | 12.38 | 51.84 | 75.21 | 0.08 | 6.53 | 22.77 |
BoT | Hao Luo et al. (2019) | 9.91 | 48.34 | 73.53 | 0.07 | 5.28 | 20.20 |
Kindly include a reference to this paper in your publications if it helps your research:
@misc{chen2021benchmarks,
title={Benchmarks for Corruption Invariant Person Re-identification},
author={Minghui Chen and Zhiqiang Wang and Feng Zheng},
year={2021},
eprint={2111.00880},
archivePrefix={arXiv},
primaryClass={cs.CV}
}