Negative samples for detection #1197
Replies: 12 comments 26 replies
-
Hi @themantalope, IMO it works when there is no bounding box from the code. If @Can-Zhao could help double-confirm it, that would be great! Thanks in advance. |
Beta Was this translation helpful? Give feedback.
-
Thank you for reaching out. |
Beta Was this translation helpful? Give feedback.
-
Thank you for your response. I have run into some problems with trying to
use negative samples. May you provide an example of how the target boxes
and class data should be formatted for a negative sample?
I can show what I did which resulted in data loader and loss function
errors when I’m at my other workstation.
Thanks!
…On Mon, Jan 30, 2023 at 8:58 AM Can Zhao ***@***.***> wrote:
Thank you!
—
Reply to this email directly, view it on GitHub
<#1197 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZ7OB7KPCRLGHFH43PIFPTWU7JKRANCNFSM6AAAAAAUKXMQ4M>
.
You are receiving this because you were mentioned.Message ID:
***@***.***
com>
|
Beta Was this translation helpful? Give feedback.
-
Will do. Probably in another couple hours I can send some examples.
Thanks!
…On Mon, Jan 30, 2023 at 9:35 AM Can Zhao ***@***.***> wrote:
Thank you! I will test it with purely negative samples and let you know in
48 hours. It would also be helpful if you could share the settings and
errors from your side.
—
Reply to this email directly, view it on GitHub
<#1197 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZ7OB3ACTVJLWAHP23TGA3WU7NVZANCNFSM6AAAAAAUKXMQ4M>
.
You are receiving this because you were mentioned.Message ID:
***@***.***
com>
|
Beta Was this translation helpful? Give feedback.
-
Hi @themantalope , the issue is not with the RetinaNet detector. It happens in data loader. The data loader expects ground truth (GT) box with shape (N,4) or (N,6). But negative samples have GT box with shape (0,). I will work on the data loader but it will take some time. Currently, the solution to your case will be regenerating a new training data json file to remove all the negative samples. |
Beta Was this translation helpful? Give feedback.
-
Hi @Can-Zhao Sorry for the late response. Day job got in the way :) I think that the retinanet detector loss function is more the issue. Here is a bit of a contrived example: # 2) build network
anchor_generator = AnchorGeneratorWithAnchorShape(
feature_map_scales=[2**l for l in range(len(tra_d['returned_layers']) + 1)],
base_anchor_shapes=tra_d['base_anchor_shapes'],
)
net = torch.jit.load(env_d["model_path"]).to(device)
# 3) build detector
detector = RetinaNetDetector(network=net, anchor_generator=anchor_generator, debug=False)
# set inference components
detector.set_box_selector_parameters(
score_thresh=tra_d['score_thresh'],
topk_candidates_per_level=1000,
nms_thresh=tra_d['nms_thresh'],
detections_per_img=100,
)
detector.set_sliding_window_inferer(
# roi_size=tra_d['val_patch_size'],
roi_size=[128,128,128],
overlap=0.75,
sw_batch_size=32,
mode="gaussian",
device="cpu",
)
detector.to(device)
print(f"Loaded model from {env_d['model_path']}")
detector.set_atss_matcher(num_candidates=4, center_in_gt=False)
x = torch.rand(size=(1,1,128,128,128))
c = torch.tensor([[]]) # class, empty => nothing in image
b = torch.tensor([[]]) # boxes, empty => no targets in image
b = b.float()
print(c.dtype, b.dtype) # should be float 32
y = [{'labels':c, 'boxes':b}]
detector(x, y) First, the issue is that you get a value error, checking that the boxes are of the correct shape: ---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In [38], line 1
----> 1 detector(x, [{'labels':c, 'boxes':b}])
File /opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py:1185, in Module._call_impl(self, *input, **kwargs)
1181 # If we don't have any hooks, we want to skip the rest of the logic in
1182 # this function, and just call forward.
1183 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1184 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1185 return forward_call(*input, **kwargs)
1186 # Do not call functions when jit is used
1187 full_backward_hooks, non_full_backward_hooks = [], []
File /opt/monai/monai/apps/detection/networks/retinanet_detector.py:479, in RetinaNetDetector.forward(self, input_images, targets, use_inferer)
477 # 1. Check if input arguments are valid
478 if self.training:
--> 479 check_training_targets(input_images, targets, self.spatial_dims, self.target_label_key, self.target_box_key)
480 self._check_detector_training_components()
482 # 2. Pad list of images to a single Tensor `images` with spatial size divisible by self.size_divisible.
483 # image_sizes stores the original spatial_size of each image before padding.
File /opt/monai/monai/apps/detection/utils/detector_utils.py:85, in check_training_targets(input_images, targets, spatial_dims, target_label_key, target_box_key)
83 raise ValueError(f"Expected target boxes to be of type Tensor, got {type(boxes)}.")
84 if len(boxes.shape) != 2 or boxes.shape[-1] != 2 * spatial_dims:
---> 85 raise ValueError(
86 f"Expected target boxes to be a tensor " f"of shape [N, {2* spatial_dims}], got {boxes.shape}."
87 )
88 return
ValueError: Expected target boxes to be a tensor of shape [N, 6], got torch.Size([1, 0]). Ok, let's say we just put something silly in to get rid of the value error: c = torch.tensor([[]])
b = torch.tensor([[0,0,0,0,0,0]])
b = b.float()
detector(x, [{'labels':c, 'boxes':b}]) Now we get index errors because of the shape of ---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In [43], line 1
----> 1 detector(x, [{'labels':c, 'boxes':b}])
File /opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py:1185, in Module._call_impl(self, *input, **kwargs)
1181 # If we don't have any hooks, we want to skip the rest of the logic in
1182 # this function, and just call forward.
1183 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1184 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1185 return forward_call(*input, **kwargs)
1186 # Do not call functions when jit is used
1187 full_backward_hooks, non_full_backward_hooks = [], []
File /opt/monai/monai/apps/detection/networks/retinanet_detector.py:514, in RetinaNetDetector.forward(self, input_images, targets, use_inferer)
512 # 6(1). If during training, return losses
513 if self.training:
--> 514 losses = self.compute_loss(head_outputs, targets, self.anchors, num_anchor_locs_per_level) # type: ignore
515 return losses
517 # 6(2). If during inference, return detection results
File /opt/monai/monai/apps/detection/networks/retinanet_detector.py:692, in RetinaNetDetector.compute_loss(self, head_outputs_reshape, targets, anchors, num_anchor_locs_per_level)
675 """
676 Compute losses.
677
(...)
689 a dict of several kinds of losses.
690 """
691 matched_idxs = self.compute_anchor_matched_idxs(anchors, targets, num_anchor_locs_per_level)
--> 692 losses_cls = self.compute_cls_loss(head_outputs_reshape[self.cls_key], targets, matched_idxs)
693 losses_box_regression = self.compute_box_loss(
694 head_outputs_reshape[self.box_reg_key], targets, anchors, matched_idxs
695 )
696 return {self.cls_key: losses_cls, self.box_reg_key: losses_box_regression}
File /opt/monai/monai/apps/detection/networks/retinanet_detector.py:789, in RetinaNetDetector.compute_cls_loss(self, cls_logits, targets, matched_idxs)
786 total_gt_classes_target_list = []
787 for targets_per_image, cls_logits_per_image, matched_idxs_per_image in zip(targets, cls_logits, matched_idxs):
788 # for each image, get training samples
--> 789 sampled_cls_logits_per_image, sampled_gt_classes_target = self.get_cls_train_sample_per_image(
790 cls_logits_per_image, targets_per_image, matched_idxs_per_image
791 )
792 total_cls_logits_list.append(sampled_cls_logits_per_image)
793 total_gt_classes_target_list.append(sampled_gt_classes_target)
File /opt/monai/monai/apps/detection/networks/retinanet_detector.py:885, in RetinaNetDetector.get_cls_train_sample_per_image(self, cls_logits_per_image, targets_per_image, matched_idxs_per_image)
883 # create the target classification with one-hot encoding
884 gt_classes_target = torch.zeros_like(cls_logits_per_image) # (sum(HW(D)A), self.num_classes)
--> 885 gt_classes_target[
886 foreground_idxs_per_image, # fg anchor idx in
887 targets_per_image[self.target_label_key][
888 matched_idxs_per_image[foreground_idxs_per_image]
889 ], # fg class label
890 ] = 1.0
892 if self.fg_bg_sampler is None:
893 # if no balanced sampling
894 valid_idxs_per_image = matched_idxs_per_image != self.proposal_matcher.BETWEEN_THRESHOLDS
IndexError: tensors used as indices must be long, byte or bool tensors Ok, what if we put some dummy value for c = torch.tensor([[-1]])
b = torch.tensor([[0,0,0,0,0,0]])
b = b.float()
detector(x, [{'labels':c, 'boxes':b}]) Now we get box errors because it's all zeros. ---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In [48], line 1
----> 1 detector(x, [{'labels':c, 'boxes':b}])
File /opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py:1185, in Module._call_impl(self, *input, **kwargs)
1181 # If we don't have any hooks, we want to skip the rest of the logic in
1182 # this function, and just call forward.
1183 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1184 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1185 return forward_call(*input, **kwargs)
1186 # Do not call functions when jit is used
1187 full_backward_hooks, non_full_backward_hooks = [], []
File /opt/monai/monai/apps/detection/networks/retinanet_detector.py:514, in RetinaNetDetector.forward(self, input_images, targets, use_inferer)
512 # 6(1). If during training, return losses
513 if self.training:
--> 514 losses = self.compute_loss(head_outputs, targets, self.anchors, num_anchor_locs_per_level) # type: ignore
515 return losses
517 # 6(2). If during inference, return detection results
File /opt/monai/monai/apps/detection/networks/retinanet_detector.py:693, in RetinaNetDetector.compute_loss(self, head_outputs_reshape, targets, anchors, num_anchor_locs_per_level)
691 matched_idxs = self.compute_anchor_matched_idxs(anchors, targets, num_anchor_locs_per_level)
692 losses_cls = self.compute_cls_loss(head_outputs_reshape[self.cls_key], targets, matched_idxs)
--> 693 losses_box_regression = self.compute_box_loss(
694 head_outputs_reshape[self.box_reg_key], targets, anchors, matched_idxs
695 )
696 return {self.cls_key: losses_cls, self.box_reg_key: losses_box_regression}
File /opt/monai/monai/apps/detection/networks/retinanet_detector.py:829, in RetinaNetDetector.compute_box_loss(self, box_regression, targets, anchors, matched_idxs)
823 total_target_regression_list = []
825 for targets_per_image, box_regression_per_image, anchors_per_image, matched_idxs_per_image in zip(
826 targets, box_regression, anchors, matched_idxs
827 ):
828 # for each image, get training samples
--> 829 decode_box_regression_per_image, matched_gt_boxes_per_image = self.get_box_train_sample_per_image(
830 box_regression_per_image, targets_per_image, anchors_per_image, matched_idxs_per_image
831 )
832 total_box_regression_list.append(decode_box_regression_per_image)
833 total_target_regression_list.append(matched_gt_boxes_per_image)
File /opt/monai/monai/apps/detection/networks/retinanet_detector.py:970, in RetinaNetDetector.get_box_train_sample_per_image(self, box_regression_per_image, targets_per_image, anchors_per_image, matched_idxs_per_image)
968 box_regression_per_image_ = box_regression_per_image
969 if self.encode_gt:
--> 970 matched_gt_boxes_per_image_ = self.box_coder.encode_single(matched_gt_boxes_per_image_, anchors_per_image)
971 if self.decode_pred:
972 box_regression_per_image_ = self.box_coder.decode_single(box_regression_per_image_, anchors_per_image)
File /opt/monai/monai/apps/detection/utils/box_coder.py:166, in BoxCoder.encode_single(self, gt_boxes, proposals)
164 device = gt_boxes.device
165 weights = torch.as_tensor(self.weights, dtype=dtype, device=device)
--> 166 targets = encode_boxes(gt_boxes, proposals, weights)
167 return targets
File /opt/monai/monai/apps/detection/utils/box_coder.py:99, in encode_boxes(gt_boxes, proposals, weights)
97 # torch.log may cause NaN or Inf
98 if torch.isnan(targets).any() or torch.isinf(targets).any():
---> 99 raise ValueError("targets is NaN or Inf.")
100 return targets
ValueError: targets is NaN or Inf. Ok, let's make some dummy boxes that are ok. c = torch.tensor([[-1]])
b = torch.tensor([[0,0,0,1,1,1]])
b = b.float() {'classification': tensor(7.8400e-05, grad_fn=<BinaryCrossEntropyWithLogitsBackward0>),
'box_regression': tensor(0.5932, grad_fn=<SmoothL1LossBackward0>)} We get something back. Is this the correct approach? Please let me know if this is what I should be doing for negative samples. For context, I'm doing a simple experiment with one class (positive labels are In the next week I'll dig into the loss function to see if this is a valid approach or if it should be modified to handle this use case. Please let me know what your thoughts are. |
Beta Was this translation helpful? Give feedback.
-
Thank you!
I have two draft PR on tutorial and MONAI core submitted. #1256
Project-MONAI/MONAI#6170
It works on my machine, but have not added unit test. I will try to finalize it before end of next week.
Best,
Can
…________________________________
From: themantalope ***@***.***>
Sent: Monday, March 20, 2023 9:13:33 AM
To: Project-MONAI/tutorials ***@***.***>
Cc: Can Zhao ***@***.***>; Mention ***@***.***>
Subject: Re: [Project-MONAI/tutorials] Negative samples for detection (Discussion #1197)
@Can-Zhao<https://github.com/Can-Zhao>
I will have time tomorrow to work on the implementation. If there is a way to DM me or if you want to message me here about your plan for implementation let me know. If not the whole thing I can at least work on parts of it.
—
Reply to this email directly, view it on GitHub<#1197 (reply in thread)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AQUYEBFYZ763G2GMFO6JYGTW5BJX3ANCNFSM6AAAAAAUKXMQ4M>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
OK, so the boxes are no longer the issue. There is still the problem that the loss function cannot handle an empty class tensor properly. See the below example: x = torch.rand(size=(1,1,128,128,128))
c = torch.tensor([])
b = torch.tensor([])
b = b.float()
b = standardize_empty_box(b, 3)
print(b.shape)
# prints torch.Size([0, 6])
detector(x, [{'labels':c, 'boxes':b}])
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In [43], line 1
----> 1 detector(x, [{'labels':c, 'boxes':b}])
File /opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py:1185, in Module._call_impl(self, *input, **kwargs)
1181 # If we don't have any hooks, we want to skip the rest of the logic in
1182 # this function, and just call forward.
1183 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1184 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1185 return forward_call(*input, **kwargs)
1186 # Do not call functions when jit is used
1187 full_backward_hooks, non_full_backward_hooks = [], []
File /opt/monai/monai/apps/detection/networks/retinanet_detector.py:514, in RetinaNetDetector.forward(self, input_images, targets, use_inferer)
512 # 6(1). If during training, return losses
513 if self.training:
--> 514 losses = self.compute_loss(head_outputs, targets, self.anchors, num_anchor_locs_per_level) # type: ignore
515 return losses
517 # 6(2). If during inference, return detection results
File /opt/monai/monai/apps/detection/networks/retinanet_detector.py:692, in RetinaNetDetector.compute_loss(self, head_outputs_reshape, targets, anchors, num_anchor_locs_per_level)
675 """
676 Compute losses.
677
(...)
689 a dict of several kinds of losses.
690 """
691 matched_idxs = self.compute_anchor_matched_idxs(anchors, targets, num_anchor_locs_per_level)
--> 692 losses_cls = self.compute_cls_loss(head_outputs_reshape[self.cls_key], targets, matched_idxs)
693 losses_box_regression = self.compute_box_loss(
694 head_outputs_reshape[self.box_reg_key], targets, anchors, matched_idxs
695 )
696 return {self.cls_key: losses_cls, self.box_reg_key: losses_box_regression}
File /opt/monai/monai/apps/detection/networks/retinanet_detector.py:789, in RetinaNetDetector.compute_cls_loss(self, cls_logits, targets, matched_idxs)
786 total_gt_classes_target_list = []
787 for targets_per_image, cls_logits_per_image, matched_idxs_per_image in zip(targets, cls_logits, matched_idxs):
788 # for each image, get training samples
--> 789 sampled_cls_logits_per_image, sampled_gt_classes_target = self.get_cls_train_sample_per_image(
790 cls_logits_per_image, targets_per_image, matched_idxs_per_image
791 )
792 total_cls_logits_list.append(sampled_cls_logits_per_image)
793 total_gt_classes_target_list.append(sampled_gt_classes_target)
File /opt/monai/monai/apps/detection/networks/retinanet_detector.py:885, in RetinaNetDetector.get_cls_train_sample_per_image(self, cls_logits_per_image, targets_per_image, matched_idxs_per_image)
883 # create the target classification with one-hot encoding
884 gt_classes_target = torch.zeros_like(cls_logits_per_image) # (sum(HW(D)A), self.num_classes)
--> 885 gt_classes_target[
886 foreground_idxs_per_image, # fg anchor idx in
887 targets_per_image[self.target_label_key][
888 matched_idxs_per_image[foreground_idxs_per_image]
889 ], # fg class label
890 ] = 1.0
892 if self.fg_bg_sampler is None:
893 # if no balanced sampling
894 valid_idxs_per_image = matched_idxs_per_image != self.proposal_matcher.BETWEEN_THRESHOLDS
IndexError: tensors used as indices must be long, byte or bool tensors |
Beta Was this translation helpful? Give feedback.
-
Haha! Thank you! It was mostly simple formatting which I should have been
able to figure out….
Hopefully someone else will find this instructive
…On Tue, Mar 21, 2023 at 6:00 PM Can Zhao ***@***.***> wrote:
Thank you for the help! This is an important issue and glad that it is
solved.
—
Reply to this email directly, view it on GitHub
<#1197 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZ7OBYPFGUZDHEJPYFUHBLW5IXIJANCNFSM6AAAAAAUKXMQ4M>
.
You are receiving this because you were mentioned.Message ID:
***@***.***
com>
|
Beta Was this translation helpful? Give feedback.
-
The issue is resolved with PR #1256 |
Beta Was this translation helpful? Give feedback.
-
I have this in my training code:
detector.set_hard_negative_sampler(
batch_size_per_image=64,
positive_fraction=args.balanced_sampler_pos_fraction,
pool_size=20,
min_neg=16,
)
where balanced_sampler_pos_fraction is 0.3 in the config.json. So both
hard_negative and balanced_pos_fraction appear to be used in the code? So
I'm not sure. Is there an easy way to tell which I am using?
Sorry for deleting the comment, I saw that after epoch 1 it does not throw
that warning anymore so it's not a big deal in that case. It does throw
the warning for every training sample in epoch 1 though.
Thanks,
Bobby
…On Tue, Aug 1, 2023 at 6:07 PM Can Zhao ***@***.***> wrote:
Could I ask if you used BalancedPositiveNegativeSampler or
HardNegativeSampler?
Thanks,
Can
________________________________
From: AceMcAwesome77 ***@***.***>
Sent: Tuesday, August 1, 2023 3:01 PM
To: Project-MONAI/tutorials ***@***.***>
Cc: Can Zhao ***@***.***>; State change ***@***.***>
Subject: Re: [Project-MONAI/tutorials] Negative samples for detection
(Discussion #1197)
The RetinaNet3D framework does now seem to be successfully training on
image volumes with no boxes which is great. However I do get this warning
every time it runs through a nifti file with no boxes:
"Num foregrounds 0, Num backgrounds 25898160, unable to generate class
balanced samples, setting pos_ratio to 0."
This is not unexpected behavior since those volumes don't have any
foreground boxes, but is there any way to suppress this warning? 1,000 out
of my 1,500 training samples are negative so the warning shows up 1,000
times each epoch. I could comment it out in the source code but I'm
wondering if there's a more formal way to make it not say that. Thanks!
—
Reply to this email directly, view it on GitHub<
#1197 (comment)>,
or unsubscribe<
https://github.com/notifications/unsubscribe-auth/AQUYEBE6PWUCYJCOBJMZCU3XTF4ERANCNFSM6AAAAAAUKXMQ4M>.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
—
Reply to this email directly, view it on GitHub
<#1197 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AXFPX4Q7L5GFPQRORND3DF3XTGD4LANCNFSM6AAAAAAUKXMQ4M>
.
You are receiving this because you commented.Message ID:
***@***.***
com>
|
Beta Was this translation helpful? Give feedback.
-
Hi @Can-Zhao, I'm currently working on a detection task and encountered an issue related to negative samples. When I include classification labels for negative samples in the JSON file, it results in an error.
However, the error doesn't occur if I remove the classification labels for negative samples.
For negative samples, there won't be any box labels, but they should still have classification labels. Could you help explain how to properly set up the JSON file for negative brain samples? |
Beta Was this translation helpful? Give feedback.
-
With the retinanet model and the current implementation and loss function is it possible to train with images that contain no objects (bounding boxes)? Or does the loss function expect that for each image there will be at least one box?
Beta Was this translation helpful? Give feedback.
All reactions