Skip to content

Latest commit

 

History

History
573 lines (415 loc) · 49.1 KB

model_cons_params_en.md

File metadata and controls

573 lines (415 loc) · 49.1 KB

简体中文 | English

PaddleRS Model Construction Parameters

This document describes the construction parameters of each PaddleRS model trainer, including their parameter names, parameter types, parameter descriptions, and default values.

BIT

The BIT implementation based on PaddlePaddle.

The original article refers to H. Chen, et al., "Remote Sensing Image Change Detection With Transformers "(https://arxiv.org/abs/2103.00208).

This implementation adopts pretrained encoders, as opposed to the original work where weights are randomly initialized.

Parameter Name Description Default Value
in_channels (int) Number of channels of the input image 3
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss function False
losses (list) List of loss functions None
att_type (str) Spatial attention type values are 'CBAM' and 'BAM' 'CBAM'
ds_factor (int) Downsampling factor 1
backbone (str) ResNet architecture to use as backbone. Currently only 'resnet18' and 'resnet34' are supported 'resnet18'
n_stages (int) Number of ResNet stages used in the backbone, should be a value in {3, 4, 5} 4
use_tokenizer (bool) Whether to use tokenizer True
token_len (int) Length of input token 4
pool_mode (str) Gets the pooling strategy for input tokens when 'use_tokenizer' is set to False. 'max' means global max pooling, 'avg' means global average pooling 'max'
pool_size (int) When 'use_tokenizer' is set to False, the height and width of the pooled feature map 2
enc_with_pos (bool) Whether to add learned positional embeddings to the encoder's input feature sequence True
enc_depth (int) Number of attention blocks used in encoder 1
enc_head_dim (int) Embedding dimension of each encoder head 64
dec_depth (int) Number of attention blocks used in decoder 8
dec_head_dim (int) Embedding dimension for each decoder head 8

CDNet

The CDNet implementation based on PaddlePaddle.

The original article refers to Pablo F. Alcantarilla, et al., "Street-View Change Detection with Deconvolut ional Networks"(https://link.springer.com/article/10.1007/s10514-018-9734-5).

Parameter Name Description Default Value
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss function False
losses (list) List of loss functions None
in_channels (int) Number of channels of the input image 6

ChangeFormer

The ChangeFormer implementation based on PaddlePaddle.

The original article refers to Wele Gedara Chaminda Bandara, Vishal M. Patel, “A TRANSFORMER-BASED SIAMESE NETWORK FOR CHANGE DETECTION”(https://arxiv.org/pdf/2201.01293.pdf).

Parameter Name Description Default Value
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss False
losses (list) List of loss functions None
in_channels (int) Number of channels of the input image 3
decoder_softmax (bool) Whether to use softmax as the last layer activation function of the decoder False
embed_dim (int) Hidden layer dimension of the Transformer encoder 256

ChangeStar

The ChangeStar implementation with a FarSeg encoder based on PaddlePaddle.

The original article refers to Z. Zheng, et al., "Change is Everywhere: Single-Temporal Supervised Object Change Detection in Remote Sensing Imagery"(https://arxiv.org/abs/2108.07002).

Parameter Name Description Default Value
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss False
losses (list) List of loss functions None
mid_channels (int) Number of channels in the middle layer of UNet 256
inner_channels (int) Number of channels inside the attention module 16
num_convs (int) Number of convolutional layers in UNet encoder and decoder 4
scale_factor (float) Upsampling factor to scale the size of the output segmentation mask 4.0

DSAMNet

The DSAMNet implementation based on PaddlePaddle.

The original article refers to Q. Shi, et al., "A Deeply Supervised Attention Metric-Based Network and an Open Aerial Image Dataset for Remote Sensing Change Detection"(https://ieeexplore.ieee.org/document/9467555).

Parameter Name Description Default Value
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss False
losses (list) List of loss functions None
in_channels (int) Number of channels of the input image 3
ca_ratio (int) Channel compression ratio in channel attention module 8
sa_kernel (int) Kernel size in the spatial attention module 7

DSIFN

The DSIFN implementation based on PaddlePaddle.

The original article refers to C. Zhang, et al., "A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images"(https://www.sciencedirect.com/science/article/pii/S0924271620301532).

Parameter Name Description Default Value
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss False
losses (list) List of loss functions None
use_dropout (bool) Whether to use dropout False

FCEarlyFusion

The FC-EF implementation based on PaddlePaddle.

The original article refers to Rodrigo Caye Daudt, et al. "Fully convolutional siamese networks for change detection"(https://arxiv.org/abs/1810.08462)`.

Parameter Name Description Default Value
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss False
losses (list) List of loss functions None
in_channels (int) Number of channels of the input image 6
use_dropout (bool) Whether to use dropout False

FCSiamConc

The FC-Siam-conc implementation based on PaddlePaddle.

The original article refers to Rodrigo Caye Daudt, et al. "Fully convolutional siamese networks for change detection"(https://arxiv.org/abs/1810.08462).

Parameter Name Description Default Value
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss False
losses (list) List of loss functions None
in_channels (int) Number of channels of the input image 3
use_dropout (bool) Whether to use dropout False

FCSiamDiff

The FC-Siam-diff implementation based on PaddlePaddle.

The original article refers to Rodrigo Caye Daudt, et al. "Fully convolutional siamese networks for change detection"(https://arxiv.org/abs/1810.08462).

Parameter Name Description Default Value
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss function False
losses (list) List of loss functions None
in_channels (int) Number of channels of the input image int
use_dropout (bool) Whether to use dropout False

FCCDN

The FCCDN implementation based on PaddlePaddle.

The original article refers to Pan Chen, et al., "FCCDN: Feature Constraint Network for VHR Image Change Detection"(https://arxiv.org/pdf/2105.10860.pdf).

Parameter Name Description Default Value
in_channels (int) Number of channels of the input image 3
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss False
losses (list) List of loss functions None

P2V

The P2V-CD implementation based on PaddlePaddle.

The original article refers to M. Lin, et al. "Transition Is a Process: Pair-to-Video Change Detection Networks for Very High Resolution Remote Sensing Images"(https://ieeexplore.ieee.org/document/9975266).

Parameter Name Description Default Value
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss False
losses (list) List of loss functions None
in_channels (int) Number of channels of the input image 3
video_len (int) Number of input video frames 8

SNUNet

The SNUNet implementation based on PaddlePaddle.

The original article refers to S. Fang, et al., "SNUNet-CD: A Densely Connected Siamese Network for Change Detection of VHR Images" (https://ieeexplore.ieee.org/document/9355573).

arg_name Description default
in_channels (int) Number of channels of the input image
num_classes (int) Number of target classes
width (int) Output channels of the first convolutional layer 32

STANet

The STANet implementation based on PaddlePaddle.

The original article refers to H. Chen and Z. Shi, "A Spatial-Temporal Attention-Based Method and a New Dataset for Remote Sensing Image Change Detection"(https://www.mdpi.com/2072-4292/12/10/1662).

Parameter Name Description Default Value
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss False
losses (list) List of loss functions None
in_channels (int) Number of channels of the input image 3
width (int) Number of channels in the neural network 32

CondenseNetV2

The CondenseNetV2 implementation based on PaddlePaddle.

Parameter Name Description Default Value
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss function False
losses (list) List of loss functions None
in_channels (int) Number of channels of the input image 3
arch (str) Architecture of the model, which can be 'A', 'B', or 'C' 'A'

HRNet

The HRNet implementation based on PaddlePaddle.

Parameter Name Description Default Value
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss function False
losses (list) List of loss functions None

MobileNetV3

The MobileNetV3 implementation based on PaddlePaddle.

Parameter Name Description Default Value
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss function False
losses (list) List of loss functions None

ResNet50_vd

The ResNet50-vd implementation based on PaddlePaddle.

Parameter Name Description Default Value
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss function False
losses (list) List of loss functions None

DRN

The DRN implementation based on PaddlePaddle.

Parameter Name Description Default Value
losses (list) List of loss functions None
sr_factor (int) Scaling factor for super-resolution. The output image size will be the original image size multiplied by this factor. For example, if the original image is H x W, the output image will be sr_factor * H x sr_factor * W 4
min_max (None | tuple[float, float]) Minimum and maximum pixel values of the input image. If not specified, the data type's default minimum and maximum values are used None
scales (tuple[int]) Scaling factor (2, 4)
n_blocks (int) Number of residual blocks 30
n_feats (int) Number of features in the residual block 16
n_colors (int) Number of image channels 3
rgb_range (float) Range of image pixel values 1.0
negval (float) Negative value in nonlinear mapping 0.2
Supplementary Description of lq_loss_weight parameter (float) Weight of the primal regression loss 0.1
dual_loss_weight (float) Weight of the dual regression loss 0.1

ESRGAN

The ESRGAN implementation based on PaddlePaddle.

Parameter Name Description Default Value
losses (list) List of loss functions None
sr_factor (int) Scaling factor for super-resolution. The output image size will be the original image size multiplied by this factor. For example, if the original image is H x W, the output image will be sr_factor * H x sr_factor * W 4
min_max (tuple) Minimum and maximum pixel values of the input image. If not specified, the data type's default minimum and maximum values are used None
use_gan (bool) Whether to use GAN (Generative Adversarial Network) during training. If yes, GAN will be used True
in_channels (int) Number of channels of the input image 3
out_channels (int) Number of channels of the output image 3
nf (int) Number of filters in the first convolutional layer of the model 64
nb (int) Number of residual blocks in the model 23

LESRCNN

The LESRCNN implementation based on PaddlePaddle.

Parameter Name Description Default Value
losses (list) List of loss functions None
sr_factor (int) Scaling factor for super-resolution. The output image size will be the original image size multiplied by this factor. For example, if the original image is H x W, the output image will be sr_factor * H x sr_factor * W 4
min_max (tuple) Minimum and maximum pixel values of the input image. If not specified, the data type's default minimum and maximum values are used None
multi_scale (bool) Whether to train on multiple scales. If yes, multiple scales are used during training False
group (int) Number of groups used in convolution operations. 1

NAFNet

The NAFNet implementation based on PaddlePaddle.

Parameter Name Description Default Value
losses (list) List of loss functions None
sr_factor (int) Scaling factor for image restoration. NAFNet is not suitable for image super-resolution tasks and does not change the size of the image. Please set the sr factor to None None
min_max (tuple) Minimum and maximum pixel values of the input image. If not specified, the data type's default minimum and maximum values are used None
use_tlsc (bool) Whether to use tlsc (test-time local statistics converter) during testing. If yes, tlsc will be used False
in_channels (int) Number of channels of the input image 3
width (int) Number of channels of NAFBlock 32
middle_blk_num (int) Number of NAFBlocks in middle block 1
enc_blk_nums (list[int]) Number of NAFBlocks in different layers of the encoder None
dec_blk_nums (list[int]) Number of NAFBlocks in different layers of the decoder None

SwinIR

The SwinIR implementation based on PaddlePaddle.

参数名 描述 默认值
losses (list) List of loss functions None
sr_factor (int) Scaling factor for image restoration. The output image size will be the original image size multiplied by this factor. For example, if the original image is H x W, the output image will be sr_factor * H x sr_factor * W 1
min_max (tuple) Minimum and maximum pixel values of the input image. If not specified, the data type's default minimum and maximum values are used None
in_channels (int) Number of channels of the input image 3
img_size (int) Input image size 128
window_size (int) Window size 8
depths (list[int]) Depth of each Swin Transformer layer [6, 6, 6, 6, 6, 6]
num_heads (list[int]) Number of attention heads in different layers [6, 6, 6, 6]
embed_dim (int) Patch embedding dimension 96
window_size (int) Ratio of MLP hidden dim to embedding dim 4

FasterRCNN

The Faster R-CNN implementation based on PaddlePaddle.

Parameter Name Description Default Value
num_classes (int) Number of target classes 80
backbone (str) Backbone network to use 'ResNet50'
with_fpn (bool) Whether to use Feature Pyramid Network (FPN) True
with_dcn (bool) Whether to use Deformable Convolutional Networks (DCN) False
aspect_ratios (list) List of aspect ratios of candidate boxes [0.5, 1.0, 2.0]
anchor_sizes (list) list of sizes of candidate boxes expressed as base sizes on each feature map [[32], [64], [128], [256], [512]]
keep_top_k (int) Number of predicted boxes to keep before the non-maximum suppression (NMS) operation 100
nms_threshold (float) NMS threshold to use 0.5
score_threshold (float) Score threshold for filtering predicted boxes 0.05
fpn_num_channels (int) Number of channels for each pyramid layer in the FPN network 256
rpn_batch_size_per_im (int) Ratio of positive and negative samples per image in the RPN network 256
rpn_fg_fraction (float) Fraction of foreground samples in RPN network 0.5
test_pre_nms_top_n (int) Number of predicted boxes to keep before NMS operation when testing. If not specified, keep_top_k is used. None
test_post_nms_top_n (int) Number of predicted boxes to keep after NMS operation at test time 1000

FCOSR

The FCOSR implementation based on PaddlePaddle.

Parameter Name Description Default Value
num_classes (int) Number of target classes 80
backbone (str) Backbone network to use 'MobileNetV1'
anchors (list[list[int]]) Sizes of predefined anchor boxes [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45 ], [59, 119], [116, 90], [156, 198], [373, 326]]
anchor_masks (list[list[int]]) Masks of predefined anchor boxes [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
ignore_threshold (float) IoU threshold used to assign predicted boxes to ground truth boxes 0.7
nms_score_threshold (float) NMS score threshold 0.01
nms_topk (int) Maximum number of detections to keep before performing NMS 1000
nms_keep_topk (int) Maximum number of prediction boxes to keep after NMS 100
nms_iou_threshold (float) NMS IoU threshold 0.45
label_smooth (bool) Whether to use label smoothing when computing losses

PPYOLO

The PP-YOLO implementation based on PaddlePaddle.

Parameter Name Description Default Value
num_classes (int) Number of target classes 80
backbone (str) Backbone network to use 'ResNet50_vd_dcn'
anchors (list[list[float]]) Sizes of predefined anchor boxes None
anchor_masks (list[list[int]]) Masks for predefined anchor boxes None
use_coord_conv (bool) Whether to use coordinate convolution True
use_iou_aware (bool) Whether to use IoU awareness True
use_spp (bool) Whether to use spatial pyramid pooling (SPP) True
use_drop_block (bool) Whether to use DropBlock True
scale_x_y (float) Parameter to scale each predicted box 1.05
ignore_threshold (float) IoU threshold used to assign predicted boxes to ground truth boxes 0.7
label_smooth (bool) Whether to use label smoothing False
use_iou_loss (bool) Whether to use IoU loss True
use_matrix_nms (bool) Whether to use Matrix NMS True
nms_score_threshold (float) NMS score threshold 0.01
nms_topk (int) Maximum number of detections to keep before performing NMS -1
nms_keep_topk (int) Maximum number of prediction boxes to keep after NMS 100
nms_iou_threshold (float) NMS IoU threshold 0.45

PPYOLOTiny

The PP-YOLO Tiny implementation based on PaddlePaddle.

Parameter Name Description Default Value
num_classes (int) Number of target classes 80
backbone (str) Backbone network to use 'MobileNetV3'
anchors (list[list[float]]) Sizes of predefined anchor boxes [[10, 15], [24, 36], [72, 42], [35, 87], [102, 96] , [60, 170], [220, 125], [128, 222], [264, 266]]
anchor_masks (list[list[int]]) Masks for predefined anchor boxes [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
use_iou_aware (bool) Whether to use IoU awareness False
use_spp (bool) Whether to use spatial pyramid pooling (SPP) True
use_drop_block (bool) Whether to use the DropBlock True
scale_x_y (float) Parameter to scale each predicted box 1.05
ignore_threshold (float) IoU threshold used to assign predicted boxes to ground truth boxes 0.5
label_smooth (bool) Whether to use label smoothing False
use_iou_loss (bool) Whether to use IoU loss True
use_matrix_nms (bool) Whether to use Matrix NMS False
nms_score_threshold (float) NMS score threshold 0.005
nms_topk (int) Maximum number of detections to keep before performing NMS 1000
nms_keep_topk (int) Maximum number of prediction boxes to keep after NMS 100
nms_iou_threshold (float) NMS IoU threshold 0.45

PPYOLOv2

The PP-YOLOv2 implementation based on PaddlePaddle.

Parameter Name Description Default Value
num_classes (int) Number of target classes 80
backbone (str) Backbone network to use 'ResNet50_vd_dcn'
anchors (list[list[float]]) Sizes of predefined anchor boxes [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]]
anchor_masks (list[list[int]]) Masks of predefined anchor boxes [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
use_iou_aware (bool) Whether to use IoU awareness True
use_spp (bool) Whether to use spatial pyramid pooling (SPP) True
use_drop_block (bool) Whether to use DropBlock True
scale_x_y (float) Parameter to scale each predicted box 1.05
ignore_threshold (float) IoU threshold used to assign predicted boxes to ground truth boxes 0.7
label_smooth (bool) Whether to use label smoothing False
use_iou_loss (bool) Whether to use IoU loss True
use_matrix_nms (bool) Whether to use Matrix NMS True
nms_score_threshold (float) NMS score threshold 0.01
nms_topk (int) Maximum number of detections to keep before performing NMS -1
nms_keep_topk (int) Maximum number of prediction boxes to keep after NMS 100
nms_iou_threshold (float) NMS IoU threshold 0.45

YOLOv3

The YOLOv3 implementation based on PaddlePaddle.

Parameter Name Description Default Value
num_classes (int) Number of target classes 80
backbone (str) Backbone network to use 'MobileNetV1'
anchors (list[list[int]]) Sizes of predefined anchor boxes [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45 ], [59, 119], [116, 90], [156, 198], [373, 326]]
anchor_masks (list[list[int]]) Masks of predefined anchor boxes [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
ignore_threshold (float) IoU threshold used to assign predicted boxes to ground truth boxes 0.7
nms_score_threshold (float) NMS score threshold 0.01
nms_topk (int) Maximum number of detections to keep before performing NMS 1000
nms_keep_topk (int) Maximum number of prediction boxes to keep after NMS 100
nms_iou_threshold (float) NMS IoU threshold 0.45
label_smooth (bool) Whether to use label smoothing when computing losses False

BiSeNetV2

The BiSeNet V2 implementation based on PaddlePaddle.

Parameter Name Description Default Value
in_channels (int) Number of channels of the input image 3
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss function False
losses (list) List of loss functions {}
align_corners (bool) Whether to use the corner alignment method False

DeepLabV3P

The DeepLab V3+ implementation based on PaddlePaddle.

Parameter Name Description Default Value
in_channels (int) Number of channels of the input image 3
num_classes (int) Number of target classes 2
backbone (str) Backbone network type of neural network ResNet50_vd
use_mixed_loss (bool) Whether to use mixed loss function False
losses (list) List of loss functions None
output_stride (int) Downsampling ratio of the output feature map relative to the input feature map 8
backbone_indices (tuple) Indices of different stages of the backbone network for use (0, 3)
aspp_ratios (tuple) Dilation ratio of dilated convolution (1, 12, 24, 36)
aspp_out_channels (int) Number of ASPP module output channels 256
align_corners (bool) Whether to use the corner alignment method False

FactSeg

The FactSeg implementation based on PaddlePaddle.

The original article refers to A. Ma, J. Wang, Y. Zhong and Z. Zheng, "FactSeg: Foreground Activation -Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery,"in IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1-16, 2022, Art no. 5606216.

Parameter Name Description Default Value
in_channels (int) Number of channels of the input image 3
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss function False
losses (list) List of loss functions None

FarSeg

The FarSeg implementation based on PaddlePaddle.

The original article refers to Zheng Z, Zhong Y, Wang J, et al. Foreground-aware relation network for geospatial object segmentation in high spatial resolution remote sensing imagery[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 4096-4105.

Parameter Name Description Default Value
in_channels (int) Number of channels of the input image 3
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss function False
losses (list) List of loss functions None

FastSCNN

The Fast-SCNN implementation based on PaddlePaddle.

Parameter Name Description Default Value
in_channels (int) Number of channels of the input image 3
num_classes (int) Number of target classes 2
use_mixed_loss (bool) Whether to use mixed loss function False
losses (list) List of loss functions None
align_corners (bool) Whether to use the corner alignment method False

HRNet

The HRNet implementation based on PaddlePaddle.

Parameter Name Description Default Value
in_channels (int) Number of channels of the input image 3
num_classes (int) Number of target classes 2
width (int) Initial number of feature channels for the network 48
use_mixed_loss (bool) Whether to use mixed loss function False
losses (list) List of loss functions None
align_corners (bool) Whether to use the corner alignment method False

UNet

The UNet implementation based on PaddlePaddle.

Parameter Name Description Default Value
in_channels (int) Number of channels of the input image 3
num_classes (int) Number of target classes 2
use_deconv (int) Whether to use deconvolution for upsampling 48
use_mixed_loss (bool) Whether to use mixed loss function False
losses (list) List of loss functions None
align_corners (bool) Whether to use the corner alignment method False