This repo contains a comprehensive list of our research works related to image matting, including papers, codes, datasets, demos, and citations. For any related questions, please contact Jizhizi Li at [email protected] and Sihan Ma at [email protected].
[2023-04-10]: Publish the paper Deep Image Matting: A Comprehensive Survey on arXiv.
[2023-03-28]: The paper Rethinking Portrait Matting with Privacy Preserving has been accepted by the International Journal of Computer Vision (IJCV) 🎉
[2023-02-28]: The paper Referring Image Matting has been accepted by the Computer Vision and Pattern Recognition Conference (CVPR) 🎉
1. Deep Image Matting: A Comprehensive Survey, arXiv, 2023
2. Rethinking Portrait Matting with Privacy Preserving, IJCV, 2023
3. Referring Image Matting, CVPR, 2023
4. Bridging Composite and Real: Towards End-to-end Deep Image Matting, IJCV, 2022
5. Privacy-preserving Portrait Matting, ACM MM, 2021
6. Deep Automatic Natural Image Matting, IJCAI, 2021
Jizhizi Li, Jing Zhang, and Dacheng Tao
Paper | Github Code | BibTex
Image matting refers to extracting precise alpha matte from natural images, and it plays a critical role in various downstream applications, such as image editing. The emergence of deep learning has revolutionized the field of image matting and given birth to multiple new techniques, including automatic, interactive, and referring image matting. Here we present a comprehensive review of recent advancements in image matting in the era of deep learning.
Sihan Ma∗, Jizhizi Li∗, Jing Zhang, He Zhang, and Dacheng Tao. (*equal contribution)
Paper | Github Code | Dataset | Demo | BibTex
This paper introduces three variants of P3M-Net based on both transformer and CNN backbones to solve the portrait matting problem with privacy preserving. Also a simple yet effective Copy and Paste strategy (P3M-CP) is devised to enable the matting model to process both face-blurred and normal images without extra effort during inference.
Jizhizi Li, Jing Zhang, and Dacheng Tao
Paper | Github Code | Dataset | BibTex
Image matting refers to extracting the accurate foregrounds in the image. Current automatic methods tend to extract all the salient objects in the image indiscriminately. In this paper, we propose a new task named Referring Image Matting (RIM), referring to extracting the meticulous alpha matte of the specific object that can best match the given natural language description. We then propose a large-scale dataset RefMatte and a carefully designed method CLIPMat to serve as a baseline suite for RIM. We believe the new task RIM along with the RefMatte dataset and the method CLIPMat will open new research directions in this area and facilitate future studies.
Jizhizi Li1∗, Jing Zhang1∗, Stephen J. Maybank, and Dacheng Tao. (*equal contribution)
Paper | Github Code | Dataset | Demo | BibTex
We propose a novel Glance and Focus Matting network (GFM), which employs a shared encoder and two separate decoders to learn both tasks in a collaborative manner for end-to-end image matting. We also establish a novel Animal Matting dataset (AM-2k) to serve for end-to-end matting task. Furthermore, we investigate the domain gap issue between composition images and natural images systematically, propose a carefully designed composite route RSSN and a large-scale high-resolution background dataset (BG-20k) to serve as better candidates for composition.
Jizhizi Li∗, Sihan Ma∗, Jing Zhang, and Dacheng Tao. (*equal contribution)
Paper | Github Code | Dataset | BibTex
This work presents P3M-10k, which is the first large-scale anonymized benchmark for Privacy-Preserving Portrait Matting, to solve the increasing concerns about the privacy in image matting. They also propose P3M-Net, which leverages the power of a unified framework for both semantic perception and detail matting, and specifically emphasizes the interaction between them and the encoder to facilitate the matting process.
Jizhizi Li, Jing Zhang, and Dacheng Tao
Paper | Github Code | Dataset | BibTex
We investigate the difficulties when extending the automatic matting methods to natural images with salient transparent/meticulous foregrounds or non-salient foregrounds by proposing a novel end-to-end matting network, which can predict a generalized trimap for any image of the above types as a unified semantic representation and simultaneously guide the matting network to focus on the transition areas via an attention mechanism. We also construct a test set AIM-500 that contains 500 diverse natural images covering all types along with manually labeled alpha mattes, making it feasible to benchmark the generalization ability of AIM models.