P3M

[ACM MM 2021] Privacy-Preserving Portrait Matting

Generate Convert Improve

Install / Use

/learn @JizhiziLi/P3M

About this skill

Quality Score

0/100

README

<h1 align="center">Privacy-Preserving Portrait Matting [ACM MM-21]</h1> <a href="https://arxiv.org/abs/2104.14222"><img src="https://img.shields.io/badge/arXiv-Paper-<COLOR>.svg" ></a> <a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/license-MIT-blue"></a> <a href="https://dl.acm.org/doi/10.1145/3474085.3475512"><img src="https://img.shields.io/static/v1?label=inproceedings&message=Paper&color=orange"></a> <a href="https://paperswithcode.com/sota/image-matting-on-p3m-10k"><img src="https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/privacy-preserving-portrait-matting/image-matting-on-p3m-10k"></a> <h4 align="center">This is the official repository of the paper <a href="https://arxiv.org/abs/2104.14222">Privacy-Preserving Portrait Matting</a>.</h4> <h5 align="center">Jizhizi Li∗, Sihan Ma∗, Jing Zhang, and Dacheng Tao</h5> <a href="#introduction">Introduction</a> | <a href="#ppt-setting-and-p3m-10k-dataset">PPT and P3M-10k</a> | <a href="#p3m-net">P3M-Net</a> | <a href="#benchmark">Benchmark</a> | <a href="#results">Results</a> | <a href="https://github.com/JizhiziLi/P3M/tree/master/core">Train and Test</a> | <a href="#inference-code---how-to-test-on-your-images">Inference code</a> | <a href="#statement">Statement</a>

<h3>:postbox: News</h3>
[2023-03-28]: The extended paper Rethinking Portrait Matting with Privacy Preserving has been accepted by the International Journal of Computer Vision (IJCV).

[2022-03-31]: Publish the extended version paper "Rethinking Portrait Matting with Privacy Preserving". The code, dataset, and models are available at Github.

[2021-11-21]: Publish the dataset P3M-10k (the largest privacy-preserving portrait matting dataset, contains 10421 high-resolution real-world face-blurred portrait images and the manually labeled alpha mattes.), the train code and the test code. The dataset P3M-10k can be accessed from the following link, please make sure that you have read and agreed to the agreement. The train code and test code can be viewed from this code-base page.

[2021-12-06]: Publish the face mask of the training set and P3M-500-P validation set of P3M-10k dataset.

| Dataset | Dataset Link (Google Drive) | Dataset Link (Baidu Wangpan 百度网盘) | Dataset Release Agreement| | :----:| :----: | :----: | :----: | |P3M-10k|Link|Link (pw: fgmc)|Agreement (MIT License)| |P3M-10k facemask (optional)|Link|Link (pw: f772)|Agreement (MIT License)|

[2021-11-20]: Publish the <a href="#inference-code---how-to-test-on-your-images">inference code</a> and the pretrained model (Google Drive | Baidu Wangpan (pw: 2308)) that can be used to test on your own privacy-preserving or normal portrait images. Some test results on P3M-10k can be viewed from this demo page.

Introduction

Recently, there has been an increasing concern about the privacy issue raised by using personally identifiable information in machine learning. However, previous portrait matting methods were all based on identifiable portrait images. To fill the gap, we present <a href="#ppt-setting-and-p3m-10k-dataset">P3M-10k</a> in this paper, which is the first large-scale anonymized benchmark for Privacy-Preserving Portrait Matting. P3M-10k consists of 10,000 high-resolution face-blurred portrait images along with high-quality alpha mattes. We systematically evaluate both trimap-free and trimap-based matting methods on P3M-10k and find that existing matting methods show different generalization capabilities when following the Privacy-Preserving Training (PPT) setting, 𝑖.𝑒., training on face-blurred images and testing on arbitrary images. To devise a better trimap-free portrait matting model, we propose <a href="#p3m-net">P3M-Net</a>, which leverages the power of a unified framework for both semantic perception and detail matting, and specifically emphasizes the interaction between them and the encoder to facilitate the matting process. Extensive experiments on P3M-10k demonstrate that P3M-Net outperforms the state-of-the-art methods in terms of both objective metrics and subjective visual quality. Besides, it shows good generalization capacity under the PPT setting, confirming the value of P3M-10k for facilitating future research and enabling potential real-world applications.

PPT Setting and P3M-10k Dataset

PPT Setting: Due to the privacy concern, we propose the Privacy-Preserving Training (PPT) setting in portrait matting, 𝑖.𝑒., training on privacy-preserved images (𝑒.𝑔., processed by face obfuscation) and testing on arbitraty images with or without privacy content. As an initial step towards privacy-preserving portrait matting problem, we only define the identifiable faces in frontal and some profile portrait images as the private content in this work. P3M-10k Dataset: To further explore the effect of PPT setting, we establish the first large-scale privacy-preserving portrait matting benchmark named P3M-10k. It contains 10,000 annonymized high-resolution portrait images by face obfuscation along with high-quality ground truth alpha mattes. Specifically, we carefully collect, filter, and annotate about 10,000 high-resolution images from the Internet with free use license. There are 9,421 images in the training set and 500 images in the test set, denoted as P3M-500-P. In addition, we also collect and annotate another 500 public celebrity images from the Internet without face obfuscation, to evaluate the performance of matting models under the PPT setting on normal portrait images, denoted as P3M-500-NP. We show some examples as below, where (a) is from the training set, (b) is from P3M-500-P, and (c) is from P3M-500-NP.

P3M-10k and the facemask are now published!! You can get access to it from the following links, please make sure that you have read and agreed to the agreement. Note that the facemask is not used in our work. So it's optional to download it.

P3M-Net

Our proposed P3M-Net consists of four parts

A Multi-task Framework: To enable benefits from explicitly modeling both semantic segmentation and detail matting tasks and jointly optimizing for trimap-free matting, we follow [1] and [2], adopt a multi-task framework based on a modified version of ResNet-34, the model pretrained on ImageNet will be listed as follows;
TFI: Tripartite-Feature Integration: TFI module is used in each matting decoder block to model the interaction between encoder, segmentation decoder, and the matting decoder. TFI has three inputs, the feature map of the previous matting decoder block, the feature map from the same level semantic decoder block, and the feature map from the symmetrical encoder block. TFI passes them through a projection layer, concats the outputs and feeds into a convolutional block to generate the output feature;
sBFI: Shallow Bipartite-Feature Integration: sBFI module is used to model the interaction between the encoder and matting decoder. sBFI adopts the feature map from the first encoder block as a guidance to refine the output feature map from previous matting decoder block since shallow layers in the encoder contain many details and local structural information;
dBFI: Deep Bipartite-Feature Integration: dBFI module is used to model the interact

Related Skills

node-connect

351.4k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

110.7k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

351.4k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

351.4k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。