Multiple-View Constrained Clustering For Unsupervised Face Identification In TV-Broadcast

Abstract : Our goal is to automatically identify faces in TV broadcast without a pre-defined dictionary of identities. Most methods are based on identity detection (from OCR and ASR) and require a propagation strategy based on visual clustering. In TV content, people appear with many variations making the clustering difficult. In this case, speaker clustering can be a reliable link for face clustering. Multi-modal clustering methods assume a bipartite mapping between modalities. In this paper, we propose to build automatically an incomplete speaker-face mapping based on local evidence of OCR and Lip activity links. Then, we propose schemes of speaker constraints propagation to the face constrained-clustering problem. Experiments performed on the REPERE corpus show an improvement of face identification by propagating names to face clusters (+3.7% F-measure compared to the baseline).
Document type :
Conference papers
Complete list of metadatas

https://hal-amu.archives-ouvertes.fr/hal-01194240
Contributor : Benoit Favre <>
Submitted on : Saturday, September 5, 2015 - 11:12:43 AM
Last modification on : Friday, March 22, 2019 - 1:35:15 AM

Links full text

Identifiers

Citation

Meriem Bendris, Delphine Charlet, Benoit Favre, Géraldine Damnati, Rémi Auguste. Multiple-View Constrained Clustering For Unsupervised Face Identification In TV-Broadcast. ICASSP2014 - Image, Video, and Multidimensional Signal Processing (ICASSP2014 - IVMSP), May 2014, Florence, Italy. pp.494 - 498, ⟨10.1109/ICASSP.2014.6853645⟩. ⟨hal-01194240⟩

Share

Metrics

Record views

351