This paper addresses the need for a consistent evaluation framework designed specifically for video content. 

Abstract

The increasing relevance of facial recognition technology in the broadcasting industry raised the need for a consistent evaluation framework designed specifically for video content. Addressing this challenge, the European Broadcasting Union (EBU) has developed a benchmark and state-of-the-art AI models tailored for facial recognition in television programmes. This initiative involved the extensive annotation of a diverse video dataset guided by user-centric metrics that prioritise the accurateretrieval of relevant personalities, as defined by documentalists. Our machine learning models employ a unique approach that selectively identifies personalities active in the TV programme and deliberately excludes incidental characters, maximising user-centric metrics and enhancing the relevance and quality of the metadata. This strategy improves the overall performance of the facial recognition system while addressing privacy concerns by complying with General Data Protection Regulation (GDPR) and ensures ethical and responsible use of facial recognition technology in the media sector.

Introduction

The increasing applicability of facial recognition technology (FRT) in the broadcast and media industries necessitates a standardised evaluation framework specifically designed for video content. The absence of such a framework poses challenges in the decision-making process regarding the implementation of facial recognition systems, as reliance on conventional Machine Learning (ML) metrics may lead to suboptimal choices. In fact, these metrics prioritise performance optimization, which can inadvertently overlook user-centric properties essential for practical applications and result in masking critical user-centric properties, such as the relevance and accessibility of the metadata produced. To address this gap, the EBU has developed a benchmark tailored for facial recognition in television programming, accompanied by a state-of-the-art AI model optimised for this framework. This initiative involved the extensive annotation of a video dataset guided by user-centric metrics, which prioritise the accurate retrieval of relevant personalities in accordance with the requirements of documentalists. This framework has been specifically designed for the management of the open set use case.