This paper proposes an objective film grain similarity model using a data-driven approach, which aligns closely with human perception, demonstrating a high correlation with subjective studies. 

Abstract

Digital cinematography has advanced, yet many artists prefer film rolls for their distinctive texture and essence, with film grain being integral to their artistic expression. However, Over-the-Top (OTT) providers and streamers face challenges with this high-entropy signal, unfriendly to compression due to limited bandwidth. Preserving film grain involves removing it at the source and resynthesizing it post-decoding, a strategy supported by codecs like AV1 and VVC, albeit potentially compromising grain fidelity. Our subjective studies, presented at IBC 2023 (1), examined existing film grain synthesis methods, revealing shortcomings in replicating the original grain appearance. We advocate for a perceptual approach to assessing film grain synthesis quality, emphasizing subjective evaluation’s complexity. We propose an objective film grain similarity model using a data-driven approach, which aligns closely with human perception, demonstrating a high correlation with subjective studies. This metric optimizes auto-regression film grain synthesis parameters, resulting in faithful replication of the original film grain, as confirmed by subsequent subjective studies.

Introduction

Film grain, a distinctive texture resulting from the random distribution of silver halide crystals in traditional film photography, plays a crucial role in the visual aesthetic of films. It adds depth, texture, and authenticity, allowing directors to control brightness, contrast, and mood. Moreover, it contributes to the overall look and feel of a film, enhancing certain elements and drawing attention to specific areas of the frame. Film grain removal and synthesis are integral to video encoding systems, facilitating bandwidth savings while maintaining perceptual quality. This process involves estimating and removing grain from the source video before encoding, then reintroducing it after decoding based on estimated grain statistics (see Figure 1).