IBC2023: This Technical Paper focuses on the perceptual quality assessment of synthesised film grain in terms of its fidelity to the grain in source.
Abstract
Despite the advancements in digital cinematography, numerous artists and filmmakers still adore the look and feel of the content that is shot on film rolls. Specifically, they believe in true film grain as a signature of motion pictures and thus they treat grain as a key part of their artistic intent. The natural randomness of the true film grain comes from the crystallisation of silver halide when exposed to the light, and this natural randomness of true film grain is what fascinates content creators. However, content distributors like OTT providers and streamers always have trouble with such a high entropy signal since randomness possesses challenges to compression. Content distributors have limited bandwidth; they always try to squeeze videos into the pipes as much as possible. A clever and well thought approach to cope with grainy content is to remove the grain at the source side and then synthesise the grain after decoding the compressed videos. Recently developed codecs such as AV1 and VVC provide end-to-end solutions to achieve this goal; however, the faithfulness of the grain with respect to the creative intent is subject to thorough validation and deep investigation.
We believe that while the proposed framework is technically sound, without looking at the problem from a perceptual video quality point of view, the synthesised film grain will likely not satisfy film makers and content creators’ pursuit for the look and feel they intend to convey. To support this hypothesis, we have conducted a subjective study using content with film grain. In order to create different Hypothetical Reference Circuits (HRCs), standard film grain synthesis techniques like auto-regression models were used to produce different levels of grain with AV1 codec. The subjective data proves that there is still a big gap in the proposed models by the available codec standards.
Introduction
Film grain is a characteristic texture that appears in traditional film photography, caused by the random distribution of silver halide crystals in the emulsion layer of the film. It is a result of the light-sensitive chemicals on the film reacting to the light that passes through the camera lens during filming (1). This texture can vary in size and intensity, depending on the type of film used, the lighting conditions during filming, and the camera settings. Film grain can be an important visual element for Hollywood directors as it can contribute to the overall look and feel of a film. It can add a sense of texture, depth, and authenticity to the image, and can also help to create a certain mood or atmosphere. In addition to its aesthetic qualities, film grain can also be used strategically by directors to control the brightness and contrast of an image. By adjusting the amount of grain present in the image, a director can subtly alter the look and feel of the scene, enhancing certain elements or drawing attention to specific areas of the frame.
A large number of film grain synthesis algorithms have been proposed in the past decades, but little work has been done to quantify the perceptual quality of the synthesised film grain. In practice, researchers often use either subjective evaluations, where a group of viewers are recruited to rate the quality of the grain, or common video quality assessment (VQA) objective metrics, such as the peak signal-to-noise-ratio (PSNR) and the structural similarity index (SSIM), but proper validations of these measures are missing.
Both subjective and objective VQA methods can be employed to assess the quality of the synthesised film grain. In a subjective experiment, multiple human subjects are asked to rate or rank the quality of the synthesised grain for mean opinion score (MOS) collection. Subjective methods are highly valuable in comparing grain synthesis algorithms and in validating objective VQA methods, but they are often very time consuming and costly. Depending on the accessibility to the original source that is assumed to have perfect quality, objective VQA measures can be classified into full-reference (FR), reduced reference (RR) and no-reference (NR) methods. Objective models can be employed to evaluate the grain quality automatically, and can also be embedded into the design and optimization of various grain processing algorithms and systems. Notable success has been achieved in all three categories, especially in the FR case, where a number of state-of-the-art algorithms have been shown to have good correlations with subjective quality ratings. The FR quality metric is the primary study topic of this paper, because it is consistent with the philosophy of preserving creative intent.
In this work, we focus on the perceptual quality assessment of synthesised film grain in terms of its fidelity to the grain in source. We first create a database that contains different levels of synthesised grain, together with multiple compression levels, and carry out a subjective study using the database. Comprehensive subjective score analysis is conducted to comparatively study the behaviour of different grain synthesis methods. We find that state- of-the-art VQA models only moderately correlate with subjective opinions. Closer examinations reveal that popular deterministic VQA approaches such as VMAF and AVQT lack appropriate considerations on the statistical naturalness of the grain. This provides potential guidelines for designing a more effective objective grain fidelity metric in the future.
Sign up for FREE access to the latest industry trends, videos, thought leadership articles, executive interviews, behind the scenes exclusives and more!
Already have a login? SIGN IN