IBC2023: This Technical Paper presents a real-time implementation of a platform jointly developed by InterDigital and Philips that showcases use cases leveraging the MPEG volumetric (MPEG-I V3C) and 2D video standards (VVC, HEVC).

Abstract

This paper presents a real-time implementation of a platform jointly developed by InterDigital and Philips that showcases use cases leveraging the MPEG volumetric (MPEG-I V3C) and 2D video standards (VVC, HEVC). We will detail how our platform enables interoperability within existing and emerging extended reality (XR) ecosystems, including the acquisition, streaming, and real-time interactive playback of volumetric video on current and future client devices, for use in applications like telelearning, free- viewpoint sport replays, and 3D telepresence in connected ecosystems like the metaverse.

MPEG’s Visual Volumetric Video-based Coding (V3C) standard is an extensive framework for the coding of volumetric video, from dynamic point clouds (V-PCC) to multi-view plus depth and multi-plane image representations (MIV), to offer a single bitstream structure with a uniform bridge to systems-level standards. The V3C carriage standard defines how volumetric content can be stored, transported, and delivered to the end- user, and it repurposes existing 2D video hardware decoding capabilities and GPUs to decode and render volumetric video.

Introduction

The increasing popularity of XR applications is driving the media industry to explore the creation and delivery of new immersive experiences, while pushing engineers and inventors to address the challenges of real video content manipulation.

A volumetric video is comprised of a sequence of frames, and each frame is a static 3D representation of a real-world object or scene capture at a different point in time. Volumetric video is bandwidth-heavy content that can be presented as dynamic point clouds, multi-view plus depth, or multi-plane image representations. These high bandwidth constraints can be reduced through dedicated compression schemes adapted to these types of contents to reach data rates and files sizes that are economically viable in the industry. Standards play a crucial role in ensuring interoperability across these different types of contents and experiences, and this paper presents the Moving Picture Experts Group (MPEG) Visual Volumetric Video-based Coding (V3C) standard [1] as an open standard solution for efficient compression and streaming of volumetric video. The MPEG community has described use cases for the V3C codec [6] [17].

Looking at trends towards metaverse developments, some could consider a transitioning path from current 2D experiences to future metaverse worlds. The presented platform has been developed as such to allow a remote user to access 2D content first and then further engage with the proposed topic with volumetric viewing; allowing the enriched experience to be brought seamlessly to the user.

This paper presents a real-time implementation of a platform jointly developed by InterDigital and Philips. This platform, illustrated in Figure 1Figure 1, ingests pre-recorded 2D and volumetric video content, provides real-time streaming and rendering. Final content is proposed to the user on various devices such as 2D screens, smartphones or tablets, and VR/AR head mounted displays.

The organization of the paper is as follows: the standards section will give an overview of the two implemented V3C-based volumetric codecs, namely V-PCC and MIV; along with the V3C carriage layer for systems. The platform architecture section will depict presented Immersive Video Decoder Platform and will highlight principal software components allowing real-time streaming and rendering and proposed integration into XR ecosystem. The evaluation section will provide metrics of the current implementation measured on laptop, smartphones, and tablet devices.

Download the paper below.