Conference Agenda

Overview and details of the sessions of this conference. Please register as a participant for the conference (free!) and then Login in order to have access to downloads in the detailed view. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view.

Session Overview
IVC1-P: Image and Video Compression 1
Monday, 21/Sept/2020:
10:50am - 11:15am

Session Chair: Joao Ascenso
Location: Virtual platform

10:50am - 10:55am

Scalable Mesh Representation for Depth from Breakpoint-Adaptive Wavelet Coding

Yue Li, Reji Mathew, David Taubman

University of New South Wales, Australia

A highly scalable and compact representation of depth data is required in many applications, and it is especially critical for plenoptic multiview image compression frameworks that use depth information for novel view synthesis and inter-view prediction. Efficiently coding depth data can be difficult as it contains sharp discontinuities. Breakpoint-adaptive discrete wavelet transforms (BPA-DWT) currently being standardized as part of JPEG 2000 Part-17 extensions have been found suitable for coding spatial media with hard discontinuities. In this paper, we explore a modification to the original BPA-DWT by replacing the traditional constant extrapolation strategy with the newly proposed affine extrapolation for reconstructing depth data in the vicinity of discontinuities. We also present a depth reconstruction scheme that can directly decode the BPA-DWT coefficients and breakpoints onto a compact and scalable mesh-based representation which has many potential benefits over the sample-based description. For performing depth compensated view prediction, our proposed triangular mesh representation of the depth data is a natural fit for modern graphics architectures.

Li-Scalable Mesh Representation for Depth from Breakpoint-Adaptive Wavelet Coding-238.pdf

10:55am - 11:00am

Efficient Low Bit-Rate Intra-Frame Coding using Common Information for 360-degree Video

Fariha Afsana1, Manoranjan Paul1, Manzur Murshed2, David Taubman3

1Charles Sturt University, Australia; 2Federation University Australia; 3The University of New South Wales

With the growth of video technologies, superresolution videos, including 360-degree immersive video has become a reality due to exciting applications such as augmented/ virtual/mixed reality for better interaction and a wide angle user-view experience of a scene compared to traditional video with narrow-focused viewing angle. The new generation video contents are bandwidth-intensive in nature due to high resolution and demand high bit rate as well as low latency delivery requirements that pose challenges in solving the bottleneck of transmission and storage burdens. There is limited optimisation space in traditional video coding schemes for improving video coding efficiency in intra-frame due to the fixed size of processing block. This paper presents a new approach for improving intra-frame coding especially at low bit rate video transmission for 360-degree video for lossy mode of HEVC. Prior to using traditional HEVC intra-prediction, this approach exploits the global redundancy of entire frame by extracting common important information using multi-level discrete wavelet transformation. This paper demonstrates that the proposed method considering only low frequency information of a frame and encoding this can outperform the HEVC standard at low bit rates. The experimental results indicate that the proposed intra-frame coding strategy achieves an average of 54.07% BD rate reduction and 2.84 dB BD-PSNR gain for low bit rate scenario compared to the HEVC. It also achieves a significant improvement in encoding time reduction of about 66.84% on an average. Moreover, this finding also demonstrates that the existing HEVC block partitioning can be applied in the transform domain for better exploitation of information concentration as we applied HEVC on wavelet frequency domain.

Afsana-Efficient Low Bit-Rate Intra-Frame Coding using Common Information-186.pdf

11:00am - 11:05am

A Coarse Representation of Frames Oriented Video Coding By Leveraging Cuboidal Partitioning of Image Data

Ashek Ahmmed1, Manoranjan Paul1, Manzur Murshed2, David Taubman3

1CSU, Australia; 2Federation University, Australia; 3UNSW, Australia

Video coding algorithms attempt to minimize the significant commonality that exists within a video sequence. Each new video coding standard contains tools that can perform this task more efficiently compared to its predecessors. In this work, we form a coarse representation of the current frame by minimizing commonality within that frame while preserving important structural properties of the frame. The building blocks of this coarse representation are rectangular regions called cuboids, which are computationally simple and has a compact description. Then we propose to employ the coarse frame as an additional source for predictive coding of the current frame. Experimental results show an improvement in bit rate savings of up to 17.46\% over a reference codec for HEVC, with minor increase in the codec computational complexity.

Ahmmed-A Coarse Representation of Frames Oriented Video Coding-263.pdf

11:05am - 11:10am

Bi-directional intra prediction based measurement coding for compressive sensing images

Thuy Thi Thu Tran1, Jirayu Peetakul1, Chi Do Kim Pham1, Jinjia Zhou1,2

1Graduate School of Science and Engineering, Hosei University, Tokyo 184-8584, Japan; 2JST, PRESTO, Tokyo 332-0012, Japan

This work proposes a bi-directional intra prediction-based measurement coding algorithm for compressive sensing images. Compressive sensing is capable of reducing the size of the sparse signals, in which the high-dimensional signals are represented by the under-determined linear measurements. In order to explore the spatial redundancy in measurements, the corresponding pixel domain information extracted using the structure of the measurement matrix. Firstly, the mono-directional prediction modes (i.e. horizontal mode and vertical mode), which refer to the nearest information of neighboring pixel blocks, are obtained by the structure of the measurement matrix. Secondly, we design bi-directional intra prediction modes (i.e. Diagonal +Horizontal, Diagonal + Vertical) base on the already obtained mono-directional prediction modes. Experimental results show that this work improves 0.01 - 0.02 dB PSNR improvement and the bitrate reductions of on average 19.5%, up to 36% compared to the state-of-the-art.

Tran-Bi-directional intra prediction based measurement coding-241.pdf

11:10am - 11:15am


Ramin Ghaznavi Youvalari, Jani Lainema

Nokia Technologies, Finland

The Cross-Component Linear Model (CCLM) is an intra prediction

technique that is adopted into the upcoming Versatile

Video Coding (VVC) standard. CCLM attempts to reduce the

inter-channel correlation by using a linear model. For that, the

parameters of the model are calculated based on the reconstructed

samples in luma channel as well as neighboring samples

of the chroma coding block. In this paper, we propose a

new method, called as Joint Cross-Component Linear Model

(J-CCLM), in order to improve the prediction efficiency of the

tool. The proposed J-CCLM technique predicts the samples

of the coding block with a multi-hypothesis approach which

consists of combining two intra prediction modes. To that

end, the final prediction of the block is achieved by combining

the conventional CCLM mode with an angular mode that

is derived from the co-located luma block. The conducted experiments

in VTM-8.0 test model of VVC illustrated that the

proposed method provides on average more than 1.0% BDRate

gain in chroma channels. Furthermore, the weighted

YCbCr bitrate savings of 0.24% and 0.54% are achieved in

4:2:0 and 4:4:4 color formats, respectively.