Overview and details of the sessions of this conference. Please register as a participant for the conference (free!) and then Login in order to have access to downloads in the detailed view. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view.
Please note that all times are shown in the time zone of the conference. The current conference time is: 22nd Oct 2020, 12:36:29am EEST
IVC3-O: Image and Video Compression 3
First presentation: https://mmsp-virtual.org/presentation/oral/video-coding-machines-featurebased-ratedistortion-optimization
Second presentation: https://mmsp-virtual.org/presentation/oral/triangulationbased-backward-adaptive-motion-field-subsampling-scheme
Third presentation: https://mmsp-virtual.org/presentation/oral/graphbased-skeleton-data-compression
Fourth presentation: https://mmsp-virtual.org/presentation/oral/optical-flow-and-mode-selection-learningbased-video-coding
4:15pm - 4:30pm
⭐ This paper has been nominated for the best paper award.
Video Coding for Machines with Feature-Based Rate-Distortion Optimization
Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
Common state-of-the-art video codecs are optimized to deliver a low bitrate by providing a certain quality for the final human observer, which is achieved by rate-distortion optimization (RDO). But, with the steady improvement of neural networks solving computer vision tasks, more and more multimedia data is not observed by humans anymore, but directly analyzed by neural networks. In this paper, we propose a standard-compliant feature-based RDO (FRDO) that is designed to increase the coding performance, when the decoded frame is analyzed by a neural network in a video coding for machine scenario. To that extent, we replace the pixel-based distortion metrics in conventional RDO of VTM-8.0 with distortion metrics calculated in the feature space created by the first layers of a neural network. Throughout several tests with the segmentation network Mask R-CNN and single images from the Cityscapes dataset, we compare the proposed FRDO and its hybrid version HFRDO with different distortion measures in the feature space against the conventional RDO. With HFRDO, up to 5.49 % bitrate can be saved compared to the VTM-8.0 implementation in terms of Bjøntegaard Delta Rate and using the weighted average precision as quality metric. Additionally, allowing the encoder to vary the quantization parameter results in coding gains for the proposed HFRDO of up~9.95 % compared to conventional VTM.
4:30pm - 4:45pm
A Triangulation-Based Backward Adaptive Motion Field Subsampling Scheme
1Friedrich-Alexander-Universität Erlangen Nürnberg, Germany; 2Huawei Technologies Duesseldorf GmbH
Optical flow procedures are used to generate dense motion fields which approximate true motion. Such fields contain a large amount of data and if we need to transmit such a field, the raw data usually exceeds the raw data of the two images it was computed from. In many scenarios, however, it is of interest to transmit a dense motion field efficiently. Most prominently this is the case in inter prediction for video coding.
In this paper we propose a transmission scheme based on subsampling the motion field. Since a field which was subsampled with a regularly spaced pattern usually yields suboptimal results, we propose an adaptive subsampling algorithm that preferably samples vectors at positions where changes in motion occur. The subsampling pattern is fully reconstructable without the need for signaling of position information. We show an average gain of 2.95 dB in mean squared error compared to regular subsampling. Furthermore we show that an additional prediction stage can improve the results by an additional 0.43 dB, gaining 3.38 dB in total.
4:45pm - 5:00pm
Graph-based skeleton data compression
University of Southern California, United States of America
With the advancement of reliable, fast, portable acquisition systems, human motion capture data is becoming widely used in many industrial, medical, and surveillance applications. These systems can track multiple people simultaneously, providing full-body skeletal keypoints as well as more detailed landmarks in face, hands and feet. This leads to a huge amount of skeleton data to be transmitted or stored. In this paper, we introduce Graph Based Skeleton Compression (GSC), an efficient graph-based method for nearly lossless compression. We use a separable spatio-temporal graph transform along with non-uniform quantization, coefficient scanning and entropy coding with run-length codes for nearly lossless compression. We evaluate the compression performance of the proposed method on the large NTU-RGB activity dataset. Our method outperforms a 1D discrete cosine transform method applied along temporal direction. In near-lossless mode our proposed compression does not affect action recognition performance.
5:00pm - 5:15pm
⭐ This paper has been nominated for the best paper award.
Optical Flow and Mode Selection for Learning-based Video Coding
1Orange, France; 2Univ. Rennes, INSA Rennes, CNRS, IETR, UMR 6164, France
This paper introduces a new method for inter-frame coding based on two
complementary autoencoders: MOFNet and CodecNet. MOFNet aims at computing
and conveying the optical flow and a pixel-wise coding mode selection. The
optical flow is used to perform a prediction of the frame to code. The
coding mode selection enables competition between direct copy of the prediction or
transmission through CodecNet.
The proposed coding scheme is assessed under the Challenge on Learned
Image Compression 2020 (CLIC20) P-frame coding track test conditions, where it
is shown to perform on par with the state-of-the-art video codec ITU/MPEG HEVC.
Moreover, the possibility of copying the prediction enables to learn the optical
flow in an actual end-to-end fashion i.e. without pre-training or
dedicated loss term.
Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: IEEE MMSP 2020
|Conference Software - ConfTool Pro 2.6.135+CC
© 2001 - 2020 by Dr. H. Weinreich, Hamburg, Germany