Conference Agenda

Overview and details of the sessions of this conference. Please register as a participant for the conference (free!) and then Login in order to have access to downloads in the detailed view. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view.

Please note that all times are shown in the time zone of the conference. The current conference time is: 22nd Oct 2020, 01:24:14am EEST

Session Overview
DLM1-P: Deep Learning for Multimedia 1
Tuesday, 22/Sept/2020:
10:50am - 11:05am

Session Chair: Nabajeet Barman
Location: Virtual platform

External Resource: Session:
First presentation:
Second presentation:
Third presentation:
Show help for 'Increase or decrease the abstract text size'
10:50am - 10:55am

Parallelized Rate-Distortion Optimized Quantization Using Deep Learning

Dana Kianfar, Auke Wiggers, Amir Said, Reza Pourreza, Taco Cohen

Qualcomm, Netherlands, The

Rate-Distortion Optimized Quantization (RDOQ) has played an important role in the coding performance of recent video compression standards such as H.264/AVC, H.265/HEVC, VP9 and AV1. This scheme yields significant reductions in bit-rate at the expense of relatively small increases in distortion. Typically, RDOQ algorithms are prohibitively expensive to implement on real-time hardware encoders due to their sequential nature and their need to frequently obtain entropy coding costs. This work addresses this limitation using a neural network-based approach, which learns to trade-off rate and distortion during offline supervised training. As these networks are based solely on standard arithmetic operations that can be executed on existing neural network hardware, no additional area-on-chip needs to be reserved for dedicated RDOQ circuitry. We train two classes of neural networks, a fully-convolutional network and an auto-regressive network, and evaluate each as a post-quantization step designed to refine cheap quantization schemes such as scalar quantization (SQ). Both network architectures are designed to have a low computational overhead. After training they are integrated into the HM 16.20 implementation of HEVC, and their video coding performance is evaluated on a subset of the H.266/VVC SDR common test sequences. Comparisons are made to RDOQ and SQ implementations in HM 16.20. Our method outperforms the SQ baseline, and on average reaches 45% of the performance of the iterative HM RDOQ algorithm.

10:55am - 11:00am

Deep Learning Off-the-shelf Holistic Feature Descriptors for Visual Place Recognition in Challenging Conditions

Farid Alijani, Esa Rahtu

Tampere University, Finland

In this paper, we present a comprehensive study on the utility of deep learning feature extraction methods for visual place recognition task in three challenging conditions, appearance variation, viewpoint variation and combination of both appearance and viewpoint variation. We extensively compared the performance of convolutional neural network architectures with batch normalization layers in terms of fraction of the correct matches. These architectures are primarily trained for image classification and object detection problems and used as holistic feature descriptors for visual place recognition task. To verify effectiveness of our results, we utilized four real world datasets in place recognition. Our investigation demonstrates that convolutional neural network architectures coupled with batch normalization and trained for other tasks in computer vision outperform architectures which are specifically designed for place recognition tasks.

11:00am - 11:05am

Learned BRIEF -- transferring the knowledge from hand-crafted to learning-based descriptors

Nina Žižakić, Aleksandra Pižurica

Ghent University, Belgium

In this paper, we present a novel approach for designing local image descriptors that learn from data and from hand-crafted descriptors. In particular, we construct a learning model that first mimics the behaviour of a hand-crafted descriptor and then learns to improve upon it in an unsupervised manner. We demonstrate the use of this knowledge-transfer framework by constructing the learned BRIEF descriptor based on the well-known hand-crafted descriptor BRIEF. We implement our learned BRIEF with a convolutional autoencoder architecture. Evaluation on the HPatches benchmark for local image descriptors shows the effectiveness of the proposed approach in the tasks of patch retrieval, patch verification, and image matching.

Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: IEEE MMSP 2020
Conference Software - ConfTool Pro 2.6.135+CC
© 2001 - 2020 by Dr. H. Weinreich, Hamburg, Germany