Conference Agenda

Overview and details of the sessions of this conference. Please register as a participant for the conference (free!) and then Login in order to have access to downloads in the detailed view. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view.

Please note that all times are shown in the time zone of the conference. The current conference time is: 22nd Oct 2020, 12:04:59am EEST

 
 
Session Overview
Session
DLM2-O: Deep Learning for Multimedia 2
Time:
Tuesday, 22/Sept/2020:
4:15pm - 5:15pm

Session Chair: Patrick Le Callet
Location: Virtual platform

External Resource: Session: https://mmsp-virtual.org/presentation/oral/deep-learning-multimedia-2
First presentation: https://mmsp-virtual.org/presentation/oral/emotion-dependent-facial-animation-affective-speech
Second presentation: https://mmsp-virtual.org/presentation/oral/demi-deep-video-quality-estimation-model-using-perceptual-video-quality
Third presentation: https://mmsp-virtual.org/presentation/oral/variational-bound-mutual-information-fairness-classification
Fourth presentation: https://mmsp-virtual.org/presentation/oral/profiling-actions-sport-video-summarization-attention-signal-analysis
Show help for 'Increase or decrease the abstract text size'
Presentations
4:15pm - 4:30pm

Emotion Dependent Facial Animation from Affective Speech

Rizwan Sadiq, Sasan Asadiabadi, Engin Erzin

Koc University, Turkey

In human-to-computer interaction, facial animation

in synchrony with affective speech can deliver more naturalistic

conversational agents. In this paper, we present a two-stage

deep learning approach for affective speech driven facial shape

animation. In the first stage, we classify affective speech into

seven emotion categories. In the second stage, we train separate

deep estimators within each emotion category to synthesize

facial shape from the affective speech. Objective and subjective

evaluations are performed over the SAVEE dataset. The proposed

emotion dependent facial shape model performs better in terms

of the Mean Squared Error (MSE) loss and in generating the

landmark animations, as compared to training a universal model

regardless of the emotion.



4:30pm - 4:45pm

DEMI: Deep Video Quality Estimation Model using Perceptual Video Quality Dimensions

Saman Zadtootaghaj2, Nabajeet Barman1, Rakesh Rao3, Steve Goring3, Maria Martini1, Alexander Raake3, Sebastian Möller2,4

1Kingston University, United Kingdom; 2Quality and Usability Lab, TU Berlin, Germany; 3Technische Universitat Ilmenau, Germany; 4DFKI Projektburo Berlin, Germany

With the advent and integration of gaming video streaming on traditional platforms such as YouTubeGaming and Facebook Gaming, it is imperative that the quality estimation metrics proposed work both for gaming and non-gaming content. Existing works in the field of quality assessment focus separately on gaming and non-gaming content. Along with the traditional modeling approaches, deep learning based approaches have been used to develop quality models, due to their high prediction accuracy. Hence, we present in this paper a deep learning based quality estimation model considering both gaming and non-gaming videos. The model is developed in three phases. First, a convolutional neural network (CNN) is trained based on an objective metric which allows the CNN to learn video artifacts such as blurriness and blockiness. Next, the model is fine-tuned based on a small image quality dataset using blockiness and blurriness ratings. Finally, a Random Forest was used to pool frame-level predictions and temporal information of videos in order to predict the overall video quality.



4:45pm - 5:00pm

Variational Bound of Mutual Information for Fairness in Classification

Zahir Alsulaimawi

Oregon State University, United States of America

Machine learning applications have emerged in many aspects of our lives, such as for credit lending, insurance rates, and employment applications.

Consequently, it is required that such systems be nondiscriminatory and fair in sensitive features user, e.g., race, sexual orientation, and religion.

To address this issue, this paper develops a minimax adversarial framework, called features protector (FP) framework, to achieve the information-theoretical trade-off between minimizing distortion of target data and ensuring that sensitive features have similar distributions.

We evaluate the performance of the proposed framework on two real-world datasets. Preliminary empirical evaluation shows that our framework provides both accurate and fair decisions.



5:00pm - 5:15pm

Profiling Actions for Sport Video Summarization: An attention signal analysis

Melissa Sanabria1,2,3, Frédéric Precioso1,2,3, Thomas Menguy4

1Université Cote d'Azur; 2Maasai, Inria Sophia Antipolis; 3Laboratoire d'Informatique, Signaux, et Systèmes de Sophia-Antipolis (I3S); 4Wildmoka

Analyzing video content to produce summaries and extracting highlights has been challenging for decades. One of the biggest challenges for automatic sports video summarization is to produce summaries almost immediately after it ended, witnessing the course of the match while preserving emotions. Currently, in broadcast companies many human operators select which actions should belong to the summary based on multiple rules they have built upon their own experience using different sources of information. These rules define the different profiles of actions of interest that help the operator to generate better customized summaries. Most of these profiles do not directly rely on broadcast video content but rather exploit metadata describing the course of the match. In this paper, we show how the signals produced by the attention layer of a recurrent neural network can be seen as a learnt representation of these action profiles and provide a new tool to support operators' work. The results in soccer matches show the capacity of our approach to transfer knowledge between datasets from different broadcasting companies, from different leagues, and the ability of the attention layer to learn meaningful action profiles.



 
Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: IEEE MMSP 2020
Conference Software - ConfTool Pro 2.6.135+CC
© 2001 - 2020 by Dr. H. Weinreich, Hamburg, Germany