Conference Agenda

MQA1-O: Multimedia Quality Assessment 1
Monday, 21/Sept/2020:
4:15pm - 5:15pm

Session Chair: Mylene Farias
Location: Virtual platform

4:15pm - 4:30pm

No-Reference Video Quality Assessment Using Space-Time Chips

Joshua Peter Ebenezer1, Zaixi Shang1, Yongjun Wu2, Hai Wei2, Alan Conrad Bovik1

1Laboratory for Image and Video Engineering (LIVE), The University of Texas at Austin; 2Amazon Prime Video

We propose a new model for no-reference video quality assessment (VQA) based on the natural statistics of space-time chips of videos. Space-time chips (ST-chips) are a new, quality-aware feature space which we define as space-time localized cuts of video data in directions that are determined by the local motion flow. We use parametrized statistical fits to the statistics of space-time chips to characterize quality, and show that the parameters from these models are affected by distortion and can hence be used to objectively predict the quality of videos. The proposed method, which we tentatively call ChipQA, is agnostic to the types of distortion affecting the video, and is based on identifying and quantifying deviations from the expected statistics of natural, undistorted ST-chips in order to predict video quality. We train and test our resulting model on several large VQA databases and show that our model achieves high correlation against human judgments of video quality and is competitive with state-of-the-art models.

Ebenezer-No-Reference Video Quality Assessment Using Space-Time Chips-168.pdf

4:30pm - 4:45pm

Subjective Test Dataset and Meta-data-based Models for 360° Streaming Video Quality

Stephan Fremerey1, Steve Göring1, Rakesh Rao Ramachandra Rao1, Rachel Huang2, Alexander Raake1

1TU Ilmenau, Germany; 2Huawei Technologies Co. Ltd., China

During the last years, the number of 360° videos available for streaming has rapidly increased, leading to the need for 360° streaming video quality assessment.

In this paper, we report and publish results of three subjective 360° video quality tests, with conditions used to reflect real-world bitrates and resolutions including 4K, 6K and 8K, resulting in 64 stimuli each for the first two tests and 63 for the third.

As playout device we used the HTC Vive for the first and HTC Vive Pro for the remaining two tests.

Video-quality ratings were collected using the 5-point Absolute Category Rating scale.

The 360° dataset provided with the paper contains the links of the used source videos, the raw subjective scores, video-related meta-data, head rotation data and Simulator Sickness Questionnaire results per stimulus and per subject to enable reproducibility of the provided results.

Moreover, we use our dataset to compare the performance of state-of-the-art full-reference quality metrics such as VMAF, PSNR, SSIM, ADM2, WS-PSNR and WS-SSIM.

Out of all metrics, VMAF was found to show the highest correlation with the subjective scores.

Further, we evaluated a center-cropped version of VMAF ("VMAF-cc") that showed to provide a similar performance as the full VMAF.

In addition to the dataset and the objective metric evaluation, we propose two new video-quality prediction models, a bitstream meta-data-based model and a hybrid no-reference model using bitrate, resolution and pixel information of the video as input.

The new lightweight models provide similar performance as the full-reference models while enabling fast calculations.

Fremerey-Subjective Test Dataset and Meta-data-based Models-281.pdf

4:45pm - 5:00pm

Merging of MOS of Large Image Databases for No-reference Image Visual Quality Assessment

Aki Kaipio, Mykola Ponomarenko, Karen Egiazarian

Tampere University, Finland

For training of no-reference image visual quality metrics large specialized image databases are used. For images of the databases mean opinion scores (MOS) are experimentally obtained collecting judgments of many observers. MOS of a given image reflects an averaged human perception of visual quality of the image. Each database has its own unknown scale of MOS values depending from unique content of the database. For training of no-reference metrics based on convolutional networks usually only one selected database is used, because all MOS values on input of training loss function should be in the same scale. In this paper, a simple and effective method of merging of several large databases into one database with transforming of their MOS into one scale is proposed. Accuracy of the proposed method is analyzed. Merged MOS is used for practical training of no-reference metric. Better effectiveness of the training is shown in comparative analysis.

Kaipio-Merging of MOS of Large Image Databases for No-reference Image Visual Quality Assessment-249.pdf

5:00pm - 5:15pm

Translation of Perceived Video Quality Across Displays

Jessie Lin, Neil Birkbeck, Balu Adsumilli

Google, United States of America

Display devices can affect the perceived quality of

a video significantly. In this paper, we focus on the scenario

where video resolution does not exceed screen resolution, and

investigate the relationship of perceived video quality on mobile,

laptop and TV. A novel transformation of Mean Opinion Scores

(MOS) among different devices is proposed and is shown to be

effective at normalizing ratings across user devices for in lab

and crowd sourced subjective studies. The model allows us to

perform more focused in lab subjective studies as we can reduce

the number of test devices and helps us reduce noise during

crowd-sourcing subjective video quality tests. It is also more

effective than utilizing existing device dependent objective

Lin-Translation of Perceived Video Quality Across Displays-185.pdf