Overview and details of the sessions of this conference. Please register as a participant for the conference (free!) and then Login in order to have access to downloads in the detailed view. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view.
MQA1-O: Multimedia Quality Assessment 1
4:15pm - 4:30pm
No-Reference Video Quality Assessment Using Space-Time Chips
1Laboratory for Image and Video Engineering (LIVE), The University of Texas at Austin; 2Amazon Prime Video
We propose a new model for no-reference video quality assessment (VQA) based on the natural statistics of space-time chips of videos. Space-time chips (ST-chips) are a new, quality-aware feature space which we define as space-time localized cuts of video data in directions that are determined by the local motion flow. We use parametrized statistical fits to the statistics of space-time chips to characterize quality, and show that the parameters from these models are affected by distortion and can hence be used to objectively predict the quality of videos. The proposed method, which we tentatively call ChipQA, is agnostic to the types of distortion affecting the video, and is based on identifying and quantifying deviations from the expected statistics of natural, undistorted ST-chips in order to predict video quality. We train and test our resulting model on several large VQA databases and show that our model achieves high correlation against human judgments of video quality and is competitive with state-of-the-art models.
4:30pm - 4:45pm
Subjective Test Dataset and Meta-data-based Models for 360° Streaming Video Quality
1TU Ilmenau, Germany; 2Huawei Technologies Co. Ltd., China
During the last years, the number of 360° videos available for streaming has rapidly increased, leading to the need for 360° streaming video quality assessment.
In this paper, we report and publish results of three subjective 360° video quality tests, with conditions used to reflect real-world bitrates and resolutions including 4K, 6K and 8K, resulting in 64 stimuli each for the first two tests and 63 for the third.
As playout device we used the HTC Vive for the first and HTC Vive Pro for the remaining two tests.
Video-quality ratings were collected using the 5-point Absolute Category Rating scale.
The 360° dataset provided with the paper contains the links of the used source videos, the raw subjective scores, video-related meta-data, head rotation data and Simulator Sickness Questionnaire results per stimulus and per subject to enable reproducibility of the provided results.
Moreover, we use our dataset to compare the performance of state-of-the-art full-reference quality metrics such as VMAF, PSNR, SSIM, ADM2, WS-PSNR and WS-SSIM.
Out of all metrics, VMAF was found to show the highest correlation with the subjective scores.
Further, we evaluated a center-cropped version of VMAF ("VMAF-cc") that showed to provide a similar performance as the full VMAF.
In addition to the dataset and the objective metric evaluation, we propose two new video-quality prediction models, a bitstream meta-data-based model and a hybrid no-reference model using bitrate, resolution and pixel information of the video as input.
The new lightweight models provide similar performance as the full-reference models while enabling fast calculations.
4:45pm - 5:00pm
Merging of MOS of Large Image Databases for No-reference Image Visual Quality Assessment
Tampere University, Finland
For training of no-reference image visual quality metrics large specialized image databases are used. For images of the databases mean opinion scores (MOS) are experimentally obtained collecting judgments of many observers. MOS of a given image reflects an averaged human perception of visual quality of the image. Each database has its own unknown scale of MOS values depending from unique content of the database. For training of no-reference metrics based on convolutional networks usually only one selected database is used, because all MOS values on input of training loss function should be in the same scale. In this paper, a simple and effective method of merging of several large databases into one database with transforming of their MOS into one scale is proposed. Accuracy of the proposed method is analyzed. Merged MOS is used for practical training of no-reference metric. Better effectiveness of the training is shown in comparative analysis.
5:00pm - 5:15pm
Translation of Perceived Video Quality Across Displays
Google, United States of America
Display devices can affect the perceived quality of
a video significantly. In this paper, we focus on the scenario
where video resolution does not exceed screen resolution, and
investigate the relationship of perceived video quality on mobile,
laptop and TV. A novel transformation of Mean Opinion Scores
(MOS) among different devices is proposed and is shown to be
effective at normalizing ratings across user devices for in lab
and crowd sourced subjective studies. The model allows us to
perform more focused in lab subjective studies as we can reduce
the number of test devices and helps us reduce noise during
crowd-sourcing subjective video quality tests. It is also more
effective than utilizing existing device dependent objective