Conference Agenda

Session
IVA2-P: Image and Video Analysis 2
Time:
Tuesday, 22/Sept/2020:
11:05am - 11:30am

Session Chair: Luca Magri
Location: Virtual platform

Presentations
11:05am - 11:10am

Eye Movement State Trajectory Estimator based on Ancestor Sampling

Sai Phani Kumar Malladi1, Jayanta Mukhopadhyay1, Mohamed-Chaker Larabi2, Santanu Chaudhury3

1Indian Institute of Technology Kharagpur, India; 2University of Poitiers, France; 3Indian Institute of Technology Jodhpur, India

Human gaze dynamics mainly concern about the sequence of the occurrence of three eye movements, namely fixations, saccades, and microsaccades. In this paper, we correlate them as three different states to velocities of eye movements. We build a state trajectory estimator based on ancestor sampling (STEAS) model, which captures the features of the human temporal gaze pattern to identify the kind of visual stimuli. We used a gaze dataset of 72 viewers watching 60 video clips which are equally split into four visual categories. Uniformly sampled velocity vectors from the training set, are used to find the best suitable parameters of the proposed statistical model. Then, the optimized model is used for both gaze data classification and video retrieval on the test set. We observed 93.265% of classification accuracy and a mean reciprocal rank of 0.888 for video retrieval on the test set. Hence, this model can be used for viewer independent video indexing for providing viewers an easier way to navigate through the contents.

Malladi-Eye Movement State Trajectory Estimator based on Ancestor Sampling-171.pdf


11:10am - 11:15am

Online Multiple Object Tracking Using Single Object Tracker and Maximum Weight Clique Graph

Yujie Hu, Xiang Zhang, Yexin Li, Ran Tian

University of Electronic Science and Technology of China, China, People's Republic of

Tracking multiple objects is a challenging task in time-critical video analysis systems. In the popular tracking-by-detection framework, the core problems of a tracker are the quality of the employed input detections and the effectiveness of the data association. Towards this end, we propose a multiple object tracking method which employs a single object tracker to improve the results of unreliable detection and data association simultaneously. Besides, we utilize maximum weight clique graph algorithm to handle the optimal assignment in an online mode. In our method, a robust single object tracker is used to connect previous tracked objects to tackle the current noise detection and improve the data association as a motion cue. Furthermore, we use person re-identification network to learn the historical appearances of the tracklets in order to promote the tracker's identification ability. Extensive experiments show that our tracker achieves state-of-the-art performance on the MOT benchmark.

Hu-Online Multiple Object Tracking Using Single Object Tracker and Maximum Weight Clique Graph-101.pdf


11:15am - 11:20am

Automated Genre Classification for Gaming Videos

Steve Göring, Robert Steger, Rakesh Rao Ramachandra Rao, Alexander Raake

TU Ilmenau, Germany

Beside classical videos, gaming matches, tournaments or sessions are streamed and viewed all over the world. The increased popularity of Twitch or YoutubeGaming shows the importance of additional research for gaming videos. One important pre-condition for live or offline encoding of gaming videos is the knowledge of game specific properties. Knowing or automatically predicting the genre of a gaming video enables a more advanced and optimized encoding pipeline for streaming providers, especially because gaming videos of different genres vary a lot from classical 2D video, e.g., considering the cgi content, textures or camera motion. We describe several computer vision based features, that are optimized for speed and motivated by characteristics of popular games, to automatically predict the genre of a gaming video. Our prediction system uses random forest and gradient boosting trees as underlying machine learning approaches combined with feature selection. For the evaluation of our approach we use a dataset that was built as part of this work and consists of recorded gaming sessions for 6 genres from Twitch. In total 351 different videos are considered. We show that our prediction approach shows a good performance in terms of f1-score. Beside the evaluation of different machine learning approaches, we additionally investigate the influence of the hyperparameters for the algorithms.

Göring-Automated Genre Classification for Gaming Videos-212.pdf


11:20am - 11:25am

Use of a deep convolutional neural network to diagnose disease in the rose by means of a photographic image

Oleg Miloserdov1, Nadezhda Ovcharenko2, Andrey Makarenko1

1Institute of Control Sciences of Russian Academy of Sciences; 2Institute of Agriculture of Crimea

The article presents particulars of developing a plant disease detection system based on analysis of photographic images by deep convolutional neural networks. A original lightweight neural network architecture is used (only 13480 trained parameters) that is tens and hundreds of times more compact than typical solutions. Real-life field data is used for training and testing, with photographs taken in adverse conditions: variation in hardware quality, angles, lighting conditions, scales (from macro shots of individual fragments of leaf and stem to several rose bushes in one picture), and complex disorienting backgrounds. An adaptive decision-making rule is used, based on the Bayes' theorem and Wald's sequential probability ratio test, in order to improve reliability of the results. A following example is provided: detection of disease on leaves and stems of rose from images taken in the visible spectrum. The authors were able attain the quality of 90.6% on real-life data (F1-score, one input image, test dataset).

Miloserdov-Use of a deep convolutional neural network to diagnose disease-181.pdf


11:25am - 11:30am

On Verification of Blur and Sharpness Metrics for No-reference Image Visual Quality Assessment

Sheyda Ghanbaralizadeh Bahnemiri, Mykola Ponomarenko, Karen Egiazarian

Tampere University, Finland

Natural images contain regions with different level of blur affecting image visual quality. Often images are processed to decrease level of blur and to increase image sharpness. No-reference image visual quality metrics should be able to effectively consider blur/sharpness level on a given image. To verify this ability a new large test image database is proposed in the paper. An extensive comparative analysis of no-reference metrics is carried out. A new convolutional neural network for blind estimation of difference between blurred/sharpened image and a true source image is proposed. It is shown that the proposed network achieves best performance on the new test set among all considered metrics in both estimation of blur level and estimation of sharpness level.

Ghanbaralizadeh Bahnemiri-On Verification of Blur and Sharpness Metrics for No-reference Image Visual Quality.pdf