IVA1-O: Image and Video Analysis 1
8:30am - 8:45am
Leveraging Active Perception for Improving Embedding-based Deep Face Recognition
Aristotle University of Thessaloniki, Greece
Even though recent advances in deep learning (DL) led to tremendous improvements for various computer and robotic vision tasks, existing DL approaches suffer from a significant limitation: they typically ignore that robots and cyber-physical systems are capable of interacting with the environment in order to better sense their surroundings. In this work we argue that perceiving the world through physical interaction, i.e., employing active perception, allows for both increasing the accuracy of DL models, as well as for deploying smaller and faster models. To this end, we propose an active perception-based face recognition approach, which is capable of simultaneously extracting discriminative embeddings, as well as predicting in which direction the robot must move in order to get a more discriminative view. To the best of our knowledge, we provide the first embedding-based active perception method for deep face recognition. As we experimentally demonstrate, the proposed method can indeed lead to significant improvements, increasing the face recognition accuracy, as well as allowing for using overall smaller and faster models.
8:45am - 9:00am
MultiANet: a Multi-Attention Network for Defocus Blur Detection
1University of Electronic Science and Technology of China, China; 2Institute for Infocomm Research, A-STAR, Singapore; 3Sichuan Police College, China
Defocus blur detection is a challenging task because of obscure homogenous regions and interferences of background clutter. Most existing deep learning-based methods mainly focus on building wider or deeper networks to capture multi-level features, neglecting to extract the feature relationships of intermediate layers, thus hindering the discriminative ability of network. Moreover, fusing features at different levels have been demonstrated to be effective. However, direct integrating without distinction is not optimal because low-level features focus on fine details only and could be distracted by background clutters. To address these issues, we propose the Multi-Attention Network for stronger discriminative learning and spatial guided low-level feature learning. Specifically, a channel-wise attention module is applied to both high-level and low-level feature maps to capture channel-wise global dependencies. In addition, a spatial attention module is employed to low-level features maps to emphasize effective detailed information.
9:00am - 9:15am
Haze-robust image understanding via context-aware deep feature refinement
University of Electronic Science and Technology of China, China, People's Republic of
Image understanding under the foggy scene is greatly challenging due to inhomogeneous visibility deterioration. Although various image dehazing methods have been proposed, they usually aim to improve image visibility (such as, PSNR/SSIM) in the pixel space rather than the feature space, which is critical for the perception of computer vision. Due to this mismatch, existing dehazing methods are limited or even adverse in facilitating the foggy scene understanding. In this paper, we propose a generalized deep feature refinement module to minimize the difference between clear images and hazy images in the feature space. It is consistent with the computer perception and can be embedded into existing detection or segmentation backbones for joint optimization. Our feature refinement module is built upon the graph convolutional network, which is favorable in capturing the contextual information and beneficial for distinguishing different semantic objects. We validate our method on the detection and segmentation tasks under foggy scenes. Extensive experimental results show that our method outperforms the state-of-the-art dehazing based pretreatments and the fine-tuning results on hazy images.
9:15am - 9:30am
Iterative Nadaraya-Watson Distribution Transfer for Colour Grading
Trinity College Dublin, Ireland
We propose a new method with Nadaraya-Watson that maps one N-dimensional distribution to another taking into account available information about correspondences. We extend the 2D/3D problem to higher dimensions by encoding overlapping neighborhoods of data points and solve the high dimensional problem in 1D space using an iterative projection approach. To show potentials of this mapping, we apply it to colour transfer between two images that exhibit overlapped scene. Experiments show quantitative and qualitative improvements over previous state of the art colour transfer methods.