DLM3-P: Deep Learning for Multimedia 3
5:25pm - 5:30pm
Generalized Operational Classifiers for Material Identification
1Northwestern Polytechnical University, China, People's Republic of; 2Tampere University, Finland; 3Qatar University, Qatar
Material is one of the intrinsic features of objects, and consequently material recognition plays an important role in image understanding. The same material may have various shapes and appearance, while keeping the same physical characteristic. This brings great challenges for material recognition. Besides suitable features, a powerful classifier also can improve the overall recognition performance. Due to the limitations of classical linear neurons, used in all shallow and deep neural networks, such as CNN, we propose to apply the generalized operational neurons to construct a classifier adaptively. These generalized operational perceptrons (GOP) contain a set of linear and nonlinear neurons, and possess a structure that can be built progressively. This makes GOP classifier more compact and can easily discriminate complex classes. The experiments demonstrate that GOP networks trained on a small portion of the data (4%) can achieve comparable performances to state-of-the-arts models trained on much larger portions of the dataset.
5:30pm - 5:35pm
Few-Shot Object Detection in Real Life: Case Study on Auto-Harvest
1University of Nantes; 2CAPACITÉS, France; 3CTIFL
Confinement during COVID-19 has caused serious effects on agriculture all over the world. As one of the efficient solutions, mechanical harvest/auto-harvest that is based on object
detection and robotic harvester becomes an urgent need. Within the auto-harvest system, robust few-shot object detection model is one of the bottlenecks, since the system is required to deal with new vegetable/fruit categories and the collection of large-scale annotated datasets for all the novel categories is expensive. There are many few-shot object detection models that were developed by the community. Yet, whether they could be employed directly for real life agricultural applications is still questionable, as there is a context-gap between the commonly used training datasets and the images collected in real life agricultural scenarios. To this end, in this study, we present a novel cucumber dataset and propose two data augmentation strategies that help to bridge the context-gap. Experimental results show that 1) the state-of-the-art few-shot object detection model performs poorly on the novel ‘cucumber’ category; and 2) the proposed augmentation
strategies outperform the commonly used ones.
5:35pm - 5:40pm
Auto-Encoder based Structured Dictinoary Learning
1Zhengzhou University, China, People's Republic of; 2Hefei University of Technology, China, People's Republic of; 3The University of Queensland, Australia; 4Motovis Australia Pty Ltd Adelaide, Australia
Dictionary learning and deep learning are two popular representation learning paradigms, which can be combined to boost the classification task. However, existing combination methods often learn multiple dictionaries embedded in a cascade of layers, and a specialized classifier accordingly. This may inattentively lead to overfitting and high computational cost. In this paper, we present a deep auto-encoding architecture which is coupled with a dictionary layer to straightly produce a dictionary for classification. To empower the dictionary with discrimination, we construct the dictionary with class-specific sub-dictionaries, and introduce supervision by imposing category constraints. The proposed framework is inspired by a sparse optimization method, namely Iterative Shrinkage Thresholding Algorithm, which characterizes the learning process by the forward-propagation based optimization w.r.t the dictionary only. Extensive experiments demonstrate the effectiveness of our method in image classification.