1:30pm - 1:50pm
Multi-Attribute based Fire Detection in Diverse Surveillance Videos
Beijing University of Posts and Telecommunications, China, People's Republic of
Fire detection as an immediate response of fire accident to avoid great disaster, has attracted many researchers’ focuses. However, the existing methods cannot effectively exploit the comprehensive attribute of fire to give satisfying accuracy. In this paper, we innovatively design a Multi-Attribute based Fire Detection System which combines the fire’s color, geometric, and motion attributes to accurately detect the fire in complicated surveillance videos. For geometric attribute, novel contour moment and line detection based descriptors are proposed to represent the variation of fire’s shape. Furthermore, to utilize fire’s instantaneous motion character, we design a novel dense optical flow based descriptor as fire’s motion attribute. Finally, we build a fire detection video dataset as benchmark, which contains 305 fire and non-fire videos, with 135 very challenging negative samples for fire detection. Experimental results on this benchmark demonstrate that the proposed approach greatly outperforms the state-of-the-arts with 92.30% accuracy and only 8.33% false positives.
1:50pm - 2:10pm
A Structural Coupled-layer Tracking Method Based on Correlation Filters
1CAS Key Laboratory of Electromagnetic Space Information,University of Science and Technology of China, Hefei, China; 2University at Buffalo, State University of New York, Dept. of Computer Science & Engineering, USA
A recent trend in visual tracking is to employ correlation filter based formulations for their high efficiency and superior performance. To deal with partial occlusion issue, part-based methods via correlation filters have been introduced to visual tracking and achieved promising results. However, these methods ignore the intrinsic relationships among local parts and do not consider the spatial structure inside the target. In this paper, we propose a coupled-layer tracking method based on correlation filters that resolves this problem by incorporating structural constraints between the global bounding box and local parts. In our method, the target state is optimized jointly in a unified objective function taking into account both appearance information of all parts and structural constraint between parts. In that way, our method can not only has the advantages of existing correlation filter trackers, such as high efficiency and robustness, and the ability to handle partial occlusion well due to part-based strategy, but also preserve object structure. Experimental results on the challenging benchmark dataset demonstrate that our proposed method outperforms state-of-art trackers.
2:10pm - 2:30pm
Robust Visual Tracking based on Multi-channel Compressive Features
bit, China, People's Republic of
Tracking-by-detection approaches show good performance in visual tracking, which often train discriminative classifiers to separate tracking target from their surrounding background. As we know, an effective and efficient image feature plays an important role for realizing an outstanding tracker. The excellent image feature can separate the tracking object and the background more easily. Besides, the feature should effetively adapt to many boring factors such as illumination changes, appearance changes, shape variations, and partial or full occlusions, etc. To this end, in this paper, we present a novel multi-channel compressive feature, which combine rich information from multiple channels, and then project into a low-dimension compressive feature space. After that, we designed a new visual tracker based on the multi-channel compressive feature. At last, extensive comparative experiments conducted on a series of challenging sequences demonstrate that our tracker outperforms most of state-of-the-art tracking approaches, that also proves that our multi-channel compressive feature is successful.
2:30pm - 2:50pm
A Comparison of Approaches for Automated Text Extraction from Scholarly Figures
1Christian-Albrechts-Universität Kiel, Germany; 2ZBW - Leibniz Information Centre for Economics, Germany
So far, there has not been a comparative evaluation of different approaches for text extraction from scholarly figures. In order to fill this gap, we have defined a generic pipeline for text extraction that abstracts from the existing approaches as documented in the literature. In this paper, we use this generic pipeline to systematically evaluate and compare 32 configurations for text extraction over four datasets of scholarly figures of different origin and characteristics. In total, our experiments have been run over more than 400 manually labeled figures. The experimental results show that the approach BS-4OS results in the best F-measure of 0.67 for the Text Location Detection and the best average Levenshtein Distance of 4.71 between the recognized text and the gold standard on all four datasets using the Ocropy OCR engine.
2:50pm - 3:10pm
Structure-aware Image Resizing for Chinese Characters
Institute of Computer Science and Technology, Peking University, No.128, Zhongguancun Street, Haidian District, Beijing, China
This paper presents a structure-aware resizing method for Chinese character images. Compared to other image resizing approaches, the proposed method is able to preserve important features such as the width, orientation and trajectory of each stroke for a given Chinese character. The key idea of our method is to first automatically decompose the character image into strokes, and then separately resize those strokes naturally using a modified linear blending skinning approach and as-rigid-as-possible shape interpolation under the guidance of structure information. Experimental results not only verify the superiority of our method compared to the state of the art but also demonstrate its effectiveness in several real applications.
3:10pm - 3:30pm
Robust Scene Text Detection for Multi-script Languages using Deep Learning
1National Key Lab for Novel Software Technology, Nanjing University, Nanjing, China; 2Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia
Text detection in natural images has been a high demand for a lot real-life applications such as image retrieval and self-navigation. This work deals with the problem of robust text detection especially for multi-script languages in natural scene images. Previous works treat multi-script characters as groups of text fragments. Different from all of them, we treat these script as non-connected components. We firstly propose a novel representation named Linked Extremal Regions (LER) to extract full characters instead of frag-ments in scene characters. Secondly, we propose a two-stage convolution neural networks framework for handling the difficulties in discriminating multi-script texts from clusters in complex backgrounds. Finally, we build a robust and effective multi-script text detection technique. The proposed method is not only effective for multi-script but also for poorly connected characters. Experimental results on three well-known dataset such as ICDAR 2011, 2013 and MSRA-TD500 demonstrate that our method achieves state-of-the-art performances in scene text detection.