Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding
Yuanying Dai, Dong Liu, Feng Wu
Univ Sci Tech China, China, People's Republic of
Lossy image and video compression algorithms yield visually annoying artifacts including blocking, blurring, and ringing, especially at low bit-rates. To reduce these artifacts, post-processing techniques have been extensively studied. Recently, inspired by the great success of convolutional neural network (CNN) in computer vision, some researches were performed on adopting CNN in post-processing, mostly for JPEG compressed images. In this paper, we present a CNN-based post-processing algorithm for High Efficiency Video Coding (HEVC), the state-of-the-art video coding standard. We redesign a Variable-filter-size Residue-learning CNN (VRCNN) to improve the performance and to accelerate network training. Experimental results show that using our VRCNN as post-processing leads to on average 4.6\% bit-rate reduction compared to HEVC baseline. The VRCNN outperforms previously studied networks in achieving higher bit-rate reduction, lower memory cost, and multiplied computational speedup.
11:10am - 11:30am
Phase Fourier Reconstruction for Anomaly Detection on Metal Surface Using Salient Irregularity
Tzu-Yi Hung1, Sriram Vaikundam1, Vidhya Natarajan1, Liang Tien Chia2
1Rolls-Royce@NTU Corporate Lab, Singapore; 2School of Computer Science and Engineering, Nanyang Technological University (NTU), Singapore
In this paper, we propose a phase Fourier reconstruction (PFR) approach for anomaly detection using salient irregularity. While existing phase-based methods usually work on texture-based images, they assume that majority of the pixels are normal region and patterned. Instead, our PFR utilizes salient irregularities in a single nontexture-based dark image. By doing so, surface details, component design, and boundaries between foreground/background become indistinct, and anomaly regions are highlighted because of diﬀuse reﬂection caused by rough surfaces. Besides, diﬀerent from existing template matching methods which require prior knowledge, our PFR framework is an unsupervised approach which automatically de-emphasizes regular patterns and homogeneous regions, and emphasizes salient regions simultaneously. Experimental results on anomaly detection clearly demonstrate the eﬀectiveness of the proposed method which outperforms several well-designed methods with a running time of less than 0.01 seconds per image.
11:30am - 11:50am
Model-Based 3D Scene Reconstruction Using a Moving RGB-D Camera
This paper presents a scalable model-based approach for 3D scene reconstruc-tion using a moving RGB-D camera. The proposed approach enhances the accu-racy of pose estimation due to exploiting the rich information in the multi-channel RGB-D image data. Our approach has lots of advantages on the recon-struction quality of the 3D scene as compared with the conventional approaches using sparse features for pose estimation. The pre-learned image-based 3D model provides multiple templates for sampled views of the model, which are used to estimate the poses of the frames in the input RGB-D video without the need of a priori internal and external camera parameters. Through template-to-frame registration, the reconstructed 3D scene can be loaded in an augmented reality (AR) environment to facilitate displaying, interaction, and rendering of an image-based AR application. Finally, we verify the ability of the established reconstruction system on publicly available benchmark datasets, and compare it with the sate-of-the-art pose estimation algorithms. The results indicate that our approach outperforms the compared methods on the accuracy of pose estima-tion.
11:50am - 12:10pm
Joint Face Detection and Initialization for Face Alignment
Zhiwei Wang, Xin Yang
Huazhong University of Science and Technology, China, People's Republic of
This paper presents a joint face detection and initialization method for cascaded face alignment. Unlike existing methods which consider face detection and initialization as separate steps, we concurrently obtain a bounding box and initial facial landmarks (i.e. shape) in one step, yielding better accuracy and efficiency. Specifically, each image region is represented using shape-indexed features derived from different head poses. A multipose face detector is trained: regions whose shapes are roughly aligned with faces can have a good feature representation and are utilized as positive samples, otherwise are considered as negative samples. During the face detection phase, initial landmarks can be explicitly placed on the detected faces according to the corresponding shape-indexed features. To accelerate our method, an ultrafast face proposal method based on face probability map (FPM) and boosted classifiers. Experimental results on public datasets demonstrate superior efficiency and robustness to existing initialization schemes and great accuracy improvement for the cascaded face alignment.
12:10pm - 12:30pm
Learning Features Robust to Image Variations with Siamese Networks for Facial Expression Recognition
Wissam Baddar, Daehoe kim, Yong Man Ro
KAIST, Korea, Republic of (South Korea)
Abstract. This paper proposes a computationally efficient method for learning features robust to image variations for facial expression recognition (FER). The proposed method minimizes the feature difference between an image under a variable image variation and a corresponding target image with the best image conditions for FER (i.e. frontal face image with uniform illumination). This is achieved by regulating the objective function during the learning process where a Siamese network is employed. At the test stage, the learned network parameters are transferred to a convolutional neural network (CNN) with which the features robust to image variations can be obtained. Experiments have been conducted on the Multi-PIE dataset to evaluate the proposed method under a large number of variations including pose and illumination. The results show that the proposed method im-proves the FER performance under different variations without requiring extra computational complexity.