Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Session Overview
Session 4A: Multimedia Features and Encoding
Thursday, 05/Jan/2017:
10:50am - 12:30pm

Session Chair: Jakub Lokoc
Location: V101
1st floor, 1st room on left.

Show help for 'Increase or decrease the abstract text size'
10:50am - 11:10am

A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding

Yuanying Dai, Dong Liu, Feng Wu

Univ Sci Tech China, China, People's Republic of

Lossy image and video compression algorithms yield visually annoying artifacts including blocking, blurring, and ringing, especially at low bit-rates. To reduce these artifacts, post-processing techniques have been extensively studied. Recently, inspired by the great success of convolutional neural network (CNN) in computer vision, some researches were performed on adopting CNN in post-processing, mostly for JPEG compressed images. In this paper, we present a CNN-based post-processing algorithm for High Efficiency Video Coding (HEVC), the state-of-the-art video coding standard. We redesign a Variable-filter-size Residue-learning CNN (VRCNN) to improve the performance and to accelerate network training. Experimental results show that using our VRCNN as post-processing leads to on average 4.6\% bit-rate reduction compared to HEVC baseline. The VRCNN outperforms previously studied networks in achieving higher bit-rate reduction, lower memory cost, and multiplied computational speedup.

11:10am - 11:30am

Phase Fourier Reconstruction for Anomaly Detection on Metal Surface Using Salient Irregularity

Tzu-Yi Hung1, Sriram Vaikundam1, Vidhya Natarajan1, Liang Tien Chia2

1Rolls-Royce@NTU Corporate Lab, Singapore; 2School of Computer Science and Engineering, Nanyang Technological University (NTU), Singapore

In this paper, we propose a phase Fourier reconstruction (PFR) approach for anomaly detection using salient irregularity. While existing phase-based methods usually work on texture-based images, they assume that majority of the pixels are normal region and patterned. Instead, our PFR utilizes salient irregularities in a single nontexture-based dark image. By doing so, surface details, component design, and boundaries between foreground/background become indistinct, and anomaly regions are highlighted because of diffuse reflection caused by rough surfaces. Besides, different from existing template matching methods which require prior knowledge, our PFR framework is an unsupervised approach which automatically de-emphasizes regular patterns and homogeneous regions, and emphasizes salient regions simultaneously. Experimental results on anomaly detection clearly demonstrate the effectiveness of the proposed method which outperforms several well-designed methods with a running time of less than 0.01 seconds per image.

11:30am - 11:50am

Model-Based 3D Scene Reconstruction Using a Moving RGB-D Camera

Shyi-Chyi Cheng1, Jui-Yuan Su2, Ching-Min Chen1, Jun-Wei Hsieh1

1National Taiwan Ocean UNiversity, Taiwan, Republic of China; 2Ming Chuan University, Taiwan

This paper presents a scalable model-based approach for 3D scene reconstruc-tion using a moving RGB-D camera. The proposed approach enhances the accu-racy of pose estimation due to exploiting the rich information in the multi-channel RGB-D image data. Our approach has lots of advantages on the recon-struction quality of the 3D scene as compared with the conventional approaches using sparse features for pose estimation. The pre-learned image-based 3D model provides multiple templates for sampled views of the model, which are used to estimate the poses of the frames in the input RGB-D video without the need of a priori internal and external camera parameters. Through template-to-frame registration, the reconstructed 3D scene can be loaded in an augmented reality (AR) environment to facilitate displaying, interaction, and rendering of an image-based AR application. Finally, we verify the ability of the established reconstruction system on publicly available benchmark datasets, and compare it with the sate-of-the-art pose estimation algorithms. The results indicate that our approach outperforms the compared methods on the accuracy of pose estima-tion.

11:50am - 12:10pm

Joint Face Detection and Initialization for Face Alignment

Zhiwei Wang, Xin Yang

Huazhong University of Science and Technology, China, People's Republic of

This paper presents a joint face detection and initialization method for cascaded face alignment. Unlike existing methods which consider face detection and initialization as separate steps, we concurrently obtain a bounding box and initial facial landmarks (i.e. shape) in one step, yielding better accuracy and efficiency. Specifically, each image region is represented using shape-indexed features derived from different head poses. A multipose face detector is trained: regions whose shapes are roughly aligned with faces can have a good feature representation and are utilized as positive samples, otherwise are considered as negative samples. During the face detection phase, initial landmarks can be explicitly placed on the detected faces according to the corresponding shape-indexed features. To accelerate our method, an ultrafast face proposal method based on face probability map (FPM) and boosted classifiers. Experimental results on public datasets demonstrate superior efficiency and robustness to existing initialization schemes and great accuracy improvement for the cascaded face alignment.

12:10pm - 12:30pm

Learning Features Robust to Image Variations with Siamese Networks for Facial Expression Recognition

Wissam Baddar, Daehoe kim, Yong Man Ro

KAIST, Korea, Republic of (South Korea)

Abstract. This paper proposes a computationally efficient method for learning features robust to image variations for facial expression recognition (FER). The proposed method minimizes the feature difference between an image under a variable image variation and a corresponding target image with the best image conditions for FER (i.e. frontal face image with uniform illumination). This is achieved by regulating the objective function during the learning process where a Siamese network is employed. At the test stage, the learned network parameters are transferred to a convolutional neural network (CNN) with which the features robust to image variations can be obtained. Experiments have been conducted on the Multi-PIE dataset to evaluate the proposed method under a large number of variations including pose and illumination. The results show that the proposed method im-proves the FER performance under different variations without requiring extra computational complexity.

Contact and Legal Notice · Contact Address:
Conference: MMM2017
Conference Software - ConfTool Pro 2.6.107+TC
© 2001 - 2017 by H. Weinreich, Hamburg, Germany