lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A...

27
学学学学学学 ScienceDirect 学 学7学 24 学 2018 学 11 学 15 学 请请 (, Crtl 请请请请请请请请请请请请Journal of Visual Communication and Image Representation Volume 57 Pages 1-201 (November 2018) 1. A (t,n)-multi secret image sharing scheme based on Boolean operations Saeideh Kabirirad, Ziba Eslami Pages 39-47 Abstract In (t,n)-multi secret image sharing (MSIS) schemes, a number of secret images are shared among n users so that participation of at least t of them is needed to recover the shared images. Due to the high volume of images and computing complexity of secret sharing schemes, recent Boolean-based approaches are highly desirable. Unfortunately, to the best of our knowledge, existing literature on Boolean-based MSIS schemes only supports two cases: (2,n) and (n,n). In (n,n)-schemes, we lose fault tolerancy such that in the absence of even one share, secret images can not be recovered. On the other hand, (2,n)-MSIS seems to be quite restrictive for the wide range of applications that might occur in practice. It is therefore a challenging problem to propose a Boolean- based (t,n)-MSIS for t≠2,n. The aim of this paper is to solve this problem. We further provide formal proofs of security as well as comparison with existing literature. 2. An optimized non-subsampled shearlet transform-based image fusion using Hessian features and unsharp masking Amit Vishwakarma, M.K. Bhuyan, Yuji Iwahori Pages 48-60 Abstract 学学 请请请请 3548请 E-mail[email protected] 请 20 请 1 请

Transcript of lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A...

Page 1: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

学科文献信息计算机专题(ScienceDirect数据库) 第 7期 (总 24期) 2018年 11月 15日

(请把光标放在文献题名上,按住 Crtl 键单击题名可打开文献全文)Journal of Visual Communication and Image Representation

Volume 57 Pages 1-201 (November 2018)

1. A   (t,n)-multi secret image sharing scheme based on Boolean operations Saeideh Kabirirad, Ziba EslamiPages 39-47AbstractIn (t,n)-multi secret image sharing (MSIS) schemes, a number of secret images are shared among n users so that participation of at least t of them is needed to recover the shared images. Due to the high volume of images and computing complexity of secret sharing schemes, recent Boolean-based approaches are highly desirable. Unfortunately, to the best of our knowledge, existing literature on Boolean-based MSIS schemes only supports two cases: (2,n) and (n,n). In (n,n)-schemes, we lose fault tolerancy such that in the absence of even one share, secret images can not be recovered. On the other hand, (2,n)-MSIS seems to be quite restrictive for the wide range of applications that might occur in practice. It is therefore a challenging problem to propose a Boolean-based (t,n)-MSIS for t≠2,n. The aim of this paper is to solve this problem. We further provide formal proofs of security as well as comparison with existing literature.

2. An optimized non-subsampled shearlet transform-based image fusion using Hessian features and unsharp masking

Amit Vishwakarma, M.K. Bhuyan, Yuji IwahoriPages 48-60AbstractExisting image fusion approaches are not so efficient to seize significant edges, texture and fine features of the source images due to ineffective and non-adaptive fusion structure. Also for objective evaluation of fusion algorithms, there is a need of a metric to measure source image features which are preserved in the fused image. To address these issues, an optimized non-subsampled shearlet transform (NSST) is developed, which is applied to decompose the source images into low- and high frequency bands. The low frequency bands are fused using proposed descriptor obtained from superposition of scale multiplied Canny edge detector features and Hessian features. The high frequency bands are fused using unsharp masking based

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 1 页

Page 2: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

fusion rule. Moreover, a metric QE is formulated on the basis of Karhunen-Loeve transform (KLT). The information of image pixel variance for both source and fused images can be measured by using the proposed metric QE, and it gives an indication of the amount of variance information transferred from the source images to the fused image. Both subjective and objective analysis show the efficacy of the proposed fusion structure and the metric QE.

3. HCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image restoration

Xiaolong Zhu, Xiangchu Feng, Weiwei Wang, Xixi Jia, Chen XuPages 61-68AbstractA photon-limited image can be represented as a pixel matrix limited by the relatively small number of collected photons. The image can also be seen as being contaminated by Poisson noise because the total number of photons follows the Poisson distribution. Through exploitation of the inherent properties of observation combined with application of a denoising method, an image can be significantly restored. In this paper, a hybrid clustering and low-rank regularization-based model (HCLR) is proposed based on the essential features of patch clustering and noise. An efficient Newton-type method is designed to optimize this biconvex problem. Experimental results demonstrate that HCLR achieves competitive denoising performance, especially for high noise levels, compared with state-of-the-art Poisson denoising algorithms.

4. A novel hypergraph matching algorithm based on tensor refining Jun Zhou, Tao Wang, Congyan Lang, Songhe Feng, Yi JinPages 69-75AbstractHypergraph matching utilizes high order constraints rather than unary or pairwise ones, which aims to establish a more reliable correspondence between two sets of image features. Although many hypergraph matching methods have been put forward over the past decade, it remains a challenging problem to be solved due to its combinatorial nature. Most of these methods are based on tensor marginalization, where tensor entries representing joint probabilities of the assignment are fixed during the iterations meanwhile the individual assignment probabilities evolving. This will cause some incomplete information which may hurt the matching performance. Addressing this issue, we propose a novel hypergraph matching algorithm based on tensor refining, accompanied with an alternative adjustment method to accelerate the convergence. We make a comparison between the proposed approach and several outstanding matching algorithms on three commonly used benchmarks. The experimental results validate the superiority of our method on both matching accuracy and robustness against noise and deformation.

5. HDR video quality assessment: Perceptual evaluation of compressed HDR video Xiaofei Pan, Jiaqi Zhang, Shanshe Wang, Shiqi Wang, Yahui YangPages 76-83AbstractCompared with standard dynamic range (SDR) video, the high dynamic range (HDR) video can provide us significantly enhanced viewing experience. In particular, compared to SDR video, the HDR video has better contrast and preserves more details for the same scene. With the rapid development of HDR video compression technology, there is a lack of trusted quality measure of HDR video compression. In order to facilitate the future development of objective HDR quality assessment, we build a HDR video quality

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 2 页

Page 3: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

assessment database, in which the bitstream is created by compressing a series of HDR video sequences. In the compression, the quantization parameters (QP) are set to 12 levels according to the configuration of the codec. The subjective quality of each bitstream is rated by 22 viewers. It is revealed that the subject viewers have arrived at a reasonable agreement on the subjective quality of different QP levels. This paper presents the results of subjective quality assessment of HDR compressed video, which also exhibits that there is significant room to further improve the objective HDR video quality assessment algorithms.

6. Edge detection with feature re-extraction deep convolutional neural network Changbao Wen, Pengli Liu, Wenbo Ma, Zhirong Jian, Xiaowen ShiPages 84-90AbstractIn this paper, we propose an edge detector based on feature re-extraction (FRE) of a deep convolutional neural network to effectively utilize features extracted from each stage, and design a new loss function. The proposed detector is mainly composed of three modules: backbone, side-output, and feature fusion. The backbone module provides preliminary feature extraction; the side-output module makes network architecture more robustly map features from different stages of the backbone network to edge-pixel space by applying residual learning, and the feature fusion module generates the edge map. Generalization ability on the same distribution is verified using the BSDS500 dataset, achieving optimal dataset scale (ODS) F-score = 0.804. Cross-distribution generalization ability is verified on the NYUDv2 dataset, achieving ODS F-score = 0.701. In addition, we find that freezing backbone network can significantly speed up training process, without much overall accuracy loss (ODS F-score of 0.791 after 5.4k iterations).

7. Visual comparison based on linear regression model and linear discriminant analysis Hanqin Shi, Liang TaoPages 118-124AbstractVisual comparison is that given two images, we can predict which one exhibits a particular visual attribute more than the other. The existing relative attribute methods rely on ranking SVM functions to conduct visual comparison; however, the ranking SVM functions are sensitive to the support vectors. When there are rarely effective samples, the performance of the ranking SVM model will be greatly discounted. To address this issue, we propose the pairwise relative attribute method for visual comparison by training the Linear Regression Model (LRM), which can be formulated by learning a mapping function between a vector-formed feature input with pairwise image difference and a scalar-valued output. In addition, we propose a novel feature reduction method based on the Linear Discriminant Analysis (LDA) in order to obtain a low dimensional and discriminant feature. Experimental results on the three databases of UT-Zap50K-1, OSR and PubFig demonstrate the advantages of the proposed method.

8. Reduced-reference quality assessment of multiply-distorted images based on structural and uncertainty information degradation

Saeed Mahmoudpour, Peter SchelkensPages 125-137AbstractThe majority of existing objective Image Quality Assessment (IQA) methods are designed for evaluation of images corrupted by single distortion types. However, images may be degraded with multiple distortions during processing stages. In this paper, we propose a reduced-reference IQA algorithm to predict the quality

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 3 页

Page 4: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

of multiply-distorted images. An image is first decomposed into predicted and disorderly portions based on the internal generative mechanism theory. The structural information is captured from the predicted image by using a shearlet representation and Rényi directional entropy is deployed to measure the disorderly information changes. Finally, we introduce the application of a framework namely Learning Using Privileged Information (LUPI) to build a quality model and obtain quality scores. During training, the LUPI framework utilizes a set of additional privileged data to learn an improved quality model. Experimental results on multiply-distorted image datasets (MLIVE and MDID2015) confirm the effectiveness of the proposed IQA model.

9. An uniformizing method of MR image intensity transformation Jie Chang, Naijie Gu, Xiaoci Zhang, Li Yang, Junjie SuPages 138-151AbstractThe scanner-dependent variations effect fluctuation of intensities of MR images, even under the fixed condition of key parameters: the scanner, the patient, the body region and the type of MRI protocol. The inherent variation causes the lack of a standard and quantifiable interpretation of image intensities. Moreover, the unbalanced distribution of intensity values lowers accuracy and sensitivity of automatic analysis and segmentation based on MR images. As such, we proposed a uniformizing method to make the distribution even while ensuring that similar intensities of MR images reflect the same tissue after processed. Our experiments based on the 3D brain tumor MR images proved that this method can significantly improve labeling and segmentation accuracy as compared to conventional preprocessed methods.

10. Learning multi-denoising autoencoding priors for image super-resolution Yankun Wang, Qiegen Liu, Huilin Zhou, Yuhao WangPages 152-162AbstractInspired by the application of denoising autoencoding priors (DAEP) to image restoration tasks, we propose a single image super-resolution (SISR) method via introducing multi-denoising autoencoding priors (MDAEP). On the basis of the naive DAEP, the proposed MDAEP integrates multi-DAEPs from different noisy inputs into the iterative restoration process. The combined strategy avails to alleviate the instability of the denoising autoencoders, and thus to avoid falling into local solutions. Furthermore, compared with the existing SISR methods based on end-to-end mapping, MDAEP is only trained once and applied to different magnification factors, but also can effectively preserve high-frequency information and reduce ringing effects of the reconstructed images. Both quantitative and qualitative assessments of the benchmark datasets show that the ability and the stability of the network are improved effectively. The proposed method performs better than the state-of-the-art algorithms including the basic DAEP, in terms of PSNRs and visual comparisons.

11. Extraction of PRNU noise from partly decoded video Jian Li, Bin Ma, Chunpeng WangPages 183-191AbstractThis paper presents an algorithm for extracting PRNU noise from video files taken by a smartphone camera. We consider video because they are quite prevalent in our life, but are not often involved in the existing research works. Unlike most prior arts tending to extract PRNU noise from completely decompressed video

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 4 页

Page 5: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

frames, our proposed algorithm leaves out some procedures in the video decoding process to reduce the computational complexity. Besides, we design a maximum-likelihood-estimation algorithm for extracting PRNU noise from partly decoded video frames, and analyze the algorithm’s suitability as well in theory. Experimental results further prove the validity and effectiveness of the proposed algorithm.

12. Sparsity induced prototype learning via   ℓp,1-norm grouping Xingxing Zhang, Zhenfeng Zhu, Yao ZhaoPages 192-201AbstractPrototype learning aims to eliminate redundancy of large-scale data by selecting an informative subset. It is at the center of visual data analysis and processing. However, due to intrinsic structures among sample groups, the learnt prototypes are generally less representative and diversified. To alleviate this issue, we develop in this paper a structurally regularized model via ℓp,1-norm grouping, in which both the intra-group and inter-group structures of source data in object-space are rationally exploited. Thus, while the learnt representative prototypes are prone to distribute in different groups at the inter-group level, the grouping constraint via ℓp,1-norm will enforce the greatest diversity for intra-group prototypes. Considering the convexity in the formulated model, an alternative re-weighting solver is presented to efficiently solve the proposed optimization problem. Experimental results on video summarization, scene categorization and handwriting recognition demonstrate that the proposed method is considerably superior to the state-of-the-art methods in prototype learning.

Special Issue on Multimodal Cooperation for Multimedia

Computing

13. Saliency detection based on directional patches extraction and principal local color contrast

Muwei Jian, Wenyin Zhang, Hui Yu, Chaoran Cui, Yilong YinPages 1-11AbstractSaliency detection has become an active topic in both computer vision and multimedia fields. In this paper, we propose a novel computational model for saliency detection by integrating the holistic center-directional map with the principal local color contrast (PLCC) map. In the proposed framework, perceptual directional patches are firstly detected based on discrete wavelet frame transform (DWFT) and sparsity criterion, then the center of the spatial distribution of the extracted directional patches are utilized to locate the salient object in an image. Meanwhile, we proposed an efficient local color contrast method, called principal local color contrast (PLCC), to compute the color contrast between the salient object and the image background, which is sufficient to highlight and separate salient objects from complex background while dramatically reduce the computational cost. Finally, by incorporating the complementary visual cues of the global center-directional map with the PLCC map, a final compounded saliency map can be generated. Extensive experiments

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 5 页

Page 6: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

performed on three publicly available image databases, verify that the proposed scheme is able to achieve satisfactory results compared to other state-of-the-art saliency-detection algorithms.

14. Graph regularized multiview marginal discriminant projection Heng Pan, Jinrong He, Yu Ling, Lie Ju, Guoliang HePages 12-22AbstractMulti-view data has become commonplace in today's computer vision applications, for the same object can be sampled through various viewpoints or by different instruments. The large discrepancy between distinct even heterogenous views bring the challenge of handling multi-view data. To obtain intrinsic common representation shared by all views, this paper proposes a novel multi-view algorithm called Multiview Marginal Discriminant Projection (MMDP), which is a supervised dimensionality reduction method for searching latent common subspace across multiple views. MMDP takes both inter-view and intra-view discriminant information into account and can preserve the global geometric structure and local discriminant structure of data manifold. Furthermore, the performance of MMDP is improved via imposing graph embedding as a regularization term to give a penalization of the local data geometric structure violation, which is called Graph regularized Multiview Marginal Discriminant Projection (GMMDP). The extensive experimental results on face recognition tasks demonstrate the effectiveness and robustness of MMDP and GMMDP. Finally, this paper excavates a new application scenario of multi-view learning and introduce it including the proposed GMMDP into solving hyperspectral image classification (HIC) problem, which leads to a satisfactory result.

15. Can modified minimax win in Pearl’s game? Haoyang Cai, Haodong Ma, Linyu Li, Luming ZhangPages 23-27AbstractMinimax algorithm is widely used for adversarial searching. It is commonly believed that searching depth and winning chance has a positive correlation, but Dana S. Nau pointed out in his 1982 research that in Pearl’s game, pathology occurs. Minimax shows a decrease of winning chance as searching depth increases. Our research proposes a possible way to fix the pathology by taking the opponent’s strategy into consideration in some specific cases. The experiment proves that the accumulation of incorrect predictions at least partially causes the pathology. Our modified version of Minimax successfully overcomes the pathology and continues to present the power of Minimax in Pearl’s game.

16. Cross-modal hashing based on category structure preserving Fei Dong, Xiushan Nie, Xingbo Liu, Leilei Geng, Qian WangPages 28-33AbstractCross-modal hashing has made a great development in cross-modal retrieval since its vital reduction in computational cost and storage. Generally, projections for each modality that map heterogeneous data into a common space are used to bridge the gap between different modalities. However, category-specific distributions are usually be ignored during the projection. To address this issue, we propose a novel cross-modal hashing, termed as Category Structure Preserving Hashing (CSPH), for cross-modal retrieval. In CSPH, category-specific distribution is preserved by a structure-preserving regularization term during the hash learning. Compared with existing methods, CSPH not only preserves the local structure of each

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 6 页

Page 7: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

category, but also generates more stable hash codes with less time for training. Extensive experiments conducted on three benchmark datasets, and the experimental results demonstrate the superiority of CSPH under various cross-modal scenarios.

17. A robust enhancement system based on observer-backstepping controller Li JiGuang, Chen XinPages 34-38AbstractA large mount of data is indispensable in deep learning. The learning results can be different because of the noise or contaminated tags. So in this paper, a controller design method is proposed to reduce the influence due to noise or damaged label. Our method is based on backstepping control method and observer. In our work, an adaptive function is designed to eliminate the influence of the unmodelable part of the system because of the contaminated tags. For the noise, the observer is used to accurately estimated and effectively compensated. Experimental results show the effectiveness of our method. Our modified system has good performance and can accurately response the input training data in the case of the unmodelable part of the system and the external noise.

18. Community detection for multi-layer social network based on local random walk XiaoMing Li, Guangquan Xu, Minghu TangPages 91-98AbstractWith the fast development of information network, the scale of social network has become very significant, and it has become more difficult to obtain the information of entire network. In addition, because current mining method for complicated network community utilizes the information of node link or property, which cannot effectively detect the community with dense member links and highly similar properties. As a result, most current algorithms are impractical for online social network with large scale, and we propose a community detection algorithm for multi-layer social network based on local random walk (MRLCD); this algorithm determines the core node based on the repeatability of multi-layer nodes. It expands from a core node, has local random walk in multi-layer network, identifies and controls the random walk scope of node based on the intra-layer and interlayer trust. During the walk process, the clustering coefficient of nodes to be combined is comprehensively compared to further complete a local community search, and the optimal local community search is obtained through multiple iterations. Finally, the multi-layer modularity is used as the indicator for measurement and evaluation of algorithm performance, and its performance is compared with other network clustering algorithms such as GL, LART and PMM through four actual multi-layer network datasets. The MRLCD algorithm can autonomously explore the local community structure of given node, and effectively improve the stability and accuracy for local community detection in multi-layer social network.

19. Panoramic visual tracking based on adaptive mechanism Long Liu, Zijing Yan, Qing LiuPages 99-106AbstractPanoramic visual tracking has high application value in many situations, but its visual distortion is likely to cause low tracking robustness or loss of target. This paper presents a panoramic visual tracking method based on adaptive feature fusion method, the size of the trapezoidal frame is calibrated with target moving in the

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 7 页

Page 8: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

method, and the linear model of trapezoidal frame parameters is fitted, then according to the model trapezoidal area of target extracted, and are modified to target trapezoidal region through the affine transformation; Based on filter tracking framework, the fusion of color and shape information is as the main characteristics of the target tracking, and Bayesian fusion recursive formula is used for calculating the particle weights. The experimental results show that the proposed algorithm is better than the existing methods in tracking precision and anti occlusion, and effectively improves the robustness of panoramic visual tracking.

20. Geometric discriminative deep features for traffic image analysis Haibo ZhangPages 163-171AbstractTraffic image analysis is an important application in intelligent transportation. For local features’ robustness to image variances, such as scale changes and occlusions, they are widely used in image classification. However, how to integrate these local features for modeling traffic images optimally is still a crucial challenge. In this paper, a novel deep learning method, geometric discriminative feature fusion (GDFF), is proposed to tackle this problem. First, we use a variety of data sets to train the general convolutional neural network (CNN), which is used to extract the features of the training and test set after deep level. Deep architecture makes it possible for people to learn more abstract and internal features that are robust to changes in viewpoint and illumination. It can fuse image geometric related local features, such as local regions’ RGB histograms, into high level discriminative features, which can be used for better classifying complex scene images. Our framework’s central task is to build a structural kernel, called discriminative topological kernel. Firstly, we segment the traffic images into several regions and use a region connected graph (RCG) to model regions location relationships. We use frequent sub graph mining algorithm to mine all frequent sub structures (topologies) occurs in all training RCGs. And a selection algorithm is designed to select the k qualified topologies from the entire mined frequent topologies. We call these selected topologies geometric feature fusers, which are both high discriminative and low redundant structures in all training RCGs. Finally, given a pair of RCGs and to each geometric fuser, we extract all pairs of sub graphs sharing the same topology and calculate distance between them. All k distances are accumulated for the final kernel. The experimental result demonstrates the effectiveness of our method.

21. Pedestrian tracking by learning deep features Honghe Huang, Yi Xu, Yanjie Huang, Qian Yang, Zhiguo ZhouPages 172-175AbstractPedestrian tracking technique is now widely used in many intelligent systems, such as video surveillance, security regions. But many methods suffer from illumination, human posture or human appendant. With the development of Convolutional Neural Networks (CNNs), deep feature can be learned. In this paper, training images will be divided into subregions to reduce the influence of human appendant, such as bags. The remain regions are almost fixed regions. Then these fixed regions will be fed into our CNNs for learning deep features. In order to copy with different sizes of training images, an arbitrarily-sized pooling layer is developed in our CNN architecture. Then, these deeply-learned feature vector can be used in pedestrian recognition. In our work, optical flow is used for pedestrian tracking. Experimental results show our proposed method can achieve pedestrian tracking effectively.

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 8 页

Page 9: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

22. Anti-occlusion tracking algorithm of video target based on prediction and re-matching strategy

Zhen-tao Hu, Lin Zhou, Ya-nan Yang, Xian-xing Liu, Yong JinPages 176-182AbstractAccurately locating the video target in the process of occlusion and recurrence will be very important for effective follow-up of the target. For the problem of poor applicability of Mean Shift and its improved algorithm when the target is heavily occluded, this paper proposes an anti-occlusion video target tracking algorithm based on prediction and re-matching strategy. Firstly, dynamically combining the Mean Shift algorithm with the Kalman filter, this paper achieves stable tracking of un-occluded target. Secondly, when the target is occluded, Kalman filter is combined with the target prior information to predict the position of the occluded target. Finally, in the recurrence process of occluded targets, the target is re-matched through the normalized cross-correlation method to obtain target optimal position, and then the target can be quickly and accurately located. The simulation results show that the proposed method has strong anti-occlusion and reliability tracking in the video target tracking process.

Special Issue Visual Information Processing for Virtual Reality

23. Efficient VR Video Representation and Quality Assessment Shilin Wu, Xiaoming Chen, Jun Fu, Zhibo ChenPages 107-117AbstractVR video is increasingly popular due to the recent advances in VR technology and hardware. The bulky size of VR video, however, impose new challenges in its storage and processing. In this paper, we focus on the research problems of VR video representation and objective quality assessment. Distinct from traditional 2D video, a VR video is displayed and represented in a form of spherical surface. For encoding purpose, a VR video frame needs to be projected to a 2D flat plane. Existing projection methods usually lead to much redundancy or significant violation in the correlation of neighbor pixels, which are not encoding friendly or are creating visible edge artifacts. To alleviate these problems, we propose in this paper our Quadrangle Affine Square Projection (QASP). QASP is a novel representation for VR video frames, which can reduce the redundant pixels over the traditional projections. In particular, all the inner pixels in QASP remain connected, i.e. the correlations between neighbors pixels are well maintained, which is a desirable feature for video encoding and edge-artifact-free viewing experience. Besides, we also investigated in predicting the optimal rotation angle for QASP frames for higher encoding efficiency. In addition to VR video representation, we also investigate in this paper accurate objective quality measurement for VR video. The traditional video quality measurements, e.g. PSNR, are not suitable to measure the quality of VR videos since they will take the redundant pixels into account. In this paper, we propose a new quality measurement for VR videos, named as Resized-PSNR (R-PSNR). With R-PSNR, only the “meaningful” pixels are considered for quality measurement while the redundant pixels are discarded. To evaluate QASP and R-PSNR, experiments are conducted based on standard VR video sequences. The experimental results show that the proposed projection method achieves noticeable improvement over traditional methods, and the proposed quality assessment method outperforms the traditional measurements in terms of consistency with

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 9 页

Page 10: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

subjective evaluation.

24. Chaotic particle filter for visual object tracking Pages 1-12Marjan Firouznia, Karim Faez, Hamidreza Amindavar, Javad Alikhani KoupaeiAbstractIn this paper, a chaotic particle filter method is introduced to improve the performance of particle filter based on chaos theory. The methodology of the algorithm includes two steps. First, the global motion estimation is used to predict target position using dynamical information of object movement over frames. Then, the color-based particle filter method is employed in the local region obtained from global motion estimation to localize the target. The algorithm significantly reduces the number of particles, search space, and the filter divergence because of high-order estimation. To verify the efficiency of the tracker, the proposed method is applied to two datasets, consisting of particle filter-based methods under the Bonn Benchmark on Tracking (BoBoT), the large Tracking Benchmark (TB), and Visual Object Tracking (VOT2014). The results demonstrate that the chaotic particle filter method outperforms other state-of-the-art methods on the abrupt motion, occlusion, and out of view. The precision of the proposed method is about 10% higher than that of other particle filter algorithms with low computational cost.

Volume 56 Pages 1-316 (October 2018)

25. Robust visual tracking via multi-feature response maps fusion using a collaborative local-global layer visual model

Haoyang Zhang, Guixi Liu, Zhaohui HaoPages 1-14AbstractThis paper addresses the issue of robust visual tracking, in which an effective tracker based on multi-feature fusion under a collaborative local-global layer visual model is proposed. In the local layer, we implement a novel block tracker using structural local color histograms feature based on the foreground-background discrimination analysis approach. In the global layer we implement a complementary correlation filters-based tracker using HOG feature. Finally, the local and global trackers are linearly merged in the response maps level. We choose the different merging factors according to the reliability of each combined tracker, and when both of the combined trackers are unreliable, an online trained SVM detector is activated to re-detect the target. Experiments conducted on challenging sequences show that our final merged tracker achieves favorable tracking performance and outperforms several state-of-the-art trackers. Besides, performance of the implemented block tracker is evaluated by comparing with some relevant color histograms-based trackers.

26. No-reference image quality assessment with local features and high-order derivatives Mariusz OszustPages 15-26AbstractThe perceptual quality of images is often affected by applied image processing techniques. Their evaluation

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 10 页

Page 11: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

requires tests which involve human subjects. However, in most cases, image quality assessment (IQA) should be automatic and reproducible. Therefore, in this paper, a novel no-reference IQA method is proposed. The method uses high-order derivatives to extract detailed structure deformation present in distorted images. Furthermore, it employs local features, considering that only some regions of an image carry interesting information. Then, statistics of local features are used by a support vector regression technique to provide an objective quality score. To improve the quality prediction, luminance and chrominance channels of the image are processed. Experimental results on six large-scale public IQA image datasets show that the proposed method outperforms the state-of-the-art hand-crafted and deep-learning techniques in terms of the visual quality prediction accuracy. Furthermore, the method is better than popular full-reference approaches (i.e., SSIM and PSNR).

27. Hybrid of extended locality-constrained linear coding and manifold ranking for salient object detection

Chunlei Yang, Xiangluo Wang, Jiexin Pu, Guo-Sen Xie, ... Lingfei LiangPages 27-37AbstractRecent years have witnessed great progress of salient object detection methods. However, due to the emerging complex scenes, two problems should be solved urgently: one is on the fast locating of the foreground while preserving the precision, and the other is about reducing the noise near the foreground boundary in saliency maps. In this paper, a hybrid method is proposed to ameliorate the above two issues. At first, to reduce the essential runtime of integrating the prior knowledge, a novel Prior Knowledge Learning based Region Classification (PKL-RC) method is proposed for classifying image regions and preliminarily locating foreground; furthermore, to generate more accurate saliency, a Locality-constrained Linear self-Coding based Region Clustering (LLsC-RC) model is proposed to improve the adjacency structure of the similarity graph for Manifold Ranking (MR). Experimental results demonstrate the effectiveness and superiority of the proposed method in both higher precision and better smoothness.

28. A new Wronskian change detection model based codebook background subtraction for visual surveillance applications

Deepak Kumar Panda, Sukadev MeherPages 52-72AbstractBackground subtraction (BS) is a popular approach for detecting moving objects in video sequences for visual surveillance applications. In this paper, a new multi-channel and multi-resolution Wronskian change detection model (MCMRWM) based codebook background subtraction is proposed for moving object detection in the presence of dynamic background conditio ns. In the prooed MCMRWM, the multi-channel information helps to reduce the false negative of the foreground object; and the multi-resolution data suppresses the background noise resulting in reduced false positives. The proposed algorithm considers the ratio between feature vectors of current frame to the background model or its reciprocal in an adaptive manner, depending on the l2 norm of the feature vector, which helps to detect the foreground object completely without any false negatives. Extensive experiments are carried out with challenging video sequences to show the efficacy of the proposed algorithm against state-of-the-art BS techniques.

29. General-to-specific learning for facial attribute classification in the wild Yuechuan Sun, Jun Yu

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 11 页

Page 12: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

Pages 83-91AbstractRecent studies have shown that facial attributes provide useful cues for a number of applications such as face verification. However, accurate facial attribute interpretation is still a formidable challenge in real life due to large head poses, occlusion and illumination variations. In this work, we propose a general-to-specific deep convolutional network architecture for predicting multiple attributes from a single image in the wild. First, we model the interdependencies among all attributes by joint learning them all. Second, task-aware learning is adopted to explore the disparity regarding each attribute. Finally, an attribute-aware face cropping scheme is proposed to extract more discriminative features from where a certain attribute naturally shows up. The proposed learning strategy ensures both robustness and performance of our model. Extensive experiments on two challenging publicly available datasets demonstrate the effectiveness of our architecture and the superiority to state-of-the-art alternatives.

30. Visual tracking via context-aware local sparse appearance model Guiji Li, Manman Peng, Ke Nai, Zhiyong Li, Keqin LiPages 92-105AbstractMost existing local sparse trackers are prone to drifting away as they do not make use of discriminative information of local patches. In this paper, we propose an effective context-aware local sparse appearance model to alleviate the drift problem caused by background clutter and occlusions. First, considering that different local patches should have different impacts on the likelihood computation, we present a novel Impact Allocation Strategy (IAS) with integration of the spatial-temporal context. Varying positive impact factors are adaptively assigned to different local patches based on their ability distinguishing the spatial context, which provides discriminative information to prevent the tracker from drifting. Furthermore, we exploit temporal context to introduce some historical information for more accurate locating. Second, we present a new patch-based dictionary update method being able to update each patch independently with the validation of effectiveness. On the one hand, we introduce sparsity concentration index to check whether the local patch to be updated is a valid local patch from the target object. On the other hand, spatial context is further employed to eliminate the effect of the background. Experimental results show the superiority and competitiveness of the proposed method on the benchmark data set compared to other state-of-the-art algorithms.

31. A new framework of action recognition with discriminative parts, spatio-temporal and causal interaction descriptors

Ming Tong, Yiran Chen, Mengao Zhao, Weijuan TianPages 116-130AbstractTo improve action recognition performance, a novel discriminative spectral clustering method is firstly proposed, by which the candidate parts with the internal trajectories being close in spatial position, consistent in appearance and similar in motion velocity are mined. Furthermore, the discriminative constraint is introduced to select discriminative parts. Meanwhile, by fully considering the local and global distributions of data, a new similarity matrix is constructed, which enhances clustering effect. Secondly, the spatio-temporal interaction descriptor and causal interaction descriptor are constructed respectively, which fully mine the spatio-temporal and implicit causal interactive relationships between parts. Finally, a new framework is proposed. By associating the discriminative parts, spatio-temporal and causal interaction

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 12 页

Page 13: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

descriptors together as the inputs of Latent Support Vector Machine (LSVM), the correlations between action categories and action parts as well as interaction descriptors are mined. Consequently, accuracy is enhanced. The extensive and adequate experiments demonstrate the effectiveness of the proposed method.

32. Content adaptive interpolation filters based on HEVC framework Xiaojie Liu, Wenpeng Ding, Yunhui Shi, Baocai YinPages 131-138AbstractMotion compensation is the key technique to reduce temporal redundancy in video coding. Interpolation filters are adopted to generate the inter frame prediction for motion compensation with fractional pixel accuracy. In existing video coding standards such as H.264/AVC and HEVC, a set of predefined interpolation filters is adopted in motion compensation. However, predefined interpolation filters cannot adapt to the video content, which may compromise the coding efficiency. In this paper, a content adaptive interpolation scheme is proposed for motion compensation. In the proposed scheme, a set of adaptive interpolation filters is derived for each frame as additional interpolation filters to minimize the inter prediction difference. Rate-distortion optimization is employed to choose between the predefined interpolation filters and the derived adaptive interpolation filters to achieve the best coding performance at the low bit rates. The proposed scheme is implemented into the HM 12.1 software. Experimental results show that the proposed scheme achieves 5.13 percent, 3.42 percent and 4.07 percent bit rate saving on average compared with HEVC under the “low delay P”, the “low delay B” and the “random access” configurations respectively.

33. The spatial correlation problem of noise in imaging deblurring and its solution Chenwei Yang, Huajun Feng, Zhihai Xu, Qi Li, Yueting ChenPages 167-176AbstractWe describe the spatial correlation problem of noise in colour digital images and analyse its cause. Pixel-correlated image processing procedures, such as CFA colour interpolation and colour space transformation, mainly lead to this problem. Considering this problem, we propose a new noise model based on a joint Gaussian probability distribution. Furthermore, we present an algorithm that makes the revised noise model fit the existing image deconvolution well. The parameters of our algorithm depend only on the image processing procedures of the imaging system. Finally, we apply the proposed algorithm to revise two typical image deconvolution methods and perform simulations and real-world experiments. Both the quantitative indicators and visual performance of the image deblurring results show that the revised deconvolution methods based on our noise model behave better in reducing the noise and ringing artefacts, thus improving the image quality compared with the methods that use the original noise model.

34. Moving object detection by low rank approximation and   l1-TV   regularization on RPCA framework

B. Shijila, Anju Jose Tom, Sudhish N. GeorgePages 188-200AbstractThe detection of moving objects and the subtraction of the scene background are significant tasks for intelligent video surveillance systems as it is one among the fundamental steps. Inspired by the challenging cases yet to be resolved in Moving Object Detection (MOD), a new formulation is done to detect moving

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 13 页

Page 14: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

objects from video sequences based on Robust Principal Component Analysis (RPCA) principle by adopting the regularization of Total Variation (TV) norm using a convergent convex optimization algorithm. While the nuclear norm exploits the low-rank property of background, the sparsity is enhanced by the l1-norm and the foreground spatial smoothness is explored by TV regularization. The goodness of this method lies in the reduced computational complexity, quickness and on the superiority acquired in quantitative evaluation based on F-measure, Recall and Precision with respect to the state of the art methods.

35. Discriminative kernel-based metric learning for face verification Siew-Chin Chong, Thian-Song Ong, Andrew Beng Jin TeohPages 207-219AbstractThis paper outlines a simplistic formulation for doublet constrained discriminative metric learning framework for face verification. The Mahalanobis distance metric of the framework is formulated by leveraging the within-class scatter matrix of the doublet and a quadratic kernel function. Unlike existing metric learning methods, the proposed framework admits efficient solution attributed to the convexity nature of the kernel machines. We demonstrate three realizations of the proposed framework based on the well-known kernel machine instances, namely Support Vector Machine, Kernel Ridge Regression and Least Squares Support Vector Machine. Due to wide availability of off-the-shelf kernel learner solvers, the proposed method can be easily trained and deployed. We evaluate the proposed discriminative kernel-based metric learning with two types of face verification setup: standard and unconstrained face verification through three benchmark datasets. The promising experimental results corroborate the feasibility and robustness of the proposed framework.

36. Artistic movement recognition by consensus of boosted SVM based experts Corneliu Florea, Fabian GiesekePages 220-233AbstractIn this work we aim to automatically recognize the artistic movement from a digitized image of a painting. Our approach uses a new system that resorts to descriptions induced by color structure histograms and by novel topographical features for texture assessment. The topographical descriptors accumulate information from the first and second local derivatives within four layers of finer representations. The classification is performed by two layers of ensembles. The first is an adapted boosted ensemble of support vector machines, which introduces further randomization over feature categories as a regularization. The training of the ensemble yields individual experts by isolating initially misclassified images and by correcting them in further stages of the process. The solution improves the performance by a second layer build upon the consensus of multiple local experts that analyze different parts of the images. The resulting performance compares favorably with classical solutions and manages to match the ones of modern deep learning frameworks.

37. An efficient lossless secret sharing scheme for medical images A. Kanso, M. GheblehPages 245-255AbstractMedical doctors use diagnostic imaging techniques such as X-rays, CT scans and MRI, for detecting diseases or narrowing down possible causes of pain. This often require sharing and transmitting medical images over

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 14 页

Page 15: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

public channels. In this work we adapt Shamir’s secret sharing paradigm to propose a novel lossless scheme for secure sharing of medical images. The proposed scheme takes advantage of the redundancy in typical medical images to reduce share sizes, and hence facilitate storing and sharing. To this end, we employ a customized run-length encoding method to compress the medical image. We conduct an extensive performance analysis on the proposed scheme, including a comparison with some existing Shamir-type secret image sharing schemes.

38. Single image vehicle classification using pseudo long short-term memory classifier Reza Fuad Rachmadi, Keiichi Uchimura, Gou Koutaki, Kohichi OgataPages 265-274AbstractIn this paper, we propose a pseudo long short-term memory (LSTM) classifier for single image vehicle classification. The proposed pseudo-LSTM (P-LSTM) uses spatially divided images rather than time-series images. In other words, the proposed method considers the divided images to be time-series frames. The divided images are formed by cropping input images using two-level spatial pyramid region configuration. Parallel convolutional networks are used to extract the spatial pyramid features of the divided images. To explore the correlations between the spatial pyramid features, we attached an LSTM classifier to the end of the parallel convolutional network and treated each convolutional network as an independent timestamp. Although LSTM classifiers are typically used for time-dependent data, our experiments demonstrated that they can also be used for non-time-dependent data. We attached one fully connected layer to the end of the network to compute a final classification decision. Experiments on an MIO-TCD vehicle classification dataset show that our proposed classifier produces a high evaluation score and is comparable with several other state-of-the-art methods.

39. View synthesis using foreground object extraction for disparity control and image inpainting

Dongxue Han, Hui Chen, Changhe Tu, Yanyan XuPages 287-295AbstractAmong the rapidly growing three-dimensional technologies, multiview displays have drawn great research interests in three-dimensional television due to their adaption to the motion parallax and wider viewing angles. However, multiview displays still suffer from dazzling discomfort on the border of viewing zones. Leveraging on the separability of scene via foreground segmentation, we propose a novel virtual view synthesis method for depth-image-based rendering to alleviate the discomfort. Foreground objects of interest are extracted to segment the whole image into multiple layers, which are further warped to the virtual viewpoint in order. To alleviate the visual discomfort, global disparity adjustments and local depth control are performed for specific objects in each layer. For the post-processing, we improve an exemplar-based inpainting algorithm to tackle the disoccluded areas. Experimental results demonstrate that our method achieves effective disparity control and generates high-quality virtual view images.

40. Attention guided U-Net for accurate iris segmentation Sheng Lian, Zhiming Luo, Zhun Zhong, Xiang Lin, ... Shaozi LiPages 296-304AbstractIris segmentation is a critical step for improving the accuracy of iris recognition, as well as for medical

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 15 页

Page 16: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

concerns. Existing methods generally use whole eye images as input for network learning, which do not consider the geometric constrain that iris only occur in a specific area in the eye. As a result, such methods can be easily affected by irrelevant noisy pixels outside iris region. In order to address this problem, we propose the ATTention U-Net (ATT-UNet) which guides the model to learn more discriminative features for separating the iris and non-iris pixels. The ATT-UNet firstly regress a bounding box of the potential iris region and generated an attention mask. Then, the mask is used as a weighted function to merge with discriminative feature maps in the model, making segmentation model pay more attention to iris region. We implement our approach on UBIRIS.v2 and CASIA.IrisV4-distance, and achieve mean error rates of 0.76% and 0.38%, respectively. Experimental results show that our method achieves consistent improvement in both visible wavelength and near-infrared iris images with challenging scenery, and surpass other representative iris segmentation approaches.

Special Issue on Emerging 3D and Immersive Data Processing

and Evaluation Technologies

41. Cost aggregation benchmark for light field depth estimation Williem, In Kyu ParkPages 38-51AbstractLight field depth estimation has become a mature research topic and there are numerous algorithms introduced by various research groups. However, comprehensive and fair benchmark is difficult to apply because there are large step variances of the introduced algorithms. It is essential to analyze each step in the light field depth estimation so that it could help design better and more robust algorithms. Thus, a thorough analysis of cost aggregation is conducted in this paper to analyze the performance of various cost aggregation methods on light field depth estimation. A study on the parameter setting for each cost aggregation method is performed. Then, each cost aggregation with its optimal parameters is evaluated individually. Instead of using the standard rank system, this paper utilizes the weighted rank system based on the score difference on each criterion. Experimental results confirm that the guided-filter based method outperforms other methods in most evaluation criteria.

42. Joint foveation-depth just-noticeable-difference model for virtual reality environment Di Liu, Yingbin Wang, Zhenzhong ChenPages 73-82AbstractIn this paper, we develop a joint foveation-depth just-noticeable-difference (FD-JND) model to quantify the perceptual redundancy of image in the VR display environment. The proposed FD-JND model is developed with considerations on the effects of both foveation and depth. More specifically, experiments for the VR environment on synthesized stimuli are conducted based on luminance masking and contrast masking and the FD-JND model is developed accordingly. Subjective quality discrimination experiments between the noise

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 16 页

Page 17: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

contaminated images and original ones validate favorableness of the proposed FD-JND model.

43. Cost and power efficient FPGA based stereo vision system using directional graph transform

M. Dehnavi, M. EshghiPages 106-115Abstract3D information of an environment using stereo cameras is important information for navigation of intelligent systems. The cost, power, accuracy, and speed are four important parameters in these systems. In this article, an accurate, real-time, low-power and low-cost system is provided to extract disparity maps in a stereo vision, using FPGA hardware platform. First, a new transform based on directional graphs is proposed. Then, benefiting from this graph transform and cross-based matching method, disparity map is computed. By using optimized hardware for the proposed transform and algorithm, we have obtained an accurate, low-cost, low-power and fast stereo vision system. The proposed system is fully implemented on relatively low cost FPGA platform, XC7K160t, in order to operate as a Standalone system. This system uses 40 K registers, 31 K LUTs, 215 memory blocks, and 258 DSP blocks of this FPGA. The proposed system is tested and evaluated in Middlebury dataset. The results show that the proposed stereo system can process a HD quality video at 60 frames per second for 64 disparity levels with only 7.1% error in the final disparity map. The total power consumption of the proposed stereo vision core is about 1W.

Special Issue on Text-based Image/video Understanding in

Social Media Context

44. An overview of face-related technologies Hongyan Fei, Bing Tu, Ququ Chen, Danbing He, ... Yishu PengPages 139-143AbstractIn recent years, information technology is developing continuously and set off a burst of artificial intelligence boom in the field of science. The development of advanced technologies such as unmanned driving and AI chips, is the extensive application of artificial intelligence. Face-related technologies have a wide range of applications because of intuitive results and good concealment. Since 3D face information can provide more comprehensive facial information than 2D face information, and it can solve many difficulties that cannot be solved in 2D face recognition. Therefore, more and more researchers have studied 3D face recognition in recent years. Under the new circumstances, the research on face are experiencing all kinds of challenges. With the tireless of many scientists, the new technology is also making a constant progress, and in the development of many technologies it still maintained its leading position. In this paper, we simply sort out the present development process of facial correlation technology, and the general evolution of this technology is outlined. Finally, the practical significance of this technology development is briefly discussed.

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 17 页

Page 18: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

45. Adaptive total variation-based spectral-spatial feature extraction of hyperspectral image

Guoyun Zhang, Jinping Wang, Xiaofei Zhang, Hongyan Fei, Bing TuPages 150-159AbstractIn this paper, a simple yet quite useful hyperspectral images (HSI) classification method based on adaptive total variation filtering (ATVF) is proposed. The proposed method consists of the following steps: First, the spectral dimension of the HSI is reduced with principal component analysis (PCA). Then, ATVF is employed to extract image features which not only reduces the noise in the image, but also effectively exploits spatial–spectral information. Therefore, it can provide an improved representation. Finally, the efficient extreme learning machine (ELM) with a very simple structure is used for classification. This paper analyzes the influence of different parameters of the ATVF and ELM algorithm on the classification performance in detail. Experiments are performed on three hyperspectral urban data sets. By comparing with other HSI classification methods and other different feature extraction methods, the proposed method based on the ATVF algorithm shows outstanding performance in terms of classification accuracy and computational efficiency when compared with other hyperspectral classification methods.

46. Classification of hyperspectral images via weighted spatial correlation representation Bing Tu, Nanying Li, Leyuan Fang, Hongyan Fei, Danbing HePages 160-166AbstractSuperpixel segmentation has been widely applied in hyperspectral image (HSI) classification. In this letter, a weighted spatial correlation representation (WSCR) method for HSI classification is proposed where an effective metric spatial correlation representation (SCR) that measures the correlation coefficient (CC) among different pixels in the superpixels is described, which fully utilizes the spatial information and structural features of superpixels. In addition, considering that the contribution of each SCR is different, the Gaussian weighted is considered. The proposed method includes the following steps: First, a superpixels image is obtained from HSI based on the entropy rate superpixel (ERS) algorithm. Second, the WSCRs for the training and test samples are calculated. Then, a joint sparse representation (JSR) classification is used to obtain the representation residuals of different pixels. Finally, the class label of each pixel is determined by the defined decision function that combines the WSCR and JSR. Experimental results obtained on two real HSI datasets demonstrate the superiority of the proposed methods compared to other widely used methods in terms of classification accuracy.

47. Analysis of security operation and maintenance system using privacy utility in media environment

Zhengwei Jiang, Guoen Chen, Xueqi Jin, Yueqiang WangPages 177-181AbstractAt present, the power information room mostly adopts the analog KVM matrix, or adopts the digital KVM matrix, but there are various defects in the two methods. In order to solve the problem of traditional mode, this paper developed a new security operation and maintenance management system composed of 5 parts. It also has the advantages of traditional mode and overcomes the shortcomings of the traditional model. First, introduces the advantages and disadvantages of two kinds of traditional model, puts forward the necessity of improving; then, the security operation management system each part of the design, and set out to achieve its

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 18 页

Page 19: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

function; finally carries on the analysis of security. The results show that the security operation and maintenance management system improves the security of the system and helps the system to operate more intelligently and safely based on the guarantee of the required functions.

48. A discriminative dynamic framework for facial expression recognition in video sequences

Xijian Fan, Xubing Yang, Qiaolin Ye, Yin YangPages 182-187AbstractFacial expression involves a dynamic process, leading to the variation of different facial components over time. Thus, dynamic descriptors are essential for recognising facial expressions. In this paper, we extend the spatial pyramid histogram of gradients to spatio-temporal domain to give 3-dimensional facial features. To enhance the spatial information, we divide the whole face region into a group of smaller local regions to extract local 3D features, and a weighting strategy based on fisher separation criterion is proposed to enhance the discrimination ability of local features. A multi-class classifier based on support vector machine is applied for recognising facial expressions. Experiments on the CK+ and MMI datasets using leave-one-out cross validation scheme show that the proposed framework perform better than using the descriptor of simple concatenation. Compared with state-of-the-art methods, the proposed framework demonstrates a superior performance.

49. Camera network analysis for visual surveillance in electric industrial context Zhengwei JiangPages 201-206AbstractSociety is rapidly accepting the use of a wide variety of cameras location and applications: site traffic monitoring, parking lot surveillance, car and smart space. The camera provides data every day in an analysis by an effective way. Recent advances in sensor technology manufacturing, communications and computing are stimulating. The development of new applications that can change the traditional vision system incorporating universal smart camera network was processed. This analysis of visual cues in multi camera networks makes wide applications ranging from smart home and office automation to large area surveillance and traffic surveillance. And dense Camera networks, most of which have large overlapping areas of cameras. In the view of good research, we focus on sparse camera networks. One sparse camera network using large area surveillance was developed. As few cameras as possible, most cameras do not overlap each other’s field of vision. This task is challenging. Lack of knowledge of topology network, the specific changes in appearance and movement track different opinions of the target, as well as difficulties understanding complex events in a network were observed. In this review, we present a comprehensive survey of recent studies. Results to solve the problem of topology learning, object appearance modeling and global activity understanding sparse camera network were determined. In addition, some of the current open research issues are discussed.

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 19 页

Page 20: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

Special Issue on Multimodal Cooperation for Multimedia

Computing

50. Graph regularized low-rank tensor representation for feature selection Yuting Su, Xu Bai, Wu Li, Peiguang Jing, ... Jing LiuPages 234-244AbstractRecently, considerable efforts have been made in feature selection to improve the original feature subspace. In this paper, we proposed a graph regularized low-rank tensor representation (GRLTR) for feature selection. We jointly incorporated the low-rank representation and the graph embedding into a unified learning framework to preserve the intrinsic global low-dimension structure and local geometrical structure of data together. According to the wide presence of multidimensional data, our proposed framework is based on tensor, which can faithfully maintain the information. To improve the performance of specific clustering task, we employed the idea of embedded-based feature selection into our model for optimizing the feature representation and clustering result simultaneously. Experimental results on six available datasets suggest our proposed approach produces superior performances compared with several state-of-the-art methods.

51. Unsupervised multi-view feature extraction with dynamic graph learning Dan Shi, Lei Zhu, Zhiyong Cheng, Zhihui Li, Huaxiang ZhangPages 256-264AbstractGraph-based multi-view feature extraction has attracted much attention in literature. However, conventional solutions generally rely on a manually defined affinity graph matrix, which is hard to capture the intrinsic sample relations in multiple views. In addition, the graph construction and feature extraction are separated into two independent processes which may result in sub-optimal results. Furthermore, the raw data may contain adverse noises that reduces the reliability of the affinity matrix. In this paper, we propose a novel Unsupervised Multi-view Feature Extraction with Dynamic Graph Learning (UMFE-DGL) to solve these limitations. We devise a unified learning framework which simultaneously performs dynamic graph learning and the feature extraction. Dynamic graph learning adaptively captures the intrinsic multiple view-specific relations of samples. Feature extraction learns the projection matrix that could accordingly preserve the dynamically adjusted sample relations modelled by graph into the low-dimensional features. Experimental results on several public datasets demonstrate the superior performance of the proposed approach, compared with state-of-the-art techniques.

52. 3D object recognition based on pairwise Multi-view Convolutional Neural Networks Z. Gao, D.Y. Wang, Y.B. Xue, G.P. Xu, ... Y.L WangPages 305-315AbstractWith the development of 3D sensors, it will be much easier for us to obtain 3D models, which is prevailing in our future daily life, but up to now, although many 3D object recognition algorithms have been proposed, there are some limitations, including the lack of training samples, hand-crafted feature representation, feature

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 20 页

Page 21: lib.jstu.edu.cnlib.jstu.edu.cn/_upload/article/files/ae/b4/d110c6c04e988…  · Web viewHCLR: A hybrid clustering and low-rank regularization-based method for photon-limited image

extraction and recognition separately. In this work, we propose a novel pairwise Multi-View Convolutional Neural Network for 3D Object Recognition (PMV-CNN for short), where automatic feature extraction and object recognition are put into a unify CNN architecture. Moreover, since the pairwise network architecture is utilized in PMV-CNN, thus, the requirement of the number of training samples in the original dataset is not severe. In addition, the latent complementary relationships from different views can be highly explored by view pooling. Large scale experiments demonstrate that the pairwise architecture is very useful when the number of labeled training samples is very small. Moreover, it also makes more robust feature extraction. Furthermore, since the end-to-end network architecture is employed in PMV-CNN, thus, the extracted feature is very suitable for 3D object recognition, whose performance is much better than that of hand-crafted features. In a word, the performance of our proposed method outperforms state-of-the-art methods.

编者:陆雪梅 联系电话(3548)E-mail:[email protected] 共 20 第 21 页