S -R P -S S -C D L A I - Hong Kong Polytechnic Universitycslzhang/SCDL/SCDL_poster.pdf · SCDL...

1
S EMI -C OUPLED D ICTIONARY L EARNING WITH A PPLICATIONS TO I MAGE S UPER -R ESOLUTION AND P HOTO -S KETCH S YNTHESIS S HENLONG W ANG ,L EI Z HANG ,Y AN L IANG AND Q UAN P AN [email protected], [email protected], {liangyan, quanpan}@nwpu.edu.cn P ROBLEM In many computer vision and pat- tern recognition applications, people often have images of the same scene but obtained from different sources, and consequently the conversion be- tween the images of different styles are required. This is a difficult problem due to two aspects. 1. Characterizing a highly complex mapping. 2. Describing the target style. (The output should look like to be in the style) C ONTRIBUTIONS We propose a simple yet more general model to solve the cross-style image synthesis problems. Our method trains a dictionary pair and a mapping function simultaneously. The pair of dictionaries aims to characterize the two structural domains of image types, and the mapping function that reveals the intrinsic rela- tionship between the two styles. Our model can be adapted to various cross-style image applications.In summary, our paper have two main contributions: 1. Proposed a novel coupled dictionary learning approach (Two dictionaries will not be fully coupled, allowing much flexibility for synthesis). 2. Proposed a reliable style transform algorithm (Successful application in im- age super-resolution and photo-sketch transformation). R ESULTS ×2 super-resolution. From left to right: low resolution image, high resolution ground-truth, and reconstructed images by Bicubic, ScSR [1], SAI [2], SME [3] and the proposed SCDL method. Image Girl Butterfly Fence Starfish Parthenon House Foreman Leaves Average SAI[2] 34.13 29.17 23.78 30.73 27.10 32.84 37.68 28.72 30.00 SME[3] 34.03 28.65 24.53 30.35 27.13 33.15 37.17 28.21 29.93 ScSR[1] 33.29 28.27 24.05 30.35 26.46 31.78 35.68 27.52 29.19 Proposed 34.25 29.62 24.76 30.94 27.32 33.21 37.26 28.92 30.26 ×3 super-resolution. From left to right: low resolution image, high resolution ground-truth, and reconstructed images by Bicubic, ScSR [1] and the proposed SCDL method. Image Girl Butterfly Fence Starfish Parthenon House Foreman Leaves Average Bicubic 31.24 23.32 20.30 25.97 24.05 28.55 32.00 21.74 25.47 ScSR[1] 31.10 23.84 20.38 26.08 24.06 28.53 32.29 21.93 25.60 Proposed 31.90 24.61 20.96 26.60 24.68 29.25 33.37 22.64 26.32 In super-resolution, we can see that our proposed method outperforms state-of- the-arts in most cases. In sketch-synthesis, compared with the final synthesis re- sults reported in Wang et al.[4], our result seems over-smoothed. Considering we simply use the averaging strategy for fusing overlapped patches, our results have a large room to improve by coupling with some post-processing techniques. R EFERENCES [1] J. Yang, J. Wright, T. Huang, and Y. Ma. Image super-resolution via sparse representation. IEEE Trans on IP, 2010. [2] X. Zhang and X. Wu. Image interpolation by adaptive 2-d autoregressive mod- eling and soft-decision estimation. IEEE Trans on IP, 2008. [3] S. Mallat and G. Yu. Super-resolution with sparse mixing estimators. IEEE Trans on IP, 2010. [4] X. Wang and X. Tang. Face photo-sketch synthesis and recognition. IEEE Trans on PAMI, pages 1955–1967, 2008. AF UTURE D IRECTION In the future study, we will adap- t SCDL to more types of cross- modality synthesis tasks and extend it to cross-style image recognition tasks. More results and the MATLAB source codes of this paper are available at http://www.comp.polyu.edu. hk/~cslzhang/SCDL.htm. S EMI - COUPLED D ICTIONARY L EARNING Training: min {D x ,D y ,W} kX - D x Λ x k 2 F + kY - D y Λ y k 2 F + γ kΛ y - x k 2 F + λ x kΛ x k 1 + λ y kΛ y k 1 + λ W kWk 2 F Synthesis: In the synthesis stage, we still minimizing E SCDL instead of directly reconstruct the target sparse codes by transforming the source sparse codes, which will find a balance between style fidelity and prediction ability. Model Selection: In real-world data, the mappings between different styles can be complex, spatial-variant and nonlin- ear. In order to improve the robustness and stability of SCDL, a model selection (clustering) procedure can be integrated into SCDL.

Transcript of S -R P -S S -C D L A I - Hong Kong Polytechnic Universitycslzhang/SCDL/SCDL_poster.pdf · SCDL...

Page 1: S -R P -S S -C D L A I - Hong Kong Polytechnic Universitycslzhang/SCDL/SCDL_poster.pdf · SCDL instead of directly reconstruct the target sparse codes by transforming the source sparse

SEMI-COUPLED DICTIONARY LEARNING WITH APPLICATIONS TO IMAGESUPER-RESOLUTION AND PHOTO-SKETCH SYNTHESIS

SHENLONG WANG, LEI ZHANG, YAN LIANG AND QUAN [email protected], [email protected], {liangyan, quanpan}@nwpu.edu.cn

PROBLEM

In many computer vision and pat-tern recognition applications, peopleoften have images of the same scenebut obtained from different sources,and consequently the conversion be-tween the images of different styles arerequired. This is a difficult problemdue to two aspects.

1. Characterizing a highly complexmapping.

2. Describing the target style. (Theoutput should look like to be inthe style)

CONTRIBUTIONS

We propose a simple yet more general model to solve the cross-style imagesynthesis problems. Our method trains a dictionary pair and a mapping functionsimultaneously. The pair of dictionaries aims to characterize the two structuraldomains of image types, and the mapping function that reveals the intrinsic rela-tionship between the two styles. Our model can be adapted to various cross-styleimage applications.In summary, our paper have two main contributions:

1. Proposed a novel coupled dictionary learning approach (Two dictionarieswill not be fully coupled, allowing much flexibility for synthesis).

2. Proposed a reliable style transform algorithm (Successful application in im-age super-resolution and photo-sketch transformation).

RESULTS

×2 super-resolution. From left to right: low resolution image, high resolutionground-truth, and reconstructed images by Bicubic, ScSR [1], SAI [2], SME [3] andthe proposed SCDL method.

Image Girl Butterfly Fence Starfish Parthenon House Foreman Leaves AverageSAI[2] 34.13 29.17 23.78 30.73 27.10 32.84 37.68 28.72 30.00SME[3] 34.03 28.65 24.53 30.35 27.13 33.15 37.17 28.21 29.93ScSR[1] 33.29 28.27 24.05 30.35 26.46 31.78 35.68 27.52 29.19

Proposed 34.25 29.62 24.76 30.94 27.32 33.21 37.26 28.92 30.26

×3 super-resolution. From left to right: low resolution image, high resolutionground-truth, and reconstructed images by Bicubic, ScSR [1] and the proposedSCDL method.

Image Girl Butterfly Fence Starfish Parthenon House Foreman Leaves AverageBicubic 31.24 23.32 20.30 25.97 24.05 28.55 32.00 21.74 25.47ScSR[1] 31.10 23.84 20.38 26.08 24.06 28.53 32.29 21.93 25.60

Proposed 31.90 24.61 20.96 26.60 24.68 29.25 33.37 22.64 26.32

In super-resolution, we can see that our proposed method outperforms state-of-the-arts in most cases. In sketch-synthesis, compared with the final synthesis re-sults reported in Wang et al.[4], our result seems over-smoothed. Considering wesimply use the averaging strategy for fusing overlapped patches, our results havea large room to improve by coupling with some post-processing techniques.

REFERENCES

[1] J. Yang, J. Wright, T. Huang, and Y. Ma. Image super-resolution via sparserepresentation. IEEE Trans on IP, 2010.

[2] X. Zhang and X. Wu. Image interpolation by adaptive 2-d autoregressive mod-eling and soft-decision estimation. IEEE Trans on IP, 2008.

[3] S. Mallat and G. Yu. Super-resolution with sparse mixing estimators. IEEETrans on IP, 2010.

[4] X. Wang and X. Tang. Face photo-sketch synthesis and recognition. IEEE Transon PAMI, pages 1955–1967, 2008.

A FUTURE DIRECTION

In the future study, we will adap-t SCDL to more types of cross-modality synthesis tasks and extendit to cross-style image recognitiontasks. More results and the MATLABsource codes of this paper are availableat http://www.comp.polyu.edu.hk/~cslzhang/SCDL.htm.

SEMI-COUPLED DICTIONARY LEARNING

• Training: min{Dx,Dy,W} ‖X−DxΛx‖2F + ‖Y −DyΛy‖2F + γ‖Λy −WΛx‖2F + λx‖Λx‖1 + λy‖Λy‖1 + λW ‖W‖2F

• Synthesis: In the synthesis stage, we still minimizing ESCDL instead of directly reconstruct the target sparse codes bytransforming the source sparse codes, which will find a balance between style fidelity and prediction ability.

• Model Selection: In real-world data, the mappings between different styles can be complex, spatial-variant and nonlin-ear. In order to improve the robustness and stability of SCDL, a model selection (clustering) procedure can be integratedinto SCDL.