Complex-Valued Convolutional Neural Networks for …...specialize on local line features for...

1
Key question: How to automatically and robustly find instances of object classes in PolSAR images? Problem: Ad hoc assumptions about usefull features and pre- processing methods necessary for common classifiers. Solution: A complex-valued convolutional neural network includes feature extraction into the problem- specific optimization. It can also handle complex-valued data. Advantage: No ad hoc choices necessary. Figure 1 visualizes the activation of several exemplary neurons of the convolutional layer. As can be seen the activation differs greatly for different neurons. While some of them seem to specialize on local line features for different orientations, others try to capture intensity variations or even polarimetric information. Complex-Valued Convolutional Neural Networks for Object Detection in PolSAR Data Ronny Hänsch and Olaf Hellwich +49 30 314 21104 1. Introduction Technische Universität Berlin Computer Vision & Remote Sensing Information: www.cv.tu-berlin.de Contact: Ronny Hänsch +49 30 314 73107 [email protected] The presented preliminary results show that CC- NN can be successfully utilized to classify PolSAR data. Although the net topology with only one convolutional layer is quite simple, the classification performance increased significantly if compared to CV-MLPs. Indeed, the achieved error rates are comparable to those acquired by applying a speckle reduction technique before classifying with a CV- MLP. 5. Learned Kernels Convolutional Neural Networks (C-NNs) are hierarchical multi-layered neural networks, which are inspired by biological research results on the visual cortex of mammals. They were introduced in [3] and successfully applied in several computer vision tasks, like optical character recognition [4] or face detection [1, 2]. Kernel based feature extraction is a widely used approach in a lot of computer vision applications. In most cases it is an ad hoc decision which kernel is used to calculate descriptive features. Instead of applying pre-defined task specific feature operators before the classification process, C-NN include the feature extraction into the optimization problem. During the training stage the error is not only backpropagated within the net itself, but even into the kernel. Therefore, no fixed a priori choice for a specific kernel function is made, but the net choose those kernels, which are most suitable for the specific task. The architecture of a C-NN consists of several convolutional and subsequent sub-sampling layers. The subsampling reduces the computational effort during training and application. The one and only extension this work makes to standard C-NN is to use complex-valued weights and inputs in all layers. Therefore, complex domain specific learning has to be used. 4. Convolutional Neural Networks The aim of this paper is to investigate the general applicability of this learning scheme to classification of PolSAR data. Therefore, the number of convolutional layers was restricted to one in this study. Different numbers of neurons within the convolutional layer, as well as different numbers of hidden layers and neurons per hidden layer were investigated. Table 1 Each row of Table 1 denotes different topologies ranging from a single layer net (with six output neurons due to the six-class problem) to a net with two hidden layers each with fifty neurons. The columns mean different numbers of neurons in the single convolutional layer. As expected the test error usually decreases with more hidden neurons. A more complex topology lead to a better performance as well. However, the largest decrease of error can be achieved by using more features, meaning more neurons within the convolutional layer. The right side of the 2 nd row of Figure 1 shows a visual representation of the classification result. A CV-MLP (two hidden layers with 50 neurons per layer) evaluated on the same data set with identical performance measurements achieved an error rate of 34%. The introduction of contextual knowledge as well as the usage of learned convolutional features is able to significantly improve the performance. 7. Conclusion 3. Complex-Valued MLP 2. Abstract Detection methods for generic object categories, which are more sophisticated than pixel-wise classification, have been rarely introduced to polarimetric synthetic aperture radar (PolSAR) images. This paper provides a first investigation of Complex-Valued Convolutional Neural Networks (CC-NN) for object recognition in PolSAR data. Although an architecture with only one single convolutional layer was used, the results are already superior to those obtained by a standard complex-valued neural network. Fully-polarimetric SAR measures the amplitude and phase of the backscattered signal in four different combinations of transmitted and received polarisation. Therefore, it forms a complex-valued signal. Most approaches for object classification are designed for real-valued data and require a projection of complex-valued PolSAR data to the real domain. Those projections make the classification result dependend on the used method. Complex-Valued Multi-Layer Percep- trons (CV-MLPs) can be applied to complex- valued PolSAR data directly and circumvent those dependencies. In particular the lack of complex functions, which are bounded as well as analytical (Liouville's Theorem), make the extension of standard MLPs to CV-MLPs non-trivial. Nevertheless, several approaches were proposed, for example the usage of split-complex activation functions as the split-tanh-function: Considering rules of differential calculus in the complex domain the following rules for back- propergation can be obtained: 6. Results Figure 1: The images above show details of the classification process. The used data shown at the left side of the first row is a fully-polarimetric SAR image (E-SAR, L-Band, Oberpfaffenhofen). On the right side the reference data (no label (black), forest (dark green), low vegetation (light green), field (yellow), settlement (red), large building/industrial area (pink), street/railway (blue)) is shown. The obtained classification result can be seen on the right side of the second row. All other images are visualisations of the activation of a few exemplary convolutional neurons. [1] Garcia, C.; Delakis, M.: Convolutional Face Finder: A Neural Architecture for Fast and Robust Face Detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 26, No. 11, 2004. [2] Lawrence, S.; Gilles, C.; Tsai, A.; Back, A.: Face Recognition: A Convolutional Neural Network Approach, IEEE Trans. Neural Networks, Vol. 8, No. 1, pp. 98-113, 1997. [3] LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P.: Gradient-Based Learning Applied to Document Recognition, Proc. IEEE, Vol. 86, No. 11, pp. 2278-2324, 1998. [4] Wang, J.; Jean, J. Multi-Resolution Neural Networks for Omnifont Character Recognition, Proc. Int.Conf. Neural Networks, Vol. 3, pp. 1588-1593, 1993. References h z = tanh z i tanh z w ij l t 1 = w ji l t =1 P i l y j l 1 where = 2 P and i l = { err y i l y i l h i l err y i l * y i l * h i l ,l = L r =1 N l 1 r l 1 w ir l 1* y i l h i l r l 1* w ir l 1 y i l * h i l ,l L 5 10 50 6 54% ± 24% 35% ± 13% 25% ± 2% 10-6 41% ± 12% 30% ± 4% 24% ± 2% 50-6 42% ± 10% 28% ± 3% 24% ± 2% 10-10-6 38% ± 7% 31% ± 5% 26% ± 1% 50-50-6 31% ± 4% 26% ± 1% 24% ± 1%

Transcript of Complex-Valued Convolutional Neural Networks for …...specialize on local line features for...

Page 1: Complex-Valued Convolutional Neural Networks for …...specialize on local line features for different orientations, others try to capture intensity variations or even polarimetric

Key question: How to automatically and robustly find instances of object classes in PolSAR images?

Problem: Ad hoc assumptions about usefull features and pre-processing methods necessary for common classifiers.

Solution: A complex-valued convolutional neural network includes feature extraction into the problem-specific optimization. It can also handle complex-valued data.

Advantage: No ad hoc choices necessary.

Figure 1 visualizes the activation of several exemplary neurons of the convolutional layer. As can be seen the activation differs greatly for different neurons. While some of them seem to specialize on local line features for different orientations, others try to capture intensity variations or even polarimetric information.

Complex-Valued Convolutional Neural Networksfor Object Detection in PolSAR Data

Ronny Hänsch and Olaf Hellwich

+49 30 314 21104

1. Introduction

Technische Universität BerlinComputer Vision & Remote Sensing

Information: www.cv.tu-berlin.de Contact: Ronny Hänsch +49 30 314 73107 [email protected]

The presented preliminary results show that CC-NN can be successfully utilized to classify PolSAR data.

Although the net topology with only one convolutional layer is quite simple, the classification performance increased significantly if compared to CV-MLPs.

Indeed, the achieved error rates are comparable to those acquired by applying a speckle reduction technique before classifying with a CV-MLP.

5. Learned Kernels

Convolutional Neural Networks (C-NNs) are hierarchical multi-layered neural networks, which are inspired by biological research results on the visual cortex of mammals. They were introduced in [3] and successfully applied in several computer vision tasks, like optical character recognition [4] or face detection [1, 2].

Kernel based feature extraction is a widely used approach in a lot of computer vision applications. In most cases it is an ad hoc decision which kernel is used to calculate descriptive features. Instead of applying pre-defined task specific feature operators before the classification process, C-NN include the feature extraction into the optimization problem.

During the training stage the error is not only backpropagated within the net itself, but even into the kernel. Therefore, no fixed a priori choice for a specific kernel function is made, but the net choose those kernels, which are most suitable for the specific task.

The architecture of a C-NN consists of several convolutional and subsequent sub-sampling layers. The subsampling reduces the computational effort during training and application.

The one and only extension this work makes to standard C-NN is to use complex-valued weights and inputs in all layers. Therefore, complex domain specific learning has to be used.

4. Convolutional Neural Networks

The aim of this paper is to investigate the general applicability of this learning scheme to classification of PolSAR data. Therefore, the number of convolutional layers was restricted to one in this study.

Different numbers of neurons within the convolutional layer, as well as different numbers of hidden layers and neurons per hidden layer were investigated.

Table 1

Each row of Table 1 denotes different topologies ranging from a single layer net (with six output neurons due to the six-class problem) to a net with two hidden layers each with fifty neurons. The columns mean different numbers of neurons in the single convolutional layer.

As expected the test error usually decreases with more hidden neurons. A more complex topology lead to a better performance as well. However, the largest decrease of error can be achieved by using more features, meaning more neurons within the convolutional layer. The right side of the 2nd row of Figure 1 shows a visual representation of the classification result.

A CV-MLP (two hidden layers with 50 neurons per layer) evaluated on the same data set with identical performance measurements achieved an error rate of 34%.

The introduction of contextual knowledge as well as the usage of learned convolutional features is able to significantly improve the performance.

7. Conclusion

3. Complex-Valued MLP

2. Abstract

Detection methods for generic object categories, which are more sophisticated than pixel-wise classification, have been rarely introduced to polarimetric synthetic aperture radar (PolSAR) images.

This paper provides a first investigation of Complex-Valued Convolutional Neural Networks (CC-NN) for object recognition in PolSAR data. Although an architecture with only one single convolutional layer was used, the results are already superior to those obtained by a standard complex-valued neural network.

Fully-polarimetric SAR measures the amplitude and phase of the backscattered signal in four different combinations of transmitted and received polarisation. Therefore, it forms a complex-valued signal.

Most approaches for object classification are designed for real-valued data and require a projection of complex-valued PolSAR data to the real domain. Those projections make the classification result dependend on the used method. Complex-Valued Multi-Layer Percep-trons (CV-MLPs) can be applied to complex-valued PolSAR data directly and circumvent those dependencies.

In particular the lack of complex functions, which are bounded as well as analytical (Liouville's Theorem), make the extension of standard MLPs to CV-MLPs non-trivial. Nevertheless, several approaches were proposed, for example the usage of split-complex activation functions as the split-tanh-function:

Considering rules of differential calculus in the complex domain the following rules for back-propergation can be obtained:

6. Results

Figure 1: The images above show details of the classification process. The used data shown at the left side of the first row is a fully-polarimetric SAR image (E-SAR, L-Band, Oberpfaffenhofen). On the right side the reference data (no label (black), forest (dark green), low vegetation (light green), field (yellow), settlement (red), large building/industrial area (pink), street/railway (blue)) is shown. The obtained classification result can be seen on the right side of the second row. All other images are visualisations of the activation of a few exemplary convolutional neurons.

[1] Garcia, C.; Delakis, M.: Convolutional Face Finder: A Neural Architecture for Fast and Robust Face Detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 26, No. 11, 2004.[2] Lawrence, S.; Gilles, C.; Tsai, A.; Back, A.: Face Recognition: A Convolutional Neural Network Approach, IEEE Trans. Neural Networks, Vol. 8, No. 1, pp. 98-113, 1997.[3] LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P.: Gradient-Based Learning Applied to Document Recognition, Proc. IEEE, Vol. 86, No. 11, pp. 2278-2324, 1998.[4] Wang, J.; Jean, J. Multi-Resolution Neural Networks for Omnifont Character Recognition, Proc. Int.Conf. Neural Networks, Vol. 3, pp. 1588-1593, 1993.

References

h z = tanh ℜ z i tanh ℑ z

w ijl t1=w ji

l t −∑=1

P

il⋅y j

l−1

where=2⋅P

and

il={∂err∂ y i

l ⋅∂ y i

l

∂ hil

∂err∂ y i

l* ⋅∂ y i

l*

∂ hil , l=L

∑r=1

N l1 rl1w irl1*∂ y il∂ hil r

l1* wirl1∂ y i

l*

∂ h il , lL

5 10 50

6 54% ± 24% 35% ± 13% 25% ± 2%

10-6 41% ± 12% 30% ± 4% 24% ± 2%

50-6 42% ± 10% 28% ± 3% 24% ± 2%

10-10-6 38% ± 7% 31% ± 5% 26% ± 1%

50-50-6 31% ± 4% 26% ± 1% 24% ± 1%