AU Conv. Network for Segmentation of Mixed Crops

Conv. Network for Segmentation of Mixed CropsAUAARHUS

UNIVERSITY

1 Aarhus University, Department of Agroecology

2 University of Southern Denmark, Maersk Mc-Kinney Moller Institute

3 Aarhus University, Department of Engineering

A. K. Mortensen1, M. Dyrmann

2, H. Karstoft

3, R. N. Jørgensen

3 and R. Gislum

1

1. Problem:

2. Method:

4. Discussion and conclusion:

5. References:[1] Simonyan, K., & Zisserman, A. (2015). Very Deep Convolutional Networks for Large-Scale Image Recoginition. Intl. Conf. on Learning Representations (ICLR), 1–14.

[2] Long, J., Shelhamer, E., & Darrell, T. (2015). Fully Convolutional Networks for Semantic Segmentation. Proceedings of the IEEE Conference on Computer Vision and

Pattern Recognition, 3431–3440.

[3] Sukhbaatar, S., Bruna, J., Paluri, M., Bourdev, L., & Fergus, R. (2014). Training Convolutional Networks with Noisy Labels, 1–10. Retrieved from http://arxiv.org/abs/

1406.2080

· What?

· Pixel-wise classification of top view images of mixed crops

· Why?

· Yield estimation

· Nitrogen-uptake estimation

· Variable Nitrogen application to fields

· Optimal harvest time

3. Results:

· Fine-tuning on a pre-trained Deep Convolutional Neural

Network

· Based on VGG-16D for image classification [1]

· Adapted to pixel-wise classification [2]:

· Convert fully connected layers to convolutional layers

· Insert deconvolutional layer

· First, fine-tuned on PASCAL-Context [2]

· Then, fine-tuned on own data

· Data set

· 48 images

· 400 x 400 pixels

· 75% used for training

· 7 classes

· Unknown, Radish, Barley/Grass, Weed, Stump, Soil

Equipment

· Data argumentation to increase data set

· Flip left/right, flip up/down and transpose

· Learning parameters

· Learning rate: 10-9

· Per pixel learning rate: ~1.6*10-4

Original Flip left/right Flip up/down Flip l/r + u/d

Transpose T+ flip l/r T + flip u/d T + flip l/r + u/d

64

3 x

3 k

ern

els

64

3 x

3 k

ern

els

12

8 3

x 3

ke

rne

ls

12

8 3

x 3

ke

rne

ls

25

6 3

x 3

ke

rne

ls

25

6 3

x 3

ke

rne

ls

25

6 3

x 3

ke

rne

ls

51

2 3

x 3

ke

rne

ls

51

2 3

x 3

ke

rne

ls

51

2 3

x 3

ke

rne

ls

51

2 3

x 3

ke

rne

ls

51

2 3

x 3

ke

rne

ls

51

2 3

x 3

ke

rne

ls

40

96

1 x

1 k

ern

els

40

96

1 x

1 k

ern

els

10

00

1 x

1 k

ern

els

2 x

2 m

ax p

oo

l, stride

= 2

2 x

2 m

ax p

oo

l, stride

= 2

2 x

2 m

ax p

oo

l, stride

= 2

2 x

2 m

ax p

oo

l, stride

= 2

2 x

2 m

ax p

oo

l, stride

= 2

7 6

4 x

64

ke

rne

ls, stride

= 3

2

· Pixel-wise classification:

· Pixel accuracy: 79 %

· Frequency weighted IoU: 66%

· Ensemble classification

· Pixel accuracy: 80 %

· Frequency weighted IoU: 67%

· Works relatively well, but only tested on small labelled data set

· Large un-labelled dataset + small lalelled dataset à semi-

supervised learning?

· How?

· Auto-encoders?

· Learning Noise model? [3]

Source: Sukhbaatar et al. 2014 [3]

AU Conv. Network for Segmentation of Mixed Crops

Documents

Transcript of AU Conv. Network for Segmentation of Mixed Crops