Convolutional Neural Network

Why CNN for Image Some patterns are much smaller than the whole image The same patterns appear in different regions Subsampling the pixels will not change the object
展开查看详情

1. Convolutional Neural Network Hung-yi Lee Can the network be simplified by considering the properties of images?

2.Why CNN for Image • Some patterns are much smaller than the whole image A neuron does not have to see the whole image to discover the pattern. Connecting to small region with less parameters “beak” detector

3.Why CNN for Image • The same patterns appear in different regions. “upper-left beak” detector Do almost the same thing They can use the same set of parameters. “middle beak” detector

4.Why CNN for Image • Subsampling the pixels will not change the object bird bird subsampling We can subsample the pixels to make image smaller Less parameters for the network to process the image

5.The whole CNN cat dog …… Convolution Max Pooling Can repeat Fully Connected many times Feedforward network Convolution Max Pooling Flatten

6. The whole CNN Property 1 ➢ Some patterns are much Convolution smaller than the whole image Property 2 Max Pooling ➢ The same patterns appear in Can repeat different regions. many times Property 3 Convolution ➢ Subsampling the pixels will not change the object Max Pooling Flatten

7.The whole CNN cat dog …… Convolution Max Pooling Can repeat Fully Connected many times Feedforward network Convolution Max Pooling Flatten

8.CNN – Convolution Those are the network parameters to be learned. 1 -1 -1 1 0 0 0 0 1 -1 1 -1 Filter 1 0 1 0 0 1 0 -1 -1 1 Matrix 0 0 1 1 0 0 1 0 0 0 1 0 -1 1 -1 -1 1 -1 Filter 2 0 1 0 0 1 0 Matrix 0 0 1 0 1 0 -1 1 -1 …… 6 x 6 image Each filter detects a small Property 1 pattern (3 x 3).

9. 1 -1 -1 CNN – Convolution -1 1 -1 Filter 1 -1 -1 1 stride=1 1 0 0 0 0 1 0 1 0 0 1 0 3 -1 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 6 x 6 image

10. 1 -1 -1 CNN – Convolution -1 1 -1 Filter 1 -1 -1 1 If stride=2 1 0 0 0 0 1 0 1 0 0 1 0 3 -3 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 We set stride=1 below 0 0 1 0 1 0 6 x 6 image

11. 1 -1 -1 CNN – Convolution -1 1 -1 Filter 1 -1 -1 1 stride=1 1 0 0 0 0 1 0 1 0 0 1 0 3 -1 -3 -1 0 0 1 1 0 0 1 0 0 0 1 0 -3 1 0 -3 0 1 0 0 1 0 0 0 1 0 1 0 -3 -3 0 1 6 x 6 image 3 -2 -2 -1 Property 2

12. -1 1 -1 CNN – Convolution -1 1 -1 Filter 2 -1 1 -1 stride=1 Do the same process for 1 0 0 0 0 1 every filter 0 1 0 0 1 0 3 -1 -3 -1 -1 -1 -1 -1 0 0 1 1 0 0 1 0 0 0 1 0 -3 1 0 -3 -1 -1 -2 1 0 1 0 0 1 0 Feature 0 0 1 0 1 0 -3 -3 Map0 1 -1 -1 -2 1 6 x 6 image 3 -2 -2 -1 -1 0 -4 3 4 x 4 image

13. CNN – Colorful image 1 -1 -1 -1-1 11 -1-1 11 -1-1 -1-1 -1 1 -1 -1-1 11 -1-1 -1-1-1 111 -1-1-1 Filter 2 -1 1 -1 Filter 1 -1 1 -1 -1-1 -1-1 11 -1 1 -1 -1 -1 1 -1 1 -1 Colorful image 1 0 0 0 0 1 1 0 0 0 0 1 0 11 00 00 01 00 1 0 1 0 0 1 0 0 00 11 01 00 10 0 0 0 1 1 0 0 1 00 00 10 11 00 0 1 0 0 0 1 0 0 11 00 00 01 10 0 0 1 0 0 1 0 0 00 11 00 01 10 0 0 0 1 0 1 0 0 0 1 0 1 0

14.Convolution v.s. Fully Connected 1 0 0 0 0 1 1 -1 -1 -1 1 -1 0 1 0 0 1 0 -1 1 -1 -1 1 -1 0 0 1 1 0 0 -1 -1 1 -1 1 -1 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 convolution image x1 1 0 0 0 0 1 0 1 0 0 1 0 x2 Fully- 0 0 1 1 0 0 1 0 0 0 1 0 connected …… …… 0 1 0 0 1 0 0 0 1 0 1 0 x36

15.1 -1 -1 Filter 1 1: 1 -1 1 -1 2: 0 -1 -1 1 3: 0 4: 0 3 … 1 0 0 0 0 1 0 1 0 0 1 0 7: 0 0 0 1 1 0 0 8: 1 1 0 0 0 1 0 9: 0 0 1 0 0 1 0 10: 0 … 0 0 1 0 1 0 13: 0 6 x 6 image 14: 0 Less parameters! 15: 1 Only connect to 9 16: 1 input, not fully connected …

16.1 -1 -1 1: 1 -1 1 -1 Filter 1 2: 0 -1 -1 1 3: 0 4: 0 3 … 1 0 0 0 0 1 0 1 0 0 1 0 7: 0 0 0 1 1 0 0 8: 1 1 0 0 0 1 0 9: 0 -1 0 1 0 0 1 0 10: 0 … 0 0 1 0 1 0 13: 0 6 x 6 image 14: 0 Less parameters! 15: 1 16: 1 Shared weights Even less parameters! …

17.The whole CNN cat dog …… Convolution Max Pooling Can repeat Fully Connected many times Feedforward network Convolution Max Pooling Flatten

18.CNN – Max Pooling 1 -1 -1 -1 1 -1 -1 1 -1 Filter 1 -1 1 -1 Filter 2 -1 -1 1 -1 1 -1 3 -1 -3 -1 -1 -1 -1 -1 -3 1 0 -3 -1 -1 -2 1 -3 -3 0 1 -1 -1 -2 1 3 -2 -2 -1 -1 0 -4 3

19. CNN – Max Pooling New image 1 0 0 0 0 1 but smaller 0 1 0 0 1 0 Conv 3 0 0 0 1 1 0 0 -1 1 1 0 0 0 1 0 0 1 0 0 1 0 Max 3 1 0 3 0 0 1 0 1 0 Pooling 2 x 2 image 6 x 6 image Each filter is a channel

20.The whole CNN 3 0 -1 1 Convolution 3 1 0 3 Max Pooling Can repeat A new image many times Smaller than the original Convolution image The number of the channel Max Pooling is the number of filters

21.The whole CNN cat dog …… Convolution Max Pooling A new image Fully Connected Feedforward network Convolution Max Pooling A new image Flatten

22. 3 Flatten 0 1 3 0 -1 1 3 3 1 -1 0 3 Flatten 1 Fully Connected Feedforward network 0 3

23. Only modified the network structure and CNN in Keras input format (vector -> 3-D tensor) input Convolution 1 -1 -1 -1 1 -1 -1 1 -1 -1 1 -1 …… There are 25 -1 -1 1 3x3 filters. -1 1 -1 Max Pooling Input_shape = ( 28 , 28 , 1) 28 x 28 pixels 1: black/white, 3: RGB Convolution 3 -1 3 Max Pooling -3 1

24. Only modified the network structure and CNN in Keras input format (vector -> 3-D tensor) input 1 x 28 x 28 Convolution How many parameters 9 25 x 26 x 26 for each filter? Max Pooling 25 x 13 x 13 Convolution How many parameters 225 50 x 11 x 11 for each filter? Max Pooling 50 x 5 x 5

25. Only modified the network structure and CNN in Keras input format (vector -> 3-D tensor) input 1 x 28 x 28 output Convolution 25 x 26 x 26 Fully Connected Max Pooling Feedforward network 25 x 13 x 13 Convolution 50 x 11 x 11 Max Pooling 1250 50 x 5 x 5 Flatten

26.Live Demo

27.What does machine learn? 球鞋 美洲獅 http://newsneakernews.wpengine.netdna-cdn.com/wp- content/uploads/2016/11/rihanna-puma-creeper-velvet-release-date-02.jpg

28.First Convolution Layer • Typical-looking filters on the trained first layer 11 x 11 (AlexNet) http://cs231n.github.io/understanding-cnn/

29. How about higher layers? • Which images make a specific neuron activate Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation”, CVPR, 2014