Go to [[Week 2 - Introduction]] or back to the [[Main AI Page]]
Part of the page on [[Neural Networks]]
tl;dr A CNN breaks an image down by simplifying it a whole bunch of times to try and recognise macro-features and classifying those features it thinks it has a high certainty of seeing.
For example, if a convolution layer fuzzies the shit out of an image and can still see a ‘smile’ macro feature, it passes that on. If another layer fuzzies the shit out of it and is fairly sure it sees an ‘eye’ macro feature, it passes that on. If then a fully-connected layer gets a bunch of layers saying ‘smile’, ‘eye’, ‘hair’, ‘ears’, etc, eventually it’ll probably come out with ‘face’ as the image’s classification.
Convolutional neural networks or CNNs are multilayer neural networks that take inspiration from the animal visual cortex. CNNs are useful in applications such as image processing, video recognition, and natural language processing. A convolution is a mathematical operation, where a function is applied to another function and the result is a mixture of the two functions.
Convolutions are good at detecting simple structures in an image, and putting those simple features together to construct more complex features.
In a convolutional network, this process occurs over a series of layers, each of which conducts a convolution on the output of the previous layer.
CNNs are adept at building complex features from less complex ones.
A CNN is composed of several kinds of layers:
Rendering context...