I was looking for writing a blog post. The Github blog post was confusing somehow. But will try to do it this week if I can figure out how to write one.
Traditional ML pipeline - Gather training data and validation data for bird images with labels > train CNN over training, evaluate over Validation > Use Trained model over test/production
So that variance remains close to 1 and mean close to 0 in order to avoid the problem of vanishing and exploding gradient.
So that every slice along dim will sum to 1
It changes the shape to 3 rows and automatically calculating the value of the other dimension. E.g say u have a tensor of shape (3,4) so (2,-1) will change the dimension to (2, 6)
Gather data → Apply transforms/ preprocessing–> apply various filters so as to extract edges and other features–>use a model like SVM —> evaluate
Activation function are linear so it is sensitive for values close to centre. Small changes in the image changes the result by quite a bit. For images we are sure of(that are on the extreme end) some changes in the pixels don’t affect much. But for images we are not sure much of, small changes lead to significant changes in our prediction.
A color picture would be a combination of RGB values to signify the value of a pixel. I’m having a hard time visualizing how on and off for a pixel in a certain location maps to a RGB value for color. Sorry. I’m a noob at this, but love the classes so far and going to go back and start from chapter 1.
The shape identifier as you call it, is not hand-designed or hard-coded by us. We simply throw filters at our input (training) data points, and these filters learn the inherent signals corresponding to a label, i.e. a dog is a dog.
We don’t choose or decide which filter will do what. That is decided by the network during training. Some filters become edge detectors, some become corner detectors, some become associated with colors, and so on. But remember that we cannot control which does which.
It was done in pre-Deep Learning Computer Vision, where feature-extractors such as edge-detectors were hard-coded by humans. Like the Sobel kernel is a good edge detector.
You can certainly change and control the size of a kernel, but you cannot control which filter will do what.
The earlier layers usually become low-level feature extractors (such as corner detection), where the layers toward the end become high-level feature extractors (such as recognizing a face).
During inference, the filters related to the detected features get activated, and the others get turned off.
Because, you can turn things off when needed. And put negative weights when needed (Leaky ReLU).
NO. But we can see what filter does what. It completely depends on the weights of the filter what it does. It’s completely random what a filter does.
On and off pixels as such only make sense when you have a greyscale bitmap image.
But we usually deal with JPEGs or PNGs which have three layers. The R layer has only the information about red values. No green or blue values are of concern in that layer. The same goes for G and B.
These three layers, superimposed together forms an image.