#4 PyTorch Book Thread: Sunday, 19th Sept 8AM PT

lessing · September 16, 2021, 12:22pm

Note: This is a wiki, please edit it & add resources!

YouTube Link: PyTorch Book Reading - 4. Train your first CNN using Torch - YouTube

Hi all!
This thread is for discussing, Q&A, and everything else for the #3 meetup of the book reading group.

Last week we trained our first NN using Torch, this week we’ll continue by looking at convolutions and training our First CNN using torch.

Link to sign up

Resources:

(Purchase) Link to the book Use Discount Code " podchai20 " for a 40% discount (Not an affliate link)
(Free Link to download book, courtesy of PyTorch website)
Chai Time Data Science Interview w the Authors
github link of the code used
WHAT IS TORCH.NN REALLY ? by Jeremy Howard
Dive Into Deep Learrning - Full Deep Learning book (free) with full PyTorch coverage
DEEP LEARNING WITH PYTORCH: A 60 MINUTE BLITZ
WELCOME TO PYTORCH TUTORIALS Comprehensive list of PyTorch tutorials from PyTorch website
The Top 10,653 Pytorch Open Source Projects on Github 10.5k+ PyTorch projects with code
Deeplizard: A good resource in parallel to the book for learning pyTorch

<<<< Previous Session Thread

dhruvashist · September 19, 2021, 3:06pm

I was looking for writing a blog post. The Github blog post was confusing somehow. But will try to do it this week if I can figure out how to write one.

bhutanisanyam1 · September 19, 2021, 3:06pm

Why do we choose activation functions with linearity around 0,1?
Why do we provide dimension in Softmax?
What does view(3,-1) do?
Steps in “Traditional” ML Pipeline to detect birds?

dhruvashist · September 19, 2021, 3:09pm

(3,-1) basically flattens the data to forward it into linear layer from the conv layers

girijesh · September 19, 2021, 3:09pm

Why do we choose activation functions with linearity around 0,1?>>> Batch normalization
Why do we provide dimension in Softmax? >>
What does view(3,-1) do? >> -1 will identify other dim itself
Steps in “Traditional” ML Pipeline to detect birds?>> Using hand-crafted filters on image to find features and then use SVM like classifier.

yuvraj · September 19, 2021, 3:12pm

Traditional ML pipeline - Gather training data and validation data for bird images with labels > train CNN over training, evaluate over Validation > Use Trained model over test/production

aadil · September 19, 2021, 3:14pm

So that variance remains close to 1 and mean close to 0 in order to avoid the problem of vanishing and exploding gradient.
So that every slice along dim will sum to 1
It changes the shape to 3 rows and automatically calculating the value of the other dimension. E.g say u have a tensor of shape (3,4) so (2,-1) will change the dimension to (2, 6)
Gather data → Apply transforms/ preprocessing–> apply various filters so as to extract edges and other features–>use a model like SVM —> evaluate

bhutanisanyam1 · September 19, 2021, 3:15pm

dhruvashist · September 19, 2021, 3:15pm

Activation function are linear so it is sensitive for values close to centre. Small changes in the image changes the result by quite a bit. For images we are sure of(that are on the extreme end) some changes in the pixels don’t affect much. But for images we are not sure much of, small changes lead to significant changes in our prediction.

This is my understanding but i am not for sure.

tauseef · September 19, 2021, 3:38pm

Do we have control over what shape identifier filters are present in a CNN layer?

aadil · September 19, 2021, 3:41pm

Yes, we can. There is a parameter to specify kernel shape in Conv2D and other classes as well

sahiljuneja · September 19, 2021, 3:42pm

Luis Serrano’s Youtube channel that Sanyam refers to - https://www.youtube.com/c/LuisSerrano

A friendly introduction to Convolutional Neural Networks and Image Recognition
You are much better at math than you think (my personal recommendation to everyone here)

johngrant · September 19, 2021, 3:45pm

A color picture would be a combination of RGB values to signify the value of a pixel. I’m having a hard time visualizing how on and off for a pixel in a certain location maps to a RGB value for color. Sorry. I’m a noob at this, but love the classes so far and going to go back and start from chapter 1.

ghosh-r · September 19, 2021, 3:48pm

The shape identifier as you call it, is not hand-designed or hard-coded by us. We simply throw filters at our input (training) data points, and these filters learn the inherent signals corresponding to a label, i.e. a dog is a dog.

We don’t choose or decide which filter will do what. That is decided by the network during training. Some filters become edge detectors, some become corner detectors, some become associated with colors, and so on. But remember that we cannot control which does which.

It was done in pre-Deep Learning Computer Vision, where feature-extractors such as edge-detectors were hard-coded by humans. Like the Sobel kernel is a good edge detector.

You can certainly change and control the size of a kernel, but you cannot control which filter will do what.

The earlier layers usually become low-level feature extractors (such as corner detection), where the layers toward the end become high-level feature extractors (such as recognizing a face).

During inference, the filters related to the detected features get activated, and the others get turned off.

Because, you can turn things off when needed. And put negative weights when needed (Leaky ReLU).

johngrant · September 19, 2021, 3:48pm

Thank you helps! Great course so far.

tauseef · September 19, 2021, 3:50pm

got it, i always wondered about this but couldnt find an explanation online. I just feel the cloud infront of my eyes have been finally cleared

dhruvashist · September 19, 2021, 3:54pm

NO. But we can see what filter does what. It completely depends on the weights of the filter what it does. It’s completely random what a filter does.

ghosh-r · September 19, 2021, 3:54pm

No worries. Being a noob is never a bad thing.

On and off pixels as such only make sense when you have a greyscale bitmap image.

But we usually deal with JPEGs or PNGs which have three layers. The R layer has only the information about red values. No green or blue values are of concern in that layer. The same goes for G and B.

These three layers, superimposed together forms an image.

This is the perspective you are looking for-

Topic		Replies	Views
#5 PyTorch Book Thread: Sunday, 26th Sept 8AM PT PyTorch Book Reading Group	32	2304	September 26, 2021
#3 PyTorch Book Thread: Sunday, 12th Sept 8AM PT PyTorch Book Reading Group	38	3191	October 7, 2021
Week 12 Discussion Thread Fastbook Reading Group	51	2853	September 10, 2021
#1 PyTorch Book Thread: Sunday, 29 Aug 8 AM PT PyTorch Book Reading Group	66	4354	September 17, 2021
Week 14 Discussion Thread Fastbook Reading Group	25	2110	October 26, 2021

#4 PyTorch Book Thread: Sunday, 19th Sept 8AM PT

Related topics