#3 PyTorch Book Thread: Sunday, 12th Sept 8AM PT

ghosh-r · September 12, 2021, 3:57pm

Added these resources:

WHAT IS TORCH.NN REALLY ? by Jeremy Howard
Dive Into Deep Learrning - Full Deep Learning book (free) with full PyTorch coverage
DEEP LEARNING WITH PYTORCH: A 60 MINUTE BLITZ
WELCOME TO PYTORCH TUTORIALS Comprehensive list of PyTorch tutorials from PyTorch website
The Top 10,653 Pytorch Open Source Projects on Github 10k+ PyTorch projects with code

deep_learner_007 · September 12, 2021, 3:58pm

How to experiment with different activation functions. Mathematically I am able to understand them, but how to get their essemce for application?

aadil · September 12, 2021, 4:02pm

I would suggest, take any simple architecture say ResNet-18 replace ReLU with Tanhx or sigmoid and compare things the training performance, losses, etc. you will get the gist of them.

deep_learner_007 · September 12, 2021, 4:02pm

Recently WandB’s official YouTube channel had a video about Torch.nn as well:

Also Abhishek Thakur covered the topic: 9. Understanding torch.nn - YouTube

aadil · September 12, 2021, 4:03pm

As Jeremy once said and Sanyam said a while ago, the more u experiment with code and get your hands dirty the more stronger your intuitions will become.

ghosh-r · September 12, 2021, 4:09pm

OrderedDicts are really helpful when you have a big architecture.

And to mess with them when you want to without much trouble.

(left at 2147 IST)

sophie_zang · September 12, 2021, 4:20pm

why we apply super, is that required by pytorch?

aadil · September 12, 2021, 4:27pm

since the class we create, inherits nn.Module… to initialise the components(variable, etc.) of that base class we use super.

bhutanisanyam1 · September 12, 2021, 4:31pm

Suggested Homework:

Checkout different loss functions in torch
Try new activation functions
Play around with NN hyperparameters
Try a new dataset from torchvision
Try torchvision transforms

yuvraj · September 12, 2021, 4:31pm

Can you please talk about hangout event once more?

matt24 · September 12, 2021, 4:33pm

You can find more info at the following link:

dhruvashist · September 15, 2021, 5:14am

Hi!! I was working on the SGD optimizer and noticed that it gives the loss=Nan when I use t_u(not normalized), the authors use t_un=0.1*t_u. Why does this happen?

dhruvashist · September 15, 2021, 5:19am

Is this the problem of exploding gradients if I understand it correctly? And therefore normalization is necessary?

bhutanisanyam1 · September 15, 2021, 6:07am

@dhruvashist Welcome to the community!

Yes, you’re right about gradient explosion.

dhruvashist · September 15, 2021, 9:41am

I was working on the validation set loss and found the loss to decrease, then increase, and then stabilize. It seems something is wrong but I know what to look for. Any advice/help is appreciated

Also, the authors have a small difference between the training and validation loss. Mine seems quite large. Is this OK or do I need to rework it?

dhruvashist · September 15, 2021, 11:12am

The loss was reduced from 8.0 to 3.0 when training again and again with different splits. Looks like this is why cross-validation is important.

dhruvashist · September 16, 2021, 12:33pm

When using a sequential model(13 neurons), these are the shapes of the parameters for both layers.

[torch.Size([13, 1]), torch.Size([13]), torch.Size([1, 13]), torch.Size([1])]

Why do biases in the first layer have shape [13] and not [13,1] and vice-versa?

tauseef · September 19, 2021, 4:26pm

I prepared a small notebook for CNN but it’s with tensorflow. Any feedback would be really appreciated, thanks!

bibhabasu · October 7, 2021, 6:42pm

I had read the chapter 5 mechanics of learning earlier but i recently read it again and its basically -simple high school stuff of differetiating functions and having slopes or here they say gradients ,
But the whole point is that the developer have to come from far high level dealing complex problems to basic statistics and make the reader realize the loss funtion and then differentiating it wrt weights and biases then optimizing it all with code as good as pen and paper , it makes us realize that as a user of torch. nn or even sklearn ,how close we get to the truth but yet we are too far with our implementations …
The chapter starts with simple " mx + c " and beautifully fits and optimizes everything in our world and you can only realize the calmness if you truly try to forget everything that you have learnt about ML or DL… The only way to enjoy this chapter is to know x^2 derivative wrt x is 2x and nothing more!

Topic		Replies	Views
#4 PyTorch Book Thread: Sunday, 19th Sept 8AM PT PyTorch Book Reading Group	29	2734	September 20, 2021
#5 PyTorch Book Thread: Sunday, 26th Sept 8AM PT PyTorch Book Reading Group	32	2304	September 26, 2021
#1 PyTorch Book Thread: Sunday, 29 Aug 8 AM PT PyTorch Book Reading Group	66	4351	September 17, 2021
#6 PyTorch Book Thread: Sunday, 3rd Oct 8AM PT PyTorch Book Reading Group	23	2335	October 7, 2021
#7 PyTorch Book Thread: Sunday, 10th Oct 8AM PT PyTorch Book Reading Group	34	2744	October 17, 2021

#3 PyTorch Book Thread: Sunday, 12th Sept 8AM PT

Related topics