Week 12 Discussion Thread

Sure Aman, will do! :slight_smile:

Maybe a bit more self-explanatory representation for the Conv2d layer would be something like:

import torch.nn as nn
conv_layer = nn.Conv2d(in_channels=1, out_channels=128, kernel_size=(3,3))

Ref.: Conv2d — PyTorch 1.9.0 documentation

1 Like

Hello, can anyone tell me is there any forums for solving fastbook Colab notebook errors??

When would be a scenario when we would want to increase our stride? Just to reduce our image size faster?

you can post in this Week 11 Discussion Thread - #9 by bhutanisanyam1

1 Like

I guess it would kind of be stride vs kernel size right? figured it out :slight_smile: Kernel size really doesn’t reduce the image size nearly as much as stride does. Only around the edges. And larger kernel sizes would require more memory intensive processes as well.

2 Likes

As Aman just said, I tend to think of it as covering more of the image with a larger stride which also results in reduction in size (which can be ameliorated by padding).

1 Like

If the padding is (p x p) with input (n x n) and a kernel size (f x f) with stride (S x S)

the size of output is ((n+2p-f)/S +1) x ((n+2p-f)/S +1)

One of the formulae I could recollect from Andrew Ng’s classes, if I recollect correctly

1 Like

https://cs231n.github.io/convolutional-networks/

1 Like

The best one I follow is …

Image Size N x N x d
Filter Size F x F x d
Number of Filters M
Padding P
Stride S
Output size (O x O) O = (N - F + 2P)/S + 1
Number of Weights in One Filter F x F x d
Number of Weights for all Filters in a Conv Layer M x F x F x d
Number of Calculations per Filter (exc. Activation) O x O x ( FxFxd Mult + FxFxd Add)
                                       = O x O x 2 x F x F x d

Number of Calculations per Conv Layer > M x O x O x 2 x F x F x d

Actually formula (n+2p-f)/S+1

1 Like

Can you explain how we go from 30 filters down to 1 filter?

Thank you. Updated the message

It is actually (n+2p-f)//S + 1 where // indicates integer divide

1 Like

Yes, that makes sense

But how do we determine if it’s a 3 or a 7 with this output? Because we would be back where we were right?

Ah, ok I missed that. Thank you!

1 Like

I think I now understand how it will happen for an RGB image.

We’re having kernels which also sum depthwise then I understand…

1 Like

like is it 5 rank tensor?

Thank you Aman for the wonderful lectures.