DenseNet

Funny thought, if we are accumulating all the features from all previous layers so after one forward pass we can only do backpropagation on last layer ??

So @amanarora , we are making it very easy for the model to get information from the previous layers.

If we do resnet like blocks, there might be some information loss as we’re only relying on a string of convolutions to retain all the information from previous layers

3 Likes

" The difference between ResNet and DenseNet is that ResNet adopts summation to connect all preceding feature- maps while DenseNet concatenates all of them"

Ref.: https://openaccess.thecvf.com/content/WACV2021/papers/Zhang_ResNet_or_DenseNet_Introducing_Dense_Shortcuts_to_ResNet_WACV_2021_paper.pdf

3 Likes

Why does increasing the number of layers as we go deeper into the layers help?

i.e., we have 6 layers in the first Dense Block, but 48 towards the end? Why not the opposite?

Why does this gradually increasing number of layers from left to right help? Why not right to left? Why vary it at all?

Maybe I need to get my eyes tested since I’m always requesting this :sweat_smile:
Could you please zoom into the PDF please :slight_smile:

2 Likes

thanks @amanarora for the great description of the DenseNet model.

Thanks for the great explanation! I’m very eager to read the paper and try coding it. :slight_smile:

2 Likes

Thanks for the discussion. If you hv time, can you address this question this week or next -
the first layer inside the DenseBlock has only 32 outputs. So the 1x1 that follows it is not a Bottleneck anymore if it increases the feature maps to 128. But the term Bottleneck makes sense in the later layers in the DenseBlock.

1 Like

Thanks a lot for this session Aman. I am really learning and overcoming my fear of reading research papers thanks to these sessions :slight_smile: !

3 Likes

Great explanation. Had fun listening to the lecture and you made it look easy. I really liked how you broke down the architecture into smaller parts and explained each part.

3 Likes

We know for a fact that later layers (on the right) learn higher level abstractions (eyes, wheels etc) while earlier layers (on the left) learn basic structures such as lines and circles and such. Given this, my intuition on the reason for more layers in later dense blocks is that it takes more layers to learn higher level abstractions better (also reason why deeper networks considered to learn better) and hence we increase number of layers in dense blocks which are deeper in the network and not the other way around. I could however be wrong!