Squeeze-and-Excitation Networks: [1709.01507] Squeeze-and-Excitation Networks
Livestream on YouTube here: W&B Paper Reading Group: Squeeze-and-Excitation Networks - YouTube
Is Excitation block element wise operation or a MLP?
When we hv multiple output channels / filters in each layer, they all hv different weights for each input channels. Does this not give the same effect as SE block that modifies the input feature map? Or is SE block make the learning easier by modifying the inputs for all the next level filters??
how do we know that which conv filter (vertical, horizontal, gaussian…) we applied in convolution
cause in my practice i never understand
How are you reducing C/R to C and vice versa?
From Aman’s blog post it looks like a good old fully connected layer:
nn.Linear(c, c // r, bias=False)
In the paper they compare the results of the various “places” where you could add the SE blocks. The results they share actually show the errors to be LOWER for the SE-PRE block in Table 14. Do you know why they then do NOT recommend this implementation vs what is normally done which is to implement the SE block in the residual path?