#7 PyTorch Book Thread: Sunday, 10th Oct 8AM PT

Just to add one more point the problem here we will face due highly skewed distribution is model might learn to give only that class (majority class) as output as in doing the same will give the minimum loss.

2 Likes

Thanks for the free education :)! See you around

1 Like
2 Likes


Excerpt from the book. So if we penalize the model for high confident wrong prrediction, we should be able to get a solution to class imbalance. Please correct if my understanding is wrong here.

3 Likes
3 Likes

Is there a sort of rule book which can tell which augmentation(s) should be applied to what kind of image domain?

TTA-Test Time Augmentation

1 Like

I wonder if there is a sequence of augmentations you can apply to turn all the cats in the dataset into dogs…

2 Likes
3 Likes

Can we please work on the GAN to convert cats to dogs after this study group? Maybe call it WoofGAN?

5 Likes

The training happens on the image after all the augmentations steps or model is trained on every image from each augmentation steps?

1 Like

Thanks for another awesome session @bhutanisanyam1! See you next week :slight_smile:

2 Likes

whether you can show us the answer of the homework , transforming ffrom TEnsorflow to Pytorch

You’re thinking in the right direction but one problem we’re gonna face in the case of high class imbalance is even after penalisation the loss for the minority class will be overshadowed by the majority class. For example say we dataset with two classes A & B with very high imbalance(datasets such has cancer detection, etc.). % of A in distribution is 99.99% and say we have a million images. So what will happen in this is case is even if our model learns just to predict class A every time it will be correct for 99.99% of the times and loss obtained from wrong prediction will be very very minute due to it’s small % in the dataset hence it will be overshadowed and hence learning will be hampered.

Let’s take another example:
Say we have 5000 data points with 4995 elements of class A and 5 elements of class B. We have a model that always gives A as output and when it’s output is wrong we penalise it with loss value of say 100(our very own example loss function). Say we are using Gradient Decent for optimisation, so the loss will be in this case ((5*100 + (some_small_value)) / 5000) ~ 0.01, hence we can see that our optimiser won’t be able to make much changes to the parameters, therefore this will hamper the learning.

But If we have a loss function that takes data distribution into account i.e. penalisation will be very high when minority class is wrong then we might be able to overcome this problem. We have such loss functions and techniques. There are some great articles and papers on the same:

  1. By Json Brownlee
  2. Constrained Optimization to Train Neural Networks on Critical and Under-Represented Classes, Sara Sangalli et. al
4 Likes

I read Sentdex’s amazing beginner-friendly notebook and converted it from TensorFlow to PyTorch: PyTorch vs. Cancer.

I tried to re-create the model architecture in PyTorch, but loaded the data as the authors have done, using SimpleITK, instead of using the dicom library like in Sentdex’s notebook.

Any kind of feedback is welcome. :slight_smile:

1 Like