Detecting blood cells

The blood components are unbalanced - we have predominantly more red blood cells compared to the other types. But the visual characteristics of the 3 types studied here (red blood cells, white blood cells and platelets) are sufficiently distinct so that the model is able to perform well despite the imbalance.



Class imbalance is a common concern, but in practice nearly all datasets will suffer this to one degree or another. COCO itself has several orders of magnitude imbalance between the most and least common classes (people and toaster I think).

We’ve tried to create specific tools to address imbalance, like weighted image selection during training with the --image-weights flag, but in practice simply training longer seems to be the best solution, as this naturally provides greater exposure to less frequent classes.


Hey @maria_rodriguez :wave:, the article was a great read. I really appreciate you open sourcing code (which btw is very readable, a rare feat to achieve in the medical imaging domain IMHO) and the thoroughness of your experimentation :grin:.


