Data from the real world are never perfectly balanced. In this episode I explain a simple yet effective trick to train models with very unbalanced data. Enjoy the show!

Sponsors

Get one of the best VPN at a massive discount with coupon code DATASCIENCE. It provides you with an 83% discount which unlocks the best price in the market plus 3 extra months for free. Here is the link https://surfshark.deals/DATASCIENCE

 

References

Leo Breiman, Random Forests, 2001

C. Chen, A. Liaw, L. Breiman, Using Random Forest to Learn Imbalanced Data (2004)

 

Podden och tillhörande omslagsbild på den här sidan tillhör Francesco Gadaleta. Innehållet i podden är skapat av Francesco Gadaleta och inte av, eller tillsammans med, Poddtoppen.