Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Why does hierarchical Bayesian regression work well on imbalanced data?

I have a dataset of electrical outages and it is extremely imbalanced, <2% of all of the data are positive classes. I am using weather station data to try to predict the probability of an outage occurring near the weather stations.

When I try any other model I have to rebalance the data to get any good results. However I have recently tried hierarchical Bayesian logistic regression and it performs just fine without resampling. In my methodology every individual weather station has a unique intercept and coefficients, but they are each drawn from a parent distribution.

What I would like to discuss is why does the hierarchical approach perform so much better on the imbalanced dataset?

submitted by /u/paulie007
[link] [comments]