Join our meetup, learn, connect, share, and get to know your Toronto AI community.
Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.
Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.
I have a dataset of electrical outages and it is extremely imbalanced, <2% of all of the data are positive classes. I am using weather station data to try to predict the probability of an outage occurring near the weather stations.
When I try any other model I have to rebalance the data to get any good results. However I have recently tried hierarchical Bayesian logistic regression and it performs just fine without resampling. In my methodology every individual weather station has a unique intercept and coefficients, but they are each drawn from a parent distribution.
What I would like to discuss is why does the hierarchical approach perform so much better on the imbalanced dataset?
submitted by /u/paulie007
[link] [comments]