[D] Random Forests and Decision Trees
I am doing a binary classification problem where I currently run a decision tree across the data with 100 different random seeds, and then take the total number of outputs and figure out the final predicted classification. So if it comes out 1 75 times and 0 25 times, then the final prediction is a 1. I am using a pure majority problem (in the event of a tie, I go with 0). Would there be any benefit to running the exact same thing, but with 100 different random forests? In other words, will a decision tree and random forest predict the same wrong ones, but predict different correct ones? I am trying to find a way to push the accuracy a little higher. It works well, coming in with about 65% accuracy.
P.S. I do all the normal stuff like train-test split, limit the number of branches to the decision tree, etc.
P.P.S. I should note that the random seed changes for the train-test split and the decision tree when running the next tree.
submitted by /u/spot4992
[link] [comments]