[D] Random Forests and Decision Trees

Written by torontoai on November 5, 2019. Posted in Reddit MachineLearning.

I am doing a binary classification problem where I currently run a decision tree across the data with 100 different random seeds, and then take the total number of outputs and figure out the final predicted classification. So if it comes out 1 75 times and 0 25 times, then the final prediction is a 1. I am using a pure majority problem (in the event of a tie, I go with 0). Would there be any benefit to running the exact same thing, but with 100 different random forests? In other words, will a decision tree and random forest predict the same wrong ones, but predict different correct ones? I am trying to find a way to push the accuracy a little higher. It works well, coming in with about 65% accuracy.

P.S. I do all the normal stuff like train-test split, limit the number of branches to the decision tree, etc.

P.P.S. I should note that the random seed changes for the train-test split and the decision tree when running the next tree.

submitted by /u/spot4992
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

[D] Random Forests and Decision Trees