[D] Tips on improving random forest predictive accuracy when # of features is really low?

Written by torontoai on September 9, 2019. Posted in Reddit MachineLearning.

Working on a random forest predictive model with a continuous response variable and two continuous features. Normally when I do RF projects I use some sort of feature selection method to choose which features to use. Then I fit the RF model onto those features. Then to test accuracy / related metrics I use cross validation, confusion matrices, etc.

However in this case I only have two given features. I don’t want to just literally run a RF model on those two features as my whole entire project. I’m thinking gradient boosting is what I should learn? Also I think I should play around with the number of estimators and depth of the RF. I’m using sklearn in Python if that helps.

Any other suggestions? Obviously this type of problem/challenge is an unexplored area for me, so looking for best practices on how to add to my data science toolkit. Thanks!

submitted by /u/truryce
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

[D] Tips on improving random forest predictive accuracy when # of features is really low?