Join our meetup, learn, connect, share, and get to know your Toronto AI community.
Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.
Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.
I want to build a model for Chess/Go/Shogi that is trained and tested on real players, and I want it to pass the Turing test. I don’t want my model to play the best move in a position, I want it to play the move that a person would play (of a certain strength, time control, etc..).
It’s easy to make this a classification problem and train a CNN on a one-hot encoded policy of actual moves played. The only problem is, without some kind of look-ahead algorithm (MCTS for example) the model fails to learn sequences that require multiple moves, such as tactics.
However, current MCTS/alpha-beta/minimax models require evaluation of leaf nodes. I don’t have a way to shape the reward to an evaluation of a leaf node. So my question: how would I incorporate a look-ahead algorithm in an imitation learning problem like this?
submitted by /u/Pawngrubber
[link] [comments]