Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] How does AlphaStar, a NN that players StarCraft, encode its output?

For something like AlphaGo (that plays a simple board game), I understand that the neural network can output a “grid” vector the size of the board, and the largest value in the output, which is also a valid move, is the move you make*. In this case, the neural network is solving the same simple question repeatedly, “Where do I move?”. I know how to encode the answer to that question. There’s around 400 possible moves in Go, and they are fixed, so a vector of length 400 can encode every possible action.

(* Actually, AlphaGo uses the NN in a tree search. The NN does not generate moves directly.)

I don’t understand how a neural network like AlphaStar can output an answer to the much broader question “What should I do?”. The answers can be “build a building”, “kill one of your own buildings”, “build a unit”, “attack a unit”, “move 2 of your units to position”, “move 3 of your units to another positions”, “load your units into a transport”, “use one of your units special abilities”, “research a new technology”, etc.

How are the answers to such a broad question encoded? Do we know how AlphaStar does it?

I’m especially baffled by the change number of units in StarCraft. Encoding the actions 2 units can take seems significantly different than encoding the actions 3 units can take. Do they use a multi-agent setup? Is each unit running its own NN and determining its own actions individually?

submitted by /u/Buttons840
[link] [comments]