[R] Badger architecture by GoodAI

Blog: https://blog.marekrosa.org/2019/12/badger.html

Paper: https://arxiv.org/abs/1912.01513

Badger = an architecture and a learning procedure where:

An agent is made up of many experts

All experts share the same communication policy (expert policy), but have different internal memory states

There are two levels of learning, an inner loop (with a communication stage) and an outer loop

Inner loop – Agent’s behavior and adaptation emerges as a result of experts communicating between each other. Experts send messages (of any complexity) to each other and update their internal memories/states based on observations/messages and their internal state from the previous time-step. Expert policy is fixed and does not change during the inner loop.

Inner loop loss need not even be a proper loss function. It can be any kind of structured feedback guiding the adaptation during the agent’s lifetime.

Outer loop – An expert policy is discovered over generations of agents, ensuring that strategies that find solutions to problems in diverse environments can quickly emerge in the inner loop.

Agent’s objective is to adapt fast to novel tasks

Exhibiting the following novel properties:

Roles of experts and connectivity among them assigned dynamically at inference time

Learned communication protocol with context-dependent messages of varied complexity

Generalizes to different numbers and types of inputs/outputs

Can be trained to handle variations in architecture during both training and testing

submitted by /u/sorrge
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

[R] Badger architecture by GoodAI