A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

Data PlatformsPremium

We present an end-to-end, model-based deep reinforcement learning agent which dynamically attends to relevant parts of its state, in order to plan and to generalize better out-of-distribution. The agent’s architecture uses a set representation and a bottleneck mechanism, forcing the number of entities to which the agent attends at each planning step to be small. In experiments with customized MiniGrid environments with different dynamics, we observe that the design allows agents to learn to plan effectively, by attending to the relevant objects, leading to better out-of-distribution generalization. , Lees op arxiv.org/abs/2106.02097

Premium content

A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

Dit artikel is exclusief beschikbaar voor nieuwsbrief-abonnees. Schrijf je in voor toegang tot 880+ artikelen.

Geen spam. Uitschrijven op elk moment.

AI & Security Intelligence

Wekelijkse nieuwsbrief met AI updates, security alerts en compliance inzichten, direct in uw inbox.

Security & AI Operating Model

Advisory met executiekracht

Van BIO2 en NIS2 tot EU AI Act, embedded in uw operating model, niet als extern project. Maandelijks opzegbaar, met assessments als bewijsvoering.

Bekijk advisory niveaus →Plan een intake

A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

AI & Security Intelligence

Advisory met executiekracht

Gerelateerde artikelen

De AI-levenscyclus een benadering voor gefaseerde innovatie.

Who is the data owner?

Unlocking the Power of Language From Roman Jakobson to Large Language Models (LLMs)

A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

AI & Security Intelligence

Advisory met executiekracht

Gerelateerde artikelen

De AI-levenscyclus een benadering voor gefaseerde innovatie.

Who is the data owner?

Unlocking the Power of Language From Roman Jakobson to Large Language Models (LLMs)