Exploration and exploitation during sequential search

Gregory Dam*, Konrad Körding

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

When we learn how to throw darts we adjust how we throw based on where the darts stick. Much of skill learning is computationally similar in that we learn using feedback obtained after the completion of individual actions. We can formalize such tasks as a search problem; among the set of all possible actions, find the action that leads to the highest reward. In such cases our actions have two objectives: we want to best utilize what we already know (exploitation), but we also want to learn to be more successful in the future (exploration). Here we tested how participants learn movement trajectories where feedback is provided as a monetary reward that depends on the chosen trajectory. We mathematically derived the optimal search policy for our experiment using decision theory. The search behavior of participants is well predicted by an ideal searcher model that optimally combines exploration and exploitation.

Original languageEnglish (US)
Pages (from-to)530-541
Number of pages12
JournalCognitive Science
Volume33
Issue number3
DOIs
StatePublished - May 2009

Keywords

  • Decision making
  • Human search behavior
  • Mathematical modeling
  • Motor control
  • Neuroeconomics
  • Skill acquisition

ASJC Scopus subject areas

  • Experimental and Cognitive Psychology
  • Cognitive Neuroscience
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Exploration and exploitation during sequential search'. Together they form a unique fingerprint.

Cite this