Integrating vision and natural language without central models

Ian Horswill*

*Corresponding author for this work

Research output: Contribution to conferencePaperpeer-review

8 Scopus citations

Abstract

Ludwig answers natural language queries about a simple scenes using a real-time vision system based on current biological theories of vision. Ludwigis unusual in that it does not use a propositional database to modelthe world. Instead, it simulates the interface of a traditional world modelby providing plug-compatible operations that are implementeddirectly using real-time vision. Logic variables are boundto imageregions, rather than complexdata structures, while predicates, relations, andexistential queries are computedon demandby the vision system. This architecture allow Ludwigto "use the world as its own best model" in the most literal sense. The resulting simpfifications in the modeling, reasoning, and parsing systems allow them to be implemented as communicatingfinite state machines, thus giving thema weakbiological plausibility. Theresulting systemis highly pipelined and incremental, allowing noun phrase referents to be visually determinedeven before the entire sentence has been parsed.

Original languageEnglish (US)
Pages54-61
Number of pages8
StatePublished - 1995
Event1995 AAAI Fall Symposium on Embodied Language and Action - Cambridge, United States
Duration: Nov 10 1995Nov 12 1995

Conference

Conference1995 AAAI Fall Symposium on Embodied Language and Action
Country/TerritoryUnited States
CityCambridge
Period11/10/9511/12/95

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Integrating vision and natural language without central models'. Together they form a unique fingerprint.

Cite this