Through the growing popularity of voice-enabled search, multimodal applications are finally starting to get into the hands of consumers. However, these applications are principally for mobile platforms and generally involve highly-moded interaction where the user has to click or hold a button in order to speak. Significant technical challenges remain in bringing multimodal interaction to other environments such as smart living rooms and classrooms, where users speech and gesture is directed toward large displays or interactive kiosks and the microphone and other sensors are 'always on'. In this demonstration, we present a framework combining low cost hardware and open source software that lowers the barrier of entry for exploration of multimodal interaction in smart environments. Specifically, we will demonstrate the combination of infrared tracking, face detection, and open microphone speech recognition for media search (magicTV) and map navigation (magicMap).