TY - GEN
T1 - Multimodal interactive spaces
T2 - 2010 IEEE Workshop on Spoken Language Technology, SLT 2010
AU - Worsley, Marcelo
AU - Johnston, Michael
PY - 2010/12/1
Y1 - 2010/12/1
N2 - Through the growing popularity of voice-enabled search, multimodal applications are finally starting to get into the hands of consumers. However, these applications are principally for mobile platforms and generally involve highly-moded interaction where the user has to click or hold a button in order to speak. Significant technical challenges remain in bringing multimodal interaction to other environments such as smart living rooms and classrooms, where users speech and gesture is directed toward large displays or interactive kiosks and the microphone and other sensors are 'always on'. In this demonstration, we present a framework combining low cost hardware and open source software that lowers the barrier of entry for exploration of multimodal interaction in smart environments. Specifically, we will demonstrate the combination of infrared tracking, face detection, and open microphone speech recognition for media search (magicTV) and map navigation (magicMap).
AB - Through the growing popularity of voice-enabled search, multimodal applications are finally starting to get into the hands of consumers. However, these applications are principally for mobile platforms and generally involve highly-moded interaction where the user has to click or hold a button in order to speak. Significant technical challenges remain in bringing multimodal interaction to other environments such as smart living rooms and classrooms, where users speech and gesture is directed toward large displays or interactive kiosks and the microphone and other sensors are 'always on'. In this demonstration, we present a framework combining low cost hardware and open source software that lowers the barrier of entry for exploration of multimodal interaction in smart environments. Specifically, we will demonstrate the combination of infrared tracking, face detection, and open microphone speech recognition for media search (magicTV) and map navigation (magicMap).
KW - Gesture recognition
KW - Multimodal integration
KW - Open microphone
KW - Speech recognition
UR - http://www.scopus.com/inward/record.url?scp=79951781872&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79951781872&partnerID=8YFLogxK
U2 - 10.1109/SLT.2010.5700841
DO - 10.1109/SLT.2010.5700841
M3 - Conference contribution
AN - SCOPUS:79951781872
SN - 9781424479030
T3 - 2010 IEEE Workshop on Spoken Language Technology, SLT 2010 - Proceedings
SP - 161
EP - 162
BT - 2010 IEEE Workshop on Spoken Language Technology, SLT 2010 - Proceedings
Y2 - 12 December 2010 through 15 December 2010
ER -