Abstract This paper reports on the utility of gestures and speech to manipulate graphic objects. In the experiment described herein, three different populations of subjects were asked to communicate with a computer using either speech alone, gestures alone, or both. The task was the manipulation of a three-dimensional cube on the screen. They were asked to assume that the computer could see their hands, hear their voices, and understand their gestures and speech as well as a human could. A gesture classification scheme was developed to analyse the gestures of the subjects. A primary objective of the classification scheme was to determine whether common features would be found among the gestures of different users and classes of users. The collected data show a surprising degree of commonality among subjects in the use of gestures as well as speech. In addition to the uniformity of the observed manipulations, subjects expressed a preference for a combined gesture/speech interface. Furthermore, all subjects easily completed the simulated object manipulation tasks. The results of this research, and of future experiments of this type, can be applied to develop a gesture-based or gesture/speech-based system which enables computer users to manipulate graphic objects using easily learned and intuitive gestures to perform spatial tasks. Such tasks might include editing a three-dimensional rendering, controlling the operation of vehicles or operating virtual tools in three dimensions, or assembling an object from components. Knowledge about how people intuitively use gestures to communicate with computers provides the basis for future development of gesture-based input devices.
摘要:本文报道了手势和语音在操作图形对象方面的效用。在此描述的实验中,三组不同的受试者被要求单独使用语音、单独使用手势或者两者结合来与计算机进行交流。任务是操作屏幕上的一个三维立方体。他们被要求假定计算机能够像人一样看到他们的手、听到他们的声音,并理解他们的手势和语音。开发了一种手势分类方案来分析受试者的手势。该分类方案的一个主要目的是确定在不同用户以及不同用户类别之间的手势是否会有共同特征。收集的数据显示,受试者在使用手势以及语音方面存在令人惊讶的共性程度。除了观察到的操作具有一致性之外,受试者表示更喜欢手势/语音结合的界面。此外,所有受试者都轻松完成了模拟的物体操作任务。这项研究以及此类未来实验的结果可应用于开发一种基于手势或基于手势/语音的系统,使计算机用户能够使用易于学习且直观的手势来操作图形对象以执行空间任务。此类任务可能包括编辑三维渲染图、控制车辆操作或在三维空间中操作虚拟工具,或者从组件组装一个物体。关于人们如何直观地使用手势与计算机交流的知识为未来基于手势的输入设备的开发提供了基础。