THE APPLICATION OF A METHOD FOR INTEGRATING
NON-SPEECH AUDIO INTO HUMAN-COMPUTER
Stephen A. Brewster1, Peter C. Wright2 and Alistair D. N. Edwards2
1VTT Information Technology, 2Department of Computer Science,
Tekniikantie 4 B, University of York,
P.O. Box 1203, Heslington,
FIN-02044 VTT, Finland York, Y01 5DD, UK.
Tel.: +358 0 456 4311 Tel.: +44 904 432775
[email protected] [pcw, alistair]@minster.york.ac.uk
This paper describes the application of a structured method for integrating non-speech
sound into graphical interfaces. The method analyses interactions in terms of event,
status and mode information. It then categorises this information in terms of the feedback
needed to present it. This is then combined with guidelines for creating sounds to
generate the auditory feedback required. As an example, the method is applied to a
scrollbar. This sonically-enhanced scrollbar is then experimentally evaluated to see if the
auditory enhancements are effective. The results show that the new scrollbar reduced the
time taken to perform certain tasks, it reduced the time taken to recover from errors, it
reduced mental workload and participants preferred it to the standard graphical scrollbar.
Auditory interfaces, earcons, sonically-enhanced scrollbar, sonification, multimodal
The combination of graphical and auditory information at the human-computer interface is a natural step forward. In everyday life both senses combine to give us complementary information about the world. The visual system gives us detailed data about a small area of focus whereas the auditory system provides general data from all around, alerting us to things outside our peripheral vision. The advantages offered by the combination of these different senses can be brought to the human-computer interface. Dannenberg & Blattner (, pp xviii-xix) discuss some of the advantages of using this approach in multimedia/multimodal computer systems:
?People communicate more effectively through multiple channels. ? Music and other sound in film or drama can be used to communicate aspects of the plot or situation that are not verbalised by the actors. Ancient drama used a chorus and musicians to put the action into its proper setting without interfering with the plot. Similarly, non - speech audio messages can communicate to the computer user without interfering with an application?.
Currently, almost all information presented by computers is visual. A multimodal interface that integrated information output to both senses could capitalise on the interdependence between them and present information in the most efficient way possible.
How then should the information be apportioned to each of the senses? Sounds can do more than simply indicate errors or supply redundant feedback for what is already available on the graphical display. They should be used to present information that is not currently displayed (give more information) or present existing information in a more effective way so that users can deal with it more efficiently.
Alty & McCartney  have begun to consider this problem in process control environments. They wanted to create a multimedia process control system that would choose the appropriate modality (from those available) for presenting information to the plant operator. A resource manager was to be used for this. In such a system there is much information that must be presented and the appropriate modality may not always be available because, for example, it is being used for other output at the same time. Alty & McCartney suggest that alternative media could then be