Choosing Effective Colours for Data Visualization
Christopher G. Healey1
Department of Computer Science, University of British Columbia
In this paper we describe a technique for choosing multiple colours for use during data visualization. Our goal is a systematic method for maximizing the total number of colours available for use, while still allowing an observer to rapidly and accurately search a display for any one of the given colours. Previous research suggests that we need to consider three separate effects during colour selection: colour distance, linear separation, and colour category. We describe a simple method for measuring and controlling all of these effects. Our method was tested by performing a set of target identification studies; we analysed the ability of thirty-eight observers to find a colour target in displays that contained differently coloured background elements. Results showed our method canbe used to select a group of colours that will provide good differentiation between data elements during data visualization.
CR Descriptors: H.5.2 [Information Interfaces and Presentation]: User Interfaces - ergonomics, screen design (graphics, colour); I.3.6 [Computer Graphics]: Methodology and Techniques - ergonomics, interaction techniques
Scientific visualization in computer graphics is a rapidly expanding area of research. This is due in large part to the dramatic increase in both the size and the number of datasets that need to be visualized [5, 18]. To date, many application-specific tools have been built to help analyse individual datasets. Much less work has focused on developing guidelines for the design of visualization techniques . Our work is intended to address one aspect of this more general question.
A typical method of visualizing a dataset involves mapping data attributes to visual features (e.g., shape, size, spatial location, and orientation). Colour is an important and frequently-used feature. Examples include colour temperature gradients on maps and charts, colour-coded vector fields in flow visualization, or colour icons displayed by real-time simulation systems.
If we use colour to represent our data, an important question to ask is: ?How can we choose effective colours that provide good differentiation between data elements during the visualization task?? We address this problem by trying to answer three related questions:
1Departmentof Computer Science, Unviersity of British Columbia, 2366 Main Mall, Vancouver, British Columbia, V6T 1Z2, Canada
ffl How can we allow rapid and accurate identification of individual data elements through the use of colour?
ffl What factors determine whether a ?target? element's colour will make it easy to find, relative to differently coloured ?nontarget? elements?
ffl How many colours can we display at once, while still allowing for rapid and accurate target identification?
Previous work has addressed the issue of choosing colours for certain types of data visualization. For example, Ware and Beatty describe a simple colour visualization technique for displaying correlation in a five-dimensional dataset . Research on the design of military systems sometimes quotes anecdotal evidence that suggests ?: : :the general guideline for computer-generated images is no more than five to seven colours at a time: : :? , although they offer no explanation for why this might be the case. Robertson, Ware, Rheingans and Tebbs, and Levkowitz and Herman discuss various methods for building effective colour gamuts and colourmaps [15, 21, 14, 13]. Recent work at the IBM Thomas J. Watson Research Center has focused on a rule-based visualization tool which considers how a user perceives visual features like hue, luminance, height, and so on [16, 2].
None of these techniques were intended to investigate the rapid and accurate identification of individual data elements based on colour. Also, since the colour gamut and colourmap work uses continuous colour scales to encode information, they do not address the question of how many colours we can effectively display at once, while still providing good differentiation between individual data elements.
An intuitive first step to gaining more control over colour would be to use a perceptual colour model like CIELUV, CIE Lab, or Munsell . Unfortunately, fixing the colour distance in a perceptual colour model to a constant value does not guarantee that each colour will be equally easy to detect. Other factors can affect how groups of coloured elements interact with one another.
Our technique does, in fact, use the CIE LUV colour model to provide control over colour distance and isoluminance. We also exploit two specific results related to colour target detection: linear separation [7, 1] and colour category . These effects are controlled to allow for the rapid and accurate identification of colour targets. Target identification is a necessary first step towards performing other types of exploratory data analysis. If we can rapidly and accurately differentiate elements based on their colour, we can apply our results to other important visualization techniques like detection of data boundaries, the tracking of data regions in real-time, and enumeration tasks like counting and estimation [20, 19, 9].
Before we describe our technique in detail, we provide a brief overview of the CIELUV colour model, as well as a description of the linear separation and colour category effects.