Developing a quick guide on presenting data and uncertainty

This paper describes the rationale, development and testing of a quick guide leaflet on the presentation of data and uncertainty. While tools for capturing, analysing and presenting data become more advanced, the fundamentals of presenting data with adequate context and clarity remain unchanged. The leaflet, aimed at scientists generating and presenting data, was created as part of an interdisciplinary collaboration (meteorologists, information designers, psychologists) working within the PURE Network project RACER, funded by the UK Natural Environment Research Council. A copy of the leaflet is included in this article.


Introduction
As a discipline dealing with large amounts of complex data, meteorology relies on communication through numbers, graphs and charts. While atmospheric scientists are skilled users of these communication modes, many of their readers are not trained to interpret complicated information. Difficulties in interpretation may particularly affect those who are not familiar with the conventions of data presentation, nor with the use of data relating to forecasts subject to uncertainty. Spiegelhalter et al. (2011) give an excellent overview of some of the difficulties in communicating uncertainty about the future, and Gigerenzer et al. (2005) demonstrate the specific interpretation problems that can occur when percentages are used in rain forecasts. As part of the PURE (Probability, Uncertainty & Risk in the Environment) research programme, funded by the UK Natural Environment Research Council, an interdisciplinary team of meteorologists, designers and psychologists at the University of Reading carried out survey and experimental studies on user interpretations of, and decisions based on, environmental risk information (e.g. Mulder et al., 2017). This collaboration gave rise to more general consideration of environmental data communication and its role in enabling professional and public analysis and decision making. One outcome of this project was the creation of a short leaflet: Presenting Data and Uncertainty, the development and testing of which is described below. A full copy of the leaflet is shown in Figure 1.  , 2013). The length of the main guides from both sources led us to question their suitability for quick reference and potential uptake by their target audience of researchers and professionals who communicate uncertainty.

Current sources of advice
The substantial research literature on visualising and communicating uncertainty, which is fragmented across a wide range of fields (medicine, physics, chemistry, computer science, business, etc.) as well as GIS (Geographic Information Science) and atmospheric science, is not amenable to quick referencing for guidance on data presentation issues. Even longer guides, such as those of the WMO and PBL, cannot fully represent the scope of relevant research. While some research review articles present a more general approach, such as (Pang et al., 1997;MacEachren et al., 2005;Brodlie et al., 2012), others remain within a specific field of application (e.g. Bostrom et al., 2008 on 'seismic risk'). Experimental studies within the literature often deal with specific situations of communication. Cheong et al. (2016), for example, studied the effects of visualisation of risk surrounding wildfire 'burn likelihood' in a static, map based, decision-making task. Necessarily, such studies use controlled variables, but this means it can be difficult to judge to what extent recommendations from one specific situation of communication can be carried over into different systems and contexts.
We had an opportunity to consult informally with producers and consumers of probabilistic and uncertain information at a workshop attended by academic and industry specialists to consider the forecasting and communication of volcanic ash concentrations (at the Royal Academy of Engineering, London on 22 February 2016). The consultation revealed that participants were not familiar with the WMO or PBL guideline documents. Subsequent discussion with researchers and PhD students in the Department of Meteorology at the University of Reading indicated that they would be unlikely to consult long guides on data presentation, even if aware of them.

Observation of current communications
Our experience with atmospheric science data presented in journals (and wider uncertainty literature) and conferences suggests that shortcomings in data and uncertainty presentation are not unusual. There are often fundamental communication weaknesses, including a lack of clear titling of graphs, little contextualisation for data representations, and indistinct colour use. While these and similar features are not unique to the communication of uncertainty, they can further complicate data interpretation in this already difficult area.

Presenting data and uncertainty
There is a difference between data presentation that allows viewers to draw their own conclusions (where appropriate) and lapses in communication fundamentals that leave space for, or unwittingly encourage, mistaken interpretations and invalid conclusions. For example, a 3D projection of a pie chart with perspective 'effects' applied can make it difficult for readers to accurately assess the angle of a segment or make accurate comparisons between segments.
Weaknesses in contextualisation can occur at several different levels. In the most basic examples contextualisation might be limited by missing labels or units on axes, or titles that are either absent or lacking in detail. Mistakes at this basic level can leave readers struggling to work out what a graph is trying to show at all. On a wider scale, poor captioning and integration of the chart/graph figure into the main text can cause confusion, especially if the text does not directly support the figure. In some cases, poor communication results from inconsistencies or clear errors in presentation, but in others there may be a system of internal logic that is not transparent to the reader. That is, when taken in isolation a specific visualisation (for example) has a reasoning and sense behind its organisation, but this internal sense is then at odds with the wider context or conventions of interpretation and understanding. Similar effects can be seen when a discipline or group has developed its own conventions (use of colour, terminology, assumptions) that appear immediately comprehensible to its own members, but opaque to other readers. Public facing weather forecasts, for example, are a strongly conventionalised area, with approaches to presentation and public interpretation varying by region and nationality. Gigerenzer et al. (2005) demonstrated this variation, showing differences in interpretation and assumed context for a basic forecast, 30% chance of rain tomorrow, in New York compared to four European cities.

Responding to context
Our investigation (albeit informal) suggested a need for a quick reference guide on data and uncertainty presentation, with a focus on awareness of good fundamentals and a 'light' , easily retrievable, approach to increase the likelihood of potential users assimilating the key messages. Guidance would need to be relatively general, as specific details vary with the context, even within atmospheric science communication. We therefore aimed to produce a short leaflet that would: -raise awareness of the needs of some audiences to be supported in understanding data presented to them, particularly in the presence of uncertainty, and -highlight approaches to presentation that would improve data transparency and, conversely, those that can hinder interpretation.
The small format of a leaflet, while useful for quick reference, imposed a hard limit on how much could be communicated. The WMO two-page summary guide concentrates on why it is useful to communicate uncertainty information, followed by some examples of uncertainty data/graphics. We aimed to give more direct advice.
The content of the leaflet is focused around a list of core topics. The list was based on the research and teaching experience of our meteorologist team members and insights from project researchers in psychology and design. This experience includes teaching at undergraduate and postgraduate levels in the three disciplines, and, outside of meteorology, expertise in perceptual psychology, cognition and user-centred information design. Copy and illustrations to cover the topic list were devised and compiled by our designer team members into a draft leaflet for feedback.

Survey feedback on leaflet draft
A short survey was conducted ahead of full publication to check if the leaflet (in draft form) was relevant and useful for its intended purpose. The survey was distributed via Survey Monkey to PURE Network members (including representatives from insurance companies, government agencies, energy companies, first responders, and the aviation industry) and academic staff and postgraduate researchers in the Department of Meteorology at the University of Reading. Twenty-nine responses were returned, with responses from both academic-and industry-based recipients.
As the leaflet dealt with both general issues in data presentation and specific issues relating to the presentation of uncertainty, most of the survey questions had two parts, addressing 'data' and 'uncertainty' presentation as distinct points. Overall, respondents did not feel that the leaflet introduced them to new aspects of presenting data or uncertainty (only 7 and 9 respondents, respectively, agreeing that it introduced new aspects). However, a majority (21 of the 29 respondents for both data and uncertainty questions) felt that the leaflet reminded them of aspects of which they were already aware, but did not always implement in their own work. The benefits of the guide in collating reminders of basic good practice were indicated by 13 respondents, who agreed that as a result of the leaflet they would make changes to the way they presented their own data. Similarly, 13 respondents agreed that they would change the way they reviewed the data presentation of others.
In response to a question regarding the potential audience for such a leaflet, 25 of the 29 respondents thought it would be useful for trainee research scientists and undergraduates, and 20 for experienced research scientists. Recommendations for audiences outside of academia dropped to around half of respondents in other key areas (14 recommending it for civil servants, and journalists and broadcasters, 15 for industry partners). This bias towards potential for academic users might reflect the research background of many of the respondents, or that the respondents felt that academic users were more likely than others to create data presentations.
An open-ended question (any other comments) yielded detailed suggestions for reduction of some topics (specifically on colour use) and augmentation of others (for example on the use of captions, labels and accompanying text to provide the full context for data presentation), to which we responded in a revision of the draft leaflet. Two respondents thought that the leaflet content was too basic to be helpful. Nonetheless, although the leaflet does not engage in detailed or advanced specifics, our initial research suggested that basics are often forgotten, as indicated by responses to the survey. Across a group of volunteer respondents, we would expect a range of perspectives and approaches to data presentation.

Implementation and access
The aim of the leaflet was to reinforce good practice at a fundamental level, such as the need for clear and consistent labelling and contextualisation, in order to support some audiences in interpreting atmospheric science data. Our survey indicated that although many people have been exposed to the underlying principles of presenting data and uncertainty, there is value in providing a concise and easily digestible summary that will serve as an aide memoire for day-to-day use.
The fundamentals of clear communication and data presentation are of renewed importance at a time when digital technology offers almost unlimited options for the presentation of information. With the wealth of options available, it can be easy to overlook basic factors that still have a central impact on comprehension. In some cases, it may be necessary or beneficial to further support understanding of data presented graphically or in text with discussion and face-to-face communication. While this leaflet does not focus on spoken communication, the same principals of clarity and contextualisation are relevant. The leaflet provides quick-reference reminders Presenting data and uncertainty of these fundamentals and some specifics for the more specialist area of uncertainty communication.
The leaflet is available to download from www.reading.ac.uk/web/files/infodesign/ presenting-data-and-uncertainty-quickguide.pdf.