Dealing with inconsistent weather warnings: effects on warning quality and intended actions

In the past four decades, the private weather forecast sector has been developing next to National Meteorological and Hydrological Services, resulting in additional weather providers. This plurality has led to a critical duplication of public weather warnings. For a specific event, different providers disseminate warnings that are more or less severe, or that are visualized differently, leading to inconsistent information that could impact perceived warning quality and response. So far, past research has not studied the influence of inconsistent information from multiple providers. This knowledge gap is addressed here. An inconsistency matrix was developed and employed to categorize the level of inconsistency across multiple warnings. The matrix provides warning pairs inconsistent in visualization, text or both. A survey experiment was conducted in Switzerland (N = 1,335). The results show that half of the people who received warnings from different providers for the same event indicated that these were inconsistent. The evaluation of warning quality and intended actions in a decision scenario characterized by two severe rainfall warnings shows the negative impacts of inconsistency. For example, consistent warnings are least confusing and inconsistent visual and textual warnings are most confusing. However, there are no significant differences in the effects of inconsistent textual information compared to inconsistent visual information on warning quality and intended actions. These findings offer empirical justification to enhance co‐operation between public and private weather providers. To improve warnings, the providers should find an agreement to be consistent either in the text or in the visualization.

In the past four decades, the private weather forecast sector has been developing next to National Meteorological and Hydrological Services, resulting in additional weather providers. This plurality has led to a critical duplication of public weather warnings. For a specific event, different providers disseminate warnings that are more or less severe, or that are visualized differently, leading to inconsistent information that could impact perceived warning quality and response. So far, past research has not studied the influence of inconsistent information from multiple providers. This knowledge gap is addressed here. An inconsistency matrix was developed and employed to categorize the level of inconsistency across multiple warnings. The matrix provides warning pairs inconsistent in visualization, text or both. A survey experiment was conducted in Switzerland (N = 1,335). The results show that half of the people who received warnings from different providers for the same event indicated that these were inconsistent. The evaluation of warning quality and intended actions in a decision scenario characterized by two severe rainfall warnings shows the negative impacts of inconsistency. For example, consistent warnings are least confusing and inconsistent visual and textual warnings are most confusing. However, there are no significant differences in the effects of inconsistent textual information compared to inconsistent visual information on warning quality and intended actions. These findings offer empirical justification to enhance co-operation between public and private weather providers. To improve warnings, the providers should find an agreement to be consistent either in the text or in the visualization.

K E Y W O R D S
decision-making, warning communication, warning inconsistency, warning quality On a rainy day in November 2017, several different weather information providers issued to the general public a variety of extreme weather warnings for the southernmost part of Switzerland around Lake Lugano (a region known as Sottoceneri). Four warnings were issued by four different Swiss weather providers: the national meteorological service (MeteoSwiss), the national broadcasting service (SRF Meteo) and two public providers (MeteoNews and Meteocentrale). Table 1 summarizes the different visualizations and interpretations (amount of rainfall per time period). What actually fell that day was 50 mm, meaning that only the MeteoSwiss forecast was correct. However, the question that this research examines is whether the inconsistency of the warnings could have made people doubt the whole information package, easily available on mobile devices, and choose not to respond to any of them. If indeed the rainfall had been higher resulting in flooding, such a course of action could have had serious negative consequences.
Over the last decades, a private sector meteorology industry has developed next to National Meteorological and Hydrological Services (NMHS) (Pettifer, 2015). The private weather sector is becoming much more involved in nearly all elements of the weather value chain from observations to tailored weather products (Thorpe, 2016). These authors state that, even though these recent developments have led to improvements in forecast quality and weather services, they have also resulted in more negative outcomes, especially when public weather warnings are duplicated (Thorpe, 2016). In Europe, all NMHS are obliged to issue official and authoritative weather warnings on behalf of their governments to warn public authorities and the public at large of hazardous weather. In addition to these official warnings of the NMHS, private weather companies can also publish and disseminate their own weather warnings. Differences between map colour-coding, warning thresholds and weather models increase the risk of apparent (or actual) inconsistencies in the warning information disseminated by different weather providers for a specific point in time.
In southern Africa, the belief in counter-productiveness of sometimes contradictory national forecasts led to the development of the first regional Climate Outlook Forum (COF) in Zimbabwe as early as 2007 (Patt et al., 2007). The COF addressed the issue of trustworthiness by negotiating a single, multinational, consensus forecast for the region of southern Africa. Even though the COF members did not provide empirical evidence, they showed that the belief in counter-productiveness of inconsistent national information is a recognized problem and has led to changes. This study will provide some evidence that the COF members were right even though the context is different. To date, there has been little empirical evidence examining the influence of inconsistent information on warning evaluation and response, and research addressing the impact of warning information from multiple providers is completely missing. This void provided the impetus for the research reported here.
This study focuses on the effects of inconsistent warning information on the evaluation of warning quality and intended actions. It is hypothesized that the receipt of inconsistent information is likely to decrease the likelihood that people take adaptive action compared to consistent information. To test the hypotheses, a randomized control survey with 1,335 participants was conducted in Switzerland. This provides empirical evidence (lacking in the literature) and demonstrates the potential for improvements in warning design and process to help to improve understanding and appreciation of danger.

| LITERATURE BACKGROUND
Generally, the risk communication literature suggests that the information provided must be consistent and spread through multiple channels in order to lead to effective decision-making. For example, Mileti and Fitzpatrick (1992) found that risk communication during an event was effective because it was a process of multiple messages (that were consistent) delivered through multiple channels emanating from multiple sources. This confirmed findings of prior studies (Hovland and Weiss, 1951;Drabek and Boggs, 1968;Mileti and Beck, 1975). However, these studies focused on identical messages that were communicated differently, such as through a trustworthy or untrustworthy communicator (Hovland and Weiss, 1951) or through different channels (written, electronic) and sources (friends, officials) (Mileti and Fitzpatrick, 1992). Mileti and Peek (2000), who reviewed the process of public response to warnings of a nuclear power plant accident, highlighted that effective warning messages include information that is clear, specific, accurate, certain and consistent. Moreover, consistency must be ensured within messages, as well as across different messages. They summarize previous research which found that warning messages promote the formation of accurate perceptions only if they are consistent with other publicly announced advisements (Perry and Green, 1982;Quarantelli, 1984;Drabek, 1999).
However, the effect of inconsistent information has largely been understudied. Even though Lindell and Perry (2012) acknowledge that multiple sources often deliver conflicting messages that require searching for additional information to clarify the confusion, they do not study how people are affected by inconsistent information, for instance, when there is no time to consult new information to resolve the ambiguity. In the context of climate forecasts, inconsistency has been identified as a cause for concern in terms of agricultural decision-making related to climate change adaptation (Garrett et al., 2013;Bhatta and Aggarwal, 2016). To date, only Losee and Joslyn (2018) appear to have provided direct evidence of the impact of weather forecast inconsistency on public response. They addressed the influence of inconsistency in sequential forecasts on trust (when the source is not mentioned) directly and found that inconsistency leads to lower trust in the forecast, but also that inconsistency leads to a higher likelihood of taking risk mitigation action. Moreover, they found that increased forecast severity was positively correlated with increased levels of trust and action-taking. The study here did not focus on inconsistency in sequential warnings but on inconsistency in weather providers' warning information at a given point in time; to the authors' knowledge no previous study has examined this form of inconsistency. Not surprisingly, best practices in risk and crisis communication also underline the need to spread consistent messages. For example, the National Oceanic and Atmospheric Administration (NOAA) Social Science Committee highlights that those disseminating the messages will work together to create and share consistent information, and speak with one voice. The Committee acknowledges that multiple messengers are critical to providing a consistent message that will be clear and credible, because it is invariant (NOAA, 2016). Other policy documents use similar language (econcept AG, 2011). However, in contrast to the literature suggestions, very often warnings disseminate inconsistent information. In the absence of empirical evidence Sandman (2006) questions whether speaking with one voice is the optimal way to communicate risk because it does not "accept uncertainty and ambiguity," assumes that lay people cannot handle expert disagreement or uncertainty and risks silencing private providers who disagree with the single harmonized message, thereby limiting warning reach. These conjectures are reinforced by Losee and Joslyn (2018) who show that, when confronted with inconsistent information, people tend to be more precautious and are more likely to take risk mitigation actions.
There are similarities between the topics of forecast uncertainty communication and information inconsistency. Uncertainty communication research focuses on the analysis of information processes and decision shortcuts (such as biases and heuristics) that often lead forecast users within the general public to interpret information or behave incorrectly (Wernstedt et al., 2018). Wernstedt et al. (2018) show that the use of forecast information appears to depend critically on risk communication. Inconsistent warning messages that are the result of uncertainties in weather models and of the interpretation of model outcomes appear to play a critical role. Recent research suggests that end users of weather forecasts had well-defined uncertainty expectations, highlighting that they should be able to understand explicit uncertainty forecasts (Joslyn and Savelli, 2010) or that providing forecasts with uncertainty information enhances the chances that users take precautionary action when threatened with an extreme weather event with a low probability (LeClerc and Joslyn, 2012). Morss et al. (2008) used empirical data from a nationwide survey of the US public to investigate beliefs commonly held among meteorologists and found that a significant majority of respondents liked weather forecasts that expressed uncertainty and many preferred such forecasts to single-valued forecasts.
Graphical or visual salience studies have addressed the advantages and disadvantages that graphics provide in communicating a warning and related uncertainties. Considering the advancing of weather applications in the era of big data, this is a research topic that may deserve greater attention in the future. Graphical salience refers to the importance of visual components in a map or other visual representation that intuitively draws people's attention, indicating the most relevant and important features for cognitive processing (Fabrikant et al., 2010;Severtson, 2013). Variations of colour in terms of hue, number or value, as well as size of the map, are some of the most commonly studied visual features which can facilitate comprehension and salience (Wogalter et al., 2002;Garlandini and Fabrikant, 2009;Severtson, 2013). Pravossoudovitch et al. (2014) document an implicit association between red and danger, and suggest using red to communicate danger in systematic signal systems. Different colour combinations and types of images have also been tested in the context of severe weather. Sherman-Morris (2013) tested the effectiveness of different graphical communication forms (different colour palettes, legends and text descriptions) for hurricane storm risk threat. This study indicates that shades of blue are the most difficult to interpret. In Sherman-Morris and Lea (2016) the impact of two different types of radar images on warning recipients was tested. Respondents who viewed a reflectivity display appeared to have higher perceptions of risk and higher likelihoods of taking protective action than those who viewed a velocity display. However, they concluded that other aspects of the data suggested that these differences were due to the weathercasters' accompanying commentary (i.e. differences in spoken message) rather than the images themselves.
More specifically, findings from research that focused on improving warning messages through the inclusion of graphics or maps are ambiguous. Canham and Hegarty (2010) report that, in a first experiment, participants could apply newly acquired knowledge about pressure and wind direction to make appropriate choices about wind direction. However, in a second experiment, in which participants were given task irrelevant information, performance decreased, suggesting that graphics should not include more information than is required. Although Bean et al. (2016) found that indicating the user's location on a map leads to more personalization of risk and improved participants' perception of personal risk, this finding is not supported by other studies. Broad et al. (2007) reported that hurricane forecast graphics are misunderstood by many members of the public. Similarly, Savelli and Joslyn (2013) found that forecast messages containing only visualizations were more likely to lead to erroneous interpretation than text alone, whereas Casteel and Downing (2015) found no effect of graphics (or text) on the perceived risk, perceived severity and likelihood to contact a loved one for each message. They used a scenario in which participants were told they were driving through an unknown region of the United States to investigate wireless emergency alert weather warning message effectiveness across one of four conditions (text, text + polygon, text + radar image, text + radar image + polygon). They indicate that their findings along with those of Canham and Hegarty (2010) and Sherman-Morris and Lea (2016) highlight the equivocal influence of weather graphics on comprehension.
Based on ambiguous research results on the influence of inconsistency on adaptive/risk minimizing action, there is a need to better understand how people are affected by differing, sometimes conflicting, information coming from weather providers (also highlighted by the National Academies of Sciences, 2017). More precisely, to justify possible adjustments of the warning process or even the legislation, there is a need to find out whether consistent information is actually more effective than inconsistent information. In order to address this research gap, the following questions were analysed. Does inconsistency affect evaluation of warning quality and does it influence intended behavioural response? The possible effects of different types of forecast inconsistency are considered and warning characteristic effects (such as visual graphics) on user evaluation of quality are analysed.

| Inconsistent weather warnings
Switzerland has a single NMHS (MeteoSwiss), the national broadcasting service (SRG SSR) with its own independent weather information service (SRF Meteo) and multiple private companies of which MeteoNews and Meteocentrale issue public warnings. Inconsistency in warnings is caused by several factors including the use of different weather forecast models (e.g. COSMO, GFS, ECMWF), parameterization, assimilations and resolutions (Bassill, 2014;MeteoSwiss, 2018). Furthermore, warnings from different providers can also be influenced by observations and nowcasting (a technique for very short range forecasting).
Forecasters can interpret the outputs of these models in different ways. For instance, a more experienced forecaster may understand the data differently to a less experienced colleague, or one forecaster may have a personal preference for a specific weather model based on long-term or shortterm experience. A possible reason could be that the model underestimates wind for several weeks in a row. Also, public forecasters bear a higher responsibility compared to their private colleagues. They have the legal duty to warn the regional authorities, as well as the general public, of severe weather. These warnings can lead to expensive deployments of police, fire brigade, military and civil defence personnel. Therefore, MeteoSwiss has a policy to stay below a certain false alarm ratio (MeteoSwiss, 2017). In contrast, private sector organizations are seemingly able to warn with fewer obligations or consequences and this affords them more freedom to target a higher probability of detection at the expense of a higher false alarm ratio. As revealed by the results of semi-structured interviews with Swiss stakeholders involved in the natural hazard related warning chain, different responsibilities and agendas between the public and private weather providers are thought to lead to different warning practices, so that private providers deliver and disseminate more severe warnings on a more frequent basis than the public weather agency. While confirming or rejecting such suspicions is beyond the framework of this study, monitoring warning messages issued by the four Swiss weather providers (MeteoSwiss, MeteoNews, Meteocentrale, SRF Meteo) between August and November 2017 revealed regular disparities and in most cases at least one private provider disseminated a more severe warning than the public agency. In fact, data were collected from 19 events in that period and in 13 cases at least one of the four providers was not issuing a warning whilst (at least) another one was. Table 2 shows that different weather providers use different threshold values to define the warning categories. Even though the climate data on which these thresholds are mainly based are the same, the interpretation of the datasets changes from one climatologist/forecaster to another. Also, only some providers take into account the current climatological and topographical situation (e.g. differences in runoff on the northern and southern side of the Alps or different threshold values for the lowlands and the mountains) and/or the needs of specific groups of stakeholders (e.g. the cantonal agencies) when developing the threshold values. All this contributes to different threshold values and to different numbers of warning (and pre-warning) categories. Among the four Swiss providers, warning categories differ from three to five warning and zero to two pre-warning categories, which again go hand in hand with the choice of colours that represent the (pre-) warning categories. Also, pre-warning categories are based on lead time and probabilities (e.g. MeteoSwiss) or on impact thresholds (e.g. MeteoNews). For example, SRF Meteo uses three warning categories (coloured in light orange, dark orange and dark red) and issues no pre-warnings, whereas MeteoSwiss uses five warning categories (coloured in green, yellow, orange, red and dark red) for most hazards (e.g. only two categories for thunderstorms) and, as well, has a prewarning category (called warning outlook). Even though Meteocentrale and MeteoNews use the same number of warning categories and associated colours (orange, red and violet), these categories are defined again by different threshold levels for a given hazard. Last, but not least, the number of warning regions also varies from one provider to another. As the regions are based on a very small scale (e.g. 159 warning regions for MeteoSwiss, 172 for SRF Meteo, 140 for Meteo-News and individual regions for Meteocentrale), there are only minor differences in geographical coverage.

| Indicator design
In response to the challenge of dealing with warning inconsistency, an "inconsistency matrix" was developed to help structure analysis. It is important to highlight that this matrix focuses on inconsistency in warning information from different weather providers at a given point in time, and not on inconsistency in sequential warnings from one provider. Table 3 presents the matrix based on two axes: the horizontal axis distinguishes between same and different visualizations; the vertical axis distinguishes between same and different texts (with a focus on degree of warning severity) in the warnings. As a result, the four types, each one characterized by two different warnings, are described by consistent warnings (type AA); inconsistent visual warnings (type AB); inconsistent textual warnings (type AC); and inconsistent visual and textual warnings (type AD). These four types have characteristic differences when the full set of warning characteristics are the number of categories, colours, warning thresholds, interpretation of data and used weather model. In type AB the warnings appear differently because providers use different categories, thresholds and colours (considering that these three elements are interlinked) even though the text (i.e. the forecasted hazard severity) is the same. In contrast, for type AC the warning message is different, yet the providers use identical categories, thresholds and colours. The typology in Table 3 has been used to conduct a  large sample randomized control survey. In the following section, a description of the research methods and the operationalization of this typology is provided. Examples of the types are also reproduced in Figures S1-S8.

| Hypotheses, data and analytical methods
The method used here was a large sample randomized control survey that tested for effects of warning inconsistency on the evaluation of warning quality and intended response. A survey containing 31 questions was conducted based on a decision scenario. The questions were developed by the authors taking into account the vast literature on warning communication and examples of other questionnaire surveys (e.g. Ripberger et al., 2014;Kox and Thieken, 2017;Maidl, 2017;Potter et al., 2018;Weyrich et al., 2018). The scenario was placed at the beginning of the survey. Respondents were asked to imagine that it is 1900 UTC in the evening on a Friday and that, almost simultaneously, they saw two severe rainfall warnings, valid for the weekend. They were told that during that weekend they would be celebrating their birthday that they had planned for months. In more detail, they were told to imagine that they would be having a picnic and barbecue with friends on the shores of Lake Lugano (in the Swiss territory; see Figure 1). Other scenarios (such as going for a run in the woods, moving a picnic indoors, going to work by bicycle or protecting plants from frost) with different response scale options (binary or Likert scale) were considered and pre-tested as well, but the birthday scenario was chosen as (a) it was simple, straightforward and realistic, (b) it forced people to face a threat taking them out of their comfort zone and (c) it required people to make the decisions on their own. In total, 1,335 respondents completed the online survey from May 8 to May 22, 2018. They were recruited by Respondi, an online access panel provider. Prior to the survey, participants were informed about the ongoing research, that the data would be evaluated anonymously (names were not collected) and that their participation was voluntary. Participants received a financial incentive to complete the survey, which took them on average about 12 min. Because of unrealistically short answering times, as well as misunderstandings of the scenario (people imagined to be at home and not at Lake Lugano), answers of 140 respondents were excluded from analysis leaving 1,195 respondents. The survey (in German) was conducted with a large sample of the Swiss population in the German-speaking cantons (see Figure 1). Respondents ranged in age from 18 to 86 years (M = 47.85, SD = 16.44), and n = 607 (50.8%) were female. Most respondents had completed vocational school (45.5%, n = 542), followed by college or university education (18.5%, n = 221), with 7.5% indicating at least some compulsory education. Compared to the average Swiss population, the sample was slightly older than the average (M = 43.14 years), similar for the gender ratio of female (50.5%) and male (49.5%) (FSO, 2017b) and slightly more educated than the Swiss population (more respondents with higher vocational training or university degree and fewer respondents having only completed compulsory school) (FSO, 2017a). However, as the survey was conducted online and based on the people registered in the database of the online panel, it did not reach people who did not have internet access or who were not registered in that database.
FIGURE 1 Map of Switzerland with the different language regions. The circle shows Lake Lugano on the Swiss territory Therefore, the results of the research should not be taken as being representative of the Swiss German population. Nevertheless, they can be an indication of the opinions of the users of online weather warnings.
Respondents were randomly assigned to one of four warning conditions described in Table 3: (a) consistent warnings, (b) inconsistent visual warnings, (c) inconsistent textual warnings and (d) inconsistent visual and textual warnings. Roughly even subgroups were achieved out of the total sample size of 1,195. Each subgroup received two warnings (respectively AA, AB, AC and AD), which each consisted of a warning map of Switzerland and a warning text. Warning A, a very severe rainfall warning with three categories, was identical across all subgroups. For type AA the two warning messages were almost identical in description/visualization and text (therefore named type "AA"). For type AB, warning B was visualized differently (i.e. five categories instead of three with different thresholds and associated colour scheme) but had identical content (i.e. identical text with numbers) compared to warning A. For type AC, the visualization was identical for warnings A and C (three categories with identical thresholds and colours), but warning C was less severe than warning A. For type AD, warning D was less severe and was visualized differently (i.e. five categories) compared to warning A. The number and size of warning regions were kept constant. For each subgroup, the different warning messages are provided in Figures S1-S8. All the warnings reflect actual inconsistencies in actual warnings. The warning messages were developed based on real warnings that weather providers disseminated in November 2017 during the event described in the introduction which experts (forecasters and other staff of MeteoSwiss) reviewed for plausibility.
With respect to the survey content, questions on two sets of dependent variables were asked: evaluation of warning quality and intended action. Participants had to evaluate the two warning messages they received separately. They rated each warning on a five-point Likert scale from "not at all" to "very" with respect to comprehension, credibility and meaningfulness. The respective items were "The warning is comprehensible," "The warning is credible" and "I find the warning meaningful." Based on the reported differences in comprehension, credibility and meaningfulness between the two warnings, a variable was computed that we labelled "confusion." Large differences (i.e. high numbers) in the evaluation indicate a high level of confusion. Participants also assessed the two warnings together: they had to indicate whether they were "in agreement" and "easy to understand." The three variables ("confusion", "in agreement" and "easy to understand") were used to assess participants' evaluation of warning quality. Moreover, data were collected on intended actions, namely a risk minimizing behaviour (i.e. "I would cancel the party and barbecue") and searching for more information (i.e. "I would look for more information").
Participants had to indicate how strongly they agreed or disagreed on a five-point Likert scale with each statement. Finally, after having made the decision to engage in a risk minimizing behaviour or not, participants were also asked whether they relied on warning A or warning B, C or D (depending on the experimental condition) or whether they relied on both warnings together for their decision-making. Respondents also had to answer questions about their collection of weather information (e.g. how often do they check the weather forecasts), risk perception, experience with and knowledge about severe weather/warnings and sociodemographic characteristics. The full questionnaire is available in Appendix S1.
The hypotheses listed in Table 4 were elaborated based on the literature review in Section 2 and the inconsistency matrix.
Statistical analysis was performed using IBM SPSS 23. A one-way analysis of variance (ANOVA) was conducted to study the effect of warning inconsistency on evaluation of warning quality and intended actions for which an average sample size of n = 300 respondents per experimental condition satisfied the requirements for ANOVA. One-way ANCOVAs (analysis of covariance) were also conducted to assess the covariance between risk perception, knowledge and experience of warning and severe weather and relevant demographic variables (age, gender, education, urban/rural). In a further step, a series of two-way ANOVAs was performed to study possible interactions between the effects of warning inconsistency and participants' characteristics (knowledge, experience and other socio-demographic variables) on the dependent variables mentioned above. In addition, multinomial logistic regression was performed to assess Warning quality will be evaluated to be highest for type AA (most consistent warning pair) and lowest for type AD (least consistent warning pair)

H1b
Warning quality will be evaluated higher for type AC compared to type AB. Thus, the (negative) effect of inconsistent visual information will be more important than the effect of inconsistent textual information alone on evaluation of warning quality

H2a
People are more likely to engage in a risk minimizing behaviour for type AA and AC, and less likely to do so for type AB and AD. Thus, inconsistent textual information is expected to have a less important negative impact than inconsistent visual information on intended action

H2b
People are less likely to search for more information for type AA and AC, and more likely to do so for type AB and AD. Again, a more important positive impact of inconsistent textual information is assumed H3 For type AA, on average, people rely more often on both warnings together when making the decision to act or not

H4a
With respect to single warnings, on average, people rely more often on the more severe warning, which is also better evaluated than the less severe warning

H4b
With respect to single warnings, on average, people rely more often on the warning using a flash light system with three categories, which is also better evaluated than the warning using five categories.
the responders' level of reliance on one warning when making an action/no action decision.

| Weather information behaviour
An analysis of participants' weather information sources and attitudes towards weather providers revealed that the majority (79.5%) indicated that they regularly consulted weather information. In a multiple response set, these people reportedly consulted information via a variety of sources, listed in Table 5. Of those people who regularly look for weather information, more than half do so on a daily basis (53.9%) or at least several times a week (33.3%). Only 11.5% of people consult information very irregularly, such as only when they are planning an activity for which weather is important, whereas almost two-thirds of people who look for information (64.7%) consult weather information from several providers. The participants were far more likely to compare weather information before undertaking outdoor activities (89.4%) and also depending on the severity of the hazard (70.0%), as well as its nature (62.5%) (multiple response question). With respect to warning information, almost two-thirds of those who regularly consulted information (63.2%) had already received warnings with half of responders (49.7%) having received messages from different weather providers for the same event (these were inconsistent on 52.9% of occasions). For the same weather event (independent of severity) 43.8% of respondents indicated that they would like to receive only one harmonized warning from the national authorities for the same weather event (independent of its severity) with the remainder preferring to receive several warnings that were visualized differently (19.1%) or differed in text (12%) or they had no preference (25.2%). Figure 2 assesses the relationship between warning inconsistency and the evaluation of warning quality. As expected it indicates that there is an inverse proportional relationship between warning inconsistency and "ease of understanding" (light grey line in Figure 2). The effect of warning inconsistency on understandability of the information received was significant, F(3, 1,187) = 6.33, p < 0.001, r = 0.13, which was supplemented by a significant linear trend, F (1, 1,187) = 16.76, p < 0.001, r = 0.12, indicating that as the warning became more inconsistent the understandability decreased proportionately. (An ANOVA produces an F statistic that compares the amount of systematic variance to the amount of unsystematic variance; the probability value determines the significance of the results where the smaller the value the larger is the evidence that the measured differences were not generated by chance [if the value is larger than 0.05 then there is no significance]. r is a standardized measure of the size of the effect, small numbers [as in our case] representing small effects, whereas a large number [<0.5] represents large effects.) People found the information they received most understandable for type AA (M = 4.21, SD = 0.8) and least understandable for type AD (M = 3.97, SD = 0.89) (p = 0.002). However, there were no significant differences between type AB (M = 3.76, SD = 0.69) and type AC (M = 3.63, SD = 0.71) with respect to "ease of understanding" (p = 0.263).

| Evaluation of warning quality
Respondents evaluated the quality of warnings containing consistent information more positively than inconsistent information. When participants assessed the two warnings separately, the results showed a large difference in evaluation between warnings A and D (M = 0.49, SD = 0.98) and, as expected, no difference at all for type AA because messages were almost identical in visualization and text (M < 0.01, SD = 0.71). There was a significant effect of warning inconsistency on confusion, F(3, 1,178) = 15.8, p < 0.001, r = 0.2. As can be seen in Figure 3, there was also a significant linear trend, F(1, 1,178) = 44.19, p < 0.001, r = 0.19, indicating that as the warning became more inconsistent the confusion increased proportionately. As Figure 3 shows, there are no significant differences in evaluation of warning quality between type AB (M = 0.26, SD = 0.83) and type AC (M = 0.3, SD = 0.85), and thus between types characterized by inconsistent visual and textual information. However, there are significant differences between all other types, highly significant between type AA and each of the other three types (p ≤ 0.001), and significant between type AD and AB (p = 0.01), respectively, and type AD and AC (p = 0.05), as shown by Bonferroni post hoc tests.
As the warning became more inconsistent, people evaluated that they were "less in agreement" (dark dashed line in Figure 2). People who received type AA information perceived warnings to be more in agreement (M = 3.89, SD = 0.89) than people who received type AD information (M = 2.51, SD = 1.105). The overall effect was significant, F(3, 1,187) = 95.34, p = 0.000, r = 0.44, as well as the linear trend F(1, 1,187) = 177.61, p < 0.001, r = 0.36. All group differences were highly significant (p < 0.001), except the difference between inconsistent visual (M = 3.30, SD = 0.96) and inconsistent textual information (M = 3.18, SD = 0.98), as shown by Bonferroni post hoc testing.

| Intended actions
In the next step, the set of relationships between warning inconsistency and intended actions is examined. When faced with increasingly inconsistent information, respondents search for more information but are less likely to change behaviour (solid lines in Figure 2). There was a significant effect of warning inconsistency on engaging in risk minimizing behaviour F(3, 1,194) = 5.28, p = 0.001, r = 0.11, and on searching for more information F(3, 1,194) = 7.07, p < 0.001, r = 0.13. As expected for both variables, receiving type AA (consistent information) resulted in a significantly higher likelihood to change behaviour and in a lower likelihood to search for additional information compared to receiving type AB (inconsistent visual information) or type AD (inconsistent visual and textual information). However, as hypothesized, there were no significant differences between type AA and type AC (inconsistent textual information) on intended actions. The means and Dunnett post hoc testing are provided in Tables 6 and 7. People receiving type AA and type AC information (consistent visualization) are more likely to take intended action and less likely to look for more information than people receiving type AB and type AD information (inconsistent visualization). This trend can be identified in Figure 3. Additional testing showed significant differences in engaging in risk minimizing behaviour (t[1193] = 3.39, p = 0.001) and looking for more information (t[1193] = −3.36, p < 0.001) between warnings with and without consistent visualization. Even though not significant, the findings show that for type AC the likelihood of a change of behaviour increases whereas the likelihood of looking for more information decreases compared to type AB.

| Warning reliance
In the fourth step, the relationship between reliance on warning and textual inconsistency is analysed. For type AA, people tend to rely on both messages when making the decision to act or not. As documented in Table 8, almost 80% of people relied on both warnings for type AA, whereas only 50% of people relied on both warnings for type AD. There was a significant effect of warning inconsistency on participants' choice to rely more on one warning or the other, F(3, 1,176) = 22.21, p < 0.001, r = 0.23. The results of a subsequent multinomial logistic regression are presented in Table 9. It shows that for FIGURE 2 Intended actions and evaluation of warning quality. The error bars are 95% confidence intervals FIGURE 3 Confusion in the four warning conditions. The error bars are 95% confidence intervals. Confusion was computed based on differences in evaluating the warning quality (in terms of comprehension, credibility and meaningfulness) of the two warnings separately type AA, compared to type AD, people are significantly more likely to rely on both messages together than to rely only on the first warning message. The same pattern can be observed on comparing type AC to type AD. For types AB, AC or AD, people relied, on average, less on the second warning. As documented in Table 8, roughly one-third of people in type AB (33.4%), AC (29.7%) and AD (41.8%) relied on warning A compared to about 10% or less who relied on warning B, C or D. This pattern is confirmed by people's evaluations of warning quality. As shown in Table 10, all people who received type AB, AC or AD information significantly evaluated warning A more positively than warnings B, C or D respectively.

| Socio-demographics
Finally, this section focuses on the influence of participants' individual characteristics on warning evaluation and response. Including knowledge (and also other characteristics) as a covariant did not change any of the documented results. Also, for the evaluation of warning quality, there was no significant effect of knowledge, experience or sociodemographic characteristics. However, the data show that for people with no experience with warning messages the effect of inconsistent visual warnings is more important than the effect of inconsistent textual warnings on evaluation of warning quality. A series of two-way ANOVAs showed significant interaction effects between warning inconsistency and whether people have adapted their behaviour in response to warnings they received in the past, F(3, 1,178) = 3.61, p = 0.013, r = 0.09, as well as warning experience, F (6, 1,178) = 3.02, p = 0.006, r = 0.12. This means that people with no general warning experience are more likely to be confused by type AB compared to type AC. For the intended actions, risk perception significantly influenced taking the risk minimizing behaviour: people with higher risk perceptions were more likely to take action compared to people with lower risk perceptions, t[973] = 4.70, p < 0.001, r = 0.15. Warning experience, risk perception, age and whether people consult more than one weather provider significantly influenced looking for more information. People with positive warning experience, with higher risk perceptions, who consult more than one weather provider and younger people were more likely to take action compared to people with negative experience (t[973] = 1.99, p = 0.047, r = 0.06), lower risk perceptions (t[973] = 2.52, p = 0.012, r = 0.08), who consult only information from one provider (t[973] = 2.03, p = 0.042, r = 0.07) and older people (t[973] = 3.63, p < 0.001, r = 0.12). A series of two-way ANOVAs showed no significant interaction effects between warning inconsistency and participants' characteristics on taking intended actions. Including participants' individual characteristics in the models used did not change the significance of the results for taking intended action, nor for warning reliance.
Almost two-thirds of respondents who regularly look for weather information consult information from several providers, which indicates that they are aware of their existence and actively compare the available information. However, half of the people who have received warning information from different providers indicate that these warnings were inconsistent. This highlights the importance of the research questions addressed here. This study is a first attempt to analyse people's weather warning information attitudes in Switzerland. With respect to general weather information sources, there is very limited research available in the Swiss context. In her thesis on risk awareness and preparedness in Swiss natural hazard risk management, Maidl (2017) found that the most common means to get informed about natural hazards is to follow weather forecasts in popular media such as radio, television and newspapers, followed by websites and mobile apps. More proactive forms such as attending information events or public discussions are less frequently used and social media appeared to be the least useful media (data on the availability and usage of 20 different means was collected) (Maidl, 2017). In Germany, Kox and Thieken (2017) found that slightly fewer people use weather forecasts daily (69.4%) compared to this study and that traditional mass media are still the primary source of weather information (in order of decreasing usage: TV, radio, websites, apps, newspapers, SMS/emails). Thus, the results reported in Table 5 more or less confirm the findings of both studies, even though the use of smartphone apps seems to have increased in the last few years. Surveys conducted by the Federal Office for Civil Protection, which did not explicitly focus on weather warnings but on information behaviours with respect to natural and human catastrophes, found that most people use the internet to keep updated and consult websites from cantonal offices (econcept AG, 2011;BABS, 2014).
The results also show that, overall, inconsistency negatively impacts the evaluation of warning quality (see H1a in Table 4): evaluated warning quality was significantly higher for consistent warnings and lower for inconsistent warnings (inconsistent in either visual or textual information or both). These results support earlier findings that, for instance, official warning messages promote the formation of accurate perceptions only if they are consistent with other publicly announced advisements (Perry and Green, 1982;Quarantelli, 1984;Drabek, 1999). The research also affirms the results of Mileti and Peek (2000), who highlighted that effective warning messages must include consistent information, also to guarantee greater trust in the forecast (Losee and Joslyn, 2018). Moreover, results show that for people with no experience with warning messages the effect of inconsistent visual information is more important than the effect of inconsistent textual information. This indicates that people who are less familiar with warnings may have more difficulty in understanding inconsistent visual information (compared to those with warning experience), supporting Parker et al. (2007). Hazard experience may influence warning evaluation: less experienced people may have more difficulty in understanding warnings that provide identical numbers and text but which are visualized differently.
Moreover, inconsistent information (visual or textual) did not have an effect on evaluation of warning quality. In the following, based on respondents' comments, the reasons why this research could not find a negative effect of inconsistent visual (rather than textual) information, as was expected (see H1b in Table 4), can be conjectured. Some respondents noted for type AB (similar textual information) that the numbers in both texts were identical, resulting in a clear message to them. Thus, it could be that AB warning quality was evaluated as high due to similar warning texts, which (over-)compensated for the confusion arising from the different maps. This ultimately resulted in no difference compared to type AC (similar visual information). This highlights the importance of including a warning text with hard numbers which help people to compare warnings that have the same content but may be visualized differently. This speculation is also supported by Sherman-Morris and Lea (2016) who found that the weathercasters' accompanying commentary, and not the graphics, increased the likelihood of taking action. In addition, the results could depend on the choice of the selected natural hazard. Several respondents' comments pointed in that direction saying that rainfall is not dangerous. This makes us believe that for a different hazard, which people perceive to be more threatening to their safety, there could be a negative effect of inconsistent visual information, compared to textual, on the evaluated quality of the warning. Even though the research context is slightly different, the findings support past studies that found no effect of graphics (Broad et al., 2007;Casteel and Downing, 2015) or that found that visualizations led to even more errors in interpreting forecasts than text alone (Savelli and Joslyn, 2013). More generally, the existing literature on warning uncertainty highlights the need to present both numerical and verbal uncertainty information to make sure that the user has the right information regardless of their needs  or preferences (Wallsten et al., 1986). Pardowitz et al. (2015) found that textual uncertainty was increased by a large variety of verbal expressions that were used by forecasters in the German Meteorological Office to describe similarly forecasted numerical probabilities. These findings could further explain why no differences in evaluation of warning quality were found between inconsistent textual and visual information. Both numerical and verbal uncertainty information which are inherent in each warning pair may be equally important.
Moving beyond assessed quality, the research reported here also shows that inconsistency impacts decision-making. On the one hand, receiving consistent information resulted in a higher likelihood of engaging in risk minimizing behaviour (see H2a in Table 4). These findings reinforce earlier research that involved workshops with practitioners who underlined the need to agree on a single harmonized message (econcept AG, 2011). In general, they also support best practices in risk communication, such as "spread consistent messages and speak with one voice in order to lead to effective decision-making" (e.g. NOAA, 2016). They also support climate forecast research, showing that consistency in forecasts is leading to improved agricultural decisionmaking in response to climate change (Garrett et al., 2013;Bhatta and Aggarwal, 2016). However, the results contradict Sandman (2006) who argues that consistency leads to worse decision-making and Losee and Joslyn (2018) who showed that forecast consistency leads to a lower likelihood of taking risk mitigation action (than forecast inconsistency). Yet, the results may have been influenced by the fact that the focus here was on information from different weather providers at a given point in time and not on sequential warnings from one provider (as in Losee and Joslyn, 2018). On the other hand, receiving consistent information resulted in a lower likelihood of searching for more information compared to receiving inconsistent information (see H2b in Table 4). The findings also reinforce Lindell and Perry (2012), who highlighted that consistent messages delivered by multiple sources do not require searching for additional information to resolve the confusion. Moreover, even though the textual component in the warning messages was kept rather neutral, they could have induced fear in the participants. A recent study by Morss et al. (2018) found that fearbased messages not only increased taking protective action intentions and risk perceptions compared to neutral messages but also increased perceptions that the information was overblown. Their findings are supported by fear literature (Witte, 1992(Witte, , 1994. Thus, fear could have had an influence on intended actions and/or discrediting the message source if perceived as overblown or misleading. Moreover, the hypothesis is made that the effect of inconsistent visual warnings will be more important than the effect of inconsistent textual warnings on intended actions (see H2a and H2b in Table 4; Sandman, 2006). Unlike for the evaluation of warning quality, the data confirm this trend and respondents receiving inconsistent textual information were more likely to adopt a risk minimizing behaviour. Moreover, these respondents were less likely to resolve the ambiguity through searching more information than those who received inconsistent visual information. As was explained earlier, inconsistent textual information may be a consequence of uncertainty in forecasting (weather model used and interpretation of data). So, it may be that subconsciously people understand the uncertainty in the provided information, pushing them to adopt their behaviour more often when warnings are inconsistent in text compared to when they are not. This argumentation is also supported by other research suggesting that people prefer weather forecasts providing uncertainty information (Morss et al., 2008) or that forecasts with uncertainty can enhance the preparation of mitigation measures (LeClerc and Joslyn, 2012). However, these conclusions have to be treated with caution, as the findings show no such pattern when participants evaluated the quality of warning.
Finally, these findings are relevant also in relation to the evaluation of single warnings. On the one hand, they indicate that when messages are consistent in visualization people are more likely to rely on both messages together compared to inconsistent visual messages (see H3 in Table 4). This indicates an implicit association between graphics and reliance and highlights the importance of visual salience (Wogalter et al., 2002;Fabrikant et al., 2010;Severtson, 2013). The results show that for decision-making people relied (a) more often on the warning using three categories compared to five categories and (b) on the more severe warning (see H4a and H4b in Table 4). This could indicate that they prefer warnings with three categories over five categories (in type AB and AD) and prefer to stay with the worst-case scenario (in type AC and AD). Losee and Joslyn (2018) showed that greater forecast severity leads to greater trust in the forecast, because people "need them to be trustworthy" or because they just "want to be on the safe side." However, it must be borne in mind that small language differences in the textual information may have played an influence as well. Patt and Schrag (2003) highlight that people interpret uncertainties and probabilities as information to evaluate the impact of an event. Even though the language was similar in the different warning pairs, textual variations (e.g. "whereas levels of rainfall of up to 120 L/m 2 can occur in the southern part" compared to "whereas the highest precipitation rates (120 L/m 2 ) are expected in the southern part") could have made a difference as people may have tended to choose the more certain sounding probability information.

| CONCLUSION
Several recommendations can be drawn from these results. At least in Switzerland, most people regularly consult weather information, primarily using smartphone applications. A lot of people indicated that they have received inconsistent warning information from different weather providers. This highlights the importance of the problem addressed in this study, which needs to be taken seriously into account by public and private weather providers. Moreover, inconsistent visual and textual messages have to be avoided as they are very confusing. Thus, public and private weather providers need to work more closely together and start a process of harmonization of warning messages. As Thorpe (2016) highlights, public-private partnerships could enable the whole weather enterprise to grow and produce more accurate and reliable weather forecasts. This study indicates that the existing misunderstanding and mistrust about the respective roles of the public and private sector (see, for example, the successful lawsuit of private providers against the public provider DWD in Germany in 2017) need to be overcome and indeed things are slowly changing. At a European level, the 2016-2025 Strategy of the European National Meteorological and Hydrological Services (NMHS) highlights that "In response to the anticipated growth of the private meteorology sector, the distinct roles of the European NMHSs with respect to data collection, model development, research, warnings and alerts need to be established, while at the same time collaboration with the private sector is stimulated" (Thorpe, 2016, p. 19). In Switzerland, for instance, public and private weather providers are in regular contact to improve the warning process and there are ongoing discussions on how warning information could become more consistent, especially with regard to visualization. Even though the breakthrough has not yet been reached, this study highlights the need to disseminate more consistent warning information than is currently the case. To do so, agreeing on a similar visualization of warning messages (i.e. identical number of categories, thresholds and associated colours) seems to be doable and could work for the private and public sector: MeteoSwiss would remain the official warning authority, while the private providers would continue to disseminate their own warnings based on their data, models and interpretations. As this study shows that being consistent in either text or visualization would significantly improve outcomes compared to inconsistent warnings in both (as is currently the case), such an agreement would improve the warning process and, ultimately, the general public would strongly benefit from warning messages that are always consistent in visualization but could show some variety in text.
One shortcoming of this study (and also some previous studies) is that the research relies on self-reported intended responses to a hypothetical situation, rather than a field observation of actual behaviour in response to an actual warning situation, even though some scholars argue that behavioural intentions are a good proxy for actual behavioural response (Ripberger et al., 2014). In addition, an imagined situation suffers from a lack of real consequences for decisions and people might seek fewer risks in real-life decisions, as was pointed out also by Kox and Thieken (2017). Furthermore, the sampling technique may limit the generalizability of the findings reported in this study. Moreover, evaluating different warning messages is somewhat subjective. As Casteel (2016) highlighted, a legitimate question to ask is what actually constitutes an effective response to warnings. Thus, it is important to remember that, given the inherent uncertainty in severe weather events as well as the local conditions that can influence the weather, the most effective response may not always be the same action. Finally, as in Losee and Joslyn (2018), the research reported here focused on evaluation of quality of information when the sources are not mentioned and thus did not include the effect of source. However, source attribution was found to be an important determinant of perceived informational quality (Frewer and Shepherd, 1994) and thus public reaction to warning messages may be determined by an interaction between source and inconsistency of information.
Future research is needed to address some of the limitations of this study. It would be interesting to investigate reactions to real inconsistent, and consistent, weather warnings, rather than using a simulated event. Recent research has shown that, in a real-world crisis, feelings and/or deliberative evaluation influence evacuation behaviours and intentions (McCaughey, 2018;Gutteling et al., 2017). Although the results of such a study would be of high validity, the study would also pose complicated methodological and potential ethical challenges, such as requiring public or private weather forecasters and providers to disseminate different messages during a dangerous situation. The methodological challenge could be addressed by using existing weather applications of one or more providers or by developing a new application that could be used to disseminate the warnings from different providers available. Through both applications, the subscribers could be given links to complete short, mobile-friendly surveys to evaluate warning response. Moreover, this message duplication may have the unintended side effect of reducing credibility and respondents' trust in the weather provider. Future research should address the link between warning evaluation and response. Last, but not least, as the results also seem to undermine the power of visuals, more research is needed to resolve the ambiguity. Subsequent research could also focus on the effects of the source of information and inconsistent visual and textual information on the warning evaluation for different hazards.