Sugar, gravel, fish and flowers: Mesoscale cloud patterns in the trade winds

An activity designed to characterise patterns of mesoscale (20 to 2,000 km) organisation of shallow clouds in the downstream trades is described. Patterns of mesoscale organisation observed from space were subjectively defined and learned by 12 trained scientists. The ability of individuals to communicate, learn and replicate the classification was evaluated. Nine‐hundred satellite images spanning the area from 48°W to 58°W, 10°N to 20°N for the boreal winter months (December–February) over 10 years (2007/2008 to 2016/2017) were classified. Each scene was independently labelled by six scientists as being dominated by one of six patterns (one of which was “no‐pattern”). Four patterns of mesoscale organisation could be labelled in a reproducible manner, and were labelled Sugar, Gravel, Fish and Flowers. Sugar consists of small, low clouds of low reflectivity, Gravel clouds form along apparent gust fronts, Fish are skeletal networks (often fishbone‐like) of clouds, while Flowers are circular clumped features defined more by their stratiform cloud elements. Both Fish and Flowers are surrounded by large areas of clear air. These four named patterns were identified 40% of the time, with the most common pattern being Gravel. Sugar was identified the least and suggests that unorganised and very shallow convection is unlikely to dominate large areas of the downstream trade winds. Some of the patterns show signs of seasonal and interannual variability, and some degree of scale selectivity. Comparison of typical patterns with radar imagery suggests that even this subjective and qualitative visual inspection of imagery appears to capture several important physical differences between shallow cloud regimes, such as precipitation and radiative effects.


INTRODUCTION
A German proverb reads: Wenn Schäfchenwolken am Himmel steh'n, kann man ohne Schirm spazieren geh'n. Loosely, and lyrically, it translates as "when cute woolly clouds make the weather, no need to walk with your umbrella." It demonstrates how particular cloud forms have long been taken as indicative of weather and its pending changes.
Ground-based observers have historically classified individual cloud types -Schäfchenwolken in the above proverb -to help anticipate the weather. With the advent of the telegraph, systematic surface-based observations of clouds could be communicated over great distances, thereby giving a sense of the synoptic situation. This idea motivated the first "Wolken-Atlas" or cloud atlas, developed around Abercromby and Hildebrandsson's ten basic cloud types. The Atlas was published in Hamburg in 1890, and served as a template for a standardised activity implemented through an international accord adopted in Paris in 1896 (Hildebrandsson and Teisserenc de Bort, 1910), a year later celebrated as "the international year of the cloud" (Stephens, 2003). This Paris accord, and subsequent coordination by the World Meteorological Organisation (WMO), has helped establish cloud classification as a basic element of more than a century of systematic human weather observations. These days, instruments are replacing human cloud observers and climate change -not changing weather -motivates efforts to read order into clouds.
Modern satellite-derived cloud classification schemes emphasise radiative properties derived from a field of clouds, rather than the impression left by an individual cloud. A famous example is the International Satellite Cloud Climatology Project (Rossow and Schiffer, 1999). In this scheme, fields of clouds are classified by their net effect on solar and infrared radiation, which are respectively indicative of their average thickness, and the average height of their tops. But even such properties-driven classifications of fields of clouds are guided by, and interpreted in terms of, basic cloud types which are not too different from the ten types defined by Abercromby and Hildebrandsson nearly 150 years ago.
Satellite imagery shows that the form of clouds is not just expressed in an individual cloud, but often through the very different spatial patterns built up from individually quite similar clouds, or in some cases regular sequences of changing cloud forms. Examples of the former encompass familiar patterns, e.g. elongated wind-parallel cloud streets, networks of open/closed cells, or even cross-wind bands (Agee, 1987;Atkinson and Zhang, 1996;Young et al., 2002;Stevens et al., 2005;Wood and Hartmann, 2006). Examples of the latter include the progressive changes associated with large-scale midlatitude frontal features. Many of these patterns are clear and unambiguous, lending themselves well to objective identification techniques, and have motivated research over decades to understand the parameters that order them (e.g. Painemal et al., 2010;Muhlbauer et al., 2014). But other patterns are less clear, so much so that whether or not one can rightfully identify regularity in cloud-patterning is a question unto itself.
To answer this question, the present authors 1 formed a study group to explore whether they could identify common patterns in the satellite presentation of shallow convection in the trades. A motivation for this activity was an appreciation of the important role the organisation of deep convection plays in regulating radiative heat-loss to space (Tobin et al., 2012), and hence an emerging curiosity as to whether similar processes were occurring in shallow convection. A further motivation was to help prepare for a forthcoming field programme (Bony et al., 2017): if patterns of organisation exist for shallow convection, then their appearance might influence how one organises the measurement campaign. These general motivations gave rise to more specific questions. First, do different people recognise common patterns in satellite images of clouds in the trades? And if so, can these patterns be described, classified and communicated to train other individuals to identify them? Secondly, if humans can consistently identify patterns of cloudiness, can objective measures of these patterns be discerned, and/or can machines be taught to recognise such patterns as well? Ultimately this exercise aims to answer the question as to whether information about differences in patterns of shallow convection in the trade winds can be used to advance understanding of cloud-controlling processes and their role in climate. This article addresses the first question and sets the foundation for ongoing work, and future articles, on the subsequent questions.

GETTING STARTED -THE CLASSIFICATION PROCEDURE
Interest in the mesoscale organisation of clouds focused on the downstream North Atlantic trades. We chose a region windward of Barbados (i.e. east of 13 • N, 59 • W), during the months of boreal winter. The region and season were of interest because they are characterised by regimes of "small" clouds associated with "big" questions, questions like: what sets Earth's equilibrium climate sensitivity, and how sensitive are clouds to aerosol perturbations (Albrecht, 1989;Vial et al., 2013;Brient et al., 2015;Seifert et al., 2015)? It is for these same reasons that the region has become a focal point for long-term ground-based observations through the Barbados Cloud Observatory (Stevens et al., 2016) as well as more elaborate past (Stevens et al., 2019) and planned (Bony et al., 2017) field studies windward of this observatory. The F I G U R E 1 Sugar: MODIS-Aqua scenes from Worldview. The images cover the area from 60 • W to 48 • W and 10 • N to 20 • N. For these images the scenes have been extended to the west to include Barbados, coloured in artificial green, on the far left. For a sense of scale, Barbados fits in a rectangle of east-west dimension of 25 km and north-south dimension of 30 km. Depending on the quality of the reproduction, some features distinguishing these from other patterns may be difficult to discern from printed (rather than electronic) renditions of this article. From left to right the images correspond to 31 December 2014, 5 December 2015 and 20 January 2016 emphasis on the mesoscale -to be precise, on what Orlanski (1975) calls the meso-(20 to 200 km) and to a lesser extent the meso-(200 to 2,000 km) scale -is because patterns of organisation on these scales are often not a part of the discourse on the "big" questions.
The idea to investigate patterns of mesoscale organisation was made possible by NASA's Worldview. 2 The ability to easily browse very high-resolution images made it conceivable to look for patterns in the pictures. With this in mind a subgroup, consisting of a few of the authors, spent part of a morning independently browsing the Worldview visible images from MODerate-resolution Imaging Spectroradiometer (MODIS) near and upwind (roughly east-northeast) of Barbados, the idea being to see if a dominant spatial pattern could be identified among images, and if so how often it occurred. The images spanned 10 • in latitude and 20 • in longitude. After spending time individually looking for patterns, the members of the subgroup met together to discuss their individual impressions with one another. The discussion quickly led to the conclusion that different people often identified the recurrence of similar patterns. After some further discussion the subgroup concluded that when recurrent patterns could be identified, they took on one of at least four patterns. To these we gave the names of Sugar, Gravel, Fish and Flowers. A fifth pattern, Bands, was also identified by the subgroup. It was associated with large-scale bands of completely overcast sky. An example is given in Figure 10 in appendix S1. But because it ended up being infrequently and inconsistently classified by the broader group, our analysis focuses only on four robustly identifiable patterns: Sugar, Gravel, Fish and Flowers.

Definitions
A concise description of the four patterns that we felt confident in our ability to classify is as follows: 2 https://worldview.earthdata.nasa.gov

Sugar:
Dusting of very fine-scale clouds with small vertical extension and little evidence of self-organisation (by cold pools or gust fronts).
Gravel: Cloud fields patterned along meso-(20 to 100 km) lines or arcs defining cells with intermediate granularity, and brighter cloud elements (as compared to Sugar), but with little evidence of accompanying stratiform cloud veils.
Fish: Meso-scale (200 to 2,000 km) skeletal networks (often fishbone-like) of clouds separated from each other, or from other cloud forms, by well-defined cloud-free areas and sometimes accompanied by a stratiform cloud shield.
Flowers: Irregularly shaped meso-scale (20 to 200 km) stratiform cloud features, often with higher reflectivity cores, and appearing in quasi-regular spaced bunches (hence the plural) with individual features well separated from one another by regions devoid of clouds. These are illustrated by images (Figures 1-4) from scenes that, through the broader classification activity described below (section 2.2), were unanimously identified with a particular pattern. 3 Sugar was so named because when it occurred the clouds looked like a sprinkling of powder sugar. In Figure 1 this is exemplified by the cloud patterns in the upper-left quadrant (partly masked by the gap in satellite coverage) of the left panel (31 December 2014), and in the right half of the right panel (20 January 2016). The granulation in the reflectivity field of Sugar is quite fine, with relatively little clumping, other than what one might expect to occur randomly. Hence the clouds were not too reflective (or bright) which was interpreted as them lacking vertical extent. Another notable feature of Sugar was the absence of large-scale areas completely devoid of clouds. Ideally Sugar had no organisation, but often what we would call Sugar might be patterned by the large-scale flow into streets or even feather-like forms. Gravel differed from Sugar through a larger granularity of the patterns defined by the clouds as well as a greater brightness contrast ( Figure 2). More notably, Gravel clouds organised along lines or arcs thought to be associated with gust fronts accompanying cold pools (i.e. precipitation-sourced density currents (Zuidema et al., 2012)). New cells often could be seen to form at the points where gust fronts collided, with brighter, presumably deeper, clouds demarcating these regions. In some cases, Gravel exhibited structures reminiscent of open mesoscale cellular convection, for instance in the lower third of the image from 14 January 2009 (central panel, Figure 2). Gravel and Sugar are identified with some degree of preconception: Gravel with cold pools (Zuidema et al., 2012); Sugar with non-precipitating shallow convection. Past modelling studies (e.g. Siebesma et al., 2003) and observational campaigns (Barbados Oceanographic and Meteorological Experiment, BOMEX: Nitta and Esbensen, 1974) have helped establish Sugar as the canonical trade-wind cloud in the mind of many researchers.
Fish also appears to be built up from open cells or convective cells organised around apparent gust fronts in ways that outline a skeletal structure similar to that of a fish. But compared to Gravel the clouds are yet brighter, and encapsulated in a larger meso-(200 km to 2000 km) scale envelope, often with some amount of associated stratiform cloud cover. In Figure 3 one such structure stretches across the 12 • of longitude on the bottom of the left panel; another stretches across the full image of the right panel, from the northwest to the southeast corner. This meso-scale patterning of the cell complexes is brought into relief by the degree to which the areas between the "Fish" is devoid of clouds -in marked contrast to Gravel.
Flowers were the most surprising and most distinct pattern of organisation. They are comprised of meso-scale patches of stratiform clouds, often with evidence of central clusters embedded and supporting the stratiform cloud patches ( Figure 4). The scale of an individual Flower (or stratiform patch) in the pattern "Flowers" varies from a few tens to a few hundreds of kilometres. Our classification focused on situations where they appeared in bunches, i.e. with a quasi-regular distribution wherein individual Flowers were separated from one another by similarly scaled regions devoid of clouds.

Assigning labels
Based on these perceived patterns, the subgroup developed a labelling protocol which was used to train the rest of the group of 12 labellers. Here we define labelling as the act of an individual, a labeller, attaching a label to an image. Classification is what emerges out of the labelling activity, for instance as a result of independent labellers attaching the same label to an image. Because of the way the images were set up, it was only possible to label an image as a whole, and having a large (20 • × 10 • ) domain increased the chances that different patterns of shallow-cloud organisation would appear in different parts of the domain. This is already evident, for instance in Figure 2b, where in the western portion of the image, near and north of Barbados, clouds have a more Sugar-like texture, or in Figure 1a where a Fish is visible in the bottom right quadrant. In the group classification that followed, it was therefore decided to work with smaller 10 • × 10 • images. For these the southwestern corner of the domain was placed at 58 • W and 10 • N, upwind of Barbados. In adjusting the size of the scene, we may have inadvertently made it less likely for Bands to be identified. The five perceived patterns (including "Bands") were presented to the full group of 12 labellers (the authors) by the subgroup. Each pattern was described and presented in the form of a few examples, similar to those shown in Figures 1-4. Then, together, the group scrolled through a season (December, January, February; DJF) of Worldview images. As if learning how to play a card game with an open hand, individuals were asked in turn to label an image and when the other participants did not agree, reasons for differences were discussed. After the training each person was asked to label 5 years of images, for the specified study region, during the months of December, January and February, within a period of 10 seasons starting in 2007/2008 and concluding in 2016/2017. These years were chosen as they were the only ones available on Worldview at the time of the labelling activity. Each season ran from 1 December until 28 February, thus excluding 29 February in 2008, 2012 and 2016, and totalling 10 seasons (900 days). Each person assigned labels to five seasons of images, so that each image was independently assigned a label by six different people. The classification was performed only on daytime MODIS-Aqua images (corresponding to roughly 1330 local time at the centre of the image) using the "Corrected reflectance" product, which corresponds to the MODIS Level 1B data (a combination of data at different wavelengths, derived from sensors having a 250 or 500 m resolution), corrected for gross atmospheric effects. When either of Sugar, Gravel, Fish or Flowers covered half or more of the image, the image was classified as such.

Label statistics
Of the 900 images, 815 were classified by at least one person as being dominated by one of the four patterns: Sugar, Gravel, Fish or Flowers. Thus we consider these 815 days as classifiable days. Of the 85 images that were not classified by any person, many of these were the result of conditions overcast by high clouds, or simply missing images. For instance, at the time the labels were assigned, images were not available for the period between 25 January and 13 February 2008, nor for the 15-17 and 25 February 2008, a total of 24 days. Given the probability p that a particular label will be assigned, then the probability that this label will be assigned exactly k times in n trials is, From this it follows that the probability for one of the four (Sugar, Gravel, Fish, Flowers) labels to appear k or more times given n assignments is In our case, n = 6 denotes the number of labellers. Table 1 shows how often images were labelled identically by k or more people, and compares this to the fraction of images one would expect to be classified consistently from Equation 2 for given values of p. The actual frequency of agreement greatly exceeds what is expected from randomly guessing one of the six labels (p = 1/6). Because the Band pattern was identified rarely, and further assuming that there was a bias towards T A B L E 1 Fraction of 815 "classifiable" images for which k or more labellers were in agreement, and the probability, p, of this happening if labels were randomly assigned with equal likelihood Two limiting cases are considered: When a classifier randomly assigns one of six (p = 1/6) or one of four (p = 1/4) possible labels.
choosing some labels (i.e. a tendency to choose one of the four named labels as opposed to "No Pattern"), then a more stringent measure of chance agreement would be to assume that any given pattern is chosen with a probability of p = 1/4. Even for this scenario, patterns were robustly classified, with four (the smallest number denoting a majority) labellers agreeing nearly three times as often as would be expected by chance. All patterns were not equally likely to dominate the 10 • × 10 • classification area (see Figure 5, and Table 2). The Gravel label was assigned to images three times more often than the other labels. It dominated even more if unanimity was required for a pattern to qualify as classified. Surprisingly, Sugar, which was interpreted as shallow convective clouds, with little signature of self-organisation, occurred the least ( Figure 5). If we consider the two labels "No Pattern" and Sugar as the labels corresponding to an absence of internal organisation, then for more than a third of the scenes (35%) a 10 • × 10 • scene was classified (four or more labels in agreement) as being dominated by some form of mesoscale self-organisation, i.e. Gravel, Flowers or Fish.

Pattern variability
Considerable interannual variability was apparent among the patterns. This is illustrated, for instance, by the variability in Flowers among years in Figure 5. To better quantify this variability, we consider a classified image as one where four-or-more out of six people agreed on its label. The number of classifications, and their breakdown by year and category is presented in Figure 6. Because not everyone labelled images from every year, if individual labellers were biased this could bias the degree of interannual variation of the classification. In Table 3 the statistics for each person (labeller) involved in the classification are presented, along with the years they labelled. Despite considerable differences among individuals and the fact that they classified different years, Gravel was the most classified type for each individual, and for nine of the twelve, Flowers was the least frequent. There is the temptation to see a professional bias: P. Zuidema, who has written  extensively about marine cold pools (Zuidema et al., 2012), was relatively more successful in identifying Gravel, while P. Siebesma who re-introduced BOMEX to the community (Siebesma et al., 2003), in the form of their randomly distributed non-precipitating cumulus humilis, appeared to have willed away the precipitation, often seeing Sugar where others saw Gravel. On the other hand, S. Bony, who first brought Flowers to our attention, did not seem predisposed to see a disproportionate number of bouquets. These differences among labellers could be partly responsible for the apparent interannual variability in classifications. Then again, real interannual variability would also lead to apparent differences among the labellers. Which explanation is correct is difficult to establish from the available data. For most of the patterns there is not a strong signature of intraseasonal variability. Flowers are the exception that proves the rule. Of the 49 scenes classified as Flowers (by virtue of the agreement of at least two-thirds, four or more, of the labellers) only one of these occurred in December, and that near the end of the month on 21 December 2015 ( Figure 5). Moreover, Flowers was twice as likely to be identified in February as compared to January. This concentration is also consistent with the sense that Flowers were persistent: when they formed, they stayed. Only in six instances were Flowers separated from other Flowers or from Fish -the most closely related of the other patterns -by more than 2 days. And in one twelve-day period starting at the end of January 2017 (i.e. in the 2016 season), Flowers were identified on eight out of 12 days, whereas in mid-February 2011 (in the 2010 season) Flowers were identified on four consecutive days. This suggests that the patterning, particularly that which leads to Flowers, is more influenced by the large-scale synoptic situations, than the random internal dynamics of cloud-scale circulations -as the latter would be expected to have less day-to-day coherence.

Pattern similarity
Ideally one would like to know to what extent one pattern is clearly distinguishable from another. Looking at the patterns in Figures 1-4, and the pattern succession in Figure 5, suggests that some quantitative measure of the similarity between one pattern and another may differ depending on which patterns one compares. At a glance, Flowers appears more closely related to Fish than Sugar, and Gravel more closely related to Sugar than Flowers. To further address this question, we investigated all instances when an image was given the same label by all but one individual -as in this case the label that is in disagreement is unambiguous. Sixty-two instances were identified when a scene was given five Gravel labels. Of these, in more than half the instances (33) the sixth label was Sugar. On only eight occasions was the sixth label a Fish, and not once was it a Flowers. The rest of the time "No Pattern" was assigned. Likewise, of the 19 instances where five Flowers labels were assigned, the sixth label was Fish on seven instances, Gravel on three instances, and "No Pattern" on 10 instances. There seemed to be no tendency of people finding Sugar among Flowers. Conversely, and consistently, of the 18 instances when a scene was given five Sugar labels, on only one occasion was the sixth label Flowers. The connectivity, or similarity among patterns, is summarised by Figure 7, which shows how likely a discordant label, for an image where five labels agree, is to be another label. Hence this analysis fails to refute the hypothesis that

Sugar Gravel
Fish Flower F I G U R E 7 Similarity among patterns as measured by the likelihood of a discordant label, for the subset of images with only one different label. The arrow points from the classified pattern to the discordant pattern, and its width indicates the frequency with which the discordant pattern arose. Only the two most likely choices for the discordant label are shown, i.e. if the sixth labeller disagreed with the other five people who labelled an image as gravel, then this person was most likely to have chosen sugar, less likely to indicate fish and not at all likely to have chosen flowers Sugar and Gravel are in some sense closer (or more similar) to one another than Flowers and Sugar.

Structure
Riehl, a pioneer in studies of tropical meteorology, pointed out long ago that there are many different types of clouds in the trades. Figure 8 reproduces an illustration from his book (Riehl, 1954), which identifies these different forms.
The schematic gives the impression that trade-wind clouds consisted mostly of different forms of cumulus clouds, which differed principally in their vertical extent. The figure gives no hint of the spatial patterning of the clouds, and how this might be related to cloud vertical extent. Precipitation is hardly shown, and even when it does occur it does not even reach the surface. There is no sign of cold pools or gust fronts, nor that stratiform layers can sometimes develop at the top of shallow clouds. The articulation of this spatial patterning needed to wait for the advent of spatial overviews made possible by high-flying aircraft, and later satellites and radars. But even so these ideas have been slow to develop. The idea that cold pools played an important role in the organisation of clouds in the trade winds really only comes into focus as a result of the relatively recent, Rain in Cumulus over the Ocean, field study (RICO: Rauber et al., 2007). The importance of precipitation, the way in which cloud deepens, and the nature of the stratiform cloud layers classified as Sugar, Gravel, Fish and Flowers can be evaluated by examining radar time-height cross-sections associated with these patterns. For this purpose, we use measurements from a high-sensitivity (Ka band) cloud radar at the Barbados Cloud Observatory. Satellite images (MODIS) were used to identify times representative of the different patterns and a radar cross-section for that time is provided to accompany the image ( Figure 9). As anticipated, Sugar is constituted of clouds of little vertical development, mostly cumulus humilis, but also the odd Chimney cloud (even if one is not apparent on this radar cross-section). Gravel has substantially more vertical development -but how much depends on the day. It, or at least clouds associated with gust fronts and cold pools, are also clearly associated with precipitation. Fish also precipitate, but are additionally associated with more organisation, and often deeper (with tops from 3 to 4 km) clouds, and may be variants on what Garay et al. (2004) called Actinoform clouds. Flowers in contrast is composed of cumulus, some of which precipitates, not unlike Gravel, but with a stratiform veil. Similar clouds were frequently observed during Next-generation Advanced Remote sensing for VALidation (NARVAL1) and during the Cloud System Evolution in the Trades (CSET) field study (Albrecht et al., 2019), and may be the downstream evolution of closed cell convection as observed in regions of stratocumulus. The stratiform layers are often quite thin, but have a strong signature on the satellite images, and contribute considerably to the variability in cloudiness in the region (Nuijens et al., 2014).
The radar imagery, combined with the frequency with which Gravel, Fish and Flowers are identified in the satellite imagery suggests that precipitation is common in the trades, and is closely associated with the emergence of organisation. The precipitation associated with the organised patterns ( Figure 9) also appears more substantial than what is identified with the cumulus congestus in Riehl's figure. Although Riehl does not provide a vertical scale for his figure, a cloud base at 700 m implies that his cumulus congestus have tops near the freezing level, at 4.5 km. The strongly precipitating clouds in Figure 9 top out somewhat lower, between 3 to 4 km, whereas airborne measurements suggest that precipitation begins to become evident already when cloud tops reach 1.5 km (Stevens et al., 2019), and becomes frequent as clouds begin to penetrate above 2 km. Together it appears that precipitation and mesoscale organisation of cloud fields in the trade winds is common, and to understand either might require understanding both.

Scale sensitivity
Our experience, both in initially identifying the pattern prototypes, and through the course of the classification activity, was that there is some scale dependence to the frequency with which a pattern emerges and can be identified. At the very beginning the initial study group first attempted to classify F I G U R E 8 Different cloud types in the trades, taken from Riehl (1954) F I G U R E 9 Radar presentation of sugar, gravel, fish and flowers. Radar cross-sections each span 3 h which for typical wind speeds of 7 m⋅s −1 corresponds to a spatial scale of about 75 km, and contoured is the radar reflectivity (dBZ). Precipitation is indicated by a reflectivity signature extending to the surface. Each MODIS image is over the same geographic region, with Barbados in the left quartile below the centreline, in green. The grey vertical line in the time series indicates the time of the satellite overpass areas even larger than 20 • × 10 • but quickly reduced the area to 20 • × 10 • , and then, in the final iteration, to 10 • × 10 • . This progressive refinement is either indicative of a scale dependence in the patterns, or in the ability to recognise a dominant pattern over a fixed area.
The images in Figures 1-4 illustrate some of these issues. Each image was chosen from days where there was unanimity among those labelling the image. They hint at some of the challenges in the classification. For instance, if one looks closely, Sugar can be found in every image, but often Sugar is confined to rather small areas of a few tens of kilometres. The relative scarcity of Sugar classifications may be indicative that very shallow convection, with no evidence of self-organisation, is unlikely to dominate a large area, at least in conditions characteristic of the study area and time period. Even when a scene is classified as Sugar, within the 10 • × 10 • study area other patterns are almost always evident: Fish-like structures are apparent in both the middle and left panels of Figure 1 and on the lower left corner of the right panel of Figure 1 some labellers might identify Flowers.
Based on this experience, and with an eye to some of the other questions raised in the introduction, we believe that the ability to identify patterns in larger images, by drawing bounding boxes (or polygons) of an arbitrary size around clearly defined patterns, would facilitate a more consistent and robust pattern detection. Such a procedure, which was not possible in the framework of the labelling platform we used, would have the added benefit of identifying whether different patterns occurred on different scales. Does, as we hypothesise, Sugar occur more frequently, but rarely on scales which allows it to dominate a 10 • × 10 • area? Furthermore, by allowing multiple labels for one image, an association among patterns might be detectable. For instance, does Sugar have an affinity for Gravel rather than Fish or Flowers, as the analysis presented above suggests?
Based on these insights a platform has been developed to allow both a more flexible and rapid labelling, thereby facilitating crowd-sourced labelling activities. The design of this platform, the results from the classification and the ability of machines to learn the classes are reported in a separate manuscript. 4

SUMMARY AND CONCLUSIONS
Twelve trained atmospheric scientists (a subset of the authors), all with a background and interest in oceanic shallow convection, gathered to explore to what extent patterns of mesoscale variability could be visually (subjectively) identified in satellite imagery of clouds in the winter trades of the North Atlantic. This region of the atmosphere, unlike over areas where stratocumulus predominates, or where cold air flows from land over warmer water, is less strongly associated with mesoscale variability in the cloud field -even if different forms of organisation had been noted in past field studies (e.g. Rauber et al., 2007). Visual inspection of one season of satellite imagery did, however, suggest that clouds exhibit different modes of organisation. These were given names: what were believed to be the characteristic features of each patterns. Despite common characteristics, there was a degree of randomness to the patterns, which did not encourage the use of objective classification techniques. Instead a subjective procedure was developed whereby the patterns were described, and other scientists (labellers) were trained to identify and label these patterns. This procedure involved determining whether a particular pattern dominated a 10 • × 10 • area upwind of the Barbados Cloud Observatory (48 • W to 58 • W, 10 • N to 20 • N) in the season where the trade winds predominate (1 December-28 February). Nine-hundred days of satellite imagery (encompassing the 2007/2008 to 2016/2017 Northern Hemisphere winter seasons) were classified, each image being classified by six different individuals. The aim of the study was to evaluate to what extent mesoscale patterns of shallow cumulus could be defined, communicated, learned, and eventually identified by other scientists. Given an ability to identify patterns of mesoscale variability, it raised the question as to what extent these patterns exhibited interannual or intraseasonal variability, whether or not individual patterns had an affinity for one another, and to what extent patterns persisted from day to day.
We found that four distinct cloud patterns emerge, which we name Sugar, Gravel, Fish or Flowers, and characterise as follows: Sugar: Dusting of very fine-scale clouds with small vertical extension and little evidence of self-organisation (by cold pools or gust fronts).
Gravel: Cloud fields patterned along meso-(20 to 100 km) lines or arcs defining cells with intermediate granularity, and brighter cloud elements (as compared to Sugar), but with little evidence of accompanying stratiform cloud veils.
Fish: Meso-scale (200 to 2,000 km) skeletal networks (often fishbone-like) of clouds separated from each other, or from other cloud forms, by well-defined cloud-free areas and sometimes accompanied by a stratiform cloud shield.
Flowers: Irregularly shaped meso-scale (20 to 200 km) stratiform cloud features, often with higher reflectivity cores, and appearing in quasi-regular spaced bunches (hence the plural) with individual features well separated from one another by regions devoid of clouds.
From these we find that: • A majority (4 of 6) of the labellers agreed on one of these four labels with a probability (p = .4) much larger than would be expected by randomly assigning six (p = .052), or even just four (p = .015), labels. • Recognisable patterns -to the extent one associates this with the emergence of one of the four patterns -are very common in the downstream trades. Almost all of the images (more than 90%) exhibited features sufficient for at least one person to say that a particular pattern dominated the image. • The pattern found most likely to dominate the 10 • × 10 • study area was Gravel. Surprisingly, Sugar, which our preconception most strongly associated with the downstream trades, and which is the one pattern with little signature of self-organisation, dominated the study area the least. • Unorganised, very shallow convection (associated with cumulus humilis) appears frequently, but not over very large 10 • × 10 • (lon-lat) areas, as manifest in the lack of Sugar labels. • Flowers evinced the most seasonality, appearing mostly in February, and often persisting for days. • Differences in patterns are associated with differences in the structure of the cloud field as also visualised by its radar presentation, with Fish being most associated with deeper clouds and precipitation.
Based on these findings we conjecture that the relative scarcity of Sugar is related to a tendency of different patterns to predominate on different spatial scales, and that a labelling protocol that allowed Sugar to be identified over subregions would find more Sugar, but over smaller regions.
These findings also encouraged and guided a variety of follow-up activities. One has been designed to see if the different patterns can be measured by objective methods and if the patterns distinguish themselves in terms of their radiative effects, or the environment in which they form. Another aims to generate a great many more labels, and allow the labelling of smaller subdomains, which would then provide the basis for asking to what extent machines could learn the labels assigned by humans. Based on this it is hoped that the factors influencing the emergence of the different patterns can be identified. Finally, this might help us to understand factors influencing cloudiness in the trade winds, and how they might change as the climate warms.