Using seasonal rainfall clusters to explain the interannual variability of the rain belt over the Greater Horn of Africa

The seasonal cycle of rainfall over the Greater Horn of Africa (GHA) is dominated by the latitudinal migration and activity of the tropical rain belt (TRB). The TRB exhibits high interannual variability in the GHA and the reasons for the recent dry period in the Long Rains (March–May) are poorly understood. In addition, few studies have addressed the rainfall fluctuations during the Msimu Rains (Dec.–Mar.) in the southern GHA region. Interannual variations of the seasonal cycle of the TRB between 1981 and 2018 were analysed using two statistical indices. The Rainfall Cluster Index (RCI) describes the seasonal cycle as a succession of six characteristic rainfall patterns, while the Seasonal Location Index (SLI) captures the latitudinal location of the TRB. The SLI and RCI depict the full seasonal cycle of the TRB supporting interpretations of the interannual variations and trends. The Msimu Rains are dominated by two clusters with opposite rainfall characteristics between the Congo Basin and Tanzania. The associated anomalies in moisture flux and divergence indicate variations in the location of the TRB originating from an interplay between low‐level air flows from the Atlantic and Indian Oceans and tropical and subtropical teleconnections. The peak period of the Long Rains shows a complex composition of five clusters, which is tightly connected to intraseasonal and interannual variability of latitudinal locations of the TRB. A persistent location of the TRB near the equator, evidenced in a frequent occurrence of a cluster related to an anomalously weak Walker circulation, is associated with wet conditions over East Africa. Dry Long Rains are associated with strong and frequent latitudinal variations of the TRB position with a late onset and intermittent rainfall. These results offer new opportunities to understand recent variability and trends in the GHA region.


| INTRODUCTION
The seasonal cycle of rainfall over the Greater Horn of Africa (GHA, cf., Figure 1) is dominated by the latitudinal migration of the tropical rain band (TRB), which follows the seasonal variations of solar insolation and is strongly modulated by topography (Figure 1). In addition, various influences by large-scale teleconnections lead to complex spatio-temporal patterns of rainfall (Nicholson, 1996(Nicholson, , 2017. To a first approximation, the migration stages of the TRB can be separated into four phases: two positions during boreal summer (June-September, JJAS) and winter (December-March, DJFM), and two periods in boreal spring (March-May, MAM) and autumn (October to December, OND). According to these phases, the GHA can be subdivided into two hemispheric regions with one rainy season covering boreal summer and winter, respectively, and one equatorial region with two rainy seasons during the transition periods (cp., e.g., Yang et al., 2015;Dunning et al., 2016;Seregina et al., 2019).
The boreal spring rainy season over the GHA is known as the 'Long Rains'. The Long Rains account for the higher proportion of the total annual rainfall, but show a weak spatio-temporal coherence, which also varies from month to month (Camberlin and Philippon, 2002;Zorita and Tilya, 2002;Nicholson, 2017). This may result from multiple jumps that are characteristic for the northward migration of the TRB over eastern Africa (Riddle and Cook, 2008;Riddle and Wilks, 2013, hereafter RW13). Both studies highlight that these jumps are connected with different stages of the development of the Somali Jet. Riddle and Cook (2008) showed that early stages of the Somali Jet (referred to as 'nascent') coincide with heavy rainfall over Ethiopia in April. Particularly in March and April, several studies agree on a relationship between wet conditions and equatorial westerly wind anomalies in the middle and lower troposphere over the Congo Basin (CB, e.g., Zorita andTilya, 2002, Finney et al., 2019). In the northern part of the GHA, tropicalextratropical interactions modify rainfall (Camberlin and Philippon, 2002;Bekele-Biratu et al., 2018).
Due to a recent abrupt decline of the Long Rains causing devastating droughts, they have been in the focus of many previous studies (e.g., Funk, 2011;Williams and Funk, 2011;Lyon and DeWitt, 2012;Liebmann et al., 2014;Funk et al., 2018;Funk et al., 2019;Harrison et al., 2019). Even though details vary, the decline has been attributed to sea surface temperature (SST) patterns in the Indo-Pacific warm pool and related changes in the Walker cells. While some studies have claimed that there are weak seasonal relationships between the El Niño-Southern Oscillation (ENSO) (e.g., Lyon, 2014), Funk et al. (2018Funk et al. ( , 2019 have demonstrated that humaninduced warming in the Warm Pool has increased the sensitivity to La Niñas. Whereas declines in the Long Rains are well-simulated by sea surface temperature (SST) driven atmospheric simulations (Lyon and DeWitt, 2012), there is ongoing debate regarding the importance of different regions; Williams and Funk (2011) emphasized the warming in the Indian F I G U R E 1 Topography of the study region (magenta rectangle) and its vicinity. Dark red lines mark the countries of the Greater Horn of Africa (GHA). Yellow polygons delineate the regions used for the boreal winter Msimu rains and boreal spring long rains. The stippled area encircles the Congo Basin. The blue dashed lines indicate the location of the tropical rain belts (TRB) in January and April, respectively (adapted according to Laing and Evans, 2008). AL depicts the location of the Angola low during the Msimu rains. Black arrows denote major moisture transport pathways in January (adapted from Munday and Washington, 2017). Topography was interpolated from the ETOPO1 1 Arc-Minute Global Relief Model (Amante and Eakins, 2009) Ocean, while Funk et al. (2018), Harrison et al. (2019) and Funk et al. (2019) highlighted the role of the gradient between the equatorial eastern Pacific and the Warm Pool, whereas Wainwright et al. (2019) recently pointed to the Arabian Sea and Arabia.
The link between ENSO and East Africa is wellunderstood for the boreal fall rainy season ('Short Rains'), and new work has recently shown a trend towards more rainfall in equatorial latitudes (Seregina et al., 2019). However, some studies suggest a primary link to the Indian Ocean Dipole (IOD) that co-varies with ENSO (cf., Bahaga et al., 2019 and references therein). In boreal summer, the TRB is located over Sudan and Ethiopia and the rainy season is termed Sahel/Kiremt Rains in the present study. Rainfall during this rainy season increased recently (Seregina et al., 2019). Unlike for the Short Rains, El Niño events show associations with drought occurrence during the Kiremt season (Funk et al., 2018), while La Niña events tend to enhance rainfall (Seleshi and Zanke, 2004;Korecha and Barnston, 2007;Segele et al., 2009).
Less attention has been devoted to the boreal winter rainy season in Tanzania locally known as 'Msimu Rains' (Hermann and Mohr, 2011). The Msimu rains are part of the southern African solstitial rainy season. A main feature of the latter is the Angola Low (AL). The approximate locations of the TRB and the AL are given in Figure 1. On the synoptic scale, the AL is associated with increased westerly moisture fluxes from the southeast Atlantic and penetration of humid north-easterlies from the Indian Ocean deep into the outer tropics (Figure 1), enhancing convection in the AL area. The abovementioned enhanced westerly flow extends northward to the CB and can push the meridional branch of the TRB eastward into the GHA region (cf., Figure 1). In connection with a passing upper-level wave from the mid-latitudes, the AL can intensify and trigger the formation of northwest-southeast orientated cloud bands, a visible sign of so-called tropical-temperate troughs (TTTs, Todd and Washington, 1999). TTTs act as an important source of rainfall for southern Africa (Todd and Washington, 1999;Faucherau et al., 2009). At the same time convection over Tanzania can be supressed (Manhique et al., 2011).
The Msimu rains appear to terminate earlier and are less abundant in recent years (Seregina et al., 2019). Similar to the Short Rains, El Niño conditions increase rainfall, while La Niña is associated with droughts. Harrison et al. (2019) found strong correlations between drought conditions in southern Tanzania during the December-February season and La Niña-like conditions in the equatorial Pacific at interannual and decadal time scales. The authors explain this teleconnection by both shifts in the tropical Walker circulations and modulations of the Rossby wave train in the southern hemisphere extratropics.
The intraseasonal to decadal variability of the abovedescribed four GHA rainy seasons can largely be understood as anomalies related to the location, intensity and width of the TRB. Riddle and Wilks (2013, hereafter RW13) introduced two statistical indices that describe the northward migration of the TRB, including its relevant meridional jumps, and allow for a relation to the development of the Somali Jet. In this study, we use the methodology of RW13 for the full seasonal cycle to reassess the interannual variations and trends of seasonal cycle over the GHA region. Another focus is put on the circulation anomalies related to interannual variations of the Msimu and Long Rains. It will be demonstrated that the applied method is suitable to describe anomalous TRB locations and related circulations at intraseasonal to decadal time scales. By considering various synoptic-to large-scale atmospheric and SST fields, potential explanations for atypical Msimu and Long Rains are proposed. In Section 2, the data utilized in this study are described. In Section 3, the applied method after RW13 is explained. Section 4 discusses the seasonal mean evolution of the TRB and related circulations while Section 5 focuses on interannual variability for the entire season, as well as related anomalies of characteristic features of the Msimu and Long Rains. Section 6 provides a summary and discussion.

| DATA
The rainfall dataset used for this study is the infraredbased, gauge-calibrated Climate Hazards Group InfraRed Precipitation with Stations (CHIRPS) version 2.0 product at 0.25 resolution (Funk et al., 2015). This product provides daily rainfall estimates between 50 S and 50 N from 1981 to present. Several studies confirm that CHIRPS performs better than many other products over East Africa (Kimani et al., 2017;Cattani et al., 2018). The main disadvantage is the lack of coverage over the oceans. Large-scale atmospheric fields, including wind components and geopotential at different pressure levels, SSTs and vertically integrated water vapour flux and its divergence, were obtained from the fifth-generation European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis (ERA5). Compared to its predecessor ERA-Interim (Dee et al., 2011), ERA5 offers a higher spatio-temporal resolution and an improvement of the atmospheric model and data assimilation processes (Hersbach et al., 2020). For this study, the time period from 1981 to 2018 is used.

| THE INDEX APPROACH
The spatio-temporal characteristics of the TRB vary from year to year. RW13 introduced two statistical indices to describe the rainfall distribution of the TRB during its northward migration. Both indices are based on a principal component analysis of the rainfall time series to extract the most important patterns of variability. RW13 used daily rainfall totals instead of anomalies and thus explicitly included the seasonal cycle. Since rainy seasons over the GHA are dominated by the migration of the TRB, the leading principal components capture largescale variations associated with the seasonal modality of rainfall and TRB characteristics like location or intensity (RW13). In the present study, the first seven eigenvectors, which account for 34% of the total variance of rainfall (cf., Figures S1 and S2), were taken according to the Rule N significance test (Overland and Preisendorfer, 1982). The Rule N assumes that only a leading subset of the estimated eigenvalues reflects statistically significant patterns, while the remaining eigenvalues cannot be distinguished from uncorrelated random noise. This subset is determined by comparing test metrics of the estimated eigenvalues to test metrics of eigenvalues of randomly generated datasets of the same dimension. With this statistical approach, the TRB can be conceived as a smooth envelope encompassing the rainfall signatures of the bulk activity of convective systems as a function of time in an area.
The Seasonal Location Index (SLI) is defined as a daily angle in radian between the first two principal components in clockwise direction to avoid a discontinuity at 2π. The index is constructed analogously to the phase component of the MJO index introduced by Wheeler and Hendon (2004). From an illustrative point of view, the evolution of the SLI can be depicted as a seasonal clock, where the vector resulting from the leading two principal components shows the seasonal 'time'. Low SLI values indicate that the TRB is located in a southern position. The SLI increases as the TRB progresses northward and decreases as the TRB migrates southward. By analysing the annual distributions of SLI values and anomalies, impacts of anomalously fast or slow progressions of the TRB on GHA rainfall can be studied.
The Rainfall Cluster Index (RCI) is an integer-valued index that consists of clusters resulting from a cluster analysis. The input of the cluster analysis is the rainfall field reconstructed from the leading seven eigenvectors. Thus, each RCI value represents a typical spatial pattern of the daily precipitation over the GHA. The seasonal cycle can be conceived as a sequence of typical precipitation patterns identified by RCI values. RW13 used six clusters, which were assigned by the k-means clustering algorithm according to the cosine metric. The advantage of this metric is that it reflects the (dis)similarity of two vectors, but it neglects the amplitude of the respective vectors.
The k-means algorithm requires predefined cluster centres as seeds or a predefined number of clusters, which is often subjectively determined. To minimize this aspect, we used a two-step approach to obtain the bestdifferentiated partition of clusters. First, we applied the agglomerative-hierarchical (AHC) approach, which does not require a predefined number of clusters and always results in a single and reproducible partition (Wilks, 2011). This approach considers each observation as a 'micro'-cluster and produces a multilevel hierarchy, where clusters at one level are joined pairwise as clusters at the next level. The number of clusters is obtained by determining the highest distance (dissimilarity) between two levels (difference between two clusters). A weakness of this approach is that the cluster members are fixed once they are included into a particular cluster even if a different cluster is more suitable in an advanced stage of the clustering procedure. To alleviate this disadvantage, a k-means clustering was applied subsequently with the number of clusters k 'proposed' by the AHC algorithm as seeds. The k-means approach assigns each observation to one of k groups according to a distance measure (Wilks, 2011). Following several sensitivity tests, both regarding the separation between clusters and their physical interpretation, six clusters were chosen, which agrees with the choice of RW13.
In the context of trends and interannual variability, this approach offers several advantages. By construction, neither the value of SLI nor the presence of a particular cluster is restricted to fixed calendar months. The composition of clusters during a specific month can indicate anomalous stages of the progression of the TRB. As clusters also reflect the spatial distribution of rainfall, changes in the temporal persistence of a cluster can provide hints towards trends of rainfall in particular regions.

| CHARACTERIZATION OF THE ANNUAL CYCLE THROUGH SLI AND RCI
In this section, we examine the mean seasonal variability of the two indices SLI and RCI, and relate it to the different stages of the migration of the TRB. The daily climatological SLI values and percentages of the occurrences of RCI values are presented in Figure 2a. All series were smoothed by a 11-day running mean. Figure 2b-g shows the rainfall patterns associated with the six clusters. The rainfall patterns were calculated by averaging rainfall for all days belonging to a given RCI value. To highlight areas for which a cluster is associated with wet conditions, statistically significant positive differences between the rainfall distribution of a given cluster and the distribution of all rainy days were estimated using the Wilcoxon-Mann-Whitney-Test (Helsel and Hirsch, 2002).
Note that instead of numbering the cluster from 1 to 6 as in RW13, we opted to refer to them by two suggestive characters that reflected their region of maximum rainfall. For example, the first cluster is subsequently named 'SE' since rainfall is concentrated in the southeastern part of the GHA. For this and the denomination of the F I G U R E 2 (a) Daily climatology of the SLI (black line) and relative climatological frequencies of the RCI (coloured lines). The daily values are smoothed by 11-day running averages. Line colours in (a) refer to the colours of the clusters, which are denoted by the frames and squares in (b)-(g). (b)-(g) mean daily rainfall in mm day -1 during each cluster (colours) for the period 1981-2018. The stippling marks areas, where the rainfall distributions differ significantly from climatology at the 5%-level according to the Wilcoxon-Mann-Whitney-Test remaining clusters, the reader is referred to the heading of Panels (b)-(g) in Figure 2.
At the beginning of the year, the SLI values are close to 1, which corresponds to the Msimu rainy season. The northward migration of the TRB, characterized by an increase of the SLI, starts between the Julian Day (JD) 80 and 100 (i.e., between end of March and beginning of April), which can be attributed to the onset of the Long Rains. This northward movement goes along with a steady increase of the SLI until it reaches its maximum value of 5 in June. Until September, the SLI remains in the range between 4.5 and 5. The period June-September agrees with the timing of the Sahel/Kiremt rainy season. The following southward migration of the TRB and the corresponding Short Rains are captured by decreasing values of the SLI during JD 260-320 (i.e., mid-September to mid-November).
The frequent occurrence of the South-East (SE) and South-West (SW) RCI clusters dominates the Msimu Rains (turquoise and magenta lines in Figure 2a) between November and the beginning of April (JD 300 to JD 100 of the following year). Rainfall in cluster SE is concentrated at the southern and eastern edge of Tanzania ( Figure 2b). Figure 3 shows cluster-specific moisture flux vectors and moisture divergence. The former also represent the mean low-level flow due to the fact that the bulk of the moisture is concentrated in the lower troposphere. For cluster SE, northeasterly winds from the Arabian Sea and southeasterly flow southeast of Madagascar leads to moisture convergence between 5 and 15 S, longitudinally stretching across the Tanzania-Mozambique border into the southernmost CB. For cluster SW, rainfall is concentrated over the southern hemispheric CB (Figure 2c). Cluster SW shows weaker north-easterlies over the Arabian Sea, stronger easterlies around the northern tip of Madagascar and Tanzania, thus leading to a much reduced moisture convergence over the southern GHA ( Figure 3b). Rather, moisture convergence is concentrated over the southern CB. The Equatorial East (EE) cluster occurs most frequently during both the increase of the SLI in April/May and the decrease in October/November (Figure 2a, red line). As indicated by the statistically significant region, this cluster reflects rainfall in equatorial East Africa (Figure 2d), and thus it can be attributed to the Long and Short Rains, respectively. During these periods, the northeasterlies over the Arabian Sea are absent ( Figure 3c), but the southern-hemispheric branch of the Somali Jet is clearly discernible. It leads to a northwestward moisture flux directed and converging over equatorial eastern Africa including the Lake Victoria area.
The Equatorial West (EW) cluster shows increased frequencies in May/June and in September/October, and represents the equinoctial rainy seasons over the CB (Figure 2a, grey line and Figure 2e). Figure 3d shows the interhemispheric moisture flux associated with the Somali Jet and moisture divergence over the GHA region, and convergence over the CB and Lake Victoria. The Indian-Ocean (IO) cluster shows a similar seasonal cycle than the EW (Figure 2a, yellow line), but the frequencies differ. Cluster IO occurs more frequently during May/-June than in September/October, the opposite is true for cluster EW with lower percentages in May/June than IO and greater in September/October. The GHA region is overall dry (Figure 2f) and there is also moisture divergence over Uganda and the CB (Figure 3e). Cluster IO is peculiar in that it can occur in any month of the year (cf., Figure 4) and is not associated with wet anomalies anywhere in the study region ( Figure 2f). The rainy season in the northern GHA starts with increasing frequency of cluster North (NO) in late May and June and is fully established in July and August (Figure 2g, blue line). The retreat starts in late September and ends at the beginning of October. Cluster NO represents the Sahel/Kiremt rainy season. At this time, the Somali Jet is fully established and transports moisture away from the GHA region over the Arabian Sea towards India (Figure 3f). Apparent moisture sources for precipitation in northern Ethiopia and Sudan are the CB, but also the Red Sea area consistent with Viste and Sorteberg (2013).
In summary, the six RCI patterns reflect different stages of the seasonal cycle for both single-wet-and dualwet-season regions. The results agree with RW13 during the northward migration of the TRB within the first halfyear. The southward migration can be described with the same set of clusters. Note, however, that a day with the same cluster value during the Long and Short Rains just signifies that the large-scale rainfall patterns are similar, yet spatial details, amplitudes, and the underlying F I G U R E 4 Relative frequency of clusters per year for each calendar month (a-l). The colour of each cluster is indicated in the bar and corresponds to the colours in Figure 2 dynamic forcings will usually differ. Nonetheless, the approach allows assigning rainfall clusters to the four major GHA rainy seasons. As will be shown below, anomalous dry or wet rainy seasons can be related to anomalous frequencies in daily cluster occurrence and also be interpreted physically through the assignment of the large-scale circulation to seasonal clusters.

| INTERANNUAL ANOMALIES OF SEASONAL PROGRESSION AND CLUSTER-RELATED CIRCULATION ANOMALIES
In the following section, we first examine the year-to-year variations of cluster frequencies. In Sections 5.2 and 5.3, we analyse composite anomalies of the large-scale circulation associated with the occurrence of each cluster during the Msimu and Long Rains, respectively.

| Year-to-year variations in cluster frequency
The relative frequencies of the individual clusters per calendar month are shown in Figure 4 for all years between 1981 and 2018. A similar composition of clusters in consecutive months points to coherent seasons. This is, for example, the case for the Msimu Rains from December to March (Figure 4a-c,l). This rainy season can largely be explained by three clusters only; clusters SE and SW account for up to 90%, while around 6% can be attributed to cluster IO. July and August as part of the Sahel/Kiremt rainy season are also largely composed of three clusters (Figure 4g,h). Here, cluster NO is very dominant, more so than the prevailing SE cluster in the Msimu rainy season. June and September, usually also assigned to the Sahel/Kiremt Rains, have significantly different relative frequencies of the same clusters. These solstitial rains are characterized by a relative stationary location of the TRB in zones of unimodal rainfall distribution (e.g., Seregina et al., 2019). During the migratory phases of the TRB in boreal fall and spring, cluster composition changes progressively from month to month. Consequently, the Long Rains (MAM) and Short Rains (ON) are less coherent. Thus, while discussing the Msimu Rains for DJFM, we restrict the analysis of the Long Rains to the wettest month April (Nicholson, 1996) in Section 5.3. Figure 4 also displays striking features of interannual variability for certain clusters in particular years. For example, the El Niño event in 1997 can be recognized by anomalous cluster composition in the months October to December (Figure 4j-l). In particular, this El Niño event reveals an anomalously frequent occurrence of cluster EE in October and November and of cluster SE in December (Figure 5a). Both clusters have in common that they favour rainfall in the eastern parts of GHA in the abovementioned months ( Figure 2). Contrarily, clusters leading to rainfall in the CB, that is, clusters SW and EW are suppressed in OND 1997. However, other El Niño events in the time series do not exhibit pronounced deviations from the mean cluster composition. Only in October 1982 an anomalously high proportion of cluster EE can be observed (Figure 5a).
Another example is the exceptional drought period during the Short Rains in 2010 and the following Long Rains in 2011 (Dutra et al., 2013;Lott and Stott, 2013). In October 2010, clusters EW, IO and NO dominate the rainfall distribution, while clusters SW and EE occur infrequently compared to the average October (Figure 5b). In November and December 2010, clusters SE and SW dominate the monthly rainfall, while cluster EE is suppressed (Figure 5b). In April 2011, the frequency of the wet cluster EE remains low, while clusters SE, SW, EW and IO prevail instead (Figure 5b). Overall, for both rainy seasons, clusters typical of earlier and later stages of the seasonal progression are anomalously frequent, while the occurrence of cluster EE is reduced. This would be consistent with a delayed, but fast passage of the TRB from the northern into the southern hemisphere and vice versa during the 2010 Short and 2011 Long Rains, respectively. This pattern, which appears to be associated with recent La Niña events, represents a dangerous sequence of back-to-back East African droughts (Funk et al., 2018). For example, note that the sequence of small EE coverage in October and April was repeated in 1999/2000 and 2016/2017 (Figure 4).
For each month, the time series of cluster frequencies were tested for significant trends. In general, only a few trends in the cluster composition were found to be statistically significant (5%-level) according to the Mann-Kendall-Test (not shown). The only consistent trends occur for the Short Rains: Cluster EE occurs significantly more frequent in October and November in accordance with an observed wettening of the Short Rains, while cluster SW days are significantly reduced.

| The Msimu rains
In this section, we will investigate the role of the frequency of occurrence of the clusters on rainfall anomalies during the Msimu Rains and the cluster-related circulation anomalies. We will consider the period DJFM during which mostly the three clusters SE, SW and IO occur. Figure 6 shows DJFM rainfall anomalies for southern GHA (see yellow rectangle in Figure 1), the anomalies of cluster frequencies, and three maps displaying differences in rainfall anomalies with respect to seasonal climatology between seasons exhibiting above-and below-average cluster occurrence. Clusters SE and SW reveal an anticorrelation to each other, which is strongest before 2010 (Figure 6b). The late 1990s show the strongest interannual variability of these two clusters. Msimu Rains with frequently occurring SE cluster lead to wetter conditions in Tanzania, while dryness prevails over the CB (Figure 6c). The Pearson linear correlation between anomalies of SE cluster occurrence and rainfall is 0.8 (Figure 6b). Vice versa, Msimu seasons with positive anomalies of SW cluster occurrences favour dry conditions in Tanzania (r = −0.8, cf., Figure 6b) and enhanced rainfall over the CB (Figure 6d). Cluster IO occurs less frequent than the previous two clusters (Figure 4a-c,l) but with pronounced positive anomalies in particular years (Figure 6b). Highest anomalies prevail in the 1980s and 1990s, while the magnitudes of the peaks are generally reduced after 2005. Frequent days in this cluster lead to dryness in most parts of Tanzania (r = −0.5, cf., Figure 6b), southern Kenya and Uganda (Figure 6e). In contrast to cluster SW, this cluster does not lead to extensive wetness over the CB.
To understand how anomalous frequencies of clusters are related to regional circulation anomalies, composite differences of vertically integrated moisture flux, flux divergence, and geopotential at the 700-hPa level between days in a particular cluster and seasonal climatology were calculated (Figure 7). Cluster SE is characterized by enhanced moisture convergence stretching south-eastwards from eastern Zambia and Tanzania to the Mozambique Channel (Figure 7a). This pattern is accompanied by a cyclonic anomaly of geopotential centred over the southern Mozambique Channel. Overall, the result suggests an anomalously zonal orientation of the climatological meridional TRB and an active zonal portion over the Mozambique Channel and adjacent land areas (cf., Figure 1 and Figure S3a). Consequently, an increased occurrence of this cluster leads to above-average Msimu rainfall. In contrast, the CB experiences a drier season due to increased moisture flux divergence. Cluster SW shows an enhanced easterly divergent moisture flow from northern Madagascar crossing the Mozambique Channel, turning to more easterly anomalies, and affecting Tanzania and northeastern Zambia (Figure 7b). This flux extends westward, converges over the CB, and is related to a near-climatological position of the meridional TRB ( Figure S3b). An anticyclonic anomaly of geopotential is located over Madagascar. This suggests an anomalously northward position of the zonal TRB in the Madagascan region and a westerly position of the meridional TRB (cf., Figure 1). As a consequence, drier conditions are observed over southern GHA and wetter conditions over the CB. Stronger moisture flux divergence over large parts of eastern Africa is a striking feature of cluster IO (Figure 7c). This cluster reveals similarities to cluster SW, but divergence extends westward causing drier conditions over the CB. An anticyclonic pattern of geopotential extends from the southern hemisphere extratropics towards Angola (Figure 7c). This anticyclone is consistent with moisture flux anomalies around it. At its southwestern flank it appears to be associated with northwest-southeast oriented moisture flux convergence over southern Africa that is reminiscent of TTT locations (Faucherau et al., 2009).
To better understand large-scale atmospheric and oceanic conditions associated with the cluster-specific rainfall anomalies, corresponding fields of SSTs and 200-hPa velocity potential are now discussed. Anomalous SSTs are known to impact tropical convection and upper-level divergence at larger spatial scales not instantaneously, but over a time span of a few weeks. To account for this, we looked into non-overlapping 14-day windows of the Msimu Rains and determined if the occurrence of a cluster had positive or negative anomalies with respect to the climatology. Two samples of SST anomalies of the previous fortnight were created and averaged; one for fortnights with subsequent 14 days with positive cluster occurrence frequencies, and one with negative cluster F I G U R E 6 Temporal and spatial patterns of rainfall anomalies for the Msimu rainy season: (a) averaged seasonal rainfall anomalies over Tanzania (cf., yellow rectangle in Figure 1), (b) standardized anomalies of cluster occurrence, (c)-(e) composite maps of difference between seasonal rainfall anomalies across the region associated with (high-low) cluster occurrence. The Pearson correlation coefficients r between the time series in (a) and (b) are given on the right side of (b). Years in (a)-(b) refer to the major part of the rainy season, for example, for the year of 1982, the Msimu rainy season comprises Dec. 1981 andJan.-Mar. 1982. Statistically significant values compared to the climatology are stippled in (c)-(e) occurrence frequencies. Finally, the SST samples were subtracted. The SST anomalies associated with cluster SE show a gradient around Madagascar, which consists of a cold anomaly in the southwestern subtropical Indian Ocean and a warm anomaly that extends from the east African coast to the western coast of Australia ( Figure 8a). The region of warm SSTs co-locates with negative anomalies of velocity potential in 200 hPa. The latter is equivalent to upper-level divergence and indicates anomalous large-scale ascent over the Indian Ocean. Anomalous subsidence is observed over western Africa and the Atlantic Ocean. The SST and velocity potential anomalies during cluster SW are largely mirroring those in cluster SE. SSTs south of Madagascar show positive anomalies, while the north-western tropical Indian Ocean is colder than the climatological average ( Figure 8b). In addition, large parts of the tropical Atlantic reveal positive, partly significant SST anomalies and a pattern in the northern hemisphere Atlantic Ocean that mimics a positive phase of Atlantic Multidecadal Oscillation (Kerr, 2000). The anomalies of velocity potential reveal a consistent pattern with an ascending branch over the eastern Atlantic and a descending branch over the western Indian Ocean. The SST pattern of cluster IO reveals strong, largely significant negative anomalies over the western Indian Ocean (Figure 8c). Similar to cluster SW, the anomalies of velocity potential reveal a descending branch over the western Indian Ocean and an ascending counterpart over the equatorial Atlantic, which, is however, colder than average. Overall, the analysis in the section suggest that a complex interplay between changes in the tropical Walker circulation and interactions with the subtropics, as indicated by the significant geopotential anomalies in the Madagascan region, are responsible for cluster-specific changes in moisture transports and rainfall.

| The April month of the long rains
The transition month April is characterized by the northward progression of the TRB (Figure 2a). Thus, the latitudinal position of the TRB in this month plays an important role for the onset and persistence of the Long Rains over equatorial East Africa, and thus rainfall anomalies. Figure 9 shows the interannual variations of F I G U R E 7 Anomalies of vertically integrated moisture flux divergence (colours), fluxes (vectors) and geopotential at the 700 hPa (contour lines) for the Msimu rainy season for the clusters (a) SE, (b) SW and (c) IO. Only statistically significant anomalies of moisture divergence and fluxes compared to the climatology at the 5% significance level are shown. Fluxes additionally require a persistence of at least 90%. Significant positive (negative) anomalies of geopotential are highlighted by red solid (dashed) contours rainfall anomalies averaged over equatorial East Africa (see yellow polygon in Figure 1) and the mean SLI value and its daily distribution for April 1981-2018. The bold horizontal line marks the multi-year averaged SLI, while the solid curve and diamonds displays the average monthly SLI value for each year (Figure 9b). In addition, colours depict the frequency of the daily SLI value in 0.5 bins. Circles denote the monthly variance of SLI in each year. Overall, the average monthly SLI shows low interannual variability. The striking feature in Figure 9b is the intra-monthly variability of the SLI for given years. Low variance and a long period of TRB near the average latitudinal position occurred, for example, in 1981, 1988 and 2018. These years are mostly associated with wet conditions during the Long Rains over equatorial East Africa (Figure 9a, cf., Camberlin and Philippon, 2002;Kilavi et al., 2018). On the other hand, strong variations can be observed, for example, in 1989, 1999, 2011 and 2014. In several of these Aprils, the maximum frequency of occurrence of SLI values is not observed at its long term mean, and in combination with a high latitudinal variability, these months are drier than normal (e.g., 1998-2001, 2011, 2014 in Figure 9a). Figure 2a, SLI values and the frequency of the six clusters have a characteristic seasonal dependence. In Figure 10, we focus on the anomalies of cluster occurrence and the associated rainfall anomalies in April to shed light on this interdependence and on rainfall anomaly patterns. Positive anomalies in the occurrence of cluster EE match rather wet years in equatorial East Africa, while negative anomalies coincide with drought years (cf., Figures 9a and 10a). This cluster is related to positive rainfall anomalies over the Horn of Africa, most of Uganda and northern Tanzania, while negative rainfall anomalies can be observed in western Ethiopia and smaller parts of northern DR Congo (Figure 10d).

As revealed in
Although cluster EE dominates in April, its relative frequency shows high interannual variability (Figure 10a). Almost all positive rainfall anomalies, including the most extreme events in 1981, 1997 and 2018, coincide with above-average occurrence of cluster EE. Vice versa, negative rainfall anomalies coincide with years of below-average frequencies of cluster EE, particularly in 1998EE, particularly in -2000EE, particularly in , 2011EE, particularly in , 2014EE, particularly in and 2017. The deficient occurrence of cluster EE is mostly balanced by higher  (Figure 10b,c). This observation is underlined by increased frequency of low SLI values during the respective years (e.g., 1998, 1999 and 2011, cf., Figure 9). Apart from 1998 and 1999, only single years show positive anomalies of SE prior to 2010, while most of the years reveal negative anomalies. Positive anomalies of these clusters in consecutive years occur predominately after 2010.
Cluster EW leads to positive rainfall anomalies over the northern CB and western Ethiopia, while dryness prevails in most parts of equatorial East Africa (Figure 10e). This cluster normally appears in May and June (cf., Figure 2a, Figure 4e,f), when SLI values are higher than 3.5. It is associated with a more north-and westward location of the TRB, mostly outside equatorial East Africa. This cluster frequently occurred in April during the early 1990s, but also in dry years, for example, 1987, 2000, 2009, 2014 and 2017. The last cluster IO predominantly occurs during May and June. As it leads to rainfall outside of GHA, most of land areas remain dry (Figure 10f). Particularly in April, cluster IO leads to drier than normal conditions over equatorial East Africa. High positive anomalies of this cluster occurred concurrently with almost all strongest drought years. Interestingly, this cluster reveals a decadal variability including shifts around the mid-1990s and 2010, which corresponds to the time period of an abrupt decline of the Long Rains (Williams and Funk, 2011;Lyon and DeWitt, 2012). At the same time, the SLI shows particularly high variance values (Figure 9b). Our results suggest that abundant rainfall in April is characterized by frequent EE clusters and related low SLI variability, whereas poor rainfall is F I G U R E 9 Interannual variability of rainfall over equatorial East Africa and the SLI as a proxy for the location of the TRB in April.
(a) Averaged rainfall anomalies over equatorial East Africa (cf., yellow polygon in Figure 1). (b) Frequency of SLI values in April per year. The bold reference line marks the multi-year average of the SLI. Diamonds show the average value for each year, while circles highlight the respective monthly variance. The coloured boxes denote the frequency of the SLI-values for the particular year associated with high SLI variability. The latter manifest itself in some years by frequent SE and SW cluster occurrence with the TRB predominantly south of the region or by frequent changes of cluster regimes and a wide latitudinal intra-monthly variability of the TRB causing intermittent wet spells in equatorial East Africa.
In the following, anomalies of moisture divergence, vertically integrated moisture fluxes and geopotential at 700 hPa are discussed in order to identify possible causes for anomalous cluster occurrences. Cluster EE can be regarded as the climatologically expected circulation type for the Long Rains (cf., Figure 2a,d). The climatological F I G U R E 1 0 Temporal and spatial patterns of rainfall anomalies for April: (a) standardized anomalies of cluster occurrence, (b-f) composite maps of difference between seasonal rainfall anomalies across the region associated with (high-low) cluster occurrence. The Pearson correlation coefficients r between the time series in Figure 9a and Figure 10a are given on the right side of (a). Statistically significant values compared to the climatology are stippled in (b-f) feature of this circulation is the 'nascent' Somali Jet, that is, a strong jet from the southeast but still without the zonal branch across the Arabian Sea (cf., Figure 3c). Over the latter, the wintertime northeasterlies already disappeared. The composite differences in Figure 11c show a weakened easterly component of the nascent Somali Jet and weaker Turkana Jet leading to anomalous convergence of moisture and enhanced rainfall in the equatorial East Africa. The geopotential does not exhibit significant deviations except for a negative anomaly over the Arabian Peninsula, a region recently highlighted by Wainwright et al. (2019). Unlike most previous studies, Figure 11c emphasizes the potentially important role played by westerly moisture transports from the CB (cf., Finney et al., 2019).
In case of clusters SE and SW, strong northeasterly anomalies of moisture fluxes prevail and deflect the nascent Somali Jet towards the southern hemisphere. Cluster SE is characterized by northeasterlies over Somalia and the adjacent Indian Ocean that cross the equator while turning into northwesterlies (Figure 11a). As a consequence, the nascent Somali Jet does not reach the equatorial region, but converges in central-southern Tanzania. The geopotential shows positive anomalies over the northern GHA and southern Africa. The northwesterly component in the southern hemisphere are replaced by northeasterlies in cluster SW, while water vapour fluxes change at the Somali coast are less pronounced (Figure 11b). As a consequence of wind changes south of the equator, maximum moisture convergence anomalies are found over the southern CB. Anomalies of geopotential indicate a cyclonic anomaly over the southwestern coast of Africa. For clusters EW and IO, the nascent Somali Jet is enhanced by an anomalous easterly to southeasterly component leading to anomalous moisture flux divergence and dryness over equatorial East Africa. In case of cluster EW, a strong convergence anomaly is observed over the CB (Figure 11d). During cluster IO, the western part of equatorial East Africa and the eastern CB are affected by much drier conditions when compared to F I G U R E 1 1 Same as Figure 7, but for clusters that are important in April EW. This is related to strong divergent moisture flow towards western equatorial Africa (Figure 11e) in accordance with lower geopotential in this region. Partly significant positive anomalies of geopotential are found in the southern hemisphere subtropics and the Arabian Peninsula.
The tropical large-scale divergent circulation as reflected in the velocity potential anomalies indicates anomalous divergent outflow over the GHA and adjacent Indian Ocean for cluster EE (Figure 12c). The significantly positive SSTs over the latter appear to be consistent with the enhanced upper level outflow. The SST anomalies in the equatorial central Pacific resembles the El Niño Modoki pattern (Ashok et al., 2007). Overall, the Indo-Pacific SST pattern appears to be associated with a complex Walker Circulation response that decreases divergence over the Warm Pool and south-western Pacific, while increasing divergence over the Arabian Sea. For cluster SE, the velocity potential map shows anomalous divergence from east Africa extending to the Maritime Continent (Figure 12a). SSTs over the Indian Ocean are above average. The SST pattern over the Pacific Ocean resembles a late stage of an El Niño. The descending counterpart is located over the Gulf of Guinea. In case of cluster SW, the Indian Ocean is dominated by upper-level convergent flow, while divergence is observed over western equatorial Africa (Figure 12b). The tropical and subtropical Atlantic Ocean exhibit strongly positive SST anomalies. The southern hemisphere midlatitude ocean is cooler than normal, thus the SST anomalies bear some resemblance to the South Atlantic Meridional Overturning Circulation pattern as discussed in Lopez et al. (2017). Cluster EW is characterized by a dipole of upper-level convergence (divergence) over the GHA region/western Indian Ocean (central equatorial Africa) consistent with the observed rainfall anomalies (Figure 10d,e). The former region is characterized by colder SSTs, which is one potential cause of the overlying anomalous convergence. Another potential cause is a wave train from the Mediterranean over the Arabian Peninsula into the Arabian Sea ( Figure 13). Such a wave train has been discussed in Bekele-Biratu F I G U R E 1 2 Same as Figure 8, but for clusters that are important in April  Figure 11e, this suggests an anomalous Walker type circulation with an ascending (descending) branch over the Gulf of Guinea (East Africa). This circulation may be related to warmer (colder) SSTs in the Atlantic (Indian) Ocean ( Figure 12e). Overall, the discussion of the composited anomalies allows meaningful physical interpretations, that are complex due to concurrent tropical, subtropical and extratropical circulation anomalies.

| SUMMARY AND CONCLUSIONS
In the present study, interannual variations of the seasonal cycle of rainfall over the GHA between 1981 and 2018 were analysed using two statistical indices introduced by RW13. The first index, the SLI, describes the north-south meandering of the TRB. The second index, the RCI, refers to six related rainfall clusters at daily time scales. Composite anomalies of moisture flux and flux convergence, geopotential heights and SSTs for clusters prevailing during the Msimu Rains in Tanzania, and during April, the core month of the Long Rains, were created to link the anomalous cluster occurrence to atypical atmospheric and oceanic conditions. The following main conclusions can be drawn: 1. Confirming and extending the results of RW13, the SLI and RCI are powerful indices to capture the full seasonal cycle of the TRB. Mean monthly cluster composition indicate coherent parts of the rainy season and anomalous cluster occurrence helps to interpret interannual variations and trends, including some well-known ENSO extremes. 2. The Msimu Rains were dominated by two clusters with opposite rainfall characteristics between the CB and Tanzania. The associated anomalies in moisture flux (divergence) indicate variations in the location of the TRB originating from an interplay between lowlevel air flows from the Atlantic and Indian Oceans, and tropical and subtropical teleconnections. 3. Abundant rainfall during the Msimu Rains resulted from a persistent 'zonalisation' of the climatological meridional branch of the TRB. This constellation is favoured through westerly anomalies of moisture fluxes, a cyclonic anomaly of geopotential over the Mozambique Channel and large-scale anomalous ascent covering the east African coast and the Indian Ocean. Positive anomalies of SSTs over the equatorial Indian Ocean preceded these rain-favouring atmospheric conditions. 4. The peak period of the Long Rains shows a complex cluster composition, which is connected to intraseasonal and interannual variability of latitudinal locations of the TRB. In general, a persistent location of the TRB near the equator, evidenced in a frequent occurrence of cluster EE, is associated with wet conditions over East Africa, while dry Long Rains often go along with strong and frequent latitudinal variations of the TRB position with a late onset and/or intermittent rainfall. 5. Rainfall anomalies at the height of the Long Rains were found to be related to anomalous stages of the Somali Jet and low-level, zonal wind anomalies over the CB. Our results suggest that the SLI is particularly useful to analyse intraseasonal and interannual variations of the TRB during the periods of latitudinal migration, for example, Long Rains. It was shown that during wetter than normal Long Rains, SLI values show lower variability and a high percentage of days in the range of the seasonal mean, indicating a persistent location or a slow migration of the TRB. In years of drier than normal Long Rains, SLI values show higher variance, sometimes even a bimodal frequency distribution around the long-term mean. Strong variations of the SLI thus point towards frequent changes in circulation patterns, a strong intramonthly north-south meandering of the TRB and resulting intermittent rainfall.
The two dominating clusters of the Msimu Rains reflect a dipole pattern between Tanzania and the southern CB. The third cluster shows strongly reduced rainfall over both poles. The patterns describe an interplay between northeasterlies from the Arabian Sea, the southeasterlies from the southwestern Indian Ocean and westerlies from the Atlantic. The cluster SE associated with substantial rainfall over Tanzania is characterized by westerly moisture flux anomalies and warmer SSTs over the northwestern Indian Ocean, consistent with for example, Zorita and Tilya (2002) and Mapande and Reason (2005). The circulation anomalies during the Msimu Rains agree with the latitudinal positions of the semipermanent AL (cf., Crétat et al., 2019). In this context, the occurrence of cluster SE and rainfall over Tanzania is tightly connected to a northerly position of the AL. Cluster IO and widespread dry conditions predominately occur during a southerly displacement of the AL. Howard and Washington (2018) have shown that the characteristics of the AL can be separated into heat and tropical low phases. Howard et al. (2019) further investigated the relationship between the tropical low activity and precipitation over southern Africa. Their two circulation patterns of tropical low activity across the southern African continent agree with the two dominating clusters during the Msimu Rains in our work. While our results indicate a relationship between a meridional SST dipole in the western Indian Ocean and the occurrence of rainfall clusters, no obvious link to ENSO variability was found unlike in Faucherau et al. (2009) andCrétat et al. (2019). The circulation during Msimu rains with frequent co-occurrence of clusters SW and IO may be connected to the occurrence of TTTs over southwestern Africa (Todd and Washington, 1999;Faucherau et al., 2009;Macron et al., 2014). A warm southwestern Indian Ocean, which is associated with cluster SW, is related to the formation of TTTs over southern Africa (Faucherau et al., 2009). In addition, an interaction with extratropical Rossby wave activity, as described in Hart et al. (2010), Manhique et al. (2011), andMacron et al. (2014), seems plausible for cluster IO.
In agreement with Vellinga and Milton (2018), Funk et al. (2018) and MacLeod (2019), the rainfall-favouring cluster EE during the Long Rains positively correlates with SST anomalies in the Arabian Sea and central Pacific. Camberlin and Philippon (2002) found relationships of the Long Rains with lower sea level pressure, weakened easterlies and reduced divergence over the equatorial Indian Ocean, which agrees with the circulation of cluster EE. Finally, Wainwright et al. (2019) found that an earlier onset of the Long Rains is related to a warm Arabian Sea. In addition, a positive correlation is found to the SST anomalies in the central Pacific, suggesting an El Niño Modoki pattern (cf., Ashok et al., 2007;Preethi et al., 2015).
Dry conditions in April are characterized by higher relative frequencies of clusters related to moisture divergence and easterly anomalies of moisture fluxes, confirming the findings of Camberlin and Philippon (2002) and Zorita and Tilya (2002). The affected clusters can be separated into southern clusters SE and SW, and northern clusters EW and IO. The former clusters indicate a delayed onset of the Long Rains, while prolonging the Msimu rains. The occurrence of cluster SE in April is positively correlated with the SSTs in the southern Indian Ocean. This agrees with the results of Wainwright et al. (2019), who found a correlation between SSTs in the latter region and late onset dates of the Long Rains.
The northern clusters are associated with negative SST anomalies and subsidence over the northwestern Indian Ocean. Bekele-Biratu et al. (2018) emphasized that a tripolar pattern consisting of two anomalous mid-to upper-level cyclonic troughs and one anticyclone tends to enhance rainfall during the Ethiopian Belg Rains. Frequent occurrence of cluster EW goes along with a similar wave train from the northern subtropics. Cluster IO reveals a Walker-type circulation with cold SSTs and subsidence over East Africa and ascent over the Gulf of Guinea. A similar SST pattern correlating with cessation dates of the Long Rains was found by Wainwright et al. (2019) in May. In addition, a connection to SST anomalies in the central Pacific, suggesting a La Niña Modoki pattern (cf., Preethi et al., 2015) could be shown.
Our study highlights the role of the latitudinal position of the TRB and potential drivers related to the preponderance of five rainfall cluster during the Long Rains. Yet, the convective activity within the seasonal envelope of the TRB as well as its width are also contributing to the clusters statistics, even if latitudinal anomalies would be small. The employed method does not allow a quantitative assessment of the relative roles of position, intensity, and width of the TRB. As a quick check, we used the formula given in for example, Clark et al. (2018) and Hauser et al. (2020) which has three terms: The first term describes the influence of the cluster frequency, the second term the influence of the rainfall intensity and the third term is a combination of both metrics. For grid points in the yellow polygon in Figure 1, the first term was twice as large as the second, whereas the third term was negligible for April for cluster EE days. This provided room for the interpretation that a 'subsidence cap' as a consequence of, for example, Walker cell anomalies or dry phases of the MJO (Vellinga and Milton, 2018), plays the dominant role. However, it can also be inferred that more dry days occur due to the TRB being outside the region. Anomalies of the seasonal progression of the TRB, as proposed in the present study, are consistent with recent results by Wainwright et al. (2019), who pointed out that the Long Rains decline was mostly associated with a shorter rainy season with late onset and early cessation dates rather than a decline in seasonal rainfall amounts.
The results of this study indicate that the employed methods not only shed light on the seasonal and interannual anomalies of the TRB, but can also contribute to a better description and potential causes of rainfall trends. For the Short Rains that show a wetting trend, significant changes in two clusters were found. For the Long Rains, the dry period 1995-2010 was marked by an anomalous occurrence of cluster IO that is overall dry over the GHA region and was related to Walker cell type anomaly encompassing the Indian and Atlantic Oceans. In summary, this work suggests to strengthen research into the role of SST anomalies in the oceans adjacent to Africa as well as of tropical-extratropical interactions in explaining interannual rainfall variations and trends.