Global wave hindcast with Australian and Pacific Island Focus: From past to present

Wind‐wave hindcast data have many applications including climatology assessments for renewable energy projects, maritime engineering design, event‐based impact assessments, generating boundary conditions for further downscaling, amongst others. Here, we present a global wave hindcast with nested high‐resolution grids for the Exclusive Economic Zones of Australia and south west Pacific Island Countries, that is extended in time monthly. The model employs strategic methods to incorporate the effects of subgrid sized features such as small islands and islets. Various bulk wave parameters are available hourly from January 1979 to present, along with the full wave spectra at a set of 3,683 predetermined points distributed globally.


| INTRODUCTION
Historical wind-wave data have useful applications in the fields of coastal engineering (Kamphuis, 2010), maritime engineering for both operations and design specification for structural loads (Bales and Cummins, 1979;Gouldby et al., 2014;Weisse et al., 2015), coastal hazard assessment (McInnes et al., 2016), renewable energy projects (Cornett, 2008;Portilla et al., 2013) and marine habitat distribution studies (Smith et al., 2015). Long term datasets can be used to create climatologies and underpin return period analysis for engineering design (Ewans and Jonathan, 2020).
Observations from in situ instrumentation such as wave buoys and radars can provide accurate data at high temporal resolutions; however, they are sparsely located in the southern hemisphere. Satellite altimeter wave data are | 25 SMITH eT al.
available since 1985 (Ribal and Young, 2019), but does not provide directional, spectral and wave period information. Altimeter data have limited application in near-coastal regions, and have limited spatial and temporal coverage with long interval times at given locations of interest. Other satellite remotely sensed wave data, including synthetic aperture radar (Khan et al., 2020) and SWIM (Surface Waves Investigation and Monitoring) sensors (Hauser et al., 2017) offer value in providing directional and spectral information, but also have limited applicability given the full wave spectra is not resolved.
Wave information from hindcasts is often the preferred data source for retrieving long-term historical data (Ewans and Jonathan, 2020). Numerical wave models are an efficient and economically viable method for generating data across large geographic and temporal scales, provided they are driven by accurate wind inputs of sufficient spatial and temporal resolution to model wave extrema (Rogers et al., 2005;Hemer et al., 2011) and validated by available observations. Hindcasts also have the advantage of providing a greater number of wave variables than what can be observed.
The Bureau of Meteorology and CSIRO have developed a hindcast model, known as the CAWCR (Centre for Australian Weather and Climate Research) Wave Hindcast, with the primary objective of providing global data with higher resolution focussed on the Australian and central and south west Pacific region . This was made possible by the release of the Climate Forecast System Reanalysis (CFSR) for the years 1979 to 2011 (Saha et al., 2010) and CFSv2 Reanalysis from 2011 to present (Saha et al., 2011) which features output at 0.3 and 0.2 degree spatial resolution, respectively, at hourly intervals, 64 vertical levels in the atmosphere and is coupled with an ocean circulation and sea ice model that provides quality wind data with adequate resolution for a global wave hindcast model.
The hindcast was originally developed as part of Australian Aid funded PACCSAP: Pacific-Australia Climate Change Science and Adaptation Planning Program (https:// www.pacif iccli matec hange scien ce.org/) and the Australian Climate Change Science Programme. The initial dataset covered the period from 1979 to 2010. In the following years, it underwent two updates from January 2011 to May 2013 and June 2013 to June 2014. Since 2014, the dataset continuously updates on a monthly basis.
Several global wave hindcasts have been developed as static datasets with similar methodologies using CFSR, with global resolutions of 0.5 (Chawla et al., 2013;Rascle and Ardhuin, 2013;Perez et al., 2017). The global resolution of the CAWCR Wave Hindcast has global resolution of 0.4 degrees, with a resolution of 4 arc minutes (up to 7 km) in the Australasian and central and south west Pacific covering Australia, Timor, Timor-Leste (East Timor), Papua New Guinea and the main islands of Fiji, Vanuatu, Samoa, Tonga, Niue, Solomon Islands, Cook Islands, Kiribati (Gilbert, Phoenix and Line Islands), Tuvalu, Tokelau, Nauru, Marshall Islands, Federated States of Micronesia, Guam, Palau and also Howland Island and Baker Island (Figure 1). The increased coastal resolution near the land masses of Australia and the south west Pacific provides a better representation of geometry, an important consideration for the sheltering effect through the island archipelago of the South west Pacific. Furthermore, high spatial resolution enables improved representation of bathymetry near coastlines, which in turn results in a more accurate computation of the influence of bottom friction, depth induced wave breaking and improved modelled intensity of tropical storm and cyclone systems that can be significantly underestimated in terms of wave height for coarser resolutions (Cavaleri, 2009

METHODS
The numerical model used for the CAWCR Wave Hindcast is WAVEWATCH III (WWIII; Tolman, 1991). WWIII is a third generation wave model and is widely used in forecasting centres. From January 1979 to May 2013, version 4.08 (Tolman, 2009) was used. This was upgraded to version 4.18b from June 2013 onwards (Tolman and WAVEWATCH Development Group, 2014). The physics selected for each stage was identical and is shown in Table 1. These physics were demonstrated to provide the best results for the Pacific-Australasian region . Spectral dissipation source term parametrization adheres to Ardhuin et al (2010) which provides the best estimate of wave parameters for both ocean and coastal settings . The directional wave spectra were binned over 29 frequencies exponentially spaced from 0.038 to 0.5 Hz and discretised over 24 directions with a constant 15° spacing.
The global grid has a resolution of 0.4° with two-way nested grids (Tolman, 2008) for Australia and the south west Pacific Island countries progressing down to 10 arcminutes (~18 km) and then 4 arcminutes (~7 km) as shown in Figure 1. Bathymetry for all grids was extracted from the DBDB2v3 digital bathymetric dataset (NRL, 2006). Two-way nesting allows for wave energy to propagate from both coarse to nested grids and from nested back to coarse. Many Pacific Island countries are defined as archipelagos comprised of many islands and islets, some of which are much smaller than the finer grid resolution. To account for wave energy reduction across smaller features, the subgrid island blocking scheme was implemented which suppresses energy between adjacent grid cells based on a transparency value in both x and y directions (Tolman, 2003). Neglecting subgrid features results in a large positive bias particularly in the Pacific. Obstruction grid generation relies on shoreline information (that includes small atolls) from the Global Self-Consistent Hierarchical High-Resolution, which is part of the WWIII grid preprocessor as developed by Tolman (2007, 2008) that creates consistent grids across different resolutions.
The hindcast requires global wind and sea ice concentration data (to define the ice edge) to be downloaded and preprocessed in the WWIII format prior to the model run. Surface winds at 10 m and sea ice required by the ongoing monthly updates are generated by the National Centers for Environmental Prediction (NCEP) Climate Forecasts System Version 2 (CFSv2) reanalysis (since 2011) and obtained from the research data archive operated by the National Center for Atmospheric Research (NCAR). Surface winds have a spatial resolution of 0.2° at hourly intervals, while sea ice is sixhourly. These files are in GRIB edition 2 format and are converted to netCDF for ingestion into the WWIII preprocessors. The inclusion of temporally varying sea ice concentration information has the advantage of having an ever-changing spatial ocean surface concentration mask at the poles (Tolman, 2003). The sea ice does not have a damping effect on the wave energy, rather it is treated similar to subgrid island blocking of the energy flux, with blocking dependent on the T A B L E 1 Physics selected for the model run

Identifier Description
UQ Third order (UQ) propagation scheme Leonard (1979Leonard ( , 1991 PR3 Higher order schemes with Tolman (2002)  ice concentration. A default concentration threshold of <25% assumes no effect on propagation and >75% assumes ice is treated as land with a varying degree of blocking between these threshold values. A default concentration threshold of <25% assumes no effect on propagation and >75% assumes ice is treated as land with a varying degree of blocking between these threshold values. The global grid spatial latitude extent is 78°S to 78°N, therefore excluding the poles. The hindcast is updated monthly and the restart file for the previous month is saved at the conclusion of each monthly, providing the initial conditions for the subsequent monthly model run. No restart is available for the very first month in January of 1979, so the model is spun up from a flat state. For this reason, it is recommended that the data for January 1979 should not be used. From June 2013 onwards, the model configuration has remained unchanged to limit inhomogeneity issues through the time-series.
Model outputs include global bulk wave parameters for each of the three grid resolutions, as well as the full wave energy density spectra at 3,683 predetermined global points which provides the accumulation of all wave energy in two dimensions of frequency and direction. Bulk wave parameters available from within the output netCDF files are listed in Table 2. The spectral points are shown in Figure 2. The 3,683 locations are made up from a global regular 10° grid, in addition to locations at 32 arcminutes (i.e. every 8 grid points) intervals over the high-resolution 4 arcminute Australian and south west Pacific regions. They are also provided at known wave buoy locations, tide gauge stations to evaluate wind inputs and additional project specific points.
The hindcast model is run routinely on approximately the third day of each month via a Rose/Cylc suite that manages job scheduling, module order and dependencies, and computational resources (Oliver et al., 2019). The module linkages and dependencies are shown in Figure 3. The suite commences by creating all the required directories if they do not already exist (initialize). Wind and ice.grb files are downloaded, converted to netCDF and then into WWIII format (ww3_prnc). The pregenerated bathymetry grids are copied to the working directory, and the restart files are processed through the initial conditions preprocessor (ww3_strt). The model is run using two-way nesting (ww3_multi). Postprocessing produces netCDF files of bulk wave parameters (ww3_ounf), and point-based spectral output (ww3_ounp). The model takes approximately 2.5 hours to run using 150 CPUs and 512 GB RAM on the National Computational Infrastructure (NCI), and a further 2 hours to complete postprocessing. A final check is provided by the production of plots, shown in Figure 4, which allow a visual inspection to confirm that the restart file was ingested correctly and the wind was correctly ingested throughout to the end of the month. Finally, model data are archived, and restart files are produced and ready for the following month's model run.

| DATASET LOCATION AND FORMAT
The wave hindcast can currently be viewed as hourly image snapshots in the Pacific Ocean Portal under 'Ocean Monitoring' (Powers et al., 2019) (http://ocean portal.spc. int/porta l/ocean.html) and is used by Pacific National Meteorological Services for investigations into past events ( Figure 5), and to generate wave climatological information ( Figure 6).
There are some slight differences in the parameter variable names between data from January 1979 to May 2013 and from June 2013 to present due to different versions of WWIII. Differences between available data are shown in Table 2.

| DATA VALIDATION
For the first phase of the hindcast, validation was undertaken using all available satellite altimeter data, a selection of United States National Data Buoy Center operated wave buoys, and seven shorter term buoy deployments in the south Pacific. Hindcast accuracy is detailed in Durrant et al. (2014) and demonstrates good agreement with satellite altimetry and wave buoy observations for bias and root mean squared error, concentrating mainly on significant wave height and peak period. There was found to be an 28 | SMITH eT al.
over-prediction of wave heights in the mid-latitudes in the earlier part of the hindcast due to issues with CFSR during this time period. Model performance in shallow water, especially near coastlines, may be variable. This has been attributed to wind errors associated with transition from land to sea (Xie et al., 2001;Chelton et al., 2004), a known T A B L E 2 Description of all the wave parameters saved into netCDF for each of the three grid resolutions bias in Ardhuin et al (2010) for short fetches, and bathymetry and resolution limitations. It is advised to be cautious when using wave hindcast data in water shallower than 40 m. The selected model physics and resolution, however, assures the highest possible quality for a wave hindcast at these scales. A more comprehensive validation for the hindcast in the Australian region was carried out by Hemer et al. (2017), in their use of the CAWCR wave hindcast to underpin an assessment of the renewable wave energy resource in the Australian region. They completed comparisons of archived wave parameters from the Australian regional grid(s) relative to available satellite altimeter measured wave heights, and wave heights and period, and inferred wave power from a national network of 37 in situ wave buoys. Although overall error between hindcast and observed wave fields were low across the Australian grid, regional variations were evident. Poor resolution of the complex bathymetry of the Great Barrier Reef region was attributed to the notable lower agreement in this region.

| DATASET USE AND REUSE
The CAWCR wave hindcast has been extensively used to explore the climatology of wave climate, utilizing both the bulk wave parameters (Marshall et al., 2015(Marshall et al., , 2018Godoi et al., 2020) and spectral information (Echevarria et al., 2019. The derivation of wave climatological information has applications in determining the efficacy of wave energy resources (Amiruddin et al., 2019) and leads to the calculation of energy potential from wave data (Morim et al., 2016;Hemer et al., 2017). A wave climatology derived from the July 1979 to June 2014 hindcast record provides the basis for the Australian Wave Energy Atlas, which is available through Australia's Renewable Energy Mapping Infrastructure (http:// www.natio nalmap.gov.au/renew ables; Hemer, Zieger, et al., 2018). The high-resolution nested grid around Australia was used to evaluate the effect of offshore reef systems on waves in the Great Barrier Reef (Gallop et al., 2014). The hindcast has also been used in network analysis studies (Greenslade et al., 2018). Furthermore, an underway task of the Coordinated F I G U R E 2 Wave spectra output points. The top panel shows points that align with observations such as wave buoys and tide gauges, as well as points around Australia used in wave validation projects. The bottom panel shows the regularly spaced grid for both global and highresolution coverage around Australia and the south west Pacific Ocean Wave Climate (Hemer, Erikson, et al., 2018) is assessing the variance in wave climate across an ensemble of historical wave hindcasts and reanalyses, to which the CAWCR hindcast is a contribution.
The availability of spectral data has significant value in the development of regional wave models. By using the wave spectra as the boundary condition for a downscaled model, all wave frequencies and directions are included. The Secretariat of the Pacific Community use the spectral data from the hindcast as boundary conditions for their downscaled wave models that better resolve small island chains and archipelagos (e.g. Damlamian et al., 2015). The WACOP project has produced an extensive series of wave climatology reports using the hindcast for specific locations in the south west Pacific . Bulk wave statistical properties over the south west Pacific have also been derived and mapped in Trenham et al. (2013).
The hourly temporal resolution provides data for studies interested in particular historical events, for example, swell driven inundation events in the Pacific (Hoeke et al., 2013;Smith and Juria, 2019).

| FURTHER WORK
The CAWCR hindcast has provided an excellent open dataset capable of underpinning many wind-wave focused studies in the national Australian interest. However, it does have shortcomings. New global wind datasets are now available that appear to outperform CFS products in representation of marine wind extremes. It is now well established that the CFS products suffer from homogeneity issues in the Southern Hemisphere, which limits any application of the hindcast to trend analysis. Wave physics source term and model developments (e.g. Zieger et al., 2015;Stopa et al., 2016;Liu et al., 2019) are available which could lead to improved results. Given the broad application of the wave hindcast to support national interests, it is important to look forward towards necessary developments of the hindcast, ensuring an open wave dataset remains available, state-of-the-art and fitfor-purpose for the required applications.