Non‐hydrostatic atmospheric cut cell model on a block‐structured mesh
Abstract
A block‐structured Cartesian mesh approach based on the Building‐Cube Method is implemented into a 2D non‐hydrostatic atmospheric cut cell model to obtain high near‐ground resolution using Cartesian coordinates. A simple flux‐matching algorithm is introduced that ensures mass conservation across varying grid resolutions in a subcycling time integration. Results of simple diffusion and advection problems show that the method produces sufficiently accurate results with high computational efficiency. The developed model successfully reproduces a flow over a semicircular mountain on a locally refined mesh around the mountain. The result agrees well with that using a uniformly fine mesh. Copyright © 2011 Royal Meteorological Society
1. Introduction
Recently, Cartesian coordinates are drawing attention as an attractive choice for high‐resolution atmospheric models that need to handle steep slopes in mountainous areas. They have the advantage of avoiding the errors because of the slantwise orientation of grid lines in models using conventional terrain‐following coordinates (Thompson et al., 1985). A disadvantage of Cartesian coordinates is that high vertical resolution near the ground is expensive to attain over a wide range of topographic height (Walko and Avissar, 2008). Horizontal grid intervals must also be closely spaced at steep slopes to achieve high near‐ground resolution.
A block‐structured Cartesian mesh approach named Building‐Cube Method (BCM) developed by Nakahashi (2003) provides an efficient way of locally refining a Cartesian grid at regions requiring higher resolution. In this method, a flow field is divided into a number of subdomains called ‘cubes’, and each cube has the same number of cells (Figure 1). The local computational resolution is therefore determined by the cube size, so that locally refined Cartesian grids at arbitrary boundaries can be obtained by refining the size of cubes near the boundaries.

Schematic of (a) cubes and (b) cells around a cosine‐shaped mountain for 2D computation. In this case, all cubes have 8 × 8 cells
The two‐tiered data structure of cubes and cells provides several attractive features of the BCM. The use of a uniform Cartesian mesh in each cube allows the direct use of any existing code for Cartesian coordinates. Moreover, the same number of cells among all cubes provides an advantage for parallel computing since the load balance of each cube is equivalent. For example, Kim et al. (2007) investigated the parallel efficiency of the BCM using OpenMP. They reported that the performance of the parallelization was very close to the ideal one when 32 central processing units (CPUs) were used for the computations. The BCM has been applied to several problems of computational fluid dynamics including flow around an airfoil (e.g. Nakahashi et al., 2006; Kim et al., 2007) and around a circular cylinder (e.g. Takahashi et al., 2009). A similar approach to the BCM designed for spherical coordinates was proposed by Oehmke and Stout (2001) and Oehmke (2004), and applied to advection problems on the sphere by Jablonowski et al. (2006, 2009).
In this study, the BCM is modified and introduced to a 2D non‐hydrostatic atmospheric model based on a Cartesian cut cell grid, named ‘Sayaca‐2D’ (Yamazaki and Satomura, 2010). The main modifications lie in the time‐stepping procedure and flux calculation at fine‐coarse cube boundaries. Because the original BCM (Nakahashi, 2003) uses the same time step at all cubes, the time step must be small to stabilize the numerical integration of the finest cube and is applied to all cubes: overhead is added to the coarse cubes. Our method provides larger time steps at the coarse cubes and the solvers are subcycled at the fine cubes, like the method reported by Berger and Colella (1989), for example. Furthermore, a simple flux‐matching procedure for the BCM data structure is derived and introduced at fine‐coarse cube boundaries to ensure global mass conservation of the model. This Conserved Building‐Cube Method (CBCM) enables Sayaca‐2D to perform computationally efficient Cartesian‐grid simulations with both high near‐ground resolution and reasonable conservation characteristics.
Another difference of this study from Nakahashi (2003) is in the surface boundary representation. Although Nakahashi (2003) uses a staircase representation of boundaries, here we use a cut cell method recently developed for and implemented in Sayaca‐2D. The cut cell method is based on a finite‐volume discretization and uses a cell‐merging approach (Yamazaki and Satomura, 2010). Note that it differs from the immersed boundary method based on a finite‐difference discretization that was used with the BCM in some early studies, for example, in the study by Kamatsuchi (2007).
Section 2 provides descriptions of the basic CBCM design principles, including mesh generation, time‐stepping procedure, parallelization and flux‐matching algorithm. In Section 3, we show the superiority of CBCM through numerical examples of simple diffusion and advection problems. Then the performance of Sayaca‐2D with CBCM is examined by comparing its flow result over a mountain to that using Sayaca‐2D with a uniformly fine mesh throughout the entire domain.
2. Numerical method
2.1. Overall procedure
Following the BCM (Nakahashi, 2003), the computational mesh of CBCM is generated by two steps: cube generation and cell generation in each cube. First, cubes are generated in a manner similar to the quadtree method (Berger, 1986) where the size of a cube becomes small with getting closer to the terrain surface (Figure 1(a)). The size differences among cubes are adjusted to guarantee a uniform 2:1 mesh resolution at fine‐coarse cube boundaries. The cubes that are located completely inside of the topography are removed and not used for the computation.
After cube generation, a Cartesian mesh of equal spacing and equal number of cells in each cube is generated (Figure 1(b)). For cells in the cubes that cross the topography, intersections to the terrain surface are checked. Although the BCM simply determines whether each cell is inside or outside of a staircase boundary, here we identify cells cut by the topography and determine the intersections of the terrain surface with the faces of the cells. Then, small cut cells are merged with neighboring cells to avoid severe restrictions on time steps by the Courant–Friedrichs–Lewy condition. Details of this cell‐merging procedure are found in the work of Yamazaki and Satomura (2010). To prevent cell merging between cells that belong to different sizes of cubes, the size of the cubes near the terrain surface is readjusted, if necessary, so that both merged and non‐merged cut cells are inside of the finest region.
All flow calculations are performed in each cube independently. Information is transferred between adjacent cubes through ghost cells that are added beyond the boundary of each cube, which allows merging of cells that belong to the different finest cubes. Note that all ghost regions are at the same resolution as the inner domain of the cube. Although the first‐order interpolation is used as in the BCM for exchanging information between different sizes of cubes, a flux‐matching algorithm introduced in the following subsection ensures conservation across varying grid resolutions.
(1)The parallelization of CBCM is straightforward. Load balancing is achieved by distributing an equal number of cubes at each refinement level for each processor. To keep the load balance in the presence of topography at the maximum refinement level, we compute solutions for all cells in the finest cubes, and then set the values at cells that are located bellow the terrain surface to zero. The overhead required to calculate values of the underground cells are sufficiently low because cells in completely underground cubes are excluded from computation (Figure 1). For efficient parallelization, a sufficient number of cubes are needed to avoid inequality of distributed numbers of cubes among processors as demonstrated by Takahashi et al. (2009).
2.2. Flux‐matching algorithm
To achieve conservation on a hierarchically refined grid, flux matching at fine‐coarse grid interfaces is imperative (e.g. Berger and Colella, 1989; Jablonowski et al., 2006). With a subcycling time‐stepping scheme, at a fine‐coarse grid interface the numerical flux on the coarse grid must equal the fluxes on the fine grid accumulated during the subcycling steps.
This process can be easily incorporated into the BCM framework by introducing the cube boundary flux as shown in Figure 2(a). Here we assume that the advected quantity is defined at the cell centers. If a cube has N × N cells, N/2 cube boundary fluxes are defined on each side of the cube boundary. These fluxes are used to correct the coarse cell flux to match the accumulated fine cell fluxes at fine‐coarse cube boundaries.

Cube boundary flux and flux‐matching algorithm. The advected quantity is defined at the cell centers. Solid and thick lines describe the boundaries of cells and cubes, respectively. Solid and thick arrows indicate the fluxes at cell boundaries and cube boundaries, respectively
(2)
(3)
(4)3. Results
(5)
(6)
(7)
Settings and results of the heat diffusion experiment. (a) 2D computational domain. Solid lines are the cube boundaries. Thick line indicates the boundary of the initially heated region. (b) Simulated temperature field with the flux‐matching algorithm and (c) that without the flux‐matching algorithm. (d) Temperature field calculated from the analytical solution. The calculation time and contour interval in (b), (c) and (d) are 100 h and 0.05 K, respectively. (e) Time change of the mean temperature in the results in (b) and (c) normalized by that in the initial state. (f) Comparison of speed‐up ratio with increasing the number of threads
Figure 3(b) and (c) shows the simulated temperature fields using the method with and without the flux‐matching algorithm described in Section 2.2, respectively. Referring to the analytical solution to the problem shown in Figure 3(d), the experiment with flux matching produces a sufficiently accurate solution, while the simulated amplitude without flux matching is much smaller than that in the analytical solution. Figure 3(e) shows the time change of the mean temperature normalized by that in the initial state. It clearly shows that the mean temperature remains constant to the machine precision in the result with the flux‐matching algorithm, while it decreases to less than 85% of the initial value after 100 h of integration in the result without flux matching. Finally, the speed‐up ratio is estimated to verify the scalability (Figure 3(f)). Here OpenMP is used to parallelize the code and the computations are performed on a Linux PC which has two 6‐core AMD Opteron CPUs. The parallel efficiency is about 80% at eight processors in both results with and without the flux‐matching algorithm. This indicates that the matching procedure implemented here does not alter the parallel efficiency of the BCM algorithm. Although the current parallel performance is not as good as that described in the work of Kim et al. (2007), this is the first trial of parallelization and the code has some room for performance improvement.
(8)
(9)
(10)Figure 4(a)–(d) shows four snapshots of the cosine bell at t = 0, L/2, L and 3L/2 from the bottom left to the top right of each figure with refinement level 0, 1, 2 and 4, respectively. Here every test case uses 16 × 16 cells in each cube. Although the shape of the cosine bell is largely distorted in Figure 4(a) due to the relatively coarse resolution, the distortion is successfully reduced when implementing refined regions. Moreover, no visible distortions of the shape of the cosine bell are observed as it passes over fine‐coarse cube boundaries. To quantitatively examine the errors due to the abrupt resolution changes, time traces of the l2 and l∞ norms of the errors are calculated and illustrated as shown in Figure 4(e) and (f), respectively. Even though the time trace of the l∞ norm shows little spikes when the cosine bell is transported over fine‐coarse cube boundaries, these results confirm that the benefit due to the higher resolution within the finer cubes exceeds the errors due to the abrupt resolution changes. Figure 4(g) shows the variations of l1, l2 and l∞ norms of the errors with the number of cells on a side of each cube in the cases of Figure 4(a) and (d). These errors are calculated in the results after one revolution. While the convergence rate for l∞ norms in the case with four refinement levels drops slightly below that with no refinement, approximately second‐order convergence rates are found for l1 and l2 norms in both cases. This result demonstrates that first‐ and second‐order interpolations at the interface between different sizes of cubes conserve the global second‐order spatial accuracy of CBCM. In addition, the run time of the experiment in the case of Figure 4(d) is as low as about 72% of that required by non‐subcycling calculation.

Results of the advection experiment. (a)–(d) Simulated cosine bell transported over cubes with refinement levels 0, 1, 2 and 4, respectively. Solid lines are the cube boundaries and each cube has 16 × 16 cells. Snapshots are taken at t = 0, L/2, L and 3L/2 from the bottom left to the top right of each figure, respectively. Thick lines and thick‐dashed lines indicate positive and negative values of the cosine bell, respectively. The contour interval is 0.2. (e) Time traces of the l2 and (f) l∞ norms of the errors of the cosine bell for different refinement levels. (g) Variations of l1, l2 and l∞ norms of the errors with the number of cells on a side of each cube. Solid lines and dashed lines indicate the variation of errors for cases (a) and (d), respectively. Thick line corresponds to the second‐order accurate convergence
Finally, a mountain wave experiment is performed by Sayaca‐2D with CBCM to demonstrate the performance of the method with cut cells in atmospheric applications. Here a semicircular mountain of radius r = 1 km is located at 15 km from the left end of the lower boundary. A constant horizontal velocity, U = 10 m s−1, and the constant Brunt–Väisälä frequency, N = 0.01 s−1, are initially imposed on the entire domain. The lower and lateral boundary conditions are free‐slip and cyclic, respectively. The width of the domain is set to 200 km to be large enough to prevent cyclic lateral boundaries from contaminating the simulated results. The height of the domain is 30 km and a sponge layer (Klemp and Lilly, 1978) is placed higher than 22.5 km to avoid reflecting the gravity wave at the rigid top boundary.
The cube configuration used in this experiment is shown in Figure 5(a). The total number of cubes is 144 with 20 × 20 cells in each cube. The minimum cell length near the mountain is 62.5 m and the maximum cell length in the far field is 500 m. The rest of the domain not shown in Figure 5(a) is filled with the largest cubes. Figure 5(b) shows an enlarged view of the Cartesian mesh near the mountain, showing the cut cell representation of terrain surface with cell merging.

Settings and results of the mountain wave experiment. (a) Cube boundaries and (b) the cell boundaries in a cube that intersects with the mountain surface. Thick lines in (b) indicate the boundaries of the merged cells. (c) Vertical velocity reproduced by Sayaca‐2D with CBCM using four different cell length: 500, 250, 125 and 62.5 m. (d) Vertical velocity reproduced by Sayaca‐2D with a uniformly fine mesh of 62.5 m. The integration time and contour interval in (c) and (d) are 1 h and 1 m s−1, respectively. Solid and dashed lines in (c) and (d) indicate positive and negative values, respectively
Figure 5(c) and (d) shows the vertical velocity fields over the semicircular mountain calculated by Sayaca‐2D with CBCM and with a uniform mesh of 62.5 m throughout the entire domain, respectively. There are no visible distortions of mountain wave patterns in the result of Figure 5(c) associated with the change of mesh resolution. Although in the region of coarse resolution the simulated amplitude in Figure 5(c) is slightly smaller than that in Figure 5(d), Sayaca‐2D with CBCM reproduces practically the same result compared to that with a uniformly fine mesh. At the same time, the number of cells with CBCM is as low as 3.75% of that with a uniformly fine mesh. These results lead to the conclusion that Sayaca‐2D with CBCM reproduces a reasonably accurate flow over the mountain with a substantially low computational cost, thereby demonstrating the advantage of the CBCM in combination with cut cells for flows over complex topography including steep slopes.
4. Conclusion
To obtain high resolution near the ground with high computational efficiency in Cartesian coordinate models, a block‐structured Cartesian mesh approach CBCM was developed, and was implemented to a non‐hydrostatic atmospheric cut cell model Sayaca‐2D.
CBCM employs a subcycled time step to minimize the computational overhead in time integration. In addition, the method ensures mass conservation at fine‐coarse grid interfaces by introducing a simple flux‐matching algorithm. The results of simple diffusion and advection problems showed that the method has sufficient conservation property, high computational efficiency and global second‐order accuracy.
To demonstrate the performance of Sayaca‐2D with CBCM, the test of flow over a semicircular mountain was performed using a locally refined mesh around the mountain. The model reproduced a smooth and accurate mountain wave comparable with the uniformly fine‐grid computation.
Acknowledgements
The authors thank Prof Tetsuya Takemi and Mr Ryuji Yoshida at the Disaster Prevention Research Institute, Kyoto University, for their support and helpful comments. The first author also thanks Dr Phillip Blakely and Mr Leif Denby at the Cavendish Laboratory, University of Cambridge, as well as two anonymous reviewers for their suggestions and discussions. This study was supported in part by a Grant‐in‐Aid for JSPS Fellows, for JSPS Bilateral Joint Research Project, for Scientific Research B‐22340137 of MEXT, Japan and for the Global COE Program GCOE‐ARS of Kyoto University. Some figures were drawn by the GFD‐DENNOU Library.




