Advanced tools for guiding data-led research processes of Upper-Atmospheric phenomena
Link to the data service, IUGONET Type-A: http://search.iugonet.org/
This data service is based on the metadata database. It was developed by the inter-university project in Japan; thus, the data sets registered to this data service are distributed across many universities and institutes in Japan. The examples of the repositories where the data sets are located are as follows:
- ERG Science Center, Nagoya University: https://ergsc.isee.nagoya-u.ac.jp/data_info/ground.shtml.en
- World Data Center for Geomagnetism, Kyoto University: https://wdc.kugi.kyoto-u.ac.jp/wdc/Sec3.html
- RISH Data Server, Kyoto University: http://database.rish.kyoto-u.ac.jp/arch/
- Astronomical Observatory, Kyoto University: https://www.hida.kyoto-u.ac.jp/
- i-SPES, Kyushu University: http://data.i-spes.kyushu-u.ac.jp/
- National Institute of Polar Research: http://iugonet0.nipr.ac.jp/data/
More than 1,200 data sets have been registered to this data service. Some of the data sets have DOIs, while others do not.
This paper presents tools that help researchers implement the processes of data-led studies of upper-atmospheric phenomena. These tools were developed as a part of the activities of the Inter-university Upper atmosphere Global Observation NETwork (IUGONET) of Japan, which is a project to develop infrastructure for upper-atmospheric research data. This paper focuses on the data service named IUGONET Type-A, which was launched in October 2016 and has since evolved. In addition to being a conventional metadata catalogue, it has many other useful functions: an easy cross-searching system, a quick-look data-plotting procedure, an interactive data visualization system named UDAS web, and strong linkage with analysis software. Users can pick up relevant data from a huge number of data sets using either lists categorized by instruments/projects, observed regions and special campaigns or a world map of observatories. Users can quickly find the time, location and nature of phenomena that occurred by comparing the quick-look plots of various data displayed by the browser. UDAS web allows researchers to interactively create stacked plots of various data types that can facilitate the understanding of the relationships among phenomena observed in different regions. Furthermore, it presents a command list for software dedicated to data analysis that can smoothly lead users to perform detailed analyses. IUGONET Type-A provides a one-stop data service that can assist users in searching, examining and comprehending data for advanced analysis. It is also capable of handling old data, including analogue data and written paper documents. Thus, it will provide useful support for innovative interdisciplinary scientific research on solar–terrestrial phenomena.
Earth's upper atmosphere, which encompasses a spacious volume at altitudes of approximately 50–500 km, is a complex system comprising multiple layers, namely the mesosphere, thermosphere and ionosphere. These layers have different atmospheric characteristics (e.g. density, temperature and pressure) and different forms of interaction. The various phenomena observed in these layers are generated by energy inputs both from higher regions (e.g. solar radiation, solar wind and particle precipitation from the magnetosphere) and from the lower atmosphere (e.g. atmospheric waves). To understand the various upper-atmospheric phenomena and predict the long-term variation of this region, it is necessary to conduct a comprehensive analysis of data obtained from multiple regions between the Earth's surface and the solar surface.
Data sets used in upper-atmospheric research generally have the following four characteristics in common. (1) Multi-instrument: Data are obtained using various types of measurement instruments, and they cover various physical phenomena, such as plasma, neutral gas and electric and magnetic fields (see Figure 1 of Yatagai et al., 2014). (2) Multi-region: Data are obtained from multiple regions, such as the solar surface; interplanetary space; and Earth's magnetosphere, ionosphere, thermosphere, mesosphere, stratosphere, troposphere and ground. As the information on horizontal circulation is also essential for the upper atmosphere, data from the polar regions; auroral zones; and mid-latitude, low-latitude and equatorial regions are also analysed. They are obtained from both ground-based networks and satellite (in situ) observation. (3) Multi-institution and multi-mission: Data are obtained by various institutions and missions. Thus, they are distributed across various universities and institutes, highlighting the importance of international collaboration in upper-atmospheric research. (4) Long-term: Analysis of long-term monitoring data is often required for understanding and predicting the upper atmosphere. For example, the characteristic length of one solar cycle is approximately 11 years. Therefore, data from a time span longer than 11 years are needed to investigate the solar cycle dependence of upper-atmospheric phenomena. As of December 2021, the oldest data registered to our data service are the geomagnetic field data at the Saint Maur Geomagnetic Observatory, France, in 1883.
To accelerate the sharing of such upper-atmospheric data, which have been archived by many Japanese universities and institutes, and to promote interdisciplinary studies, the Inter-university Upper atmosphere Global Observation NETwork (IUGONET) project was established in 2009 as a Japanese inter-university project (Hayashi et al., 2013). This project has developed certain products, including data analysis software and a metadata database, to assist researchers in the upper-atmospheric research field. The data analysis software of IUGONET was developed based on the Space Physics Environment Data Analysis Software (SPEDAS) (Angelopoulos et al., 2019; http://spedas.org/). The IUGONET metadata database was developed for the purpose of cross-searching data distributed across many Japanese institutions and for providing users with information about the data (i.e. metadata). The first system of the metadata database was based on an open-source repository, DSpace (https://duraspace.org/dspace/), which is used mainly by university libraries to archive and promote scientific output, and the web-based service began operation in 2010 (Abe et al., 2014; Hayashi et al., 2013). It had a simple user interface that provided users with textual metadata, including a description of the data, location of the data, contact persons and data use policy.
The first version of the data service was useful for researchers in terms of locating various data sets and obtaining information about the data; however, it did not have tools to help conduct research. Researchers could use the original IUGONET system to reach the data sets that they were seeking, but it was difficult for them to proceed to performing data analysis using only the metadata catalogue. Moreover, using the dedicated SPEDAS was difficult because this system had no direct connection with the data service. To meet the users' requirement for the ability to analyse data as soon as a data set is collected, we developed a new metadata database and released a data service called ‘IUGONET Type-A’ in October 2016. The top page of the IUGONET Type-A data service website is shown in Figure 1 (http://search.iugonet.org). It provides not only the conventional services included in the first system (i.e. the metadata catalogue) but also new services, namely a quick-look (QL) plot display of various data types, an interactive data visualization tool named ‘UDAS web’ and a smooth linkage with the dedicated SPEDAS.
2 STRATEGY TO SUPPORT SCIENTIFIC RESEARCH
Figure 2 shows the typical workflow for research on upper-atmospheric phenomena. (1) Search: If researchers have already found an event of interest in a specific data set, they usually search for other data sets related to the event to enhance their understanding. If they have no specific event to study, they generally search for interesting events from the QL data plots. (2) Know: Users obtain information about the data to be analysed (i.e. metadata), for example, what is represented by the data, who owns the data, from where the data file can be downloaded, when the data were recorded, how the data were obtained, and which data policy applies. A QL data plot is also useful for determining the characteristics of the data. (3) Examine: Researchers often create a stacked plot of various data sets for comparison purposes and for finding events having physical connections. (4) Advance: Researchers who wish to perform in-depth analysis can learn how to analyse the data using the dedicated software (in this case, SPEDAS). (5) Analyse: Users analyse the data with the dedicated software using various analysis methods that depend on the characteristics of each data set. (6) Create: Users can then create high-quality figures for publication in a journal.
Our strategy to help researchers perform complicated multiparameter data analysis is to seamlessly connect all six procedures listed above. As illustrated in Figure 2, the new IUGONET Type-A data service is responsible for processes (1)–(4), while the dedicated SPEDAS is used for processes (5) and (6). To help with research processes (1)–(4), IUGONET Type-A has the following properties: (A) The system enables researchers in various research fields to quickly find data sets specific to their analysis from among many data sets. (B) The service provides QL plots together with metadata to help researchers understand the characteristics of each data set. (C) Multiple data sets can be interactively visualized in arbitrary combinations to compare them. (D) Researchers are given information explaining how to plot and analyse data using the dedicated analysis software. We implemented these features by incorporating some new functions into IUGONET Type-A, such as the user interfaces for easy searching, QL display of data plots, interactive data visualization and illustration of how to visualize and analyse data.
3 NEW DATA SERVICE (IUGONET TYPE-A)
3.1 Example of research workflow using IUGONET Type-A
Figure 3 shows the page progression of the IUGONET Type-A website. The pages of the IUGONET Type-A website consist of the search page (top page) (a), search result page (b), metadata display page (c and d) and UDAS web page (e). There are two options for the search page (Figures 1 and 3a): one allows users to select data sets from a list compiled by IUGONET, and the other allows the selection of data sets from a world map. These user interfaces were newly added to IUGONET Type-A. The first data service based on DSpace required users to input any free-form keywords to narrow down their search results; however, it was often difficult for researchers in various research fields to specify adequate keywords because there were too many types of data sets (more than 1,200 as of December 2021). The list mode enables easy selection of keywords from the list, categorized by instruments/projects, observed regions and special observation campaigns. The categorized lists are determined in advance; however, they can be changed by the administrator. The world map mode shows observatory locations on a global map, and it leads users to data sets obtained at specific observation sites. Figure 4 illustrates the user interface for the world map mode. In this figure, the observatories of MAGDAS/CPMN (Yumoto et al., 1996; Yumoto et al., 2001; Yumoto et al., 2006; Yumoto et al., 2007) are marked on the map, and some simplified metadata relating to geomagnetic field data obtained at the Davao observatory in The Philippines are displayed. After the displayed link is clicked on, the metadata display page will open (Figure 3c,d) to reveal additional detailed information. It should be noted that in the world map mode, users cannot specify the date and can select only one station at once to show the simplified metadata.
The search result page has two display modes (Figure 3b and Figure 5): a text-list mode and a QL-plot mode. The text-list mode shows the search results as a list of data sets presented as text. The QL-plot mode exhibits the search results as QL plots of time-series data (created in advance by SPEDAS) and/or raw images acquired using observational instruments at the observatory (e.g. imagers and telescopes). The QL-plot mode of the search result page is a unique function of IUGONET Type-A. The display of multiple QL plots for various data sets can help researchers in various research fields to find data sets suited for their analysis. With the text-list mode only, it is too difficult for users to distinguish data sets of interest from search results with many entries. In addition, the QL-plot mode can also be used to quickly find phenomena of interest using visual inspection. Figure 5 shows an example of the QL-plot display of the search results returned by specifying the time interval of 26 September to 2 October 2012. These plots show various data sets acquired during a geomagnetic storm. From the top to the bottom, this figure presents the following data sets: the zonal mean air temperature observed by the COSMIC satellite (Tsuda et al., 2011), H-alpha and continuum solar images obtained using the SMART/T1 and SMART/T3 telescopes (Ishii et al., 2013; Ueno et al., 2004), geomagnetic indices from the World Data Center for Geomagnetism in Kyoto and geomagnetic field data. The time interval is set to 7 days by default for all QL plots. The time interval can be changed to one or 3 days to find upper atmosphere phenomena with various time scales. For the raw images, the images acquired on the final day of the selected interval are displayed. The common time interval for all the QL plots allows users to compare data and find the occurrence time, the location and the nature of the event to be analysed. The geomagnetic storm and strong auroral activity can be found in the plots of the Dst, SYM/ASY and AE indices (middle, right and left plots in the third row in Figure 5, respectively), which are indicators of the intensity of the geomagnetic storm (Dst and SYM/ASY) and auroral activities (AE) (Sugiura, 1964; Iyemori et al., 2010; Davis & Sugiura, 1966; World Data Center for Geomagnetism, Kyoto et al., 2015a; World Data Center for Geomagnetism, Kyoto et al., 2015b) during the period from 30 September to 1 October. Users can quickly see the presence of several sunspots on the solar disk (second row in Figure 5) and the effect of the geomagnetic storm on the geomagnetic field observed all over the world (bottom row in Figure 5).
If users wish to obtain information about the data, they can move to the metadata display page (Figure 3c and d) by clicking on a QL plot on the search result page. Figure 6 shows the metadata display page of lower stratosphere and troposphere data taken by the Equatorial Atmosphere Radar (EAR) (Fukao et al., 2003). The metadata display page exhibits detailed information about the data, such as a description of the data (Description), the data use policy (Acknowledgement), the contact persons (Contact) and the location of data files (Access Information). Since these metadata are based on the Space Physics Archive Search and Extract (SPASE) data model (King et al., 2010), which is a standard metadata format in the space and solar physics community, they are interoperable with other data services using the SPASE format. The SPASE format supports old records, including analogue data, written paper documents and photographic papers; thus, the IUGONET Type-A can handle the long-term monitoring data of the upper atmosphere. In addition to such metadata, the QL-plot display on the metadata display page provides researchers with information on the type of data set, for example, scalars, vectors or two-dimensional data. Users can understand what the QL-plot represents with the help of the metadata. From this figure, it is inferred that the geomagnetic storm did not generate any significant influence on the lower stratosphere and troposphere in the altitude range 1–20 km. Additional detailed information on the data can be obtained from the dedicated website, as shown in the ‘Access Information’ section.
3.2 Linkage to advanced data analysis
For the purpose of obtaining prompt scientific output, IUGONET Type-A has a reinforced linkage with the data analysis software. The dedicated SPEDAS for upper-atmospheric data is grassroots open-source data analysis software for the space physics community, developed by scientists and programmers of the Space Sciences Laboratory at the University of California, Berkeley; Institute of Geophysics and Planetary Physics at the University of California, Los Angeles; and other institutions (Angelopoulos et al., 2019). This software is written in the Interactive Data Language (IDL), which is a programming language used widely within the solar–terrestrial physics (STP) community. It enables users to easily download, visualize and analyse data and to create high-quality figures for academic papers. In addition, SPEDAS supports comprehensive analysis of multi-instrument, multi-regional and multi-mission data by creating stacked plots of various time-series data, overlaying plots of images and two-dimensional data on a world map and performing inter-correlation analysis. We have provided plug-in software for SPEDAS called iUgonet Data Analysis Software (UDAS), which enables IUGONET data to be loaded onto the SPEDAS platform. The details of SPEDAS and UDAS are described by Angelopoulos et al. (2019) and Tanaka et al. (2013), respectively.
For the ‘examine’ research process (no. 3 in Figure 3), an interactive data visualization function, named UDAS web, was added to IUGONET Type-A. Many researchers tend to create stacked plots of various data sets to examine the physical relationship among them. UDAS web (Figures 3e and 7) allows researchers to interactively create a stacked plot of multiple data on the web browser by freely selecting data sets, parameters and time intervals. It executes SPEDAS commands on the backend IDL with several input parameters and returns the plot image. Users do not need to establish analysis environment on their computer with UDAS web; instead, they can plot data following simple prompts provided by the web browser, that is, by selecting the input time interval and parameters and clicking on the ‘Plot’ button. The data files are not downloaded onto a user's computer but remain on the IUGONET web server, which means that UDAS web is available on smartphones or tablets with an internet connection. Figure 7 shows an example of a stacked plot display created with UDAS web, in which the horizontal wind velocity in the mesosphere and the lower thermosphere observed by the Middle and Upper Atmosphere Radar at Shigaraki (Kato et al., 1984) and the Dst index during the period from 00 UTC on 30 September to 00 UTC on 3 October 2012. With a figure like this, UDAS web allows users to examine the relationships among data obtained from different regions, such as the mesosphere, lower thermosphere and magnetosphere. Furthermore, UDAS web provides options to export a Postscript file of the plot or an ASCII file of the data. At present, the ASCII converter only works when the original file format is Common Data Format (CDF, https://cdf.gsfc.nasa.gov/).
For the ‘advance’ process (no. 4 in Figure 3), IUGONET Type-A provides researchers with data visualization and analysis procedures in the dedicated SPEDAS. The item ‘How to Plot’ in the metadata display page (right-hand side of Figure 6) shows the commands used for loading and visualizing data with SPEDAS. Users interested in a data set can learn SPEDAS commands and proceed to more advanced data analysis. Users can create the same type of plot as the QL plot shown in Figure 6 by copying, pasting and running the commands shown in ‘How to Plot (SPEDAS-CUI #Advanced)’ in SPEDAS, after which they can then move on promptly to scientific discussion. This function allows step-by-step creation of stacked plots of various data types and thus promotes interdisciplinary study, which is the goal of the IUGONET project. The item ‘How to Plot (SPEDAS-GUI)’ indicates the procedure used to visualize the data with the graphical user interface (GUI) of SPEDAS, which helps beginners in the use of SPEDAS and IDL analyse data via the GUI.
Although some other websites in the STP field, such as the Coordinated Data Analysis Web (CDAWeb, https://cdaweb.gsfc.nasa.gov/), enable similar interactive data visualizations without any installation of special software, IUGONET Type-A uniquely provides smooth linkage of explaining step-by-step procedures for data searching, determining detailed information about the data and examining the data analytically using UDAS web.
4 SUMMARY AND FUTURE PERSPECTIVES
This paper introduced some useful tools for upper-atmospheric research that were developed as part of the IUGONET project, namely the IUGONET Type-A data service. The most remarkable advantage of IUGONET Type-A is its smooth connection between typical STP research processes and the production of effective research outcomes. This ‘one-stop’ data analysis system represents a convenient tool with which researchers can search for data sets of interest, find information on the data (via metadata and QL plots), identify events of interest, create multiparameter data plots interactively and conduct detailed analyses using SPEDAS. Some existing web services are specialized for certain research processes in the STP field; however, IUGONET Type-A can uniquely provide all such services.
One of the main goals of the IUGONET project is to clarify the mechanisms of long-term variations in the upper atmosphere (Hayashi et al., 2013). Therefore, it is necessary for our data service to be able to handle old data, including analogue data and written paper documents. The SPASE metadata format used for IUGONET Type-A supports a variety of data types including such old records. By searching, visualizing and comparing these valuable old records with our data service, we expect to gain new knowledge about the long-term variations in the upper atmosphere.
A practical data service such as IUGONET Type-A can assist not only research efforts but also educational activities. Moreover, international collaboration is essential in STP research. For example, some IUGONET members have been collaborating with universities and institutes in many countries to construct ground-based networks of magnetometers, imagers and radars. In this context, IUGONET Type-A and SPEDAS can facilitate the sharing and analysis of various data sets and contribute to the education of young researchers and the development of increased capacity in developing countries. During 2015–2019, we held several data analysis workshops for students and young scientists, particularly those from Asian and African countries, to demonstrate how IUGONET Type-A and SPEDAS can assist in the analysis of upper-atmospheric data. We believe that the development of IUGONET Type-A is an important step towards the production of future scientific output.
We believe that the combination of the data services and dedicated data analysis software will be useful to researchers in the STP field because this field is typified by various types of data, and the choice of data analysis method depends strongly on the characteristics of individual data sets. At this stage, it is difficult to incorporate many analysis procedures into one web tool, and an independent, dedicated analysis software is often more suitable in terms of its flexibility. Nevertheless, we recognize that it is necessary to develop an integrated web tool that can support all STP research processes in the future.
Yoshimasa Tanaka: Data curation (equal); project administration (lead); software (equal); writing – original draft (lead). Norio Umemura: Data curation (equal); software (lead); writing – review and editing (supporting). Shuji Abe: Data curation (equal); software (lead); writing – review and editing (supporting). Atsuki Shinbori: Data curation (equal); software (equal); writing – review and editing (supporting). Satoru UeNo: Data curation (equal); software (supporting); writing – review and editing (supporting).
The production of this paper was supported by a subsidy from the NIPR. The IUGONET project was supported by the Special Educational Research Budget (Research Promotion) for FY2009 and by the Special Budget (Project) for FY2010-2014 from the Ministry of Education, Culture, Sports, Science and Technology, Japan. We appreciate the cooperation and generosity of the THEMIS Science Support Team in allowing us to use SPEDAS as our data analysis software, and for aiding in our development of the IUGONET plug-in software (UDAS). UDAS was developed in collaboration with the Energization and Radiation in Geospace Science Center (Miyoshi et al., 2012).
CONFLICTS OF INTEREST
The authors declare that they have no conflict of interest.
OPEN PRACTICES STATEMENT
This article has earned an Open Data badge for making publicly available the digitally shareable data necessary to reproduce the reported results. The data is available at https://doi.pangaea.de/10.1594/PANGAEA.936921. Learn more about the Open Practices badges from the Center for Open Science: https://osf.io/tvyxz/wiki