Speaker
Description
Equatorial and low-latitude Spread-F (SF) can severely affect radio communication and navigation systems, making timely identification and prediction important for space weather monitoring and operations. Although artificial intelligence (AI) has shown strong potential for automating ionogram analysis, broader progress has been limited by the lack of large-scale, well-annotated, and publicly available datasets. Traditional manual ionogram scaling and interpretation are labor-intensive and subject to inconsistencies, especially for long-term observations and borderline cases.
Here we present a long-term ionogram dataset from the Hainan station (19.5°N, 109.1°E), covering 2002–2016 and spanning an entire solar cycle. The archive includes more than 517,000 expert-labeled raw ionograms. Building on this long-term observational foundation, we established a standardized human-in-the-loop active learning workflow to improve labeling consistency and scalability. This process led to a curated public dataset of 150,000 high-quality images, balanced across five classes—frequency SF, range SF, mixed SF, strong range SF, and non-SF—with 30,000 samples per class. The released dataset provides a practical benchmark resource for AI-based ionogram analysis and supports the development of scalable methods for future large-volume observations.
We further illustrate the value of this dataset through three published AI studies based on the Hainan observations: (1) real-time automated detection and subtype classification of SF using image-based deep learning; (2) nowcasting of ionogram sequence evolution using a spatio-temporal ConvGRU framework; and (3) physically constrained generative modeling of quiet and disturbed ionospheric features using IonoGAN. Together, these studies demonstrate that the Hainan dataset can support diverse downstream tasks, from automated event identification to short-term prediction, and can serve as a data foundation for AI-enabled space weather research and operations.