24 April 2025
Stara Kotłownia
Europe/Warsaw timezone

Analysis and comparison of missing value imputation methods for atmospheric pollution data

24 Apr 2025, 10:45
30m
SK 04/05 (Stara Kotłownia)

SK 04/05

Stara Kotłownia

Warsaw University of Technology, Main Campus

Speaker

Jakub Jasiński (Warsaw University of Technology)

Description

Missing values are a common phenomenon in real-world time series datasets and can significantly impact the precision and reliability of data analysis and machine learning models. This research project aims to discuss the types of missing data occurrence and test and analyze different possibilities of their imputation. The methods taken into consideration will start from the simplest ones based on statistics, go through regression models, neural networks, and finally LLMs.

The effectiveness of these imputation techniques will be measured and tested on atmospheric pollution data, primarily focusing on PM10, PM2.5, SO2, and NO2 levels. The performance of each method will be evaluated based on accuracy, consistency, and the impact on subsequent predictive models.

Author

Jakub Jasiński (Warsaw University of Technology)

Presentation materials