PM2.5 study : explore PM2.5 in Beijing using data mining methods and social media data




Journal Title

Journal ISSN

Volume Title



Air pollution is one of the worst outcomes from industrialization. Among other air pollutants, PM2.5 is believed to pose the greatest risks to human health as it can lodge deeply into people’s lungs. This study focuses on exploring predicting aerial PM2.5 values from traditional pollutants and wind information using data mining and statistical models, including K-means, Markov chain, SVR, OLS models. Additionally, trending topics on social media is also considered to analyze how PM2.5 influences people's daily life. Considering Sina Weibo is the most popular social media in China, OLS and SVR models were also implemented with Weibo dataset. Predictions based on this study are expected to help government and concerned organizations do better in environmental protection.