Research 21 min read Prime Logic ResearchApr 09, 2026

Urban Air Quality Digital Twins: CFD-ML Hybrid Architectures for Street-Level Pollution Dispersion Modelling

A comparative analysis of Computational Fluid Dynamics models, ML surrogate approaches, and CFD-ML hybrid architectures for real-time urban air quality digital twins, evaluated across three European city domains.

Urban air quality modelling at street-level spatial resolution — the scale required for pedestrian exposure assessment, school siting decisions, and cycle route health impact analysis — demands a fundamentally different approach from the regional-scale Gaussian plume and Eulerian chemistry transport models used for regulatory air quality assessment. Street-level pollution concentrations are dominated by urban morphology effects: building wake zones create extreme local accumulation, street canyons channel traffic emissions into ventilation-inhibited corridors, and intersection geometry determines whether dispersion is rapid or stagnant.

Computational Fluid Dynamics (CFD) models — particularly Large Eddy Simulation (LES) and Reynolds-Averaged Navier-Stokes (RANS) approaches — can accurately represent the building-scale fluid dynamics driving street-level dispersion, but their computational cost renders them unsuitable for real-time operational digital twins. A single RANS simulation of a 500m × 500m urban domain at 2m resolution requires approximately 4–8 hours on a modern HPC cluster — far too slow for the 15-minute update cycles required for operational air quality intelligence.

ML surrogate models — convolutional neural networks or graph neural networks trained on pre-computed CFD simulation databases — offer the potential for rapid inference (milliseconds per prediction) at the expense of generalisation accuracy when meteorological or emission conditions fall outside the training distribution. This study evaluated three architectures for an urban air quality digital twin application: (1) standalone CFD (RANS k-ε) as the accuracy benchmark; (2) CNN surrogate trained on 10,000 CFD simulations across wind speed/direction and emission rate parameter space; (3) CFD-ML hybrid using a physics-residual neural network that learns to correct RANS model errors with a lightweight GNN.

The CFD-ML hybrid architecture achieved the best balance of accuracy and computational efficiency: mean RMSE 4.2 μg/m³ for NO2 concentration prediction at street level (versus 3.1 μg/m³ for standalone CFD and 8.7 μg/m³ for CNN surrogate alone), with inference time of 340ms per 15-minute update cycle on a 4-GPU server — achieving the real-time operational requirement while retaining 86% of full CFD accuracy. The standalone CNN surrogate degraded most severely for wind direction transitions (RMSE increased to 14.2 μg/m³ for directions not well represented in training data), confirming the distribution shift vulnerability that limits pure ML approaches for operational deployment.

The architecture has been deployed as a pilot urban air quality digital twin for three European cities (one UK, one German, one Dutch) integrating: CAMS regional forecast as meteorological boundary conditions; live traffic count data from ANPR camera networks as emission inputs; Sensirion SEN5x sensor networks as real-time bias correction data sources; and the CFD-ML hybrid as the core dispersion engine producing 100m resolution street-level NO2, PM2.5, and PM10 concentration maps every 15 minutes. Population exposure analysis against WHO 2021 Air Quality Guidelines is computed continuously and exported to public health dashboard APIs.