30th April 2019

The openair project

openair started about 10 years ago as a way in which to provide open source, dedicated data analysis tools for the air quality community

  • Widely used internationally by academia, governmental bodies and the private sector
  • Downloaded > 150,000 times
  • Coded in R — a programming language developed for data analysis and statistics
  • Lots of functions for air quality analysis including, directional analyses, robust trends, clustering of polar plots, back trajectories, easy access to AURN, SAQN, WAQN, KCL networks, …

Approach to analysis

  • In some ways it is useful to think in terms of a crime scene investigation
    • Perhaps it is a crime scene — air quality limits set in law exceeded, naughty vehicle manufacturers and lots of early deaths!
  • Thinking more in terms of forensic science
    • Forensic scientists collect, preserve, and analyse scientific evidence during the course of an investigation
    • A whodunnit for air quality scientists — Mrs White in the library … or VW on the Old Kent Road
  • Tricky for air quality because the ‘suspects’ do not have unique fingerprints or DNA
    • Although particle composition might be a close analogy …

Asking questions of your data

General questions:

  • What is it you really want to know?
  • Are the measurements dominated by specific source(s)?
  • What changes occur over time (trends)?
  • How does site ‘X’ compare with other sites?
  • Is there evidence that the measurements are affected by street canyons?

Harder questions:

  • What is the contribution from road traffic emissions?
  • How much are concentrations affected by non-local sources?
  • When will site ‘X’ meet an air quality limit?

Conditional analysis is useful

  • Don’t need to analyse the full data set
  • Filter data for conditions of interest
    • Focus on weekdays, and certain hours e.g. 7 am to 7 pm for traffic data
    • Certain months of the year
    • Other data e.g. vehicle flows
    • Geographic areas
    • By wind direction and wind speed
    • Trends — don’t have to consider all data if question is about recent changes
  • Trade off: more filtering \(\Rightarrow\) less data
    • Considering uncertainties can be useful

Example of Oxford roadside site

oxford <- importAURN(site = "ox", year = 2010:2018)

polarPlot(oxford, pollutant = "nox", cols = "inferno")