1 Introduction
Learning R can be daunting for the novice but this does not mean that openair cannot be used in a basic way. When openair was first developed it was designed to be used by non-experts in R. In this document we will cover some basic usage to see how air quality data can be imported easily and some plots produced.
Even if you do not want to learn much R, you can still do a lot with the minimum of knowledge
Later in the course we will go through more sophisticated analyses.
2 Load the openair library
3 Import one year of data for the London Marylebone roadside site
Importing data in this way also provides hourly estimates of wind speed and wind direction from the CMAQ regional air quality model. These data should be available from part way through 2010 to the present day.
It is easy to do this in openair with one short line of code:
4 Quick summary of the data
## date1 date2 code site o3 no2 co
## "POSIXct" "POSIXt" "factor" "factor" "numeric" "numeric" "numeric"
## so2 pm10 nox no pm2.5 nv10 v10
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## nv2.5 v2.5 ws wd
## "numeric" "numeric" "numeric" "numeric"
5 Produce a time series of NO2 concentrations
openair will try to format common pollutant names and units properly.
5.1 Produce a time series of daily average NO2 concentrations
6 Plot data in calendar format
Choose some nice colours also.
Which days had concentrations > 100 µg m\(^{-3}\) … a more complicated example with additional options included.
8 Plot a wind rose
9 Plot a polar plot
10 Proportion plot
11 The openair type
option
Being able to look at the dependencies of pollutant concentrations on other factors is immensely useful. It can be very illuminating to see how a pollutant varies by season, hour of the day, day of the week, cloud cover … other pollutants etc. Being able to consider these dependencies quickly and effciently greatly helps analysis and also leads to a more question-led approach and interactive analysis.
However, we don’t want to spend ages processing data! Here’s a quick example:
And a brief summary of in-built types:
- “year” splits data by year
- “month” splits variables by month of the year
- “monthyear” splits data by year and month
- “season” splits variables by season. Note in this case the user can also supply a
hemisphere
option that can be either “northern” (default) or “southern” - “weekday” splits variables by day of the week
- “weekend” splits variables by Saturday, Sunday, weekday
- “daylight” splits variables by nighttime/daytime. Note the user must supply a
longitude
andlatitude
- “dst” splits variables by daylight saving time and non-daylight saving time (see manual for more details)
- “wd” if wind direction (
wd
) is availabletype = "wd"
will split the data up into 8 sectors: N, NE, E, SE, S, SW, W, NW. - “seasonyear” (or “yearseason”) will split the data into year-season intervals, keeping the months of a season together. For example, December 2010 is considered as part of winter 2011 (with January and February 2011). This makes it easier to consider contiguous seasons. In contrast,
type = "season"
will just split the data into four seasons regardless of the year.
If a categorical variable is present in a data frame e.g. site
then that variables can be used directly e.g. type = "site"
.
type
can also be a numeric variable. In this case the numeric variable is split up into 4 quantiles i.e. four partitions containing equal numbers of points. Note the user can supply the option n.levels
to indicate how many quantiles to use.
Pollution rose for O3 for different intervals of NOx concentrations (quantiles):
What’s missing to make this more useful? The availability of surface meteorological data massively increases the types of analysis that can be carried out. We can also easily access surface measurements which will probably be more accurate than modelled data. This is something we will come back to.
Also, what about site meta data such as site classification, pollutants measured etc? That is also something that can be easily accessed in openair.
12 Trends
12.1 Theil-Sen trend estimates
In this case we will use the longer time series that comes with openair called mydata
.
A basic trend plot
## [1] "Taking bootstrap samples. Please wait."
But we can easily do more. For example it can be very useful to look at seasonal trends that are also averaged by season:
## [1] "Taking bootstrap samples. Please wait."
## [1] "Taking bootstrap samples. Please wait."
## [1] "Taking bootstrap samples. Please wait."
## [1] "Taking bootstrap samples. Please wait."
12.2 Non-parametric smooth trends
Often we do not want to fit a linear lin ethrough a trend but want to reveal the nature of the variation over time. The smoothTrend
function is useful in this situation.
Trends by wind sector:
13 Access to meteorological data
13.1 The worldmet
package
worldmet
provides an easy way in which to access surface meteorological data from >30,000 sites across the world. The package accesses the NOAA webservers to download hourly data. Access to surface meteorological data is very useful in general but is especially useful when using openair
and the polarPlot
function. To install the package, type:
There are two main functions in the package: getMeta
and importNOAA
. The former helps the user find meteorological sites by name, country and proximity to a location based on the latitude and longitude. getMeta
will also return a code that can be supplied to importNOAA
, which then imports the data.
Probably the most common use of getMeta
is to search around a location of interest based on its latitude and longitude. First we will load the worldmet package:
Now we can do a search for the 10 nearest sites to Huelva (latitude = 37, longitude = -7:
We can use the map that is produced to select a site of interest and import the data. For example, to import data for Huelva:
Plot a wind rose!