Searching and plotting#

In this short tutorial we’ll show how to retrieve some data and create a simple plot using one of our plotting functions.

Using the tutorial object store#

As in the previous tutorial, we will use the tutorial object store to avoid cluttering your personal object store.

from openghg.tutorial import use_tutorial_store


Now we’ll add some data to the tutorial store.

from openghg.tutorial import populate_surface_data

1. Searching#

Let’s search for all the methane data from Tacolneston. To do this we need to know the site code (“TAC”).

If we didn’t know the site code, we could find it using the summary_site_codes() function:

from openghg.standardise import summary_site_codes

# import pandas as pd; pd.set_option('display.max_rows', None)

summary = summary_site_codes()

The output of this function is a pandas DataFrame, so we can filter to find sites containing the name “Tacolneston”:

site_long_name = summary["Long name"]
find_tacolneston = site_long_name.str.contains("Tacolneston")


This shows us that the site code for Tacolneston is “TAC”, and also that there are two entries for Tacolneston, since it is included under multiple networks.

To see all available data associated with Tacolneston we can search for this using the site code of “TAC”.

from openghg.retrieve import search

tac_data_search = search(site="tac")

For our search we can take a look at the results property (which is a pandas DataFrame).


To just look for the surface observations we can use the search_surface function specifically. We can also pass multiple keys to extract, for example, just the methane data:

from openghg.retrieve import search_surface

tac_surface_search = search_surface(site="TAC", species="ch4")

Keyword options when searching#

When searching it is also possible to specify multiple options for keywords. If this is done using a list, then datasources which have any of the specified values will be found. For example if we wanted to search for methane at two specific inlets we could write:

from openghg.retrieve import search_surface

tac_surface_search = search_surface(site="TAC", species="ch4", inlet=["100m", "185m"])

This will return results from both the 100m and 185m inlets (but not the 54m inlet).

Note: it is also possible to specify a dictionary to provide an option between different keywords but this would most often be for backwards compatability (e.g. if a new keyword is introduced and a previous one retired but still present for some data sources) and so will not be demonstrated in this tutorial.

There are also equivalent search functions for other data types including search_footprints, search_flux and search_bc.

2. Plotting#

If we want to take a look at the data from the 185m inlet we can first retrieve the data from the object store and then create a quick timeseries plot. See the SearchResults object documentation for more information.

data_185m = tac_surface_search.retrieve(inlet="185m")


The plots created below may not show up on the online documentation version of this notebook.

We can visualise this data using the in-built plotting commands from the plotting sub-module. We can also modify the inputs to improve how this is displayed:

from openghg.plotting import plot_timeseries

plot_timeseries(data_185m, title="Methane at Tacolneston", xlabel="Time", ylabel="Conc.", units="ppm")

Plotting multiple timeseries#

If there are multiple results for a given search, we can also retrieve all the data and receive a list of ObsData objects.

all_ch4_tac = tac_surface_search.retrieve()

Then we can use the plot_timeseries function from the plotting submodule to compare measurements from different inlets. This creates a Plotly plot that should be interactive and and responsive, even with relatively large amounts of data.

plot_timeseries(data=all_ch4_tac, units="ppb")

3. Comparing different sites#

We can easily compare data for the same species from different sites by doing a quick search to see what’s available

ch4_data = search_surface(species="ch4")


Then we refine our search to only retrieve the sites (and inlets) that we want to compare and make a plot

bsd_data = ch4_data.retrieve(site="BSD")
tac_data = ch4_data.retrieve(site="TAC", inlet="54m")
plot_timeseries(data=[bsd_data, tac_data], title="Comparing CH4 measurements at Tacolneston and Bilsdale")