Searching and plotting#

In this short tutorial we’ll show how to retrieve some data and create a simple plot using one of our plotting functions.

Using the tutorial object store#

As in the previous tutorial, we will use the tutorial object store to avoid cluttering your personal object store.

from openghg.tutorial import use_tutorial_store

use_tutorial_store()

Now we’ll add some data to the tutorial store.

from openghg.tutorial import populate_surface_data
populate_surface_data()

1. Searching#

Let’s search for all the methane data from Tacolneston. To do this we need to know the site code (“TAC”).

If we didn’t know the site code, we could find it using the summary_site_codes() function:

In [1]: from openghg.standardise import summary_site_codes

## UNCOMMENT THIS CODE TO SHOW ALL ENTRIES
# import pandas as pd; pd.set_option('display.max_rows', None)
In [2]: summary = summary_site_codes()

In [3]: summary
Out[3]: 
          Network  ... Inlet heights
Site Code          ...              
AAO          NOAA  ...              
ABP          NOAA  ...              
ABT          NOAA  ...       ['33m']
ACG          NOAA  ...              
ADR           ALE  ...       ['10m']
...           ...  ...           ...
YON          NOAA  ...       ['20m']
ZEP         AGAGE  ...       ['12m']
ZEP          ICOS  ...       ['15m']
ZEP          NOAA  ...              
ZSF          ICOS  ...        ['3m']

[390 rows x 6 columns]

The output of this function is a pandas DataFrame, so we can filter to find sites containing the name “Tacolneston”:

In [4]: site_long_name = summary["Long name"]

In [5]: find_tacolneston = site_long_name.str.contains("Tacolneston")

In [6]: summary[find_tacolneston]
Out[6]: 
          Network  ...            Inlet heights
Site Code          ...                         
TAC          DECC  ...  ['185m', '54m', '100m']
TAC          NOAA  ...                         

[2 rows x 6 columns]

This shows us that the site code for Tacolneston is “TAC”, and also that there are two entries for Tacolneston, since it is included under multiple networks.

To see all available data associated with Tacolneston we can search for this using the site code of “TAC”.

from openghg.retrieve import search

tac_data_search = search(site="tac")

For our search we can take a look at the results property (which is a pandas DataFrame).

tac_data_search.results

To just look for the surface observations we can use the search_surface function specifically. We can also pass multiple keys to extract, for example, just the methane data:

from openghg.retrieve import search_surface

tac_surface_search = search_surface(site="TAC", species="ch4")
tac_surface_search.results

There are also equivalent search functions for other data types including search_footprints, search_flux and search_bc.

2. Plotting#

If we want to take a look at the data from the 185m inlet we can first retrieve the data from the object store and then create a quick timeseries plot. See the SearchResults object documentation for more information.

data_185m = tac_surface_search.retrieve(inlet="185m")

Note

The plots created below may not show up on the online documentation version of this notebook.

We can visualise this data using the in-built plotting commands from the plotting sub-module. We can also modify the inputs to improve how this is displayed:

from openghg.plotting import plot_timeseries

plot_timeseries(data_185m, title="Methane at Tacolneston", xlabel="Time", ylabel="Conc.", units="ppm")

Plotting multiple timeseries#

If there are multiple results for a given search, we can also retrieve all the data and receive a list of ObsData objects.

all_ch4_tac = tac_surface_search.retrieve()

Then we can use the plot_timeseries function from the plotting submodule to compare measurements from different inlets. This creates a Plotly plot that should be interactive and and responsive, even with relatively large amounts of data.

plot_timeseries(data=all_ch4_tac, units="ppb")

3. Comparing different sites#

We can easily compare data for the same species from different sites by doing a quick search to see what’s available

ch4_data = search_surface(species="ch4")

ch4_data.results

Then we refine our search to only retrieve the sites (and inlets) that we want to compare and make a plot

bsd_data = ch4_data.retrieve(site="BSD")
tac_data = ch4_data.retrieve(site="TAC", inlet="54m")
plot_timeseries(data=[bsd_data, tac_data], title="Comparing CH4 measurements at Tacolneston and Bilsdale")