Searching and plotting#

In this short tutorial we’ll show how to retrieve some data and create a simple plot using one of our plotting functions.

As in the previous tutorial, we will start by setting up our temporary object store for our data. If you’ve already create your own local object store you can skip the next few steps and move onto the Searching section.

from openghg.tutorial import populate_surface_data
populate_surface_data()

Searching#

Let’s search for all the methane data from Tacolneston to do this we need to know the site code. We can see a summary of known site codes using the summary_site_codes() function

from openghg.standardise import summary_site_codes

## UNCOMMENT THIS CODE TO SHOW ALL ENTRIES
# import pandas as pd; pd.set_option('display.max_rows', None)

summary = summary_site_codes()
summary

The output of this function is a pandas DataFrame. If we wanted to filter this to include sites containing the name “Tacolneston” we could do so as follows:

site_long_name = summary["Long name"]
find_tacolneston = site_long_name.str.contains("Tacolneston")

summary[find_tacolneston]

As you can see, there will sometimes be multiple entries for a site if this is included under multiple networks.

If we wanted to see all available data associated with Tacolneston we can search for this using the site code of “TAC”.

from openghg.retrieve import search

tac_data_search = search(site="tac")

For our search we can take a look at the results property (which is a pandas DataFrame).

tac_data_search.results

To just look for the surface observations we can use the search_surface function specifically. We can also pass multiple keys to extract, for example, just the methane data:

from openghg.retrieve import search_surface

tac_surface_search = search_surface(site="TAC", species="ch4")
tac_surface_search.results

There are also equivalent search functions for other data types including search_footprints, search_emissions and search_bc.

If we want to take a look at the data from the 185m inlet we can first retrieve the data from the object store and then create a quick timeseries plot. See the `SearchResults <https://docs.openghg.org/api/api_dataobjects.html#openghg.dataobjects.SearchResults>`__ object documentation for more information.

 data_185m = tac_surface_search.retrieve(inlet="185m")

**NOTE:** the plots created below may not show up on the online
documentation version of this notebook.

We can visualise this data using the in-built plotting commands from the plotting sub-module. We can also modify the inputs to improve how this is displayed:

from openghg.plotting import plot_timeseries

plot_timeseries(data_185m, title="Methane at Tacolneston", xlabel="Time", ylabel="Conc.", units="ppm")

Plot all the data#

If there are multiple results for a given search, we can also retrieve all the data and receive a list of `ObsData <https://docs.openghg.org/api/api_dataobjects.html#openghg.dataobjects.ObsData>`__ objects.

all_ch4_tac = tac_surface_search.retrieve()

Then we can use the plot_timeseries function from the plotting submodule to compare measurements from different inlets. This creates a Plotly plot that should be interactive and and responsive, even with relatively large amounts of data.

plot_timeseries(data=all_ch4_tac, units="ppb")

Compare different sites#

We can easily compare data for the same species from different sites by doing a quick search to see what’s available

ch4_data = search_surface(species="ch4")
ch4_data

Then we refine our search to only retrieve the sites (and inlets) that we want:

ch4_data.results

We can retrieve the data we want to compare and make a plot

bsd_data = ch4_data.retrieve(site="BSD")
tac_data = ch4_data.retrieve(site="TAC", inlet="54m")
plot_timeseries(data=[bsd_data, tac_data], title="Comparing CH4 measurements at Tacolneston and Bilsdale")