Developer API

The functions and methods documented in this section are the internal workings of the OpenGHG library. They are subject to change without warning due to the early stages of development of the project.

Warning

Normal users should not use any of the functions shown here directly as they may be removed or their functionality may change.

modules

Base class

This provides the functionality required by all data storage and processing classes, namely the saving, retrieval and loading of data from the object store.

BaseStore

Base class which the other core processing modules inherit

Data processing

These classes are used for the processing of data by the ObsSurface processing class.

CRANFIELD

For processing data from Cranfield

CRDS

For processing data from CRDS (cavity ring-down spectroscopy) data from the DECC network.

EUROCOM

For processing data from the EUROCOM network

GCWERKS

For processing data in the form expected by the GCWERKS package

ICOS

For processing data from the ICOS network

NOAA

For processing data from the NOAA network

THAMESBARRIER

For processing data from the Thames Barrier measurement sites

Datasource

The Datasource is the smallest data provider within the OpenGHG topology. A Datasource represents a data provider such as an instrument measuring a specific gas at a specific height at a specific site. For an instrument measuring three gas species at an inlet height of 100m at a site we would have three Datasources.

Datasource

Handles the storage of data, metadata and version information for measurements

objectstore

These functions handle the storage of data in the object store, in JSON or binary format. Each object and piece of data in the object store is stored at a specific key, which can be thought of as the address of the data. The data is stored in a bucket which in the cloud is a section of the OpenGHG object store. Locally a bucket is just a normal directory in the user’s filesystem specific by the OPENGHG_PATH environment variable.

delete_object()

Delete an object in the store

exists()

Check if an object exists at that key

get_bucket()

Get path to bucket

get_local_bucket()

Get path to local bucket

get_object()

Get object at given key

get_object_from_json()

Get object from JSON

set_object_from_file()

Set data at a key from a given filepath

set_object_from_json()

Set data at a key from JSON

util

This module contains all the helper functions used throughout OpenGHG.

Exporting

These are used to export data to a format readable by the OpenGHG data dashboard.

to_dashboard()

Export timeseries data to JSON

to_dashboard_mobile()

Export mobile observations data to JSON

Hashing

These handle hashing of data (usually with SHA1)

hash_file()

Calculate the SHA1 hash of a file

hash_string()

Calculate the SHA1 hash of a UTF-8 encoded string

String manipulation

String cleaning and formatting functions

clean_string()

Return a lowercase cleaned string

to_lowercase()

Converts a string to lowercase

Time

Helpers to deal with all things datetime.

timestamp_tzaware()

Create a Timestamp with a UTC timezone

timestamp_now()

Create a timezone aware timestamp for now

timestamp_epoch()

Create a timezone aware timestamp for the UNIX epoch (1970-01-01)

daterange_from_str()

Create a daterange from two timestamp strings

daterange_to_str()

Convert a daterange to string

create_daterange_str()

Create a daterange string from two timestamps or strings

create_daterange()

Create a pandas DatetimeIndex from two timestamps

daterange_overlap()

Check if two dateranges overlap

combine_dateranges()

Combine a list of dateranges

split_daterange_str()

Split a daterange string to the component start and end Timestamps

closest_daterange()

Finds the closest daterange in a list of dateranges

valid_daterange()

Check if the passed daterange is valid

find_daterange_gaps()

Find the gaps in a list of dateranges

trim_daterange()

Removes overlapping dates from to_trim

split_encompassed_daterange()

Checks if one of the passed dateranges contains the other, if so, then split the larger daterange into three sections.

daterange_contains()

Checks if one daterange contains another

sanitise_daterange()

Make sure the daterange is correct and return tzaware daterange.

check_nan()

Check if the given value is NaN, is so return an NA string

check_date()

Check if the passed string is a valid date or not, if not returns NA

Iteration

Our own personal itertools

pairwise()

Return a zip of an iterable where a is the iterable and b is the iterable advanced one step.

unanimous()

Checks that all values in an iterable object are the same

Site Checks

These perform checks to ensure data processed for each site is correct

valid_site()

Check if the passed site is a valid one

multiple_inlets()

Check if the passed site has more than one inlet