Data objects#

DataManager#

This dataclass is used to modify metadata stored in Datasource objects and the metadata store. DataManager instances are created by the data_manager function.

class openghg.dataobjects.DataManager(metadata, store)[source]#
__init__(metadata, store)[source]#
__str__()[source]#

Return str(self).

Return type:

str

delete_datasource(uuid)[source]#

Delete Datasource(s) in the object store. At the moment we only support deleting the complete Datasource.

NOTE: Make sure you really want to delete the Datasource(s)

Parameters:

uuid (Union[List, str]) – UUID(s) of objects to delete

Return type:

None

Returns:

None

refresh()[source]#

Force refresh the internal metadata store with data from the object store.

Return type:

None

Returns:

None

restore(uuid, version='latest')[source]#

Restore a backed-up version of a Datasource’s metadata.

Parameters:
  • uuid (str) – UUID of Datasource to retrieve

  • version (Union[str, int]) – Version of metadata to restore

Return type:

None

Returns:

None

update_metadata(uuid, to_update=None, to_delete=None)[source]#

Update the metadata associated with data. This takes UUIDs of Datasources and updates the associated metadata. To update metadata pass in a dictionary of key/value pairs to update. To delete metadata pass in a list of keys to delete.

Parameters:
  • uuid (Union[List, str]) – UUID(s) of Datasources to be updated.

  • to_update (Optional[Dict]) – Dictionary of metadata to add/update. New key/value pairs will be added.

  • updated. (If the key already exists in the metadata the value will be)

  • to_delete (Union[str, List, None]) – Key(s) to delete from the metadata

Return type:

None

Returns:

None

view_backup(uuid=None, version=None)[source]#

View backed-up metadata for all Datasources or a single Datasource if a UUID is passed in.

Parameters:

uuid (Optional[str]) – UUID of Datasource

Returns:

Dictionary of versioned metadata

Return type:

dict

SearchResults#

This dataclass is returned by the OpenGHG search functions and allows easy retrieval and querying of metadata retrieved by the search function.

class openghg.dataobjects.SearchResults(metadata=None, start_result=None, start_date=None, end_date=None)[source]#

This class is used to return data from the search function. It has member functions to retrieve data from the object store.

Parameters:
  • keys – Dictionary of keys keyed by Datasource UUID

  • metadata (Optional[Dict]) – Dictionary of metadata keyed by Datasource UUID

  • start_result (Optional[str]) –

    ?

__init__(metadata=None, start_result=None, start_date=None, end_date=None)[source]#
__repr__()[source]#

Return repr(self).

Return type:

str

__str__()[source]#

Return str(self).

Return type:

str

static df_to_table_console_output(df)[source]#

Process the DataFrame and display it as a formatted table in the console.

Parameters:

df (DataFrame) – The DataFrame to be processed and displayed.

Return type:

None

Returns:

None

classmethod from_json(data)[source]#

Create a SearchResults object from a dictionary

Parameters:

data (Union[bytes, str]) – Serialised object

Returns:

SearchResults object

Return type:

SearchResults

retrieve(dataframe=None, version='latest', sort=True, **kwargs)[source]#

Retrieve data from object store using a filtered pandas DataFrame

Parameters:
  • dataframe (Optional[DataFrame]) – pandas DataFrame

  • version (str) – Version of data requested from Datasource. Default = “latest”.

  • sort (bool) – Sort data by time in retrieved Dataset

  • **kwargs (Any) – Metadata values to search for

Returns:

ObsData object(s)

Return type:

ObsData / List[ObsData]

retrieve_all(version='latest', sort=True)[source]#

Retrieves all data found during the search

Parameters:
  • version (str) – Version of data requested from Datasource. Default = “latest”.

  • sort (bool) – Sort by time. Note that this may be very memory hungry for large Datasets.

Returns:

ObsData object(s)

Return type:

ObsData / List[ObsData]

to_data()[source]#

Convert this object to a dictionary for JSON serialisation

Returns:

Dictionary of data

Return type:

dict

to_json()[source]#

Serialises the object to JSON

Returns:

JSON str

Return type:

str

uuids()[source]#

Return the UUIDs of the found data

Returns:

List of UUIDs

Return type:

list

ObsData#

This dataclass is returned by data retrieval functions such as get_obs_surface and the SearchResults retrieve function.

class openghg.dataobjects.ObsData(metadata, data=None, uuid=None, version=None, start_date=None, end_date=None, sort=True, elevate_inlet=False, attrs_to_check=None)[source]#

This class is used to return observations data. It be created with a preloaded xarray Dataset or with a UUID and version number to retrieve data from Datasource zarr store.

__eq__(other)[source]#

Return self==value.

Return type:

bool

__getitem__(key)[source]#

Returns the data attribute (xarray Dataset) when the site name is specified. Included as a compatability layer for legacy format as a dictionary containing a Dataset for each site code.

key (str): Site code

Return type:

Any

__hash__ = None#
__iter__()[source]#

Returns site code as the key for the dictionary as would be expected.

Return type:

Iterator

__len__()[source]#

Returns number of key values (fixed at 1 at present)

Return type:

int

plot_timeseries(title=None, xlabel=None, ylabel=None, units=None, logo=True)[source]#

Plot a timeseries

Return type:

Figure

FluxData#

This dataclass is used to return observations data from the get_flux function

class openghg.dataobjects.FluxData(metadata, data=None, uuid=None, version=None, start_date=None, end_date=None, sort=True, elevate_inlet=False, attrs_to_check=None)[source]#

This class is used to return flux/emissions data from the get_flux function

Parameters:
  • data (Optional[Dataset]) – xarray Dataframe

  • metadata (Dict) – Dictionary of metadata including model run parameters

__str__()[source]#

Return str(self).

Return type:

str

ObsColumnData#

This dataclass is used to return observations data from the get_obs_column function

class openghg.dataobjects.ObsColumnData(metadata, data=None, uuid=None, version=None, start_date=None, end_date=None, sort=True, elevate_inlet=False, attrs_to_check=None)[source]#

This class is used to return observations data from the get_obs_column function

Parameters:
  • data (Optional[Dataset]) – xarray Dataset

  • metadata (Dict) – Dictionary of metadata including model run parameters

__str__()[source]#

Return str(self).

Return type:

str

FootprintData#

This dataclass is used to return observations data from the get_footprint function

class openghg.dataobjects.FootprintData(metadata, data=None, uuid=None, version=None, start_date=None, end_date=None, sort=True, elevate_inlet=False, attrs_to_check=None)[source]#

This class is used to return observations data from the get_footprint function

Parameters:
  • data (Optional[Dataset]) – xarray Dataset

  • metadata (Dict) – Dictionary of metadata including model run parameters

__str__()[source]#

Return str(self).

Return type:

str

BoundaryConditionsData#

This dataclass is used to return observations data from the get_bc function

class openghg.dataobjects.BoundaryConditionsData(metadata, data=None, uuid=None, version=None, start_date=None, end_date=None, sort=True, elevate_inlet=False, attrs_to_check=None)[source]#

This class is used to return boundary conditions data from the get_bc function

Parameters:
  • data (Optional[Dataset]) – xarray Dataframe

  • metadata (Dict) – Dictionary of metadata including model run parameters

__str__()[source]#

Return str(self).

Return type:

str