Data objects#
_BaseData#
The base dataclass inherited by (most of) the dataclasses below.
DataManager#
This dataclass
is used to modify metadata stored in Datasource objects and the metadata store.
DataManager
instances are created by the data_manager function.
- class openghg.dataobjects.DataManager(metadata, store)[source]#
-
- delete_datasource(uuid)[source]#
Delete Datasource(s) in the object store. At the moment we only support deleting the complete Datasource.
NOTE: Make sure you really want to delete the Datasource(s)
- Parameters:
uuid (
list
|str
) – UUID(s) of objects to delete- Return type:
None
- Returns:
None
- refresh()[source]#
Force refresh the internal metadata store with data from the object store.
- Return type:
None
- Returns:
None
- restore(uuid, version='latest')[source]#
Restore a backed-up version of a Datasource’s metadata.
- Parameters:
uuid (
str
) – UUID of Datasource to retrieveversion (
str
|int
) – Version of metadata to restore
- Return type:
None
- Returns:
None
- update_metadata(uuid, to_update=None, to_delete=None)[source]#
Update the metadata associated with data. This takes UUIDs of Datasources and updates the associated metadata. To update metadata pass in a dictionary of key/value pairs to update. To delete metadata pass in a list of keys to delete.
- Parameters:
uuid (
list
|str
) – UUID(s) of Datasources to be updated.to_update (
Optional
[dict
]) – Dictionary of metadata to add/update. New key/value pairs will be added.updated. (If the key already exists in the metadata the value will be)
to_delete (
Union
[str
,list
,None
]) – Key(s) to delete from the metadata
- Return type:
None
- Returns:
None
SearchResults#
This dataclass
is returned by the OpenGHG search functions and allows easy retrieval and querying of metadata retrieved
by the search
function.
- class openghg.dataobjects.SearchResults(metadata=None, start_result=None, start_date=None, end_date=None)[source]#
This class is used to return data from the search function. It has member functions to retrieve data from the object store.
- Parameters:
keys – Dictionary of keys keyed by Datasource UUID
metadata (
Optional
[dict
]) – Dictionary of metadata keyed by Datasource UUIDstart_result (
Optional
[str
]) –?
- static df_to_table_console_output(df)[source]#
Process the DataFrame and display it as a formatted table in the console.
- Parameters:
df (DataFrame) – The DataFrame to be processed and displayed.
- Return type:
None
- Returns:
None
- retrieve(dataframe=None, version='latest', sort=True, **kwargs)[source]#
Retrieve data from object store using a filtered pandas DataFrame
- Parameters:
dataframe (
Optional
[DataFrame
]) – pandas DataFrameversion (
str
) – Version of data requested from Datasource. Default = “latest”.sort (
bool
) – Sort data by time in retrieved Dataset**kwargs (
Any
) – Metadata values to search for
- Returns:
ObsData object(s)
- Return type:
ObsData / List[ObsData]
- retrieve_all(version='latest', sort=True)[source]#
Retrieves all data found during the search
- Parameters:
version (
str
) – Version of data requested from Datasource. Default = “latest”.sort (
bool
) – Sort by time. Note that this may be very memory hungry for large Datasets.
- Returns:
ObsData object(s)
- Return type:
ObsData / List[ObsData]
ObsData#
This dataclass
is returned by data retrieval functions such as get_obs_surface and the SearchResults
retrieve function.
- class openghg.dataobjects.ObsData(metadata, data=None, uuid=None, version=None, start_date=None, end_date=None, sort=True, elevate_inlet=False, attrs_to_check=None)[source]#
This class is used to return observations data. It be created with a preloaded xarray Dataset or with a UUID and version number to retrieve data from Datasource zarr store.
- __getitem__(key)[source]#
Returns the data attribute (xarray Dataset) when the site name is specified. Included as a compatability layer for legacy format as a dictionary containing a Dataset for each site code.
key (str): Site code
- Return type:
Any
- __hash__ = None#
FluxData#
This dataclass
is used to return observations data from the get_flux function
- class openghg.dataobjects.FluxData(metadata, data=None, uuid=None, version=None, start_date=None, end_date=None, sort=True, elevate_inlet=False, attrs_to_check=None)[source]#
This class is used to return flux/emissions data from the get_flux function
- Parameters:
data (
Optional
[Dataset
]) – xarray Dataframemetadata (
dict
) – Dictionary of metadata including model run parameters
ObsColumnData#
This dataclass
is used to return observations data from the get_obs_column function
- class openghg.dataobjects.ObsColumnData(metadata, data=None, uuid=None, version=None, start_date=None, end_date=None, sort=True, elevate_inlet=False, attrs_to_check=None)[source]#
This class is used to return observations data from the get_obs_column function
- Parameters:
data (
Optional
[Dataset
]) – xarray Datasetmetadata (
dict
) – Dictionary of metadata including model run parameters
FootprintData#
This dataclass
is used to return observations data from the get_footprint function
- class openghg.dataobjects.FootprintData(metadata, data=None, uuid=None, version=None, start_date=None, end_date=None, sort=True, elevate_inlet=False, attrs_to_check=None)[source]#
This class is used to return observations data from the get_footprint function
- Parameters:
data (
Optional
[Dataset
]) – xarray Datasetmetadata (
dict
) – Dictionary of metadata including model run parameters