Transform#

Functions that can convert from underlying databases or model outputs into the standardised OpenGHG format. This could include, for example, creating a Flux file for a limited domain based on data from the EDGAR database. In constrast to standardisation functions, this will usually include some amount of transformation such as selection and/or regridding.

Regridding#

openghg.transform.regrid_uniform_cc(data, lat_out, lon_out, lat_in=None, lon_in=None, latlon=None, method='conservative')[source]#

Regrid data between two uniform, cell centered grids. All coordinates (lat_out, lon_out, lat_in, lon_in) should be for the centre of the representative cell and in degrees.

Adapted from code written by @DTHoare

Parameters:
  • data (TypeVar(ArrayLikeMatch, ndarray, DataArray)) – Data to be regridded. Data must have dimensions (lat, lon) if 2D or (time, lat, lon) if 3D.

  • lat_out (Union[ndarray, DataArray]) – 1D array for output latitude grid

  • lon_out (Union[ndarray, DataArray]) – 1D array for output longituide grid

  • lat_in (Union[ndarray, DataArray, None]) – 1D array for input latitude grid. Only used if data is a numpy array and not a DataArray

  • lon_in (Union[ndarray, DataArray, None]) – 1D array for input longitude grid. Only used if data is a numpy array and not a DataArray

  • latlon (Optional[list]) – Names for latitude and longitude coordinates within data. Default = [“lat”, “lon”]

  • method (str) – Method to use for regridding. Mainly use: - “conservative” - “conservative_normed” (ignores NaN values) See xesmf documentation for full list of options.

Returns:

Regridded data using specified method

Return type:

ndarray / DataArray

Emissions#

Transform emissions data

openghg.transform.emissions.parse_edgar(datapath, date, species=None, domain=None, lat_out=None, lon_out=None, edgar_version=None)[source]#

Read and parse input EDGAR data. Notes: Only accepts annual 2D grid maps in netcdf (.nc) format for now.

Does not accept monthly data yet.

EDGAR data is global on a 0.1 x 0.1 grid. This function allows products to be created for a given year which cover specific regions (and matches to the OpenGHG data schema, including units and coordinate names).

Region information can be specified as follows:
  • To use a pre-defined domain use the domain keyword only.

  • To define a new domain use the domain, lat_out, lon_out keywords

  • If no domain or lat_out, lon_out data is supplied, the global EDGAR

data will be added labelled as “globaledgar” domain.

Pre-exisiting domains are defined within the openghg_defs “domain_info.json” file.

Metadata will also be added to the stored data including:
  • “domain”: domain (e.g. “europe”) OR “globaledgar”

  • “source”: “anthro” (for “TOTAL”), source name from file otherwise

  • “database”: “EDGAR”

  • “database_version”: edgar_version (e.g. “v60”, “v50”, “v432”)

Parameters:
  • datapath (Path) – Path to data folder or zip archive for EDGAR data

  • date (str) – Year to extract. Expect a string of the form “YYYY”

  • species (Optional[str]) – Species name being extracted

  • domain (Optional[str]) – Domain name for new or pre-existing domain

  • lat_out (Union[ndarray, DataArray, None]) – Latitude values for new domain

  • lon_out (Union[ndarray, DataArray, None]) – Longitude values for new domain

  • edgar_version (Optional[str]) – EDGAR version in file. Will be inferred otherwise.

Returns:

Dictionary of data

Return type:

dict

TODO: Allow date range to be extracted rather than year? TODO: Add monthly parsing and sector stacking options