Analyse#

The ModelScenario class allows users to collate related data sources and calculate modelled output based on this data. The types of data currently included are: - Timeseries observation data (ObsData) - Fixed domain sensitivity maps known as footprints (FootprintData) - Fixed domain flux maps (FluxData) - multiple maps can be included and referenced by source name - Fixed domain vertical curtains at the boundaries referred to as boundary conditions (BoundaryConditionsData)

class openghg.analyse.ModelScenario(site=None, species=None, inlet=None, height=None, network=None, domain=None, model=None, met_model=None, fp_inlet=None, source=None, sources=None, bc_input=None, start_date=None, end_date=None, obs=None, footprint=None, flux=None, bc=None, store=None)[source]#

This class stores together observation data with ancillary data and allows operations to be performed combining these inputs.

__init__(site=None, species=None, inlet=None, height=None, network=None, domain=None, model=None, met_model=None, fp_inlet=None, source=None, sources=None, bc_input=None, start_date=None, end_date=None, obs=None, footprint=None, flux=None, bc=None, store=None)[source]#

Create a ModelScenario instance based on a set of keywords to be or directly supplied objects. This can be created as an empty class to be populated.

The keywords are related to observation, footprint and flux data which may be available within the object store. The combination of these supplied will be used to extract the relevant data. Related keywords are as follows:

  • Observation data: site, species, inlet, network, start_date, end_data

  • Footprint data: site, inlet, domain, model, met_model, species, start_date, end_date

  • Flux data: species, sources, domain, start_date, end_date

Parameters:
  • site (Optional[str]) – Site code e.g. “TAC”

  • species (Optional[str]) – Species code e.g. “ch4”

  • inlet (Optional[str]) – Inlet value e.g. “10m”

  • height (Optional[str]) – Alias for inlet.

  • network (Optional[str]) – Network name e.g. “AGAGE”

  • domain (Optional[str]) – Domain name e.g. “EUROPE”

  • model (Optional[str]) – Model name used in creation of footprint e.g. “NAME”

  • met_model (Optional[str]) – Name of met model used in creation of footprint e.g. “UKV”

  • fp_inlet (Union[str, list, None]) – Specify footprint release height options if this doesn’t match to site value.

  • sources (Union[str, Sequence, None]) – Emissions sources

  • bc_input (Optional[str]) – Input keyword for boundary conditions e.g. “mozart” or “cams”

  • start_date (Union[str, Timestamp, None]) – Start of date range to use. Note for flux this may not be applied

  • end_date (Union[str, Timestamp, None]) – End of date range to use. Note for flux this may not be applied

  • obs (Optional[ObsData]) – Supply ObsData object directly (e.g. from get_obs…() functions)

  • footprint (Optional[FootprintData]) – Supply FootprintData object directly (e.g. from get_footprint() function)

  • flux (Union[FluxData, dict[str, FluxData], None]) – Supply FluxData object directly (e.g. from get_flux() function)

  • store (Optional[str]) – Name of object store to retrieve data from.

Returns:

None

Sets up instance of class with associated values.

TODO: For obs, footprint, flux should we also allow Dataset input and turn these into the appropriate class?

add_bc(species=None, bc_input=None, domain=None, start_date=None, end_date=None, bc=None, store=None)[source]#

Add boundary conditions data based on keywords or direct BoundaryConditionsData object.

Return type:

None

add_flux(species=None, domain=None, source=None, sources=None, start_date=None, end_date=None, flux=None, store=None)[source]#

Add flux data based on keywords or direct FluxData object. Can add flux datasets for multiple sources.

Return type:

None

add_footprint(site=None, inlet=None, height=None, domain=None, model=None, met_model=None, start_date=None, end_date=None, species=None, fp_inlet=None, network=None, footprint=None, store=None)[source]#

Add footprint data based on keywords or direct FootprintData object.

Return type:

None

add_obs(site=None, species=None, inlet=None, height=None, network=None, start_date=None, end_date=None, obs=None, store=None)[source]#

Add observation data based on keywords or direct ObsData object.

Return type:

None

calc_modelled_baseline(resample_to='coarsest', platform=None, output_units=1e-09, cache=True, recalculate=False)[source]#

Calculate the modelled baseline points based on site footprint and boundary conditions. Boundary conditions are multipled by any loss (exp(-t/lifetime)) for the species.

The time points returned are dependent on the resample_to option chosen. If obs data is also linked to the ModelScenario instance, this will be used to derive the time points where appropriate.

Parameters:
  • resample_to (str) –

    Resample option to use for averaging:
    • either one of [“coarsest”, “obs”, “footprint”] to match to the datasets

    • or using a valid pandas resample period e.g. “2H”.

    Default = “coarsest”.

  • platform (Optional[str]) – Observation platform used to decide whether to resample e.g. “site”, “satellite”.

  • cache (bool) – Cache this data after calculation. Default = True.

  • recalculate (bool) – Make sure to recalculate this data rather than return from cache. Default = False.

Returns:

Modelled baselined values along the time axis

If cache is True:

This data will also be cached as the ModelScenario.modelled_baseline attribute. The associated scenario data will be cached as the ModelScenario.scenario attribute.

Return type:

xarray.DataArray

calc_modelled_obs(sources=None, resample_to='coarsest', platform=None, cache=True, recalculate=False)[source]#

Calculate the modelled observation points based on site footprint and fluxes.

The time points returned are dependent on the resample_to option chosen. If obs data is also linked to the ModelScenario instance, this will be used to derive the time points where appropriate.

Parameters:
  • sources (Union[str, list, None]) – Sources to use for flux. All will be used and stacked if not specified.

  • resample_to (str) –

    Resample option to use for averaging:
    • either one of [“coarsest”, “obs”, “footprint”] to match to the datasets

    • or using a valid pandas resample period e.g. “2H”.

    Default = “coarsest”.

  • platform (Optional[str]) – Observation platform used to decide whether to resample e.g. “site”, “satellite”.

  • cache (bool) – Cache this data after calculation. Default = True.

  • recalculate (bool) – Make sure to recalculate this data rather than return from cache. Default = False.

Returns:

Modelled observation values along the time axis

If cache is True:

This data will also be cached as the ModelScenario.modelled_obs attribute. The associated scenario data will be cached as the ModelScenario.scenario attribute.

Return type:

xarray.DataArray

combine_flux_sources(sources=None, cache=True, recalculate=False)[source]#

Combine together flux sources on the time dimension. This will align to the time of the highest frequency flux source both for time range and frequency.

Parameters:
  • sources (Union[str, list, None]) – Names of sources to combine. Should already be attached to ModelScenario.

  • cache (bool) – Cache this data after calculation. Default = True

Returns:

All flux sources stacked on the time dimension.

Return type:

Dataset

combine_obs_footprint(resample_to='coarsest', platform=None, cache=True, recalculate=False)[source]#

Combine observation and footprint data so these are on the same time axis. This will both slice and resample the data to align this axis.

  • Data is slices to smallest timeframe spanned by both footprint and obs

  • Data is resampled according to resample_to input and using the mean

  • Data is combined into one dataset

Parameters:
  • resample_to (str) –

    Resample option to use for averaging:
    • either one of [“coarsest”, “obs”, “footprint”] to match to the datasets

    • or using a valid pandas resample period e.g. “2H”.

    Default = “coarsest”.

  • platform (Optional[str]) – Observation platform used to decide whether to resample

  • cache (bool) – Cache this data after calculation. Default = True.

Returns:

Combined dataset aligned along the time dimension

If cache is True:

This data will be also be cached as the ModelScenario.scenario attribute.

Return type:

xarray.Dataset

footprints_data_merge(resample_to='coarsest', platform=None, calc_timeseries=True, sources=None, calc_bc=True, cache=True, recalculate=False)[source]#

Produce combined object containing aligned footprint and observation data. Can also include modelled timeseries data derived from flux.

Parameters:
  • resample_to (str) –

    Resample option to use for averaging:
    • either one of [“coarsest”, “obs”, “footprint”] to match to the datasets

    • or using a valid pandas resample period e.g. “2H”.

    Default = “coarsest”.

  • platform (Optional[str]) – Observation platform used to decide whether to resample.

  • calc_timeseries (bool) – Calculate modelled timeseries based on flux sources.

  • sources (Union[str, list, None]) – Sources to use for flux if calc_timseries is True. All will be used and stacked if not specified.

  • calc_baseline – Calculate modelled baseline.

  • cache (bool) – Cache this data after calculation. Default = True.

  • recalculate (bool) – Make sure to recalculate this data rather than return from cache. Default = False.

Returns:

Combined dataset containing footprint and observation data

Return type:

xarray.Dataset

plot_comparison(baseline='boundary_conditions', sources=None, resample_to='coarsest', platform=None, cache=True, recalculate=False)[source]#

Plot comparison between observation and modelled timeseries data.

Parameters:
  • baseline (str | None) – Add baseline to data. One of: - “boundary_conditions” - Uses added boundary conditions to calculate modelled baseline - “percentile” - Calculates the 1% value across the whole time period - None - don’t add a baseline and only plot the modelled observations

  • sources (Union[str, list, None]) – Sources to use for flux. All will be used and stacked if not specified.

  • resample_to (str) –

    Resample option to use for averaging:
    • either one of [“coarsest”, “obs”, “footprint”] to match to the datasets

    • or using a valid pandas resample period e.g. “2H”.

    Default = “coarsest”.

  • platform (Optional[str]) – Observation platform used to decide whether to resample e.g. “site”, “satellite”.

  • cache (bool) – Cache this data after calculation. Default = True.

  • recalculate (bool) – Make sure to recalculate this data rather than return from cache. Default = False.

Return type:

Any

Returns:

Plotly Figure

Interactive plotly graph created with observation and modelled observation data.

plot_timeseries()[source]#

Plot the observation timeseries data.

Return type:

Any

Returns:

Plotly Figure

Interactive plotly graph created with observations