Util#
Exporting#
These are used to export data to a format readable by the OpenGHG data dashboard.
- openghg.util.to_dashboard(data, selected_vars, downsample_n=3, filename=None)[source]#
Takes a Dataset produced by OpenGHG and outputs it into a JSON format readable by the OpenGHG dashboard or a related project.
This also exports a separate file with the locations of the sites for use with map selector component.
Note - this function does not currently support export of data from multiple inlets.
Hashing#
These handle hashing of data (usually with SHA1)
- openghg.util.hash_file(filepath)[source]#
Opens the file at filepath and calculates its SHA1 hash
Taken from https://stackoverflow.com/a/22058673
- Parameters:
filepath (pathlib.Path) – Path to file
- Returns:
SHA1 hash
- Return type:
str
String manipulation#
String cleaning and formatting functions
- openghg.util.clean_string(to_clean)[source]#
Returns a lowercase string with only alphanumeric characters and underscores.
- Parameters:
to_clean (
Optional
[str
]) – String to clean- Returns:
Clean string
- Return type:
str or None
- openghg.util.to_lowercase(d, skip_keys=None)[source]#
Convert an object to lowercase. All keys and values in a dictionary will be converted to lowercase as will all objects in a list, tuple or set. You can optionally pass in a list of keys to skip when lowercasing a dictionary.
Based on the answer https://stackoverflow.com/a/40789531/1303032
- Parameters:
d (
Union
[Dict
,List
,Tuple
,Set
,str
]) – Object to lower caseskip_keys (
Optional
[List
]) – List of keys to skip when lowercasing.
- Returns:
Dictionary of lower case keys and values
- Return type:
dict
Time#
Helpers to deal with all things datetime.
- openghg.util.timestamp_tzaware(timestamp)[source]#
Returns the pandas Timestamp passed as a timezone (UTC) aware Timestamp.
- Parameters:
timestamp (pandas.Timestamp) – Timezone naive Timestamp
- Returns:
Timezone aware
- Return type:
pandas.Timestamp
- openghg.util.timestamp_now()[source]#
Returns a pandas timezone (UTC) aware Timestamp for the current time.
- Returns:
Timestamp at current time
- Return type:
pandas.Timestamp
- openghg.util.timestamp_epoch()[source]#
Returns the UNIX epoch time 1st of January 1970
- Returns:
Timestamp object at epoch
- Return type:
pandas.Timestamp
- openghg.util.daterange_from_str(daterange_str, freq='D')[source]#
Get a Pandas DatetimeIndex from a string. The created DatetimeIndex has minute frequency.
- Parameters:
daterange_str (str) – Daterange string
2019-01-01T00 (of the form) – 00:00_2019-12-31T00:00:00
- Returns:
DatetimeIndex covering daterange
- Return type:
pandas.DatetimeIndex
- openghg.util.daterange_to_str(daterange)[source]#
Takes a pandas DatetimeIndex created by pandas date_range converts it to a string of the form 2019-01-01-00:00:00_2019-03-16-00:00:00
- Parameters:
daterange (pandas.DatetimeIndex)
- Returns:
Daterange in string format
- Return type:
str
- openghg.util.create_daterange_str(start, end)[source]#
Convert the passed datetimes into a daterange string for use in searches and Datasource interactions
- Parameters:
start_date – Start date
end_date – End date
- Returns:
Daterange string
- Return type:
str
- openghg.util.create_daterange(start, end, freq='D')[source]#
Create a minute aligned daterange
- Parameters:
start (
Timestamp
) – Start dateend (
Timestamp
) – End date
- Return type:
DatetimeIndex
- Returns:
pandas.DatetimeIndex
Site Checks#
These perform checks to ensure data processed for each site is correct
- openghg.util.verify_site(site)[source]#
Check if the passed site is a valid one and returns the three letter site code if found. Otherwise we use fuzzy text matching to suggest sites with similar names.
- Parameters:
site (
str
) – Three letter site code or site name- Returns:
Verified three letter site code if valid site
- Return type:
str
Domain#
- openghg.util.find_domain(domain, domain_filepath=None)[source]#
Finds the latitude and longitude values in degrees associated with a given domain name.
- Parameters:
domain (
str
) – Pre-defined domain namedomain_filepath (
Union
[str
,Path
,None
]) – Alternative domain info file. Defaults to openghg_defs input.
- Returns:
Latitude and longitude values for the domain in degrees.
- Return type:
array, array
Inlet#
- openghg.util.format_inlet(inlet, units='m', key_name=None, special_keywords=None)[source]#
Make sure inlet / height name conforms to standard. The standard imposed can depend on the associated key_name itself (can be supplied as an option to check).
- This standard is as follows:
number followed by unit
number alone if unit / derviative is specified at the end of key_name (e.g. station_height_masl)
unchanged if this is one of the special keywords (by default “multiple” or “various”)
- Other considerations:
For units of “m”, we will also look for “magl” and “masl” (metres above ground and sea level)
If the input string just contains numbers, it is assumed this is already within the correct unit.
- Parameters:
inlet (
Union
[str
,slice
,None
,list
[Union
[str
,slice
,None
]]]) – Inlet / Height value in the specified unitsunits (
str
) – Units for the inlet value (“m” by default)key_name (
Optional
[str
]) – Name of the associated key. This is optional but will be used to determine whether the unit value should be added to the output string.special_keywords (
Optional
[list
]) – Specify special keywords inlet could be set to If so do not apply any formatting. If this is not set a special keyword of “multiple” and “column” will still be allowed.
- Return type:
Union
[str
,slice
,None
,list
[Union
[str
,slice
,None
]]]- Returns:
same type as input, with all strings formatted
- Usage:
>>> format_inlet("10") "10m" >>> format_inlet("10m") "10m" >>> format_inlet("10magl") "10m" >>> format_inlet("10.111") "10.1m" >>> format_inlet(["10", 100]) ["10m", "100m"] >>> format_inlet("multiple") "multiple" >>> format_inlet("10m", key_name="inlet") "10m" >>> format_inlet("10m", key_name="inlet_magl") "10" >>> format_inlet("10m", key_name="station_height_masl") "10"