Util#
Exporting#
These are used to export data to a format readable by the OpenGHG data dashboard.
- openghg.util.to_dashboard(data, selected_vars, downsample_n=3, filename=None)[source]#
- Takes a Dataset produced by OpenGHG and outputs it into a JSON format readable by the OpenGHG dashboard or a related project. - This also exports a separate file with the locations of the sites for use with map selector component. - Note - this function does not currently support export of data from multiple inlets. 
Hashing#
These handle hashing of data (usually with SHA1)
- openghg.util.hash_file(filepath)[source]#
- Opens the file at filepath and calculates its SHA1 hash - Taken from https://stackoverflow.com/a/22058673 - Parameters:
- filepath (pathlib.Path) – Path to file 
- Returns:
- SHA1 hash 
- Return type:
- str 
 
String manipulation#
String cleaning and formatting functions
- openghg.util.clean_string(to_clean)[source]#
- Returns a lowercase string with only alphanumeric characters and underscores. - Parameters:
- to_clean ( - str|- None) – String to clean
- Returns:
- Clean string 
- Return type:
- str or None 
 
- openghg.util.to_lowercase(d, skip_keys=None)[source]#
- Convert an object to lowercase. All keys and values in a dictionary will be converted to lowercase as will all objects in a list, tuple or set. You can optionally pass in a list of keys to skip when lowercasing a dictionary. - Based on the answer https://stackoverflow.com/a/40789531/1303032 - Parameters:
- d ( - dict|- list|- tuple|- set|- str) – Object to lower case
- skip_keys ( - list|- None) – List of keys to skip when lowercasing.
 
- Returns:
- Dictionary of lower case keys and values 
- Return type:
- dict 
 
Time#
Helpers to deal with all things datetime.
- openghg.util.timestamp_tzaware(timestamp)[source]#
- Returns the pandas Timestamp passed as a timezone (UTC) aware Timestamp. - Parameters:
- timestamp (pandas.Timestamp) – Timezone naive Timestamp 
- Returns:
- Timezone aware 
- Return type:
- pandas.Timestamp 
 
- openghg.util.timestamp_now()[source]#
- Returns a pandas timezone (UTC) aware Timestamp for the current time. - Returns:
- Timestamp at current time 
- Return type:
- pandas.Timestamp 
 
- openghg.util.timestamp_epoch()[source]#
- Returns the UNIX epoch time 1st of January 1970 - Returns:
- Timestamp object at epoch 
- Return type:
- pandas.Timestamp 
 
- openghg.util.daterange_from_str(daterange_str, freq='D')[source]#
- Get a Pandas DatetimeIndex from a string. The created DatetimeIndex has minute frequency. - Parameters:
- daterange_str (str) – Daterange string 
- 2019-01-01T00 (of the form) – 00:00_2019-12-31T00:00:00 
 
- Returns:
- DatetimeIndex covering daterange 
- Return type:
- pandas.DatetimeIndex 
 
- openghg.util.daterange_to_str(daterange)[source]#
- Takes a pandas DatetimeIndex created by pandas date_range converts it to a string of the form 2019-01-01-00:00:00_2019-03-16-00:00:00 - Parameters:
- daterange (pandas.DatetimeIndex) 
- Returns:
- Daterange in string format 
- Return type:
- str 
 
- openghg.util.create_daterange_str(start, end)[source]#
- Convert the passed datetimes into a daterange string for use in searches and Datasource interactions - Parameters:
- start_date – Start date 
- end_date – End date 
 
- Returns:
- Daterange string 
- Return type:
- str 
 
- openghg.util.create_daterange(start, end, freq='D')[source]#
- Create a minute aligned daterange - Parameters:
- start ( - Timestamp) – Start date
- end ( - Timestamp) – End date
 
- Return type:
- DatetimeIndex
- Returns:
- pandas.DatetimeIndex 
 
Site Checks#
These perform checks to ensure data processed for each site is correct
- openghg.util.verify_site(site, site_filepath=None)[source]#
- Check if the passed site is a valid one and returns the three letter site code if found. Otherwise we use fuzzy text matching to suggest sites with similar names. - Parameters:
- site ( - str) – Three letter site code or site name
- site_filepath ( - Union[- str,- Path,- None]) – Alternative site info file. Defaults to openghg_defs input.
 
- Returns:
- Verified three letter site code if valid site 
- Return type:
- str 
 
- openghg.util.multiple_inlets(site, site_filepath=None)[source]#
- Check if the passed site has more than one inlet - Parameters:
- site ( - str) – Three letter site code
- site_filepath ( - Union[- str,- Path,- None]) – Alternative site info file. Defaults to openghg_defs input.
 
- Returns:
- True if multiple inlets 
- Return type:
- bool 
 
Domain#
- openghg.util.find_domain(domain, domain_filepath=None)[source]#
- Finds the latitude and longitude values in degrees associated with a given domain name. - Parameters:
- domain ( - str) – Pre-defined domain name
- domain_filepath ( - Union[- str,- Path,- None]) – Alternative domain info file. Defaults to openghg_defs input.
 
- Returns:
- Latitude and longitude values for the domain in degrees. 
- Return type:
- array, array 
 
Inlet#
- openghg.util.format_inlet(inlet, units='m', key_name=None, special_keywords=None)[source]#
- Make sure inlet / height name conforms to standard. The standard imposed can depend on the associated key_name itself (can be supplied as an option to check). - This standard is as follows:
- number followed by unit 
- number alone if unit / derviative is specified at the end of key_name (e.g. station_height_masl) 
- unchanged if this is one of the special keywords (by default “multiple” or “various”) 
 
- Other considerations:
- For units of “m”, we will also look for “magl” and “masl” (metres above ground and sea level) 
- If the input string just contains numbers, it is assumed this is already within the correct unit. 
 
 - Parameters:
- inlet ( - str|- slice|- None|- list[- str|- slice|- None]) – Inlet / Height value in the specified units
- units ( - str) – Units for the inlet value (“m” by default)
- key_name ( - str|- None) – Name of the associated key. This is optional but will be used to determine whether the unit value should be added to the output string.
- special_keywords ( - list|- None) – Specify special keywords inlet could be set to If so do not apply any formatting. If this is not set a special keyword of “multiple” and “column” will still be allowed.
 
- Return type:
- str|- slice|- None|- list[- str|- slice|- None]
- Returns:
- same type as input, with all strings formatted 
 - Usage:
- >>> format_inlet("10") "10m" >>> format_inlet("10m") "10m" >>> format_inlet("10magl") "10m" >>> format_inlet("10.111") "10.1m" >>> format_inlet(["10", 100]) ["10m", "100m"] >>> format_inlet("multiple") "multiple" >>> format_inlet("10m", key_name="inlet") "10m" >>> format_inlet("10m", key_name="inlet_magl") "10" >>> format_inlet("10m", key_name="station_height_masl") "10" 
 
