The source code for OpenGHG is available on GitHub.

Setting up your computer

OpenGHG requires Python >= 3.7, so please install this before continuing further.

Virtual environment

It is recommended that you develop OpenGHG in a Python virtual environment. Here we’ll create a new folder called envs in our home directory and create a new openghg_devel environment in it.

mkdir -p ~/envs/openghg_devel
python -m venv ~/envs/openghg_devel

Virtual environments provide sandboxes which make it easier to develop and test code. They also allow you to install Python modules without interfering with other Python installations.

We activate our new environment using

source ~/envs/openghg_devel/bin/activate

This will update your shell so that all python commands (such as python, pip etc.) will use the virtual environment. You can deactivate the environment and return to your system Python using;


Clone OpenGHG

As OpenGHG is currently in its very early stages and is not yet available on pip we need clone the OpenGHG repository and then move into it and install the required dependencies.

git clone
cd openghg
pip install -r requirements.txt
pip install -r requirements-dev.txt

OpenGHG should now be installed within your virtual environment.

Run tests

To ensure everything is working on your system running the tests is a good idea. To do this run

pytest -v tests

Coding Style

OpenGHG is written in Python 3 (>= 3.7). We aim as much as possible to follow a PEP8 python coding style and recommend that use a linter such as flake8.

This code has to run on a wide variety of architectures, operating systems and machines - some of which don’t have any graphic libraries, so please be careful when adding a dependency.

With this in mind, we use the following coding conventions:


We follow a Python style naming convention.

  • Packages: lowercase, singleword

  • Classes: CamelCase

  • Methods: snake_case

  • Functions: snake_case

  • Variables: snake_case

  • Source Files: snake_case with a leading underscore

Functions or variables that are private should be named with a leading underscore. This prevents them from being prominantly visible in Python’s help and tab completion.


OpenGHG consists of the main module, e.g. openghg, plus a openghg.submodule module.

In addition, there is a openghg.scripts module which contains the code for the various command-line applications.

To make OpenGHG easy for new developers to understand, we have a set of rules that will ensure that only necessary public functions, classes and implementation details are exposed to the Python help system.

  • Module files containing implementation details are prefixed with an underscore, i.e.

  • Each module file contains an __all__ variable that lists the specific items that should be imported.

  • The package can be used to safely expose the required functionality to the user with:

from module import *

This results in a clean API and documentation, with all extraneous information, e.g. external modules, hidden from the user. This is important when working interactively, since IPython and Jupyter do not respect the __all__ variable when auto-completing, meaning that the user will see a full list of the available names when hitting tab. When following the conventions above, the user will only be able to access the exposed names. This greatly improves the clarity of the package, allowing a new user to quickly determine the available functionality. Any user wishing expose further implementation detail can, of course, type an underscore to show the hidden names when searching.

Type hinting

Throughout the OpenGHG project we use type hinting which allows us to declare the type of the objects that are going to be passed to and returned from functions. This helps improve user understanding of the code and when used in conjunction with tools like mypy can help catch bugs.

If we are writing a function that accepts takes a string and returns a string we can add the types like so

def greeter(name: str) -> str:
    """ Greets the user

            name: Name of user
            str: Greeting string
    return 'Hello ' + name

For a function that takes either a string or a list as its argument and returns a list we can write it as

def search(search_terms: Union[str, List]) -> List:
    """ A function that searches

            search_terms: Search terms
            list: List of data found
    # some excellent code


Feature branches

First make sure that you are on the development branch of OpenGHG:

git checkout devel

Now create and switch to a feature branch. This should be prefixed with feature, e.g.

git checkout -b feature-process


When working on your feature it is important to write tests to ensure that it does what is expected and doesn’t break any existing functionality. All code added to the project must be covered by tests and tests should be placed inside the tests directory, creating an appropriately named sub-directory for any new submodules.

The test suite is intended to be run using pytest. When run, pytest searches for tests in all directories and files below the current directory, collects the tests together, then runs them. Pytest uses name matching to locate the tests. Valid names start or end with test, e.g.:

# Files:
# Functions:
def test_func():
   # code to perform tests...

def func_test():
   # code to perform tests...

We use the convention of test_* when naming files and functions.

Running tests

To run the full test suite, simply type:

pytest tests

To get more detailed information about each test, run pytests using the verbose flag, e.g.:

pytest -v

For more information on the capabilties of pytest please see the pytest documentation.

Continuous integration and delivery

We use GitHub Actions to run a full continuous integration (CI) on all pull requests to devel and master, and all pushes to devel and master. We will not merge a pull request until all tests pass. We only accept pull requests to devel. We only allow pull requests from devel to master. In addition to CI,


OpenGHG is fully documented using a combination of hand-written files (in the doc folder) and auto-generated api documentation created from Google style docstrings. for details. The documentation is automatically built using Sphinx. Whenever a commit is pushed to devel the documentation is automatically rebuilt and updated.

To build the documentation locally you will first need to install some additional packages. If you haven’t yet installed the developer requirements please do so by running

pip install -r requirements-dev.txt

Next ensure you have pandoc installed. Installation instructions can be found here


If you haven’t installed openghg to your virtual environment you can add the folder path to your PYTHONPATH. This allows the library to be used easily without the need for reinstallation after changes.

export PYTHONPATH="${PYTHONPATH}:/path/to/cloned/repo"

Then move to the doc directory and run:


When finished, point your browser to build/html/index.html.


If you create new tests, please make sure that they pass locally before commiting. When happy, commit your changes, e.g.

git commit openghg/ tests/test_feature \
    -m "Implementation and test for new feature."

If your edits don’t change the OpenGHG source code e.g. fixing typos in the documentation, then please add [skip ci] to your commit message.

git commit -a -m "Updating docs [ci skip]"

This will avoid unnecessarily running the GitHub Actions, e.g. running all the tests and rebuilding the documentation of the OpenGHG package etc. GitHub actions are configured in the file .github/workflows/main.yaml).

Next, push your changes to the remote server:

# Push to the feature branch on the main OpenGHG repo, if you have access.
git push origin feature

# Push to the feature branch your own fork.
git push fork feature

When the feature is complete, create a pull request on GitHub so that the changes can be merged back into the development branch. For information, see the documentation here.