GitHub Best Practices

General principles of change management

Some principles of software management and review are true across all workflows. These are presented first; for more details on possible workflows in GitHub, see the more GitHub workflow options section.

  1. The main branch is always stable. To ensure stability, a combination of automated testing and manual review needs to be undertaken everytime a change is merged into main. At minimum complete the following checklist:
    • A series of unit tests are ran. Realistically, this means using a continous integration tool (we recommend GitHub Actions. Manual running of test suites at the frequency we expect changes to be merged becomes way too cumbersome. The ghactions4r package provides setup functions to use common GitHub Actions workflows for R packages.
    • All documentation is updated. This includes auto-update of code reference documentation generated by doxygen, sphinx, etc. and manual examination that any changes that break example code, vignettes, or tutorials are updated in the respective materials. For R packages, it also means ensuring your DESCRIPTION file is updated with any new package dependencies.
    • Manual code review. At least one package collaborator should review changes, suggest alternative approaches, and approve as necessary.
  2. Changes in main are pulled to working location (this could be a development or feature branch, or fork depending on the workflow) every time the new code is tested. This ensures the remote location stays up to date.

  3. Changes that are the subject of pull requests do not exceed 500 lines of code. Changes of larger magnitude are difficult to review and test in one go. Similarly, changes that are intended to be merged in should be on a weekly basis.

More GitHub workflow options

Git forking and Git branching are possible workflows. Some pros and cons of each workflow are listed below.

Pros of forking:

  1. The only option if you intend to keep code divergent forever.
  2. Does not require contributors to be added as collaborators to the project.

Cons of forking:

  1. Harder to stay up-to-date with main.
  2. Harder to make “feature forks” than “feature branches.”
  3. Doesn’t integrate quite as well with releases.

Pros of branching:

  1. Allows for multi-branch workflow, i.e. a main, development, and feature branches. Feature branches are one way to keep changes more modular and improve testability.
  2. Seamless to manage GitHub releases.
  3. Branch protection rules to enforce more protocols on everyone, including administrators.

Cons of branching:

  1. Authors need to be collaborators.
  2. Not ideal for permanently divergng codebases.

For the NOAA FIT, NOAA git policy dictates non-NOAA affiliates cannot have push access to the repository. For this reason, you can only use the branch workflow if your repository exists under an organization, which allows you to tweak the permissions of collaborators. A non-organization git repo gives all collaborators push access. We recommend creating an organization for your repository if you have more than one repository and/or more than 2 or 3 collaborators and using the branching workflow for changes you expect to be merged back into the main branch. If you expect changes to diverge and not rejoin main, or you have one repository with non-NOAA collaborators, the forking workflow may suit your needs better.

Software package management and review with version control

Some principles of software management and review are true across all version control workflows.

  1. The main branch is always stable. To ensure stability, a combination of automated testing and manual review needs to be undertaken every time a change is merged into main. At minimum complete the following checklist:
    • A series of unit tests are ran. Realistically, this means using a continous integration tool (such as Github Actions). Manual running of test suites for each change becomes cumbersome and can be easily forgotten.
    • All documentation is updated. This includes auto-update of code reference documentation generated by doxygen, roxgyen, sphinx, etc. and manual examination that any changes that break example code, vignettes, or tutorials are updated in the respective materials. For R packages, it also means ensuring your DESCRIPTION file is updated with any new package dependencies.
    • Manual code review. At least one package collaborator should review changes, suggest alternative approaches, and approve as necessary.
  2. Changes in main are pulled to working location (this could be a development or feature branch, or fork depending on the workflow) every time the new code is tested. This ensures the remote location stays up to date.

  3. Changes that are the subject of pull requests do not exceed 500 lines of code. Changes of larger magnitude are difficult to review and test in one go. Similarly, changes that are intended to be merged in should be on a weekly basis.

Software versioning and GitHub Releases

The standard for software versioning is semantic versioning in which major changes that break the application programmatic interface (API) constitute version changes, backwards-compatible changes constitute minor versions, and patches are backwards compatible-bug fixes. We recommend not trying to always maintain backwards compatibility, which can lead to testing nightmares, but instead being clear about versioning and maintaining access to legacy software binaries for users who are unable to migrate to later software versions. Patches may be applied to legacy versions to port bug fixes when necessary. The post on Dependency management provides more guidance.

A good way to manage this is using GitHub releases. GitHub releases are designed to keep a consistent log of the most recent software version. You can create a release by pushing a commit with a tag that corresponds to a semantic version (for example, tag 1.0.0 to release version 1.0), or by selecting “Draft a new release” in your GitHub repository. Drafting releases comes with the benefit of marking something a draft or preliminary release. In a GitHub release, compiled binaries up to 2.0 MB are provided for download and others watching your repository will be notified when a new release is pushed.

Free online Git and GitHub resources