github project structure best practices

Contents

1 Introduction

These are my notes on the structural best practices I’ve been following in my github repositories. We first go through the global practices that apply to all kinds of projects, and then we cover the language-dependent ones. This is definitely not supposed to be a complete guide - rather, I’m documenting what I do once I have at least 2 repositories of a specific kind.

2 Global practices

These are the practices that apply to almost all kinds of repositories.

2.1 Workflow

Leave main as the default HEAD of the repository - that’s the one that third parties get when they clone/check-out and the base of pull requests.

Use a devel branch as a staging ground for commits. We can then freely amend commits and push to this branch to make the complete CI suite analyse it. Once a commit is good to go, just move the main branch over to the latest commit with git push origin origin/devel:main.

We can use other upstream branches to keep drafts of future work, just be aware that they might start looking like feature branches, which are not entirely ideal.

2.2 Files

A quick rundown of files that are expected in all repositories:

2.3 CI

2.3.1 Provider: github actions

Use a single github actions workflow with as many jobs as necessary. That makes job execution and dependencies easier to visualize and maintain than using multiple workflows.

I have used travis-ci as a best practice for almost 10 years before they changed their pricing model and added limits to public repositories. They might have gone back on this, but they lost my trust.

2.3.2 Test coverage

Getting a 100% of tests passing is only meaningful if the tests are good. One easy way to get a sense of how good the tests are is by measuring test code coverage.

Use coveralls for code coverage reports. It has its quirks, but it can be made to work fine with just about any language - and it supports build matrices natively.

I’ve also used codecov for a while, but it didn’t work very well with my workflow - where a branch is constantly forwarded to commits that exist in another branch.

Add a test result and a coverage shield to the top of README.md. That allows visitors to see that you project cares about test coverage at a glance.

2.3.3 Formatter

Use a tool to check that files are well formatted - the stricter the tool is, the better. This may sound a bit radical, but the truth is that they end up making the code easier to work with for everybody, and prevent silly conflicts.

Try to integrate the same tool in the code editor so that the formatting doesn’t have to be fixed after the commit is pushed. Keeping the format can otherwise become a chore.

2.3.4 omnilint

omnilintdisclaimer is a github action that performs static analysis on standalone files. Add it as a first job in all repositories:

jobs:
  omnilint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: docker://lpenz/omnilint:v0.2

This makes sure that a minimum standard is kept even before the build and tests are developed. It checks, for instance, that python files can be loaded by the interpreter, without actually executing them (which would require installing dependencies, etc.)

2.3.5 Build/test matrices

github actions have build matrices support that we can use to test the project in different environments - architectures and compilers.

Use matrices to test the environments where the project can be actually deployed - there’s no need to go crazy and add all possible combinations.

Use them also to test in a future or nightly environment, so that we can get a heads up about upcoming language changes.

2.4 CD

2.4.1 Deploy on main or on tagging

With regards to deployment, there are two options:

These are not mutually exclusive - we can actually use both in the same project if it makes sense, separating them in different artifact repositories. That allows users to what they want with regards to stability versus frequency of updates.

Use semver, with versions in the format MAJOR.MINOR.PATCH. The git tag is usually the version prefixed by a v (vMAJOR.MINOR.PATCH). The meaning of the numbers, according to semver:

This is the whole versioning scheme in tag-based deployments.

Branch-based deployments also use semver tags as checkpoints, but they deploy using a version in the format MAJOR.MINOR.PATCH-DISTANCE, where DISTANCE is the number of commits since the tag MAJOR.MINOR.PATCH.

To generate the versions, use the version-generator disclaimer github action. It allows us to simplify the deploy condition with something similar to:

(...)
      - id: version
        uses: docker://lpenz/ghaction-version-gen:0.3
      - name: deploy
        uses: <deploy action>
        if: steps.version.outputs.version_commit != ''
        with:
          version: ${{ steps.version.outputs.version_commit }}

That deploys on commits. To deploy on tags, use version_tagged instead of version_commit. The action is also able to check if the tag matches the version in the language-specific project file, for some languages.

2.4.2 Debian on packagecloud

Package your projects in a format supported by distributions, such as .deb or .rpm - that makes cross-language dependencies manageable. Deploy the packages to a repository and use it as an upstream source. packagecloud is a very convenient option for that, and it takes both formats mentioned.

Use the Deploy to packagecloud.io disclaimer github action, and add a corresponding shield to README.md.

2.4.3 Nix

TODO

2.4.4 Project-specific

Also deploy to a specific location for the kind of project you created:

This is specially important if the project is not a standalone executable.

Most language-specific repositories provide a shield that should be added to README.md.

3 Rust

Example: https://github.com/lpenz/ogle/

The rust ecosystem is at just the right point - it has most of the problems already solved, and few solutions are already obsolete, making the state-of-the-art easy to find.

We have proper github actions to run tests with coverage, the formatter, static analyser (clippy), create .deb and language-specific packages (crates). Use this github action workflow as a base: https://github.com/lpenz/ogle/blob/main/.github/workflows/ci.yml

Deploy to the language-specific repository at https://crates.io/, and add the corresponding shield to README.md.

Write a manpage if the project is a command line utility.

Some issues pending:

4 C/C++

Example: https://github.com/lpenz/execpermfix/

C/C++ is on the other end of the spectrum: it’s a very old language, older than even the concept of ecosystem. Every old problem has been solved multiple times, and some new problems might not even be solved yet - package management is the most important one.

That said, all the basics can be covered:

We can use cmake to generate the install target and the .deb from that.

Use this workflow as the base: https://github.com/lpenz/execpermfix/blob/main/.github/workflows/ci.yml

5 Python

Example: https://github.com/lpenz/ftpsmartsync/

Python is known for having “batteries included”, but as another language that’s been around for a while, those batteries had to be changed several times. We have all the basics covered, including packaging, but sometimes it’s not easy to find the current best practice - not even on stackoverflow - due to answers that are plain old.

Use python 3 - python 2’s end-of-life was on 2020-04-20.

Use pyproject.toml and setup.cfg instead of setup.py. It’s not hard to make the conversion, as long as there’s not a lot of dynamic generation. This is the current (very superficial) official doc: https://packaging.python.org/tutorials/packaging-projects/

Use stdeb to create debian packages. It’s not perfect, as there’s some friction with the way share/doc files are deployed, but does the job.

Use the workflow at https://github.com/lpenz/ftpsmartsync/blob/main/.github/workflows/ci.yml as a base.

6 github actions

Example: https://github.com/lpenz/ghaction-cmake/

We don’t have that many resources for github actions, to be honest.

Create docker github actions if possible, as they are self-contained, easier to test and tend to be faster.

Deploy the containers to https://hub.docker.com/ using the docker/build-push-action. Docker Hub can automatically build containers, but it has a quota, and using actions allows tests jobs to run before deployment.

To create a new version of a github action, tag the repository as usual, let the container be deployed to docker hub and then use github’s Create a new release link. This last step updates github’s marketplace information, and can’t automated.

Add the following shields to README.md:

That simplifies checking if the latest releases are in sync.

Note: don’t CD github actions from the main branch, as we have the extra steps mentioned above.

7 Final remarks

This is probably the document that will get updated the most as we figure out better ways to do things.

In short, we should strive to get the following whenever possible:

The table below summarizes the language-specific tools that support our scheme:

Category Formatter Debian Deploy Shields Example
Rust rustfmt cargo-deb packagecloud.io
  • coverage
  • crates.io release
  • packagecloud.io
ogle
C/C++ clang-format cpack packagecloud.io
  • coverage
  • packagecloud.io
execpermfix
Python black stdeb packagecloud.io
  • coverage
  • packagecloud.io
ftpsmartsync
ghaction No N/A Docker Hub
  • marketplace
  • github release
  • dockerhub release
ghaction-cmake

7.1 Disclaimer

Be aware that I’ve tagged github actions that I created with a disclaimer tag that links to this section.

I had created some github actions before starting this document, and I ended up creating a few more while write writing it. I avoided doing that, though, but sometimes the sortcomings were too severe and/or very easy to overcome.