Sometimes you’d like a copy of a git repository, but don’t need the entire git history. For large repositories, the history can be slow to download and may consume a lot of disk space. The history is unnecessary when:

  • Exploring or building at a specific branch or tag
  • Building a git repository in a Dockerfile
  • Scaffolding a project from a template

In this post I show how to use the GitHub API to download a snapshot of a repository and show how this might look in a Dockerfile. Additionally, I describe degit, a project scaffolding tool that copies repositories from GitHub and other providers.

Using the GitHub API Link to heading

The GitHub API allows downloading a tar archive of a repository by calling:

GET /repos/{owner}/{repo}/tarball/{ref}

ref is a commit, branch, or tag. If omitted, the default branch is used.

Replacing tarball with zipball gives a zip archive.

The server responds with a redirect URL—the status code is 302 Found, and the Location header contains the URL to download the archive.

For example, you can download an archive from the command line with:

curl \
  -L \
  -o ./treetop-v1.0.0.tar.gz \
  https://api.github.com/repos/msmolens/treetop/tarball/v1.0.0

In a Dockerfile Link to heading

When building a git repository in a Dockerfile there’s no reason to download the git history. Here is a snippet that downloads the libsndfile source at a specific commit and verifies the archive’s SHA-256 hash:

ENV LIBSNDFILE_GIT_REF 808fb07864727e64218fe0910abd7b900e575695
ENV LIBSNDFILE_TARBALL_SHA256 d61864f10290dd60adb3c245ae80bcb0a8ee5a7e5d85c8732def0cbc3f39873a
WORKDIR /opt/libsndfile
RUN \
  curl -L https://api.github.com/repos/erikd/libsndfile/tarball/${LIBSNDFILE_GIT_REF} > libsndfile.tar.gz && \
  echo "$LIBSNDFILE_TARBALL_SHA256 libsndfile.tar.gz" | sha256sum -c -w --status - && \
  mkdir libsndfile && \
  tar zxf libsndfile.tar.gz --strip-components=1 && \
  mkdir build && \
  ...

See the compile-for-aws-lambda repository for the complete Dockerfile.

degit Link to heading

Rich Harris’ degit tool abstracts downloading copies of git repositories from GitHub, GitLab, and other providers. This is especially useful for scaffolding projects from template repositories.

After installing Node.js, install degit with:

npm install -g degit

To download a GitHub repository at a specific ref into a subdirectory, run:

degit user/repo#ref my-project

For example, Svelte suggests scaffolding a new project with:

degit sveltejs/template my-svelte-project

Here, the degit tool uses the GitHub API to download a tar archive of https://github.com/sveltejs/template at the latest commit on the default branch. It then extracts the archive to my-svelte-project.