Sometimes you’d like a copy of a git repository, but don’t need the entire git history. For large repositories, the history can be slow to download and may consume a lot of disk space. The history is unnecessary when:
- Exploring or building at a specific branch or tag
- Building a git repository in a Dockerfile
- Scaffolding a project from a template
In this post I show how to use the GitHub API to download a snapshot of a repository and show how this might look in a Dockerfile. Additionally, I describe degit, a project scaffolding tool that copies repositories from GitHub and other providers.
Using the GitHub API
The GitHub API allows downloading a tar archive of a repository by calling:
GET /repos/{owner}/{repo}/tarball/{ref}
ref
is a commit, branch, or tag. If omitted, the default branch is used.
Replacing tarball
with zipball
gives a zip archive.
The server responds with a redirect URL—the status code is 302 Found
, and
the Location
header contains the URL to download the archive.
For example, you can download an archive from the command line with:
curl \
-L \
-o ./treetop-v1.0.0.tar.gz \
https://api.github.com/repos/msmolens/treetop/tarball/v1.0.0
In a Dockerfile
When building a git repository in a Dockerfile there’s no reason to download the git history. Here is a snippet that downloads the libsndfile source at a specific commit and verifies the archive’s SHA-256 hash:
ENV LIBSNDFILE_GIT_REF 808fb07864727e64218fe0910abd7b900e575695
ENV LIBSNDFILE_TARBALL_SHA256 d61864f10290dd60adb3c245ae80bcb0a8ee5a7e5d85c8732def0cbc3f39873a
WORKDIR /opt/libsndfile
RUN \
curl -L https://api.github.com/repos/erikd/libsndfile/tarball/${LIBSNDFILE_GIT_REF} > libsndfile.tar.gz && \
echo "$LIBSNDFILE_TARBALL_SHA256 libsndfile.tar.gz" | sha256sum -c -w --status - && \
mkdir libsndfile && \
tar zxf libsndfile.tar.gz --strip-components=1 && \
mkdir build && \
...
See the compile-for-aws-lambda repository for the complete Dockerfile.
degit
Rich Harris’ degit tool abstracts downloading copies of git repositories from GitHub, GitLab, and other providers. This is especially useful for scaffolding projects from template repositories.
After installing Node.js, install degit with:
npm install -g degit
To download a GitHub repository at a specific ref into a subdirectory, run:
degit user/repo#ref my-project
For example, Svelte suggests scaffolding a new project with:
degit sveltejs/template my-svelte-project
Here, the degit tool uses the GitHub API to download a tar archive of
https://github.com/sveltejs/template at the latest commit on the default
branch. It then extracts the archive to my-svelte-project
.