- Published on
Automated License Checking for Conda Projects
- Authors
Third-party software is a cornerstone of many modern software projects. These third-party packages come with their own license files.
Instead of forcing all our developers to be familiar with their details, we automate the process of checking our projects for license compliance.
This post is inspired by our recent open-sourcing of conda-deny, a simple CLI to check packages for such compliance.
Why care about licenses?
Almost every bit of software you can find on GitHub or other platforms contains an associated license file. Licenses can seem intimidating at first. Their sheer number makes many people uncomfortable.
We want to introduce you to some ways of thinking about licenses and some tools that might make them less frightening.
When you create a piece of software, you hold its copyright. This means that you can control what others can do with your code. To exercise this control, you create a license text. It serves as legal instruction regarding what permissions you grant to others.
When providing software to third-parties or selling it as a product, the license compliance of your package becomes important. Not only do you need to choose the appropriate license for your own code, you also need to make sure that all the packages your project depends on comply with your intended distribution needs.
Consider the following example:
You include a package with the Unlicense
license in your project. This leaves you free to modify, sell, and distribute your software however you wish.
If, however, you had included a package with the GPL-3.0
license, this would force you to make the source code of your own package available to everybody! For software you sell, this can be a problem.
Thus, when relying on open-source software, it becomes crucial to check whether your dependencies include licenses like GPL-3.0
(so-called "strong copyleft" licenses) or other problematic legalities.
Nice to know: Licenses in the GPL
family are sometimes referred to as "viral" licenses. They "infect" all downstream packages with their own license.
The SPDX License Format
The overwhelm that a lot of people feel when first reading about software licenses becomes even more pronounced when we consider that, in theory, every person can create their own license with its own name. This means that licenses like "# @_bfv My cool new license" are, in theory just as valid as "GPL-3.0".
In addition to the name, everybody can write their own legal instructions in the license text. If you suspect that this will result in absolute chaos, you are right.
Luckily, the SPDX initiative has created a standardized way to represent licenses. They feature a vast majority of existing licenses and provide a standardized format for them.
Nice to know: If you ever see a license identifier that includes whitespaces, e.g., BSD 3-Clause
, you can be sure that it is not in SPDX format.
This layer of abstraction makes it easier to check for license compliance. Instead of manually reading the license text and checking whether it complies with our policies, you can now just check the SPDX identifier against a whitelist. This whitelist contains SPDX licenses we have already checked and are comfortable with.
conda-forge
QuantCo relies on the conda ecosystem, especially packages from the conda-forge
distribution, for most of its dependencies. conda-forge
helps to enforce the SPDX license format by requiring all packages to include a license_file
field in their recipe file.
It also encourages feedstock maintainers to ensure that the license file is being packaged and in SPDX format.
conda-deny
conda-deny
builds on the SPDX specification to provide a simple CLI for checking packages for license compliance. It leverages the local availability of license specifiers in pixi.lock
files to check their corresponding licenses against a user-provided whitelist.
This is inspired by cargo-deny, which offers similar functionality for Rust projects.
Checking pixi projects
The most minimal version of conda-deny
is running it with the --osi
flag. This allows you to not provide a whitelist and just check the licenses for OSI approval. For details about the whitelist format, check out the section on Configuration.
# Assuming you have a pixi.lock file in the current directory
conda-deny check --osi
The metadata in pixi's project configuration, makes the retrieval of associated SPDX identifiers straightforward.
Checking non-pixi projects
In the case that you have a setup that doesn't specify its environment locations, e.g. by using micromamba
or conda
as a package manager, we provide the --prefix
flag. You can use it to specify the path to the environment you want to check.
micromamba env list
> test-env /Users/user/micromamba/envs/test-env
conda-deny check --prefix /Users/user/micromamba/envs/test-env
Checking recipes
When building a conda package you need to define the environment that the package will be installed in. It can be helpful to check whether this environment complies with your license constraints. A setup along the following lines should work in most cases:
# recipe.yml
tests:
- script:
- if: unix
then: conda-deny check --prefix $CONDA_PREFIX
else: conda-deny check --prefix %CONDA_PREFIX%
files:
source: pixi.toml
requirements:
run: [conda-deny]
rattler-build
allows you to run arbitrary scripts at test time in the build process. This example uses the conda-deny
configuration in pixi.toml
to check the environment created at ($CONDA_PREFIX
).
Configuration
You can configure conda-deny
in your pixi.toml
or pyproject.toml
. Alternatively, you can provide a custom configuration path with the --config
flag.
The following configuration options are available:
[tool.conda-deny]
#--------------------------------------------------------
# General setup options:
#--------------------------------------------------------
license-whitelist = "https://raw.githubusercontent.com/QuantCo/conda-deny/main/tests/test_remote_base_configs/conda-deny-license_whitelist.toml" # or ["license_whitelist.toml", "other_license_whitelist.toml"]
platform = "linux-64" # or ["linux-64", "osx-arm64"]
environment = "default" # or ["default", "py39", "py310", "prod"]
lockfile = "environment/pixi.lock" # or ["environment1/pixi.lock", "environment2/pixi.lock"]
#--------------------------------------------------------
# License whitelist directly in configuration file:
#--------------------------------------------------------
safe-licenses = ["MIT", "BSD-3-Clause"]
ignore-packages = [
{ package = "make", version = "0.1.0" },
]
Conclusion
Licenses are an ever-present topic in the world of software engineering. We hope this short post has given you some idea of how you can deal with licenses in your projects and how conda-deny
can help you with this task.