Published on

Debugging Automated Conda-Forge Feedstock Updates

Authors

At QuantCo, we rely heavily on the conda-forge package distribution for most of our external software dependencies. Packages from conda-forge are built transparently from public feedstocks. These are community-maintained GitHub repositories (see here for an example) describing how a package is built from its source code.

The conda-forge community has developed a sophisticated system to automate common maintenance tasks for these repositories to enable large scale: just over 25,000 feedstocks are currently part of conda-forge.

One of the most important tools in this system is the autotick-bot. The autotick-bot applies so-called migrations to feedstocks by creating pull requests (PRs) against their repositories. Most notably, the Version migration keeps the version of the package up to date by checking for new releases on the package's source repository. Due to the high variety of versioning schemes, packages, and the sources they are built from, even the version migration can be quite complex.

Therefore, a way to locally test and debug the behaviour of the bot is crucial for both feedstock and bot maintainers. As part of QuantCo's long-standing commitment to the open source community, we have recently contributed to the autotick-bot to allow for local debugging of some of the bot's features. While we are still working on some aspects, we are sharing our current progress in this blog post. Find out how you can already use the new debugging features to test version updates on your feedstock!

Understanding the Bot's Pipeline

In its production setting, the autotick-bot is split into several pipeline steps, each of which are run in separate GitHub Actions jobs. Between each pipeline step, it uses another GitHub repository, cf-graph-countyfair, to store the internal state of the bot. This repository is named after a JSON file that contains a dependency graph of all feedstocks in conda‑forge but also contains other data. The architecture of the bot dates back to its original development in 2018 and often hits the limits of what can be done with a free CI service like GitHub Actions.

The different pipeline steps are shown in the following diagram:

A flowchart with 5 steps. The steps are: 1. Gather List of Feedstocks, 2. Fetch Recipe Data & Build Dependency Graph, 3. Find Version Updates (Upstream), 4. Prepare Migrators, 5. Run Migrations.
Details about corresponding CLI entrypoints

The different pipeline steps manifest as different CLI entrypoints. For reference, the corresponding entryponts are (as of today):

  1. gather-all-feedstocks
  2. two separate entrypoints named make-graph and make-graph --update-nodes-and-edges that have to run in sequence
  3. update-upstream-versions
  4. make-migrators
  5. auto-tick

The first step (1) updates the internal list of feedstocks by querying the conda-forge GitHub organization. Thereafter (2), the bot fetches the recipe, which is the main configuration file of a feedstock, and updates the global dependency graph in the cf-graph-countyfair (short form: cf-graph) repository. This graph is used to determine the order in which feedstocks are updated.

In the third step (3), the bot checks for new upstream versions of the package source by using certain update strategies (see below). Then, in the fourth step (4), the bot prepares the migrations that need to be applied to the feedstocks. Most importantly, it calculates a reduced dependency graph ("effective graph") that only contains the feedstocks that need to be updated, for each migration. Finally, in the fifth step (5), the bot applies the migrations to the feedstocks by creating PRs.

Investigating Version Updates

If everything goes well, the bot will issue an automated PR to the feedstock repository. It will look like this one. Note that because the bot currently runs its pipeline steps in batches, there might be a delay of a few hours between a new upstream release and the bot creating a PR. Other migrations (like the addition of Apple Silicon builds) might take even longer because they have lower priority. The version update migration of the bot is enabled by default for all feedstocks.

If a version update PR is not issued, the first step is always to check the conda-forge status page to see if any errors are reported. The autotick-bot has a built-in mechanism to report runtime exceptions to this page by periodically copying exception information to status files in the cf-graph repository, which are read by the status page. But there are also instances where the bot does not create a PR because it is not able to find a new version of the package without triggering an exception. In this case, nothing is reported on the status page.

To debug the search for new versions (the third step in the pipeline above), you can run the following command on your local machine (install pixi first and replace pydantic with the name of your feedstock). Run the command in a temporary directory.

pixi exec conda-forge-tick --debug --online --no-containers update-upstream-versions pydantic

Let's break it down part by part:

  • pixi exec conda-forge-tick downloads and installs the bot code in a temporary environment. If you have run this command previously, use pixi exec --force-reinstall conda-forge-tick to update the environment.
  • The --debug flag enables debug logging.
  • The --online flag allows you to run the command without a local clone of the cf-graph repository. It downloads individual files from the repository instead.
  • The --no-containers flag disables the use of Docker containers for some of the bot's operations. In production, containerization is in place for security reasons because the v0 format of the recipe file (which is in the process of being replaced by the v1 format) allows arbitrary code execution.
  • update-upstream-versions is the entrypoint for step 3 of the pipeline.
  • pydantic is the name of the feedstock you want to debug.

In the case of the pydantic feedstock (which does not have any issues), the output will look like this:

2025-01-15 11:29:13,575 INFO     conda_forge_tick.cli || Running in online mode
2025-01-15 11:29:13,575 INFO     conda_forge_tick.cli || Running without containers
2025-01-15 11:29:13,626 INFO     conda_forge_tick.update_upstream_versions || Reading graph
2025-01-15 11:29:15,735 INFO     conda_forge_tick.update_upstream_versions || Updating upstream versions
2025-01-15 11:29:15,736 DEBUG    conda_forge_tick.update_upstream_versions || Fetching latest version for pydantic from PyPI...
2025-01-15 11:29:15,736 DEBUG    conda_forge_tick.update_upstream_versions || Using URL https://pypi.org/pypi/pydantic/json
2025-01-15 11:29:15,842 DEBUG    conda_forge_tick.update_upstream_versions || Found version 2.10.5 on PyPI
2025-01-15 11:29:15,842 INFO     conda_forge_tick.update_upstream_versions || # 0     - pydantic - 2.10.5 -> 2.10.5
2025-01-15 11:29:15,842 DEBUG    conda_forge_tick.update_upstream_versions || writing out file
FINISHED STAGE update-upstream-versions IN 2.925801992416382 SECONDS

As you can see, the bot used the PyPI API to check for new versions of the package. It found version 2.10.5 and determined that the feedstock already has this version. The last step "writing out file" means that the bot wrote the new version to the cf-graph repository (here). If you debug on your machine, the file is created and modified in the versions directory of your current working directory. The bot will use this information in the next pipeline steps to create a PR.

Version Sources

We have seen that the bot uses the PyPI API to check for new versions of a package. But there are also other sources that the bot can use. You find them implemented in the source code. Besides API logic for CRAN, CratesIO, NPM, NVIDIA and ROSDistro packages, the following sources are often interesting:

  • The Github source uses the GitHub releases RSS feed to check for new versions.
  • The GithubReleases source uses the https://github.com/OWNER/REPO/releases/latest endpoint to check for new versions. This is helpful because it always resolves to the latest stable release version. The RSS feed does not contain the information whether a release is a pre-release or not.
  • The BaseRawURL source can be used for any package. It works by guessing new version strings and checking if the resulting upstream URL exists. For example, if the current version is 2.1.4, the bot will check if 2.1.5, 2.2.0, 3.0.0, or similar versions exist.

Feedstocks can configure the version sources the bot uses in their conda-forge.yml file, overriding the default behaviour (see the documentation).

Example: Debugging minio-server

To illustrate how you can use the debugging capabilities of the bot, let's look at the recipe for minio-server.

At the time of writing, its first lines look like this:

{% set name = "minio-server" %}
{% set version = "RELEASE.2024-12-18T13-15-44Z" %}
{% set sha256 = "d7d044f11a343389b1952c48a3e15feef6efa63c67b9f025c212a04016f05037" %}
{% set time = version|replace("RELEASE.", "") %}

package:
  name: {{ name|lower }}
  version: {{ time|replace("-", ".")|replace("T", ".")|replace("Z", "") }}

source:
  url: https://github.com/minio/minio/archive/{{ version }}.tar.gz
  sha256: {{ sha256 }}

The timestamped version string is quite unusual and does not follow the usual semantic versioning scheme. Also, the SHA-256 hash of the source archive is encoded as a variable in the recipe, instead of writing it directly in the source section. This is usually not recommended because it adds complexity to the recipe, making it less accessible for automated tools like the autotick-bot.

Nonetheless, the bot should be able to find version updates for this package. Investigating the output of the update-upstream-versions command, we see the following (omitting irrelevant lines):

2025-01-16 10:34:13,477 INFO     conda_forge_tick.cli || Running in online mode
2025-01-16 10:34:13,477 INFO     conda_forge_tick.cli || Running without containers
2025-01-16 10:34:13,519 INFO     conda_forge_tick.update_upstream_versions || Reading graph
2025-01-16 10:34:14,330 INFO     conda_forge_tick.update_upstream_versions || Updating upstream versions
2025-01-16 10:34:16,611 DEBUG    conda_forge_tick.update_upstream_versions || Fetching latest version for minio-server from Github...
2025-01-16 10:34:16,611 DEBUG    conda_forge_tick.update_sources || Found version prefix from url: release.
2025-01-16 10:34:16,611 DEBUG    conda_forge_tick.update_upstream_versions || Using URL https://github.com/minio/minio/releases.atom
2025-01-16 10:34:17,256 DEBUG    conda_forge_tick.update_upstream_versions || Upstream: Could not find version on Github
2025-01-16 10:34:17,257 DEBUG    conda_forge_tick.update_upstream_versions || Fetching latest version for minio-server from GithubReleases...
2025-01-16 10:34:17,257 DEBUG    conda_forge_tick.update_upstream_versions || Using URL https://github.com/minio/minio/releases/latest
2025-01-16 10:34:18,675 ERROR    conda_forge_tick.update_upstream_versions || An exception occurred while fetching minio-server from GithubReleases: Invalid version: 'RELEASE.2024-12-18T13-15-44Z'
2025-01-16 10:34:18,676 DEBUG    conda_forge_tick.update_upstream_versions || Fetching latest version for minio-server from NVIDIA...
2025-01-16 10:34:18,676 DEBUG    conda_forge_tick.update_upstream_versions || Fetching latest version for minio-server from ROSDistro...
2025-01-16 10:34:18,676 DEBUG    conda_forge_tick.update_upstream_versions || Fetching latest version for minio-server from RawURL...
2025-01-16 10:34:18,676 DEBUG    conda_forge_tick.update_sources || orig urls: {'https://github.com/minio/minio/archive/RELEASE.2024-12-18T13-15-44Z.tar.gz'}
2025-01-16 10:34:18,677 DEBUG    conda_forge_tick.update_sources || trying version: 2024.12.18.13.15.45
2025-01-16 10:34:18,799 DEBUG    conda_forge_tick.update_sources || parsed new version: 2024.12.18.13.15.45
2025-01-16 10:34:18,799 DEBUG    conda_forge_tick.update_sources || trying url: https://github.com/minio/minio/archive/2024.12.18.13.15.45.tar.gz
2025-01-16 10:34:19,051 DEBUG    conda_forge_tick.update_sources || version 2024.12.18.13.15.45 does not exist for url https://github.com/minio/minio/archive/2024.12.18.13.15.45.tar.gz
                                 [...other guessed version strings failing...]
2025-01-16 10:34:30,408 ERROR    conda_forge_tick.update_upstream_versions || Cannot find version on any source, exceptions occurred. Raising the first exception.
2025-01-16 10:34:30,408 WARNING  conda_forge_tick.update_upstream_versions || Warning: Error getting upstream version of minio-server: InvalidVersion("Invalid version: 'RELEASE.2024-12-18T13-15-44Z'")
2025-01-16 10:34:30,409 DEBUG    conda_forge_tick.update_upstream_versions || writing out file
FINISHED STAGE update-upstream-versions IN 16.932467222213745 SECONDS

We see that many things go wrong here:

  1. The Github source reading the GitHub releases RSS feed does not find a version, although it should. We can already guess here that it rejects the format of the version string.
  2. The GithubReleases source correctly finds the latest release RELEASE.2024-12-18T13-15-44Z but rejects it because it is not a valid version string.
  3. The BaseRawURL source (output in large part omitted) has no chance of guessing the latest version because of the timestamped fashion of the version string. This is expected.
  4. The BaseRawURL source incorrectly assembles URLs to guess new versions by omitting the RELEASE. prefix.

One way of fixing such issues is always to try to update the recipe in some way to make it more aligned with the features and assumptions of the bot. In this PR, for example, the upstream maintainers added a v prefix to the version string, which needs to be reflected manually in the recipe.

However, in this case, the acceptance of the timestamped version string in the bot must be corrected because the Github and GithubReleases sources are the only ones that could find the version, and we cannot influence the version string format on the upstream repository. As a feedstock contributor, this should be the point where you open an issue in the cf-scripts repository. In the case of the example above, we have done that here. After the autotick-bot was fixed, the recipe for minio-server had to be updated slightly to make the version variable match exactly what the version source returns. Automatic version updates do now work for this feedstock.

Conclusion and Outlook

We have shown how you can debug the step of finding new versions of a package in the conda-forge autotick-bot. In most cases, this is a helpful first step to understand why the bot does not create a PR for your feedstock.

To make investigating issues more efficient, we are working on further improvements to the debugging capabilities of the bot:

  • Testing proposed changes: Sometimes, you want to test a change of your recipe before you merge it in your main branch. We are working on a feature that allows you to test a recipe change locally or in a PR and see how the bot would react to it.
  • Debugging other pipeline steps: The bot has many more pipeline steps than just finding new versions. Most of them already allow for local execution, but we are working on making them more accessible.

We hope that these improvements will make it easier for you to maintain your feedstock and contribute to the conda-forge ecosystem. This blog post will be updated as soon as new features are available.