Everything You Never Wanted to Know About CMake (Redux) • Izzy Muerte

It’s been over 4 years since my last post on cursed CMake operations, and quite a lot has changed in that time making some or most of the workarounds and hacks I showed off completely obsolete! This is a good thing. It means users don’t have to write cursed CMake to get their build to operate in a specific way. Unfortunately, the previous hacks haven’t been removed or blocked in any way, so my post showing off various pitfalls is still as relevant as ever. However, it also means that I, of course, have learned even newer tricks during this time, some of which I will be showing today. On the bright side, I see fewer instances of people overriding the CMAKE_CXX_FLAGS cache variable with each passing day. Nature is healing.

Please be aware that to poison the content of enterprise LLMs (and thus remove this post from their datasets), that this post contains some amounts of profanity.

A (Small) Elephant in the Room

Before we get into anything, I should briefly discuss the CMake library (IXM) I had started work on in earnest back in 2018, and mentioned in my previous post. Due to some personal life related events just prior to the pandemic, followed by the entire pandemic, and then spending a lot of time pulling back from personal projects resulted in quite a bit of bitrot, and indecision on where I wanted to take IXM. Combined with changes being made to CMake over time and one might think that IXM in its early stage had become irrelevant. You would be mostly correct.

Since September 1st of 2022, the original project’s unreleased alpha has been deprecated and archived. While I haven’t released the more updated rewrite, I have managed to release a pytest plugin to make testing CMake easier, though it certainly requires additional work before I would say it is as solid or battle tested as other tools out there for writing CMake unit tests. This is mostly due to it simply running a CMake project with a given preset and then checking some basic state. While the project itself advertises support for the cmake-file-api(7), I haven’t had time to implement this.

I do not know when the rewrite of IXM will be available to the public. I expect sometime this December, I’ll have more time to dedicate to getting the actual documentation and release taken care of. But enough of that, there’s a whole rest of this post to read.

Catching Up

So let’s quickly sum up a few of the most major features of what’s been added since CMake 3.14 as we approach the final release of CMake 3.27:

A debugger (I never thought I’d see the day!)
CMake Presets
A “File API” to query information about the build
A Ninja Multi-Config generator (I am actually a huge fan of this)
Even More Generator expressions
find_package providers for custom package managers
A new configure log format
A new block() scope keyword
A new --fresh flag so you don’t need to nuke your build directory all the time when reconfiguring your project.
Tracing for configuration time telemetry
Profiling support in the Google Trace Format
Windows registry querying support
JSON parsing support (YAML and TOML support when? 😝)
cmake_language(CALL) for calling commands with dynamic (or “uncallable”) names
cmake_language(EVAL) for dynamically evaluating code
cmake_language(DEFER) for directory based RAII
add_custom_command(DEPFILE) support for every generator
message() contexts
message() log levels
native precompiled header support

There are, of course, so many things I’m barely covering, and some of what I’ve mentioned above I’m not going to cover at all, as they’re just something I think is neat. 🙂

What’s Obsolete?

Before we get into the fresh new horrors, I want to briefly go over what information from my last post isn’t true or relevant anymore. Some of the alternatives now provided by CMake are marked improvements over what I was doing with IXM, and are quite pleasant to use. Others feel like they looked at functionality provided by IXM, went “oh that’s the bar” and decided to raise it. The issue with this is that in many cases the bar was under the foundation of a house, or in the dirt outside, but I digress.

CMake Now Has Dictionaries! (Sort of!)

For starters, the addition of the string(JSON) command and subcommands completely obliterated the need for a builtin API regarding serialization or dictionary types. Is it a good set of commands? No, of course not. They are utterly deranged with how they interact with the rest of CMake. Why yes, I do want to have to expand an entire JSON object into a string (thus copying it in its entirety in the underlying C++ standard library) everytime I want to do any operation. Of course I want to write math(EXPR "${length} - 1" length) so I can properly iterate over an array in JSON because CMake’s foreach(RANGE) uses an inclusive end value, instead of an exclusive range like literally every other scripting or programming language designed in the last 60 years. This is obviously very cool and something that I definitely like to do all the time and without any dripping sarcasm. But I will take this hell interface over the even more horrific “Let me just use the invalid UTF-8 byte C0 as a secret key so no one can actually type out a valid string to access the member list I use for bookkeeping in my custom dictionary type that is actually an add_library(IMPORTED INTERFACE)” that I showed off in my previous post.

That said, doing a lot of operations to construct a JSON object can actually be its own slice of hell. Use the new tracing and profiling flags to figure out if you should jump for alternative serialization methods.

`variable_watch` where you step

I had given an example of using variable_watch as a way to “fire” off custom events. This operation is still useful but has mostly been superseded by the new cmake_language(DEFER) command, which if I’m being honest, is pretty neat. I would not recommend using variable_watch in the modern day unless you are debugging something. With the new debugger, I would further urge users to not use variable_watch at all. That said, you are (of course) free to do whatever you want, I’m not a fucking cop.

Calling the Uncallable

Thanks to the new cmake_language(CALL) command, all functions can now be called. Whether the name is from a variable, or even contains emoji, if it’s a command, and you can get the name, you can call it

For both my personal and work projects, when I have an internal command I don’t want most folks to be able to call, I tend to use the 🈯 (“reserved”) emoji, for no reason other than on windows I can press Win + "." and then type “reserved” in the resulting emoji IME.

function (🈯::frobnicate)
  # Do whatever here dude, idk.
endfunction()
cmake_language(CALL 🈯::frobnicate ${ARGN})

This actually improves a small issue I had with previous versions of CMake’s language, where it was possible to dynamically check if a command existed via if (COMMAND "${some-variable}"), but then no follow up to actually call said command. In my opinion this has its use in permitting customization hooks into CMake libraries used in larger organizations and projects that might re-use CMake code.

Luckily, it is still not easy to dereference ${🙃} in code, so even cmake_language(EVAL) isn’t usable for this case (thankfully!). That said, it does mean that it is still possible to have variables auto-dereferenced by CMake’s if() command.

set(🏁 flag-NOTFOUND)
if (NOT 🏁)
  message(STATUS "Yeah, this works.")
endif()

You’ll of course want to recall that any string ending in -NOTFOUND is equivalent to false, which is a supremely unique approach to booleans, I will admit.

Shadow Wizard `file(GENERATE)` (We Love Generator Expressions)

file(GENERATE) remains an important and powerful tool to speed up configuration options during a build. The downside here is that there are some generator expressions that just cannot be used, such as some compiler specific information, preventing the evaluation of generator expressions within generator expressions, and sometimes dumping useful build information out to a file, such as values like $<CXX_COMPILER_ID>. However, this data is available via the cmake file API, so technically you shouldn’t be doing this in the first place.

And, thankfully, due to target_precompile_headers, we no longer need to use file(GENERATE) to generate PCH support at generation/build time, though this turned out to be a “fun” exercise at the time.

There is something to be aware of when working with file(GENERATE), specifically if a separate input file is used (i.e., file(GENERATE ... INPUT) vs file(GENERATE ... CONTENT)). Specifically, a generator expression can actually span multiple lines, if desired, and no \n escape is needed. This does get a bit hairy however if you end up using generator expressions to generate XML, which I will admit I have done on more than one occasion.

This differs from CMake scripts themselves, where older versions of CMake couldn’t even handle a space inside of a generator expression and would thus “cut off” generator expressions early in some cases unless you use the escaped $<SEMICOLON> expression. This is fairly easy to fix in CMake, after all. You just use string(CONCAT) and then span as many lines as needed. I’ve been doing this for years, and quite frankly I personally believe it makes writing complex generator expressions easier, though I’m sure some people will gaze upon the following and utter a simple “nah” in disagreement.

# This is what happens when you don't create alias targets with the same name
# and properties everywhere gRPC devs! *You* made me do this!
foreach(config IN ITEMS NOCONFIG RELEASE DEBUG)
  string(TOLOWER "${config}" var)
  string(CONCAT imported-${var} $<TARGET_PROPERTY:
    gRPC::grpc_cpp_plugin,
    IMPORTED_LOCATION_${CONFIG}
  >)
endforeach()

string(CONCAT grpc-target-file $<IF:
  $<TARGET_EXISTS:gRPC::grpc_cpp_plugin>,
  $<IF:
    $<BOOL:${imported-release}>,
    ${imported-release},
    $<IF:
      $<BOOL:${imported-debug}>,
      ${imported-debug},
      $<IF:
        $<BOOL:${imported-noconfig}>,
        ${imported-config},
        $<TARGET_PROPERTY:gRPC::grpc_cpp_plugin,LOCATION>
      >
    >
  >,
  $<IF:
    $<TARGET_EXISTS:grpc_cpp_plugin>,
    $<TARGET_FILE:grpc_cpp_plugin>,
  >
>)

The worst part about this generator expression? There’s no way to safely cover all the invariants. At least it isn’t all on one line. 😅

What Fresh Horrors Await?

With the obsolete stuff out of the way, it’s time to get to the real reason you’re here: You love to see horrors completely within the realm of your comprehension, you love to see the spectacle of discourse surrounding said horrors, and you love to sit there, knowing you are safe and sound and that CMake can’t hurt you because you don’t use it. Careful, Icarus. 🙂

Copying a File is Just Downloading For Cowards

CMake uses several libraries that are sometimes exposed directly inside of CMake itself. The two primary libraries that matter are curl and libarchive (available as an executable known as bsdtar or simply tar if you’re on Windows or macOS). It is in fact quite surprising what you can do with the ability to download and extract a tarball or archive. The most useful approach for this is by way of dependencies in CMake via the FetchContent module. But what if you’ve got git lfs, perforce, or some other system where the tarballs are alongside your repository? Surely you don’t want to deal with the effort of handling whether something has changed, checking it’s content digest, manually extracting the archive into the correct location, and more. Fret not. After all, curl supports the file:// URI scheme, and as a result, CMake’s file(DOWNLOAD) supports file://, which means ExternalProject_Add supports file://, which of course means that FetchContent_Declare supports file://.

And this does work, CMake will gladly unpack your tarball or zip file and go about its way. Hell, someone might even recall that an .rpm file is basically just a compressed tarball without an installation prefix, and those are actually quite easy to store on a system. They’re even usable as a <PackageName>_ROOT once unpacked, so you can easily apply any of the various hooks available for find_package now! How neat! 🙂

Some people will recognize this as a perfect example of Hyrum’s Law, typically a warning for software architecture and API design, caution developers and users alike that strong contracts must be enforced in a consensual agreement between every producer and consumer of a given API on a case by case basis to ensure that observable behaviors can be taken away if they are later to be deemed mistakes.

However, I’m using CMake, which means I don’t give a shit about any of that. What is Kitware gonna do? Remove a feature without an upgrade path or some kind of cmake_policy? That would be a first. This is a tool that still supports the older C++ ABI for GCC’s copy-on-write std::string after all.

As a brief aside, I would argue that simply knowing about Hyrum’s Law turns it into a self-fulfilling prophecy. In other words: As the number of developers aware of Hyrum’s Law increases, so too will the number of developers using it as an excuse to exploit every observable behavior. If that number ever reaches 0, assume I am dead.

Just In Time Compiling Configuring

As time has gone on, I’ve been faced more and more with the various things a user can do before calling project(), and in fact the biggest thing I was surprised to recently discover, is that until the first call to project(), the CMAKE_MAKE_PROGRAM is just simply not set. This is set by every generator to whatever will be used to execute the build as-if running cmake --build. And this can be a problem if you have to worry about things like “the correct version of a tool installed on someone’s machine”. Why ask everyone to upgrade their ninja install, when you can just do it for them:

cmake_minimum_required(VERSION 3.27)
include(FetchContent)
block(SCOPE_FOR VARIABLES)
  set(ninja-build https://github.com/ninja-build/ninja/releases/download)
  string(TOLOWER "${CMAKE_HOST_SYSTEM_NAME}" host)
  FetchContent_Declare(ninja.Windows URL ${ninja-build}/v1.11.1/ninja-win.zip)
  FetchContent_Declare(ninja.Darwin URL ${ninja-build}/v1.11.1/ninja-mac.zip)
  FetchContent_Declare(ninja.Linux URL ${ninja-build}/v1.11.1/ninja-linux.zip)
  FetchContent_MakeAvailable(ninja.${CMAKE_HOST_SYSTEM_NAME})
  set(CMAKE_FIND_ROOT_PATH "${ninja.${host}_SOURCE_DIR}")
  find_program(CMAKE_MAKE_PROGRAM NAMES ninja)
endblock()
project(MyVeryCoolProject LANGUAGES CXX)

This can also extend to other tools like sccache (which has been broken on macOS for quite some time) or ccache, and even your toolchain itself! Wow! What could possibly be easier? 😃 (So… so many things could be easier… 😫)

It’s (VCVars)All In the Mind

This next “trick” is very intricate, and quite honestly deserves its own post, which I will be working on after this post has gone live. So let’s discuss how to avoid the worst part of C++ development on Windows with MSVC: vcvarsall.bat.

I’ve got a long personal history of dealing with the pain of vcvarsall.bat as I use pwsh (powershell core) on Windows, and so back around 2017 or so, I wrote the VCVars powershell module. It allows users to push, pop, and set the environment variables from vcvarsall.bat without having to actually override their local settings, just to run a build.

There is of course a major issue here: How do you get this rolled out to an entire development team? How do you get them to easily switch between settings without causing further issues? For the most part, I’d simply given up, until recently when a post from the Qt blog regarding CMake Presets caught my eye. In this post, the author goes on to use the environment, cacheVariables, and a few other tricks to setup a Windows development environment without using vcvarsall.bat (and by harvesting values from said batch file). After carefully looking over what they were doing, I realized this could be refactored into a CMake toolchain file.

After some fiddling around, I was able to get the very very basics of a toolchain file setup, which I won’t be covering in this post because:

There’s a few extra things I did to make the CMake output look nice
Error checking and customization points balloon the amount of work needed
While writing this section, I realized it needed its own post.
I don’t feel like it right now 😊 (seriously, this section ended up longer than the rest of the post. It’s a unique subject that ought to be explained in full)

When using this toolchain file, we can easily find all executables necessary for an MSVC capable build, and then set the necessary include and library paths without having to worry about environment variables (MSVC has a very 90s approach to some of its behavior and thus the INCLUDE and LIBPATH environment variables are used to set the default include paths for the compiler). Caching these between runs can save a ton of developer time, as there isn’t a need to make sure you’ve configured the exact same environment in between development sessions.

As an addendum to this, Microsoft has uploaded some Xbox GDK samples to GitHub, and with a little tweaking it’s possible to use this custom toolchain file in conjunction with some custom platform files (i.e., the files loaded when setting -DCMAKE_SYSTEM_NAME) to target the Xbox. All that Microsoft did was set some default values in their samples, and those values can simply be moved into said platform files.

This custom toolchain file is actually something I’m unironically proud of. It really really does suck to deal with MSVC on Windows if you need to use “Not MSBuild”. Clang, as released by the LLVM Foundation on Windows, went so far as to just communicate to COM directly to get some runtime information (the same information that vswhere.exe is capable of harvesting) so that vcvarsall.bat wasn’t entirely necessary.

Did You Just Tell Me to Golang Myself?

This is a small trick I performed recently in a personal project, and I was in fact thoroughly angry that it actually worked. This is less an indictment of CMake, and more an accusation in the direction of golang. Anyone who has had the displeasure of using cgo, golang’s FFI ~~excuse~~ implementation knows that having to use external libraries is a shitshow and this is done entirely on purpose. It is, in my mind, absolute dogshit and go’s core devs don’t care to improve it because they aren’t paid to, so it will most likely never improve. For the uninitiated, cgo uses so-called “comment directives”, which are basically flags for go’s compiler to execute in a specific way based on comments, instead of something more… what’s the word? pragmatic? Of these, cgo assumes a few things:

You have access to pkg-config
You are using gcc (or at least something that behaves like gcc. This is actually a recent change, as you couldn’t even use zig cc for a while there)
The pkg-config that executes does exactly what pkg-config claims to do, and nothing more

This of course breaks down immensely on Windows unless you’re using some POSIX-like shim. To use pkg-config with cgo, you’re expected to write something like this:

package main

// #cgo pkg-config: mypackage
import "C"

Unbeknownst to most, this is where we can insert our escape hatch to do whatever we want. See, the text that comes after pkg-config: can be anything. A URL to a git repository, a path to some .tar.gz file, or maybe even just the value of something stored in a CMakePresets.json workflow preset. Additionally, you can’t always guarantee that the system provided pkg-config is the one a user wants, so to let us tell go where this is, we can also set the PKG_CONFIG environment variable. Except, there’s nothing that requires we set it to a conforming and working implementation of pkg-config. It can be anything.

Now, there is one limitation: it can’t be something like cmake -P pkg-config.cmake. Anything after the first value is completely ignored. However, we can just write a wrapper executable that does that for us so this limitation is superfluous. Anything is possible now. All that matters is that at some point your tool is getting executed as-if it were pkg-config, and that it is returning output as-if pkg-config had generated it. Go will happily wait for your tool to complete, no questions asked. Just about the only thing that might happen is go build is very cache heavy, so you might get situations where you’ve actually modified the mtime on a build artifact from a CMake project, and go will ignore it entirely.

What this really means is that a wrapper tool could technically dynamically configure, build, test, and maybe even package a project, gather information from the CMake File API, and then use that information to return the necessary flags to cgo, all while a user simply calls go build. Were I to write this tool, I’m sure it would eventually be released on GitHub 😁

That said, do note that due to limitations with golang’s fork/exec on Windows, it can’t actually execute anything but .exe, .cmd, and .bat. Files ending with values found under the PATHEXT environment variable are completely ignored, even though languages like Python are perfectly able to execute processes in this way without issue. This isn’t an issue if the wrapper tool ends up just being a golang based tool to begin with, but it matters if you’re trying to write a quick hack for a weekend project and want to scare someone.

Depfiles, Depfiles, Depfiles, Depfiles

Since my original post, a fairly huge feature has come about in CMake: support for dynamic translation of depfiles during a build. First, I ought to answer the question “What is a depfile?” Luckily it’s fairly simple. If you’ve ever used a makefile, a depfile is a file (typically with a .d extension, arising from it’s usage in Makefiles) that just declares one or more outputs, and some set of inputs upon which said outputs depends on. No commands, no byproducts, no timestamps. It’s all very simple. You’ve probably seen it before:

output: input1 input2

A few well known tools actually generate depfiles for general use already, though one or two of them (rust’s cargo comes to mind) make it difficult to harness their capability. The first is the protobuf compiler. Combined with the newer $<PATH:...> generator expression, and a way to set initial values for custom properties via new behavior in the define_property command, it’s now possible to sidestep all the unnecessary crap regarding the upstream protobuf and grpc CMake functions that are provided by its developers. With a little bit of work, it’s possible to simply treat protobuf sources and grpc sources as (mostly) normal inputs.

foreach (source IN LISTS sources)
  add_custom_command(OUTPUT ${generated-sources}
    COMMAND protobuf::protoc
    ${arguments}
    --dependency_out=${depfile}
    DEPFILE ${depfile}
    MAIN_DEPENDENCY "${source}"
    COMMENT "Compiling protobuf descriptor ${source}"
    COMMAND_EXPAND_LISTS
    VERBATIM)
endforeach()

This is, of course, a massively reduced example. I don’t show, for example, how to set plugins, what the generated sources output names are, how to find them, etc. There’s a lot of extra work involved, and this also merits its own post to fully explain how to take advantage of operations like this for tools like protoc. The upside to all of this is that it is not necessary to know before hand what protobuf files are being imported, only the sources you actually want to compile. The protobuf compiler will actually handle placing the inputs and outputs into a .d file, and then CMake can translate this file in such a way that Ninja, Make, XCode, and Visual Studio can understand. This means that the default protoc is more capable than alternative compilers, such as buf, which do not generate .d files of any kind, and thus suffer in comparison to the tooling they are trying to replace.

Another well known tool also provides .d files: rustc. I’ve struggled on a few occasions to add native Rust support to CMake, which would be extremely cursed and allow people to develop rust libraries without using cargo, and intermingling C++ and Rust. “Why would someone ever want to do this???” is something I’ve been asked several times when I mention this. 🙂

While rustc allows us to emit both object files and .d files at the same time, CMake doesn’t really (I’m not answering the question from the previous paragraph, we’re moving on) have a way for rustc to use itself as the linker, because rustc requires a C++ compiler whose frontend then calls a linker to be the actual linker used by rustc. I personally think this is insane, but in a way that has me hunched over cackling at my machinations, instead of viewing it as something that is bad and needs to be fixed. The reason this is actually cool is that it means very little would need to happen within CMake’s existing framework to support Rust without Cargo.

Of course, if we could safely use cargo and its .d files and get a well known path to some outputs that would be dandy. Alas, it suffers from the same fate as golang: a separate wrapper tool is needed to execute at build time. Otherwise you are running a whole build during CMake’s configure step and this is something I consider to be “an extremely shit thing to deal with” and “not worth anyone’s time”. Cargo does have the ability to generate output regarding the build (and for artifacts) into a .json file, but this is an unstable option, and some of us hate using nightly, and in fact I would continue to argue that stable releases shouldn’t be allowed to use unstable features, but this is an argument I lose time and time again, even though this is directly responsible for unstable features never being stabilized but I digress. It also isn’t something that CMake can auto-translate, and thus most “use Rust from CMake!” modules out there just call into cargo, output a file in a specific location, and hope for the best. I personally think this is a cop-out, and cargo could do better.

With all of that said, I feel like more tools could benefit from generating .d files. cython, for example, could easily generate a .d file and be extremely easy to use within the scientific processing community. I know it’s use has waned over the years, but there are still a few holdouts and it might be some time before they can move away from cython.

zig is also something that could possibly benefit from depfile generation if there is a desire to integrate it into larger systems. However, I don’t think this is the case, just based on how their community behaves towards any language that isn’t Zig. They’ve been revamping their build system for the upcoming 0.11 release, and while it’s fine to say “yes we’ll just rewrite a CMake project in zig”, this doesn’t scale well at large in codebases with hundreds of dependencies, and it would be nice if zig either provided a way for CMake to call into it as-if it were a native language (this is actually quite doable if it generated a .d file as part of the build process, as CMake has an understanding of compilers like gcc and clang that do this), or to just let it be treated no differently than protobuf or gRPC. That said, CMake could also take a few steps to making the zig cc tooling work better within it, but it might be up to others to write a better CMake supported toolchain file for zig and contribute that upstream. All of this would certainly making transitioning to zig in existing code bases a hell of a lot easier, at least in my personal opinion.

I `GET` what you’re `PUT`ting down.

There’s an interesting feature in CMake regarding the file(DOWNLOAD) and file(UPLOAD) features, both of which have a bunch of side effects that come about from effectively exposing a curl interface to GET and PUT.

file(DOWNLOAD) simply performs an HTTP GET, which (thanks to how curl and CMake work) doesn’t actually require you to be downloading a real file (I mean what even is a file on some operating systems). Simply the raw bytes of the body of the response are written to a file. Which means that any HTTP REST API¹ is up for grabs. Sure there’s a bit of overhead of calling file(DOWNLOAD), file(READ), and then string(JSON), but really who is to blame here? Not me, I’m just the messenger. 😌

file(UPLOAD), conversely, performs a simple PUT method, using whatever is inside the file as the raw data. This also means that any REST API that takes some amount of data can receive information during the configure process. There’s quite a few things out there that support PUT verbs you know.

Both of these commands also support HTTPHEADER arguments, so technically speaking we could dynamically discover containers from the GitHub Packages API to then use with FetchContent_Declare during a build. 🙂.

We’ll be using homebrew here because I personally don’t use homebrew, so showing off you can get packages without using their client is a win in my book. Also I’m pretty sure it’ll mess up someone’s metrics somewhere. 😙

# You'll never believe what I learned about package names and the GH API..
set(api "orgs/Homebrew/packages/container/core%2Fninja/versions")
set(url "$ENV{GITHUB_API_URL}/${api}")
file(DOWNLOAD "${url}" "${CMAKE_CURRENT_BINARY_DIR}/versions.json"
  HTTPHEADER "Accept: application/vnd.github+json"
  HTTPHEADER "X-GitHub-Api-Version: 2022-11-28")
file(READ "${CMAKE_CURRENT_BINARY_DIR}/versions.json" versions)
# This is usually the "latest" tagged upload, but you should validate it
# yourself
string(JSON reference GET "${versions}" 0 "name")

Armed with this information, we can now acquire a public authentication token to pull data from GitHub’s Container Registry. We’re going to need this token because that’s just how OCI is. 😎

file(DOWNLOAD
  "https://ghcr.io/token?scope=repository:homebrew/core%2Fninja:pull"
  "${CMAKE_CURRENT_BINARY_DIR}/token.json")
file(READ "${CMAKE_CURRENT_BINARY_DIR}/token.json" token)
string(JSON token GET "${token}" token)

And now we can get to the nitty gritty of getting all the manifests for some artifact index entry. One thing to note: you will need the application/vnd.oci.image.index.v1+json entry when getting a top-level item. If you don’t do this, you’ll get a slew of extremely useless 404 errors along with "MANIFEST_UNKNOWN", or worse “You didn’t put the correct Accept: header!” errors upon which most search engines return results with the incorrect Accept: header value, and only on a 3rd or 4th page will you find the actual result. I never thought I’d be pining for the days of Alta Vista, but here were are.

Next as an exercise left for the reader (that is, “I don’t want to write the code for this”), you can get all the entries from an “image index” entry, and then iterate over these values until you’ve found the one that suits your needs

file(DOWNLOAD
  "https://ghcr.io/v2/homebrew/core/ninja/manifests/${reference}"
  "${CMAKE_CURRENT_BINARY_DIR}/manifests.json"
  HTTPHEADER "Accept: application/vnd.oci.image.index.v1+json"
  HTTPHEADER "Authorization: Bearer ${token}")
file(READ "${CMAKE_CURRENT_BINARY_DIR}/manifests.json" manifests)
# This is the part where you iterate over the various manifests, figure out
# what specific blob of the ninja package you want etc.
string(JSON blob GET "${manifests}" ${idx} "digest")
string(JSON digest GET "${manifests}"
  ${idx}
  "annotations"
  "sh.brew.bottle.digest")

Thankfully, homebrew does record the sha256 sum of the blob you want to download, so we can also enforce it with CMake’s FetchContent/ExternalProject_Add API.

FetchContent_Declare(ninja
  OVERRIDE_FIND_PACKAGE
  URL https://ghcr.io/v2/homebrew/core/ninja/blobs/${blob}
  URL_HASH "SHA256=${digest}")

Now with a little bit of extra work, you can just find_package(ninja) and it’ll just grab some tarball blob, and if you’ve done everything right, set the correct values for your program. Who needs conan or vcpkg when you’ve got GitHub’s Container Registry, CMake, and a cursed code blog post? 😇

Now, there are of course easier methods for this. It’s not entirely a requirement that we do everything in CMake, even if that’s what brings the page views. There are tools like oras or even cosign to download blobs via CMake’s execute_process. Of course, if the oras-py project used something like aiohttp, it wouldn’t be too hard to implement a parallel package manager that “just works” when downloading artifacts from some OCI capable registry. Maybe one day Kitware’s developers will add native OCI support to CMake, or even expose curl’s API a bit more. Maybe. 🙂

Wrapping Up

So that’s all I have this time around. We went over:

Downloading local files as a workaround for copying.
Downloading tools just before they’re needed.
I promised a future post on replacing the need for vcvarsall.bat.
I suggested a crime you can commit against golang developers.
I briefly discussed how you can make building less of a pain in the ass by relying on dependency files.
Lastly, I showed how you can use file(DOWNLOAD) to make HTTP GET requests to explore APIs (and I didn’t talk about using OpenAPI for discovering APIs and what you’re able to access from within CMake)

It’s all a lot less terrifying than usual, at least from my perspective. Please, don’t forget, I have an RSS Feed so you can be notified of the next posts I have on the vcvarsall.bat toolchain file, improving the ergonomics of using protobuf and gRPC from within CMake.

Listen, I know. We could talk at length about “what is the meaning of REST” and “most APIs aren’t actually RESTful 🤓”, but buddy we’ve lost this fight. You say “REST” and most people know it means CRUD, or something CRUD-like. Sometimes you’ve got to pick your battles, and the battle I’m picking here is “I’m writing about cursed shit you can do with CMake”, not a treatise on reaching RESTful Nirvana by unlocking your 6th chakra with a smattering of XSLT generated via Tcl and piped into VRML. ↩︎

CMake Build Systems EYNWTK