โ—index ๐Ÿฆape-rust-intro.md ๐Ÿท๏ธtags ๐Ÿ‘คabout

๐Ÿฆ One Rust Binary, Six Operating Systems: Meet the Actually Portable Executable

First post in the one-bin-to-rule-them-all series. An investigation into whether a single cargo build can produce a binary that runs natively on Linux x86_64, Linux aarch64, macOS arm64, FreeBSD, OpenBSD, and Windows, with no emulation, no containers, no build matrix.

ape-rs

I wanted to know if I could build a Rust binary once, copy it to any mainstream OS, and have it run natively ๐Ÿฆ€.. Not through QEMU, not through Wine, not through a Docker container with a compat layer. The same machine code talking to whichever kernel ABI the host happens to speak.

The short answer? Yes for sync Rust, no for async Rust (yet), and the cost of getting there is one forked libc, a handful of tiny downstream crate forks, a stack of std patches, and a lot of respect for the people who made this possible in C before me ๐Ÿ’—.

This post sets the stage: what an APE is, what the toolchain looks like, how Rust plugs into it, and the shape of the experiment that runs through the rest of the series.

Credit where it's due

This project sits on top of two pieces of work without which none of it would exist.

Justine Tunney's Cosmopolitan is the foundation. Cosmopolitan is a C toolchain that produces Actually Portable Executables: single binary files that are simultaneously valid ELF, PE, Mach-O and shell scripts, and that run natively on Linux, macOS, FreeBSD, OpenBSD, and Windows with zero per-OS recompilation. It is one of the most quietly radical projects in systems programming ๐Ÿ”ฅ.. The trick, a single polyglot file that every OS loader is willing to accept, it sounds like it really shouldn't be possible, and then you read the blog post and it is, and you spend the next week in a daze ๐Ÿคฉ. If you haven't seen it before, go read that first, I'll wait.

ahgamut/rust-ape-example is the other pillar. Back in 2022โ€“2023, @ahgamut figured out how to get rustc to emit object code that cosmocc's linker would accept, wrote a custom Rust target spec (x86_64-unknown-linux-cosmo.json), a GCC linker wrapper, and a "hello world" that actually works. That repo had gone quiet, the last commit is from November 2023, but the bones are entirely correct. My starting point was to resurrect it, not to reinvent it. Anywhere in this series where I talk about "the cosmo target spec" or "the linker wrapper," I am talking about ahgamut's code.

The rest of the work, the probe matrix, the forks, the std patches, the cosmo-sysconf crate, is mine. The scaffolding it rests on is not.

What is an APE, actually

An Actually Portable Executable is one file that multiple operating systems will all load and run, natively, without any translation layer.

Let me describe the physical file. The first bytes of an APE are MZqFpD\r\n, an MS-DOS stub header (MZ is the Windows PE magic), followed by a two-byte jump instruction, followed by a shebang-looking string that makes the file recognisable to Bourne shells as a self-extracting script. Later in the same file are an ELF header (for Linux/BSD), a Mach-O header (for macOS), and a ZIP central directory (for embedded assets). Every header starts at an offset that is meaningful to at least one loader and skipped by the others. The jumps and NOPs are arranged so that whichever loader picks up the file, its control flow ends up at the same _start.

When you run an APE on Linux, one of two things happens. Either binfmt_misc has been told about the MZqFpD magic and hands the file straight to an installed APE loader, or the kernel doesn't recognise the magic and invokes the file via execve, at which point the /bin/sh prolog at the top of the file takes over, self-extracts the right payload, and re-execs it. Both paths land in the same place. On macOS, the kernel sees the Mach-O header and loads the appropriate payload. On Windows, the PE loader does its thing. On BSDs, there's a chain of prologs that detect the OS and jump to the right code path.

The detection is at load time, not compile time. The binary itself has no #ifdef branches, it carries all the OS-specific syscall layers, and picks one at boot.

This is so close to impossible that the first time you produce one it feels like a magic trick ๐Ÿช„.. file hello.com:

โฏ_bashโ€บ2 lines
  1โฏโฏโฏ file hello.com
  2hello.com: DOS/MBR boot sector; partition 1 : ID=0x7f, active, ...

That's the MS-DOS stub showing through the file heuristic, and the same physical bytes run on six operating systems ๐Ÿคฏ.

Toolchain bring-up

The toolchain I used for this investigation is cosmocc 4.0.2, a cosmopolitan release tarball that packages:

  • x86_64-linux-cosmo and aarch64-linux-cosmo GCC cross-toolchains (based on GCC 14.1.0)
  • apelink, the linker-wrapper that fuses multiple per-arch object files into one fat APE
  • APE loader payloads for each host OS (ape-x86_64.elf, ape-aarch64.elf, ape-x86_64.macho, and a small C source ape-m1.c for macOS arm64)

Installing it is "download the zip, unzip it, put bin/ on your PATH." No package manager, no build-from-source, no dependencies.

A trivial C program to confirm the toolchain works:

โฏ_bashโ€บ5 lines
  1โฏโฏโฏ cosmocc -o hello.com hello.c
  2โฏโฏโฏ file hello.com
  3hello.com: DOS/MBR boot sector; partition 1 : ID=0x7f, active, ...
  4โฏโฏโฏ ./hello.com
  5hello from APE

The binary is 597 KB, that's the full cosmopolitan runtime linked statically. It's not tiny, but is a single file with no ldd output, no dynamic linker requirement, no platform-specific package. For a CLI tool that currently ships as six binaries plus a platform detection script, that is a real trade.

binfmt_misc is optional

APEs just run. The shell-script prolog at the top of the file is valid /bin/sh, and on any Linux system where execve falls through to shell execution, the prolog self-extracts the right payload and re-execs it. You don't have to configure anything. I ran the whole investigation's aarch64 targets without ever touching binfmt and they worked fine.

If you want APEs to launch directly without the prolog round-trip, you can register the magic bytes with binfmt_misc so the kernel hands APEs straight to a dedicated loader:

โฏ_bashโ€บ7 lines
  1โฏโฏโฏ sudo install -m 0755 /path/to/cosmocc/bin/ape-x86_64.elf /usr/bin/ape
  2โฏโฏโฏ cat /etc/binfmt.d/APE.conf
  3:APE:M::MZqFpD::/usr/bin/ape:
  4:APE-jart:M::jartsr::/usr/bin/ape:
  5โฏโฏโฏ sudo systemctl restart systemd-binfmt
  6โฏโฏโฏ cat /proc/sys/fs/binfmt_misc/APE
  7enabled

Two magic strings because cosmopolitan has produced two prolog variants over the years (MZqFpD and jartsr). Both resolve to the same loader binary. This is persistent across reboots and a tiny bit faster than the prolog path, but it is an optimisation, not a requirement. If you don't care, don't bother.

Rust's turn

The Rust side is where things get interesting. rustc speaks a target-triple language (x86_64-unknown-linux-musl, aarch64-apple-darwin, and so on), and cosmopolitan is not a triple any rustc knows about. The trick ahgamut worked out is to give rustc a custom target spec that describes "x86_64, linux-flavoured, musl-shaped" but wires the linker to cosmocc instead of a native linker.

The target-spec file is a JSON blob. Here's the core of x86_64-unknown-linux-cosmo.json after the changes I had to make to catch up with current rustc:

{}jsonโ€บ20 lines
  1{
  2  "llvm-target": "x86_64-unknown-linux-musl",
  3  "data-layout": "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128",
  4  "arch": "x86_64",
  5  "target-pointer-width": 64,
  6  "target-c-int-width": "32",
  7  "os": "linux",
  8  "env": "musl",
  9  "linker-flavor": "gcc",
 10  "linker": "./gcc-linker-wrapper.bash",
 11  "executables": true,
 12  "has-rpath": false,
 13  "position-independent-executables": false,
 14  "crt-static-default": true,
 15  "crt-static-respected": true,
 16  "relocation-model": "static",
 17  "code-model": "large",
 18  "disable-redzone": true,
 19  "panic-strategy": "abort"
 20}

The linker field points at a bash wrapper that intercepts rustc's call to cc and redirects it to x86_64-unknown-cosmo-cc from the cosmocc tarball. That wrapper is ahgamut's work. It adds a few missing flags and strips a few flags cosmocc doesn't understand, but it's about fifty lines of bash.

To compile something:

โฏ_bashโ€บ5 lines
  1COSMO=$(pwd)/toolchain/cosmocc-4.0.2 \
  2  cargo +nightly build \
  3    --target=./x86_64-unknown-linux-cosmo.json \
  4    -Z build-std \
  5    -Z json-target-spec

-Z build-std rebuilds the Rust standard library against the cosmo target (because no pre-built std ships for a nonexistent triple). -Z json-target-spec is the incantation that tells cargo it's allowed to load a .json target spec from disk. Both are nightly-only flags, so this whole investigation runs on a nightly rustc.

Produce a fat APE by building once per arch and fusing with apelink:

โฏ_bashโ€บ7 lines
  1apelink \
  2  -o hello.com \
  3  -l ape-x86_64.elf \
  4  -l ape-aarch64.elf \
  5  -M ape-m1.c \
  6  hello-x86_64.com.dbg \
  7  hello-aarch64.com.dbg

That's the clean build. The ugly build, the one I actually did first, was to run the command that the rust-ape-example README told me to run, watch it fail, and then fix one error at a time ๐Ÿคฆ..

I had to do something!

Target-spec drift: the resurrection

ahgamut wrote the rust-ape-example target spec against a 2023-era rustc. By early 2026, rustc had moved forward, the target-spec JSON schema had tightened, and a fresh clone of the repo no longer compiled. Seven attempts to get a Rust "hello world" through the toolchain, each attempt a different schema complaint:

AttemptError (truncated)Fix
1.json target specs require -Zjson-target-specAdd the flag
2target-pointer-width: invalid type: string "64"Change "64" to 64 (unquoted)
3is-builtin: unknown fieldRemove the field
4linker flavor args must not be emptyRemove empty late-link-args: {gcc: []}
5static CRT can be enabled but crt_static_respected is not setAdd "crt-static-respected": true
6data-layout ... differs from LLVM target's default layoutAlign the data-layout string to current LLVM/musl
7exit 0done

Each of those is a minor version drift in rustc's target-spec validator. None of them is a deep change. The total diff against ahgamut's last commit is two target-spec files edited (โˆ’11 / +3 lines each) plus a fresh build-fat.sh wrapper script. No changes to Cargo.toml, no changes to the linker wrapper, nothing in the Rust source.

Which is the right lesson: the hard part is not how you ask rustc to talk to cosmocc, the hard part is that the target-spec format is a moving target and a repo that worked two years ago will need a trivial polish to work today. ahgamut got the architecture right, maintenance is the price of living on a custom target.

After the resurrection, ./rust-ape-example/hello.com, 1.26 MB of fat APE, printed Hello World! on the host ๐Ÿš€.

The experimental design

The question this series tries to answer is not "can I build Rust code against cosmocc" (ahgamut answered that one yes in 2022), but "does Rust code then actually run correctly on OSes other than the one you built on?"

The experiment is a probe crate with twelve categories of test (filesystem, networking, time, processes, threads, synchronisation, randomness, environment, panic/unwind, FFI, stdio, signals) plus three real workloads ported on top: ripgrep (sync filesystem search), dog (sync DNS), and xh (async HTTP via reqwest / hyper / tokio / mio). Everything gets built as a single fat APE and run on six targets:

TargetArch
Linuxx86_64
Linuxaarch64
FreeBSD 14x86_64
OpenBSD 7.xx86_64
Windows Server 2022x86_64
macOSarm64

Same binary every time. One command per target (./probe.com, ./rg.com, ./dog.com, ./xh.com). Whatever breaks is a portability finding ๐Ÿ”ฌ.

Previewing the answer

Skipping to the end, over the next five posts I'll walk through all fourteen portability findings and the fixes that closed (almost) all of them.

The short version:

  • Synchronous Rust works on all six targets. Probe, ripgrep, and dog all run correctly on Linux x64/aarch64, macOS arm64, FreeBSD, OpenBSD, and Windows. That's three real workloads, 6/6 each. As far as I can tell, this is the first time a non-trivial async-free Rust program has been shown to do that from a single binary.
  • Asynchronous Rust works on Linux only. xh runs correctly on Linux x86_64 and Linux aarch64, but breaks on every non-Linux target. The failure is structural: mio picks its reactor (epoll / kqueue / IOCP) at compile time via cfg(target_os), and there is no current way to do that selection at runtime. Post 5 is about exactly that wall.
  • The key insight is that const is the wrong shape here. The Rust ecosystem assumes libc::EINVAL, libc::CLOCK_MONOTONIC, libc::AF_INET, and a hundred other constants are compile-time integers (very reasonable assumption for a normal Rust target). On a cosmopolitan-linked binary, those numbers happen to be different on every host. The pattern is to turn them into extern "C" { static ... } symbols and let the cosmo loader resolve them at load time. One pattern, applied consistently, closes ten of the fourteen findings.

The remaining posts walk through the evidence and the fixes in order.

Should you bother?

Honestly, as a thing you can drop into your own project today, no. The underlying idea is beautiful, and the proof that sync Rust crosses six OSes from one binary is real, but making it a comfortable path for anyone else would involve a lot of moving parts lining up across several upstream projects (a real rustc target, changes in libc, std, and the async foundational crates, plus something about async reactors that I'll come back to in post 5), and none of those are things I'm in any position to promise will happen. As it stands this is fork-the-world-and-rebase-it-forever, which is a reasonable thing to do once as a learning exercise and a lot to ask of anyone else.

So: is this series a tech demo or a recipe? It's a tech demo. I think it's a very cool tech demo in systems Rust right now, and the honest read is that it would need a lot of careful upstream work before it became a casual cargo install story. What I can show is the shape of the gap, finding by finding.

Try the binaries yourself

If you want to see the matrix with your own eyes instead of reading my results tables, here are the three sync-workload APEs built during the investigation. Each one is a single fat binary containing x86_64 and aarch64 payloads, the APE loader stubs for every supported OS, and the cosmopolitan runtime.

  • probe.com (4.5 MB), the 12-category probe crate from the next post
  • rg.com (27 MB), ripgrep ported onto the cosmo target
  • dog.com (6.6 MB), the dog DNS CLI ported onto the cosmo target

โš ๏ธ Don't trust me. Run them in a VM. These are arbitrary binaries from a stranger on the internet. I believe they are exactly the three programs above, built from the sources I'll walk through over the rest of the series, with no surprises. You have no reason to take my word for that.. If you actually want to try them, drop them into a throwaway VM or container, chmod +x, run them there, and nowhere else. Do not give them access to a filesystem you care about ๐Ÿค.

Next post: the probe matrix, and the eleven ways Rust breaks when it leaves Linux ๐Ÿ”ฅ.

:discuss share / comment on Mastodon โ†’