๐ Dogfooding on `ripgrep` and `dog`: Real Rust Tools, One Binary, Six Operating Systems
Fourth post in the one-bin-to-rule-them-all series. Previously: intro, the probe matrix, extern-static and the fork cascade.
The previous post was abstract: a pattern, a fork, a batch of std patches ๐งฌ.. This one is the opposite, two specific Rust tools picked up from crates.io and asked to work on every OS in the matrix without pretending they're on Linux when they aren't. The tools are ripgrep and dog, ogham's DNS CLI, and they were picked deliberately for what they do and don't exercise. ripgrep is pure sync filesystem work, no sockets, no system configuration, the cleanest possible test of whether Strategy 3 from post 3 holds up on a real codebase. dog is the opposite: all its interesting code is std::net and per-OS config discovery. Between them they cover most of the ground a normal CLI tool actually touches ๐.
Starting with ripgrep
Forking BurntSushi/ripgrep and pointing its Cargo.toml at libc-cosmo via [patch.crates-io] was the literal entire Rust-source change. No additions to src/, no new config, nothing. The things that needed doing were all at the build-wiring layer, and they were the same things I'd done for the probe:
- copy the target JSONs and
gcc-linker-wrapper.bashfrom the probe directory; - add a
.cargo/config.tomlwithbuild-std = ["std", "panic_abort", "panic_unwind"]andrustflags = ["--cfg=cosmo"]; - carry the two
#[no_mangle]shims forwaitidand__xpg_strerror_rthat the probe needed, since ripgrep spawns processes (viaCommandfor the pager fallback) and formatsio::Errormessages; - gate jemalloc off under
cfg(not(cosmo))(its build script wants autotools cross-compilation, which cosmocc doesn't agree to); - add the same
build-fat.shwrapper that runs twocargo builds and apelinks the outputs.
rg.com fell out of that. rg --version and a real search (rg "fn main" main.rs across a checkout of the Rust standard library) ran exit-zero on every target in the matrix. No new findings, no panics, no wrong results. The point of running ripgrep wasn't to discover new bugs; it was to confirm that the pattern from post 3 actually does what it says on the tin for a non-trivial real codebase, and that a tool the size of ripgrep doesn't need per-crate patching beyond the [patch.crates-io] libc = ../libc-cosmo line. That's the thing this port established, and with the matrix still green I moved on to the next tool.
The reason I'm spending only a few paragraphs on ripgrep is that there's not much more to say. A boring port, it is the win condition. If every future Rust-on-cosmo port looked like this, Strategy 3 would be done and I'd have nothing left to write about.
Why dog comes next
Ripgrep does filesystem, processes, and string matching. It does not open a socket, and it does not read any OS-specific system configuration. Both of those are where divergence tends to hide in Rust, and neither of them got exercised by the ripgrep port. So the next tool needed to make a network syscall and read something from whichever /etc-ish config lives on the host โ which is exactly what a DNS client does.
dog was small enough to audit end to end, plain UDP/TCP (so I could drop its native-tls feature until I picked a pure-Rust TLS provider that compiles through cosmocc, which is a follow-up for another post), and had the clean command-line shape that makes cross-host comparison easy. I forked it, dropped the TLS features, pointed Cargo.toml at libc-cosmo and getrandom-cosmo, and ran the same fat-APE build.
dog github.com on the Linux host returned an A record from the real nameserver. First try ๐.. That was encouraging. Then I copied dog.com to the other five targets and the encouragement ran out ๐คฎ..
First matrix run, first finding
| Target | Result |
|---|---|
| Linux x86_64 | A record returned |
| Linux aarch64 | A record returned |
| macOS arm64 | panic: could not initialize thread_rng |
| FreeBSD 14 | panic: could not initialize thread_rng |
| OpenBSD 7.4 | "No nameserver found" |
| Windows Server 2022 | "No nameserver found" |
Two of the failures were the same shape, with a stack trace that dropped us into rand, which dropped us into getrandom. DNS uses randomness to pick query IDs, and getrandom is where all that comes from. The panic said it couldn't initialise the RNG, which is a strange thing for something to say on a host with a working /dev/urandom.
The root cause, once I traced it, is a probe inside upstream getrandom. On Linux it wants to use the getrandom(2) syscall directly, so at startup it makes a dummy call and looks at errno to decide whether the syscall is available:
1// from upstream getrandom/src/linux_android.rs
2
3
4
5
6
7
8
9
10
11
12
13
Exactly the pattern post 2 walked through: libc::ENOSYS is a compile-time constant, baked at 38 for Linux. On cosmo-Mac and cosmo-BSD the getrandom(2) syscall doesn't exist, so the call does fail, but the errno that comes back is the host's native "no such syscall" number, not 38. The match falls through to _ => true, the probe concludes "getrandom works", the next real call panics.
The fix follows the exact shape of libc-cosmo, but in a different crate. Fork rust-random/getrandom, branch off, add a cfg(cosmo) gate that skips the probe entirely and goes straight through to /dev/urandom:
1
2
3
4
Two gates (one in linux_android.rs, one in use_file.rs to suppress a Linux-specific /dev/random readiness poll that's a no-op or error on the non-Linux hosts), wire it into dog via another [patch.crates-io], rebuild.
Second matrix run
| Target | Result |
|---|---|
| Linux x86_64 | A record returned |
| Linux aarch64 | A record returned |
| macOS arm64 | A record returned |
| FreeBSD 14 | A record returned |
| OpenBSD 7.4 | A record returned |
| Windows Server 2022 | still "No nameserver found" |
Five out of six from a single dog.com. The only miss was Windows, and the Windows miss wasn't really about cosmo at all.
The Windows gap isn't about cosmo
Upstream dog has two DNS-configuration paths. On cfg(unix) it parses /etc/resolv.conf. On cfg(windows) it uses the ipconfig crate, which wraps the GetAdaptersAddresses Win32 API to ask Windows for the system's DNS servers per-adapter. That's because Windows doesn't have /etc/resolv.conf; it has a list of network adapters, each with its own DNS configuration, discovered by iterating a linked list in a chunk of memory the OS fills out for you.
We compile with target_os = "linux", so the cfg(unix) path wins, the cfg(windows) code is dead, and at runtime on the Windows host dog reaches for /etc/resolv.conf, doesn't find it, and gives up. That failure would happen to any Linux-musl Rust binary you dropped onto Windows through cosmo. It isn't the pattern of "Linux value baked at compile time doesn't match host," it's "the whole source branch for this OS was compiled out, and the runtime fallback doesn't know what to do."
You can paper over this with a hardcoded 1.1.1.1 fallback. I didn't want to, for two reasons. First, is cheating; the port is meant to be doing what it says on the tin, which is using whatever DNS the host actually has configured. Second, every other Rust networked tool I might one day want to port is going to have the exact same problem, and solving it once as a reusable crate is more useful than solving it for dog with magic constants.
cosmo-sysconf: one crate, two backends, runtime dispatch
The shape of the fix is straightforward once you name it: a crate that exposes a tiny API (give me the system's DNS servers) and picks its backend at runtime based on what cosmo says the host actually is.
Cosmopolitan's libc exposes a single symbol, __hostos, which is a c_int bitmask populated during startup before main() runs. Bit 1 means Linux, bit 4 means Windows, bit 8 means XNU (macOS), and so on. Read it once, dispatch accordingly:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
The Unix path is the boring one, a BufReader over /etc/resolv.conf that pulls out nameserver lines and parses each address. I could have added BSD-specific paths or macOS's SystemConfiguration framework, but /etc/resolv.conf exists on all of them and that turned out to be enough for dog's needs.
The Windows path is where the real work is.
The Windows excursion
This is the part I genuinely enjoyed ๐คฉ. Cosmopolitan ships a lot of Windows API bindings in its C headers that I didn't realise were there until I went looking. iphlpapi.h has GetAdaptersAddresses, and struct/ipadapteraddresses.h has the full IP_ADAPTER_ADDRESSES struct plus all the nested types (SOCKET_ADDRESS, IP_ADAPTER_DNS_SERVER_ADDRESS). Meaning: if I can mirror the right piece of that layout in Rust, I can call GetAdaptersAddresses from a cosmo-linked Linux-target binary running on a Windows kernel and get back the real per-adapter DNS server list that Windows uses itself.
The layout I needed to mirror, trimmed to the fields actually read:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
The trick with IP_ADAPTER_ADDRESSES is that it's genuinely enormous (hundreds of bytes of fields I don't care about) but we don't need to know the full shape, only the offsets up to the pointer we're chasing. The OS fills a buffer we hand it, and the tail of each struct beyond first_dns_server_address is opaque bytes as far as we're concerned. That keeps the Rust mirror tiny.
Calling GetAdaptersAddresses is a size-probing dance: call it with a buffer, if it returns ERROR_BUFFER_OVERFLOW the OS has written the required size into your out-param, so grow and retry. Once it succeeds, you get a linked list of adapters, each with its own linked list of DNS server addresses, each of which holds a SOCKET_ADDRESS that points at a sockaddr_in or sockaddr_in6.
The one gotcha that cost me ten minutes was the address family constants. POSIX says AF_INET=2 and AF_INET6=10. Windows agrees on AF_INET=2 but uses AF_INET6=23, because Winsock has its own family numbers and they diverged long before anyone tried to run Linux binaries on Windows. The sockaddr here was filled by Windows, so I match against Winsock's values:
1const AF_INET: u16 = 2;
2const AF_INET6: u16 = 23; // Winsock โ not 10
3
4// sockaddr_in: [family u16][port u16][addr 4 bytes]...
5// sockaddr_in6: [family u16][port u16][flowinfo u32][addr 16 bytes]...
Read the family from offset 0, branch on it, read the address bytes from the right offset, build an Ipv4Addr or Ipv6Addr. Unaligned reads because there's no guarantee of alignment on a buffer Windows filled. The whole Windows module comes out at around two hundred lines of Rust, most of it comments explaining what the OS is handing us.
Wire it into dog via a one-line cfg(cosmo) branch in resolve.rs that calls cosmo_sysconf::system_nameservers() and takes the first entry. Rebuild.
Third matrix run, and one last finding
Almost there. Windows now got past the resolver config; dog found a real adapter DNS server, constructed a query packet, and tried to send it. Which failed, with win err 87, ERROR_INVALID_PARAMETER.
A bit of tracing narrowed it down. The raw syscall path was fine (libc::sendto with flag 0 sent the packet without complaint), but std's own UdpSocket::send_to wrapper failed. The difference was a flag. On target_os = "linux" std's send wrappers pass MSG_NOSIGNAL = 0x4000 to suppress SIGPIPE on writes to closed connections. Winsock doesn't know about MSG_NOSIGNAL, cosmo passes it through, Windows rejects the whole call.
Which is exactly the pattern post 3 was about: a Linux constant baked at compile time, wrong on another host. Fix is the same: add MSG_NOSIGNAL to the libc-cosmo extern-static gates (cosmo already exposes it as extern const int MSG_NOSIGNAL in libc/sysv/consts/msg.h, 0x4000 on Linux, 0 on Windows), and a one-line unsafe { } wrap in the one std send path that wasn't already inside one.
Rebuild. Re-run:
| Target | Result |
|---|---|
| Linux x86_64 | A github.com โ 20.26.156.215 |
| Linux aarch64 | A github.com โ 140.82.121.3 |
| macOS arm64 | A github.com โ 140.82.121.3 |
| FreeBSD 14 | A github.com โ 20.26.156.215 |
| OpenBSD 7.4 | A github.com โ 20.26.156.215 |
| Windows Server 2022 | A github.com โ 20.26.156.215 |
Six out of six. One dog.com, copied onto every target, each run reading the host's actual DNS configuration (including Windows's real per-adapter servers, no fallbacks) and returning a real answer. The output was the same shape on every host:
1
2
3)
4
5
6
Counting ripgrep from the top of the post, that's two non-trivial real Rust tools crossing the sync-Rust cosmo boundary. The pattern from post 3 did most of the work; the new piece here was teaching cosmo-sysconf to look up Windows DNS at runtime, because some things genuinely can't be normalised via a libc constant and need a real platform-specific implementation.
So what did this port teach
A couple of things worth pulling out, because they're going to matter for future Rust-on-cosmo ports beyond this series.
The first is that the Strategy 3 pattern really does scale to other crates without getting dramatically bigger. getrandom-cosmo is a two-gate fork that took an afternoon. If you've already built the libc fork, you've already done the hard thinking; picking up each new dependency in your graph is the same move, once.
The second is that there's a whole class of problems the extern-static trick can't solve, and they cluster around "this OS uses a fundamentally different API for this thing." DNS config on Windows is the canonical example. You can't turn /etc/resolv.conf into an extern static. You have to call GetAdaptersAddresses, and that requires real runtime dispatch, which is what cosmo-sysconf is. That's a different tier of effort than "add a cfg gate to a libc constant," and it's where the patching surface grows beyond the libc fork.
The third is that cosmo is doing a lot of work behind the scenes that makes these ports feel quieter than they should. The fact that GetAdaptersAddresses is just there in a Linux-target Rust binary because cosmo ships the Win32 bindings in its C headers is, genuinely, wild ๐คฏ.. I keep forgetting and then being surprised again.
If your tool reads anything from /etc/resolv.conf, /etc/hosts, /etc/nsswitch.conf, or other Unix-only config files that Windows does not have, cosmo-sysconf is the thing to point at. If your tool only does filesystem, processes, and string matching โ ripgrep shape โ the forked libc plus the std patches from post 3 will carry you through without any of the runtime-dispatch work I did here.
Next post: xh, the full async HTTP stack โก.. That's where the pattern stops working quite so cleanly, and the series stops promising one-binary-anywhere.