โ—index ๐Ÿฆ€mrwolf-rust.md ๐Ÿท๏ธtags ๐Ÿ‘คabout

๐Ÿฆ€ MrWolf: Rust, Macros, and Zero Boilerplate

Second post in the MrWolf series. Previously: I Gave an AI Tools to Run My Homelab.

This is the Rust nerd post ๐Ÿฆ€. If you're here for the "what can it do" story, go read the first post. This one is about macros, middleware, and the patterns that keep the codebase manageable by one person on evenings and weekends.

Before you think I'm insane.. I didn't hand-write the whole thing from scratch. I wrote most of MrWolf with Claude's help. I designed the architecture, defined the tool interfaces, and did all the QA, testing and code review. Claude wrote most of the implementation. It's a Rust MCP server that was largely built by the same AI that uses it. There's something beautifully recursive about that ๐Ÿ”„.

The architecture

MrWolf is a single binary that spawns two HTTP servers:

๐Ÿฆ€rustโ€บ20 lines
  1#[tokio::main]
  2async fn main() -> Result<()> {
  3    let settings = Settings::from_env()?;
  4    let client = http::build_client(15)?;
  5
  6    // Metrics server on :8080
  7    tokio::spawn(async move {
  8        metrics::serve_metrics(metrics_port).await
  9    });
 10
 11    // MCP server on :8081
 12    if settings.mcp_enabled {
 13        let composite = build_composite_server(&settings, &client).await;
 14        tokio::spawn(async move {
 15            serve_mcp(composite, mcp_port).await
 16        });
 17    }
 18
 19    std::future::pending::<()>().await; // block forever
 20}

That's the whole main. Metrics, MCP, and a pending() that blocks the main task while the spawned servers run. All config comes from environment variables via envy::prefixed("MRWOLF_"), which makes the binary a natural fit for a Kubernetes pod where env vars are the native config mechanism anyway.

The interesting bit is how the optional servers auto-enable:

๐Ÿฆ€rustโ€บ10 lines
  1let media = (!settings.sonarr_api_key.is_empty())
  2    .then(|| MediaServer::new(client.clone(), media_config));
  3
  4let pangolin = (!settings.pangolin_token.is_empty())
  5    .then(|| PangolinServer::new(client.clone(), ...));
  6
  7let crowdsec = (!settings.crowdsec_api_key.is_empty())
  8    .then(|| CrowdSecServer::new(client.clone(), ...));
  9
 10let geoblock = Some(GeoBlockServer::new(client.clone(), ...)); // always on

No feature flags. No MRWOLF_MEDIA_ENABLED=true. If the API key exists, the server starts. If it doesn't, that server is None and its tools don't appear in the MCP tool list. One less thing to configure, one less thing to forget.

The macro that killed boilerplate

Every MCP tool needs the same ceremony: create a tracing span, wrap the async body in instrumentation, catch errors, convert them to user-friendly messages. Without a macro, every tool would have more scaffolding than actual logic.

This is tool_body!:

๐Ÿฆ€rustโ€บ23 lines
  1macro_rules! tool_body {
  2    ($name:literal => $body:block) => {{
  3        let span = ::tracing::info_span!($name);
  4        let result = ::tracing::Instrument::instrument(
  5            async {
  6                let __r: color_eyre::Result<CallToolResult> = { $body };
  7                __r
  8            },
  9            span,
 10        )
 11        .await;
 12        match result {
 13            Ok(v) => Ok(v),
 14            Err(err) => {
 15                ::tracing::warn!(tool = $name, error = %err, "tool call failed");
 16                Ok(text_result(format!(
 17                    "Error: {err:#}\n\nThe upstream service may be \
 18                     restarting or unavailable. Try again in a moment."
 19                )))
 20            }
 21        }
 22    }};
 23}

What it does:

  1. Creates a tracing span named after the tool (shows up in structured logs)
  2. Wraps the body in an async block with a typed Result
  3. If the body returns Ok, pass it through
  4. If the body returns Err, log the error and return a friendly message instead of propagating it

That last point is critical. An LLM seeing a Rust stack trace is worse than useless. Every error becomes "Error: CrowdSec API returned HTTP 503. The upstream service may be restarting. Try again in a moment." Claude reads that and either retries or tells the user what happened. No panics, no cryptic errors.

Here's what a tool looks like with the macro:

๐Ÿฆ€rustโ€บ14 lines
  1#[tool(name = "get_allowed_countries")]
  2async fn get_allowed_countries(&self) -> Result<CallToolResult, ErrorData> {
  3    tool_body!("get_allowed_countries" => {
  4        let data = self.api_get("/api/countries").await?;
  5        let countries: Vec<&str> = data.as_array()
  6            .map(|a| a.iter().filter_map(|v| v.as_str()).collect())
  7            .unwrap_or_default();
  8        Ok(text_result(format!(
  9            "Allowed Countries ({}):\n  {}",
 10            countries.len(),
 11            countries.join(", ")
 12        )))
 13    })
 14}

The actual logic is a few lines; the macro handles tracing, error conversion, and type annotations. Once you multiply that across every tool in the codebase, the savings add up fast.

There's also a variant with span fields for richer traces:

๐Ÿฆ€rustโ€บ3 lines
  1tool_body!("update_allowed_countries", [count = p.countries.len(), confirmed = p.confirmed] => {
  2    // ...
  3})

Now I get structured log lines like update_allowed_countries count=26 confirmed=true without any manual span construction ๐Ÿ’—.

The composite server

Each sub-server needs to merge into a single MCP endpoint. The composite server builds a HashMap<String, SubServer> at startup:

๐Ÿฆ€rustโ€บ21 lines
  1#[derive(Clone, Copy)]
  2enum SubServer {
  3    Prometheus,
  4    Alertmanager,
  5    Loki,
  6    Kubernetes,
  7    Gotify,
  8    Media,
  9    Pangolin,
 10    CrowdSec,
 11    AdGuard,
 12    ArgoCD,
 13    GeoBlock,
 14}
 15
 16struct MrWolfServer {
 17    prometheus: PrometheusServer,
 18    alertmanager: AlertmanagerServer,
 19    // ... 9 more (some Option<T>)
 20    dispatch: HashMap<String, SubServer>,
 21}

At construction, it iterates every server's tool list and builds the dispatch table:

๐Ÿฆ€rustโ€บ10 lines
  1let mut dispatch = HashMap::new();
  2for tool in prometheus.tool_router.list_all() {
  3    dispatch.insert(tool.name.to_string(), SubServer::Prometheus);
  4}
  5// ... repeat for each server
  6if let Some(ref media) = media {
  7    for tool in media.tool_router.list_all() {
  8        dispatch.insert(tool.name.to_string(), SubServer::Media);
  9    }
 10}

Dispatch at call time is O(1) โ€” just a HashMap lookup and a match:

๐Ÿฆ€rustโ€บ20 lines
  1async fn call_tool(&self, request: CallToolRequestParams, context: RequestContext<RoleServer>)
  2    -> Result<CallToolResult, ErrorData>
  3{
  4    let tool_name = request.name.to_string();
  5    let start = Instant::now();
  6
  7    let result = match self.dispatch.get(request.name.as_ref()) {
  8        Some(SubServer::Prometheus) => self.prometheus.call_tool(request, context).await,
  9        Some(SubServer::Alertmanager) => self.alertmanager.call_tool(request, context).await,
 10        // ... all 11 arms
 11        None => Err(ErrorData { message: format!("Unknown tool: {}", request.name).into(), .. }),
 12    };
 13
 14    // Metrics for every tool call โ€” automatic, no per-tool code needed
 15    counter!("mrwolf_tool_calls_total", "tool" => tool_name.clone(), "status" => status).increment(1);
 16    histogram!("mrwolf_tool_duration_seconds", "tool" => tool_name.clone()).record(duration);
 17    histogram!("mrwolf_tool_response_size_bytes", "tool" => tool_name).record(size as f64);
 18
 19    result
 20}

Every tool call gets three Prometheus metrics automatically: call count, duration, and response size. I have a Grafana dashboard that shows which tools Claude uses most, how long they take, and how large the responses are. It's fascinating to watch ๐Ÿ“Š.

The HTTP middleware sandwich ๐Ÿฅช

MrWolf talks to ~15 upstream services. Every request needs retries with backoff and metrics:

๐Ÿฆ€rustโ€บ15 lines
  1pub(crate) fn build_client(timeout_secs: u64) -> Result<ClientWithMiddleware> {
  2    let retry_policy = ExponentialBackoff::builder().build_with_max_retries(3);
  3
  4    let client = reqwest_middleware::ClientBuilder::new(
  5        reqwest::Client::builder()
  6            .timeout(Duration::from_secs(timeout_secs))
  7            .build()?,
  8    )
  9    .with(HttpMetricsMiddleware::outer())           // 1. Track logical requests
 10    .with(RetryTransientMiddleware::new_with_policy(retry_policy))  // 2. Retry on 5xx/timeout
 11    .with(HttpMetricsMiddleware::inner())            // 3. Track individual attempts
 12    .build();
 13
 14    Ok(client)
 15}

The same metrics middleware is used twice with different configs:

  • Outer (before retry): sees one event per logical request. Records final status, duration, and sets the mrwolf_upstream_up gauge. This is what I alert on.
  • Inner (after retry): sees every attempt including retries. Records mrwolf_http_attempts_total. This tells me if a service is flaky (lots of retries) vs. down (outer failures).

The middleware extracts the service name from the URL hostname automatically:

๐Ÿฆ€rustโ€บ8 lines
  1fn service_from_host(host: &str) -> &str {
  2    match host.split('.').next().unwrap_or(host) {
  3        "prometheus-kube-prometheus-prometheus" => "prometheus",
  4        "prometheus-kube-prometheus-alertmanager" => "alertmanager",
  5        "loki-gateway" => "loki",
  6        other => other, // sonarr, radarr, jellyfin, gotify...
  7    }
  8}

So every HTTP request gets labeled with the service it's talking to. Zero manual instrumentation in the tool code.

Dealing with MCP client quirks

MCP clients sometimes send numbers as strings. Like {"limit": "10"} instead of {"limit": 10}. This silently breaks normal serde deserialization, so I have lenient deserializers:

๐Ÿฆ€rustโ€บ18 lines
  1pub(crate) mod serde_usize_lenient {
  2    pub(crate) fn deserialize<'de, D: Deserializer<'de>>(
  3        deserializer: D,
  4    ) -> Result<Option<usize>, D::Error> {
  5        #[derive(Deserialize)]
  6        #[serde(untagged)]
  7        enum StringOrNum {
  8            Num(usize),
  9            Str(String),
 10        }
 11        Option::<StringOrNum>::deserialize(deserializer)?
 12            .map(|v| match v {
 13                StringOrNum::Num(n) => Ok(n),
 14                StringOrNum::Str(s) => s.parse::<usize>().map_err(de::Error::custom),
 15            })
 16            .transpose()
 17    }
 18}

Small thing, but without it half the tools would randomly fail. Defensive coding for AI clients ๐Ÿคท.

The ServiceClient abstraction

The media stack talks to a handful of services with slightly different APIs โ€” different auth headers, different URL prefixes. Instead of duplicating the HTTP scaffolding for each one:

๐Ÿฆ€rustโ€บ23 lines
  1#[derive(Clone)]
  2pub(super) struct ServiceClient {
  3    pub(super) client: ClientWithMiddleware,
  4    pub(super) base_url: String,
  5    pub(super) api_prefix: &'static str,   // "/api/v3", "/api/v1", etc.
  6    pub(super) auth_header: &'static str,  // "X-API-Key", "Authorization", etc.
  7    pub(super) api_key: String,
  8    pub(super) name: &'static str,         // for tracing spans
  9}
 10
 11impl ServiceClient {
 12    #[instrument(skip(self), fields(service = self.name))]
 13    pub(super) async fn get(&self, path: &str) -> color_eyre::Result<serde_json::Value> {
 14        let url = format!("{}{}/{path}", self.base_url, self.api_prefix);
 15        self.client
 16            .get(&url)
 17            .header(self.auth_header, &self.api_key)
 18            .send().await?
 19            .error_for_status()?
 20            .json().await
 21            .wrap_err_with(|| format!("{} request failed", self.name))
 22    }
 23}

All the media-stack tools share one HTTP client struct, so every call picks up tracing, retries, and metrics without any per-tool plumbing.

JWT caching with Arc<RwLock<>>

CrowdSec uses machine-login authentication โ€” POST credentials, get a JWT, use it until it expires. Here's how MrWolf handles it:

๐Ÿฆ€rustโ€บ20 lines
  1#[derive(Clone)]
  2pub(crate) struct CrowdSecServer {
  3    jwt_token: Arc<RwLock<Option<String>>>,  // shared across clones
  4    // ...
  5}
  6
  7async fn get_jwt(&self) -> color_eyre::Result<String> {
  8    // Try read lock first (fast path)
  9    {
 10        let cached = self.jwt_token.read().await;
 11        if let Some(ref token) = *cached {
 12            return Ok(token.clone());
 13        }
 14    } // drop read lock before acquiring write lock
 15
 16    // Slow path: login and cache
 17    let token = self.machine_login().await?;
 18    *self.jwt_token.write().await = Some(token.clone());
 19    Ok(token)
 20}

And retry-on-401:

๐Ÿฆ€rustโ€บ11 lines
  1async fn machine_get(&self, path: &str) -> color_eyre::Result<serde_json::Value> {
  2    for attempt in 0..2 {
  3        let token = self.get_jwt().await?;
  4        let resp = self.client.get(&url).bearer_auth(&token).send().await?;
  5        if resp.status() != 401 {
  6            return Ok(resp.json().await?);
  7        }
  8        self.invalidate_jwt().await; // force re-login on retry
  9    }
 10    bail!("CrowdSec auth failed after 2 attempts")
 11}

Drop read lock before write lock. Invalidate on 401 and retry. The Arc<RwLock<>> makes it safe to clone the server struct (the MCP handler needs Clone) while sharing the token cache.

Pre-built PromQL queries

The Prometheus server has ~30 queries as constants โ€” a curated catalog that Claude can browse:

๐Ÿฆ€rustโ€บ18 lines
  1mod queries {
  2    pub(super) const CPU_BY_NODE: &str =
  3        r#"100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)"#;
  4
  5    pub(super) const MEMORY_BY_NODE: &str =
  6        "(1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100";
  7
  8    pub(super) const POD_RESTARTS_1H: &str =
  9        "sum by(namespace, pod) (increase(kube_pod_container_status_restarts_total[1h])) > 0";
 10
 11    pub(super) const RAID_DISKS: &str =
 12        r#"node_md_disks{instance="corellia",device="md0"}"#;
 13
 14    // Dynamic queries that accept parameters
 15    pub(super) fn network_receive_rate(period: &str) -> String {
 16        format!(r#"rate(node_network_receive_bytes_total{{device!~"lo|veth.*|cali.*"}}[{period}]) * 8"#)
 17    }
 18}

Claude doesn't need to know PromQL. It calls get_cluster_health and MrWolf fires the right queries. But if Claude wants a custom query, query_prometheus accepts raw PromQL too. Best of both worlds.

Node names across 3 subnets

My nodes are reachable over LAN, Tailscale, and WiFi. Prometheus labels use whatever IP it scraped from. This function turns any of them into a Star Wars planet name:

๐Ÿฆ€rustโ€บ12 lines
  1pub(crate) fn resolve_node_name(instance: &str) -> &str {
  2    let host = instance.split(':').next().unwrap_or(instance);
  3    match host {
  4        "10.0.1.252" | "100.64.0.1" | "192.168.1.100" | "corellia" => "corellia",
  5        "10.0.1.251" | "100.64.0.3" | "192.168.1.102" | "tatooine" => "tatooine",
  6        "10.0.1.253" | "100.64.0.13" | "192.168.1.101" | "mandalore" => "mandalore",
  7        "100.64.0.6" | "192.168.1.103" | "scarif" => "scarif",
  8        "10.0.1.249" | "100.64.0.8" | "kamino" => "kamino",
  9        "10.0.1.248" | "100.64.0.9" | "jakku" => "jakku",
 10        _ => instance,
 11    }
 12}

Beautiful? No. Works perfectly? Yes. Sometimes the right code is the boring code.

Confirmation gates for destructive ops

Every write tool follows the same pattern โ€” preview, then execute:

๐Ÿฆ€rustโ€บ19 lines
  1#[derive(Debug, Deserialize, JsonSchema)]
  2struct UpdateAllowedCountriesParams {
  3    countries: Vec<String>,
  4    /// Must be true to execute. If false, shows a preview.
  5    confirmed: bool,
  6}
  7
  8tool_body!("update_allowed_countries", [count = p.countries.len(), confirmed = p.confirmed] => {
  9    if !p.confirmed {
 10        return Ok(text_result(format!(
 11            "CONFIRMATION REQUIRED: Replace allowed countries with {} entries: [{}]. \
 12             All other countries will be BLOCKED. \
 13             Call again with confirmed=true to execute.",
 14            p.countries.len(), p.countries.join(", ")
 15        )));
 16    }
 17    self.api_put("/api/countries", serde_json::json!(p.countries)).await?;
 18    Ok(text_result(format!("Updated: {} countries allowed.", p.countries.len())))
 19})

Simple pattern, but it's the difference between "useful tool" and "unsupervised chaos" ๐Ÿ˜….

Testing with WireMock

Every server has tests that mock the upstream APIs. No real services needed:

๐Ÿฆ€rustโ€บ18 lines
  1#[tokio::test]
  2async fn tool_get_allowed_countries() {
  3    let mock = MockServer::start().await;
  4    Mock::given(method("GET"))
  5        .and(path("/api/countries"))
  6        .respond_with(
  7            ResponseTemplate::new(200)
  8                .set_body_json(serde_json::json!(["CN", "RU", "KP"]))
  9        )
 10        .mount(&mock)
 11        .await;
 12
 13    let server = GeoBlockServer::new(build_client(5).unwrap(), mock.uri());
 14    let result = server.get_allowed_countries().await.unwrap();
 15    let text: &str = result.content[0].raw.as_text().unwrap().text.as_ref();
 16    assert!(text.contains("CN"));
 17    assert!(text.contains("3"));
 18}

Fast, deterministic, runs in CI. cargo nextest run finishes the whole suite in seconds.

The dependency stack

โš™tomlโ€บ19 lines
  1[dependencies]
  2rmcp = { version = "0.16", features = ["server", "macros", "transport-streamable-http-server"] }
  3kube = { version = "3.0", features = ["runtime", "client"] }
  4reqwest = { version = "0.13", features = ["json", "rustls"] }
  5reqwest-middleware = "0.5"
  6reqwest-retry = "0.9"
  7tokio = { version = "1", features = ["full"] }
  8axum = "0.8"
  9metrics = "0.24"
 10metrics-exporter-prometheus = "0.18"
 11serde = { version = "1", features = ["derive"] }
 12schemars = "1.2"
 13color-eyre = "0.6"
 14envy = "0.4"
 15tracing = "0.1"
 16
 17[dev-dependencies]
 18wiremock = "0.6"
 19pretty_assertions = "1"

rmcp does the MCP protocol. kube does the Kubernetes API. Everything else is the standard Rust ecosystem. No frameworks, no code generation.

โฏ_shโ€บ6 lines
  1โฏโฏโฏ tokei src/
  2โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
  3 Language        Files     Lines      Code   Comments     Blanks
  4โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
  5 Rust               28     14026     11963        456       1607
  6โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”

All of it fairly straightforward async Rust. A lot written by Claude, all of it reviewed by me ๐Ÿฆ€.

Done

MrWolf isn't clever ๐Ÿงฐ.. It's a pile of small, tested functions which each do one thing, wrapped in macros that handle the plumbing. The Rust compiler does the rest: if it compiles, the dispatch table is correct, the error handling is complete, the metrics are wired up.

That's how Mr. Wolf solves problems. Not with cleverness, but with a system ๐Ÿบ.

Next up: what I've learned about giving an AI the keys to infrastructure, what works, what breaks, and what's next.

:discuss share / comment on Mastodon โ†’