Why is ingress working but tools fail?

Inbound webhooks only prove nginx and listeners; outbound calls need DNS, SNI, and unfiltered egress to different ASNs. Trace both paths.

Do I need HTTP/HTTPS proxy?

Leased data-center Macs are usually open; corporate proxies are rare. If you set one, pass proxy env in launchd and in Node, not only in the shell that ran npm once.

AI / Automation April 24, 2026

2026-04-24 OpenClaw Egress Path: DNS, TLS SNI & Upstream Resilience on Headless Leased Cloud Mac

MacXCode Engineering Team April 24, 2026 ~18 min read

Most runbooks for OpenClaw on a leased Apple Silicon Mac mini or equivalent stop after nginx terminates TLS for inbound webhooks. That is only half of the system: the gateway, tool invocations, and LLM provider HTTP clients all egress to a different set of anycast fronts, each with new DNS A/AAAA sets, TLS 1.2+ SNI, and HTTP/2 settings. This 2026-04-24 guide is intentionally not a repeat of the reverse-proxy deep dive; it describes the outbound contract—DNS freshness, certificate validation, clock discipline, and how to correlate failures in structured logs when a region in HK / JP / KR / SG / US suddenly cannot reach an API, while the web UI of the same product works from a laptop. Pair it with launchd-stored API keys so you never conflate a dead secret with a dead network path to the host that mints the OAuth token.

Egress Is Not a Mirror of Ingress

Successful inbound webhook delivery proves that: (a) a listener is bound, (b) a certificate is trusted on the client side, and (c) TCP path from provider → your edge works. Outbound calls prove different facts: the Node (or other runtime) trust store used by the gateway process, DNS resolution on the same host (not your laptop’s resolver), and whether a default route prefers IPv4 or IPv6 when a provider’s DNS advertises both. A green status on readiness that only tests 127.0.0.1:18789 does not validate the outbound stack—extend probes with a synthetic GET to a known-static endpoint or an RFC-8705-style metadata check if you operate multi-tenant OAuth flows, and log RTT p95 in the same line format as the rest of your JSON logs so the dashboard does not need new parsers.

DNS Resolution, TTL, and “Stuck” Resolvers on macOS

Large API and inference networks shift edge IPs on short TTLs. On long-lived SSH-only production hosts, watch for: (1) a burst of ENOTFOUND after you resume from maintenance where scutil --dns still points at an old split-horizon name from a test week; (2) regional GeoDNS that sends your Singapore box to a different anycast than your US East canary, creating asymmetric latency without packet loss. Baseline a scripted dscacheutil -q host -a name api.target.example plus host -a in your post-deploy checklist; store round-trip to each resolver hop in your ticket template so that “DNS is fine in browser” (which may use a different keychain or VPN) cannot derail a midnight bridge call.

Corporate or Mesh Split Paths

Some teams mesh a gateway to an internal API via Tailscale while public chat notifications still go direct. Document which hostnames are 100.x only and which are public—mixing the two in one OpenClaw profile without a policy block causes intermittent 403/timeout patterns that generic gateway troubleshooting alone will not disambiguate, because the failure is routing, not queue depth. When in doubt, capture a single curl -v from the same launchd EnvironmentVariables the daemon inherits, not an interactive ssh -t shell (which can pick up different proxy variables).

TLS, SNI, and Certificate Chain Validation

Every HTTPS client in the tool stack must support SNI to the correct hostname; legacy scripts that use raw IP targets will fail in front of CDN edges. The Node runtime on macOS Trust Store is separate from the system keychain your browser uses during VNC: when you pin a custom internal CA for a lab, install it in the trust bundle or environment that the daemon reads, following the env & secrets on launchd runbook, not a one-time export in a developer shell. Clock skew beyond a few minutes breaks TLS 1.3 and signed JWT windows—sntp -sS time.apple.com remains an on-box invariant that pairs with the same NTP guidance you use for inbound HMAC time buckets.

IPv4 vs IPv6, Happy Eyeballs, and Explicit Proxies

When a provider’s DNS returns both AAAA and A records, the client library’s connection strategy (often called “Happy Eyeballs”) can choose a path that a datacenter firewall or provider-side allow-list has not been updated to accept. A pragmatic split for HK / JP / KR / SG / US fleets: document which regions keep IPv6 disabled at the edge, add curl -4 vs -6 to triage, and, if a corporate HTTP proxy is mandatory, set HTTPS_PROXY in the LaunchAgent you already version alongside API key env, and verify the proxy allows WebSocket upgrades if a bridge feature depends on it. Docker vs native npm also changes how proxy variables propagate: bridge-mode containers may ignore the host launchd file entirely unless you plumb a compose-level env block that mirrors the production plist.

Symptom / Layer Triage (Outbound)

Symptom	Layer	Stabilize
TLS alert unknown CA, exit 60	Chain / trust / MITM	Align system trust, avoid IP-based TLS; review proxy MITM on port 443
getaddrinfo ENOTFOUND (intermittent)	DNS / search domain	Check search domains in scutil, flush cache, recompare against laptop
HTTP 403 from provider edge only on SG node	Geo / WAF + egress IP	Map leased egress to allow-list, consider regional second host

Resilience SLOs: for multi-region teams, set an alert when p95 outbound TLS handshake time to a reference endpoint exceeds 250ms for 5 minutes or when 5xx to provider burst above 1% of attempts after a Node or OS update—treat that as a networking incident, not a model-quality incident.

Correlating with Structured Gateway Logs

Adopt a single request or trace identifier in OpenClaw logs and forward it in outbound HTTP headers where a provider’s API supports idempotency keys or trace headers. If you can’t inject headers, log date, duration_ms, and err.code in one JSON line so log shipping can pivot from “mysterious 500” to a DNS spike without opening VNC—though VNC remains a break-glass path for Certificate Assistant or interactive trust store fixes you refuse to script. After any npm upgrade and double gateway restart, re-run a minimal outbound curl -sS -o /dev/null -w "%{time_connect}\n" suite against the same list you used in the pre-upgrade runbook, so a regression in Node OpenSSL bindings is visible in numbers, not in vibes.

Compare with launchd + cron for time-based synthetic egress checks, subagent for channel issues that are not really networking, and onboard + daemon when you are unsure whether a bad PATH is what breaks curl under the service account. After a messy provider or TLS incident, the OpenClaw doctor, model allowlist & post-upgrade triage (2026-04-25) runbook is the right place to normalize openclaw doctor output, gateway PID, and launchd env parity on the same headless node. If charts show 429/503 on the outbound LLM path, switch to the LLM API rate limits, retry budgets & 429/503 triage (2026-04-27) runbook so you do not over-tune nginx while the model vendor throttles you.

FAQ: Outbound Connectivity in Production

Question	Practical answer
Is ping enough?	ICMP may pass while TCP 443 to the same anycast is blocked; prefer TLS-layer checks.
Do I need an outbound firewall rule dump?	On leased data-center Macs, rare—but keep a port allow-list doc in your ticket template.
When to add a second node?	When GeoDNS and egress IP policy force one region to carry more risk; scale out via pricing.

Why Mac mini M4 Still Fits Egress-Heavy OpenClaw

Outbound-heavy automation rewards low-jitter clocks, predictable TCP stacks, and enough NVMe to retain verbose JSON logs for forensics. The same Mac mini M4 fleet that powers your CI hosts can also host 24/7 SSH-first HK · JP · KR · SG · US gateways with 1–2 TB for retained structured logs, without a hypervisor lying between the NIC and the kernel. If a region needs its own egress reputation, colocate a dedicated node via MacXCode plans and teach your dashboard to split latency charts by lease label—a cheap guardrail for 2026 multi-provider stacks.

Ship OpenClaw on clean egress

M4 · Multi-region · SSH & optional VNC

Lease a host Read operational help

Egress Is Not a Mirror of Ingress

DNS Resolution, TTL, and “Stuck” Resolvers on macOS

Corporate or Mesh Split Paths

TLS, SNI, and Certificate Chain Validation

IPv4 vs IPv6, Happy Eyeballs, and Explicit Proxies

Symptom / Layer Triage (Outbound)

Correlating with Structured Gateway Logs

Related Runbooks

FAQ: Outbound Connectivity in Production

Why Mac mini M4 Still Fits Egress-Heavy OpenClaw

Ship OpenClaw on clean egress