openclaw doctor, LaunchAgents en double et récupération du gateway sur Mac cloud loué (2026)
2026 upstream OpenClaw installs on macOS may register more than one launchd job with similar labels — especially when teams mix the desktop app path with the CLI gateway install. On a 24/7 leased Apple Silicon cloud Mac in HK / JP / KR / SG / US, that shows up as flaky dashboards, “wrong” configs after edit, or port 18789 owned by a stale binary. openclaw doctor is the supported first pass to surface drift; this article turns doctor output into a repeatable reconciliation runbook. Use it after upgrades (see upgrade & rollback), beside gateway troubleshooting, and with env & API keys in launchd.
Symptoms That Mean “Duplicate Services”
openclaw gateway statusdisagrees with whatlaunchctl listshows.- Editing
~/.openclaw/openclaw.jsondoes not change runtime behavior. - Two plist files under
~/Library/LaunchAgents/reference OpenClaw with differentProgramArguments.
~/.openclaw on iCloud Drive / Dropbox — file locking breaks agent state. Keep it local on the leased NVMe.
Root Causes of Duplicate LaunchAgents
Duplicates rarely appear “randomly” — they trace to installation path changes over time. Typical sequences we see on long-lived leased Macs:
- Upgrade without uninstall — npm
-gbump while an older app bundle still registers a gateway plist. - Multi-user experiments — one engineer bootstraps under
admin, another underbuilder; both leave LaunchAgents behind. - Automation replay — Ansible / shell provisioner appends a new plist on every run instead of declaring idempotent state.
- Manual copy-paste — a “fix” blog post suggests dropping a plist into
LaunchAgentswithout unloading the prior label.
| Signal | Likely duplicate pattern | Risk if ignored |
|---|---|---|
| Port 18789 flaps between PIDs | Two gateway binaries racing on boot | Broken webhooks; half-written config |
CPU idle but launchctl list shows two OpenClaw labels |
One agent crashed; second never cleaned up | Silent drift from expected openclaw.json |
Disk growth under ~/.openclaw/logs |
Both agents logging at debug |
NVMe pressure on 512 GB slices |
Run openclaw doctor as Baseline
openclaw doctor
Capture stdout to your CI log or ticket. Doctor typically flags path issues, version skew, and launchd registration problems. Treat warnings as blockers on production agent hosts.
Run doctor before and after any gateway install or major version bump: the delta between outputs tells you whether launchd state actually converged. Store both text files under a dated directory (example: ~/ops/openclaw/2026-04-09-pre.txt) so you can bisect regressions across months of 24/7 uptime.
If doctor suggests a repair flag or subcommand you have not seen before, pin the OpenClaw version in your ticket and cross-check upstream release notes — cloud Macs often lag local laptops by one release cycle, which is useful for repro but painful if semantics changed.
Inspect LaunchAgents
ls -la ~/Library/LaunchAgents/ | grep -i openclaw
Open each plist; note Label, ProgramArguments, and EnvironmentVariables. If both ai.openclaw.gateway and ai.openclaw.mac style labels exist, decide which install mode is canonical for this host (native npm vs app bundle).
Pay attention to WorkingDirectory and any injected PATH: a duplicate often differs only by Node path (/usr/local/bin vs ~/.nvm/versions/...), which is enough to fork runtime behavior. Highlight differences in a three-column diff table in your runbook so the next on-call engineer does not reintroduce the second plist.
Reconcile Order (Low Downtime)
- Announce maintenance if external automations hit this gateway — expect 30–120 seconds of listener downtime during bootout.
launchctl bootout gui/$(id -u) ~/Library/LaunchAgents/<duplicate>.plistfor every non-canonical label (adjust domain if your setup uses different bootstrap namespaces).- Confirm with
launchctl print gui/$(id -u)/<label>that only the intended job remains scheduled. - Move retired plists to
~/Archive/LaunchAgents-2026-04-09/instead of deleting blindly — you may need to diff them if rollback is required. - Run
openclaw gateway install(or the doctor-suggested repair) once for the chosen mode; avoid double-running install scripts in the same minute. - Re-bootstrap if needed:
launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/<canonical>.plist, then re-run doctor until clean.
Verify Gateway Health
lsof -nP -iTCP:18789 | grep LISTEN— exactly one PID should own the port.openclaw gateway status— compare against your golden output from last green deploy.openclaw logs --follow(sample 200 lines) — pair with install guide baselines.- Hit a lightweight health route (if exposed) from another MacXCode node in JP or US to confirm cross-region reachability matches your firewall rules.
Network, Firewall, and Port 18789
Leased cloud Macs often sit behind provider-level ACLs or your own security group rules. OpenClaw’s gateway listener on TCP 18789 must be explicitly allowed for whichever subnets need access — whether that is CI runners in KR, operator laptops, or a reverse proxy VM.
If you terminate TLS on nginx or Caddy in front of the gateway, document the hop count: mis-matched X-Forwarded-* headers can masquerade as “duplicate gateway” symptoms when health checks hit the wrong upstream. Keep a single source of truth for “public URL → backend port” in your internal help page and mirror it on the host’s README.txt in ~/ops/.
nc -vz host 18789 should complete in < 5 ms on a quiet LAN path; multi-second delays usually indicate filtering or asymmetric routing, not OpenClaw itself.
Decision Matrix: One Mode per Host
| Goal | Prefer | Avoid |
|---|---|---|
| Scriptable CI-style ops | npm / CLI gateway + single LaunchAgent | App + CLI both managing gateway |
| GUI-first experimentation | Official app path; disable CLI duplicate | Parallel gateway installs |
| Docker isolation | Compose-only gateway; no host duplicate | Host LaunchAgent + container both binding 18789 |
Why Bare-Metal Mac mini M4 Helps Here
Duplicate LaunchAgent debugging is inherently stateful: you read plists, diff environments, and recycle daemons until doctor is clean. A dedicated Apple Silicon Mac mini M4 with local NVMe avoids the snapshot/reboot ambiguity of oversubscribed VMs and gives you the same launchd semantics Apple ships on desk hardware. MacXCode nodes in Hong Kong, Japan, Korea, Singapore, and the United States let you park the gateway near your API consumers or compliance boundary while keeping SSH for automation and VNC for occasional GUI verification.
Renting also makes it cheap to operate a canary host: clone plist policies from production, run the next OpenClaw version there first, and only then promote. When you outgrow a slice, scale disk or add a second node from pricing instead of procuring metal weeks in advance.
Bottom line: treat openclaw doctor as the source of truth prompt, then mechanically dedupe LaunchAgents and re-verify the listener. Stable bare-metal nodes make that rehearsal cheap — see pricing and help.