DevOps / CI·CD May 6, 2026

2026-05-06 iOS Simulator runtime disk budget, selective install, and CI cleanup on a leased Apple Silicon cloud Mac (HK / JP / KR / SG / US)

MacXCode Engineering Team May 6, 2026 ~23 min read

Teams that lease a Mac mini M4 in Hong Kong, Tokyo, Seoul, Singapore, or the United States for xcodebuild test quickly discover a second bill: Simulator runtimes are not free—they occupy multi‑GB slices under Library/Developer/CoreSimulator, compete with DerivedData for NVMe bandwidth, and multiply when watchOS companions join the matrix. This 2026-05-06 guide answers three operational questions: how to inventory what is installed, how to select only the OS families your destinations require, and how to prune without deleting the runtime your nightly UI tests still boot. It extends headless simulator testing, pairs with disk cleanup janitors, and references DerivedData isolation when you run parallel lanes on the same host.

Why 2026 CI still confuses “Xcode installed” with “Simulator ready”

Xcode.app bundles the toolchain, but runtimes are versioned payloads you download on demand. Three recurring mistakes:

  • Destination drift — YAML still lists iPhone 15 while the host only provisioned iPhone 16 images after an Xcode bump.
  • Silent watch pairing — the iOS scheme passes locally because Xcode auto-fetched a watch runtime; CI never did.
  • Janitor overreach — a cron deletes “old” runtimes that QA still needed for reproducing App Store crashes on iOS 17.6.
Anchor numbers: keep ≥ 180 GB free on a 512 GB SKU before starting a parallel test wave; on 2 TB nodes, still cap runtime count so Spotlight and backup daemons do not contend during Archives.

Runtime footprint: where gigabytes hide

Apple Silicon hosts store device data, dyld shared caches, and per-runtime slices. Expect 7–14 GB per major iOS runtime pair (phone + paired watch images vary by year). Add 3–6 GB for additional languages if you installed localization packs for UI screenshot lanes. The point is not the exact byte count—it is the derivative when five engineers each “just install the newest beta runtime” on the same shared builder.

Correlate runtime growth with queue latency: when free space drops under 12%, first symptom is often slower xctest fixture copy—not immediate failures—because APFS starts working harder to find contiguous extents.

When you adopt a new Xcode major, treat runtime downloads as a capacity migration, not a single checkbox: Xcode may prompt for multiple platform packs, CoreSimulator caches rebuild, and your first green build often performs an implicit “warm” that inflates disk for 24–48 hours. Budget an explicit maintenance window on leased hosts so PR traffic does not overlap with the first bulk download—teams that ignore this see flaky red builds that disappear after the second day for no code reason. Log df hourly during that window and compare against the steady-state curve you captured last quarter; if the slope is steeper than historical, you probably installed overlapping beta and GA slices on the same lane.

Finally, document who may install runtimes: ad-hoc sudo installs from individual engineers are how “mystery six gigabytes” appear on shared keys. Centralize changes through infra tickets tied to CI image tags so ssh sessions remain auditable—especially when the Mac sits in a regulated geography where data residency already constrained your choice of Singapore vs US East for signing material.

Lane matrix: which host keeps which runtimes

Split responsibilities explicitly; do not pretend every runner is interchangeable.

Lane label Runtime policy Owns cleanup?
ios-current Latest GA iOS + one N-1 for App Store parity Weekly prune with ticket
watch-heavy watchOS images + paired phones only Monthly; never delete N-1 without QA sign-off
archive-only Minimal simulators; prefers device Archives Disk janitor aggressive on simulators, gentle on keys

Inventory commands operators should paste into runbooks

Start with non-destructive probes, then escalate.

xcrun simctl list runtimes df -h / du -sh ~/Library/Developer/CoreSimulator/* 2>/dev/null | sort -h | tail -n 20

When numbers disagree with Finder, trust du from the CI user—not admin snapshots—because launchd jobs run as the builder account. If you use separate volumes, repeat for /Volumes/builds.

Selective install runbook (happy path)

  1. Freeze queue or drain labels tied to the host.
  2. Download only the runtime bundles required by the next sprint’s destination matrix.
  3. Boot smoke one simulator per runtime with xcrun simctl boot and verify sysctl hw.model still reports Apple Silicon.
  4. Re-open queue with a bumped image version tag so CI YAML matches reality.
  5. Document the installed set in infra repo—not a wiki orphan.

Between steps two and three, capture checksums or Apple version strings in your ticket so rollback is obvious if a bad mirror delivered a truncated runtime.

Prune policy: what is allowed to die

Good janitors delete derived artifacts aggressively but treat runtimes as semi-static infrastructure. Use a two-phase policy:

Artifact Safe cadence Risk if over-pruned
Unbooted simulator devices Daily Low—recreate from templates
Old .xcresult bundles After upload to object store Medium—legal retention may require 30–90d copies off-host
Runtime bundles Quarterly with QA list High—breaks reproducibility of crash repro

Parallel lanes and unified memory pressure

Parallel xcodebuild jobs multiply simulator boot churn. Cap concurrent simulators per host using queue labels—see parallel job guidance. When memory pressure spikes, prefer fewer concurrent destinations over swapping; unified memory makes swap painful for XCTest.

Headless tip: if UI tests flake only after pruning, capture a sysdiagnose slice from the builder account before you blame application code—disk pressure manifests as SpringBoard watchdogs first.

Xcode upgrade windows: sequencing runtime pulls with Archives

Never schedule a runtime download on the same night as a TestFlight submission freeze unless you have a second warm node ready. The safest pattern on MacXCode hosts is blue/green builders: promote a refreshed image tag only after xcodebuild test and a trivial xcodebuild archive -archivePath /tmp/Smoke.xcarchive succeed on the candidate. If you cannot afford dual nodes, shrink scope: install one additional runtime per maintenance hour, not five at once. Record wall-clock minutes for each download so finance can compare leasing another 1 TB builder versus burning a release weekend.

After upgrades, re-verify destination strings in YAML because Apple occasionally renames simulator hardware profiles. A mismatch manifests as “destination not found” even though simctl list looks populated—usually because the job targets a hardware string you deleted during pruning. Keep a machine-readable export of simctl list devices available in your infra repo for diff review.

Numeric targets you can put in Grafana

  • 85% max sustained disk utilization before paging ops.
  • 4 concurrent booted simulators max on 16 GB unless profiling proves headroom.
  • 22 minutes upper bound for cold “install runtime + boot + single XCTest” acceptance; alert if exceeded post-upgrade.

Nine-step checklist before you delete “the big folder”

  1. Confirm no Archive job is mid-flight.
  2. Snapshot current simctl list to GitOps.
  3. Identify runtimes with zero jobs in last 30 days.
  4. Notify QA with explicit removal dates.
  5. Drain one lane at a time.
  6. Delete runtime via supported UI/CLI paths only.
  7. Re-run smoke tests for remaining destinations.
  8. Compare df before/after; attach deltas to ticket.
  9. Roll forward tag on healthy hosts only.

FAQ: betas, Apple Silicon, and cross-region hosts

Question Practical answer (2026-05-06)
Should beta runtimes live on production CI? Isolate to labeled canary hosts; never mix with App Store submission lanes.
Do Singapore and US hosts need identical sets? Align on minimum shared set; region-specific extras are fine if YAML encodes them.

Why Mac mini M4 with 1–2 TB still wins for simulator-heavy CI

Simulator workloads are random read heavy; bare-metal Mac mini M4 NVMe on MacXCode nodes keeps boot times predictable when four lanes boot different OS generations for the same pull request matrix. That predictability is what lets you enforce numeric budgets instead of buying mystery “large cloud disks” that sit behind noisy neighbors. Pair this hardware story with regional pricing when capacity planners ask for a second JP canary, and with SSH/VNC access guides when security teams want evidence of who deleted a runtime.

Add NVMe headroom before runtime sprawl returns

Mac mini M4 · HK / JP / KR / SG / US · SSH / optional VNC