AI / Automation

2026-05-29 Hermes Agent serverless: Modal & Daytona hibernate + Telegram wake (HK / JP / KR / SG / US)

A 24/7 cloud VM for an AI agent can cost $5–20/month before you spend a single LLM token. Hermes Agent (MIT, Nous Research) targets that pain with serverless terminal backends: Modal and Daytona sandboxes that hibernate when idle and spin up when Hermes needs to run shell tools—while you chat from Telegram on your phone.

Important precision (so budgets do not break): upstream separates where the gateway listens (often a small always-on host for Telegram long polling) from where heavy commands run (Modal/Daytona). This guide shows the lowest-cost honest architecture, not “literally $0 for everything including model API bills.”
Disclosure: MacXCode leases Apple Silicon Mac mini M4 hosts for teams that need always-on Xcode CI or OpenClaw gateways—we mention leases only as a contrast in the cost matrix, not as the cheapest Telegram+Hermes path.
Hermes Agent serverless Modal Daytona Telegram hibernate on cloud Mac mini M4

The operating-cost problem

Budget-conscious founders and automation hobbyists hit three bills:

Cost lineTypical rangeWhat drives it
Compute VM$5–50/moVPS/RDS always on for gateway + tools
LLM API$5–500+/moModel choice × tool loops
Egress / storage$0–20/moLogs, artifacts, Modal disk

Hermes does not eliminate LLM spend. It can collapse idle compute between bursts by routing terminal execution to Modal/Daytona instead of a fat always-on box running bash 24/7.

If you already run Hermes on a leased Mac mini M4 for iOS builds, see our Telegram gateway on M4 guide—this article is for minimizing cloud compute, not maximizing Xcode colocation.

Architecture: gateway vs terminal backend

Quotable model (technical summary): The Hermes gateway (hermes gateway) handles messaging; the terminal backend (terminal.backend in ~/.hermes/config.yaml) decides where bash, file tools, and scripts execute.

LayerHibernate when idle?Typical host
Telegram gateway (long polling)No — outbound polls need a running process$5 VPS, home Mac, or lease Mac
Terminal backend: ModalYes — sandbox sleeps between tool burstsModal cloud
Terminal backend: DaytonaYes — sandbox sleeps between tool burstsDaytona cloud
LLM providerN/A (pay per token)OpenRouter, Nous Portal, etc.

Wake path you want:

  1. You send a Telegram message to your bot.
  2. Gateway (on a thin host) receives the update.
  3. Hermes agent loop calls tools → Modal/Daytona wakes, runs commands, returns output.
  4. Sandbox hibernates again; gateway may stay up (small RAM).

For true sleep-everything deployments, upstream Telegram docs describe webhook mode (HTTPS ingress) on Fly.io/Railway—harder on SSH-only Mac leases. Default long polling is simpler but needs a always-on gateway process.

Official references: Terminal backends, Telegram setup, GitHub README.

Cost decision matrix (4 rows)

PatternMonthly idle computeBest forTrade-off
Fat VPS 24/7~$5–12 (1 vCPU)Simplest mental modelPays while you sleep
Thin VPS gateway + Modal/Daytona tools~$5 gateway + near-$0 idle sandboxesTelegram + bursty automationTwo providers to monitor
Home Mac / laptop gatewayElectricity onlySolo dev testingMust stay online
Leased Mac mini M4 (MacXCode class)Lease fee (regional monthly)Xcode + agent on one Apple Silicon hostNot the cheapest chat-only bot

Modal bills per CPU-second and GB-second when sandboxes run; idle hibernated periods avoid those charges. Daytona markets similar sleep-when-idle behavior—verify current pricing on each vendor’s dashboard before production.

Apple Mac mini specs matter only if you colocate gateway with builds—not required for serverless terminal routing.

When Modal fits

  • Bursty shell work (scripts, pip install, data pulls) with minutes between Telegram messages.
  • Optional GPU classes for ML-side tasks (pay only during runs).
  • Filesystem persistence via Modal snapshots when container_persistent: true.

Configure ~/.hermes/config.yaml

terminal: backend: modal modal_image: "nikolaik/python-nodejs:python3.11-nodejs20" container_cpu: 1 container_memory: 5120 container_disk: 51200 container_persistent: true

Prerequisites (upstream):

pip install modal modal token new hermes doctor

When Modal is wrong: sub-second local file edits on a huge monorepo checkout—cold start + image pull latency hurts. Use local or SSH backend on a machine that already has the repo.

Daytona terminal backend

Daytona routes tool execution to cloud sandboxes that hibernate when idle (per Hermes docs). Set:

terminal: backend: daytona

Export API key before starting gateway:

export DAYTONA_API_KEY="your_key" # persist in ~/.hermes/.env for launchd

When Daytona fits: you want serverless persistence without managing Docker on a VPS—Hermes README positions Daytona alongside Modal for “cost nearly nothing between sessions.”

When Daytona is wrong: strict data residency requiring on-prem only—cloud sandboxes exit compliance scope.

Telegram: wake the agent without a fat server

Wire Telegram once (full steps in our Telegram gateway guide):

hermes gateway setup

Low-cost gateway hosts:

  • $5/month VPS (1 GB RAM) running only hermes gateway + ~/.hermes/.env
  • Oracle Cloud free tier (if available in your region—verify account limits)
  • Home always-on Mac for experiments

Point terminal.backend to modal or daytona on that same machine—the gateway stays lightweight; heavy work wakes serverless sandboxes.

Security: numeric TELEGRAM_ALLOWED_USERS only—never expose the bot without allowlisting (@userinfobot for your ID).

Compare agent frameworks in our Hermes vs OpenClaw vs OpenHuman matrix—OpenClaw excels on headless launchd leases; Hermes excels at learning loop + Modal/Daytona offload.

Eight-step runbook: near-idle compute stack

  1. Install Hermescurl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
  2. Model authhermes setup (BYO keys; budget models on OpenRouter for cost control).
  3. Pick serverless backend — Modal (modal token new) or Daytona (DAYTONA_API_KEY).
  4. Write ~/.hermes/config.yaml — set terminal.backend: modal or daytona with resource limits above.
  5. Telegramhermes gateway setup; confirm ~/.hermes/.env.
  6. Thin gateway host — deploy hermes gateway install && hermes gateway start on VPS (not on expensive GPU box).
  7. Smoke test — message bot: “run uname -a and report”—watch Modal/Daytona dashboard for sandbox start/stop.
  8. Cost guardrails — set provider spend caps; schedule hermes gateway stop during vacations if gateway not needed.

Troubleshooting

Bot replies “hello” but tools fail with Modal auth errors

SymptomFix
modal token missingRun modal token new on gateway host user
Wrong Python envhermes doctor; install modal in Hermes venv
Stale sandboxToggle container_persistent or clear Modal app logs

Sandbox runs but Telegram silent

  • Gateway not running: hermes gateway status
  • Check ~/.hermes/logs/gateway.log for Telegram token errors
  • Only one process may poll a bot token (Conflict: terminated by other getUpdates)

Costs higher than expected

  • LLM tokens dominate—switch model, shorten tool loops, use /compress in chat per upstream CLI docs
  • Modal persistent disk snapshots still bill storage—trim container_disk
  • Gateway VPS left at 4 GB RAM tier—downsize to 1 GB if only polling

MEDIA: file attachments fail from Modal backend

Gateway sends files from host paths—inside Modal, write to a host-mounted volume path documented in the Telegram + Docker section of upstream docs.

FAQ

Can monthly spend be literally $0?+
Unlikely end-to-end. Idle Modal/Daytona compute can approach $0, but a gateway host (~$5 VPS), Telegram, and LLM API usage usually still cost money. Free tiers change—verify vendor pages monthly.
Does the entire Hermes stack hibernate on Telegram idle?+
Terminal sandboxes do. The gateway for default long polling stays awake unless you move to webhook hosting or stop the service.
Modal or Daytona?+
Modal if you want GPU/CPU classes and snapshot persistence in one ecosystem. Daytona if you prefer their sandbox UX—run hermes doctor after configuring either.
Is this a replacement for a leased Mac mini M4?+
No for Xcode CI. Yes for personal Telegram automation where Apple Silicon and 24/7 disk are unnecessary.
Where is official serverless documentation?+
Hermes terminal backends and NousResearch/hermes-agent README (six backends: local, Docker, SSH, Singularity, Modal, Daytona).

Budget automation without a fat VM

When Telegram + bursty tools are enough, a thin gateway plus Modal/Daytona hibernate beats paying for 24/7 shell hosts—leases matter only when Xcode colocation does.