2026-05-29 Hermes Agent serverless: Modal & Daytona hibernate + Telegram wake (HK / JP / KR / SG / US)
A 24/7 cloud VM for an AI agent can cost $5–20/month before you spend a single LLM token. Hermes Agent (MIT, Nous Research) targets that pain with serverless terminal backends: Modal and Daytona sandboxes that hibernate when idle and spin up when Hermes needs to run shell tools—while you chat from Telegram on your phone.
The operating-cost problem
Budget-conscious founders and automation hobbyists hit three bills:
| Cost line | Typical range | What drives it |
|---|---|---|
| Compute VM | $5–50/mo | VPS/RDS always on for gateway + tools |
| LLM API | $5–500+/mo | Model choice × tool loops |
| Egress / storage | $0–20/mo | Logs, artifacts, Modal disk |
Hermes does not eliminate LLM spend. It can collapse idle compute between bursts by routing terminal execution to Modal/Daytona instead of a fat always-on box running bash 24/7.
If you already run Hermes on a leased Mac mini M4 for iOS builds, see our Telegram gateway on M4 guide—this article is for minimizing cloud compute, not maximizing Xcode colocation.
Architecture: gateway vs terminal backend
Quotable model (technical summary): The Hermes gateway (hermes gateway) handles messaging; the terminal backend (terminal.backend in ~/.hermes/config.yaml) decides where bash, file tools, and scripts execute.
| Layer | Hibernate when idle? | Typical host |
|---|---|---|
| Telegram gateway (long polling) | No — outbound polls need a running process | $5 VPS, home Mac, or lease Mac |
| Terminal backend: Modal | Yes — sandbox sleeps between tool bursts | Modal cloud |
| Terminal backend: Daytona | Yes — sandbox sleeps between tool bursts | Daytona cloud |
| LLM provider | N/A (pay per token) | OpenRouter, Nous Portal, etc. |
Wake path you want:
- You send a Telegram message to your bot.
- Gateway (on a thin host) receives the update.
- Hermes agent loop calls tools → Modal/Daytona wakes, runs commands, returns output.
- Sandbox hibernates again; gateway may stay up (small RAM).
For true sleep-everything deployments, upstream Telegram docs describe webhook mode (HTTPS ingress) on Fly.io/Railway—harder on SSH-only Mac leases. Default long polling is simpler but needs a always-on gateway process.
Official references: Terminal backends, Telegram setup, GitHub README.
Cost decision matrix (4 rows)
| Pattern | Monthly idle compute | Best for | Trade-off |
|---|---|---|---|
| Fat VPS 24/7 | ~$5–12 (1 vCPU) | Simplest mental model | Pays while you sleep |
| Thin VPS gateway + Modal/Daytona tools | ~$5 gateway + near-$0 idle sandboxes | Telegram + bursty automation | Two providers to monitor |
| Home Mac / laptop gateway | Electricity only | Solo dev testing | Must stay online |
| Leased Mac mini M4 (MacXCode class) | Lease fee (regional monthly) | Xcode + agent on one Apple Silicon host | Not the cheapest chat-only bot |
Modal bills per CPU-second and GB-second when sandboxes run; idle hibernated periods avoid those charges. Daytona markets similar sleep-when-idle behavior—verify current pricing on each vendor’s dashboard before production.
Apple Mac mini specs matter only if you colocate gateway with builds—not required for serverless terminal routing.
Modal terminal backend
When Modal fits
- Bursty shell work (scripts,
pip install, data pulls) with minutes between Telegram messages. - Optional GPU classes for ML-side tasks (pay only during runs).
- Filesystem persistence via Modal snapshots when
container_persistent: true.
Configure ~/.hermes/config.yaml
terminal:
backend: modal
modal_image: "nikolaik/python-nodejs:python3.11-nodejs20"
container_cpu: 1
container_memory: 5120
container_disk: 51200
container_persistent: true
Prerequisites (upstream):
pip install modal
modal token new
hermes doctor
When Modal is wrong: sub-second local file edits on a huge monorepo checkout—cold start + image pull latency hurts. Use local or SSH backend on a machine that already has the repo.
Daytona terminal backend
Daytona routes tool execution to cloud sandboxes that hibernate when idle (per Hermes docs). Set:
terminal:
backend: daytona
Export API key before starting gateway:
export DAYTONA_API_KEY="your_key"
# persist in ~/.hermes/.env for launchd
When Daytona fits: you want serverless persistence without managing Docker on a VPS—Hermes README positions Daytona alongside Modal for “cost nearly nothing between sessions.”
When Daytona is wrong: strict data residency requiring on-prem only—cloud sandboxes exit compliance scope.
Telegram: wake the agent without a fat server
Wire Telegram once (full steps in our Telegram gateway guide):
hermes gateway setup
Low-cost gateway hosts:
- $5/month VPS (1 GB RAM) running only
hermes gateway+~/.hermes/.env - Oracle Cloud free tier (if available in your region—verify account limits)
- Home always-on Mac for experiments
Point terminal.backend to modal or daytona on that same machine—the gateway stays lightweight; heavy work wakes serverless sandboxes.
Security: numeric TELEGRAM_ALLOWED_USERS only—never expose the bot without allowlisting (@userinfobot for your ID).
Compare agent frameworks in our Hermes vs OpenClaw vs OpenHuman matrix—OpenClaw excels on headless launchd leases; Hermes excels at learning loop + Modal/Daytona offload.
Eight-step runbook: near-idle compute stack
- Install Hermes —
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash - Model auth —
hermes setup(BYO keys; budget models on OpenRouter for cost control). - Pick serverless backend — Modal (
modal token new) or Daytona (DAYTONA_API_KEY). - Write
~/.hermes/config.yaml— setterminal.backend: modalordaytonawith resource limits above. - Telegram —
hermes gateway setup; confirm~/.hermes/.env. - Thin gateway host — deploy
hermes gateway install && hermes gateway starton VPS (not on expensive GPU box). - Smoke test — message bot: “run
uname -aand report”—watch Modal/Daytona dashboard for sandbox start/stop. - Cost guardrails — set provider spend caps; schedule
hermes gateway stopduring vacations if gateway not needed.
Troubleshooting
Bot replies “hello” but tools fail with Modal auth errors
| Symptom | Fix |
|---|---|
modal token missing | Run modal token new on gateway host user |
| Wrong Python env | hermes doctor; install modal in Hermes venv |
| Stale sandbox | Toggle container_persistent or clear Modal app logs |
Sandbox runs but Telegram silent
- Gateway not running:
hermes gateway status - Check
~/.hermes/logs/gateway.logfor Telegram token errors - Only one process may poll a bot token (
Conflict: terminated by other getUpdates)
Costs higher than expected
- LLM tokens dominate—switch model, shorten tool loops, use
/compressin chat per upstream CLI docs - Modal persistent disk snapshots still bill storage—trim
container_disk - Gateway VPS left at 4 GB RAM tier—downsize to 1 GB if only polling
MEDIA: file attachments fail from Modal backend
Gateway sends files from host paths—inside Modal, write to a host-mounted volume path documented in the Telegram + Docker section of upstream docs.
FAQ
hermes doctor after configuring either.Budget automation without a fat VM
When Telegram + bursty tools are enough, a thin gateway plus Modal/Daytona hibernate beats paying for 24/7 shell hosts—leases matter only when Xcode colocation does.