Hermes Agent v0.8.0 · 源码解析

Nous Research 开源的自进化 AI agent。本文基于本地 git clone 的 v0.8.0 源码 (commit d6785dc,2026-04-12),结合 AGENTS.md、RELEASE_v0.8.0.md 和 842 个 Python 文件的结构化阅读。所有技术结论都标注了 file:line 引用。

version 0.8.0 commit d6785dc python files 842 markdown files 542 repo size 162 MB test count ~3000 license MIT

01一句话总结

Hermes Agent 是一个多平台入口、同一个 agent 循环、可插拔 LLM provider、可插拔 terminal 后端、可插拔 skills + MCP + plugins 的 Python agent 框架。它的最关键设计不是 LLM 能力,而是 把"跟 agent 说话"、"agent 跑代码"、"agent 用工具"、"agent 记忆"四件事完全正交: 你可以从 CLI / Telegram / Slack / Discord / WhatsApp / Signal / 邮件 / Matrix / 钉钉 / 飞书 / WeChat 任意入口发消息,同一个 AIAgent 实例处理,代码执行可以转发到本地 / Docker / SSH 远程 / Daytona / Modal / Singularity 任意沙箱,而模型 provider 从 OpenRouter / Nous / Anthropic / OpenAI / z.ai / Kimi / MiniMax / Google AI Studio / HuggingFace / 任意 OpenAI 兼容 custom endpoint 任选。

所有这些都可以在运行时用 /model 切换,不重启,不丢上下文。

02取证方式 · 不是 AI 猜的

本文的每一条技术断言都基于对实际源码的阅读,格式是 file:line 引证。证据来源:

git clone https://github.com/NousResearch/hermes-agent.git 到 ~/Projects/research/20260413-hermes-source-analysis/source/
顶层仓库文件:AGENTS.md(20 KB 开发者指南)、README.md、 RELEASE_v0.8.0.md(209 PR / 82 issue 合并日志)、Dockerfile、 pyproject.toml、docker/entrypoint.sh
核心代码文件:run_agent.py(10 578 行)、cli.py(9 920 行)、 gateway/run.py(8 844 行)、hermes_state.py(1 238 行)、 tools/registry.py、model_tools.py

为什么要讲取证:AI 分析源码最容易出的问题是"听起来对但其实瞎编"。本文写作时使用的读法是先读 → 再写 → 不反过来。若发现任何引用错误,以 GitHub 上 v0.8.0 标签为准,反馈到项目 issues。

03三个入口脚本

pyproject.toml pyproject.toml:112-115 定义了 3 个可执行入口:

命令	指向	作用
hermes	hermes_cli.main:main	交互式 CLI + 所有 `hermes *` 子命令
hermes-agent	run_agent:main	单次 agent 调用(脚本模式、批处理、数据生成)
hermes-acp	acp_adapter.entry:main	ACP 协议服务器(给 VS Code / Zed / JetBrains 用)

核心类 AIAgent 定义在 run_agent.py:492。交互 CLI 的 HermesCLI 定义在 cli.py:1574。消息网关的 GatewayRunner 定义在 gateway/run.py:512。 这三个类是理解 Hermes 的三把钥匙。

04整体架构

┌─────────────────────────────────────────────┐ │ 用户入口 │ │ CLI · Telegram · Slack · Discord │ │ WhatsApp · Signal · Email · Matrix │ │ DingTalk · Feishu · WeCom · WeChat · 22+ │ └──────────────────┬──────────────────────────┘ │ [gateway/run.py + gateway/platforms/*] │ ▼ ┌─────────────────────────────────────────────────────────────────────┐ │ AIAgent 核心(run_agent.py:492) │ │ ┌──────────────┬──────────────┬──────────────┬──────────────┐ │ │ │ 系统提示词 │ 会话存储 │ 内存管理 │ 工具编排 │ │ │ │ agent/ │ hermes_state │ agent/ │ model_tools │ │ │ │ prompt_ │ .py (FTS5) │ memory_ │ + tools/ │ │ │ │ builder.py │ │ manager │ registry.py │ │ │ └──────────────┴──────────────┴──────────────┴──────────────┘ │ │ │ │ 主循环:run_conversation() :7528 │ │ while api_call_count < max_iterations: │ │ call LLM with tool schemas │ │ if response.tool_calls: │ │ handle_function_call(...) │ │ append tool results │ │ else: return response.content │ └──────────────────┬──────────────────┬───────────────────────────────┘ │ │ LLM 层工具执行层 │ │ ▼ ▼ ┌─────────────────────┐ ┌──────────────────────────────────────────┐ │ OpenRouter │ │ tools/*.py(40+ 工具) │ │ Nous Portal │ │ ┌────────────────────────────────────┐ │ │ Anthropic │ │ │ 终端(terminal_tool.py) │ │ │ OpenAI + Codex │ │ │ ↓ 可选 6 种后端 │ │ │ Google AI Studio │ │ │ local · docker · ssh · │ │ │ z.ai / GLM │ │ │ modal · daytona · singularity │ │ │ Kimi / Moonshot │ │ └────────────────────────────────────┘ │ │ MiniMax │ │ · 文件(file_tools.py) │ │ HuggingFace │ │ · 网络(web_tools, browser_tool) │ │ xAI / Grok │ │ · 代码执行(code_execution_tool) │ │ custom(任意 OAI) │ │ · 子代理(delegate_tool) │ └─────────────────────┘ │ · MCP 客户端(mcp_tool.py,2195 行) │ └──────────────────────────────────────────┘

05仓库结构地图

来源:AGENTS.md 第 12-64 行 + 实际 ls 验证。

路径	行数	责任
run_agent.py	10 578	AIAgent 类 + 主循环 + 重试 / fallback / 预算管理
cli.py	9 920	HermesCLI 类 + 交互 TUI(prompt_toolkit + Rich)
gateway/run.py	8 844	GatewayRunner + 消息分发 + agent 会话缓存
tools/mcp_tool.py	2 195	MCP 客户端(OAuth 2.1 PKCE + 动态工具发现)
tools/terminal_tool.py	1 777	终端执行编排(后台进程、审批、环境切换)
hermes_state.py	1 238	SessionDB(SQLite + FTS5 全文搜索)
tools/delegate_tool.py	1 103	子代理派发(并行工作流)
model_tools.py	577	工具编排层 + sync→async 桥接
toolsets.py	655	工具集定义 + 平台启用策略
tools/registry.py	335	单例工具注册表 + dispatch
agent/	—	prompt_builder, context_compressor, prompt_caching, memory_manager, trajectory, display
hermes_cli/	—	所有 `hermes *` 子命令 + 配置 + 皮肤 + setup 向导
tools/	—	40+ 工具文件(每个自注册到 registry)
tools/environments/	3 285	6 种终端后端(base + local/docker/ssh/modal/daytona/singularity)
gateway/platforms/	—	22 个消息平台适配器
skills/	—	26 个顶层分类 / 78 个 bundled skill
acp_adapter/	—	ACP 协议服务(VS Code / Zed / JetBrains 集成)
cron/	—	调度器(jobs.py + scheduler.py)
tinker-atropos/	—	RL 训练子模块(Tinker + Atropos)
tests/	—	~3000 个 pytest 测试

06核心:AIAgent 类

class AIAgent 定义在 run_agent.py:492。构造函数接受 50+ 参数, 但核心字段几个:

字段	默认	说明
model	""(运行时注入)	模型名,如 `anthropic/claude-opus-4.6`
max_iterations	90 run_agent.py:527	工具调用循环上限
iteration_budget	IterationBudget(90) run_agent.py:619	跨主 agent + 所有子代理共享的预算
platform	None	"cli" / "telegram" / "discord" / ...,用于注入平台格式提示
session_id	自动生成	SessionDB 外键
enabled_toolsets / disabled_toolsets	None	白/黑名单控制可用工具集
fallback_model	None	主 provider 失败后的备选
credential_pool	None	多 key 自动轮替
skip_context_files / skip_memory	False	批处理/RL 场景不注入用户人格

注意 run_agent.py:504-505 的 _context_pressure_last_warned — 这是一个类级别的 dict,用来跨实例去重"上下文压力警告"。原因写在注释里: gateway 会为每条消息创建一个新 AIAgent 实例,所以实例级的 flag 每次都重置,需要类级共享状态。这个细节暴露了 Gateway 的真实调用模式:每消息一个实例,但实例是从 cache 里拿的(见 gateway/run.py:577-584 的 _agent_cache)。

07Agent 主循环

主循环在 run_conversation(),入口 run_agent.py:7528, 核心 while 在 run_agent.py:7850:

while (api_call_count < self.max_iterations and self.iteration_budget.remaining > 0) or self._budget_grace_call:
    self._checkpoint_mgr.new_turn()
    if self._interrupt_requested:
        interrupted = True
        break
    api_call_count += 1
    ...
    # 调 LLM
    response = client.chat.completions.create(model=model, messages=messages, tools=tool_schemas)
    if response.tool_calls:
        for tool_call in response.tool_calls:
            result = handle_function_call(tool_call.name, tool_call.args, task_id)
            messages.append(tool_result_message(result))
    else:
        return response.content

这个简化版是 AGENTS.md 第 112-122 行给的示意。真实代码在 7850-10213 之间展开了 ~2000 行复杂度,处理:

中断(run_agent.py:7855)— 用户发新消息时打断当前循环
迭代预算(run_agent.py:7871)— consume() 消费 1 次,共享预算
_budget_grace_call(run_agent.py:7869)— 预算耗尽给模型 1 次总结机会
step_callback 发给 gateway 做 agent:step hook(run_agent.py:7878-7902)
Skill 使用计数器 _iters_since_skill(run_agent.py:7906-7908)— 每 N 轮未用 skill 会 nudge 模型考虑 skill_manage
消息消毒:_sanitize_messages_surrogates()(run_agent.py:356) 和 _sanitize_messages_non_ascii()(run_agent.py:413)— 处理奇怪 unicode 导致的 provider 报错

设计原则:AGENTS.md 第 339-347 行明确写 Prompt Caching Must Not Break。主循环的一条铁律是——绝对不能中途改变过去的 context、工具集或系统提示。破坏 cache 意味着 10x 成本。唯一允许改 context 的时刻是自动上下文压缩 (agent/context_compressor.py)。

08工具注册表(插件化核心)

Hermes 的工具系统是"tool 文件自注册"模型。每个 tools/*.py 在模块 import 时调用 registry.register(...) 把自己挂上去。看 tools/registry.py 的类定义:

tools/registry.py:48
class ToolRegistry:
    """Singleton registry that collects tool schemas + handlers from tool files."""

    def __init__(self):
        self._tools: Dict[str, ToolEntry] = {}
        self._toolset_checks: Dict[str, Callable] = {}

    def register(
        self,
        name: str,
        toolset: str,
        schema: dict,          # OpenAI function schema
        handler: Callable,     # 实际执行函数
        check_fn: Callable = None,   # 可用性检查(env 变量等)
        requires_env: list = None,
        is_async: bool = False,
        description: str = "",
        emoji: str = "",
        max_result_size_chars: int | float | None = None,
    ): ...

关键设计:

单例:tools/registry.py:290 registry = ToolRegistry(), 模块级对象,任意文件 from tools.registry import registry 拿到同一个
check_fn:tools/registry.py:117-143 get_definitions() 先跑 check, 失败的 tool 不会出现在发给 LLM 的 schema 列表里。这是 Hermes 实现"API key 缺失时工具消失"的机制
MCP 动态注入:tools/registry.py:95-110 有 deregister(), 专门给 MCP 服务发 notifications/tools/list_changed 时 nuke-and-repave 用
dispatch() 同时支持同步 / 异步:tools/registry.py:149-166 — 如果 entry.is_async,从 model_tools._run_async() 桥接跑
tool_error / tool_result helper:tools/registry.py:309-335 强制所有 tool handler 返回 JSON 字符串,消除了几百次 json.dumps() 样板

import 依赖链(防循环) tools/registry.py:7-14: tools/registry.py 不依赖任何其他文件 → tools/*.py 依赖 registry → model_tools.py 依赖 registry + 所有 tool 文件 → 上层 run_agent.py / cli.py / batch_runner.py 依赖 model_tools。

09工具编排层(sync/async 桥接)

model_tools.py 的顶部(model_tools.py:1-22)自述为 "thin orchestration layer over the tool registry"。它只做两件事:

触发 tool 发现:import 所有 tools/*.py,让它们 self-register
sync→async 桥接:_run_async() 位于 model_tools.py:81-120+

_run_async() 背后有一个很讲究的设计 —— 三种 loop 策略:

场景	策略	为什么
CLI 主线程(无 running loop)	持久共享 loop `_tool_loop` model_tools.py:39-56	缓存的 httpx/AsyncOpenAI 客户端需要绑定到活 loop,`asyncio.run()` 每次 create-and-close 会触发 "Event loop is closed" GC 报错
Gateway 已经在 async 栈里	开一次性 thread pool,在新线程里 `asyncio.run()` model_tools.py:108-113	不跟现有 running loop 冲突
并行 tool 执行的 worker 线程	每线程持久 loop(ThreadLocal)model_tools.py:59-78	避免多 worker 抢同一个 loop,同时保持客户端缓存有效

这段代码读下来能看出 Hermes 是经过大量实战打磨的 —— _get_worker_loop() 的文档注释直接说"这是为了避免某个 PR 引入的 'Event loop is closed' bug"。

10会话存储 + FTS5 搜索

hermes_state.py 的 SessionDB 类是整个 agent 的记忆骨架。路径:

hermes_state.py:32  DEFAULT_DB_PATH = get_hermes_home() / "state.db"
hermes_state.py:34  SCHEMA_VERSION = 6
hermes_state.py:115 class SessionDB

表结构

sessions 表 hermes_state.py:41-69 存每个会话的元数据:

id, source — 会话 ID + 来源("cli" / "telegram" / "discord" / ...)
parent_session_id — 父会话链,上下文压缩时切片新会话,父子关系保持
input_tokens, output_tokens, cache_read_tokens, cache_write_tokens, reasoning_tokens — 5 种 token 统计
estimated_cost_usd, actual_cost_usd, cost_status, cost_source, pricing_version — 成本追踪
billing_provider, billing_base_url, billing_mode — 计费 provider 单独记录(因为 OAuth pool 会轮换)

messages 表 hermes_state.py:71-85 存每条消息: session_id, role, content, tool_call_id, tool_calls, tool_name, timestamp, token_count, finish_reason, reasoning, reasoning_details, codex_reasoning_items。

注意 reasoning_details 和 codex_reasoning_items 是两种不同 provider 的 thinking 格式 —— Anthropic thinking blocks 和 OpenAI Codex reasoning item,各自持久化。

FTS5 全文搜索

hermes_state.py:93-112 定义了 FTS5 虚拟表 + 3 个触发器:

CREATE VIRTUAL TABLE IF NOT EXISTS messages_fts USING fts5(
    content,
    content=messages,
    content_rowid=id
);

CREATE TRIGGER messages_fts_insert AFTER INSERT ON messages BEGIN
    INSERT INTO messages_fts(rowid, content) VALUES (new.id, new.content);
END;
CREATE TRIGGER messages_fts_delete AFTER DELETE ON messages BEGIN
    INSERT INTO messages_fts(messages_fts, rowid, content) VALUES('delete', old.id, old.content);
END;
CREATE TRIGGER messages_fts_update AFTER UPDATE ON messages BEGIN
    INSERT INTO messages_fts(messages_fts, rowid, content) VALUES('delete', old.id, old.content);
    INSERT INTO messages_fts(rowid, content) VALUES (new.id, new.content);
END;

content=messages, content_rowid=id 是 FTS5 的外部内容模式(external content table)— 索引数据不重复存储,直接引用 messages 表的 id。节省一半磁盘。

_sanitize_fts5_query() hermes_state.py:938 实现了 FTS5 查询消毒 —— 原因:FTS5 自己有查询语法,用户输入的 -、"、(、) 要么 escape 要么会抛错。这个函数的注释明确说明"wrap unquoted hyphenated and dotted terms in quotes so FTS5's tokenizer doesn't split them"(因为 FTS5 会按点和连字符分词)。

高并发写

hermes_state.py:123-136 的类常量:

_WRITE_MAX_RETRIES = 15
_WRITE_RETRY_MIN_S = 0.020(20 ms)
_WRITE_RETRY_MAX_S = 0.150(150 ms)
_CHECKPOINT_EVERY_N_WRITES = 50(每 50 次写触发一次 PASSIVE WAL checkpoint)

注释解释:SQLite 自带 busy handler 用确定性 sleep 导致"convoy effect"(高并发时互相卡成一队), 所以 Hermes 在应用层做抖动重试,让多个 writer 自然错开。这是细节工程能力的体现。

11消息网关

class GatewayRunner gateway/run.py:512。启动入口 async def start_gateway() gateway/run.py:8634。

关键实例字段

adapters: Dict[Platform, BasePlatformAdapter] gateway/run.py:536 — 每个启用的平台加载一个 adapter(telegram / discord / slack / ...)
session_store = SessionStore(...) gateway/run.py:553 — 会话持久化,wrap SessionDB
_running_agents: Dict[str, AIAgent] gateway/run.py:573 — 每 session 的活 agent,用于中断/查询
_pending_messages gateway/run.py:575 — 中断期间用户发的新消息先排队
_agent_cache gateway/run.py:583 — 这是 prompt caching 能用的关键。每 session 缓存 (AIAgent, config_signature), 同一个 session 复用同一个 instance,避免重建 system prompt 和内存,保证 Anthropic prefix cache 命中
_session_model_overrides gateway/run.py:588 — /model 指令切换模型,每 session 一份
_pending_approvals gateway/run.py:591 — 危险命令审批的待响应状态

临时配置注入

gateway/run.py:540-549:

self._prefill_messages = self._load_prefill_messages()
self._ephemeral_system_prompt = self._load_ephemeral_system_prompt()
self._reasoning_config = self._load_reasoning_config()
self._service_tier = self._load_service_tier()
self._show_reasoning = self._load_show_reasoning()
self._busy_input_mode = self._load_busy_input_mode()
self._restart_drain_timeout = self._load_restart_drain_timeout()
self._provider_routing = self._load_provider_routing()
self._fallback_model = self._load_fallback_model()
self._smart_model_routing = self._load_smart_model_routing()

这些都标注为"ephemeral, injected at API-call time only and never persisted" — 典型做法:保留功能,但不污染持久化会话,这样切换配置不会破坏历史 cache。

1222 个平台适配器

gateway/platforms/ 目录实际文件清单(ls 核对):

api_server.py         base.py             bluebubbles.py    dingtalk.py
discord.py            email.py            feishu.py         helpers.py
homeassistant.py      matrix.py           mattermost.py     signal.py
slack.py              sms.py              telegram.py       telegram_network.py
webhook.py            wecom.py            wecom_callback.py wecom_crypto.py
weixin.py             whatsapp.py

base.py 定义 BasePlatformAdapter 抽象基类,其他文件继承实现。 ADDING_A_PLATFORM.md 在同目录下给开发者文档。

特殊的几个:

api_server.py — REST API 入口,暴露 OpenAI 兼容 /v1/*(就是你跑起来的那个)
webhook.py — 通用 webhook(接 Home Assistant、自定义集成)
wecom.py + wecom_callback.py + wecom_crypto.py — 企业微信,因为要做 XML 加密所以拆 3 个文件
telegram_network.py — 专门处理 Telegram 网络层的 retry / dedup / fallback IP(中国访问常见问题)
bluebubbles.py — iMessage 桥接(通过 BlueBubbles 项目)

136 种 Terminal 后端

tools/environments/ 目录(共 3285 行代码):

后端	文件 / 行数	用途
base	base.py / 579	抽象基类 + 共用工具方法
local	local.py / 314	直接在 Hermes 进程宿主机跑(默认,不安全)
docker	docker.py / 560	每个 task 起独立容器,用完销毁
ssh	ssh.py / 258	远程执行(把 Hermes 留本地,命令转发到 sandbox VM)
modal	modal.py / 434 + managed_modal.py / 282	Modal.com 无服务器沙箱
daytona	daytona.py / 229	Daytona workspace
singularity	singularity.py / 262	HPC / 学术 Singularity 容器(给 GPU 集群用)
file_sync	file_sync.py / 168	本地 ↔ 远程文件同步层(ssh/modal 共用)

为什么这是 Hermes 的亮点:传统 agent 把代码执行和 agent 本体绑死,要隔离就整个进程全隔离。 Hermes 把它解耦—— agent 跑在便宜的 VPS,代码执行可以转发到贵的 Modal GPU 或远程 SSH 沙箱, 按需付费。对比昨天讨论的 Manus 架构,Hermes 更灵活(你能选),Manus 更省心(强制 microVM)。

14Skills 系统

Skills 是 Hermes 的"过程性记忆"—— markdown 文件描述某类任务的步骤。skills/ 目录:

顶层分类	包含
apple	apple-reminders / imessage / findmy / apple-notes / ...
research	blogwatcher / polymarket / llm-wiki / arxiv / research-paper-writing
gaming	minecraft-modpack-server / pokemon-player / ...
software-development	(多个)
devops, mlops, data-science	(多个)
creative, diagramming, email, feeds, gifs, github, leisure, media, mcp, note-taking, productivity, red-teaming, smart-home, social-media, autonomous-ai-agents, domain, dogfood, index-cache, inference-sh	—

26 个顶层 + 78 个 bundled skill 文件(入口日志 "78 total bundled" 匹配)。每个 skill 是 SKILL.md,YAML frontmatter 带 metadata。例:

---
name: dogfood
description: Systematic exploratory QA testing of web applications ...
version: 1.0.0
metadata:
  hermes:
    tags: [qa, testing, browser, web, dogfood]
    related_skills: []
---

# Dogfood: Systematic Web Application QA Testing
...

来源:skills/dogfood/SKILL.md:1-11

Skills 的注入策略:agent/skill_commands.py 扫描 ~/.hermes/skills/, 用户打 /skillname 触发时,注入为用户消息(不是系统提示词), 这样不破坏 prompt caching —— 见 AGENTS.md:135(原文: "Skill slash commands: scans skills dir, injects as user message (not system prompt) to preserve prompt caching")。

此外,skills 还支持"自我进化"—— 独立仓库 hermes-agent-self-evolution 用 DSPy + GEPA 对 skill 做基于轨迹的 prompt 优化。

15Slash 命令中心化注册

所有 /command 定义在 hermes_cli/commands.py 的 COMMAND_REGISTRY(AGENTS.md:137-148)。一次定义,多处消费:

CLI 分发:cli.py HermesCLI.process_command() cli.py:5192
Gateway 分发:gateway/run.py + GATEWAY_KNOWN_COMMANDS 冻结集
Gateway 帮助:gateway_help_lines() 自动生成 /help
Telegram 菜单:telegram_bot_commands() 生成 BotCommand 数组
Slack 子命令:slack_subcommand_map() 生成 /hermes sub 路由
自动补全:COMMANDS flat dict 喂给 SlashCommandCompleter
CLI 帮助:COMMANDS_BY_CATEGORY 喂给 show_help()

加一条 slash 命令只需要在 CommandDef 数组里加一行 + 各自 process_command() 里加 elif canonical == "mycommand": 分支。加别名更简单 —— 只动 aliases 元组,所有下游自动同步 (dispatch、help、Telegram 菜单、Slack 映射、autocomplete)。

16Profile 多实例支持

Hermes 支持多个完全隔离的实例,每个有独立 HERMES_HOME。核心机制:

hermes_cli/main.py 中的 _apply_profile_override()
    在任何模块 import 之前设置 HERMES_HOME 环境变量
    → 所有 119+ 个 get_hermes_home() 调用自动 scope 到活 profile

安全规则(AGENTS.md:376-420):

禁止硬编码 Path.home() / ".hermes"(会破坏 profiles)
用 get_hermes_home() 读路径,display_hermes_home() 展示
平台 adapter 连接时用 acquire_scoped_lock() 防两个 profile 共用同一个 bot token
Profile 列表本身不存在 HERMES_HOME,存在 Path.home() / ".hermes" / "profiles", 这样 hermes -p coder profile list 能看到所有 profile(包括非活动的)

真实修过 5 个 bug:AGENTS.md 第 427 行提到"This was the source of 5 bugs fixed in PR #3575" —— Nous Research 自己踩过坑。硬编码 ~/.hermes 会破坏 profile 隔离, 测试里也要 mock Path.home() + 设 HERMES_HOME 才能跑 profile 测试 (tests/conftest.py 的 _isolate_hermes_home autouse fixture)。

17v0.8.0 重要变化

来源:RELEASE_v0.8.0.md(2026-04-08 发布,209 PR / 82 issue 合并)。挑出最关键的:

A. 自我进化的证据

#6120 — "Self-Optimized GPT/Codex Tool-Use Guidance"。原文: "The agent diagnosed and patched 5 failure modes in GPT and Codex tool calling through automated behavioral benchmarking"。这句话很重:agent 用自己的行为基准测试发现自己的问题, 然后自己写了修复。不是团队加提示词,是 agent 闭环在做迭代。

B. 后台任务自动通知

#5779 — notify_on_complete。起一个长任务(训练、测试、部署), agent 不需要 polling,完成时自动收通知。这是长时间任务场景的核心优化。

C. 闲置超时替代墙钟超时

#5389 — 原文:"Gateway and cron timeouts now track actual tool activity instead of wall-clock time"。这个改动让"跑 20 分钟的 build 不会被超时杀掉", 前提是它一直在有工具活动。

D. 实时模型切换

#5181 / #5742 — /model 从 CLI、Telegram、Discord、Slack 都能切。 Telegram 和 Discord 还给了 inline 按钮 picker。

E. 安全加固打包

#5944 + #5613 + #5629 — "Security Hardening Pass":

SSRF 保护统一化
时序攻击(timing attack)缓解
tar 遍历防护(解压恶意包)
凭证泄漏防护
cron 路径遍历加固
跨会话隔离
所有 terminal 后端的 workdir 消毒

F. MCP OAuth 2.1 PKCE

#5420 — 完整的 OAuth 2.1 (带 PKCE) 接入任意 MCP server。 #5305 — 自动 OSV 漏洞库扫描 MCP 扩展包(防供应链投毒)。

G. Google AI Studio 原生 provider

#5577 — Gemini 直连,不再必须过 OpenRouter。还集成了 models.dev 注册表自动检测任意 provider 的 context length。

H. 集中式日志 + 结构验证

#5430 — ~/.hermes/logs/agent.log + errors.log, hermes logs 命令跟踪过滤。#5426 — 启动时验证 config.yaml 结构。

18Docker 运行时

Dockerfile

来源:Dockerfile:1-46(46 行,简短)。三阶段构建:

ghcr.io/astral-sh/uv:0.11.6-python3.13-trixie 作为 uv 源 Dockerfile:1
tianon/gosu:1.19-trixie 作为 gosu 源 Dockerfile:2 — gosu 是用来丢 root 权限的轻量替代 su
debian:13.4 作为最终 base Dockerfile:3

关键环境变量 Dockerfile:7-10:

PYTHONUNBUFFERED=1 — 实时刷日志
PLAYWRIGHT_BROWSERS_PATH=/opt/hermes/.playwright — "Store Playwright browsers outside the volume mount so the build-time install survives the /opt/data volume overlay at runtime"。这是典型的 Docker volume 覆盖细节问题 — 如果装在 /opt/data 下,runtime mount 覆盖会抹掉安装

APT 包列表 Dockerfile:14-15: build-essential nodejs npm python3 ripgrep ffmpeg gcc python3-dev libffi-dev procps —— 注意:这里没有 git。这是我们昨天部署时踩到的坑, npm install 需要 git 但 Dockerfile 没装,导致 build 失败,需要 sed 补丁。

entrypoint.sh

docker/entrypoint.sh:1-64:

root → hermes 降权(docker/entrypoint.sh:11-30): 如果以 root 启动,先检查并重映射 hermes 用户的 UID/GID(通过 HERMES_UID/HERMES_GID 环境变量),fix 数据卷所有权,然后 exec gosu hermes "$0" "$@"
目录结构初始化(docker/entrypoint.sh:42): mkdir -p $HERMES_HOME/{cron,sessions,logs,hooks,memories,skills,skins,plans,workspace,home} —— 最后那个 home 很巧妙,注释解释:"Without it those tools [git, ssh, gh, npm] write to /root which is ephemeral and shared across profiles"
首次启动 seed 默认文件:如果 .env / config.yaml / SOUL.md 不存在,从 INSTALL_DIR 的 example 文件复制
同步 bundled skills(docker/entrypoint.sh:60-62): python3 $INSTALL_DIR/tools/skills_sync.py 基于 manifest,保留用户编辑
最后 exec hermes "$@"

19设计洞察(主观)

读完之后几个主观结论 —— 不是源码里写的,是我看源码之后的归纳:

1. 模块化的纪律比任何单点技术重要

Hermes 能支持 22 个消息平台 + 6 种终端后端 + 11 个 LLM provider + 78 个 skill,不是因为每个都写得多好, 而是因为工具注册表 + 抽象基类 + slash 命令中心化让新增成本低到可接受。看 BasePlatformAdapter、BaseEnvironment、ToolRegistry、 COMMAND_REGISTRY 这 4 个抽象,就明白为什么 v0.8 能在一个周期合并 209 个 PR。

2. 真正的性能优化都在"不重建"上

Gateway 的 _agent_cache、_tool_loop 的持久 loop、Anthropic 的 prompt cache 保护…… 所有优化都是"能不重建就不重建"。这让 Hermes 对 prompt caching 友好的 LLM(Anthropic)比对不友好的便宜 10 倍。这不是理论 —— AGENTS.md:339-347 把这个原则写成铁律。

3. 它把"agent 能跑哪"和"agent 用啥模型"解耦了

传统 agent 把两件事绑一起。Hermes 的终端后端和 LLM provider 是两个独立的抽象: 你可以把 agent 放在 $5 VPS 上,代码执行转发到 Modal 上的 GPU 实例,模型调 Anthropic Opus, 三者都可以独立换。这个解耦是真正的产品差异化。

4. 它承认自己是"一个会用工具的 LLM",而不是"有 LLM 的应用"

主循环本质就是 while: LLM call → handle tool calls → repeat。 Hermes 的所有复杂度都在工具生态、平台适配、状态持久化、安全边界上 —— agent loop 本身 2000 行里大部分是错误处理、重试、审批、中断、预算、流式回调,而不是"AI 逻辑"。这符合"agent 是一个模式,不是一种算法"的设计哲学。

5. 自进化不是魔法,是 DSPy + 基准测试闭环

v0.8 的 "Self-Optimized GPT/Codex Tool-Use Guidance"(#6120)揭示了自进化的实现路径: 跑 benchmark → 识别失败模式 → 用 DSPy/GEPA 优化 prompt → 回灌。独立仓库 hermes-agent-self-evolution 印证这点。换句话说,自进化是工程,不是玄学。

20没有 Web UI 的真相 · v0.7 → v0.8 拆分

读到这里你可能发现一件事 —— 本文讲了很多"platform adapter", 但始终没讲一个网页聊天界面。这不是遗漏,是 v0.8 的事实: Hermes 上游 v0.8 已经把 Web UI 整个删掉了。

证据对比

版本	web/ 子项目	docker-compose.deploy.yml 有 hermes-web 服务?	Docker 镜像
v0.7.x(2026-04-06 本地快照)	✅ 有	✅ 有(ports 4410:80)	hermes-agent-hermes-agent + hermes-agent-hermes-web(两个容器)
v0.8.0(本文分析版本)	❌ 整个被删	❌ 连文件都没了	只剩 hermes-agent(单容器)

v0.8.0 发布日:2026-04-08(见 RELEASE_v0.8.0.md:3)。

那浏览器里能访问什么?

gateway/platforms/api_server.py:1736-1754 列出 v0.8 API Server 的**全部** 17 个路由 (实际 grep router\.add_ 得到):

GET    /health
GET    /v1/health
GET    /v1/models               ← OpenAI 兼容 models 列表
POST   /v1/chat/completions     ← OpenAI 兼容聊天(支持 stream)
POST   /v1/responses            ← OpenAI Responses API 兼容
GET    /v1/responses/{id}
DEL    /v1/responses/{id}

GET    /api/jobs                ← cron 任务管理
POST   /api/jobs
GET    /api/jobs/{id}
PATCH  /api/jobs/{id}
DEL    /api/jobs/{id}
POST   /api/jobs/{id}/pause
POST   /api/jobs/{id}/resume
POST   /api/jobs/{id}/run

POST   /v1/runs
GET    /v1/runs/{run_id}/events ← 结构化事件 SSE 流

0 个 HTML 路由,0 个 static 路由,0 个 template 渲染。你在浏览器里访问根路径 / 会直接得到 404,因为 API Server 没有配 root handler。

仓库里还有静态网页,但 Hermes 不 serve 它们

v0.8 仓库里确实还有两个 HTML 相关目录,但都不会被 Hermes 运行时加载:

目录	内容	用途
`landingpage/`	665 行 HTML + 521 JS + 1178 CSS + banner/icons	Nous 官网营销页(hermes-agent.nousresearch.com)— 拉人下载用,不是聊天 UI
`website/`	Docusaurus 项目(docusaurus.config.ts + docs/)	官方文档站(hermes-agent.nousresearch.com/docs)— 静态生成,独立部署

也就是说:这两个目录是上游自己官网用的源码,不会被 Hermes 自己 serve,更不会跟着 docker compose 起来。

那人们怎么跟 Hermes 聊?

v0.8 把"对话界面"完全交给三类外部入口:

CLI 终端(hermes 命令 + prompt_toolkit TUI)— 本地最直接
22 个消息平台(Telegram / Discord / Slack / WhatsApp / Signal / Email / Matrix / DingTalk / Feishu / WeCom / WeChat / ...)— Gateway 网关转发
REST API(/v1/* OpenAI 兼容)— 配合任意第三方前端 (Open WebUI / LobeChat / 自建前端)

为什么 v0.8 删了 Web UI(推测)

Web UI 生态已经很成熟(Open WebUI / LobeChat / AnythingLLM / ...),任意一个都比自建强
删了之后维护面积减少,可以专心做 agent core + platform adapter + skills 这三件 Hermes 独有的事
Hermes 既然已经是 OpenAI 兼容 API,第三方前端塞进去就能用,没必要自己造一个二流前端
符合 UNIX 哲学:"做一件事做好",UI 不是 agent 核心能力

所以:你直接浏览器打开 hermes.milejoy.com(我们部署的实例)默认是看不到聊天界面的 — 看到的是我们自己写的介绍页 + 隔离层 Basic Auth。如果要聊天 UI,方案是自建一层聊天前端调用 /v1/* 或者挂一个 Open WebUI 容器指向 Hermes API。

21部署实战 · 公司 VPS LXC + Docker 双层隔离

本文不光是源码解析,也是一次真部署的实录。目标是把 Hermes v0.8 跑到一台已经运行其他服务 (LobeChat + PostgreSQL + RustFS 等)的公司 VPS 上,且不允许 Hermes 意外碰到邻居。以下是实际方案和踩过的坑,**每一条都来自真实 debug 现场**。

最终架构

Internet │ Nginx + Let's Encrypt (443) │ Basic Auth (boss/mile) │ proxy_set_header Authorization "Bearer …" ▼ ┌──────────────────────────────────────────────────────┐ │ Incus LXC 容器 hermes-box (Debian 13) │ │ security.privileged = true │ │ security.nesting = true │ │ cpu=4 · mem=6GB · 快照 fresh │ │ IP: 10.146.223.10 (incusbr0, NAT) │ │ │ │ ┌───────────────────────────────────────┐ │ │ │ Docker 容器 hermes-agent │ │ │ │ image: hermes-agent:latest (2.5GB) │ │ │ │ privileged + seccomp:unconfined │ │ │ │ apparmor:unconfined │ │ │ │ 0.0.0.0:8642 (API Server) │ │ │ │ Hermes v0.8 + Poe (custom provider) │ │ │ │ 78 bundled skills │ │ │ └───────────────────────────────────────┘ │ └──────────────────────────────────────────────────────┘ LobeChat(同一宿主,不同 bridge 网络,完全隔离)

为什么要 LXC(不只是 Docker)

直接 docker run 的问题是 — Docker 默认 bridge 和宿主网络之间有一定可见性, 错误命令可以访问到邻居容器(LobeChat 的 PostgreSQL 端口)。我们想要的是:

独立 namespace:Hermes 看不到宿主进程、宿主文件系统、邻居容器
独立网络:Hermes 的 docker0 和宿主的 docker0 是两条不同的 bridge
一键回滚:出事 incus restore hermes-box fresh,整个环境秒回基线
一键销毁:incus delete hermes-box --force 不留任何污染
运维栈一致:跟同一人之前的 HiClaw LXC 部署同栈,少一套技术要学

Incus 选 Debian 13 作为 rootfs,security.nesting=true 允许容器内部再跑 Docker, 形成LXC → Docker → Hermes runtime 三层。

9 条踩坑速查

这不是抱怨清单,是给下一个想做同样事情的人节省 4 小时。

坑 1 · 上游 Dockerfile 漏装 git

Dockerfile:14-15 的 apt 包列表:

build-essential nodejs npm python3 ripgrep ffmpeg gcc python3-dev libffi-dev procps

没有 git。但 npm install 阶段需要 git 拉 git 类型依赖, build 在第 17 步失败:npm ERR! syscall spawn git / errno ENOENT。

修法:sed 补一行

sed -i 's/procps/procps git/' Dockerfile

坑 2 · docker-compose env_file 不支持行内注释

.env 里写:

TELEGRAM_BOT_TOKEN=          # 等你从 @BotFather 拿到后填

结果 docker compose 把整个 # 等你从 @BotFather 拿到后填 当作 TOKEN 的值, Hermes 拿去尝试连 Telegram 时报 telegram.error.InvalidToken,容器 crash loop。

修法:所有行内注释独立成一行。env_file 解析只认行首 #。

坑 3 · API_SERVER 绑 0.0.0.0 必须 API_SERVER_KEY

日志:Refusing to start: binding to 0.0.0.0 requires API_SERVER_KEY。这是 Hermes 的硬约束(gateway/platforms/api_server.py), 防止无 key 的 API server 对外暴露。

API_SERVER_KEY=$(openssl rand -hex 32)

坑 4 · docker compose restart 不重读 env_file

改了 .env 之后 docker compose restart 依然用老环境变量。必须 down && up -d 才会重读。这是 Docker Compose 的设计,不是 bug,但容易让人白花 10 分钟 debug。

坑 5 · Incus 容器 DHCP 拿不到 IPv4

incus list IPv4 列永远是空的,容器 eth0 只有 IPv6。理论上 incusbr0 的 dnsmasq 应该分配 IPv4,实测 DHCPDISCOVER 发出但无 OFFER 回来。可能跟宿主已有的 Docker FORWARD DROP + UFW 规则有关,iptables -I FORWARD 没修好。

修法:跳过 DHCP,直接写静态 IP 到容器内的 /etc/systemd/network/eth0.network:

[Match]
Name=eth0

[Network]
Address=10.146.223.10/24
Gateway=10.146.223.1
DNS=8.8.8.8

坑 6 · Docker-in-LXC BuildKit 的 spawn sh EACCES

在 LXC 容器里跑 docker compose build,npm 某些原生包(better-sqlite3)postinstall 报:

npm ERR! code EACCES
npm ERR! syscall spawn sh
npm ERR! path /opt/hermes/node_modules/better-sqlite3

尝试无效的修法:security.privileged=true、security.nesting=true、 security.syscalls.intercept.*、raw.lxc: lxc.apparmor.profile=unconfined、 DOCKER_BUILDKIT=0(legacy builder 又被 Dockerfile 的 --chmod 卡住)。

最终有效的修法:build 不在 LXC 里做。改为:

宿主机直接 docker build -t hermes-agent:latest ./source(已知能成功)
docker save hermes-agent:latest | gzip -1 > /tmp/hermes.tar.gz(~2.5GB)
incus file push /tmp/hermes.tar.gz hermes-box/tmp/
incus exec hermes-box -- docker load -i /tmp/hermes.tar.gz
LXC 内的 compose 从 build: 改为 image: hermes-agent:latest
build 完成后清宿主镜像 docker rmi hermes-agent:latest(保持宿主干净,只让 LXC 里有)

Build 在宿主跑,运行时在 LXC,隔离边界一点没破。宿主只多了短暂的 tar 文件。

坑 7 · Python socket.socketpair() 在 Hermes 启动时 EACCES

容器启动后立刻 crash,Python traceback:

File "/usr/lib/python3.13/asyncio/selector_events.py", line 120, in _make_self_pipe
  self._ssock, self._csock = socket.socketpair()
PermissionError: [Errno 13] Permission denied

原因:Docker default seccomp 在 LXC 嵌套里拦了 socketpair()。

修法:docker-compose.yml 给 Hermes 容器加:

    privileged: true
    security_opt:
      - seccomp:unconfined
      - apparmor:unconfined

坑 8 · v0.8 官方 landing/website 路径变化

v0.7 的 docker-compose.deploy.yml 引用 ./web 作为单独服务, v0.8 这个目录不存在了(见第 20 章)。如果你还用老 compose 文件 build,第一步就会:

unable to prepare context: path "./source/web" not found

修法:compose 文件只留 hermes-agent 服务,暴露 8642 端口即可。

坑 9 · Hermes v0.8 默认拒绝浏览器 CORS

关键 debug 时刻。写完聊天 UI 部署上线后,用 curl 测 /v1/chat/completions 返回 200, 但浏览器打开 UI 点发送按钮返回 HTTP 403。

根因:gateway/platforms/api_server.py:183-201 的 CORS middleware 对带 Origin header 的请求默认拒绝(curl 不带 Origin 能过,浏览器必带 Origin 被拦)。 _origin_allowed() 在 api_server.py:393-401,检查 self._cors_origins:

def _origin_allowed(self, origin: str) -> bool:
    if not origin:
        return True                       # ← curl 这类非浏览器,放行
    if not self._cors_origins:
        return False                      # ← 浏览器且未配允许列表,拦
    return "*" in self._cors_origins or origin in self._cors_origins

修法:.env 加一行

API_SERVER_CORS_ORIGINS=https://hermes.milejoy.com

多个 origin 用逗号分隔。代码读 env 在 api_server.py:322-323。改完 docker compose down && docker compose up -d 让 env 重读。

安全影响分析

维度	有效的隔离	失去的保护
文件系统	✅ Hermes 看不到宿主和邻居容器的 /opt	容器内 root 能 mount / chroot
进程	✅ PID namespace 完全独立	—
网络	✅ 独立 bridge 10.146.223.0/24,跟 LobeChat docker0 不互通;宿主 Nginx 反代是唯一入口	—
seccomp / AppArmor	—	❌ 都 unconfined(为了让 socket.socketpair 能用)
kernel 漏洞	—	❌ 共享宿主 kernel,kernel exploit 可穿透到宿主
销毁	✅ `incus delete --force` 秒清整个容器 + docker 层 + volume	—

对单用户内部场景够用。如果你要做 Manus 那种"每个用户独立 sandbox 防恶意 prompt"的场景, 这套方案的 kernel 共享就是痛点 — 那时应该上 Firecracker / Kata Containers / gVisor, 不是 LXC。

跟 Manus(Firecracker)对比

维度	Manus 公共服务	Hermes 本次部署
隔离技术	每 session 一个 Firecracker microVM	共用 LXC 容器(长命)
启动时间	~1-2s(microVM 冷启)	~0.5s(LXC 容器本就启着)
kernel 隔离	✅(独立 kernel)	❌(共享宿主 kernel)
用户模型	多租户陌生人	单用户 / 内部团队
成本模型	按 session 付费	固定资源预留
运维复杂度	需要 orchestrator + session 生命周期管理	`incus` 命令 + 快照

结论:"威胁模型决定隔离技术"。不要看到 Manus 用 Firecracker 就觉得自己也得用 — 先问自己这三个问题:

会不会有陌生人用你的 agent?(不会 → LXC 就够)
agent 会跑用户提供的代码吗?(不会 → LXC 就够)
kernel exploit 是不是你的现实威胁?(不是 → LXC 就够)

最终交付

部署好之后的公司实例:

URL:hermes.milejoy.com
访问:HTTP Basic Auth(两个账号:boss / mile)
UI:自建玻璃拟态聊天前端(Cormorant Garamond 古金色 + 流式 SSE + Markdown + 工具进度 chip + localStorage 历史)
API:/v1/chat/completions OpenAI 兼容,Nginx 自动注入 Bearer,浏览器零密钥
后端:Poe GPT-5(主) + OpenRouter(备用)
Skills:78 bundled + Hermes 自己的 agent 记忆 / 学习循环

· Hermes Agent v0.8.0 源码解析 · 2026-04-13 ·
源码取证:~/Projects/research/20260413-hermes-source-analysis/source/(git commit d6785dc)·
上游仓库:github.com/NousResearch/hermes-agent · 官方文档:hermes-agent.nousresearch.com/docs · 运行实例:hermes.milejoy.com