Merge pull request 'feat(cli/dev/envdev): 为Linux环境添加Docker安装配置相关任务' (#1 ) from develop into main

Reviewed-on: #1
feat(cli/dev/envdev): 为Linux环境添加Docker安装配置相关任务
2026-07-02 05:26:08 +00:00 · 2026-07-02 10:58:12 +08:00 · 2026-06-28 21:38:37 +08:00 · 2026-06-28 21:38:18 +08:00 · 2026-06-28 20:30:54 +08:00 · 2026-06-28 20:30:17 +08:00
49 changed files with 4954 additions and 1076 deletions
@@ -10,3 +10,4 @@ wheels/
 .venv
 .coverage
 .idea
+*_profile.html
@@ -0,0 +1,15 @@
+# PYTHON
+.coverage
+.pytest_cache/
+.ruff_cache/
+.tox/
+.venv/
+__pycache__/
+
+# NODEJS
+node_modules/
+
+# IDE
+.idea
+.trae
+.vscode
@@ -0,0 +1,11 @@
+---
+alwaysApply: true
+scene: git_message
+---
+
+在此处编写规则，自定义 AI 生成提交信息的风格。
+
+## 提交信息格式
+- 提交信息必须使用中文。
+- 提交信息必须包含变更的类型（例如 "fix"、"feat"、"refactor" 等）。
+- 提交信息必须尽简洁明了，不要超过一段落。
@@ -0,0 +1,157 @@
+# Python 开发规范
+
+本规范结合 Python 最佳实践，作为编写与审查 Python 代码的统一标准。
+详细操作指南见 `.agents/skills/` 下相应技能。
+
+## 工具链（以 pyproject.toml 为准）
+
+| 工具 | 用途 | 配置要点 |
+|------|------|---------|
+| **ruff** | lint + format | `line-length=120`，`target-version="py38"` |
+| **pyrefly** | 类型检查 | `preset="strict"`，`python-version="3.8"` |
+| **pytest** | 测试 | `asyncio_default_fixture_loop_scope="function"`，marker `slow` |
+| **coverage** | 覆盖率 | `branch=true`，`fail_under=95`，`concurrency=["thread"]` |
+| **pre-commit** | 提交前检查 | ruff `--fix` + trailing-whitespace + end-of-file-fixer |
+
+验证（每次修改后必做）：
+
+```bash
+uvx --from pyflowx pymake tc
+uvx --from pyflowx pymake cov
+```
+
+## 兼容性
+
+- **最低 Python 3.8**：用 `from __future__ import annotations` 延迟注解求值；
+  按版本用 `typing.List`(3.8) → 内置泛型(3.9) → `X | Y`(3.10) → `typing.override`(3.12)。
+- **版本守卫**：`if sys.version_info >= (3, X):` 引入高版本 API；低版本回退分支加 `# pragma: no cover`。
+- **零运行时依赖**：仅依赖标准库（3.8 需 `graphlib_backport`、`typing-extensions`）。
+  新增依赖须审慎，优先用标准库。
+
+## 类型注解
+
+- **公共 API 必须有完整类型注解**，包括返回类型；私有函数也应有注解。
+- 泛型用 `TypeVar`；PEP 696 `default=` 仅 3.13+ 标准库支持，3.8–3.12 用 `typing_extensions.TypeVar`。
+- `Mapping`/`Sequence` 用于只读参数，`dict`/`list` 用于可变返回。
+- `Any` 仅用于真正动态场景（如 `Context` 跨任务异构映射）；任务内部类型必须完全静态。
+- 禁用裸 `# type: ignore`；确需时加具体规则码（如 `# type: ignore[union-attr]`）。
+- **`TYPE_CHECKING` 守卫**：仅类型检查需要的导入放 `if TYPE_CHECKING:` 块内，避免循环依赖。
+- **类型收窄**：用 `assert isinstance(x, Y)` 辅助 pyrefly 推断；`cast()` 仅用于类型系统无法表达的场景。
+
+## 数据结构
+
+- **不可变优先**：配置/描述类用 `@dataclass(frozen=True)`；可变类属性标注 `RUF012` 豁免。
+- **缓存**：实例级用 `functools.cached_property`，按参数键控用 `functools.lru_cache`；
+  不可哈希参数需 try/except 回退。修改被缓存数据源后必须手动清空缓存。
+- **抽象基类**：接口用 `abc.ABC` + `@abstractmethod`（如 `StateBackend`）。
+- **枚举**：状态/标志值用 `enum.Enum`（如 `TaskStatus`），禁止裸字符串/魔术数字；枚举值用 `UPPER_SNAKE`。
+- **`__repr__`**：可变类实现 `__repr__`（含关键字段）；`frozen=True` dataclass 自动生成。
+
+## 模块与导入
+
+- **单一职责**：每模块只做一件事（`task.py` 数据结构、`executors.py` 执行、`command.py` 命令、`compose.py` 组合）。禁止跨职责边界。
+- **导入顺序**（ruff isort）：`__future__` → 标准库 → 第三方 → 本地，各组间空行。
+- **惰性导入**：仅为打破循环依赖时使用，函数体内导入并注释说明；顶层导入是默认。
+- **`__all__`**：定义 `__all__` 显式声明导出符号，位置仅次于 `__future__` 之后。
+- **禁用 star imports**：`from x import *` 污染命名空间、破坏类型检查（`__init__.py` 聚合经 `__all__` 控制为例外）。
+- **避免 `utils.py`/`helpers.py`**：按职责归入对应模块。
+
+## 函数设计
+
+- **模块级函数优于 Mixin**：共享逻辑用模块级函数，类只持有状态与薄方法。
+- **静态方法慎用**：纯函数直接放模块级。
+- **参数 ≤ 5 个**为宜；超出用 dataclass 封装参数对象。
+- **单一职责**：一个函数做一件事；过长函数考虑拆分。
+- **异常范围要窄**：只捕获预期异常（如 `(TypeError, ValueError, KeyError, AttributeError)`），
+  **禁止** `except Exception` 掩盖 bug；捕获后至少 `logger.warning` 记录。
+- **可变默认参数**：`def f(x=[])` 是经典坑；用 `None` 哨兵或 `field(default_factory=list)`。
+
+## 异常处理
+
+- **自定义异常家族**：继承公共基类（如 `PyFlowXError`），按错误场景分类。
+- **异常包装**：`raise NewError(...) from exc` 保留因果链。
+- **不要吞异常**：捕获后必须处理（记录/包装/重抛），禁止空 `except: pass`。
+- **钩子/回调异常**：第三方回调异常仅记录，不影响主流程。
+
+## 并发与线程安全
+
+- **进程全局状态**（`os.environ`/`os.chdir`）在并发场景下必须用全局锁（`threading.RLock`）序列化。
+- **条件评估不可有可变状态**：组合条件（NOT/AND/OR）不得修改共享 `_reason`，避免竞态。
+- **批量 I/O**：循环内多次写盘改为批量一次（`contextmanager` 包裹延迟落盘）。
+- **信号量限流**：`concurrency_key` + `Semaphore` 按组限流。
+
+## 测试
+
+详细操作指南见 `.agents/skills/pyflowx-testing` 技能。硬约束：
+
+- **覆盖率 ≥ 95%**（branch coverage），不得下降。
+- **公共 API 优先测试**：用公共接口（`has`/`get`），不访问私有方法；
+  故障注入等场景可临时访问私有属性，docstring 注明原因。
+- **命名**：`test_<被测对象>_<场景>`。
+- **断言**：原生 `assert x == 1`，禁用 `self.assertEqual`；`pytest.raises` 必填 `match=`。
+- **Mock 优先级**：`monkeypatch` > 内联 stub > `unittest.mock` > `pytest-mock`。
+  禁用 `@patch` 装饰器、`mock.patch.object` 上下文、`pytest-mock` 的 `mocker` fixture。
+- **fixture**：`tmp_path`/`monkeypatch`/`capsys` 优先；autouse 仅全局必需时用。
+- **slow 标记**：耗时测试加 `@pytest.mark.slow`，CI 可 `-m "not slow"` 跳过。
+- **测试代码也跑 ruff**：`tests/**` 忽略 `ARG001`/`ARG002`。
+
+## 代码风格
+
+- **行宽 120**（ruff formatter 处理）。
+- **docstring**：公共 API 必须有；中文叙述 + 中文注释是本项目既有风格。
+- **打印和日志**：使用中文打印和日志，避免使用英文。
+- **命名**：`snake_case` 函数/变量，`PascalCase` 类，`UPPER_SNAKE` 常量，`_leading_underscore` 私有。
+- **字符串引号**：ruff 默认双引号。
+- **末尾单 `\n`**、**无尾随空格**（pre-commit 强制）。
+- **不用 emoji**：除非用户明确要求。
+
+## Pythonic 风格
+
+- **`is` 比较 `None`/`True`/`False`**：单例用 `is`，值用 `==`（PEP 8 E711/E712）。
+- **EAFP 优于 LBYL**：先尝试再处理异常，而非先检查再执行（避免竞态窗口）。
+- **truthiness**：`if items:` 优于 `if len(items) > 0:`。
+- **字符串格式化**：首选 f-string；`%` 仅用于 `logging` 延迟格式化。
+- **推导式**优于 `map`+`filter`；> 2 层拆为显式循环。
+- **`enumerate`** 替代 `range(len())`；**`zip`** 并行迭代（3.10+ 用 `strict=True`）。
+- **解包** `a, b = pair` 优于索引访问；忽略值用 `_`。
+- **海象运算符 `:=`**（3.8+）：赋值+判断合一，但不滥用。
+
+## 日志
+
+- **`logging.getLogger(__name__)`**：每模块独立 logger，禁用 `print` 调试残留。
+- **结构化上下文**：`extra={...}` 传字段；`logger.warning("task %r failed: %s", name, exc)` 优于 f-string（延迟格式化）。
+- **日志级别**：`DEBUG` 诊断 / `INFO` 关键流程 / `WARNING` 可恢复异常 / `ERROR` 需人工介入。
+- **禁止日志密码/密钥**：脱敏后再记录。
+
+## 路径与资源
+
+- **优先 `pathlib.Path`**：`Path("a") / "b"` 而非 `os.path.join`（ruff `PTH` 强制）；
+  禁止字符串拼接路径。类型注解用 `Path`，边界 `str` 立即包装。
+- **`with` 语句**：文件、锁、连接、临时目录一律用 `with` 或 `contextlib.contextmanager`；
+  多资源用 `contextlib.ExitStack`。
+- **显式关闭**：长生命周期对象（连接池、线程池）实现 `close()`，但优先 `with`。
+- **批量操作**：循环内多次 acquire/release 改为批量一次。
+
+## 安全
+
+- **禁用 `eval`/`exec`**：处理不可信输入时绝不使用；用 `ast.literal_eval` 或专用解析器。
+- **`subprocess`**：禁用 `shell=True` 除非命令完全可信；优先 `list[str]` 形式。
+- **凭证不入仓**：密钥/token/密码放 `.env` 或环境变量，`.gitignore` 必须包含 `.env`。
+- **日志脱敏**：记录请求/响应时移除 `Authorization`、`password` 等字段。
+- **依赖审计**：`uv lock` 后审阅新增依赖，避免引入已知 CVE 的包。
+
+## 性能要点
+
+- **避免重复计算**：循环内查询应缓存或预构建映射（如 `{name: spec}`）。
+- **避免双重查找**：`has(k)` + `get(k)` 改为单次 `get(k)` + `KeyError` 回退。
+- **统一校验**：入口校验一次，下游路径不重复（如 `run()` 统一 `validate()`，`layers()` 不再重复）。
+- **事件 emit**：任务生命周期必须 emit `RUNNING` → `SUCCESS`/`FAILED`/`SKIPPED`，
+  不要留死分支（`# pragma: no cover` 是清理信号，应激活或删除）。
+
+## Git 与提交
+
+- **不自动提交/push**：除非用户明确要求。
+- **不修改 git config**。
+- **不运行破坏性命令**（`push --force`/`reset --hard`/`clean -f`）除非用户明确要求。
+- **staging**：按文件名添加，不用 `git add -A`/`git add .`，避免误加敏感文件。
+- **commit message**：简洁，聚焦"为什么"而非"是什么"；遵循仓库既有风格。
@@ -0,0 +1,135 @@
+---
+name: "pyflowx-testing"
+description: "PyFlowX 项目的测试编写规范与 mock 使用指南。在编写或审查测试、选择 mock 工具、设计 fixture、处理 asyncio 测试时调用。"
+---
+
+# PyFlowX 测试规范
+
+本技能是 `.trae/rules/python-standards.md` 测试章节的详细展开。
+规则文件仅保留硬约束指针，本文件提供完整操作指南。
+
+## 总则
+
+- **覆盖率 ≥ 95%**（branch coverage），不得下降。
+- **公共 API 优先测试**：测试用公共接口（`has`/`get`），不访问私有方法
+  （如 `_expired`）。兼容旧测试的私有方法应删除并迁移测试。
+  例外：`_store`/`_flush` 等内部状态在无法用公共 API 触发时（如模拟过期、
+  故障注入），可临时访问私有属性，并在 docstring 注明原因。
+- **命名**：`test_<被测对象>_<场景>`，如 `test_storage_key_cache_key_exception_returns_name`。
+- **每个测试一个断言重点**；多个断言要语义相关。
+- **slow 标记**：耗时测试加 `@pytest.mark.slow`，CI 可 `-m "not slow"` 跳过。
+- **测试代码也跑 ruff**：`tests/**` 忽略 `ARG001`/`ARG002`（未用 fixture 参数）。
+- **断言风格**：用原生 `assert` + 比较运算符（`assert x == 1`），
+  不用 `self.assertEqual`；pytest 会生成更清晰的 diff。
+
+## Mock 工具选择（强制）
+
+**优先级**：`monkeypatch` > 内联 stub > `unittest.mock` > `pytest-mock`。
+
+| 场景 | 工具 | 示例 |
+|------|------|------|
+| 替换模块属性 / 环境变量 / 工作目录 | `monkeypatch` | `monkeypatch.setattr(subprocess, "run", fake_run)` |
+| `os.environ["KEY"]` 临时设置 | `monkeypatch.setenv` | `monkeypatch.setenv("LOCALAPPDATA", "C:\\...")` |
+| 切换 cwd | `monkeypatch.chdir` | `monkeypatch.chdir(tmp_path)` |
+| 一次性 stub 函数 | 内联 lambda / 闭包 | `ran = []; monkeypatch.setattr(subprocess, "run", lambda *c, **__: ran.append(c))` |
+| 复杂 spy（记录调用次数/参数/返回序列） | `unittest.mock.MagicMock` | 仅当 lambda 不足以表达时 |
+| `with patch(...)` 上下文 | **禁用**（用 monkeypatch） | monkeypatch 自动 teardown 更安全 |
+
+**禁止**：
+- 不用 `pytest-mock` 的 `mocker` fixture（项目虽在 dev 依赖声明，但实际
+  测试代码未使用；为保持风格统一，新代码继续用 `monkeypatch`）。
+- 不用 `unittest.mock.patch` 装饰器（`@patch("x.y")`），它隐藏依赖且
+  与 pytest fixture 模式不兼容；用 `monkeypatch.setattr` 替代。
+- 不用 `mock.patch.object` 作为上下文管理器，除非被测代码本身就是
+  contextmanager（此时用 `monkeypatch.setattr` 仍更简单）。
+
+## monkeypatch 使用规范
+
+- **类型注解**：fixture 参数标注 `monkeypatch: pytest.MonkeyPatch`。
+- **作用域**：monkeypatch 自动在测试结束时撤销，**禁止**手动
+  `monkeypatch.setattr(x, "y", original)` 恢复（多余且容易遗漏）。
+  例外：在单个测试内需要中途恢复时，用 `monkeypatch.undo()` 全量撤销。
+- **替换目标**：替换"被测代码看到的对象"，而非全局对象本身。
+  - 错误：`monkeypatch.setattr("os.path.exists", fake)` —— 替换全局，影响其他模块。
+  - 正确：`monkeypatch.setattr(pyflowx.command.shutil, "which", fake)` ——
+    替换被测模块引用的 `shutil.which`。
+- **属性 vs 字符串路径**：优先属性访问形式 `monkeypatch.setattr(obj, "attr", val)`
+  而非字符串路径 `monkeypatch.setattr("pkg.mod.obj.attr", val)`，
+  前者有 IDE 跳转与重构支持。
+- **记录调用**：用闭包 `ran: list[tuple] = []` + `lambda *a, **k: ran.append((a, k))`
+  替代 `MagicMock`，可读性更好且无需导入。
+
+## Stub 与 Spy 模式
+
+- **轻量 stub**：内联定义 `class MockResult: returncode = 0; stdout = ""`，
+  替代 `MagicMock(return_value=...)`，类型明确且不引入 mock 依赖。
+- **状态收集**：闭包 + list 比 `mock.call_args_list` 更易断言：
+  ```python
+  calls: list[list[str]] = []
+
+
+  def fake_run(cmd: list[str], **_: Any) -> MockResult:
+      calls.append(cmd)
+      return MockResult()
+
+
+  monkeypatch.setattr(subprocess, "run", fake_run)
+  assert calls == [["clear"]]
+  ```
+- **副作用序列**：需要按调用次数返回不同值时，用 `itertools.cycle` 或
+  手动计数器，而非 `side_effect=[...]`（mock 专有 API）。
+- **异常注入**：`def raise_oserror(*a, **k): raise OSError("...")`，
+  用 `pytest.raises(OSError)` 验证，而非 `side_effect=OSError`。
+
+## 异常断言
+
+- **`pytest.raises`**：必填 `match=` 正则（除非异常消息完全不可预测），
+  避免误捕获同类异常：
+  ```python
+  with pytest.raises(StorageError, match="cannot write"):
+      b.save("a", 1)
+  ```
+- **异常链**：验证 `__cause__` 时用 `exc_info.value.__cause__`，
+  确认 `raise X from Y` 因果链完整。
+- **禁止** `try/except + assert False`：用 `pytest.raises` 替代。
+
+## Fixture 规范
+
+- **`tmp_path`**：处理临时文件，自动清理，禁止 `tempfile.mkdtemp()` 手动管理。
+- **`monkeypatch`**：环境变量、cwd、模块属性 mock（见上）。
+- **`capsys`/`capfd`**：捕获 stdout/stderr，验证日志或命令输出。
+- **autouse fixture**：仅在全局必需时用（如 `conftest.py` 的
+  `packtool_tmp_workdir` 自动切到 tmp_path）；否则显式声明参数。
+- **fixture 命名**：`snake_case`，描述"提供什么"而非"测试什么"
+  （`sample_graph` 优于 `test_data`）。
+- **fixture 作用域**：默认 `function`；`module`/`session` 仅当构造昂贵且
+  只读时，并加注释说明无副作用。
+
+## asyncio 测试
+
+- **fixture `loop_scope="function"`**（pyproject 已配置默认值）。
+- **async 测试**：`async def test_x():`，pytest-asyncio 自动驱动。
+- **await 检查**：测试异步函数必须 `await` 结果，禁止仅验证返回 coroutine 对象。
+- **异步 mock**：用 `AsyncMock`（3.8+ 在 `unittest.mock`）或
+  `async def fake(): return value`，禁用 `MagicMock(return_value=coro)`。
+
+## 参数化
+
+- **`@pytest.mark.parametrize`**：用 `ids` 参数提供可读标识：
+  ```python
+  @pytest.mark.parametrize(
+      ("strategy", "expected_workers"),
+      [("sequential", 1), ("thread", 8), ("async", 1)],
+      ids=["seq", "thread-8", "async"],
+  )
+  ```
+- **参数命名**：参数元组用有意义名称，而非 `("a", "b")`。
+- **组合爆炸**：参数组合 > 20 时拆分测试，避免单个测试函数臃肿。
+
+## 测试组织
+
+- **文件命名**：`test_<被测模块>.py`（`test_storage.py` 对应 `storage.py`）。
+- **类分组**：仅在测试逻辑强相关时用 `class TestXxx:` 分组；默认用模块级函数。
+- **docstring**：每个测试函数一句话说明"测试什么场景"，复杂场景补充"为什么"。
+- **setup/teardown**：优先 fixture；`setup_method`/`teardown_method` 仅在
+  无法用 fixture 表达时（罕见）。
@@ -14,18 +14,25 @@ PyFlowX 把"任务依赖"这件事做到极致简单：**参数名就是依赖
 ## 特性

 - **零样板** —— 参数名即依赖，框架自动注入上游结果
- **三种执行策略** —— `sequential`（调试）/ `thread`（I/O 密集同步）/ `async`（I/O 密集异步）
+- **四种执行策略** —— `sequential`（调试）/ `thread`（I/O 密集同步）/ `async`（I/O 密集异步）/ `dependency`（依赖驱动，最大化并行）
 - **类型安全** —— `TaskSpec[T]` 把返回类型一路传到 `RunReport`，mypy strict 通过
 - **DAG 校验** —— 构建时即时校验重名、缺失依赖、环
 - **自动分层** —— Kahn 算法分组，同层任务可并行
- **重试与超时** —— 每个任务独立配置 `retries` 与 `timeout`
- **断点续跑** —— `MemoryBackend` / `JSONBackend`，成功结果可缓存复用
+- **重试与超时** —— 每个任务独立配置 `RetryPolicy`（max_attempts/delay/backoff/jitter/retry_on）与 `timeout`
+- **软依赖** —— `soft_depends_on` 仅用于上下文注入，不参与拓扑分层
+- **并发限制** —— `concurrency_key` + `concurrency_limits` 按组限流
+- **任务钩子** —— `TaskHooks`（pre_run/post_run/on_failure）生命周期回调
+- **断点续跑** —— `MemoryBackend` / `JSONBackend`，成功结果可缓存复用；`batch()` 批量落盘
+- **缓存键** —— `cache_key` 函数基于输入计算稳定键，使不同输入产生独立缓存
 - **命令任务** —— `cmd` 参数直接执行外部命令，支持列表/shell/可调用对象
 - **条件执行** —— `conditions` 参数按平台、环境变量、应用安装等条件跳过任务
+- **图组合** —— `compose` / `GraphComposer` 编程式展开多图字符串引用
+- **任务模板** —— `task_template` 工厂批量生成相似 TaskSpec
+- **图级默认值** —— `GraphDefaults` 统一配置 retry/timeout/concurrency 等
 - **CLI 运行器** —— `CliRunner` 把多个图映射为命令行子命令，替代 Makefile
- **可观测** —— `on_event` 回调、`dry_run` 预览、`verbose` 生命周期日志、Mermaid 可视化
+- **可观测** —— `on_event` 回调（RUNNING/SUCCESS/FAILED/SKIPPED）、`dry_run` 预览、`verbose` 生命周期日志、Mermaid 可视化
 - **零运行时依赖** —— 仅依赖标准库（3.8 需 `graphlib_backport`）
- **95% 测试覆盖** —— 分支覆盖率>= 95%
+- **97% 测试覆盖** —— 分支覆盖率 >= 95%

 ## 安装

@@ -67,23 +74,31 @@ print(report["double"])  # [2, 4, 6]

 ### TaskSpec —— 任务描述

-`TaskSpec` 是不可变的任务描述符，是唯一需要配置的东西：
+`TaskSpec` 是不可变的任务描述符（`Generic[T]`，返回类型一路传到 `RunReport`），是唯一需要配置的东西：

 ```python
 px.TaskSpec(
    name="fetch_user",  # 唯一标识
    fn=fetch_user,  # 同步或异步函数
    cmd=["curl", "..."],  # 或: 执行命令（覆盖 fn）
-    depends_on=("auth",),  # 依赖的任务名
+    depends_on=("auth",),  # 硬依赖（参与拓扑分层）
+    soft_depends_on=("cache",),  # 软依赖（仅注入，不参与分层）
    args=(uid,),  # 静态位置参数（追加在注入参数后）
    kwargs={"timeout": 30},  # 静态关键字参数
-    retries=3,  # 失败重试次数（0 = 仅一次）
+    retry=px.RetryPolicy(max_attempts=3, delay=1.0, backoff=2.0),  # 重试策略
    timeout=30.0,  # 超时秒数（None = 不限制）
    tags=("api", "user"),  # 自由标签，用于子图过滤
    conditions=(is_prod,),  # 条件函数列表（全部为 True 才执行）
+    priority=10,  # 同层内优先级（高优先执行，默认 0）
+    concurrency_key="db",  # 并发分组键（配合 concurrency_limits 限流）
+    cache_key=lambda ctx: str(ctx.get("uid")),  # 缓存键函数（不同输入独立缓存）
+    hooks=px.TaskHooks(pre_run=..., post_run=..., on_failure=...),  # 生命周期钩子
    cwd=Path("/tmp"),  # 命令工作目录（仅 cmd 模式）
+    env={"DEBUG": "1"},  # 环境变量覆盖（fn 与 cmd 模式均生效）
    verbose=True,  # 打印命令输出（仅 cmd 模式）
    skip_if_missing=True,  # 命令不存在时自动跳过（仅 list[str] cmd）
+    allow_upstream_skip=False,  # 上游 SKIPPED/FAILED 时是否仍执行
+    continue_on_error=False,  # 本任务失败是否不中断整体
 )
 ```

@@ -97,18 +112,54 @@ px.TaskSpec(
 ### Graph —— DAG 构建

 ```python
-graph = px.Graph.from_specs([...])  # 整批校验（推荐）
+# 图级默认值：TaskSpec 字段为 None 时回退
+defaults = px.GraphDefaults(retry=px.RetryPolicy(max_attempts=2), timeout=60.0)
+
+graph = px.Graph.from_specs([...], defaults=defaults)  # 整批校验（推荐）
 # 或增量构建
-graph = px.Graph()
+graph = px.Graph(defaults=defaults)
 graph.add(px.TaskSpec("a", fn_a))
 graph.add(px.TaskSpec("b", fn_b, ("a",)))

 graph.validate()  # 显式校验（环检测）
-graph.layers()  # 拓扑分层
+graph.layers()  # 拓扑分层（run() 入口已统一校验，直接调用需自行先 validate）
 graph.to_mermaid()  # Mermaid 可视化
 graph.describe()  # 人类可读摘要
 graph.subgraph(("api",))  # 按标签切片
 graph.subgraph_by_names(("a", "b"))  # 按名称切片
+graph.map("fetch", [1, 2, 3], lambda i: TaskSpec(f"fetch_{i}", ...))  # 批量 fan-out
+```
+
+### 图组合 —— compose
+
+`compose` / `GraphComposer` 把带字符串引用的多个图展开为纯 `Graph`：
+
+```python
+graphs = {
+    "build": px.Graph.from_specs([px.TaskSpec("b", cmd=["echo", "b"])]),
+    "all": px.Graph.from_specs(["build", px.TaskSpec("t", cmd=["echo", "t"])]),
+}
+resolved = px.compose(graphs)  # "all" 图中的 "build" 引用被展开
+```
+
+引用格式：`"command_name"`（整个图）或 `"command_name.task_name"`（特定任务）。
+`CliRunner` 内部自动调用 `compose`。
+
+### 任务模板 —— task_template
+
+`task_template` 工厂批量生成相似 TaskSpec：
+
+```python
+fetch = px.task_template(
+    fn=fetch_url,
+    retry=px.RetryPolicy(max_attempts=5),
+    timeout=30.0,
+    tags=("api",),
+)
+graph = px.Graph.from_specs([
+    fetch("users", url="https://api.example.com/users"),
+    fetch("posts", url="https://api.example.com/posts"),
+])
 ```

 ### run —— 执行
@@ -116,12 +167,14 @@ graph.subgraph_by_names(("a", "b"))  # 按名称切片
 ```python
 report = px.run(
    graph,
-    strategy="async",  # sequential | thread | async
+    strategy="async",  # sequential | thread | async | dependency
    max_workers=8,  # thread 策略的线程池大小
+    concurrency_limits={"db": 2},  # 按 concurrency_key 限流
    dry_run=False,  # True = 仅打印计划
    verbose=False,  # True = 打印任务生命周期日志
-    on_event=callback,  # 状态转换回调
+    on_event=callback,  # 状态转换回调（RUNNING/SUCCESS/FAILED/SKIPPED）
    state=px.JSONBackend("state.json"),  # 断点续跑后端
+    continue_on_error=False,  # True = 单任务失败不中断整体
 )
 ```

@@ -141,7 +194,7 @@ report.describe()  # 人类可读报告
 按顺序求值：

 1. **标注为 `Context`** 的参数 → 接收完整上游结果映射
-2. **名称匹配依赖** 的参数 → 接收该依赖的结果
+2. **名称匹配依赖** 的参数 → 接收该依赖的结果（含软依赖，缺失时注入默认值）
 3. **`**kwargs`** 参数 → 接收所有依赖结果（dict）
 4. **`TaskSpec.args` / `kwargs`** → 为非依赖参数提供静态值

@@ -170,8 +223,11 @@ def fetch_user(uid: int) -> dict:  # uid 来自 TaskSpec.args
 | `sequential` | 串行 | 调试、CPU 密集 | 直接调用 | 事件循环 |
 | `thread` | 线程池 | I/O 密集同步 | 线程池 | 不支持 |
 | `async` | 事件循环 | I/O 密集异步 | 卸载到线程池 | 事件循环 |
+| `dependency` | 依赖驱动 | 最大化并行度 | 卸载到线程池 | 事件循环 |

-所有策略都遵循 `retries`、`timeout`、上下文注入、状态后端，并发出 `TaskEvent`。
+所有策略都遵循 `RetryPolicy`、`timeout`、上下文注入、状态后端、`concurrency_limits`，
+并发出 `TaskEvent`（RUNNING/SUCCESS/FAILED/SKIPPED）。`dependency` 策略无层屏障：
+任务在其所有硬依赖完成后立即启动。

 ## 命令任务

@@ -275,12 +331,25 @@ python examples/async_aggregation.py
 from pyflowx import JSONBackend

 # 第一次运行：成功结果写入 state.json
-backend = JSONBackend("state.json")
+backend = JSONBackend("state.json", ttl=3600)  # ttl 秒数，过期条目自动忽略
 report = px.run(graph, strategy="sequential", state=backend)

-# 第二次运行：已缓存任务自动跳过
+# 第二次运行：已缓存任务自动跳过（状态为 SKIPPED）
 report = px.run(graph, strategy="sequential", state=backend)
-# report.results 中缓存任务状态为 SKIPPED
+```
+
+`run()` 内部以 `backend.batch()` 包裹整个执行：所有 `save` 延迟到运行结束时统一落盘一次
+（`JSONBackend` 从 O(N²) 降为 O(N) 磁盘写入；`MemoryBackend` 为 no-op）。
+
+**缓存键**：默认存储键为任务名。配置 `cache_key` 函数后，键为 `"name:cache_key_value"`，
+使不同输入产生独立缓存条目：
+
+```python
+px.TaskSpec(
+    "fetch_user",
+    fn=fetch_user,
+    cache_key=lambda ctx: str(ctx.get("uid")),  # 不同 uid 独立缓存
+)
 ```

 ## 错误处理
@@ -321,14 +390,52 @@ except px.PyFlowXError:

 PyFlowX 专注于**单机 DAG 调度**的极致简洁，适合 ETL、数据处理、CI 流水线等场景。

+## 高级特性
+
+### 并发限制
+
+按 `concurrency_key` 分组限流，避免压垮下游资源：
+
+```python
+graph = px.Graph.from_specs([
+    px.TaskSpec("q1", fn=query_db, concurrency_key="db"),
+    px.TaskSpec("q2", fn=query_db, concurrency_key="db"),
+    px.TaskSpec("q3", fn=query_db, concurrency_key="db"),
+])
+# 同一时刻最多 2 个 "db" 组任务运行
+px.run(graph, strategy="async", concurrency_limits={"db": 2})
+```
+
+### 任务钩子
+
+`TaskHooks` 在任务生命周期触发（异常仅记录，不影响任务状态）：
+
+```python
+hooks = px.TaskHooks(
+    pre_run=lambda spec: print(f"start {spec.name}"),
+    post_run=lambda spec, value: print(f"done {spec.name}"),
+    on_failure=lambda spec, exc: alert(spec.name, exc),
+)
+px.TaskSpec("task", fn=work, hooks=hooks)
+```
+
+### 优先级
+
+同层内按 `priority` 降序执行（稳定排序）：
+
+```python
+px.TaskSpec("low", fn=work, priority=0)
+px.TaskSpec("high", fn=work, priority=10)  # 同层内先执行
+```
+
 ## 开发

 ```bash
 # 安装开发依赖
 uv sync --extra dev

-# 运行测试（含覆盖率）
-uv run pytest --cov=pyflowx --cov-fail-under=100
+# 运行测试（含覆盖率，阈值 95%）
+uv run pytest --cov=pyflowx --cov-fail-under=95

 # 类型检查
 uv run mypy
@@ -338,6 +445,22 @@ uv run ruff check src tests examples
 uv run ruff format --check src tests examples
 ```

+## 模块结构
+
+| 模块 | 职责 |
+|------|------|
+| `task.py` | 纯数据结构：`TaskSpec`、`RetryPolicy`、`TaskHooks`、`TaskStatus` |
+| `graph.py` | DAG 构建、校验、分层、可视化 |
+| `compose.py` | 多图组合：`GraphComposer` / `compose` |
+| `context.py` | 上下文注入：参数名→依赖解析 |
+| `command.py` | 命令执行：`run_command`（list/shell/Callable） |
+| `conditions.py` | 条件执行：内置条件与组合器 |
+| `executors.py` | 执行器与 `run` 入口：四种策略共享模块级辅助 |
+| `storage.py` | 状态后端：`MemoryBackend` / `JSONBackend`（batch flush） |
+| `runner.py` | CLI 运行器：`CliRunner` |
+| `report.py` | 运行结果：`RunReport` / `TaskResult` |
+| `errors.py` | 错误家族：`PyFlowXError` 子类 |
+
 ## 许可证

 MIT
@@ -21,7 +21,7 @@ license = { text = "MIT" }
 name = "pyflowx"
 readme = "README.md"
 requires-python = ">=3.8"
-version = "0.2.11"
+version = "0.3.0"

 [project.scripts]
 autofmt     = "pyflowx.cli.autofmt:main"
@@ -38,6 +38,7 @@ packtool    = "pyflowx.cli.packtool:main"
 pdftool     = "pyflowx.cli.pdftool:main"
 piptool     = "pyflowx.cli.piptool:main"
 pymake      = "pyflowx.cli.pymake:main"
+pxp         = "pyflowx.cli.profiler:main"
 reseticon   = "pyflowx.cli.reseticoncache:main"
 scrcap      = "pyflowx.cli.screenshot:main"
 sglang      = "pyflowx.cli.llm.sglang:main"
@@ -58,6 +58,8 @@

 from __future__ import annotations

+from .command import run_command
+from .compose import GraphComposer, compose
 from .conditions import (
    IS_LINUX,
    IS_MACOS,
@@ -79,7 +81,8 @@ from .errors import (
    TaskTimeoutError,
 )
 from .executors import Strategy, run
-from .graph import Graph, GraphComposer, GraphDefaults, compose
+from .graph import Graph, GraphDefaults
+from .profiling import ProfileReport, TaskProfile
 from .report import RunReport
 from .runner import CliExitCode, CliRunner
 from .storage import JSONBackend, MemoryBackend, StateBackend
@@ -92,10 +95,12 @@ from .task import (
    TaskResult,
    TaskSpec,
    TaskStatus,
+    cmd,
+    task,
    task_template,
 )

-__version__ = "0.3.5"
+__version__ = "0.4.0"

 __all__ = [
    "IS_LINUX",
@@ -118,6 +123,7 @@ __all__ = [
    "JSONBackend",
    "MemoryBackend",
    "MissingDependencyError",
+    "ProfileReport",
    "PyFlowXError",
    "RetryPolicy",
    "RunReport",
@@ -128,13 +134,17 @@ __all__ = [
    "TaskEvent",
    "TaskFailedError",
    "TaskHooks",
+    "TaskProfile",
    "TaskResult",
    "TaskSpec",
    "TaskStatus",
    "TaskTimeoutError",
    "build_call_args",
+    "cmd",
    "compose",
    "describe_injection",
    "run",
+    "run_command",
+    "task",
    "task_template",
 ]
@@ -1,6 +1,7 @@
 from __future__ import annotations

 import argparse
+import getpass
 from pathlib import Path
 from typing import Literal, get_args

@@ -254,6 +255,31 @@ def main() -> None:
            allow_upstream_skip=True,
            verbose=True,
        ),
+        # 安装 Docker
+        px.TaskSpec(
+            "install_docker",
+            cmd=["sudo", "apt", "install", "-y", "docker-compose-v2"],
+            conditions=(BuiltinConditions.IS_LINUX(),),
+            depends_on=("install_mirror",),
+            allow_upstream_skip=True,
+            verbose=True,
+        ),
+        px.TaskSpec(
+            "add_docker_group",
+            cmd=["sudo", "usermod", "-aG", "docker", getpass.getuser()],
+            conditions=(BuiltinConditions.IS_LINUX(),),
+            depends_on=("install_docker",),
+            allow_upstream_skip=True,
+            verbose=True,
+        ),
+        px.TaskSpec(
+            "refresh_docker_group",
+            cmd=["newgrp", "docker"],
+            conditions=(BuiltinConditions.IS_LINUX(),),
+            depends_on=("add_docker_group",),
+            allow_upstream_skip=True,
+            verbose=True,
+        ),
        # 设置 Python 环境变量
        *setenv_group({
            "PIP_INDEX_URL": PIP_INDEX_URLS[python_mirror],
@@ -240,7 +240,7 @@ def _parse_email_date(date_str: str) -> str:
    try:
        dt = parsedate_to_datetime(date_str)
        return dt.isoformat()
-    except Exception:
+    except (ValueError, TypeError, OverflowError):
        return date_str


@@ -277,11 +277,11 @@ def _extract_email_body_part(part: Any) -> str:
            decoded_text = payload.decode(charset, errors="replace")
        except (UnicodeDecodeError, LookupError) as decode_error:
            # 如果指定编码失败，尝试常见编码
-            logger.warning(f"字符编码 {charset} 解码失败: {decode_error}")
+            logger.warning("字符编码 %s 解码失败: %s", charset, decode_error)
            for fallback_charset in ["utf-8", "gbk", "gb2312", "latin-1"]:
                try:
                    decoded_text = payload.decode(fallback_charset, errors="replace")
-                    logger.info(f"成功使用备用编码 {fallback_charset} 解码")
+                    logger.info("成功使用备用编码 %s 解码", fallback_charset)
                    break
                except (UnicodeDecodeError, LookupError):
                    continue
@@ -293,15 +293,15 @@ def _extract_email_body_part(part: Any) -> str:
        # 限制长度并返回
        result = decoded_text[:MAX_BODY_LENGTH]
        if len(decoded_text) > MAX_BODY_LENGTH:
-            logger.debug(f"正文内容过长，截取前{MAX_BODY_LENGTH}字符")
+            logger.debug("正文内容过长，截取前%d字符", MAX_BODY_LENGTH)

        return result

    except AttributeError as attr_error:
-        logger.error(f"邮件部分对象属性错误: {attr_error}")
+        logger.error("邮件部分对象属性错误: %s", attr_error)
        return ""
    except Exception as unexpected_error:
-        logger.error(f"提取邮件正文时发生未知错误: {unexpected_error}")
+        logger.error("提取邮件正文时发生未知错误: %s", unexpected_error)
        return ""


@@ -66,19 +66,10 @@ def backup_folder(src: str, dst: str, max_zip: int = 5) -> None:
    zip_target(src_path, dst_path, max_zip)


-# ============================================================================
-# TaskSpec 定义
-# ============================================================================
-
-folderback_default: px.TaskSpec = px.TaskSpec(
-    "folderback_default",
-    fn=lambda: backup_folder(".", "./backup", 5),
-)
-
-
-# ============================================================================
-# CLI Runner
-# ============================================================================
+@px.task
+def folderback_default() -> None:
+    """备份当前目录到 ./backup."""
+    backup_folder(".", "./backup", 5)


 def main() -> None:
@@ -86,9 +77,9 @@ def main() -> None:
    runner = px.CliRunner(
        strategy="thread",
        description="FolderBack - 文件夹备份工具",
-        graphs={
+        aliases={
            # 备份当前目录到 ./backup
-            "b": px.Graph.from_specs([folderback_default]),
+            "b": folderback_default,
        },
    )
    runner.run_cli()
@@ -57,16 +57,10 @@ def zip_folders(cwd: str = ".") -> None:
        archive_folder(dir_path)


-# ============================================================================
-# TaskSpec 定义
-# ============================================================================
-
-folderzip_default: px.TaskSpec = px.TaskSpec("folderzip_default", fn=lambda: zip_folders("."))
-
-
-# ============================================================================
-# CLI Runner
-# ============================================================================
+@px.task
+def folderzip_default() -> None:
+    """压缩当前目录下的所有文件夹."""
+    zip_folders(".")


 def main() -> None:
@@ -74,9 +68,9 @@ def main() -> None:
    runner = px.CliRunner(
        strategy="thread",
        description="FolderZip - 文件夹压缩工具",
-        graphs={
+        aliases={
            # 压缩当前目录下的所有文件夹
-            "z": px.Graph.from_specs([folderzip_default]),
+            "z": folderzip_default,
        },
    )
    runner.run_cli()
@@ -46,7 +46,12 @@ def init_sub_dirs() -> None:
        )


-isub: px.TaskSpec = px.TaskSpec("isub", fn=init_sub_dirs)
+@px.task(name="isub")
+def isub() -> None:
+    """初始化子目录的Git仓库."""
+    init_sub_dirs()
+
+
 push: px.TaskSpec = px.TaskSpec("push", cmd=["git", "push"])
 pull: px.TaskSpec = px.TaskSpec("pull", cmd=["git", "pull"])
 kill_tgit: px.TaskSpec = px.TaskSpec("task_kill", cmd=["taskkill", "/f", "/t", "/im", "tgitcache.exe"])
@@ -67,17 +72,17 @@ def main() -> None:
    runner = px.CliRunner(
        strategy="thread",
        description="Gittool - Git 执行工具.",
-        graphs={
+        aliases={
            # 添加并提交
            "a": px.Graph.from_specs([
                px.TaskSpec("add", cmd=["git", "add", "."], conditions=(lambda _: has_files(),)),
                px.TaskSpec("commit", cmd=["git", "commit", "-m", "chore: update"], depends_on=("add",)),
            ]),
-            # 清理
-            "c": px.Graph.from_specs([
+            # 清理（chain: clean → status）
+            "c": px.Graph().chain(
                px.TaskSpec("clean", cmd=["git", "clean", "-xfd", *EXCLUDE_CMDS]),
-                px.TaskSpec("status", cmd=["git", "status", "--porcelain"], depends_on=("clean",)),
-            ]),
+                px.TaskSpec("status", cmd=["git", "status", "--porcelain"]),
+            ),
            # 初始化、添加并提交
            "i": px.Graph.from_specs([
                px.TaskSpec("init", cmd=["git", "init"], conditions=(lambda _: not_has_git_repo(),)),
@@ -90,13 +95,13 @@ def main() -> None:
                ),
            ]),
            # 初始化子目录
-            "isub": px.Graph.from_specs([isub]),
+            "isub": isub,
            # 推送
-            "p": px.Graph.from_specs([push]),
+            "p": push,
            # 拉取
-            "pl": px.Graph.from_specs([pull]),
+            "pl": pull,
            # 重启TGit缓存
-            "r": px.Graph.from_specs([kill_tgit]),
+            "r": kill_tgit,
        },
    )
    runner.run_cli()
@@ -0,0 +1,272 @@
+"""pxp —— PyFlowX 性能分析器.
+
+分析包含 ``px`` 调用的 Python 脚本，生成工作流执行性能剖面报告。
+
+工作原理
+--------
+1. 注入 hook：monkey-patch ``pyflowx.run`` / ``pyflowx.executors.run`` /
+   ``pyflowx.runner.run``，捕获最后一次执行的 ``Graph`` 与 ``RunReport``。
+2. 执行目标脚本：用 ``runpy.run_path`` 以 ``__main__`` 身份执行，
+   捕获 ``SystemExit``（脚本可能调 ``sys.exit``）。
+3. 生成报告：从捕获的 report + graph 构建 :class:`ProfileReport`，
+   默认输出 HTML 并自动打开浏览器。
+
+使用方式
+--------
+    # 分析 pymake.py，生成 HTML 报告并打开浏览器
+    pxp pymake.py
+
+    # 传递参数给被分析脚本（用 -- 分隔）
+    pxp pymake.py -- t
+
+    # 指定输出文件
+    pxp pymake.py -o report.html
+
+    # 不打开浏览器
+    pxp pymake.py --no-browser
+
+    # 输出纯文本报告
+    pxp pymake.py -E text
+"""
+
+from __future__ import annotations
+
+__all__ = ["main"]
+
+import argparse
+import runpy
+import sys
+import webbrowser
+from pathlib import Path
+from typing import Any
+
+from .. import executors as _executors
+from .. import runner as _runner
+from ..profiling import ProfileReport
+from ..report import RunReport
+
+
+def _build_parser() -> argparse.ArgumentParser:
+    """构建参数解析器。"""
+    parser = argparse.ArgumentParser(
+        prog="pxp",
+        description="PyFlowX 性能分析器：分析包含 px 调用的脚本，生成性能剖面报告。",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog=(
+            "示例:\n"
+            "  pxp pymake.py              # 分析并打开 HTML 报告\n"
+            "  pxp pymake.py -- t         # 传递参数 t 给脚本\n"
+            "  pxp pymake.py -E text      # 输出纯文本报告\n"
+            "  pxp pymake.py -o out.html  # 指定输出文件\n"
+        ),
+    )
+    _ = parser.add_argument(
+        "--export",
+        "-E",
+        choices=["html", "text"],
+        default="html",
+        help="导出格式（默认: html）",
+    )
+    _ = parser.add_argument(
+        "--no-browser",
+        action="store_true",
+        help="不自动打开浏览器（仅 HTML 格式有效）",
+    )
+    _ = parser.add_argument(
+        "-o",
+        "--output",
+        help="输出文件路径（默认: <script>_profile.html）",
+    )
+    return parser
+
+
+def _capture_px_run() -> dict[str, Any]:
+    """注入 hook 捕获 px.run() 调用。
+
+    返回一个字典，``run()`` 执行后填充 ``graph`` 与 ``report``。
+    同时返回还原函数用于 finally 块。
+
+    Note
+    -----
+    需同时 patch 三处引用：
+    * ``pyflowx.executors.run`` —— 实际实现
+    * ``pyflowx.runner.run`` —— ``CliRunner`` 直接 import 的引用
+    * ``pyflowx.run`` —— 顶层包导出的引用（用户脚本常用 ``px.run()``）
+
+    另外 patch ``RunReport.__init__`` 以捕获 ``run()`` 内部创建的 report 实例。
+    这对于 ``run()`` 抛出 ``TaskFailedError`` 的场景至关重要：此时 ``run()``
+    不会正常返回 report，但 report 对象已在内部创建并填充了已执行任务的结果。
+    通过 ``capture_enabled`` 标志确保只在 ``patched_run`` 调用期间捕获。
+    """
+    captured: dict[str, Any] = {}
+    original_exec_run = _executors.run
+    original_runner_run = _runner.run
+    # 惰性获取顶层 pyflowx.run 引用（避免循环导入）
+    import pyflowx as px_mod
+
+    original_px_run = px_mod.run
+    original_report_init = RunReport.__init__
+    capture_enabled = [False]
+
+    def patched_report_init(self: RunReport, *args: Any, **kwargs: Any) -> None:
+        original_report_init(self, *args, **kwargs)
+        if capture_enabled[0]:
+            captured["report"] = self
+
+    RunReport.__init__ = patched_report_init  # type: ignore[assignment]
+
+    def patched_run(graph: Any, *args: Any, **kwargs: Any) -> RunReport:
+        captured["graph"] = graph
+        capture_enabled[0] = True
+        try:
+            report = original_exec_run(graph, *args, **kwargs)
+            # 正常返回时确保 captured["report"] 是返回的 report
+            captured["report"] = report
+            return report
+        finally:
+            capture_enabled[0] = False
+
+    # patch 所有引用 run 的入口
+    _executors.run = patched_run  # type: ignore[assignment]
+    _runner.run = patched_run  # type: ignore[assignment]
+    px_mod.run = patched_run  # type: ignore[assignment]
+
+    def _restore() -> None:
+        _executors.run = original_exec_run  # type: ignore[assignment]
+        _runner.run = original_runner_run  # type: ignore[assignment]
+        px_mod.run = original_px_run  # type: ignore[assignment]
+        RunReport.__init__ = original_report_init  # type: ignore[assignment]
+
+    captured["_restore"] = _restore
+    return captured
+
+
+def _run_target_script(script: Path, script_args: list[str]) -> dict[str, Any]:
+    """执行目标脚本。
+
+    将脚本所在目录加入 ``sys.path``，设置 ``sys.argv``，然后用
+    ``runpy.run_path`` 以 ``__main__`` 身份执行。捕获 ``SystemExit``。
+
+    Returns
+    -------
+    dict[str, Any]
+        脚本模块的全局变量字典（含 ``main`` 等定义）。
+    """
+    sys.argv = [str(script), *script_args]
+    script_dir = str(script.parent.resolve())
+    if script_dir not in sys.path:
+        sys.path.insert(0, script_dir)
+    return runpy.run_path(str(script), run_name="__main__")
+
+
+def _try_call_main(module_globals: dict[str, Any]) -> None:
+    """若模块定义了 ``main`` 可调用对象，调用它。
+
+    用于脚本无 ``if __name__ == "__main__"`` 块的场景（如通过 entry points
+    注册的 CLI 工具脚本）。``main`` 通常调用 ``CliRunner.run_cli()``，
+    后者读取 ``sys.argv[1:]`` 执行对应命令。
+    """
+    main_fn = module_globals.get("main")
+    if callable(main_fn):
+        main_fn()
+
+
+def _output_report(
+    profile: ProfileReport,
+    export: str,
+    output: str | None,
+    script_stem: str,
+    no_browser: bool,
+) -> None:
+    """输出性能报告。"""
+    if export == "text":
+        print(profile.describe())
+        return
+
+    # HTML 格式
+    html = profile.to_html()
+    if output:
+        out_path = Path(output)
+    else:
+        out_path = Path.cwd() / f"{script_stem}_profile.html"
+    out_path.write_text(html, encoding="utf-8")
+    print(f"HTML 报告已生成: {out_path}")
+
+    if not no_browser:
+        try:
+            webbrowser.open(f"file://{out_path.resolve()}")
+        except Exception as e:
+            print(f"警告：无法打开浏览器: {e}", file=sys.stderr)
+
+
+def main() -> None:
+    """pxp CLI 入口。"""
+    parser = _build_parser()
+    pxp_args, remaining = parser.parse_known_args()
+
+    if not remaining:
+        parser.print_help()
+        sys.exit(2)
+
+    script_str = remaining[0]
+    script_args = remaining[1:]
+    script_path = Path(script_str).resolve()
+
+    if not script_path.is_file():
+        print(f"错误：脚本不存在: {script_path}", file=sys.stderr)
+        sys.exit(2)
+
+    # 注入 hook
+    captured = _capture_px_run()
+
+    # 执行目标脚本
+    print(f"正在分析: {script_path}")
+    if script_args:
+        print(f"脚本参数: {script_args}")
+    print("-" * 60)
+
+    module_globals: dict[str, Any] = {}
+    try:
+        module_globals = _run_target_script(script_path, script_args)
+    except SystemExit:
+        # 脚本调用了 sys.exit，正常情况
+        pass
+    except Exception as e:
+        print(f"警告：脚本执行抛出异常: {e}", file=sys.stderr)
+
+    # 若脚本执行未捕获到 run()，尝试调用模块的 main() 函数
+    # （适用于无 ``if __name__ == "__main__"`` 块的 CLI 脚本）
+    if captured.get("report") is None and module_globals:
+        try:
+            _try_call_main(module_globals)
+        except SystemExit:
+            pass
+        except Exception as e:
+            print(f"警告：调用 main() 抛出异常: {e}", file=sys.stderr)
+
+    # 还原 hook
+    restore = captured.pop("_restore", None)
+    if restore is not None:
+        restore()
+
+    # 检查是否捕获到 run() 调用
+    report = captured.get("report")
+    graph = captured.get("graph")
+    if report is None or graph is None:
+        print("错误：未捕获到 px.run() 调用，无法生成性能报告", file=sys.stderr)
+        print("请确保脚本通过 px.run() 或 CliRunner 执行任务流图。", file=sys.stderr)
+        sys.exit(1)
+
+    # 生成报告
+    profile = ProfileReport.from_report(report, graph)
+    _output_report(
+        profile,
+        export=pxp_args.export,
+        output=pxp_args.output,
+        script_stem=script_path.stem,
+        no_browser=pxp_args.no_browser,
+    )
+
+
+if __name__ == "__main__":
+    main()
@@ -6,51 +6,77 @@

 from __future__ import annotations

+from pathlib import Path
+
 import pyflowx as px
 from pyflowx.conditions import Constants

+# 项目根目录（pymake.py 在 src/pyflowx/cli，向上四层到达根目录）
+ROOT_DIR = Path(__file__).parent.parent.parent.parent

-def maturin_build_cmd() -> list[str]:
-    """获取 maturin 构建命令（根据平台自动添加参数）.
+MATURIN_BUILD_COMMAND = ["maturin", "build", "-r"]
+if Constants.IS_WINDOWS:
+    MATURIN_BUILD_COMMAND.extend(["--target", "x86_64-win7-windows-msvc", "-Zbuild-std", "-i", "python3.8"])

-    Returns
-    -------
-    list[str]
-        完整的 maturin 构建命令列表.
-    """
-    command = ["maturin", "build", "-r"].copy()
-    if Constants.IS_WINDOWS:
-        command.extend(["--target", "x86_64-win7-windows-msvc", "-Zbuild-std", "-i", "python3.8"])
-    return command
+# 扁平注册所有任务（px.cmd 自动从命令前两段推导 name）
+# 所有任务指定 cwd=ROOT_DIR，确保在项目根目录执行
+tasks: list[px.TaskSpec] = [
+    px.cmd(["uv", "build"], cwd=ROOT_DIR),
+    px.cmd(MATURIN_BUILD_COMMAND, cwd=ROOT_DIR),
+    px.cmd(["uv", "sync"], cwd=ROOT_DIR),
+    px.cmd(["gitt", "c"], name="git_clean", cwd=ROOT_DIR),
+    px.cmd(
+        ["pytest", "-m", "not slow", "-n", "8", "--dist", "loadfile", "--color=yes", "--durations=10"],
+        name="test",
+        cwd=ROOT_DIR,
+    ),
+    px.cmd(
+        ["pytest", "-m", "not slow", "--dist", "loadfile", "--color=yes", "--durations=10"],
+        name="test_fast",
+        cwd=ROOT_DIR,
+    ),
+    px.cmd(
+        ["pytest", "--cov", "-n", "8", "--dist", "loadfile", "--tb=short", "-v", "--color=yes", "--durations=10"],
+        name="test_coverage",
+        cwd=ROOT_DIR,
+    ),
+    px.cmd(["pyrefly", "check", "."], cwd=ROOT_DIR),
+    px.cmd(["git", "add", "-A"], name="git_add_all", cwd=ROOT_DIR),
+    px.cmd(["bumpversion"], cwd=ROOT_DIR),
+    px.cmd(["bumpversion", "minor"], cwd=ROOT_DIR),
+    px.cmd(["git", "push"], cwd=ROOT_DIR),
+    px.cmd(["git", "push", "--tags"], name="git_push_tags", cwd=ROOT_DIR),
+    px.cmd(["hatch", "publish"], name="publish_python", cwd=ROOT_DIR),
+    px.cmd(["twine", "upload", "--disable-progress-bar"], name="twine_publish", cwd=ROOT_DIR),
+]
+
+# 单任务别名（alias 名与任务名相同）：直接内联 TaskSpec，避免 str 自引用
+aliases: dict[str, str | list[str | px.TaskSpec] | px.TaskSpec | px.Graph] = {
+    # 构建命令
+    "b": "uv_build",
+    "bc": "maturin_build",
+    "ba": ["b", "bc"],
+    # 安装命令
+    "sync": "uv_sync",
+    # 清理命令
+    "c": "git_clean",
+    # 开发工具
+    "bump": ["c", "tc", "git_add_all", "bumpversion"],
+    "bumpmi": "bumpversion_minor",
+    "cov": ["git_clean", "test_coverage"],
+    "doc": px.cmd(["sphinx-build", "-b", "html", "docs", "docs/_build"], name="doc", cwd=ROOT_DIR),
+    "lint": px.cmd(["ruff", "check", "--fix", "--unsafe-fixes"], name="lint", cwd=ROOT_DIR),
+    "pb": ["twine_publish", "publish_python"],
+    "t": "test",
+    "tf": "test_fast",
+    "tc": ["pyrefly_check", "lint"],
+    "tox": px.cmd(["tox", "-p", "auto"], name="tox", cwd=ROOT_DIR),
+    # 发布命令
+    "p": ["git_clean", "git_push", "git_push_tags"],
+}


-uv_build: px.TaskSpec = px.TaskSpec("uv_build", cmd=["uv", "build"])
-maturin_build: px.TaskSpec = px.TaskSpec("maturin_build", cmd=maturin_build_cmd())
-uv_sync: px.TaskSpec = px.TaskSpec("uv_sync", cmd=["uv", "sync"])
-git_clean: px.TaskSpec = px.TaskSpec("git_clean", cmd=["gitt", "c"])
-test: px.TaskSpec = px.TaskSpec(
-    "test", cmd=["pytest", "-m", "not slow", "-n", "8", "--dist", "loadfile", "--color=yes", "--durations=10"]
-)
-test_fast: px.TaskSpec = px.TaskSpec(
-    "test_fast", cmd=["pytest", "-m", "not slow", "--dist", "loadfile", "--color=yes", "--durations=10"]
-)
-test_coverage: px.TaskSpec = px.TaskSpec(
-    "test_coverage",
-    cmd=["pytest", "--cov", "-n", "8", "--dist", "loadfile", "--tb=short", "-v", "--color=yes", "--durations=10"],
-)
-ruff_lint: px.TaskSpec = px.TaskSpec("lint", cmd=["ruff", "check", "--fix", "--unsafe-fixes"])
-typecheck: px.TaskSpec = px.TaskSpec("pyrefly_check", cmd=["pyrefly", "check", "."])
-git_add_all: px.TaskSpec = px.TaskSpec("git_add_all", cmd=["git", "add", "-A"])
-bump: px.TaskSpec = px.TaskSpec("bumpversion", cmd=["bumpversion"])
-doc: px.TaskSpec = px.TaskSpec("doc", cmd=["sphinx-build", "-b", "html", "docs", "docs/_build"])
-git_push: px.TaskSpec = px.TaskSpec("git_push", cmd=["git", "push"])
-git_push_tags: px.TaskSpec = px.TaskSpec("git_push_tags", cmd=["git", "push", "--tags"])
-hatch_publish: px.TaskSpec = px.TaskSpec("publish_python", cmd=["hatch", "publish"])
-twine_publish: px.TaskSpec = px.TaskSpec("twine_publish", cmd=["twine", "upload", "--disable-progress-bar"])
-tox: px.TaskSpec = px.TaskSpec("tox", cmd=["tox", "-p", "auto"])
-
-
-def main():
+def main() -> None:
    """pymake 构建工具.

    🔨 构建命令:
@@ -78,10 +104,10 @@ def main():
    📦 发布命令:
      pymake pb   - 发布到 PyPI (twine + hatch)

-    � 版本管理:
+    🔖 版本管理:
      pymake bump  - 自动升级版本号并提交修改 (清理 + 检查 + 格式化 + git add + bumpversion)

-    �💡 常用工作流:
+    💡 常用工作流:
      1. 日常开发: pymake lint && pymake t
      2. 构建发布包: pymake ba
      3. 多版本兼容性测试: pymake tox
@@ -95,31 +121,5 @@ def main():
      pymake lint        # 格式化代码
      pymake type        # 类型检查
    """
-    runner = px.CliRunner(
-        strategy="sequential",
-        description="PyMake - Python 构建工具",
-        graphs={
-            # 构建命令
-            "b": px.Graph.from_specs([uv_build]),
-            "bc": px.Graph.from_specs([maturin_build]),
-            "ba": px.Graph.from_specs(["b", "bc"]),
-            # 安装命令
-            "sync": px.Graph.from_specs([uv_sync]),
-            # 清理命令
-            "c": px.Graph.from_specs([git_clean]),
-            # 开发工具
-            "bump": px.Graph.from_specs(["c", "tc", git_add_all, bump]),
-            "bumpmi": px.Graph.from_specs([px.TaskSpec("bumpversion_minor", cmd=["bumpversion", "minor"])]),
-            "cov": px.Graph.from_specs([git_clean, test_coverage]),
-            "doc": px.Graph.from_specs([doc]),
-            "lint": px.Graph.from_specs([ruff_lint]),
-            "pb": px.Graph.from_specs([twine_publish, hatch_publish]),
-            "t": px.Graph.from_specs([test]),
-            "tf": px.Graph.from_specs([test_fast]),
-            "tc": px.Graph.from_specs([typecheck, "lint"]),
-            "tox": px.Graph.from_specs([tox]),
-            # 发布命令
-            "p": px.Graph.from_specs([git_clean, git_push, git_push_tags]),
-        },
-    )
+    runner = px.CliRunner(strategy="sequential", description="PyMake - Python 构建工具", tasks=tasks, aliases=aliases)
    runner.run_cli()
@@ -0,0 +1,98 @@
+"""命令执行器：把 :class:`~pyflowx.task.TaskSpec` 的 ``cmd`` 字段（list /
+shell 字符串 / 可调用对象）转换为统一执行入口。
+
+历史背景：原 ``task.py`` 的模块文档声明其为"纯数据结构"，但 ``_run_command``
+属于命令执行逻辑，违反单一职责。此处将其抽离，``TaskSpec`` 仅持有配置，
+执行逻辑集中于本模块，便于独立测试与维护。
+"""
+
+from __future__ import annotations
+
+import os
+import subprocess
+from typing import Any, List, Union, cast
+
+from .task import TaskSpec
+
+__all__ = ["run_command"]
+
+
+def run_command(spec: TaskSpec[Any]) -> Any:  # noqa: PLR0912
+    """执行 ``spec.cmd`` 指定的命令（list / shell 字符串 / 可调用对象）。
+
+    与原 ``TaskSpec._run_command`` 行为一致：
+
+    - 可调用对象：直接调用，异常包装为 :class:`RuntimeError`。
+    - list / str：通过 :func:`subprocess.run` 执行，非零返回码抛
+      :class:`RuntimeError`（``verbose=False`` 时附 stderr）。
+    - ``verbose=True`` 时打印执行信息与返回码到 stdout。
+    - ``cwd`` / ``env`` 通过 subprocess 参数隔离（进程级状态仅在 fn 任务路径
+      使用，cmd 路径不依赖 ``os.chdir`` / ``os.environ``）。
+    """
+    cmd = spec.cmd
+    verbose = spec.verbose
+    cwd = spec.cwd
+    timeout = spec.timeout
+    env_override = spec.env
+
+    # 可调用对象：直接调用，返回其结果。
+    if callable(cmd) and not isinstance(cmd, (list, str)):
+        name = getattr(cmd, "__name__", "callable")
+        if verbose:
+            print(f"[verbose] 执行可调用命令: {name}", flush=True)
+            if cwd is not None:
+                print(f"[verbose] 工作目录: {cwd}", flush=True)
+        try:
+            return cmd()
+        except Exception as e:
+            raise RuntimeError(f"可调用命令执行异常: {name}: {e}") from e
+
+    is_list = isinstance(cmd, list)
+    if is_list:
+        cmd_str = " ".join(arg for arg in cmd)  # type: ignore[union-attr]
+        verb = "执行命令"
+        label = "命令"
+    else:
+        cmd_str = cast(str, cmd)
+        verb = "执行 Shell"
+        label = "Shell 命令"
+
+    if verbose:
+        print(f"[verbose] {verb}: {cmd_str}", flush=True)
+        if cwd is not None:
+            print(f"[verbose] 工作目录: {cwd}", flush=True)
+
+    # 合并环境变量
+    run_env: dict[str, str] | None = None
+    if env_override:
+        run_env = dict(os.environ)
+        run_env.update(env_override)
+
+    try:
+        result = subprocess.run(
+            cast(Union[str, List[str]], cmd),
+            shell=not is_list,
+            cwd=cwd,
+            env=run_env,
+            timeout=timeout,
+            capture_output=not verbose,
+            text=True,
+            check=False,
+        )
+    except FileNotFoundError:
+        raise RuntimeError(f"{label}未找到: {cmd_str}") from None
+    except subprocess.TimeoutExpired:
+        raise RuntimeError(f"{label}执行超时: {cmd_str} ({timeout}s)") from None
+    except OSError as e:
+        raise RuntimeError(f"{label}执行异常: {cmd_str}: {e}") from e
+
+    if verbose:
+        print(f"[verbose] 返回码: {result.returncode}", flush=True)
+
+    if result.returncode == 0:
+        return None
+
+    err_msg = f"{label}执行失败: `{cmd_str}`, 返回码: {result.returncode}"
+    if not verbose and result.stderr.strip():
+        err_msg += f"\n{result.stderr.strip()}"
+    raise RuntimeError(err_msg)
@@ -0,0 +1,115 @@
+"""图组合：将带字符串引用的多个图展开为纯 :class:`~pyflowx.graph.Graph`。
+
+历史背景：原 ``graph.py`` 同时承载 DAG 构建/校验/分层与多图组合逻辑，
+职责过载。组合逻辑（:class:`GraphComposer` / :func:`compose`）与单图 DAG
+模型正交，此处抽离为独立模块，便于按需导入与独立演进。
+"""
+
+from __future__ import annotations
+
+from dataclasses import replace
+from typing import Any
+
+from .graph import Graph
+from .task import TaskSpec
+
+__all__ = ["GraphComposer", "compose"]
+
+
+class GraphComposer:
+    """将带字符串引用的图展开为纯 :class:`TaskSpec` 图。
+
+    引用格式：
+    * ``"command_name"`` —— 引用整个命令图。
+    * ``"command_name.task_name"`` —— 引用特定任务。
+
+    引用按顺序展开，后续引用的任务依赖前面引用的最后一个任务；
+    原始 ``TaskSpec`` 之间也按出现顺序串行依赖。
+    """
+
+    def __init__(self, graphs: dict[str, Graph]) -> None:
+        self.graphs = graphs
+
+    def resolve_all(self) -> dict[str, Graph]:
+        """解析所有图的字符串引用，返回展开后的新图映射。"""
+        resolved: dict[str, Graph] = {}
+        for cmd_name, graph in self.graphs.items():
+            resolved[cmd_name] = self.expand_refs(graph, cmd_name)
+        return resolved
+
+    def expand_refs(self, graph: Graph, current_cmd: str) -> Graph:
+        """展开图中的字符串引用。若无 ``_pending_refs``，原样返回。"""
+        pending_refs = graph._pending_refs
+        if not pending_refs:
+            return graph
+
+        all_specs: list[TaskSpec[Any]] = []
+        previous_ref_last_task: str | None = None
+
+        for ref in pending_refs:
+            expanded_specs = self.parse_ref(ref, current_cmd)
+            if previous_ref_last_task and expanded_specs:
+                for i, task in enumerate(expanded_specs):
+                    if i == 0 or not task.depends_on:
+                        expanded_specs[i] = replace(task, depends_on=tuple({*task.depends_on, previous_ref_last_task}))
+            if expanded_specs:
+                previous_ref_last_task = expanded_specs[-1].name
+            all_specs.extend(expanded_specs)
+
+        original_specs = list(graph.all_specs().values())
+        if original_specs:
+            if previous_ref_last_task:
+                first = original_specs[0]
+                all_specs.append(replace(first, depends_on=tuple({*first.depends_on, previous_ref_last_task})))
+            else:
+                all_specs.append(original_specs[0])
+            for i in range(1, len(original_specs)):
+                current_task = original_specs[i]
+                previous_task_name = original_specs[i - 1].name
+                all_specs.append(
+                    replace(current_task, depends_on=tuple({*current_task.depends_on, previous_task_name}))
+                )
+
+        return Graph.from_specs(all_specs, defaults=graph.defaults)
+
+    def parse_ref(self, ref: str, current_cmd: str) -> list[TaskSpec[Any]]:
+        """解析单个字符串引用，返回对应的 TaskSpec 列表。"""
+        if ref == current_cmd:
+            raise ValueError(f"循环引用: 命令 '{current_cmd}' 引用了自己")
+
+        if "." in ref:
+            cmd_name, task_name = ref.split(".", 1)
+            if cmd_name not in self.graphs:
+                raise ValueError(f"引用的命令 '{cmd_name}' 不存在")
+            ref_graph = self.graphs[cmd_name]
+            if task_name not in ref_graph.all_specs():
+                raise ValueError(f"任务 '{task_name}' 不存在于命令 '{cmd_name}' 中")
+            return [ref_graph.all_specs()[task_name]]
+        else:
+            cmd_name = ref
+            if cmd_name not in self.graphs:
+                raise ValueError(f"引用的命令 '{cmd_name}' 不存在")
+            ref_graph = self.graphs[cmd_name]
+            ref_graph = self.expand_refs(ref_graph, cmd_name)
+            return list(ref_graph.all_specs().values())
+
+
+def compose(
+    graphs: dict[str, Graph],
+) -> dict[str, Graph]:
+    """编程式解析多图的字符串引用，返回展开后的新图映射。
+
+    与 :class:`GraphComposer` 等价，但作为独立函数暴露，供不使用
+    :class:`~pyflowx.runner.CliRunner` 的编程式用户调用。
+
+    Examples
+    --------
+    >>> graphs = {
+    ...     "build": px.Graph.from_specs([px.TaskSpec("b", cmd=["echo", "b"])]),
+    ...     "all": px.Graph.from_specs(["build", px.TaskSpec("t", cmd=["echo", "t"])]),
+    ... }
+    >>> resolved = px.compose(graphs)
+    >>> "b" in resolved["all"].all_specs()
+    True
+    """
+    return GraphComposer(graphs).resolve_all()
@@ -11,6 +11,7 @@

 from __future__ import annotations

+import logging
 import os
 import shutil
 import subprocess
@@ -20,6 +21,8 @@ from typing import Any, Callable

 from .task import Condition, Context

+logger = logging.getLogger(__name__)
+
 __all__ = ["BuiltinConditions", "Condition", "Constants"]


@@ -42,14 +45,6 @@ def _static(predicate: Callable[[], bool], name: str) -> Condition:
    return _cond


-def _cond_reason(cond: Condition) -> str | list[str] | None:
-    """获取条件的失败原因：优先返回 ``_reason``，否则返回 ``__name__``。"""
-    reason = getattr(cond, "_reason", None)
-    if reason is not None:
-        return reason
-    return getattr(cond, "__name__", repr(cond))
-
-
 def _cond_name(cond: Condition) -> str:
    """获取条件的可读名称。"""
    return getattr(cond, "__name__", repr(cond))
@@ -161,7 +156,7 @@ class BuiltinConditions:
                return False
            try:
                return content in p.read_text(encoding="utf-8")
-            except Exception:
+            except (OSError, UnicodeDecodeError):
                return False

        return _static(_check, f"FILE_CONTENT_EXISTS({path!r},{content!r})")
@@ -194,7 +189,8 @@ class BuiltinConditions:
                return False
            try:
                return predicate(ctx[dep_name])
-            except Exception:
+            except Exception as exc:
+                logger.warning("DEP_MATCHES predicate %r raised: %r", dep_name, exc)
                return False

        _cond.__name__ = f"DEP_MATCHES({dep_name!r},{getattr(predicate, '__name__', 'pred')})"
@@ -228,13 +224,7 @@ class BuiltinConditions:
        """对条件取反."""

        def _cond(ctx: Context) -> bool:
-            result = condition(ctx)
-            if result:
-                # inner 为 True 时 NOT 会失败，记录 inner 的具体原因
-                inner_reason = _cond_reason(condition)
-                if inner_reason is not None:
-                    _cond._reason = inner_reason  # type: ignore[attr-defined]
-            return not result
+            return not condition(ctx)

        _cond.__name__ = f"NOT({_cond_name(condition)})"
        return _cond
@@ -254,15 +244,7 @@ class BuiltinConditions:
        """多个条件的逻辑或."""

        def _cond(ctx: Context) -> bool:
-            matched: list[str] = []
-            for c in conditions:
-                if c(ctx):
-                    reason = _cond_reason(c)
-                    matched.append(reason if isinstance(reason, str) else str(reason))
-            if matched:
-                _cond._reason = matched  # type: ignore[attr-defined]
-                return True
-            return False
+            return any(c(ctx) for c in conditions)

        _cond.__name__ = f"OR({', '.join(_cond_name(c) for c in conditions)})"
        return _cond
@@ -16,6 +16,7 @@ DAG 库中泛滥的样板包装器。
 from __future__ import annotations

 import inspect
+from functools import lru_cache
 from typing import Any, Mapping

 from .errors import InjectionError
@@ -24,6 +25,24 @@ from .task import Context, TaskSpec
 __all__ = ["Context", "_is_context_annotation", "build_call_args", "describe_injection"]


+@lru_cache(maxsize=1024)
+def _cached_signature(fn: Any) -> inspect.Signature:
+    """缓存 ``inspect.signature`` 结果（按 fn 对象键控）。
+
+    ``fn`` 对象在 :meth:`TaskSpec.effective_fn` 缓存后稳定，签名重复内省
+    属纯开销。对不可哈希的可调用对象，调用方回退到直接内省。
+    """
+    return inspect.signature(fn)
+
+
+def _signature(fn: Any) -> inspect.Signature:
+    """获取签名，优先走缓存；``fn`` 不可哈希时回退到直接内省。"""
+    try:
+        return _cached_signature(fn)
+    except TypeError:
+        return inspect.signature(fn)
+
+
 def _is_context_annotation(annotation: Any) -> bool:
    """判断参数标注是否为（或指向）``Context``。"""
    if annotation is Context:
@@ -44,7 +63,7 @@ def build_call_args(
    执行器填入 :attr:`TaskSpec.defaults` 中的默认值）。
    """
    fn = spec.effective_fn
-    sig = inspect.signature(fn)
+    sig = _signature(fn)
    params = sig.parameters

    var_keyword = next(
@@ -115,7 +134,7 @@ def build_call_args(
 def describe_injection(spec: TaskSpec[Any]) -> str:
    """生成任务参数注入方式的人类可读描述。供 ``dry_run`` 使用。"""
    fn = spec.effective_fn
-    sig = inspect.signature(fn)
+    sig = _signature(fn)
    positional_params = [
        p
        for p, param in sig.parameters.items()
@@ -12,14 +12,18 @@

 架构
 ----
-本模块通过 **Mixin** 组合消除同步/异步与各层执行器之间的重复代码：
+本模块通过 **模块级函数** 消除同步/异步任务执行器之间的重复代码：

-* :class:`_TaskSkipMixin`  —— 上游跳过 / 条件跳过的预检逻辑。
-* :class:`_TaskRetryMixin` —— 重试决策、成功/失败后处理、finalize。
-* :class:`_LayerMixin`     —— 缓存过滤、优先级排序、信号量构建、结果存储。
-* :class:`SyncTaskRunner` / :class:`AsyncTaskRunner` —— 任务级执行器，组合上述 Mixin。
+* 模块级跳过/重试函数（:func:`_prepare_for_execution` / :func:`_should_retry`
+  / :func:`_mark_success` / :func:`_handle_failure` / :func:`_finalize_failure`）
+  —— 上游跳过 / 条件跳过的预检、重试决策、成功/失败后处理。
+* :class:`SyncTaskRunner` / :class:`AsyncTaskRunner` —— 任务级执行器，调用上述函数。
+* 模块级共享辅助（:func:`_filter_and_sort` / :func:`_store_result` /
+  :func:`_build_semaphores` / :func:`_get_sem`）—— 缓存过滤、优先级排序、
+  信号量构建、结果存储。
 * :class:`SequentialLayerRunner` / :class:`ThreadedLayerRunner` /
-  :class:`AsyncLayerRunner` / :class:`DependencyRunner` —— 层级执行器，组合 :class:`_LayerMixin`。
+  :class:`AsyncLayerRunner` —— 层级执行器，调用上述模块级辅助。
+* :class:`DependencyRunner` —— 依赖驱动调度（非层模型），同样调用模块级辅助。

 所有策略共享统一异步内核，支持：
 * :class:`RetryPolicy`（max_attempts/delay/backoff/jitter/retry_on）
@@ -37,7 +41,9 @@
 from __future__ import annotations

 import asyncio
+import atexit
 import concurrent.futures
+import contextlib
 import inspect
 import logging
 import threading
@@ -52,7 +58,60 @@ from .report import RunReport
 from .storage import StateBackend, resolve_backend
 from .task import TaskEvent, TaskHooks, TaskResult, TaskSpec, TaskStatus

-logger = logging.getLogger("pyflowx")
+logger = logging.getLogger(__name__)
+
+# 进程池复用：同一次 run() 内的 process 任务共享一个 ProcessPoolExecutor。
+# 模块级缓存避免每次任务都创建/销毁进程池的开销。
+# run() 结束后通过 _shutdown_process_pool() 关闭（shutdown(wait=False) +
+# kill 工作进程），避免 Python 退出时 threading._shutdown 等待管理线程
+# join 工作进程导致数秒阻塞。
+_process_pool: concurrent.futures.ProcessPoolExecutor | None = None
+_process_pool_lock = threading.Lock()
+
+
+def _get_process_pool() -> concurrent.futures.ProcessPoolExecutor:
+    """获取复用的进程池（惰性创建）。"""
+    global _process_pool  # noqa: PLW0603
+    if _process_pool is None:
+        with _process_pool_lock:
+            if _process_pool is None:
+                _process_pool = concurrent.futures.ProcessPoolExecutor()
+    return _process_pool
+
+
+def _shutdown_process_pool() -> None:
+    """关闭复用的进程池。
+
+    ``shutdown(wait=False)`` 通知管理线程退出（管理线程是非 daemon，
+    ``threading._shutdown`` 会等待它）；同时 kill 工作进程，避免管理线程
+    在退出前逐个 join 工作进程导致数秒阻塞。
+    """
+    global _process_pool  # noqa: PLW0603
+    if _process_pool is not None:
+        pool = _process_pool
+        _process_pool = None
+        # 在 shutdown 前获取进程列表（管理线程退出会清空 _processes）。
+        # _processes 是 ProcessPoolExecutor 的私有属性，无公开 API 替代。
+        procs = list((getattr(pool, "_processes", None) or {}).values())
+        pool.shutdown(wait=False)
+        # 强制终止工作进程（SIGKILL），避免管理线程 join 导致 ~7s 阻塞。
+        for proc in procs:
+            with contextlib.suppress(ProcessLookupError, AttributeError):
+                proc.kill()  # type: ignore[attr-defined]
+
+
+# 兜底：防止未经 run() 直接使用执行器的场景导致进程池泄漏。
+atexit.register(_shutdown_process_pool)
+
+
+def _run_in_process(fn: Any, args: tuple[Any, ...], kwargs: dict[str, Any]) -> Any:
+    """模块级函数：在进程池中执行任务（须可 pickle）。
+
+    env_context 等上下文管理器无法跨进程传递，进程池任务的 ``env``/``cwd``
+    不生效；如需设置环境，应在 ``fn`` 内部自行处理。
+    """
+    return fn(*args, **kwargs)
+

 # 观察者回调类型。
 EventCallback = Callable[[TaskEvent], None]
@@ -83,6 +142,22 @@ def _emit(on_event: EventCallback | None, result: TaskResult[Any]) -> None:
    )


+def _emit_running(on_event: EventCallback | None, spec: TaskSpec[Any]) -> None:
+    """触发 RUNNING 事件（任务开始执行时）。"""
+    if on_event is None:
+        return
+    on_event(
+        TaskEvent(
+            task=spec.name,
+            status=TaskStatus.RUNNING,
+            attempts=0,
+            error=None,
+            duration=None,
+            reason=None,
+        )
+    )
+
+
 def _run_hooks(hooks: TaskHooks, fn_name: str, *args: Any) -> None:
    """安全调用钩子（异常仅记录，不影响任务状态）。"""
    hook: Callable[..., None] | None = getattr(hooks, fn_name, None)
@@ -126,11 +201,16 @@ def _apply_cached(
    backend: StateBackend,
    on_event: EventCallback | None,
 ) -> bool:
-    """若 ``name`` 命中缓存，写入 context/report 并返回 True。"""
+    """若 ``name`` 命中缓存，写入 context/report 并返回 True。
+
+    单次 ``backend.get`` + ``KeyError`` 回退，避免 ``has`` + ``get`` 双重
+    哈希查找与双重 TTL 判断。
+    """
    storage_key = spec.storage_key(context)
-    if not backend.has(storage_key):
+    try:
+        cached = backend.get(storage_key)
+    except KeyError:
        return False
-    cached = backend.get(storage_key)
    context[name] = cached
    result = TaskResult(spec=spec, status=TaskStatus.SKIPPED, value=cached, reason="缓存命中")
    report.results[name] = result
@@ -139,154 +219,146 @@ def _apply_cached(
    return True


-def _sort_by_priority(layer: list[str], graph: Graph) -> list[str]:
-    """按优先级降序排序（稳定排序）。"""
-    return sorted(layer, key=lambda n: -graph.resolved_spec(n).priority)
+def _sort_by_priority(layer: list[str], specs: Mapping[str, TaskSpec[Any]]) -> list[str]:
+    """按优先级降序排序（稳定排序）。

-
-# ---------------------------------------------------------------------- #
-# Mixin：任务级跳过 / 重试 / 成功处理
-# ---------------------------------------------------------------------- #
-class _TaskSkipMixin:
-    """任务级跳过预检共享逻辑。
-
-    将"上游被跳过/失败"与"条件不满足"两类跳过判断统一为单一入口，
-    被 :class:`SyncTaskRunner` 与 :class:`AsyncTaskRunner` 复用。
+    接受预构建的 ``{name: spec}`` 映射，避免在排序键函数中重复调用
+    ``graph.resolved_spec``（即便有缓存也省去 N 次字典查询）。
    """
+    return sorted(layer, key=lambda n: -specs[n].priority)

-    @staticmethod
-    def _upstream_skip_reason(spec: TaskSpec[Any], report: RunReport | None) -> str | None:
-        """硬依赖被 SKIPPED/FAILED 时返回原因字符串，否则 ``None``。

-        软依赖不影响本检查——软依赖被跳过时注入默认值。
-        """
-        if report is None or spec.allow_upstream_skip:
-            return None
-        for dep in spec.depends_on:
-            if dep not in report.results:
-                continue
-            dep_status = report.results[dep].status
-            if dep_status in (TaskStatus.SKIPPED, TaskStatus.FAILED):
-                return f"上游任务 '{dep}' 状态为 {dep_status.value}"
+# ---------------------------------------------------------------------- #
+# 任务级跳过 / 重试 / 成功处理：模块级函数
+# ---------------------------------------------------------------------- #
+def _upstream_skip_reason(spec: TaskSpec[Any], report: RunReport | None) -> str | None:
+    """硬依赖被 SKIPPED/FAILED 时返回原因字符串，否则 ``None``。
+
+    软依赖不影响本检查——软依赖被跳过时注入默认值。
+    """
+    if report is None or spec.allow_upstream_skip:
        return None
+    for dep in spec.depends_on:
+        if dep not in report.results:
+            continue
+        dep_status = report.results[dep].status
+        if dep_status in (TaskStatus.SKIPPED, TaskStatus.FAILED):
+            return f"上游任务 '{dep}' 状态为 {dep_status.value}"
+    return None

-    @staticmethod
-    def _prepare_for_execution(
-        spec: TaskSpec[Any],
-        context: Mapping[str, Any],
-        report: RunReport | None,
-        on_event: EventCallback | None,
-    ) -> TaskResult[Any] | None:
-        """执行前预检：上游跳过 / 条件跳过。

-        返回 SKIPPED TaskResult 或 ``None``（继续执行）。
-        条件判断委托给 :meth:`TaskSpec.should_execute`，避免重复实现。
-        """
-        # 1. 上游被跳过/失败
-        skip_reason = _TaskSkipMixin._upstream_skip_reason(spec, report)
-        # 2. 条件 / skip_if_missing（单一来源：TaskSpec.should_execute）
-        if skip_reason is None:
-            should_run, cond_reason = spec.should_execute(context)
-            if not should_run:
-                skip_reason = cond_reason or "条件不满足"
-        if skip_reason is None:
-            return None
-        # 构造 SKIPPED 结果
-        result: TaskResult[Any] = TaskResult(
-            spec=spec,
-            status=TaskStatus.SKIPPED,
-            finished_at=datetime.now(),
-            reason=skip_reason,
+def _prepare_for_execution(
+    spec: TaskSpec[Any],
+    context: Mapping[str, Any],
+    report: RunReport | None,
+    on_event: EventCallback | None,
+) -> TaskResult[Any] | None:
+    """执行前预检：上游跳过 / 条件跳过。
+
+    返回 SKIPPED TaskResult 或 ``None``（继续执行）。
+    条件判断委托给 :meth:`TaskSpec.should_execute`，避免重复实现。
+    """
+    # 1. 上游被跳过/失败
+    skip_reason = _upstream_skip_reason(spec, report)
+    # 2. 条件 / skip_if_missing（单一来源：TaskSpec.should_execute）
+    if skip_reason is None:
+        should_run, cond_reason = spec.should_execute(context)
+        if not should_run:
+            skip_reason = cond_reason or "条件不满足"
+    if skip_reason is None:
+        return None
+    # 构造 SKIPPED 结果
+    result: TaskResult[Any] = TaskResult(
+        spec=spec,
+        status=TaskStatus.SKIPPED,
+        finished_at=datetime.now(),
+        reason=skip_reason,
+    )
+    _emit(on_event, result)
+    logger.info("task %r skipped (%s)", spec.name, skip_reason)
+    return result
+
+
+def _should_retry(spec: TaskSpec[Any], attempts: int, exc: BaseException) -> bool:
+    """是否应继续重试。"""
+    return attempts < spec.retry.max_attempts and spec.retry.should_retry(exc)
+
+
+def _mark_success(spec: TaskSpec[Any], result: TaskResult[Any], value: Any) -> None:
+    """标记任务成功并触发 post_run 钩子。"""
+    result.value = value
+    result.status = TaskStatus.SUCCESS
+    result.finished_at = datetime.now()
+    _run_hooks(spec.hooks, "post_run", spec, value)
+
+
+def _finalize_failure(
+    result: TaskResult[Any],
+    layer_idx: int | None,
+    on_event: EventCallback | None,
+    continue_on_error: bool,
+) -> None:
+    """标记任务为 FAILED。若 ``continue_on_error`` 为真则不抛出异常。"""
+    result.status = TaskStatus.FAILED
+    result.finished_at = datetime.now()
+    _emit(on_event, result)
+    if continue_on_error:
+        logger.warning(
+            "task %r failed but continue_on_error=True; continuing.",
+            result.spec.name,
        )
-        _emit(on_event, result)
-        logger.info("task %r skipped (%s)", spec.name, skip_reason)
-        return result
+        return
+    raise TaskFailedError(
+        task=result.spec.name,
+        cause=result.error if result.error is not None else RuntimeError("unknown"),
+        attempts=result.attempts,
+        layer=layer_idx,
+    )


-class _TaskRetryMixin:
-    """任务级重试决策与失败/成功后处理共享逻辑。"""
+def _handle_failure(
+    spec: TaskSpec[Any],
+    result: TaskResult[Any],
+    exc: BaseException,
+    layer_idx: int | None,
+    on_event: EventCallback | None,
+) -> bool:
+    """统一处理失败：超时转换、重试决策、finalize。

-    @staticmethod
-    def _should_retry(spec: TaskSpec[Any], attempts: int, exc: BaseException) -> bool:
-        """是否应继续重试。"""
-        return attempts < spec.retry.max_attempts and spec.retry.should_retry(exc)
-
-    @staticmethod
-    def _mark_success(spec: TaskSpec[Any], result: TaskResult[Any], value: Any) -> None:
-        """标记任务成功并触发 post_run 钩子。"""
-        result.value = value
-        result.status = TaskStatus.SUCCESS
-        result.finished_at = datetime.now()
-        _run_hooks(spec.hooks, "post_run", spec, value)
-
-    @staticmethod
-    def _finalize_failure(
-        result: TaskResult[Any],
-        layer_idx: int | None,
-        on_event: EventCallback | None,
-        continue_on_error: bool,
-    ) -> None:
-        """标记任务为 FAILED。若 ``continue_on_error`` 为真则不抛出异常。"""
-        result.status = TaskStatus.FAILED
-        result.finished_at = datetime.now()
-        _emit(on_event, result)
-        if continue_on_error:
-            logger.warning(
-                "task %r failed but continue_on_error=True; continuing.",
-                result.spec.name,
-            )
-            return
-        raise TaskFailedError(
-            task=result.spec.name,
-            cause=result.error if result.error is not None else RuntimeError("unknown"),
-            attempts=result.attempts,
-            layer=layer_idx,
+    Returns
+    -------
+    bool
+        ``True`` 表示已 finalize（不再重试）；``False`` 表示应继续重试。
+    """
+    # asyncio.TimeoutError → TaskTimeoutError（统一异常类型）
+    if isinstance(exc, asyncio.TimeoutError):
+        exc = TaskTimeoutError(spec.name, spec.timeout or 0.0)
+        logger.warning(
+            "task %r timed out (attempt %d/%d); retrying",
+            spec.name,
+            result.attempts,
+            spec.retry.max_attempts,
        )
-
-    @staticmethod
-    def _handle_failure(
-        spec: TaskSpec[Any],
-        result: TaskResult[Any],
-        exc: BaseException,
-        layer_idx: int | None,
-        on_event: EventCallback | None,
-    ) -> bool:
-        """统一处理失败：超时转换、重试决策、finalize。
-
-        Returns
-        -------
-        bool
-            ``True`` 表示已 finalize（不再重试）；``False`` 表示应继续重试。
-        """
-        # asyncio.TimeoutError → TaskTimeoutError（统一异常类型）
-        if isinstance(exc, asyncio.TimeoutError):
-            exc = TaskTimeoutError(spec.name, spec.timeout or 0.0)
-            logger.warning(
-                "task %r timed out (attempt %d/%d); retrying",
-                spec.name,
-                result.attempts,
-                spec.retry.max_attempts,
-            )
-        else:
-            logger.warning(
-                "task %r failed (attempt %d/%d): %r; retrying",
-                spec.name,
-                result.attempts,
-                spec.retry.max_attempts,
-                exc,
-            )
-        result.error = exc
-        if _TaskRetryMixin._should_retry(spec, result.attempts, exc):
-            return False
-        _run_hooks(spec.hooks, "on_failure", spec, exc)
-        _TaskRetryMixin._finalize_failure(result, layer_idx, on_event, spec.continue_on_error)
-        return True
+    else:
+        logger.warning(
+            "task %r failed (attempt %d/%d): %r; retrying",
+            spec.name,
+            result.attempts,
+            spec.retry.max_attempts,
+            exc,
+        )
+    result.error = exc
+    if _should_retry(spec, result.attempts, exc):
+        return False
+    _run_hooks(spec.hooks, "on_failure", spec, exc)
+    _finalize_failure(result, layer_idx, on_event, spec.continue_on_error)
+    return True


 # ---------------------------------------------------------------------- #
-# 任务执行器：同步 / 异步（复用 _TaskSkipMixin + _TaskRetryMixin）
+# 任务执行器：同步 / 异步（调用模块级跳过/重试函数）
 # ---------------------------------------------------------------------- #
-class SyncTaskRunner(_TaskSkipMixin, _TaskRetryMixin):
+class SyncTaskRunner:
    """同步任务执行器：带重试与跳过预检。"""

    @staticmethod
@@ -297,7 +369,7 @@ class SyncTaskRunner(_TaskSkipMixin, _TaskRetryMixin):
        on_event: EventCallback | None = None,
        report: RunReport | None = None,
    ) -> TaskResult[Any]:
-        skipped = _TaskSkipMixin._prepare_for_execution(spec, context, report, on_event)
+        skipped = _prepare_for_execution(spec, context, report, on_event)
        if skipped is not None:
            return skipped

@@ -306,23 +378,24 @@ class SyncTaskRunner(_TaskSkipMixin, _TaskRetryMixin):
        args, kwargs = build_call_args(spec, context)

        _run_hooks(spec.hooks, "pre_run", spec)
+        _emit_running(on_event, spec)

        while True:
            result.attempts += 1
            try:
                with spec.env_context():
                    value = spec.effective_fn(*args, **kwargs)
-                _TaskRetryMixin._mark_success(spec, result, value)
+                _mark_success(spec, result, value)
                return result
            except Exception as exc:
-                if _TaskRetryMixin._handle_failure(spec, result, exc, layer_idx, on_event):
+                if _handle_failure(spec, result, exc, layer_idx, on_event):
                    return result
                wait = spec.retry.wait_seconds(result.attempts)
                if wait > 0:
                    time.sleep(wait)


-class AsyncTaskRunner(_TaskSkipMixin, _TaskRetryMixin):
+class AsyncTaskRunner:
    """异步任务执行器：在事件循环上运行同步或异步任务，带重试与跳过预检。"""

    @staticmethod
@@ -334,7 +407,7 @@ class AsyncTaskRunner(_TaskSkipMixin, _TaskRetryMixin):
        report: RunReport | None = None,
        semaphore: asyncio.Semaphore | None = None,
    ) -> TaskResult[Any]:
-        skipped = _TaskSkipMixin._prepare_for_execution(spec, context, report, on_event)
+        skipped = _prepare_for_execution(spec, context, report, on_event)
        if skipped is not None:
            return skipped

@@ -345,15 +418,16 @@ class AsyncTaskRunner(_TaskSkipMixin, _TaskRetryMixin):
            loop = asyncio.get_event_loop()

            _run_hooks(spec.hooks, "pre_run", spec)
+            _emit_running(on_event, spec)

            while True:
                result.attempts += 1
                try:
                    value = await _execute_async_task(spec, args, kwargs, loop)
-                    _TaskRetryMixin._mark_success(spec, result, value)
+                    _mark_success(spec, result, value)
                    return result
                except Exception as exc:
-                    if _TaskRetryMixin._handle_failure(spec, result, exc, layer_idx, on_event):
+                    if _handle_failure(spec, result, exc, layer_idx, on_event):
                        return result
                    wait = spec.retry.wait_seconds(result.attempts)
                    if wait > 0:
@@ -372,97 +446,128 @@ async def _execute_async_task(
    loop: asyncio.AbstractEventLoop,
 ) -> Any:
    """执行异步或同步任务（带超时处理）。"""
+    # 异步任务直接 await
    if _is_async_fn(spec):
        coro = cast(Awaitable[Any], spec.effective_fn(*args, **kwargs))
-        if spec.timeout is not None:
-            return await asyncio.wait_for(coro, timeout=spec.timeout)
-        return await coro
+        return await asyncio.wait_for(coro, timeout=spec.timeout) if spec.timeout is not None else await coro
+
+    # 同步任务：根据 executor 选择执行器
+    fut = _submit_sync_task(spec, args, kwargs, loop)
+    return await asyncio.wait_for(fut, timeout=spec.timeout) if spec.timeout is not None else await fut
+
+
+def _submit_sync_task(
+    spec: TaskSpec[Any],
+    args: tuple[Any, ...],
+    kwargs: dict[str, Any],
+    loop: asyncio.AbstractEventLoop,
+) -> asyncio.Future[Any]:
+    """提交同步任务到对应执行器，返回 Future。
+
+    * ``inline``：直接在事件循环线程调用（阻塞循环，最快）。
+    * ``process``：进程池执行（绕过 GIL，fn 须可 pickle）。
+    * ``thread``（默认）：线程池执行。
+    """

    def fn_call() -> Any:
        with spec.env_context():
            return spec.effective_fn(*args, **kwargs)

-    if spec.timeout is not None:
-        return await asyncio.wait_for(loop.run_in_executor(None, fn_call), timeout=spec.timeout)
-    return await loop.run_in_executor(None, fn_call)
+    # inline：直接在事件循环线程调用，无线程池开销，但会阻塞循环。
+    if spec.executor == "inline":
+        result = fn_call()
+        fut: asyncio.Future[Any] = loop.create_future()
+        fut.set_result(result)
+        return fut
+
+    # process：进程池执行，绕过 GIL，适合 CPU 密集型任务（fn 须可 pickle）。
+    if spec.executor == "process":
+        from functools import partial
+
+        pool = _get_process_pool()
+        proc_fn = partial(_run_in_process, spec.effective_fn, args, kwargs)
+        return loop.run_in_executor(pool, proc_fn)
+
+    # thread（默认）：线程池执行。
+    return loop.run_in_executor(None, fn_call)


 # ---------------------------------------------------------------------- #
-# Mixin：层执行共享逻辑
+# 共享辅助：缓存过滤、优先级排序、信号量构建、结果存储
 # ---------------------------------------------------------------------- #
-class _LayerMixin:
-    """层执行共享逻辑：缓存过滤、优先级排序、信号量构建、结果存储。
+def _filter_and_sort(
+    layer: list[str],
+    graph: Graph,
+    context: dict[str, Any],
+    report: RunReport,
+    backend: StateBackend,
+    on_event: EventCallback | None,
+) -> list[str]:
+    """过滤掉已命中缓存的任务，按优先级排序返回待运行列表。

-    四个层执行器（sequential/threaded/async/dependency）通过组合此 Mixin
-    消除"过滤缓存→排序→运行→存结果"的样板代码。
+    预构建 ``{name: spec}`` 映射，过滤与排序共享同一份 resolved spec，
+    避免 ``_sort_by_priority`` 内重复调用 ``graph.resolved_spec``。
    """
+    specs: dict[str, TaskSpec[Any]] = {}
+    to_run: list[str] = []
+    for name in layer:
+        spec = graph.resolved_spec(name)
+        specs[name] = spec
+        if not _apply_cached(name, spec, context, report, backend, on_event):
+            to_run.append(name)
+    return _sort_by_priority(to_run, specs)

-    @staticmethod
-    def _filter_and_sort(
-        layer: list[str],
-        graph: Graph,
-        context: dict[str, Any],
-        report: RunReport,
-        backend: StateBackend,
-        on_event: EventCallback | None,
-    ) -> list[str]:
-        """过滤掉已命中缓存的任务，按优先级排序返回待运行列表。"""
-        to_run: list[str] = []
-        for name in layer:
-            spec = graph.resolved_spec(name)
-            if not _apply_cached(name, spec, context, report, backend, on_event):
-                to_run.append(name)
-        return _sort_by_priority(to_run, graph)

-    @staticmethod
-    def _store_result(
-        name: str,
-        result: TaskResult[Any],
-        graph: Graph,
-        context: dict[str, Any],
-        report: RunReport,
-        backend: StateBackend,
-        on_event: EventCallback | None,
-        context_snapshot: Mapping[str, Any] | None = None,
-    ) -> None:
-        """存储任务结果到 context/report/backend 并触发事件。"""
-        context[name] = result.value
-        if result.status == TaskStatus.SUCCESS:
-            spec = graph.resolved_spec(name)
-            task_ctx = _build_context(spec, context_snapshot if context_snapshot is not None else context, report)
-            backend.save(spec.storage_key(task_ctx), result.value)
-        report.results[name] = result
-        _emit(on_event, result)
+def _store_result(
+    name: str,
+    result: TaskResult[Any],
+    spec: TaskSpec[Any],
+    task_ctx: dict[str, Any],
+    context: dict[str, Any],
+    report: RunReport,
+    backend: StateBackend,
+    on_event: EventCallback | None,
+) -> None:
+    """存储任务结果到 context/report/backend 并触发事件。

-    @staticmethod
-    def _build_semaphores(
-        to_run: list[str],
-        graph: Graph,
-        sem_factory: Callable[[int], Any],
-        concurrency_limits: Mapping[str, int],
-    ) -> dict[str, Any]:
-        """为每个 ``concurrency_key`` 创建一个信号量。"""
-        semaphores: dict[str, Any] = {}
-        for name in to_run:
-            spec = graph.resolved_spec(name)
-            key = spec.concurrency_key
-            if key is not None and key not in semaphores:
-                limit = concurrency_limits.get(key, 1)
-                semaphores[key] = sem_factory(limit)
-        return semaphores
+    ``spec`` 与 ``task_ctx`` 由调用方在执行前已构建，直接复用避免重复
+    ``resolved_spec`` / ``_build_context`` 调用。
+    """
+    context[name] = result.value
+    if result.status == TaskStatus.SUCCESS:
+        backend.save(spec.storage_key(task_ctx), result.value)
+    report.results[name] = result
+    _emit(on_event, result)

-    @staticmethod
-    def _get_sem(semaphores: Mapping[str, Any], spec: TaskSpec[Any]) -> Any | None:
-        """获取任务对应的信号量（无 concurrency_key 则返回 None）。"""
-        if spec.concurrency_key is None:
-            return None
-        return semaphores.get(spec.concurrency_key)
+
+def _build_semaphores(
+    to_run: list[str],
+    graph: Graph,
+    sem_factory: Callable[[int], Any],
+    concurrency_limits: Mapping[str, int],
+) -> dict[str, Any]:
+    """为每个 ``concurrency_key`` 创建一个信号量。"""
+    semaphores: dict[str, Any] = {}
+    for name in to_run:
+        spec = graph.resolved_spec(name)
+        key = spec.concurrency_key
+        if key is not None and key not in semaphores:
+            limit = concurrency_limits.get(key, 1)
+            semaphores[key] = sem_factory(limit)
+    return semaphores
+
+
+def _get_sem(semaphores: Mapping[str, Any], spec: TaskSpec[Any]) -> Any | None:
+    """获取任务对应的信号量（无 concurrency_key 则返回 None）。"""
+    if spec.concurrency_key is None:
+        return None
+    return semaphores.get(spec.concurrency_key)


 # ---------------------------------------------------------------------- #
 # 层执行器
 # ---------------------------------------------------------------------- #
-class SequentialLayerRunner(_LayerMixin):
+class SequentialLayerRunner:
    """逐个运行某层的任务（按优先级排序）。"""

    @staticmethod
@@ -475,14 +580,14 @@ class SequentialLayerRunner(_LayerMixin):
        layer_idx: int,
        on_event: EventCallback | None,
    ) -> None:
-        for name in SequentialLayerRunner._filter_and_sort(layer, graph, context, report, backend, on_event):
+        for name in _filter_and_sort(layer, graph, context, report, backend, on_event):
            spec = graph.resolved_spec(name)
            task_ctx = _build_context(spec, context, report)
            result = SyncTaskRunner.run(spec, task_ctx, layer_idx, on_event, report)
-            SequentialLayerRunner._store_result(name, result, graph, context, report, backend, on_event)
+            _store_result(name, result, spec, task_ctx, context, report, backend, on_event)


-class ThreadedLayerRunner(_LayerMixin):
+class ThreadedLayerRunner:
    """在线程池中并发运行某层的任务。"""

    @staticmethod
@@ -497,43 +602,43 @@ class ThreadedLayerRunner(_LayerMixin):
        max_workers: int,
        concurrency_limits: Mapping[str, int],
    ) -> None:
-        to_run = ThreadedLayerRunner._filter_and_sort(layer, graph, context, report, backend, on_event)
+        to_run = _filter_and_sort(layer, graph, context, report, backend, on_event)
        if not to_run:
            return
-        semaphores = ThreadedLayerRunner._build_semaphores(to_run, graph, threading.Semaphore, concurrency_limits)
+        semaphores = _build_semaphores(to_run, graph, threading.Semaphore, concurrency_limits)
        context_snapshot = dict(context)
        lock = threading.Lock()

-        def _run_threaded_task(name: str) -> TaskResult[Any]:
+        def _run_threaded_task(name: str) -> tuple[dict[str, Any], TaskResult[Any]]:
            spec = graph.resolved_spec(name)
            task_ctx = _build_context(spec, context_snapshot, report)
-            sem = ThreadedLayerRunner._get_sem(semaphores, spec)
+            sem = _get_sem(semaphores, spec)
            if sem is not None:
                sem.acquire()
            try:
-                return SyncTaskRunner.run(spec, task_ctx, layer_idx, on_event, report)
+                return task_ctx, SyncTaskRunner.run(spec, task_ctx, layer_idx, on_event, report)
            finally:
                if sem is not None:
                    sem.release()

        with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as pool:
-            future_to_name: dict[concurrent.futures.Future[TaskResult[Any]], str] = {
+            future_to_name: dict[concurrent.futures.Future[tuple[dict[str, Any], TaskResult[Any]]], str] = {
                pool.submit(_run_threaded_task, name): name for name in to_run
            }
-            completed: dict[str, TaskResult[Any]] = {}
+            completed: dict[str, tuple[dict[str, Any], TaskResult[Any]]] = {}
            try:
                for fut in concurrent.futures.as_completed(future_to_name):
                    name = future_to_name[fut]
                    completed[name] = fut.result()
            finally:
                with lock:
-                    for name, result in completed.items():
-                        ThreadedLayerRunner._store_result(
-                            name, result, graph, context, report, backend, on_event, context_snapshot
+                    for name, (task_ctx, result) in completed.items():
+                        _store_result(
+                            name, result, graph.resolved_spec(name), task_ctx, context, report, backend, on_event
                        )


-class AsyncLayerRunner(_LayerMixin):
+class AsyncLayerRunner:
    """在事件循环上并发运行某层的任务。"""

    @staticmethod
@@ -547,27 +652,32 @@ class AsyncLayerRunner(_LayerMixin):
        on_event: EventCallback | None,
        concurrency_limits: Mapping[str, int],
    ) -> None:
-        to_run = AsyncLayerRunner._filter_and_sort(layer, graph, context, report, backend, on_event)
+        to_run = _filter_and_sort(layer, graph, context, report, backend, on_event)
        if not to_run:
            return
-        semaphores = AsyncLayerRunner._build_semaphores(to_run, graph, asyncio.Semaphore, concurrency_limits)
+        semaphores = _build_semaphores(to_run, graph, asyncio.Semaphore, concurrency_limits)
        context_snapshot = dict(context)

-        async def _run_async_task(name: str) -> TaskResult[Any]:
+        async def _run_async_task(name: str) -> tuple[dict[str, Any], TaskResult[Any]]:
            spec = graph.resolved_spec(name)
            task_ctx = _build_context(spec, context_snapshot, report)
-            sem = AsyncLayerRunner._get_sem(semaphores, spec)
-            return await AsyncTaskRunner.run(spec, task_ctx, layer_idx, on_event, report, sem)
+            sem = _get_sem(semaphores, spec)
+            result = await AsyncTaskRunner.run(spec, task_ctx, layer_idx, on_event, report, sem)
+            return task_ctx, result

        results = await asyncio.gather(*[_run_async_task(name) for name in to_run])
-        for name, result in zip(to_run, results):
-            AsyncLayerRunner._store_result(name, result, graph, context, report, backend, on_event, context_snapshot)
+        for name, (task_ctx, result) in zip(to_run, results):
+            _store_result(name, result, graph.resolved_spec(name), task_ctx, context, report, backend, on_event)


-class DependencyRunner(_LayerMixin):
+class DependencyRunner:
    """依赖驱动调度：任务在硬/软依赖完成后立即启动，无层屏障。

    所有任务通过 asyncio 并发调度。同步任务卸载到线程池。
+
+    本类不继承层 Mixin：依赖驱动调度不是层模型，直接调用模块级共享辅助
+    函数（:func:`_build_semaphores` / :func:`_get_sem` / :func:`_store_result`），
+    职责更清晰。
    """

    @staticmethod
@@ -580,7 +690,7 @@ class DependencyRunner(_LayerMixin):
        concurrency_limits: Mapping[str, int],
    ) -> None:
        all_names = list(graph.all_specs().keys())
-        semaphores = DependencyRunner._build_semaphores(all_names, graph, asyncio.Semaphore, concurrency_limits)
+        semaphores = _build_semaphores(all_names, graph, asyncio.Semaphore, concurrency_limits)
        futures: dict[str, asyncio.Future[TaskResult[Any]]] = {}

        async def _run_task(name: str) -> TaskResult[Any]:
@@ -598,9 +708,9 @@ class DependencyRunner(_LayerMixin):
            if _apply_cached(name, spec, context, report, backend, on_event):
                return report.results[name]

-            sem = DependencyRunner._get_sem(semaphores, spec)
+            sem = _get_sem(semaphores, spec)
            result = await AsyncTaskRunner.run(spec, task_ctx, None, on_event, report, sem)
-            DependencyRunner._store_result(name, result, graph, context, report, backend, on_event)
+            _store_result(name, result, spec, task_ctx, context, report, backend, on_event)
            return result

        loop = asyncio.get_event_loop()
@@ -617,7 +727,7 @@ def _make_verbose_callback(on_event: EventCallback | None) -> EventCallback:

    def _verbose_callback(event: TaskEvent) -> None:
        dur = f" ({event.duration:.3f}s)" if event.duration is not None else ""
-        if event.status == TaskStatus.RUNNING:  # pragma: no cover
+        if event.status == TaskStatus.RUNNING:
            print(f"[verbose] 任务 {event.task!r} 开始执行...", flush=True)
        elif event.status == TaskStatus.SUCCESS:
            print(f"[verbose] 任务 {event.task!r} 成功{dur}", flush=True)
@@ -638,7 +748,7 @@ def _make_verbose_callback(on_event: EventCallback | None) -> EventCallback:

 def run(
    graph: Graph,
-    strategy: Strategy = "sequential",
+    strategy: Strategy = "dependency",
    *,
    max_workers: int | None = None,
    dry_run: bool = False,
@@ -654,8 +764,8 @@ def run(
    graph:
        待执行的已校验 :class:`Graph`。
    strategy:
-        执行策略: ``"sequential"`` / ``"thread"`` / ``"async"`` /
-        ``"dependency"``。``"dependency"`` 为依赖驱动调度，无层屏障。
+        执行策略: ``"dependency"``（默认，依赖驱动无层屏障，最大并行度）/
+        ``"sequential"`` / ``"thread"`` / ``"async"``（层屏障模型）。
    max_workers:
        ``"thread"`` 的线程池大小。默认 ``min(32, len(layer))``。
    dry_run:
@@ -677,33 +787,46 @@ def run(
    TaskFailedError
        任何任务耗尽重试后仍失败时（除非 ``continue_on_error=True``）。
    """
-    graph.validate()
-    layers = graph.layers()
-
    if dry_run:
+        layers = graph.layers()
        _print_dry_run(graph, layers)
        return RunReport(success=True)

+    # 入口统一校验一次：所有策略共用，避免 layers() / dependency 路径
+    # 各自重复调用 validate()。
+    graph.validate()
+
    effective_callback: EventCallback | None = _make_verbose_callback(on_event) if verbose else on_event
    backend = resolve_backend(state)
    report = RunReport()
    context: dict[str, Any] = {}
    limits = concurrency_limits or {}

-    try:
-        if strategy == "sequential":
-            _drive_sequential(graph, layers, context, report, backend, effective_callback)
-        elif strategy == "thread":
-            _drive_threaded(graph, layers, context, report, backend, effective_callback, max_workers, limits)
-        elif strategy == "async":
-            asyncio.run(_async_drive(graph, layers, context, report, backend, effective_callback, limits))
-        elif strategy == "dependency":
-            asyncio.run(DependencyRunner.execute(graph, context, report, backend, effective_callback, limits))
-        else:
-            raise ValueError(f"Unknown strategy: {strategy!r}")
-    except TaskFailedError:
-        report.success = False
-        raise
+    # backend.batch()：将每任务一次落盘降为整次运行一次（JSONBackend）；
+    # MemoryBackend 为 no-op。即使中途抛出 TaskFailedError，batch 退出时
+    # 仍会 flush 一次，保留已成功任务的结果以便断点续跑。
+    with backend.batch():
+        try:
+            if strategy == "sequential":
+                layers = graph.layers()
+                _drive_sequential(graph, layers, context, report, backend, effective_callback)
+            elif strategy == "thread":
+                layers = graph.layers()
+                _drive_threaded(graph, layers, context, report, backend, effective_callback, max_workers, limits)
+            elif strategy == "async":
+                layers = graph.layers()
+                asyncio.run(_async_drive(graph, layers, context, report, backend, effective_callback, limits))
+            elif strategy == "dependency":
+                asyncio.run(DependencyRunner.execute(graph, context, report, backend, effective_callback, limits))
+            else:
+                raise ValueError(f"Unknown strategy: {strategy!r}")
+        except TaskFailedError:
+            report.success = False
+            raise
+        finally:
+            # 关闭进程池：通知管理线程退出 + kill 工作进程，避免
+            # threading._shutdown 等待管理线程 join 工作进程导致 ~7s 阻塞。
+            _shutdown_process_pool()

    return report

@@ -17,12 +17,13 @@ __all__ = [
    "GraphDefaults",
 ]

+import inspect
 import sys
 from dataclasses import dataclass, field, replace
 from typing import Any, Callable, Iterable, Mapping, Sequence

 from .errors import CycleError, DuplicateTaskError, MissingDependencyError
-from .task import RetryPolicy, TaskSpec
+from .task import Context, RetryPolicy, TaskSpec

 if sys.version_info >= (3, 9):  # pragma: no cover
    import graphlib  # pyright: ignore[reportUnreachable]
@@ -63,6 +64,74 @@ def _prune_deps(spec: TaskSpec[Any], keep: Callable[[str], bool]) -> TaskSpec[An
    )


+def _make_namespaced_fn(orig_fn: Any, ns: str, dep_names: set[str]) -> Any:
+    """包装 fn，使其能接收带 ``ns:`` 前缀的依赖名，调用时映射回原参数名。
+
+    命名空间合并后，依赖名带前缀（如 ``build:extract``），但 Python 参数名
+    不能含 ``:``。wrapper 用 ``**kwargs`` 接收所有依赖，内部把带前缀的依赖名
+    映射回原参数名后调用原 fn。
+
+    无依赖参数时直接返回原 fn。
+    """
+    if not dep_names or orig_fn is None:
+        return orig_fn
+    try:
+        orig_sig = inspect.signature(orig_fn)
+    except (TypeError, ValueError):
+        return orig_fn
+
+    # 带前缀依赖名 -> 原参数名
+    name_map: dict[str, str] = {f"{ns}:{orig}": orig for orig in dep_names}
+    prefix = f"{ns}:"
+
+    # 检查原 fn 是否有 Context 标注参数
+    context_param_name: str | None = None
+    for p in orig_sig.parameters.values():
+        ann = p.annotation
+        if ann is not Context and not (isinstance(ann, str) and ann.endswith("Context")):
+            continue
+        context_param_name = p.name
+        break
+
+    if context_param_name is not None:
+
+        def wrapper(ctx: Any = None, **kwargs: Any) -> Any:
+            # ctx 是 dep_context，键为带前缀的依赖名；映射回原始键
+            orig_ctx: dict[str, Any] = {}
+            for k, v in (ctx or {}).items():
+                orig_ctx[name_map.get(k, k)] = v
+            # kwargs 中带前缀的依赖也映射回原参数名
+            for k, v in kwargs.items():
+                if k in name_map:
+                    orig_ctx[name_map[k]] = v
+            return orig_fn(**{context_param_name: orig_ctx})
+
+        ctx_param = inspect.Parameter("ctx", inspect.Parameter.POSITIONAL_OR_KEYWORD, annotation=Context)
+        kw_param = inspect.Parameter("kwargs", inspect.Parameter.VAR_KEYWORD)
+        wrapper.__signature__ = inspect.Signature(  # type: ignore[attr-defined]
+            parameters=[ctx_param, kw_param],
+            return_annotation=orig_sig.return_annotation,
+        )
+    else:
+
+        def wrapper(**kwargs: Any) -> Any:  # type: ignore[no-redef]
+            orig_kwargs: dict[str, Any] = {}
+            for k, v in kwargs.items():
+                if k.startswith(prefix):
+                    orig_kwargs[k[len(prefix) :]] = v
+            return orig_fn(**orig_kwargs)
+
+        kw_param = inspect.Parameter("kwargs", inspect.Parameter.VAR_KEYWORD)
+        wrapper.__signature__ = inspect.Signature(  # type: ignore[attr-defined]
+            parameters=[kw_param],
+            return_annotation=orig_sig.return_annotation,
+        )
+
+    wrapper.__name__ = f"{ns}_{getattr(orig_fn, '__name__', 'fn')}"
+    wrapper.__doc__ = getattr(orig_fn, "__doc__", None)
+    return wrapper
+
+
@dataclass
 class Graph:
    """校验后的有向无环任务图。
@@ -78,10 +147,15 @@ class Graph:
    specs: dict[str, TaskSpec[Any]] = field(default_factory=dict)
    deps: dict[str, tuple[str, ...]] = field(default_factory=dict)
    defaults: GraphDefaults = field(default_factory=GraphDefaults)
+    namespace: str | None = None

    # 待解析的字符串引用列表（由 GraphComposer 消费）；为空表示无引用。
    _pending_refs: list[str] = field(default_factory=list)

+    # resolved_spec 缓存：避免执行期每个任务多次重复 dataclasses.replace 判断。
+    # 在 specs / defaults 变更时失效。
+    _resolved_cache: dict[str, TaskSpec[Any]] = field(default_factory=dict)
+
    # ------------------------------------------------------------------ #
    # 构建
    # ------------------------------------------------------------------ #
@@ -91,18 +165,43 @@ class Graph:
        self._validate_references()
        return self

+    def chain(self, *specs: TaskSpec[Any]) -> Graph:
+        """链式注册任务：每个 spec 自动依赖前一个。
+
+        ``chain(a, b, c)`` 等价于 ``b`` 依赖 ``a``，``c`` 依赖 ``b``。
+        若 spec 已带 ``depends_on``，则前驱名追加到现有依赖前。
+        返回 ``self`` 支持链式调用。
+
+        Examples
+        --------
+        >>> graph = px.Graph().chain(extract, transform, load)
+        """
+        prev_name: str | None = None
+        for s in specs:
+            current = s
+            if prev_name is not None:
+                # 将前驱追加到 depends_on 最前（保持显式依赖优先）
+                new_deps = (prev_name, *s.depends_on) if prev_name not in s.depends_on else s.depends_on
+                current = replace(s, depends_on=new_deps)
+            self.add(current)
+            prev_name = current.name
+        return self
+
    def _register(self, spec: TaskSpec[Any]) -> None:
        if spec.name in self.specs:
            raise DuplicateTaskError(spec.name)
        self.specs[spec.name] = spec
        # 拓扑依赖仅含硬依赖；软依赖仅用于注入，不影响分层。
        self.deps[spec.name] = spec.depends_on
+        self._resolved_cache.clear()

    @classmethod
    def from_specs(
        cls,
        specs: Iterable[TaskSpec[Any] | str],
        defaults: GraphDefaults | None = None,
+        *,
+        namespace: str | None = None,
    ) -> Graph:
        """从可迭代的 task spec 构建图。

@@ -115,8 +214,10 @@ class Graph:
            TaskSpec 对象或字符串引用的列表。
        defaults:
            图级默认值。``None`` 使用空 :class:`GraphDefaults`。
+        namespace:
+            可选命名空间，用于 :meth:`add_subgraph` 合并时加前缀。
        """
-        graph = cls(defaults=defaults or GraphDefaults())
+        graph = cls(defaults=defaults or GraphDefaults(), namespace=namespace)
        pending_refs: list[str] = []

        for spec in specs:
@@ -134,6 +235,46 @@ class Graph:
        graph.validate()
        return graph

+    def add_subgraph(self, sub: Graph, *, namespace: str | None = None) -> Graph:
+        """将子图合并到当前图，任务名加命名空间前缀避免冲突。
+
+        参数
+        ----
+        sub:
+            待合并的子图。
+        namespace:
+            命名空间前缀。``None`` 时使用 ``sub.namespace``，若子图也无命名空间
+            则抛出 ``ValueError``。最终任务名为 ``f"{ns}:{original_name}"``。
+
+        合并后，子图内任务的依赖名也会被加前缀；与子图外部任务的依赖保持原样。
+
+        返回 ``self`` 支持链式调用。
+        """
+        ns = namespace or sub.namespace
+        if not ns:
+            raise ValueError("add_subgraph 需要 namespace 或子图自带 namespace")
+
+        def _rename(name: str) -> str:
+            # 仅对子图内部任务名加前缀；外部依赖保持原样
+            return f"{ns}:{name}" if name in sub.specs else name
+
+        sub_names = set(sub.specs.keys())
+        for spec in sub.specs.values():
+            # 子图内部依赖名需加前缀，对应的 fn 参数也需包装
+            internal_deps = (set(spec.depends_on) | set(spec.soft_depends_on)) & sub_names
+            new_fn = _make_namespaced_fn(spec.fn, ns, internal_deps) if spec.fn else spec.fn
+            new_spec = replace(
+                spec,
+                name=_rename(spec.name),
+                fn=new_fn,
+                depends_on=tuple(_rename(d) for d in spec.depends_on),
+                soft_depends_on=tuple(_rename(d) for d in spec.soft_depends_on),
+            )
+            self._register(new_spec)
+        self._validate_references()
+        self.validate()
+        return self
+
    # ------------------------------------------------------------------ #
    # 校验
    # ------------------------------------------------------------------ #
@@ -175,7 +316,12 @@ class Graph:
        对于 ``retry``/``timeout``/``strategy``/``env``/``cwd`` 等可空
        字段，若 spec 字段为默认空值且图级默认值非空，则用
        :func:`dataclasses.replace` 生成带默认值的副本。
+
+        结果按 ``name`` 缓存；specs / defaults 变更时缓存失效。
        """
+        cached = self._resolved_cache.get(name)
+        if cached is not None:
+            return cached
        spec = self.specs[name]
        d = self.defaults
        overrides: dict[str, Any] = {}
@@ -199,9 +345,9 @@ class Graph:
            overrides["verbose"] = True
        if not spec.tags and d.tags:
            overrides["tags"] = d.tags
-        if not overrides:
-            return spec
-        return replace(spec, **overrides)
+        resolved = spec if not overrides else replace(spec, **overrides)
+        self._resolved_cache[name] = resolved
+        return resolved

    def dependencies(self, name: str) -> tuple[str, ...]:
        """``name`` 的直接硬依赖前驱。"""
@@ -221,8 +367,11 @@ class Graph:

        同层任务无相互硬依赖，可并发执行。软依赖不参与分层。
        层按执行顺序返回。图有环时抛出 :class:`CycleError`。
+
+        .. note::
+            本方法假定图已通过 :meth:`validate` 校验（由 :func:`pyflowx.run`
+            在入口统一执行一次）。若直接调用本方法，需自行先校验。
        """
-        self.validate()
        sorter = _TopologicalSorter(self.deps)
        result: list[list[str]] = []
        sorter.prepare()
@@ -355,102 +504,3 @@ class Graph:

    def __contains__(self, name: Any) -> bool:
        return name in self.specs
-
-
-class GraphComposer:
-    """将带字符串引用的图展开为纯 :class:`TaskSpec` 图。
-
-    引用格式：
-    * ``"command_name"`` —— 引用整个命令图。
-    * ``"command_name.task_name"`` —— 引用特定任务。
-
-    引用按顺序展开，后续引用的任务依赖前面引用的最后一个任务；
-    原始 ``TaskSpec`` 之间也按出现顺序串行依赖。
-    """
-
-    def __init__(self, graphs: dict[str, Graph]) -> None:
-        self.graphs = graphs
-
-    def resolve_all(self) -> dict[str, Graph]:
-        """解析所有图的字符串引用，返回展开后的新图映射。"""
-        resolved: dict[str, Graph] = {}
-        for cmd_name, graph in self.graphs.items():
-            resolved[cmd_name] = self.expand_refs(graph, cmd_name)
-        return resolved
-
-    def expand_refs(self, graph: Graph, current_cmd: str) -> Graph:
-        """展开图中的字符串引用。若无 ``_pending_refs``，原样返回。"""
-        pending_refs = graph._pending_refs
-        if not pending_refs:
-            return graph
-
-        all_specs: list[TaskSpec[Any]] = []
-        previous_ref_last_task: str | None = None
-
-        for ref in pending_refs:
-            expanded_specs = self.parse_ref(ref, current_cmd)
-            if previous_ref_last_task and expanded_specs:
-                for i, task in enumerate(expanded_specs):
-                    if i == 0 or not task.depends_on:
-                        expanded_specs[i] = replace(task, depends_on=tuple({*task.depends_on, previous_ref_last_task}))
-            if expanded_specs:
-                previous_ref_last_task = expanded_specs[-1].name
-            all_specs.extend(expanded_specs)
-
-        original_specs = list(graph.all_specs().values())
-        if original_specs:
-            if previous_ref_last_task:
-                first = original_specs[0]
-                all_specs.append(replace(first, depends_on=tuple({*first.depends_on, previous_ref_last_task})))
-            else:
-                all_specs.append(original_specs[0])
-            for i in range(1, len(original_specs)):
-                current_task = original_specs[i]
-                previous_task_name = original_specs[i - 1].name
-                all_specs.append(
-                    replace(current_task, depends_on=tuple({*current_task.depends_on, previous_task_name}))
-                )
-
-        return Graph.from_specs(all_specs, defaults=graph.defaults)
-
-    def parse_ref(self, ref: str, current_cmd: str) -> list[TaskSpec[Any]]:
-        """解析单个字符串引用，返回对应的 TaskSpec 列表。"""
-        if ref == current_cmd:
-            raise ValueError(f"循环引用: 命令 '{current_cmd}' 引用了自己")
-
-        if "." in ref:
-            cmd_name, task_name = ref.split(".", 1)
-            if cmd_name not in self.graphs:
-                raise ValueError(f"引用的命令 '{cmd_name}' 不存在")
-            ref_graph = self.graphs[cmd_name]
-            if task_name not in ref_graph.all_specs():
-                raise ValueError(f"任务 '{task_name}' 不存在于命令 '{cmd_name}' 中")
-            return [ref_graph.all_specs()[task_name]]
-        else:
-            cmd_name = ref
-            if cmd_name not in self.graphs:
-                raise ValueError(f"引用的命令 '{cmd_name}' 不存在")
-            ref_graph = self.graphs[cmd_name]
-            ref_graph = self.expand_refs(ref_graph, cmd_name)
-            return list(ref_graph.all_specs().values())
-
-
-def compose(
-    graphs: dict[str, Graph],
-) -> dict[str, Graph]:
-    """编程式解析多图的字符串引用，返回展开后的新图映射。
-
-    与 :class:`GraphComposer` 等价，但作为独立函数暴露，供不使用
-    :class:`~pyflowx.runner.CliRunner` 的编程式用户调用。
-
-    Examples
-    --------
-    >>> graphs = {
-    ...     "build": px.Graph.from_specs([px.TaskSpec("b", cmd=["echo", "b"])]),
-    ...     "all": px.Graph.from_specs(["build", px.TaskSpec("t", cmd=["echo", "t"])]),
-    ... }
-    >>> resolved = px.compose(graphs)
-    >>> "b" in resolved["all"].all_specs()
-    True
-    """
-    return GraphComposer(graphs).resolve_all()
@@ -0,0 +1,705 @@
+"""工作流执行性能评估。
+
+基于 :class:`~pyflowx.report.RunReport` 中已有的 ``started_at`` /
+``finished_at`` 时间戳进行离线分析，**零运行时开销**——不修改执行流程，
+不注册回调，不引入额外计时器。
+
+核心指标
+--------
+* **任务级**：每个任务的 wall-clock 耗时、状态、重试次数、等待时间
+  （从最早依赖完成到本任务开始）。
+* **图级**：总耗时（wall-clock）、关键路径耗时（理论最短耗时）、
+  并行度效率（关键路径耗时 / 总耗时）。
+* **关键路径**：从源点到汇点的最长依赖路径，识别真正的串行瓶颈。
+* **并行度**：基于时间线重叠计算瞬时并行度，给出平均并行度与峰值并行度。
+* **瓶颈识别**：按耗时排序的 Top-N 任务。
+
+设计原则
+--------
+* 数据来源于 ``RunReport`` + ``Graph``，无副作用。
+* 计算复杂度 O(V+E)：拓扑排序 + 单次松弛，适合大规模图。
+* 所有时间戳用 ``datetime``，与 :class:`TaskResult` 保持一致。
+
+快速上手
+--------
+    import pyflowx as px
+
+    report = px.run(graph)
+    profile = px.ProfileReport.from_report(report, graph)
+    print(profile.describe())
+    bottlenecks = profile.top_bottlenecks(3)
+"""
+
+from __future__ import annotations
+
+__all__ = [
+    "ProfileReport",
+    "TaskProfile",
+]
+
+from dataclasses import dataclass
+from datetime import datetime
+from typing import Any
+
+from .graph import Graph
+from .report import RunReport
+from .task import TaskResult, TaskStatus
+
+
+@dataclass(frozen=True)
+class TaskProfile:
+    """单个任务的性能剖面。
+
+    属性
+    ----
+    name:
+        任务名。
+    status:
+        终态（SUCCESS/FAILED/SKIPPED）。
+    duration:
+        wall-clock 执行耗时（秒）。SKIPPED 任务为 0.0。
+    attempts:
+        尝试次数（含首次）。
+    wait_time:
+        从最早硬依赖完成到本任务开始的等待时间（秒）。
+        无硬依赖或 SKIPPED 时为 0.0。
+    is_on_critical_path:
+        是否位于关键路径上。
+    deps:
+        硬依赖任务名列表。
+    """
+
+    name: str
+    status: TaskStatus
+    duration: float
+    attempts: int
+    wait_time: float
+    is_on_critical_path: bool
+    deps: tuple[str, ...]
+
+    def to_dict(self) -> dict[str, Any]:
+        """转为 JSON 友好的字典。"""
+        return {
+            "name": self.name,
+            "status": self.status.value,
+            "duration_seconds": round(self.duration, 6),
+            "attempts": self.attempts,
+            "wait_time_seconds": round(self.wait_time, 6),
+            "is_on_critical_path": self.is_on_critical_path,
+            "deps": list(self.deps),
+        }
+
+
+@dataclass(frozen=True)
+class ProfileReport:
+    """工作流执行的性能剖面报告。
+
+    通过 :meth:`from_report` 从 :class:`RunReport` + :class:`Graph` 构建。
+    所有字段在构造时一次性计算完毕，后续访问为 O(1)。
+    """
+
+    tasks: tuple[TaskProfile, ...]
+    """所有任务的性能剖面（按拓扑序）。"""
+
+    total_duration: float
+    """整次运行的 wall-clock 耗时（秒）。"""
+
+    critical_path_duration: float
+    """关键路径耗时（秒）：从最早任务开始到最晚任务结束的最长依赖路径。"""
+
+    critical_path: tuple[str, ...]
+    """关键路径上的任务名序列（按执行顺序）。"""
+
+    avg_parallelism: float
+    """平均并行度 = 任务总耗时 / wall-clock 总耗时。"""
+
+    peak_parallelism: int
+    """峰值并行度：任一时刻同时运行的任务数最大值。"""
+
+    parallelism_efficiency: float
+    """并行度效率 = 关键路径耗时 / wall-clock 总耗时。``1.0`` 表示完全串行，
+    越大表示并行化收益越低（瓶颈在关键路径上）。"""
+
+    # ------------------------------------------------------------------ #
+    # 构建
+    # ------------------------------------------------------------------ #
+    @classmethod
+    def from_report(cls, report: RunReport, graph: Graph) -> ProfileReport:
+        """从运行报告与图构建性能剖面。
+
+        参数
+        ----
+        report:
+            已完成的 :class:`RunReport`，需包含 ``started_at``/``finished_at``。
+        graph:
+            对应的 :class:`Graph`，用于依赖关系与关键路径分析。
+
+        Note
+        -----
+        本方法不修改 ``report`` 或 ``graph``，纯函数式计算。
+        """
+        task_profiles = cls._build_task_profiles(report, graph)
+        total_duration = cls._calc_total_duration(report)
+        critical_path, critical_duration = cls._calc_critical_path(graph, report)
+        avg_par, peak_par = cls._calc_parallelism(report)
+        efficiency = critical_duration / total_duration if total_duration > 0 else 0.0
+
+        # 标记关键路径上的任务
+        critical_set = set(critical_path)
+        marked = tuple(
+            TaskProfile(
+                name=t.name,
+                status=t.status,
+                duration=t.duration,
+                attempts=t.attempts,
+                wait_time=t.wait_time,
+                is_on_critical_path=t.name in critical_set,
+                deps=t.deps,
+            )
+            for t in task_profiles
+        )
+
+        return cls(
+            tasks=marked,
+            total_duration=total_duration,
+            critical_path_duration=critical_duration,
+            critical_path=critical_path,
+            avg_parallelism=avg_par,
+            peak_parallelism=peak_par,
+            parallelism_efficiency=efficiency,
+        )
+
+    @staticmethod
+    def _build_task_profiles(report: RunReport, graph: Graph) -> tuple[TaskProfile, ...]:
+        """构建每个任务的性能剖面。"""
+        profiles: list[TaskProfile] = []
+        for name, result in report.results.items():
+            spec = graph.specs.get(name)
+            deps = tuple(spec.depends_on) if spec is not None else ()
+            duration = result.duration or 0.0
+            wait_time = ProfileReport._calc_wait_time(result, deps, report)
+            profiles.append(
+                TaskProfile(
+                    name=name,
+                    status=result.status,
+                    duration=duration,
+                    attempts=result.attempts,
+                    wait_time=wait_time,
+                    is_on_critical_path=False,  # 后续标记
+                    deps=deps,
+                )
+            )
+        return tuple(profiles)
+
+    @staticmethod
+    def _calc_wait_time(
+        result: TaskResult[Any],
+        deps: tuple[str, ...],
+        report: RunReport,
+    ) -> float:
+        """计算等待时间：从最早依赖完成到本任务开始。
+
+        无硬依赖、SKIPPED 任务或时间戳缺失时返回 0.0。
+        """
+        if not deps or result.started_at is None or result.status == TaskStatus.SKIPPED:
+            return 0.0
+        # 找出所有已完成依赖的最晚完成时间
+        dep_end_times: list[datetime] = []
+        for dep in deps:
+            dep_result = report.results.get(dep)
+            if dep_result is not None and dep_result.finished_at is not None:
+                dep_end_times.append(dep_result.finished_at)
+        if not dep_end_times:
+            return 0.0
+        latest_dep_end = max(dep_end_times)
+        delta = (result.started_at - latest_dep_end).total_seconds()
+        return max(0.0, delta)
+
+    @staticmethod
+    def _calc_total_duration(report: RunReport) -> float:
+        """计算 wall-clock 总耗时：最早开始到最晚结束。"""
+        starts: list[datetime] = []
+        ends: list[datetime] = []
+        for r in report.results.values():
+            if r.started_at is not None:
+                starts.append(r.started_at)
+            if r.finished_at is not None:
+                ends.append(r.finished_at)
+        if not starts or not ends:
+            return 0.0
+        return (max(ends) - min(starts)).total_seconds()
+
+    @staticmethod
+    def _calc_critical_path(graph: Graph, report: RunReport) -> tuple[tuple[str, ...], float]:
+        """计算关键路径：DAG 最长路径（按实际执行耗时）。
+
+        使用拓扑排序 + 动态规划，O(V+E)。SKIPPED 任务耗时按 0 计。
+        """
+        # 构建耗时映射
+        durations: dict[str, float] = {}
+        for name, result in report.results.items():
+            durations[name] = result.duration or 0.0
+
+        # 拓扑序（使用 graph.layers 保证与分层一致）
+        try:
+            layers = graph.layers()
+        except Exception:
+            # 图校验失败时回退为空
+            return (), 0.0
+
+        # earliest_finish[name] = duration[name] + max(earliest_finish[dep] for dep in deps)
+        earliest_finish: dict[str, float] = {}
+        predecessor: dict[str, str | None] = {}
+
+        for layer in layers:
+            for name in layer:
+                spec = graph.specs.get(name)
+                deps = spec.depends_on if spec is not None else ()
+                if not deps:
+                    earliest_finish[name] = durations.get(name, 0.0)
+                    predecessor[name] = None
+                else:
+                    best_dep: str | None = None
+                    best_ef = 0.0
+                    for dep in deps:
+                        ef = earliest_finish.get(dep, 0.0)
+                        if ef >= best_ef:
+                            best_ef = ef
+                            best_dep = dep
+                    earliest_finish[name] = best_ef + durations.get(name, 0.0)
+                    predecessor[name] = best_dep
+
+        if not earliest_finish:
+            return (), 0.0
+
+        # 找到 earliest_finish 最大的节点作为终点
+        end_node = max(earliest_finish, key=lambda n: earliest_finish[n])
+        total = earliest_finish[end_node]
+
+        # 回溯关键路径
+        path: list[str] = []
+        node: str | None = end_node
+        while node is not None:
+            path.append(node)
+            node = predecessor.get(node)
+        path.reverse()
+
+        return tuple(path), total
+
+    @staticmethod
+    def _calc_parallelism(report: RunReport) -> tuple[float, int]:
+        """计算平均并行度与峰值并行度。
+
+        基于时间线扫描：将每个任务的 [started_at, finished_at] 区间
+        转为事件点（+1/-1），排序后扫描得到瞬时并行度序列。
+
+        返回 (avg_parallelism, peak_parallelism)。
+        无有效时间戳时返回 (0.0, 0)。
+        """
+        events: list[tuple[float, int]] = []  # (timestamp, delta)
+        for r in report.results.values():
+            if r.started_at is None or r.finished_at is None:
+                continue
+            if r.status == TaskStatus.SKIPPED:
+                continue
+            start_ts = r.started_at.timestamp()
+            end_ts = r.finished_at.timestamp()
+            if end_ts <= start_ts:
+                continue
+            events.append((start_ts, 1))
+            events.append((end_ts, -1))
+
+        if not events:
+            return 0.0, 0
+
+        # 排序：同一时间点先处理结束（-1）再处理开始（+1），避免虚假峰值
+        events.sort(key=lambda e: (e[0], e[1]))
+
+        current = 0
+        peak = 0
+        # 加权面积用于计算平均并行度
+        area = 0.0
+        prev_ts = events[0][0]
+        for ts, delta in events:
+            if ts > prev_ts:
+                area += current * (ts - prev_ts)
+            current += delta
+            peak = max(peak, current)
+            prev_ts = ts
+
+        total_span = events[-1][0] - events[0][0]
+        avg = area / total_span if total_span > 0 else 0.0
+        return avg, peak
+
+    # ------------------------------------------------------------------ #
+    # 查询
+    # ------------------------------------------------------------------ #
+    def task(self, name: str) -> TaskProfile:
+        """返回指定任务的剖面。不存在则 ``KeyError``。"""
+        for t in self.tasks:
+            if t.name == name:
+                return t
+        raise KeyError(name)
+
+    def top_bottlenecks(self, n: int = 5) -> tuple[TaskProfile, ...]:
+        """返回耗时最长的 Top-N 任务（按 duration 降序）。
+
+        参数
+        ----
+        n:
+            返回数量。``n <= 0`` 返回空元组。
+        """
+        if n <= 0:
+            return ()
+        return tuple(sorted(self.tasks, key=lambda t: t.duration, reverse=True)[:n])
+
+    def critical_tasks(self) -> tuple[TaskProfile, ...]:
+        """返回关键路径上的所有任务（按路径顺序）。"""
+        critical_set = set(self.critical_path)
+        # 保持关键路径顺序
+        order = {name: i for i, name in enumerate(self.critical_path)}
+        return tuple(sorted((t for t in self.tasks if t.name in critical_set), key=lambda t: order[t.name]))
+
+    def failed_tasks(self) -> tuple[TaskProfile, ...]:
+        """返回 FAILED 状态的任务。"""
+        return tuple(t for t in self.tasks if t.status == TaskStatus.FAILED)
+
+    def skipped_tasks(self) -> tuple[TaskProfile, ...]:
+        """返回 SKIPPED 状态的任务。"""
+        return tuple(t for t in self.tasks if t.status == TaskStatus.SKIPPED)
+
+    # ------------------------------------------------------------------ #
+    # 输出
+    # ------------------------------------------------------------------ #
+    def to_dict(self) -> dict[str, Any]:
+        """转为 JSON 友好的字典。"""
+        return {
+            "tasks": [t.to_dict() for t in self.tasks],
+            "total_duration_seconds": round(self.total_duration, 6),
+            "critical_path_duration_seconds": round(self.critical_path_duration, 6),
+            "critical_path": list(self.critical_path),
+            "avg_parallelism": round(self.avg_parallelism, 4),
+            "peak_parallelism": self.peak_parallelism,
+            "parallelism_efficiency": round(self.parallelism_efficiency, 4),
+            "bottlenecks": [t.to_dict() for t in self.top_bottlenecks(5)],
+        }
+
+    def to_html(self) -> str:
+        """生成自包含的 HTML 报告（含 CSS，无外部依赖）。
+
+        报告含：图级指标卡片、关键路径、时间线甘特图、Top 瓶颈表格、
+        全部任务表格。适合直接用浏览器打开查看。
+        """
+        return _render_html(self)
+
+    def describe(self) -> str:
+        lines: list[str] = []
+        lines.append("=" * 70)
+        lines.append("PyFlowX 性能剖面报告")
+        lines.append("=" * 70)
+        lines.append("")
+        lines.append("【图级指标】")
+        lines.append(f"  总耗时 (wall-clock):     {self.total_duration:.3f}s")
+        lines.append(f"  关键路径耗时:            {self.critical_path_duration:.3f}s")
+        lines.append(f"  平均并行度:              {self.avg_parallelism:.2f}")
+        lines.append(f"  峰值并行度:              {self.peak_parallelism}")
+        lines.append(f"  并行度效率:              {self.parallelism_efficiency:.2%}")
+        lines.append(f"  任务总数:                {len(self.tasks)}")
+        lines.append("")
+
+        # 关键路径
+        lines.append("【关键路径】")
+        if self.critical_path:
+            lines.append(f"  {' -> '.join(self.critical_path)}")
+        else:
+            lines.append("  (无)")
+        lines.append("")
+
+        # Top 瓶颈
+        bottlenecks = self.top_bottlenecks(5)
+        lines.append(f"【Top {len(bottlenecks)} 瓶颈任务】")
+        if bottlenecks:
+            lines.append(f"  {'任务':<30} {'耗时':>10} {'等待':>10} {'尝试':>6} {'关键路径':>8} {'状态':>8}")
+            lines.append(f"  {'-' * 30} {'-' * 10} {'-' * 10} {'-' * 6} {'-' * 8} {'-' * 8}")
+            for t in bottlenecks:
+                critical_flag = "✓" if t.is_on_critical_path else ""
+                lines.append(
+                    f"  {t.name:<30} {t.duration:>9.3f}s {t.wait_time:>9.3f}s {t.attempts:>6} "
+                    f"{critical_flag:>8} {t.status.value:>8}",
+                )
+        else:
+            lines.append("  (无)")
+        lines.append("")
+
+        # 全部任务详情
+        lines.append("【全部任务】")
+        if self.tasks:
+            lines.append(f"  {'任务':<30} {'耗时':>10} {'等待':>10} {'尝试':>6} {'关键路径':>8} {'状态':>8}")
+            lines.append(f"  {'-' * 30} {'-' * 10} {'-' * 10} {'-' * 6} {'-' * 8} {'-' * 8}")
+            for t in self.tasks:
+                critical_flag = "✓" if t.is_on_critical_path else ""
+                lines.append(
+                    f"  {t.name:<30} {t.duration:>9.3f}s {t.wait_time:>9.3f}s {t.attempts:>6} "
+                    f"{critical_flag:>8} {t.status.value:>8}",
+                )
+        else:
+            lines.append("  (无)")
+        lines.append("")
+        lines.append("=" * 70)
+        return "\n".join(lines)
+
+    def __repr__(self) -> str:
+        return (
+            f"ProfileReport(tasks={len(self.tasks)}, "
+            f"total={self.total_duration:.3f}s, "
+            f"critical={self.critical_path_duration:.3f}s, "
+            f"avg_par={self.avg_parallelism:.2f}, "
+            f"peak_par={self.peak_parallelism})"
+        )
+
+
+# ---------------------------------------------------------------------- #
+# HTML 渲染（私有，零依赖）
+# ---------------------------------------------------------------------- #
+_HTML_TEMPLATE = """<!DOCTYPE html>
+<html lang="zh-CN">
+<head>
+<meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1">
+<title>PyFlowX 性能剖面报告</title>
+<style>
+  :root {{
+    --bg: #f5f5f7;
+    --card: #ffffff;
+    --border: #d2d2d7;
+    --text: #1d1d1f;
+    --muted: #6e6e73;
+    --accent: #0071e3;
+    --success: #34c759;
+    --warning: #ff9f0a;
+    --danger: #ff3b30;
+    --critical: #af52de;
+  }}
+  * {{ box-sizing: border-box; }}
+  body {{
+    font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif;
+    margin: 0;
+    padding: 24px;
+    background: var(--bg);
+    color: var(--text);
+    line-height: 1.5;
+  }}
+  h1 {{ margin: 0 0 8px; font-size: 28px; }}
+  h2 {{ margin: 32px 0 12px; font-size: 20px; border-bottom: 1px solid var(--border); padding-bottom: 6px; }}
+  .subtitle {{ color: var(--muted); margin: 0 0 24px; font-size: 14px; }}
+  .cards {{ display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 12px; margin-bottom: 8px; }}
+  .card {{
+    background: var(--card);
+    border: 1px solid var(--border);
+    border-radius: 10px;
+    padding: 16px;
+  }}
+  .card .label {{ font-size: 12px; color: var(--muted); margin-bottom: 4px; text-transform: uppercase; letter-spacing: 0.5px; }}
+  .card .value {{ font-size: 22px; font-weight: 600; }}
+  .card .unit {{ font-size: 13px; color: var(--muted); margin-left: 2px; }}
+  .critical-path {{
+    background: var(--card);
+    border: 1px solid var(--border);
+    border-left: 4px solid var(--critical);
+    border-radius: 10px;
+    padding: 16px;
+    margin-bottom: 8px;
+  }}
+  .critical-path .label {{ font-size: 12px; color: var(--muted); margin-bottom: 8px; text-transform: uppercase; letter-spacing: 0.5px; }}
+  .critical-path .chain {{ font-family: ui-monospace, "SF Mono", Menlo, monospace; font-size: 13px; word-break: break-all; }}
+  .critical-path .arrow {{ color: var(--critical); margin: 0 6px; font-weight: 600; }}
+  /* 甘特图 */
+  .gantt {{
+    background: var(--card);
+    border: 1px solid var(--border);
+    border-radius: 10px;
+    padding: 16px;
+    overflow-x: auto;
+  }}
+  .gantt-row {{ display: flex; align-items: center; margin-bottom: 6px; min-width: 600px; }}
+  .gantt-label {{ width: 200px; flex-shrink: 0; font-size: 13px; font-family: ui-monospace, monospace; overflow: hidden; text-overflow: ellipsis; white-space: nowrap; }}
+  .gantt-track {{ flex: 1; height: 22px; background: #f0f0f3; border-radius: 4px; position: relative; }}
+  .gantt-bar {{ position: absolute; height: 100%; border-radius: 4px; min-width: 2px; }}
+  .gantt-bar.success {{ background: var(--success); }}
+  .gantt-bar.failed {{ background: var(--danger); }}
+  .gantt-bar.skipped {{ background: var(--muted); }}
+  .gantt-bar.critical {{ box-shadow: 0 0 0 2px var(--critical) inset; }}
+  .gantt-bar:hover {{ opacity: 0.85; }}
+  .gantt-tooltip {{ position: absolute; bottom: 100%; left: 50%; transform: translateX(-50%); background: #1d1d1f; color: #fff; padding: 4px 8px; border-radius: 4px; font-size: 11px; white-space: nowrap; opacity: 0; pointer-events: none; transition: opacity 0.15s; }}
+  .gantt-bar:hover .gantt-tooltip {{ opacity: 1; }}
+  /* 表格 */
+  table {{ width: 100%; border-collapse: collapse; background: var(--card); border-radius: 10px; overflow: hidden; border: 1px solid var(--border); }}
+  th, td {{ padding: 10px 12px; text-align: left; font-size: 13px; }}
+  th {{ background: #fafafa; font-weight: 600; color: var(--muted); text-transform: uppercase; font-size: 11px; letter-spacing: 0.5px; }}
+  tbody tr {{ border-top: 1px solid var(--border); }}
+  tbody tr:hover {{ background: #fafafa; }}
+  td.num {{ font-family: ui-monospace, monospace; text-align: right; }}
+  .badge {{ display: inline-block; padding: 2px 8px; border-radius: 10px; font-size: 11px; font-weight: 500; }}
+  .badge.success {{ background: rgba(52,199,89,0.15); color: var(--success); }}
+  .badge.failed {{ background: rgba(255,59,48,0.15); color: var(--danger); }}
+  .badge.skipped {{ background: rgba(110,110,115,0.15); color: var(--muted); }}
+  .star {{ color: var(--critical); font-weight: 700; }}
+  .footer {{ margin-top: 32px; color: var(--muted); font-size: 12px; text-align: center; }}
+</style>
+</head>
+<body>
+  <h1>PyFlowX 性能剖面报告</h1>
+  <p class="subtitle">由 <code>pxp</code> 生成 · {generated_at}</p>
+
+  <h2>图级指标</h2>
+  <div class="cards">
+    <div class="card"><div class="label">总耗时</div><div class="value">{total_duration:.3f}<span class="unit">s</span></div></div>
+    <div class="card"><div class="label">关键路径耗时</div><div class="value">{critical_duration:.3f}<span class="unit">s</span></div></div>
+    <div class="card"><div class="label">平均并行度</div><div class="value">{avg_par:.2f}</div></div>
+    <div class="card"><div class="label">峰值并行度</div><div class="value">{peak_par}</div></div>
+    <div class="card"><div class="label">并行度效率</div><div class="value">{efficiency:.1f}<span class="unit">%</span></div></div>
+    <div class="card"><div class="label">任务总数</div><div class="value">{task_count}</div></div>
+  </div>
+
+  <h2>关键路径</h2>
+  <div class="critical-path">
+    <div class="label">最长依赖路径（串行瓶颈）</div>
+    <div class="chain">{critical_chain}</div>
+  </div>
+
+  <h2>任务时间线</h2>
+  <div class="gantt">
+    {gantt_rows}
+  </div>
+
+  <h2>Top 瓶颈任务</h2>
+  <table>
+    <thead><tr><th>任务</th><th class="num">耗时</th><th class="num">等待</th><th class="num">尝试</th><th>关键路径</th><th>状态</th></tr></thead>
+    <tbody>
+{bottleneck_rows}
+    </tbody>
+  </table>
+
+  <h2>全部任务</h2>
+  <table>
+    <thead><tr><th>任务</th><th class="num">耗时</th><th class="num">等待</th><th class="num">尝试</th><th>关键路径</th><th>状态</th><th>依赖</th></tr></thead>
+    <tbody>
+{all_task_rows}
+    </tbody>
+  </table>
+
+  <div class="footer">由 PyFlowX · pxp 生成</div>
+</body>
+</html>"""
+
+
+def _status_badge(status: TaskStatus) -> str:
+    """生成状态徽章 HTML。"""
+    cls = status.value
+    return f'<span class="badge {cls}">{cls}</span>'
+
+
+def _format_critical_chain(path: tuple[str, ...]) -> str:
+    """格式化关键路径为 HTML 链。"""
+    if not path:
+        return '<em style="color:var(--muted)">(无)</em>'
+    arrow = '<span class="arrow">→</span>'
+    return arrow.join(f"<strong>{name}</strong>" for name in path)
+
+
+def _render_gantt(profile: ProfileReport) -> str:
+    """渲染甘特图行 HTML。
+
+    每个任务一行：标签 + 时间条。时间条位置基于 wait_time + 依赖关系
+    重建相对开始时间（相对最早任务起点），归一化到 0-100% 宽度。
+    SKIPPED 任务不显示（无时间戳）。
+    """
+    visible = [t for t in profile.tasks if t.status != TaskStatus.SKIPPED and t.duration > 0]
+    if not visible:
+        return '<div style="color:var(--muted);padding:12px;">(无时间线数据)</div>'
+
+    # 重建相对开始时间：start[name] = max(end[dep]) + wait_time
+    # profile.tasks 已是拓扑序，可直接按序计算
+    start: dict[str, float] = {}
+    end: dict[str, float] = {}
+    for t in profile.tasks:
+        if t.status == TaskStatus.SKIPPED:
+            continue
+        dep_end = 0.0
+        for dep in t.deps:
+            dep_end = max(dep_end, end.get(dep, 0.0))
+        s = dep_end + t.wait_time
+        start[t.name] = s
+        end[t.name] = s + t.duration
+
+    # 归一化：以最早开始时间为 0，最晚结束为 100%
+    min_start = min(start.get(t.name, 0.0) for t in visible)
+    max_end = max(end.get(t.name, 0.0) for t in visible)
+    span = max_end - min_start
+    if span <= 0:
+        span = 1.0
+
+    rows: list[str] = []
+    for t in visible:
+        s = start.get(t.name, 0.0) - min_start
+        left_pct = (s / span) * 100
+        width_pct = (t.duration / span) * 100
+        cls = t.status.value
+        critical_cls = " critical" if t.is_on_critical_path else ""
+        tooltip = f"{t.name}: {t.duration:.3f}s @ +{s:.3f}s ({t.status.value})"
+        rows.append(
+            f'      <div class="gantt-row">'
+            f'<div class="gantt-label" title="{t.name}">{t.name}</div>'
+            f'<div class="gantt-track">'
+            f'<div class="gantt-bar {cls}{critical_cls}" style="left:{left_pct:.2f}%;width:{width_pct:.2f}%">'
+            f'<span class="gantt-tooltip">{tooltip}</span>'
+            f"</div></div></div>"
+        )
+    return "\n".join(rows)
+
+
+def _render_task_row(t: TaskProfile, show_deps: bool = False) -> str:
+    """渲染任务表格行 HTML。"""
+    star = '<span class="star">★</span>' if t.is_on_critical_path else ""
+    deps = ", ".join(t.deps) if show_deps and t.deps else ""
+    deps_cell = f"<td>{deps}</td>" if show_deps else ""
+    return (
+        f"      <tr>"
+        f"<td><code>{t.name}</code></td>"
+        f'<td class="num">{t.duration:.3f}s</td>'
+        f'<td class="num">{t.wait_time:.3f}s</td>'
+        f'<td class="num">{t.attempts}</td>'
+        f"<td>{star}</td>"
+        f"<td>{_status_badge(t.status)}</td>"
+        f"{deps_cell}"
+        f"</tr>"
+    )
+
+
+def _render_html(profile: ProfileReport) -> str:
+    """渲染完整 HTML 报告。"""
+    from datetime import datetime as _dt
+
+    bottlenecks = profile.top_bottlenecks(5)
+    bottleneck_rows = (
+        "\n".join(_render_task_row(t) for t in bottlenecks)
+        or '      <tr><td colspan="6" style="color:var(--muted);">(无)</td></tr>'
+    )
+    all_task_rows = (
+        "\n".join(_render_task_row(t, show_deps=True) for t in profile.tasks)
+        or '      <tr><td colspan="7" style="color:var(--muted);">(无)</td></tr>'
+    )
+
+    return _HTML_TEMPLATE.format(
+        generated_at=_dt.now().strftime("%Y-%m-%d %H:%M:%S"),
+        total_duration=profile.total_duration,
+        critical_duration=profile.critical_path_duration,
+        avg_par=profile.avg_parallelism,
+        peak_par=profile.peak_parallelism,
+        efficiency=profile.parallelism_efficiency * 100,
+        task_count=len(profile.tasks),
+        critical_chain=_format_critical_chain(profile.critical_path),
+        gantt_rows=_render_gantt(profile),
+        bottleneck_rows=bottleneck_rows,
+        all_task_rows=all_task_rows,
+    )
@@ -69,6 +69,22 @@ class RunReport:
        """以 FAILED 状态结束的任务名列表。"""
        return [name for name, r in self.results.items() if r.status == TaskStatus.FAILED]

+    def succeeded_tasks(self) -> list[str]:
+        """以 SUCCESS 状态结束的任务名列表。"""
+        return [name for name, r in self.results.items() if r.status == TaskStatus.SUCCESS]
+
+    def skipped_tasks(self) -> list[str]:
+        """以 SKIPPED 状态结束的任务名列表。"""
+        return [name for name, r in self.results.items() if r.status == TaskStatus.SKIPPED]
+
+    def tasks_by_status(self, status: TaskStatus) -> list[str]:
+        """返回指定状态的任务名列表。"""
+        return [name for name, r in self.results.items() if r.status == status]
+
+    def durations(self) -> dict[str, float]:
+        """任务名 -> 执行时长（秒）。无时长记录的为 0.0。"""
+        return {name: (r.duration or 0.0) for name, r in self.results.items()}
+
    def describe(self) -> str:
        """用于调试的人类可读多行报告。"""
        lines: list[str] = [f"RunReport(success={self.success})"]
@@ -15,11 +15,13 @@ import argparse
 import enum
 import sys
 from dataclasses import dataclass, field, replace
+from pathlib import Path
 from typing import Any, Sequence, get_args

+from .compose import GraphComposer
 from .errors import PyFlowXError
 from .executors import Strategy, run
-from .graph import Graph, GraphComposer
+from .graph import Graph
 from .task import TaskSpec

 __all__ = ["CliExitCode", "CliRunner"]
@@ -70,76 +72,133 @@ def _apply_verbose_to_graph(graph: Graph, verbose: bool) -> Graph:
 class CliRunner:
    """命令行运行器: 根据用户输入执行对应的任务流图.

-    将命令名映射到 Graph 实例.
-    通过 ``sys.argv`` 解析用户输入的命令, 执行对应的图.
+    将命令别名映射到 Graph 实例. 通过 ``sys.argv`` 解析用户输入的命令,
+    执行对应的图.

    Parameters
    ----------
+    aliases : dict[str, str | list[str] | Graph]
+        命令别名到任务引用的映射. 每个值可以是:
+        * ``str`` —— 单个任务名 (引用 ``tasks`` 中注册的任务),
+          生成单任务图.
+        * ``list[str]`` —— 任务名列表, 自动 :meth:`Graph.chain` 建立链式依赖,
+          即后一个任务依赖前一个.
+        * :class:`~pyflowx.graph.Graph` —— 直接使用该图 (用于复杂场景, 如
+          自定义 ``conditions``、并行分支等).
+    tasks : list[TaskSpec]
+        扁平注册的任务列表. ``aliases`` 中的字符串引用这些任务名.
+        未被任何 alias 引用的任务不会被执行.
    strategy : str | Strategy
-        默认执行策略 (``Strategy.SEQUENTIAL`` / ``Strategy.THREAD`` /
-        ``Strategy.ASYNC`` 或对应字符串). 可被命令行 ``--strategy`` 覆盖.
+        默认执行策略. 可被命令行 ``--strategy`` 覆盖.
+    description : str
+        CLI 帮助文本.
    verbose : bool
-        是否显示详细执行过程. ``True`` 时打印任务生命周期和 subprocess 输出.
-        默认 ``True``. 可被命令行 ``--quiet`` 关闭.
-    **graphs : Graph
-        命令名到图的映射. 每个 key 是一个命令名, value 是对应的
-        :class:`~pyflowx.graph.Graph`.
+        是否显示详细执行过程. 默认 ``True``, 可被命令行 ``--quiet`` 关闭.

    Examples
    --------
-    基本用法::
+    简单场景 (tasks + aliases)::

        runner = px.CliRunner(
-            clean=px.Graph.from_specs(
-                [
-                    px.TaskSpec("cargo_clean", cmd=["cargo", "clean"]),
-                ]
-            ),
-            build=px.Graph.from_specs(
-                [
-                    px.TaskSpec("uv_build", cmd=["uv", "build"]),
-                ]
-            ),
+            tasks=[
+                px.cmd(["uv", "build"]),                      # name="uv_build"
+                px.cmd(["maturin", "build"], name="maturin_build"),
+                px.cmd(["ruff", "check", "--fix"], name="lint"),
+            ],
+            aliases={
+                "b": "uv_build",
+                "ba": ["uv_build", "maturin_build"],   # chain: maturin 依赖 uv
+                "lint": "lint",
+            },
        )
-        runner.run()  # 解析 sys.argv
+        runner.run()

-    指定策略与描述::
+    复杂场景 (直接用 Graph)::

        runner = px.CliRunner(
-            strategy=px.Strategy.THREAD,
+            aliases={
+                "a": px.Graph.from_specs([
+                    px.TaskSpec("add", cmd=["git", "add", "."], conditions=(...)),
+                    px.TaskSpec("commit", cmd=["git", "commit"], depends_on=("add",)),
+                ]),
+            },
        )
-        runner.run(["test", "--strategy", "sequential"])
    """

-    graphs: dict[str, Graph] = field(default_factory=dict)
-    strategy: Strategy = field(default="sequential")
+    aliases: dict[str, str | list[str | TaskSpec[Any]] | TaskSpec[Any] | Graph] = field(default_factory=dict)
+    tasks: list[TaskSpec[Any]] = field(default_factory=list)
+    strategy: Strategy = field(default="dependency")
    description: str = field(default_factory=str)
    verbose: bool = field(default_factory=lambda: True)
+    # 解析后的命令→图映射，__post_init__ 填充
+    graphs: dict[str, Graph] = field(default_factory=dict, init=False)

    def __post_init__(self) -> None:
-        if not self.graphs:
-            raise ValueError("CliRunner 至少需要一个命令 (通过关键字参数提供)")
+        if not self.aliases:
+            raise ValueError("CliRunner 至少需要一个别名 (通过 aliases= 提供)")

-        # 解析并展开字符串引用，委托给 GraphComposer。
-        # Graph 不再 frozen，可直接赋值，无需 object.__setattr__。
-        self.graphs = GraphComposer(self.graphs).resolve_all()
+        # 1. 把 tasks 注册为虚拟命令图（每个 task 一个图），加入 raw_graphs
+        #    使 GraphComposer 能解析对它们的字符串引用
+        raw_graphs: dict[str, Graph] = {}
+        for spec in self.tasks:
+            if spec.name in raw_graphs:
+                raise ValueError(f"任务名重复: {spec.name!r}")
+            raw_graphs[spec.name] = Graph.from_specs([spec])
+
+        # 2. 把每个 alias 转为 Graph（alias 名可与 task 名相同，覆盖 task 注册）
+        for alias, value in self.aliases.items():
+            raw_graphs[alias] = self._alias_to_graph(alias, value)
+
+        # 3. 解析图间字符串引用（str / list[str] 引用其他 alias 或任务）
+        self.graphs = GraphComposer(raw_graphs).resolve_all()
+
+    @staticmethod
+    def _alias_to_graph(
+        alias: str,
+        value: str | list[str | TaskSpec[Any]] | TaskSpec[Any] | Graph,
+    ) -> Graph:
+        """把 alias 的值转换为 Graph.
+
+        * ``str`` —— 对其他 alias 或已注册任务名的引用, 由 GraphComposer 展开.
+        * ``TaskSpec`` —— 单个内联任务, 生成单任务图.
+        * ``list[str | TaskSpec]`` —— 引用/任务混合列表, GraphComposer 展开时
+          自动让后续引用依赖前面 (chain 语义). 元素为 alias 名、任务名或
+          :class:`TaskSpec` 对象 (内联任务).
+        * ``Graph`` —— 原样返回 (用于复杂场景: conditions、并行分支等).
+        """
+        if isinstance(value, Graph):
+            return value
+        if isinstance(value, TaskSpec):
+            return Graph.from_specs([value])
+        if isinstance(value, str):
+            # 字符串引用，用 _pending_refs 占位，GraphComposer 后续展开
+            return Graph.from_specs([value])  # type: ignore[arg-type]
+        if isinstance(value, list):
+            if not value:
+                raise ValueError(f"别名 {alias!r} 的任务列表为空")
+            for item in value:
+                if not isinstance(item, (str, TaskSpec)):
+                    raise TypeError(f"别名 {alias!r} 的列表元素类型无效: {type(item).__name__}, 预期 str 或 TaskSpec")
+            # str/TaskSpec 混合列表，由 GraphComposer 展开（自动建立 chain 依赖）
+            return Graph.from_specs(value)
+        raise TypeError(
+            f"别名 {alias!r} 的值类型无效: {type(value).__name__}, 预期 str/TaskSpec/list[str|TaskSpec]/Graph"
+        )

    # ------------------------------------------------------------------ #
    # 内省
    # ------------------------------------------------------------------ #
    @property
    def commands(self) -> list[str]:
-        """可用的命令列表 (按插入顺序)."""
-        return list(self.graphs.keys())
+        """可用的命令列表 (按 aliases 定义顺序, 不含 tasks 中未引用的任务)."""
+        return list(self.aliases.keys())

    # ------------------------------------------------------------------ #
    # 参数解析
    # ------------------------------------------------------------------ #
    def _prog_name(self) -> str:
        """从 sys.argv[0] 推导程序名."""
-        import os
-
-        return os.path.basename(sys.argv[0]) if sys.argv else "pyflowx"
+        return Path(sys.argv[0]).name if sys.argv else "pyflowx"

    def create_parser(self) -> argparse.ArgumentParser:
        """创建参数解析器.
@@ -225,9 +284,9 @@ class CliRunner:
            parser.print_help()
            return CliExitCode.FAILURE.value

-        # 验证命令
-        if parsed.command not in self.graphs:
-            available = ", ".join(self.graphs.keys())
+        # 验证命令（必须是已注册的 alias，不接受裸任务名）
+        if parsed.command not in self.aliases:
+            available = ", ".join(self.commands)
            print(
                f"错误: 未知命令 {parsed.command!r} (可用命令: {available})",
                file=sys.stderr,
@@ -18,8 +18,9 @@ import sys
 import time
 from abc import ABC, abstractmethod
 from collections.abc import Iterator
+from contextlib import contextmanager, nullcontext
 from pathlib import Path
-from typing import Any, Mapping
+from typing import Any, ContextManager, Mapping

 if sys.version_info >= (3, 12):
    from typing import override
@@ -55,6 +56,22 @@ class StateBackend(ABC):
    def clear(self) -> None:
        """清除所有存储状态。"""

+    def flush(self) -> None:  # noqa: B027
+        """将内存中暂存的状态持久化到外部介质。
+
+        默认无操作（如 :class:`MemoryBackend` 无需落盘）。
+        :class:`JSONBackend` 在 :meth:`batch` 期间会延迟落盘，需在退出时调用。
+        """
+
+    def batch(self) -> ContextManager[None]:
+        """返回一个上下文管理器，期间 :meth:`save` 可延迟 :meth:`flush`。
+
+        默认实现为 no-op（如 :class:`MemoryBackend`）。:class:`JSONBackend`
+        覆盖为：进入时标记延迟，退出时统一 flush 一次，将每任务一次落盘
+        （N 次写入）降为整次运行一次（O(N) 而非 O(N²)）。
+        """
+        return nullcontext()
+

 class _TTLStateBackendMixin(StateBackend):
    """TTL 状态后端共享逻辑。
@@ -158,13 +175,6 @@ class MemoryBackend(_TTLStateBackendMixin):
    def _clear_raw(self) -> None:
        self._store.clear()

-    def _expired(self, key: str) -> bool:
-        """键是否已过期（兼容旧测试 API）。"""
-        entry = self._get_raw(key)
-        if entry is None:
-            return False
-        return self._is_expired(entry[1])
-

 class JSONBackend(_TTLStateBackendMixin):
    """基于文件的 JSON 存储，用于跨进程续跑。
@@ -184,6 +194,7 @@ class JSONBackend(_TTLStateBackendMixin):
        self._path: str = path
        self._ttl = ttl
        self._store: dict[str, dict[str, Any]] = {}
+        self._defer_flush: bool = False
        self._load()

    def _load(self) -> None:
@@ -244,11 +255,26 @@ class JSONBackend(_TTLStateBackendMixin):
        except (TypeError, ValueError) as exc:
            raise StorageError(f"result of key {key!r} is not JSON-serialisable", exc) from exc
        super().save(key, value)
+        if not self._defer_flush:
+            self._flush()
+
+    @override
+    def flush(self) -> None:
        self._flush()

-    def _expired(self, entry: Mapping[str, Any]) -> bool:
-        """带元数据的条目是否已过期（兼容旧测试 API）。"""
-        return self._is_expired(float(entry.get("ts", 0)))
+    @override
+    @contextmanager
+    def batch(self) -> Iterator[None]:
+        """进入批量模式：``save`` 暂不落盘，退出时统一 flush 一次。
+
+        将整次运行 N 个任务的 N 次全量落盘降为 1 次。
+        """
+        self._defer_flush = True
+        try:
+            yield
+        finally:
+            self._defer_flush = False
+            self._flush()


 def resolve_backend(backend: StateBackend | None) -> StateBackend:
@@ -17,14 +17,16 @@

 from __future__ import annotations

+import logging
 import os
 import shutil
-import subprocess
 import sys
+import threading
 from contextlib import contextmanager
 from dataclasses import dataclass, field
 from datetime import datetime
 from enum import Enum
+from functools import cached_property
 from pathlib import Path
 from typing import (
    Any,
@@ -67,6 +69,8 @@ TaskCmd = Union[
 Strategy = Union[str, "StrategyKind"]
 StrategyKind = Any  # 占位，避免循环；executors 模块用 Literal 约束

+logger = logging.getLogger(__name__)
+
 # 条件判断函数类型：接收依赖上下文（可能为空映射），返回是否应执行。
 Condition = Callable[[Context], bool]

@@ -250,6 +254,10 @@ class TaskSpec(Generic[T]):
        存取状态后端，使不同输入产生独立缓存条目。``None`` 表示用任务名。
    hooks:
        :class:`TaskHooks` 生命周期钩子。
+    executor:
+        同步任务的执行器：``"thread"``（默认，线程池）/ ``"process"``
+        （进程池，绕过 GIL，适合 CPU 密集型；``fn`` 须可 pickle）/
+        ``"inline"``（直接在事件循环线程调用，最快但会阻塞循环）。
    """

    name: str
@@ -275,6 +283,7 @@ class TaskSpec(Generic[T]):
    continue_on_error: bool = False
    cache_key: CacheKeyFn | None = None
    hooks: TaskHooks = field(default_factory=TaskHooks)
+    executor: str = "thread"  # "thread" | "process" | "inline"

    def __post_init__(self) -> None:
        if not self.name:
@@ -291,13 +300,16 @@ class TaskSpec(Generic[T]):
        if self.fn is None and self.cmd is None:
            raise ValueError(f"TaskSpec '{self.name}': 必须提供 fn 或 cmd 参数。")

-    @property
+    @cached_property
    def effective_fn(self) -> TaskFn[T]:
        """获取有效的执行函数。

        若提供 ``cmd``，返回包装后的命令执行函数；否则返回 ``fn``。
        包装函数在每次调用时从 ``self`` 读取 ``verbose``/``cwd``/``env``/
        ``timeout``，避免闭包捕获运行期参数，使翻转字段无需重建 spec。
+
+        结果按实例缓存（:func:`functools.cached_property`）：frozen dataclass
+        字段不可变，``_wrap_cmd`` 生成的闭包稳定，无需每次访问重建。
        """
        if self.cmd is not None:
            return self._wrap_cmd()
@@ -306,11 +318,17 @@ class TaskSpec(Generic[T]):
        raise ValueError(f"TaskSpec '{self.name}': 没有可执行的函数或命令。")  # pragma: no cover

    def _wrap_cmd(self) -> TaskFn[Any]:
-        """将 cmd 包装为可执行函数。"""
+        """将 cmd 包装为可执行函数。
+
+        实际执行逻辑位于 :mod:`pyflowx.command`，避免 :class:`TaskSpec`
+        作为纯数据结构混入命令执行逻辑。
+        """
+        from .command import run_command
+
        spec = self

        def _run() -> T:
-            return cast(T, _run_command(spec))
+            return cast(T, run_command(spec))

        _run.__name__ = spec.name
        return _run  # type: ignore[return-value]
@@ -368,12 +386,27 @@ class TaskSpec(Generic[T]):

    def storage_key(self, context: Context) -> str:
        """计算状态后端存储键。"""
-        if self.cache_key is not None:
-            try:
-                return f"{self.name}:{self.cache_key(context)}"
-            except Exception:
-                return self.name
-        return self.name
+        if self.cache_key is None:
+            return self.name
+        try:
+            return f"{self.name}:{self.cache_key(context)}"
+        except (TypeError, ValueError, KeyError, AttributeError) as exc:
+            # cache_key 抛出预期内的数据/类型异常时回退到 name，但仍记录警告
+            # 以便用户发现 cache_key 实现中的 bug。
+            logger.warning(
+                "task %r: cache_key 回退到 name（%s: %s）",
+                self.name,
+                type(exc).__name__,
+                exc,
+            )
+            return self.name
+
+
+# 全局锁：序列化对进程级状态（os.environ / os.chdir）的临时修改。
+# ``fn`` 任务在 thread/async 策略下并发执行时，若各自配置了不同的
+# ``cwd``/``env``，会相互覆盖（os.chdir 与 os.environ 均为进程全局）。
+# 该锁仅包裹"切换→执行→恢复"区间，保证正确性；不使用 cwd/env 的任务不受影响。
+_env_cwd_lock = threading.RLock()


@contextmanager
@@ -381,105 +414,159 @@ def _env_and_cwd(
    env: Mapping[str, str] | None,
    cwd: Path | None,
 ) -> Generator[None, None, None]:
-    """临时设置环境变量与工作目录。"""
-    saved_env: dict[str, str] = {}
-    saved_cwd: str | None = None
-    if env:
-        for k, v in env.items():
-            if k in os.environ:
-                saved_env[k] = os.environ[k]
-            os.environ[k] = v
-    if cwd is not None:
-        saved_cwd = str(Path.cwd())
-        os.chdir(cwd)
-    try:
+    """临时设置环境变量与工作目录。
+
+    ``os.environ`` 与 ``os.chdir`` 是进程级全局状态，在 thread/async 策略下
+    并发执行多个带 ``env``/``cwd`` 的 ``fn`` 任务时会相互覆盖。本函数通过
+    模块级 :data:`_env_cwd_lock` 串行化"切换→执行→恢复"区间，确保正确性。
+    无 ``env`` 且无 ``cwd`` 时直接 yield，不获取锁。
+    """
+    if not env and cwd is None:
        yield
-    finally:
-        if saved_cwd is not None:
-            os.chdir(saved_cwd)
-        # 恢复环境变量
+        return
+    with _env_cwd_lock:
+        saved_env: dict[str, str] = {}
+        saved_cwd: str | None = None
        if env:
-            for k in env:
-                if k in saved_env:
-                    os.environ[k] = saved_env[k]
-                else:
-                    os.environ.pop(k, None)
-
-
-def _run_command(spec: TaskSpec[Any]) -> Any:  # noqa: PLR0912
-    """执行 ``spec.cmd`` 指定的命令（list / shell 字符串 / 可调用对象）。"""
-    cmd = spec.cmd
-    verbose = spec.verbose
-    cwd = spec.cwd
-    timeout = spec.timeout
-    env_override = spec.env
-
-    # 可调用对象：直接调用，返回其结果。
-    if callable(cmd) and not isinstance(cmd, (list, str)):
-        name = getattr(cmd, "__name__", "callable")
-        if verbose:
-            print(f"[verbose] 执行可调用命令: {name}", flush=True)
-            if cwd is not None:
-                print(f"[verbose] 工作目录: {cwd}", flush=True)
-        try:
-            return cmd()
-        except Exception as e:
-            raise RuntimeError(f"可调用命令执行异常: {name}: {e}") from e
-
-    is_list = isinstance(cmd, list)
-    if is_list:
-        cmd_str = " ".join(arg for arg in cmd)  # type: ignore[union-attr]
-        verb = "执行命令"
-        label = "命令"
-    else:
-        cmd_str = cast(str, cmd)
-        verb = "执行 Shell"
-        label = "Shell 命令"
-
-    if verbose:
-        print(f"[verbose] {verb}: {cmd_str}", flush=True)
+            for k, v in env.items():
+                if k in os.environ:
+                    saved_env[k] = os.environ[k]
+                os.environ[k] = v
        if cwd is not None:
-            print(f"[verbose] 工作目录: {cwd}", flush=True)
-
-    # 合并环境变量
-    run_env: dict[str, str] | None = None
-    if env_override:
-        run_env = dict(os.environ)
-        run_env.update(env_override)
-
-    try:
-        result = subprocess.run(
-            cast(Union[str, List[str]], cmd),
-            shell=not is_list,
-            cwd=cwd,
-            env=run_env,
-            timeout=timeout,
-            capture_output=not verbose,
-            text=True,
-            check=False,
-        )
-    except FileNotFoundError:
-        raise RuntimeError(f"{label}未找到: {cmd_str}") from None
-    except subprocess.TimeoutExpired:
-        raise RuntimeError(f"{label}执行超时: {cmd_str} ({timeout}s)") from None
-    except OSError as e:
-        raise RuntimeError(f"{label}执行异常: {cmd_str}: {e}") from e
-
-    if verbose:
-        print(f"[verbose] 返回码: {result.returncode}", flush=True)
-
-    if result.returncode == 0:
-        return None
-
-    err_msg = f"{label}执行失败: `{cmd_str}`, 返回码: {result.returncode}"
-    if not verbose and result.stderr.strip():
-        err_msg += f"\n{result.stderr.strip()}"
-    raise RuntimeError(err_msg)
+            saved_cwd = str(Path.cwd())
+            os.chdir(cwd)
+        try:
+            yield
+        finally:
+            if saved_cwd is not None:
+                os.chdir(saved_cwd)
+            # 恢复环境变量
+            if env:
+                for k in env:
+                    if k in saved_env:
+                        os.environ[k] = saved_env[k]
+                    else:
+                        os.environ.pop(k, None)


 # ---------------------------------------------------------------------- #
 # 任务模板：批量生成相似 TaskSpec 的工厂
 # ---------------------------------------------------------------------- #
+def _task_noop() -> None:
+    """task(cmd=...) 形式下的占位 fn（cmd 任务执行期不调用 fn）。"""
+    return None
+
+
+def task(
+    fn: TaskFn[Any] | None = None,
+    *,
+    cmd: TaskCmd | None = None,
+    depends_on: tuple[str, ...] = (),
+    soft_depends_on: tuple[str, ...] = (),
+    defaults: Mapping[str, Any] | None = None,
+    args: tuple[Any, ...] = (),
+    kwargs: Mapping[str, Any] | None = None,
+    retry: RetryPolicy | None = None,
+    timeout: float | None = None,
+    tags: tuple[str, ...] = (),
+    conditions: tuple[Condition, ...] = (),
+    cwd: str | Path | None = None,
+    env: Mapping[str, str] | None = None,
+    verbose: bool = False,
+    skip_if_missing: bool = False,
+    allow_upstream_skip: bool = False,
+    strategy: str | None = None,
+    priority: int = 0,
+    concurrency_key: str | None = None,
+    continue_on_error: bool = False,
+    cache_key: CacheKeyFn | None = None,
+    hooks: TaskHooks | None = None,
+    name: str | None = None,
+) -> Any:
+    """装饰器：将函数转为 :class:`TaskSpec`。
+
+    ``name`` 默认取 ``fn.__name__``。可直接装饰函数，或带参数使用。
+
+    Examples
+    --------
+    >>> @px.task
+    ... def extract(): return [1, 2, 3]
+    >>> @px.task(depends_on=("extract",))
+    ... def double(extract): return [x * 2 for x in extract]
+    >>> graph = px.Graph.from_specs([extract, double])
+    """
+
+    def _decorate(func: TaskFn[Any]) -> TaskSpec[Any]:
+        spec_name = name or func.__name__
+        return TaskSpec(
+            name=spec_name,
+            fn=func,
+            cmd=cmd,
+            depends_on=depends_on,
+            soft_depends_on=soft_depends_on,
+            defaults=dict(defaults) if defaults else {},
+            args=args,
+            kwargs=dict(kwargs) if kwargs else {},
+            retry=retry if retry is not None else RetryPolicy(),
+            timeout=timeout,
+            tags=tags,
+            conditions=conditions,
+            cwd=Path(cwd) if isinstance(cwd, str) else cwd,
+            env=dict(env) if env else None,
+            verbose=verbose,
+            skip_if_missing=skip_if_missing,
+            allow_upstream_skip=allow_upstream_skip,
+            strategy=strategy,
+            priority=priority,
+            concurrency_key=concurrency_key,
+            continue_on_error=continue_on_error,
+            cache_key=cache_key,
+            hooks=hooks if hooks is not None else TaskHooks(),
+        )
+
+    if fn is None and cmd is None:
+        # 带参数调用：@task(depends_on=...)，等待被装饰函数
+        return _decorate
+    if fn is None:
+        # task(cmd=..., name=...) 直接构造，无被装饰函数
+        if name is None:
+            raise ValueError("task(cmd=...) 需要显式提供 name")
+        return _decorate(_task_noop)
+    return _decorate(fn)
+
+
+def cmd(
+    command: list[str],
+    *,
+    name: str | None = None,
+    depends_on: tuple[str, ...] = (),
+    **kwargs: Any,
+) -> TaskSpec[Any]:
+    """从命令列表快速创建 :class:`TaskSpec`。
+
+    ``name`` 默认为 ``"_".join(command[:2])``（如 ``["uv", "build"]`` → ``"uv_build"``）。
+    若命令不足两个元素则用 ``"_".join(command)``。
+
+    其余关键字参数透传给 :class:`TaskSpec`（如 ``depends_on``、``tags`` 等）。
+
+    Examples
+    --------
+    >>> uv_build = px.cmd(["uv", "build"])
+    >>> uv_build.name
+    'uv_build'
+    >>> lint = px.cmd(["ruff", "check", "--fix"], name="lint")
+    >>> lint.name
+    'lint'
+    """
+    spec_name = name or "_".join(command[:2]) if len(command) >= 2 else "_".join(command)
+    return TaskSpec(
+        name=spec_name,
+        cmd=command,
+        depends_on=depends_on,
+        **kwargs,
+    )
+
+
 def task_template(
    fn: TaskFn[Any] | None = None,
    cmd: TaskCmd | None = None,
@@ -113,10 +113,7 @@ def write_file(path: str, content: str, encoding: str = "utf-8") -> px.TaskSpec:
    """写入文件任务."""

    def write():
-        try:
-            with open(path, "w", encoding=encoding) as f:
-                f.write(content)
-        except Exception as e:
-            print(f"写入文件 {path} 失败: {e}")
+        p = Path(path)
+        p.write_text(content, encoding=encoding)

    return px.TaskSpec(f"write_file_{path}", fn=write, verbose=True)
@@ -1,107 +0,0 @@
-"""常用工具函数."""
-
-from __future__ import annotations
-
-__all__ = ["perf_timer"]
-
-import functools
-import logging
-import time
-from collections import defaultdict
-from typing import Callable, TypedDict
-
-try:
-    from typing_extensions import ParamSpec, TypeVar
-except ImportError:
-    from typing import ParamSpec, TypeVar
-
-P = ParamSpec("P")
-R = TypeVar("R")
-
-
-class _PerformanceMetrics(TypedDict):
-    """性能指标."""
-
-    count: int
-    total_time: float
-
-
-_perf_metrics: defaultdict[str, _PerformanceMetrics] = defaultdict(
-    lambda: _PerformanceMetrics(
-        count=0,
-        total_time=0.0,
-    )
-)
-
-
-def _generate_report(unit: str, precision: int) -> str:
-    """生成性能指标报告，返回报告字符串."""
-    if not _perf_metrics:
-        return ""
-
-    lines: list[str] = []
-    lines.append("=" * 50)
-    lines.append("性能指标报告 (Performance Metrics Report)")
-    lines.append("-" * 50)
-
-    # 按总耗时排序，最耗时的函数排在前面
-    sorted_metrics = sorted(_perf_metrics.items(), key=lambda x: x[1]["total_time"], reverse=True)
-
-    for name, metrics in sorted_metrics:
-        avg_time = metrics["total_time"] / metrics["count"] if metrics["count"] > 0 else 0
-        lines.append(
-            f"{name}: "
-            f"调用次数={metrics['count']}, "
-            f"总耗时={metrics['total_time']:.{precision}f}{unit}, "
-            f"平均耗时={avg_time:.{precision}f}{unit}"
-        )
-
-    lines.append("=" * 50)
-    report_str = "\n".join(lines)
-
-    # 同时输出到日志
-    logging.info("\n".join(lines))
-
-    return report_str
-
-
-def perf_timer(unit: str = "ms", precision: int = 4, report: bool = False):
-    """性能计时器装饰器."""
-    scale: dict[str, float] = {
-        "s": 1.0,
-        "ms": 1000.0,
-        "us": 1000000.0,
-    }
-
-    def decorator(func: Callable[P, R]) -> Callable[P, R]:
-        @functools.wraps(func)
-        def wrapper(*args: P.args, **kwargs: P.kwargs) -> R:
-            start_time = time.time()
-            result = func(*args, **kwargs)
-            end_time = time.time()
-
-            _perf_metrics[func.__name__]["count"] += 1
-            _perf_metrics[func.__name__]["total_time"] += (end_time - start_time) * scale[unit]
-
-            if not report:
-                logging.info(
-                    f"{func.__name__} {unit}: {_perf_metrics[func.__name__]['total_time']:.{precision}f}{unit}"
-                )
-            return result
-
-        return wrapper
-
-    if report:
-        import atexit
-
-        logging.basicConfig(level=logging.INFO)
-        logging.info(f"Performance metrics report enabled with unit {unit} and precision {precision}")
-
-        @atexit.register
-        def _report_at_exit() -> None:
-            """在程序退出时报告性能指标."""
-            _generate_report(unit, precision)
-
-        # 将报告生成逻辑提取为独立函数，便于测试
-
-    return decorator
@@ -0,0 +1,26 @@
+"""进程池测试辅助：模块级函数（须可 pickle）。"""
+
+from __future__ import annotations
+
+import time
+
+
+def cpu_heavy(n: int) -> int:
+    """CPU 密集型计算（求平方和）。"""
+    return sum(i * i for i in range(n))
+
+
+def add(a: int, b: int) -> int:
+    """简单加法。"""
+    return a + b
+
+
+def sub(a: int, b: int) -> int:
+    """简单减法。"""
+    return a - b
+
+
+def slow_sleep(seconds: float) -> int:
+    """睡眠指定秒数，用于测试超时。"""
+    time.sleep(seconds)
+    return int(seconds)
@@ -0,0 +1,545 @@
+"""pxp 性能分析器测试.
+
+覆盖策略：
+* HTML 渲染：to_html() 输出结构正确，含关键章节。
+* pxp CLI：参数解析、脚本执行、报告生成、浏览器调用、错误处理。
+* hook 注入：捕获 px.run() 调用，还原原始函数。
+"""
+
+from __future__ import annotations
+
+import sys
+from datetime import datetime, timedelta
+from pathlib import Path
+from typing import Any
+
+import pytest
+
+import pyflowx as px
+from pyflowx.cli import profiler
+from pyflowx.profiling import ProfileReport
+from pyflowx.report import RunReport
+from pyflowx.task import TaskResult, TaskSpec, TaskStatus
+
+
+def _fn() -> int:
+    return 1
+
+
+def _spec(name: str, deps: tuple[str, ...] = ()) -> TaskSpec[Any]:
+    return TaskSpec[Any](name, _fn, depends_on=deps)
+
+
+def _result(
+    name: str,
+    start: datetime,
+    duration: float,
+    *,
+    status: TaskStatus = TaskStatus.SUCCESS,
+    attempts: int = 1,
+) -> TaskResult[Any]:
+    """构造带时间戳的 TaskResult."""
+    end = start + timedelta(seconds=duration) if duration > 0 else start
+    return TaskResult[Any](
+        spec=_spec(name),
+        status=status,
+        value=None,
+        attempts=attempts,
+        started_at=start if duration > 0 or status != TaskStatus.SKIPPED else None,
+        finished_at=end if duration > 0 or status != TaskStatus.SKIPPED else None,
+    )
+
+
+def _build_simple_profile() -> ProfileReport:
+    """构造一个简单的 ProfileReport 用于测试 HTML 输出."""
+    start = datetime(2024, 1, 1, 0, 0, 0)
+    report = px.RunReport()
+    report.results["a"] = _result("a", start, 1.0)
+    report.results["b"] = _result("b", start + timedelta(seconds=1), 2.0)
+    graph = px.Graph.from_specs([
+        _spec("a"),
+        _spec("b", deps=("a",)),
+    ])
+    return ProfileReport.from_report(report, graph)
+
+
+class TestToHtml:
+    """测试 ProfileReport.to_html()."""
+
+    def test_to_html_contains_key_sections(self) -> None:
+        """HTML 应包含所有关键章节标题。"""
+        profile = _build_simple_profile()
+        html = profile.to_html()
+
+        assert "<!DOCTYPE html>" in html
+        assert "PyFlowX 性能剖面报告" in html
+        assert "图级指标" in html
+        assert "关键路径" in html
+        assert "任务时间线" in html
+        assert "Top 瓶颈任务" in html
+        assert "全部任务" in html
+
+    def test_to_html_contains_metrics(self) -> None:
+        """HTML 应包含图级指标数值。"""
+        profile = _build_simple_profile()
+        html = profile.to_html()
+
+        # 总耗时 3.0s (a=1 + b=2)
+        assert "3.000" in html
+        # 任务名
+        assert "a" in html
+        assert "b" in html
+
+    def test_to_html_contains_critical_path(self) -> None:
+        """HTML 应包含关键路径任务链。"""
+        profile = _build_simple_profile()
+        html = profile.to_html()
+
+        # 关键路径是 a -> b
+        assert "<strong>a</strong>" in html
+        assert "<strong>b</strong>" in html
+
+    def test_to_html_contains_gantt_bars(self) -> None:
+        """HTML 应包含甘特图条。"""
+        profile = _build_simple_profile()
+        html = profile.to_html()
+
+        assert "gantt-row" in html
+        assert "gantt-bar" in html
+        # 每个非 SKIPPED 任务一个条
+        assert html.count("gantt-bar") >= 2
+
+    def test_to_html_empty_profile(self) -> None:
+        """空报告的 HTML 应不崩溃。"""
+        report = px.RunReport()
+        graph = px.Graph()
+        profile = ProfileReport.from_report(report, graph)
+        html = profile.to_html()
+
+        assert "PyFlowX 性能剖面报告" in html
+        assert "(无)" in html
+
+    def test_to_html_with_failed_task(self) -> None:
+        """含 FAILED 任务的 HTML 应包含失败状态徽章。"""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0, status=TaskStatus.FAILED)
+        graph = px.Graph.from_specs([_spec("a")])
+
+        profile = ProfileReport.from_report(report, graph)
+        html = profile.to_html()
+
+        assert "failed" in html
+        assert "badge" in html
+
+    def test_to_html_with_skipped_task(self) -> None:
+        """含 SKIPPED 任务的 HTML 不应在甘特图中显示该任务。"""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0)
+        report.results["b"] = TaskResult[Any](
+            spec=_spec("b"),
+            status=TaskStatus.SKIPPED,
+            reason="skip",
+        )
+        graph = px.Graph.from_specs([_spec("a"), _spec("b")])
+
+        profile = ProfileReport.from_report(report, graph)
+        html = profile.to_html()
+
+        # SKIPPED 任务的徽章应出现
+        assert "skipped" in html
+
+    def test_to_html_self_contained(self) -> None:
+        """HTML 应自包含（无外部依赖）。"""
+        profile = _build_simple_profile()
+        html = profile.to_html()
+
+        # 不引用外部资源
+        assert "<link" not in html
+        assert "<script src" not in html
+
+
+class TestProfilerArgumentParsing:
+    """测试 pxp CLI 参数解析。"""
+
+    def test_default_export_is_html(self) -> None:
+        """默认导出格式为 html。"""
+        parser = profiler._build_parser()
+        args, remaining = parser.parse_known_args(["pymake.py"])
+        assert args.export == "html"
+        assert args.no_browser is False
+        assert args.output is None
+        assert remaining == ["pymake.py"]
+
+    def test_export_text(self) -> None:
+        """-E text 应设置导出格式为 text。"""
+        parser = profiler._build_parser()
+        args, _ = parser.parse_known_args(["-E", "text", "pymake.py"])
+        assert args.export == "text"
+
+    def test_no_browser_flag(self) -> None:
+        """--no-browser 应设置标志。"""
+        parser = profiler._build_parser()
+        args, _ = parser.parse_known_args(["--no-browser", "pymake.py"])
+        assert args.no_browser is True
+
+    def test_output_option(self) -> None:
+        """-o 应设置输出路径。"""
+        parser = profiler._build_parser()
+        args, _ = parser.parse_known_args(["-o", "report.html", "pymake.py"])
+        assert args.output == "report.html"
+
+    def test_script_args_separated(self) -> None:
+        """脚本参数应通过 remaining 分离。"""
+        parser = profiler._build_parser()
+        _, remaining = parser.parse_known_args(["pymake.py", "t", "--quiet"])
+        assert remaining == ["pymake.py", "t", "--quiet"]
+
+    def test_no_args_prints_help(
+        self,
+        capsys: pytest.CaptureFixture[str],
+        monkeypatch: pytest.MonkeyPatch,
+    ) -> None:
+        """无参数应打印帮助并以退出码 2 退出。"""
+        monkeypatch.setattr(sys, "argv", ["pxp"])
+        with pytest.raises(SystemExit) as exc_info:
+            profiler.main()
+        assert exc_info.value.code == 2
+        captured = capsys.readouterr()
+        assert "usage" in captured.out.lower() or "usage" in captured.err.lower()
+
+
+class TestCapturePxRun:
+    """测试 _capture_px_run hook 注入。"""
+
+    def test_capture_captures_run_call(self) -> None:
+        """hook 应捕获 px.run() 调用的 graph 和 report。"""
+        captured = profiler._capture_px_run()
+        try:
+            graph = px.Graph.from_specs([px.TaskSpec("a", lambda: 1)])
+            px.run(graph, strategy="sequential")
+            assert "graph" in captured
+            assert "report" in captured
+            assert captured["graph"] is graph
+        finally:
+            captured["_restore"]()
+
+    def test_capture_restores_original(self) -> None:
+        """还原后 px.run 和 RunReport.__init__ 应恢复为原函数。"""
+        original_run = px.run
+        original_init = RunReport.__init__
+        captured = profiler._capture_px_run()
+        # 注入期间 px.run 和 RunReport.__init__ 已被替换
+        assert px.run is not original_run
+        assert RunReport.__init__ is not original_init
+        captured["_restore"]()
+        # 还原后恢复
+        assert px.run is original_run
+        assert RunReport.__init__ is original_init
+
+    def test_capture_via_runner_run(self) -> None:
+        """hook 应捕获通过 CliRunner 执行的 run() 调用。"""
+        from pyflowx import runner as runner_mod
+
+        captured = profiler._capture_px_run()
+        try:
+            # 验证 runner.run 也被 patch（指向 patched_run）
+            assert runner_mod.run is px.executors.run
+            graph = px.Graph.from_specs([px.TaskSpec("a", lambda: 1)])
+            runner_mod.run(graph, strategy="sequential")
+            assert "report" in captured
+        finally:
+            captured["_restore"]()
+
+    def test_capture_captures_report_on_failure(self) -> None:
+        """run() 抛出 TaskFailedError 时仍应捕获 report 实例。"""
+        from pyflowx.executors import TaskFailedError
+
+        def failing() -> None:
+            raise RuntimeError("boom")
+
+        graph = px.Graph.from_specs([px.TaskSpec("a", failing)])
+        captured = profiler._capture_px_run()
+        try:
+            with pytest.raises(TaskFailedError):
+                px.run(graph, strategy="sequential")
+            # 即使 run() 抛异常，report 也应被捕获（含已执行任务的结果）
+            assert "report" in captured
+            assert "graph" in captured
+            assert captured["graph"] is graph
+        finally:
+            captured["_restore"]()
+
+
+class TestRunTargetScript:
+    """测试 _run_target_script。"""
+
+    def test_run_simple_script(self, tmp_path: Path) -> None:
+        """应能执行简单脚本并返回模块字典。"""
+        script = tmp_path / "simple.py"
+        script.write_text("x = 42\n", encoding="utf-8")
+
+        result = profiler._run_target_script(script, [])
+        assert result["x"] == 42
+
+    def test_run_script_with_sys_exit(self, tmp_path: Path) -> None:
+        """脚本调用 sys.exit 应抛 SystemExit。"""
+        script = tmp_path / "exit.py"
+        script.write_text("import sys; sys.exit(0)\n", encoding="utf-8")
+
+        with pytest.raises(SystemExit):
+            profiler._run_target_script(script, [])
+
+    def test_run_script_sets_argv(self, tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
+        """应正确设置 sys.argv。"""
+        script = tmp_path / "argv.py"
+        script.write_text(
+            "import sys\nassert sys.argv[0] == __file__\nassert sys.argv[1:] == ['arg1', 'arg2']\n",
+            encoding="utf-8",
+        )
+        profiler._run_target_script(script, ["arg1", "arg2"])
+
+    def test_run_script_adds_dir_to_path(self, tmp_path: Path) -> None:
+        """脚本所在目录应加入 sys.path。"""
+        script = tmp_path / "pathcheck.py"
+        script.write_text(
+            "import sys, os\nassert os.path.dirname(__file__) in sys.path\n",
+            encoding="utf-8",
+        )
+        profiler._run_target_script(script, [])
+
+
+class TestOutputReport:
+    """测试 _output_report。"""
+
+    def test_output_text_format(
+        self,
+        capsys: pytest.CaptureFixture[str],
+    ) -> None:
+        """text 格式应打印 describe() 到 stdout。"""
+        profile = _build_simple_profile()
+        profiler._output_report(profile, export="text", output=None, script_stem="test", no_browser=True)
+        captured = capsys.readouterr()
+        assert "PyFlowX 性能剖面报告" in captured.out
+        assert "图级指标" in captured.out
+
+    def test_output_html_default_filename(self, tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
+        """HTML 默认输出到 <script>_profile.html。"""
+        monkeypatch.chdir(tmp_path)
+        profile = _build_simple_profile()
+        profiler._output_report(profile, export="html", output=None, script_stem="mymake", no_browser=True)
+
+        out_file = tmp_path / "mymake_profile.html"
+        assert out_file.exists()
+        content = out_file.read_text(encoding="utf-8")
+        assert "PyFlowX 性能剖面报告" in content
+
+    def test_output_html_custom_path(self, tmp_path: Path) -> None:
+        """HTML 应写入指定路径。"""
+        out_file = tmp_path / "custom.html"
+        profile = _build_simple_profile()
+        profiler._output_report(profile, export="html", output=str(out_file), script_stem="test", no_browser=True)
+        assert out_file.exists()
+        assert "PyFlowX" in out_file.read_text(encoding="utf-8")
+
+    def test_output_html_opens_browser(self, tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
+        """no_browser=False 应调用 webbrowser.open。"""
+        monkeypatch.chdir(tmp_path)
+        opened: list[str] = []
+        monkeypatch.setattr(profiler.webbrowser, "open", opened.append)
+
+        profile = _build_simple_profile()
+        profiler._output_report(profile, export="html", output=None, script_stem="test", no_browser=False)
+
+        assert len(opened) == 1
+        assert opened[0].startswith("file://")
+
+    def test_output_html_no_browser_flag(self, tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
+        """no_browser=True 不应调用 webbrowser.open。"""
+        monkeypatch.chdir(tmp_path)
+        opened: list[str] = []
+        monkeypatch.setattr(profiler.webbrowser, "open", opened.append)
+
+        profile = _build_simple_profile()
+        profiler._output_report(profile, export="html", output=None, script_stem="test", no_browser=True)
+
+        assert len(opened) == 0
+
+
+class TestProfilerMainIntegration:
+    """main() 集成测试。"""
+
+    def test_main_analyses_script_with_px_run(self, tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
+        """main() 应分析含 px.run() 的脚本并生成 HTML。"""
+        script = tmp_path / "mytool.py"
+        script.write_text(
+            "import pyflowx as px\n"
+            "graph = px.Graph.from_specs([\n"
+            "    px.TaskSpec('a', lambda: 1),\n"
+            "    px.TaskSpec('b', lambda: 2, depends_on=('a',)),\n"
+            "])\n"
+            "px.run(graph, strategy='sequential')\n",
+            encoding="utf-8",
+        )
+        out_file = tmp_path / "report.html"
+        monkeypatch.setattr(sys, "argv", ["pxp", "--no-browser", "-o", str(out_file), str(script)])
+
+        profiler.main()
+
+        assert out_file.exists()
+        content = out_file.read_text(encoding="utf-8")
+        assert "PyFlowX 性能剖面报告" in content
+        assert "任务时间线" in content
+
+    def test_main_analyses_script_with_clirunner(self, tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
+        """main() 应分析含 CliRunner 的脚本。"""
+        script = tmp_path / "clirunner_tool.py"
+        script.write_text(
+            "import pyflowx as px\n"
+            "runner = px.CliRunner(\n"
+            "    aliases={'t': px.TaskSpec('t', lambda: 1)},\n"
+            ")\n"
+            "runner.run_cli(['t'])\n",
+            encoding="utf-8",
+        )
+        out_file = tmp_path / "report.html"
+        monkeypatch.setattr(sys, "argv", ["pxp", "--no-browser", "-o", str(out_file), str(script)])
+
+        profiler.main()
+
+        assert out_file.exists()
+        content = out_file.read_text(encoding="utf-8")
+        assert "PyFlowX 性能剖面报告" in content
+
+    def test_main_text_export(
+        self, tmp_path: Path, monkeypatch: pytest.MonkeyPatch, capsys: pytest.CaptureFixture[str]
+    ) -> None:
+        """main() -E text 应输出文本到 stdout。"""
+        script = tmp_path / "simple.py"
+        script.write_text(
+            "import pyflowx as px\n"
+            "graph = px.Graph.from_specs([px.TaskSpec('a', lambda: 1)])\n"
+            "px.run(graph, strategy='sequential')\n",
+            encoding="utf-8",
+        )
+        monkeypatch.setattr(sys, "argv", ["pxp", "-E", "text", "--no-browser", str(script)])
+
+        profiler.main()
+        captured = capsys.readouterr()
+        assert "PyFlowX 性能剖面报告" in captured.out
+
+    def test_main_script_not_exist(
+        self, tmp_path: Path, monkeypatch: pytest.MonkeyPatch, capsys: pytest.CaptureFixture[str]
+    ) -> None:
+        """脚本不存在应以退出码 2 退出。"""
+        monkeypatch.setattr(sys, "argv", ["pxp", "--no-browser", str(tmp_path / "nonexistent.py")])
+        with pytest.raises(SystemExit) as exc_info:
+            profiler.main()
+        assert exc_info.value.code == 2
+
+    def test_main_no_px_run_captured(self, tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
+        """脚本未调用 px.run() 应以退出码 1 退出。"""
+        script = tmp_path / "no_run.py"
+        script.write_text("print('just printing')\n", encoding="utf-8")
+        monkeypatch.setattr(sys, "argv", ["pxp", "--no-browser", str(script)])
+        with pytest.raises(SystemExit) as exc_info:
+            profiler.main()
+        assert exc_info.value.code == 1
+
+    def test_main_passes_script_args(self, tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
+        """应将脚本参数传递给目标脚本。"""
+        script = tmp_path / "argcheck.py"
+        script.write_text(
+            "import sys\n"
+            "assert sys.argv[1:] == ['myarg'], f'got {sys.argv[1:]}'\n"
+            "import pyflowx as px\n"
+            "px.run(px.Graph.from_specs([px.TaskSpec('a', lambda: 1)]), strategy='sequential')\n",
+            encoding="utf-8",
+        )
+        out_file = tmp_path / "report.html"
+        monkeypatch.setattr(sys, "argv", ["pxp", "--no-browser", "-o", str(out_file), str(script), "myarg"])
+
+        profiler.main()  # 不抛异常即成功
+
+    def test_main_handles_script_exception(self, tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
+        """脚本抛异常时应捕获并继续生成报告（如果有 report）。"""
+        script = tmp_path / "raise.py"
+        script.write_text(
+            "import pyflowx as px\n"
+            "px.run(px.Graph.from_specs([px.TaskSpec('a', lambda: 1)]), strategy='sequential')\n"
+            "raise RuntimeError('after run')\n",
+            encoding="utf-8",
+        )
+        out_file = tmp_path / "report.html"
+        monkeypatch.setattr(sys, "argv", ["pxp", "--no-browser", "-o", str(out_file), str(script)])
+
+        profiler.main()  # 不抛异常即成功
+        assert out_file.exists()
+
+    def test_main_auto_calls_main_when_no_main_block(self, tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
+        """脚本无 __main__ 块但定义了 main() 时应自动调用。"""
+        script = tmp_path / "no_main_block.py"
+        script.write_text(
+            "import pyflowx as px\n"
+            "def main():\n"
+            "    px.run(px.Graph.from_specs([px.TaskSpec('a', lambda: 1)]), strategy='sequential')\n"
+            "# 无 if __name__ == '__main__' 块\n",
+            encoding="utf-8",
+        )
+        out_file = tmp_path / "report.html"
+        monkeypatch.setattr(sys, "argv", ["pxp", "--no-browser", "-o", str(out_file), str(script)])
+
+        profiler.main()
+        assert out_file.exists()
+
+    def test_main_auto_calls_main_with_clirunner(self, tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
+        """脚本无 __main__ 块但定义了调用 CliRunner 的 main() 时应自动调用。"""
+        script = tmp_path / "cli_tool.py"
+        script.write_text(
+            "import pyflowx as px\n"
+            "def main():\n"
+            "    runner = px.CliRunner(\n"
+            "        aliases={'t': px.TaskSpec('t', lambda: 1)},\n"
+            "    )\n"
+            "    runner.run_cli(['t'])\n",
+            encoding="utf-8",
+        )
+        out_file = tmp_path / "report.html"
+        monkeypatch.setattr(sys, "argv", ["pxp", "--no-browser", "-o", str(out_file), str(script), "t"])
+
+        profiler.main()
+        assert out_file.exists()
+        content = out_file.read_text(encoding="utf-8")
+        assert "PyFlowX 性能剖面报告" in content
+
+    def test_main_no_main_function_exits_with_1(self, tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
+        """脚本无 main() 且未调用 px.run() 应以退出码 1 退出。"""
+        script = tmp_path / "no_main.py"
+        script.write_text("x = 1\n", encoding="utf-8")
+        monkeypatch.setattr(sys, "argv", ["pxp", "--no-browser", str(script)])
+        with pytest.raises(SystemExit) as exc_info:
+            profiler.main()
+        assert exc_info.value.code == 1
+
+
+class TestTryCallMain:
+    """测试 _try_call_main。"""
+
+    def test_calls_main_when_present(self) -> None:
+        """模块字典含 main 可调用对象时应调用它。"""
+        called: list[bool] = []
+
+        def fake_main() -> None:
+            called.append(True)
+
+        profiler._try_call_main({"main": fake_main})
+        assert called == [True]
+
+    def test_no_main_does_nothing(self) -> None:
+        """模块字典不含 main 时不应报错。"""
+        profiler._try_call_main({})  # 不抛异常即成功
+
+    def test_non_callable_main_does_nothing(self) -> None:
+        """main 不是可调用对象时不应报错。"""
+        profiler._try_call_main({"main": "not a function"})  # 不抛异常即成功
@@ -2,160 +2,143 @@

 from __future__ import annotations

+from pathlib import Path
 from unittest.mock import patch

 import pytest

 from pyflowx.cli import pymake
-from pyflowx.conditions import Constants
-
-
-# ---------------------------------------------------------------------- #
-# maturin_build_cmd
-# ---------------------------------------------------------------------- #
-class TestMaturinBuildCmd:
-    """Test maturin_build_cmd function."""
-
-    def test_returns_list(self) -> None:
-        """Should return a list."""
-        cmd = pymake.maturin_build_cmd()
-        assert isinstance(cmd, list)
-
-    def test_contains_maturin_build(self) -> None:
-        """Should contain 'maturin' and 'build'."""
-        cmd = pymake.maturin_build_cmd()
-        assert "maturin" in cmd
-        assert "build" in cmd
-
-    def test_contains_release_flag(self) -> None:
-        """Should contain release flag '-r'."""
-        cmd = pymake.maturin_build_cmd()
-        assert "-r" in cmd
-
-    def test_windows_includes_target(self) -> None:
-        """On Windows, should include target-specific flags."""
-        cmd = pymake.maturin_build_cmd()
-        if Constants.IS_WINDOWS:
-            assert "--target" in cmd
-            assert "x86_64-win7-windows-msvc" in cmd
-            assert "-Zbuild-std" in cmd
-            assert "-i" in cmd
-            assert "python3.8" in cmd
-        else:
-            # On non-Windows, should not include Windows-specific flags
-            assert "--target" not in cmd
-
-    def test_does_not_mutate_on_multiple_calls(self) -> None:
-        """Multiple calls should return independent lists."""
-        cmd1 = pymake.maturin_build_cmd()
-        cmd2 = pymake.maturin_build_cmd()
-        assert cmd1 == cmd2
-        # Mutating one should not affect the other
-        cmd1.append("extra")
-        assert "extra" not in cmd2
-
-    def test_non_windows_excludes_target_flags(self) -> None:
-        """On non-Windows, should not include Windows-specific flags (覆盖 22->32 分支)."""
-        from unittest.mock import patch
-
-        with patch.object(pymake.Constants, "IS_WINDOWS", False):
-            cmd = pymake.maturin_build_cmd()
-        assert "maturin" in cmd
-        assert "build" in cmd
-        assert "-r" in cmd
-        assert "--target" not in cmd
-        assert "-Zbuild-std" not in cmd


 # ---------------------------------------------------------------------- #
 # TaskSpec definitions
 # ---------------------------------------------------------------------- #
+def _find_task(name: str) -> pymake.px.TaskSpec:
+    """从 pymake.tasks 或 aliases 中查找指定名称的 TaskSpec."""
+    for spec in pymake.tasks:
+        if spec.name == name:
+            return spec
+    # 单任务别名（doc/lint/tox）内联在 aliases dict 中
+    value = pymake.aliases.get(name)
+    if isinstance(value, pymake.px.TaskSpec):
+        return value
+    raise KeyError(f"任务 {name!r} 未找到")
+
+
 class TestTaskSpecDefinitions:
    """Test that all TaskSpec definitions are valid."""

    def test_uv_build_spec(self) -> None:
        """uv_build spec should be properly defined."""
-        assert pymake.uv_build.name == "uv_build"
-        assert pymake.uv_build.cmd == ["uv", "build"]
-        assert pymake.uv_build.skip_if_missing is False
+        spec = _find_task("uv_build")
+        assert spec.name == "uv_build"
+        assert spec.cmd == ["uv", "build"]
+        assert spec.skip_if_missing is False

    def test_maturin_build_spec(self) -> None:
        """maturin_build spec should be properly defined."""
-        assert pymake.maturin_build.name == "maturin_build"
-        assert isinstance(pymake.maturin_build.cmd, list)
-        assert pymake.maturin_build.skip_if_missing is False
+        spec = _find_task("maturin_build")
+        assert spec.name == "maturin_build"
+        assert isinstance(spec.cmd, list)
+        assert spec.skip_if_missing is False

    def test_uv_sync_spec(self) -> None:
        """uv_sync spec should be properly defined."""
-        assert pymake.uv_sync.name == "uv_sync"
-        assert pymake.uv_sync.cmd == ["uv", "sync"]
-        assert pymake.uv_sync.skip_if_missing is False
+        spec = _find_task("uv_sync")
+        assert spec.name == "uv_sync"
+        assert spec.cmd == ["uv", "sync"]
+        assert spec.skip_if_missing is False

    def test_git_clean_spec(self) -> None:
        """git_clean spec should be properly defined."""
-        assert pymake.git_clean.name == "git_clean"
-        assert pymake.git_clean.cmd == ["gitt", "c"]
-        assert pymake.git_clean.skip_if_missing is False
+        spec = _find_task("git_clean")
+        assert spec.name == "git_clean"
+        assert spec.cmd == ["gitt", "c"]
+        assert spec.skip_if_missing is False

    def test_test_spec(self) -> None:
        """test spec should be properly defined."""
-        assert pymake.test.name == "test"
-        assert isinstance(pymake.test.cmd, list)
-        assert "pytest" in pymake.test.cmd
-        assert "-m" in pymake.test.cmd
-        assert "not slow" in pymake.test.cmd
-        assert pymake.test.skip_if_missing is False
+        spec = _find_task("test")
+        assert spec.name == "test"
+        assert isinstance(spec.cmd, list)
+        assert "pytest" in spec.cmd
+        assert "-m" in spec.cmd
+        assert "not slow" in spec.cmd
+        assert spec.skip_if_missing is False

    def test_test_fast_spec(self) -> None:
        """test_fast spec should be properly defined."""
-        assert pymake.test_fast.name == "test_fast"
-        assert isinstance(pymake.test_fast.cmd, list)
-        assert "pytest" in pymake.test_fast.cmd
-        assert "-n" not in pymake.test_fast.cmd  # test_fast doesn't use parallel
-        assert pymake.test_fast.skip_if_missing is False
+        spec = _find_task("test_fast")
+        assert spec.name == "test_fast"
+        assert isinstance(spec.cmd, list)
+        assert "pytest" in spec.cmd
+        assert "-n" not in spec.cmd  # test_fast doesn't use parallel
+        assert spec.skip_if_missing is False

    def test_test_coverage_spec(self) -> None:
        """test_coverage spec should be properly defined."""
-        assert pymake.test_coverage.name == "test_coverage"
-        assert isinstance(pymake.test_coverage.cmd, list)
-        assert "pytest" in pymake.test_coverage.cmd
-        assert "--cov" in pymake.test_coverage.cmd
-        assert pymake.test_coverage.skip_if_missing is False
+        spec = _find_task("test_coverage")
+        assert spec.name == "test_coverage"
+        assert isinstance(spec.cmd, list)
+        assert "pytest" in spec.cmd
+        assert "--cov" in spec.cmd
+        assert spec.skip_if_missing is False

    def test_ruff_lint_spec(self) -> None:
-        """ruff_lint spec should be properly defined."""
-        assert pymake.ruff_lint.name == "lint"
-        assert isinstance(pymake.ruff_lint.cmd, list)
-        assert "ruff" in pymake.ruff_lint.cmd
-        assert "check" in pymake.ruff_lint.cmd
-        assert pymake.ruff_lint.skip_if_missing is False
+        """lint spec should be properly defined."""
+        spec = _find_task("lint")
+        assert spec.name == "lint"
+        assert isinstance(spec.cmd, list)
+        assert "ruff" in spec.cmd
+        assert "check" in spec.cmd
+        assert spec.skip_if_missing is False

    def test_doc_spec(self) -> None:
        """doc spec should be properly defined."""
-        assert pymake.doc.name == "doc"
-        assert isinstance(pymake.doc.cmd, list)
-        assert "sphinx-build" in pymake.doc.cmd
-        assert pymake.doc.skip_if_missing is False
+        spec = _find_task("doc")
+        assert spec.name == "doc"
+        assert isinstance(spec.cmd, list)
+        assert "sphinx-build" in spec.cmd
+        assert spec.skip_if_missing is False

    def test_hatch_publish_spec(self) -> None:
-        """hatch_publish spec should be properly defined."""
-        assert pymake.hatch_publish.name == "publish_python"
-        assert pymake.hatch_publish.cmd == ["hatch", "publish"]
-        assert pymake.hatch_publish.skip_if_missing is False
+        """publish_python spec should be properly defined."""
+        spec = _find_task("publish_python")
+        assert spec.name == "publish_python"
+        assert spec.cmd == ["hatch", "publish"]
+        assert spec.skip_if_missing is False

    def test_twine_publish_spec(self) -> None:
        """twine_publish spec should be properly defined."""
-        assert pymake.twine_publish.name == "twine_publish"
-        assert isinstance(pymake.twine_publish.cmd, list)
-        assert "twine" in pymake.twine_publish.cmd
-        assert "upload" in pymake.twine_publish.cmd
-        assert pymake.twine_publish.skip_if_missing is False
+        spec = _find_task("twine_publish")
+        assert spec.name == "twine_publish"
+        assert isinstance(spec.cmd, list)
+        assert "twine" in spec.cmd
+        assert "upload" in spec.cmd
+        assert spec.skip_if_missing is False

    def test_tox_spec(self) -> None:
        """tox spec should be properly defined."""
-        assert pymake.tox.name == "tox"
-        assert pymake.tox.cmd == ["tox", "-p", "auto"]
-        assert pymake.tox.skip_if_missing is False
+        spec = _find_task("tox")
+        assert spec.name == "tox"
+        assert spec.cmd == ["tox", "-p", "auto"]
+        assert spec.skip_if_missing is False
+
+    def test_all_tasks_have_correct_cwd(self) -> None:
+        """所有任务应该有正确的 cwd 设置（指向项目根目录）."""
+        # 验证 ROOT_DIR 定义正确（向上三层到达项目根目录）
+        expected_root = Path(__file__).parent.parent.parent
+        assert expected_root == pymake.ROOT_DIR
+
+        # 验证 tasks 中的所有命令任务都有正确的 cwd
+        for spec in pymake.tasks:
+            if spec.cmd is not None:
+                assert spec.cwd == pymake.ROOT_DIR, f"任务 {spec.name} 的 cwd 应为 {pymake.ROOT_DIR}"
+
+        # 验证 aliases 中的内联任务（doc/lint/tox）也有正确的 cwd
+        for name in ("doc", "lint", "tox"):
+            spec = _find_task(name)
+            assert spec.cwd == pymake.ROOT_DIR, f"任务 {name} 的 cwd 应为 {pymake.ROOT_DIR}"


 # ---------------------------------------------------------------------- #
@@ -1,9 +1,16 @@
 from __future__ import annotations

+import sys
 from pathlib import Path

 import pytest

+# 将 tests 目录加入 sys.path，使进程池测试能 import _proc_helper 模块级辅助函数。
+# 进程池 pickle 要求被调用函数为模块级，conftest.py 在 xdist worker 中也会执行。
+_TESTS_DIR = str(Path(__file__).resolve().parent)
+if _TESTS_DIR not in sys.path:
+    sys.path.insert(0, _TESTS_DIR)
+

@pytest.fixture(autouse=True)
 def packtool_tmp_workdir(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
@@ -0,0 +1,101 @@
+"""Tests for Graph.chain DSL."""
+
+from __future__ import annotations
+
+import pyflowx as px
+from pyflowx.task import TaskSpec
+
+
+def _fn() -> None:
+    return None
+
+
+def test_chain_basic_linkage() -> None:
+    """chain(a, b, c) 应建立 a->b->c 依赖."""
+    a = TaskSpec("a", _fn)
+    b = TaskSpec("b", _fn)
+    c = TaskSpec("c", _fn)
+
+    graph = px.Graph().chain(a, b, c)
+
+    assert graph.all_specs()["b"].depends_on == ("a",)
+    assert graph.all_specs()["c"].depends_on == ("b",)
+    assert graph.all_specs()["a"].depends_on == ()
+
+
+def test_chain_single_spec() -> None:
+    """chain(a) 应只注册 a，无依赖."""
+    a = TaskSpec("a", _fn)
+    graph = px.Graph().chain(a)
+    assert "a" in graph
+    assert graph.all_specs()["a"].depends_on == ()
+
+
+def test_chain_preserves_existing_deps() -> None:
+    """chain 应保留 spec 已有的 depends_on."""
+    a = TaskSpec("a", _fn)
+    b = TaskSpec("b", _fn)
+    c = TaskSpec("c", _fn, depends_on=("b",))
+
+    graph = px.Graph().chain(a, b, c)
+    # c 已有 depends_on=('b',)，前驱是 b，已在依赖中，不重复添加
+    assert graph.all_specs()["c"].depends_on == ("b",)
+
+
+def test_chain_merges_existing_deps() -> None:
+    """chain 应将前驱追加到已有依赖前（若不存在）."""
+    a = TaskSpec("a", _fn)
+    x = TaskSpec("x", _fn)
+    c = TaskSpec("c", _fn, depends_on=("x",))
+
+    graph = px.Graph().chain(a, x, c)
+    # c 前驱是 x，但 c 已依赖 x，不重复
+    assert graph.all_specs()["c"].depends_on == ("x",)
+
+
+def test_chain_returns_self() -> None:
+    """chain 返回 self 支持链式调用."""
+    a = TaskSpec("a", _fn)
+    graph = px.Graph()
+    assert graph.chain(a) is graph
+
+
+def test_chain_execution_order() -> None:
+    """chain 应保证执行顺序."""
+    order: list[str] = []
+
+    def make(name: str):
+        def fn() -> str:
+            order.append(name)
+            return name
+        return fn
+
+    a = TaskSpec("a", make("a"))
+    b = TaskSpec("b", make("b"))
+    c = TaskSpec("c", make("c"))
+
+    graph = px.Graph().chain(a, b, c)
+    report = px.run(graph)
+    assert report.success
+    assert order == ["a", "b", "c"]
+
+
+def test_chain_with_decorator_specs() -> None:
+    """chain 应与 @task 装饰器配合."""
+
+    @px.task
+    def extract() -> int:
+        return 1
+
+    @px.task
+    def transform(extract: int) -> int:
+        return extract + 10
+
+    @px.task
+    def load(transform: int) -> int:
+        return transform + 100
+
+    graph = px.Graph().chain(extract, transform, load)
+    report = px.run(graph)
+    assert report.success
+    assert report["load"] == 111
@@ -17,7 +17,7 @@ class TestCommandReferences:

        runner = px.CliRunner(
            strategy="sequential",
-            graphs={
+            aliases={
                "build": px.Graph.from_specs([build_task]),
                "test": px.Graph.from_specs([test_task]),
                "all": px.Graph.from_specs([build_task, "test"]),
@@ -38,7 +38,7 @@ class TestCommandReferences:

        runner = px.CliRunner(
            strategy="sequential",
-            graphs={
+            aliases={
                "cmd1": px.Graph.from_specs([task1]),
                "cmd2": px.Graph.from_specs([task2]),
                "cmd3": px.Graph.from_specs([task3]),
@@ -57,7 +57,7 @@ class TestCommandReferences:

        runner = px.CliRunner(
            strategy="sequential",
-            graphs={
+            aliases={
                "lint": px.Graph.from_specs([lint_task, format_task]),
                "quick": px.Graph.from_specs(["lint.lint"]),
            },
@@ -75,7 +75,7 @@ class TestCommandReferences:

        runner = px.CliRunner(
            strategy="sequential",
-            graphs={
+            aliases={
                "cmd1": px.Graph.from_specs([task1]),
                "cmd2": px.Graph.from_specs(["cmd1", task2]),
                "cmd3": px.Graph.from_specs(["cmd2", task3]),
@@ -93,7 +93,7 @@ class TestCommandReferences:
        with pytest.raises(ValueError, match="循环引用"):
            px.CliRunner(
                strategy="sequential",
-                graphs={
+            aliases={
                    "cmd1": px.Graph.from_specs(["cmd1", task1]),
                },
            )
@@ -105,7 +105,7 @@ class TestCommandReferences:
        with pytest.raises(ValueError, match="引用的命令 'invalid' 不存在"):
            px.CliRunner(
                strategy="sequential",
-                graphs={
+            aliases={
                    "cmd1": px.Graph.from_specs(["invalid", task1]),
                },
            )
@@ -117,7 +117,7 @@ class TestCommandReferences:
        with pytest.raises(ValueError, match="任务 'invalid' 不存在于命令 'cmd1' 中"):
            px.CliRunner(
                strategy="sequential",
-                graphs={
+            aliases={
                    "cmd1": px.Graph.from_specs([task1]),
                    "cmd2": px.Graph.from_specs(["cmd1.invalid"]),
                },
@@ -130,7 +130,7 @@ class TestCommandReferences:

        runner = px.CliRunner(
            strategy="sequential",
-            graphs={
+            aliases={
                "cmd1": px.Graph.from_specs([task1, task2]),
                "cmd2": px.Graph.from_specs(["cmd1"]),
            },
@@ -148,7 +148,7 @@ class TestCommandReferences:

        runner = px.CliRunner(
            strategy="sequential",
-            graphs={
+            aliases={
                "cmd1": px.Graph.from_specs([task1, task2]),
                "cmd2": px.Graph.from_specs(["cmd1", task3]),
            },
@@ -168,7 +168,7 @@ class TestCommandReferences:

        runner = px.CliRunner(
            strategy="sequential",
-            graphs={
+            aliases={
                "cmd1": px.Graph.from_specs([task1]),
                "cmd2": px.Graph.from_specs([task2, task3]),
                "cmd3": px.Graph.from_specs([task4]),
@@ -205,7 +205,7 @@ class TestCommandReferences:

        runner = px.CliRunner(
            strategy="sequential",
-            graphs={
+            aliases={
                "cmd1": px.Graph.from_specs([task1]),
                "cmd2": px.Graph.from_specs([task2]),
                "all": px.Graph.from_specs(["cmd1", "cmd2", task3, task4, task5]),
@@ -242,7 +242,7 @@ class TestCommandReferences:

        runner = px.CliRunner(
            strategy="sequential",
-            graphs={
+            aliases={
                "cmd1": px.Graph.from_specs([task1, task2]),
                "cmd2": px.Graph.from_specs([task3]),
                "all": px.Graph.from_specs(["cmd1", "cmd2", task4]),
@@ -279,7 +279,7 @@ class TestCommandReferences:

        runner = px.CliRunner(
            strategy="sequential",
-            graphs={
+            aliases={
                "c": px.Graph.from_specs([git_clean]),
                "tc": px.Graph.from_specs([typecheck, "lint"]),
                "lint": px.Graph.from_specs([lint, format_task]),
@@ -319,7 +319,7 @@ class TestCommandReferences:

        runner = px.CliRunner(
            strategy="sequential",
-            graphs={
+            aliases={
                "cmd1": px.Graph.from_specs([task1]),
                "cmd2": px.Graph.from_specs([task2]),
                "cmd3": px.Graph.from_specs([task3]),
@@ -350,7 +350,7 @@ class TestCommandReferences:

        runner = px.CliRunner(
            strategy="sequential",
-            graphs={
+            aliases={
                "all": px.Graph.from_specs([task1, task2, task3]),
            },
        )
@@ -373,7 +373,7 @@ class TestCommandReferences:

        runner = px.CliRunner(
            strategy="sequential",
-            graphs={
+            aliases={
                "cmd1": px.Graph.from_specs([task1, task2]),
                "all": px.Graph.from_specs(["cmd1"]),
            },
@@ -399,7 +399,7 @@ class TestCommandReferences:

        runner = px.CliRunner(
            strategy="sequential",
-            graphs={
+            aliases={
                "cmd1": px.Graph.from_specs([task1]),
                "cmd2": px.Graph.from_specs(["cmd1", task2]),
                "cmd3": px.Graph.from_specs(["cmd2", task3]),
@@ -430,7 +430,7 @@ class TestCommandReferences:

        runner = px.CliRunner(
            strategy="sequential",
-            graphs={
+            aliases={
                "cmd1": px.Graph.from_specs([task1, task2]),  # Parallel tasks
                "cmd2": px.Graph.from_specs([task3, task4]),  # Parallel tasks
                "all": px.Graph.from_specs(["cmd1", "cmd2"]),
@@ -465,7 +465,7 @@ class TestCommandReferences:

        runner = px.CliRunner(
            strategy="sequential",
-            graphs={
+            aliases={
                "clean": px.Graph.from_specs([clean]),
                "build": px.Graph.from_specs([build1, build2]),
                "test": px.Graph.from_specs([test1, test2]),
@@ -301,3 +301,59 @@ def test_dir_exists_false(tmp_path: Path):
    missing = tmp_path / "nonexistent"
    cond = BuiltinConditions.DIR_EXISTS(missing)
    assert cond({}) is False
+
+
+def test_builtin_is_windows_returns_module_condition():
+    """BuiltinConditions.IS_WINDOWS() 应返回模块级 IS_WINDOWS."""
+    assert BuiltinConditions.IS_WINDOWS() is IS_WINDOWS
+
+
+def test_builtin_is_linux_returns_module_condition():
+    """BuiltinConditions.IS_LINUX() 应返回模块级 IS_LINUX."""
+    assert BuiltinConditions.IS_LINUX() is IS_LINUX
+
+
+def test_builtin_is_macos_returns_module_condition():
+    """BuiltinConditions.IS_MACOS() 应返回模块级 IS_MACOS."""
+    assert BuiltinConditions.IS_MACOS() is IS_MACOS
+
+
+def test_builtin_is_posix_returns_module_condition():
+    """BuiltinConditions.IS_POSIX() 应返回模块级 IS_POSIX."""
+    assert BuiltinConditions.IS_POSIX() is IS_POSIX
+
+
+def test_file_content_exists_missing_file(tmp_path: Path):
+    """FILE_CONTENT_EXISTS 文件不存在时返回 False."""
+    cond = BuiltinConditions.FILE_CONTENT_EXISTS(tmp_path / "missing.txt", "x")
+    assert cond({}) is False
+
+
+def test_file_content_exists_contains_content(tmp_path: Path):
+    """FILE_CONTENT_EXISTS 文件包含内容时返回 True."""
+    f = tmp_path / "f.txt"
+    f.write_text("hello world", encoding="utf-8")
+    cond = BuiltinConditions.FILE_CONTENT_EXISTS(f, "world")
+    assert cond({}) is True
+
+
+def test_file_content_exists_not_contains_content(tmp_path: Path):
+    """FILE_CONTENT_EXISTS 文件不包含内容时返回 False."""
+    f = tmp_path / "f.txt"
+    f.write_text("hello", encoding="utf-8")
+    cond = BuiltinConditions.FILE_CONTENT_EXISTS(f, "missing")
+    assert cond({}) is False
+
+
+def test_file_content_exists_decode_error_returns_false(tmp_path: Path):
+    """FILE_CONTENT_EXISTS 读取非 UTF-8 文件应返回 False（解码异常被吞）."""
+    f = tmp_path / "bin.dat"
+    f.write_bytes(b"\xff\xfe\x00bad")
+    cond = BuiltinConditions.FILE_CONTENT_EXISTS(f, "x")
+    assert cond({}) is False
+
+
+def test_dep_matches_missing_dep_returns_false():
+    """DEP_MATCHES 依赖不存在时应返回 False（覆盖 if not in ctx 分支）."""
+    cond = BuiltinConditions.DEP_MATCHES("missing", lambda _v: True)
+    assert cond({}) is False
@@ -0,0 +1,62 @@
+"""Tests for process executor (spec.executor='process')."""
+
+from __future__ import annotations
+
+import pytest
+
+# pyrefly: ignore[missing-import]
+from _proc_helper import add, cpu_heavy, slow_sleep, sub
+
+import pyflowx as px
+from pyflowx.errors import TaskFailedError
+
+
+def test_process_executor_runs_cpu_task() -> None:
+    """executor='process' 应在进程池中执行 CPU 密集型任务."""
+    spec = px.TaskSpec("cpu", fn=cpu_heavy, args=(1000,), executor="process")
+    graph = px.Graph.from_specs([spec])
+    report = px.run(graph)
+    assert report.success
+    assert report["cpu"] == sum(i * i for i in range(1000))
+
+
+def test_process_executor_with_dependency() -> None:
+    """进程池任务应支持依赖注入."""
+    spec1 = px.TaskSpec("a", fn=cpu_heavy, args=(100,), executor="process")
+    spec2 = px.TaskSpec("b", fn=add, args=(3, 4), executor="process", depends_on=("a",))
+    graph = px.Graph.from_specs([spec1, spec2])
+    report = px.run(graph)
+    assert report.success
+    assert report["b"] == 7
+
+
+def test_process_executor_default_is_thread() -> None:
+    """TaskSpec.executor 默认应为 'thread'."""
+    spec = px.TaskSpec("x", fn=lambda: None)
+    assert spec.executor == "thread"
+
+
+def test_inline_executor_runs_in_event_loop() -> None:
+    """executor='inline' 应直接在事件循环线程调用."""
+    spec = px.TaskSpec("inline", fn=add, args=(10, 20), executor="inline")
+    graph = px.Graph.from_specs([spec])
+    report = px.run(graph)
+    assert report.success
+    assert report["inline"] == 30
+
+
+def test_process_executor_with_kwargs() -> None:
+    """进程池任务应支持 kwargs 注入."""
+    spec = px.TaskSpec("kw", fn=sub, args=(10,), kwargs={"b": 3}, executor="process")
+    graph = px.Graph.from_specs([spec])
+    report = px.run(graph)
+    assert report.success
+    assert report["kw"] == 7
+
+
+def test_process_executor_timeout() -> None:
+    """进程池任务超时应抛 TaskFailedError."""
+    spec = px.TaskSpec("slow", fn=slow_sleep, args=(10.0,), executor="process", timeout=0.1)
+    graph = px.Graph.from_specs([spec])
+    with pytest.raises(TaskFailedError):
+        px.run(graph)
@@ -99,7 +99,10 @@ def test_verbose_run_with_skipped_lifecycle(capsys: pytest.CaptureFixture[str]):


 def test_verbose_run_with_user_callback():
-    """Test px.run with verbose=True and user callback both called."""
+    """Test px.run with verbose=True and user callback both called.
+
+    预期事件序列：RUNNING（开始）→ SUCCESS（完成）。
+    """
    events = []

    def on_event(event: px.TaskEvent):
@@ -109,8 +112,9 @@ def test_verbose_run_with_user_callback():
    graph = px.Graph.from_specs([spec])
    report = px.run(graph, strategy="sequential", verbose=True, on_event=on_event)
    assert report.success
-    assert len(events) == 1
-    assert events[0].status == px.TaskStatus.SUCCESS
+    assert len(events) == 2
+    assert events[0].status == px.TaskStatus.RUNNING
+    assert events[1].status == px.TaskStatus.SUCCESS


 def test_verbose_event_callback_success():
@@ -5,8 +5,8 @@ from __future__ import annotations
 import pytest

 import pyflowx as px
+from pyflowx.compose import GraphComposer, compose
 from pyflowx.errors import CycleError, DuplicateTaskError, MissingDependencyError
-from pyflowx.graph import GraphComposer, compose


 def _fn() -> None:
@@ -319,6 +319,79 @@ def test_compose_function() -> None:
    assert "a1" in resolved["cmd_b"]


+def test_graph_composer_expand_refs_multiple_refs_chain() -> None:
+    """expand_refs 多个 ref 应串联依赖：后一个 ref 首任务依赖前一个 ref 末任务."""
+    graph_a = px.Graph.from_specs([px.TaskSpec("a1", _fn)])
+    graph_c = px.Graph.from_specs([px.TaskSpec("c1", _fn)])
+    graph_b = px.Graph.from_specs([px.TaskSpec("b1", _fn)])
+    graph_b._pending_refs = ["cmd_a", "cmd_c"]
+
+    composer = GraphComposer({"cmd_a": graph_a, "cmd_c": graph_c, "cmd_b": graph_b})
+    resolved = composer.resolve_all()
+
+    # c1 应依赖 a1（后 ref 首任务依赖前 ref 末任务）
+    assert "a1" in resolved["cmd_b"]
+    assert "c1" in resolved["cmd_b"]
+    assert "b1" in resolved["cmd_b"]
+    c1_spec = resolved["cmd_b"].all_specs()["c1"]
+    assert "a1" in c1_spec.depends_on
+
+
+def test_graph_composer_expand_refs_ref_returns_empty() -> None:
+    """expand_refs 引用空图时，previous_ref_last_task 保持 None，original_specs 走 else 分支."""
+    graph_empty = px.Graph.from_specs([])
+    graph_b = px.Graph.from_specs([px.TaskSpec("b1", _fn)])
+    graph_b._pending_refs = ["empty_cmd"]
+
+    composer = GraphComposer({"empty_cmd": graph_empty, "cmd_b": graph_b})
+    resolved = composer.resolve_all()
+
+    # b1 保留，无额外依赖
+    assert "b1" in resolved["cmd_b"]
+    b1_spec = resolved["cmd_b"].all_specs()["b1"]
+    assert b1_spec.depends_on == ()
+
+
+def test_graph_composer_expand_refs_multiple_original_specs_serialized() -> None:
+    """expand_refs 多个 original_specs 应串行依赖，且首个依赖 ref 末任务."""
+    graph_a = px.Graph.from_specs([px.TaskSpec("a1", _fn)])
+    graph_b = px.Graph.from_specs([
+        px.TaskSpec("b1", _fn),
+        px.TaskSpec("b2", _fn),
+        px.TaskSpec("b3", _fn),
+    ])
+    graph_b._pending_refs = ["cmd_a"]
+
+    composer = GraphComposer({"cmd_a": graph_a, "cmd_b": graph_b})
+    resolved = composer.resolve_all()
+
+    specs = resolved["cmd_b"].all_specs()
+    # b1 依赖 a1（ref 末任务）
+    assert "a1" in specs["b1"].depends_on
+    # b2 依赖 b1，b3 依赖 b2（串行）
+    assert "b1" in specs["b2"].depends_on
+    assert "b2" in specs["b3"].depends_on
+
+
+def test_graph_composer_parse_ref_dot_notation_success() -> None:
+    """parse_ref 'cmd.task' 形式应返回对应单个 TaskSpec."""
+    graph_a = px.Graph.from_specs([px.TaskSpec("a1", _fn), px.TaskSpec("a2", _fn)])
+    composer = GraphComposer({"cmd_a": graph_a})
+
+    result = composer.parse_ref("cmd_a.a2", "cmd_b")
+    assert len(result) == 1
+    assert result[0].name == "a2"
+
+
+def test_graph_composer_parse_ref_dot_notation_cmd_not_found() -> None:
+    """parse_ref 'missing.task' 形式应检测命令不存在."""
+    graph_a = px.Graph.from_specs([px.TaskSpec("a1", _fn)])
+    composer = GraphComposer({"cmd_a": graph_a})
+
+    with pytest.raises(ValueError, match="引用的命令 'missing' 不存在"):
+        _ = composer.parse_ref("missing.task", "cmd_b")
+
+
 # ---------------------------------------------------------------------- #
 # resolved_spec defaults 测试
 # ---------------------------------------------------------------------- #
@@ -0,0 +1,152 @@
+"""Tests for Graph namespace and add_subgraph."""
+
+from __future__ import annotations
+
+import pytest
+
+import pyflowx as px
+
+
+def _fn() -> None:
+    return None
+
+
+def test_graph_namespace_field_default_none() -> None:
+    """Graph 默认 namespace 为 None."""
+    graph = px.Graph()
+    assert graph.namespace is None
+
+
+def test_graph_from_specs_with_namespace() -> None:
+    """from_specs(namespace=...) 应设置 graph.namespace."""
+    graph = px.Graph.from_specs([px.TaskSpec("a", _fn)], namespace="ns1")
+    assert graph.namespace == "ns1"
+
+
+def test_add_subgraph_prefixes_task_names() -> None:
+    """add_subgraph 应给子图任务名加命名空间前缀."""
+    sub = px.Graph.from_specs(
+        [px.TaskSpec("extract", _fn), px.TaskSpec("build", _fn, depends_on=("extract",))],
+        namespace="build",
+    )
+    main = px.Graph.from_specs([px.TaskSpec("start", _fn)])
+    main.add_subgraph(sub)
+
+    assert "start" in main
+    assert "build:extract" in main
+    assert "build:build" in main
+
+
+def test_add_subgraph_renames_internal_deps() -> None:
+    """add_subgraph 应给子图内部依赖名加前缀."""
+    sub = px.Graph.from_specs(
+        [px.TaskSpec("a", _fn), px.TaskSpec("b", _fn, depends_on=("a",))],
+        namespace="ns",
+    )
+    main = px.Graph()
+    main.add_subgraph(sub)
+
+    b_spec = main.all_specs()["ns:b"]
+    assert b_spec.depends_on == ("ns:a",)
+
+
+def test_add_subgraph_all_internal_deps_prefixed() -> None:
+    """add_subgraph 子图内所有任务（含被依赖的）都加前缀."""
+    sub = px.Graph.from_specs(
+        [px.TaskSpec("ext", _fn), px.TaskSpec("b", _fn, depends_on=("ext",))],
+        namespace="ns",
+    )
+    main = px.Graph()
+    main.add_subgraph(sub)
+
+    b_spec = main.all_specs()["ns:b"]
+    assert b_spec.depends_on == ("ns:ext",)
+    assert "ns:ext" in main
+
+
+def test_add_subgraph_requires_namespace() -> None:
+    """add_subgraph 无 namespace 时应抛 ValueError."""
+    sub = px.Graph.from_specs([px.TaskSpec("a", _fn)])  # 无 namespace
+    main = px.Graph()
+    with pytest.raises(ValueError, match="namespace"):
+        main.add_subgraph(sub)
+
+
+def test_add_subgraph_explicit_namespace_overrides() -> None:
+    """add_subgraph(namespace=...) 应覆盖子图自带 namespace."""
+    sub = px.Graph.from_specs([px.TaskSpec("a", _fn)], namespace="original")
+    main = px.Graph()
+    main.add_subgraph(sub, namespace="override")
+
+    assert "override:a" in main
+    assert "original:a" not in main
+
+
+def test_add_subgraph_internal_injection_works() -> None:
+    """子图内部依赖注入应通过 wrapper 正常工作."""
+    sub = px.Graph.from_specs(
+        [
+            px.TaskSpec("extract", lambda: [1, 2, 3]),
+            px.TaskSpec("build", lambda extract: [x * 2 for x in extract], depends_on=("extract",)),
+        ],
+        namespace="build",
+    )
+    main = px.Graph()
+    main.add_subgraph(sub)
+
+    report = px.run(main)
+    assert report.success
+    assert report["build:build"] == [2, 4, 6]
+
+
+def test_add_subgraph_cross_namespace_ref_via_context() -> None:
+    """跨命名空间引用应通过 Context 标注接收."""
+
+    def consumer(ctx: px.Context) -> str:
+        return f"got {ctx['ns:data']}"
+
+    sub = px.Graph.from_specs(
+        [px.TaskSpec("data", lambda: "data_value")],
+        namespace="ns",
+    )
+    main = px.Graph()
+    main.add_subgraph(sub)
+
+    main.add(px.TaskSpec("consumer", consumer, depends_on=("ns:data",)))
+
+    report = px.run(main)
+    assert report.success
+    assert report["consumer"] == "got data_value"
+
+
+def test_add_subgraph_context_annotation_in_subgraph() -> None:
+    """子图内部任务用 Context 标注时，wrapper 应正确传递."""
+
+    def sink(ctx: px.Context) -> int:
+        return ctx["src"]
+
+    sub = px.Graph.from_specs(
+        [
+            px.TaskSpec("src", lambda: 42),
+            px.TaskSpec("sink", sink, depends_on=("src",)),
+        ],
+        namespace="ns",
+    )
+    main = px.Graph()
+    main.add_subgraph(sub)
+
+    report = px.run(main)
+    assert report.success
+    assert report["ns:sink"] == 42
+
+
+def test_add_subgraph_chained() -> None:
+    """多个子图可链式合并到主图."""
+    sub_a = px.Graph.from_specs([px.TaskSpec("a", _fn)], namespace="nsA")
+    sub_b = px.Graph.from_specs([px.TaskSpec("b", _fn)], namespace="nsB")
+
+    main = px.Graph()
+    main.add_subgraph(sub_a).add_subgraph(sub_b)
+
+    assert "nsA:a" in main
+    assert "nsB:b" in main
@@ -0,0 +1,574 @@
+"""性能剖面（ProfileReport）测试.
+
+覆盖策略：
+* 构造带时间戳的 RunReport + Graph，验证关键路径、并行度、瓶颈排序。
+* 边界场景：空报告、单任务、无时间戳、SKIPPED 任务、图校验失败。
+* 输出格式：to_dict / describe / top_bottlenecks / critical_tasks。
+"""
+
+from __future__ import annotations
+
+from datetime import datetime, timedelta
+from typing import Any
+
+import pyflowx as px
+from pyflowx.profiling import ProfileReport, TaskProfile
+from pyflowx.task import TaskResult, TaskSpec, TaskStatus
+
+
+def _fn() -> int:
+    return 1
+
+
+def _spec(name: str, deps: tuple[str, ...] = ()) -> TaskSpec[Any]:
+    return TaskSpec[Any](name, _fn, depends_on=deps)
+
+
+def _result(
+    name: str,
+    start: datetime,
+    duration: float,
+    *,
+    status: TaskStatus = TaskStatus.SUCCESS,
+    attempts: int = 1,
+) -> TaskResult[Any]:
+    """构造带时间戳的 TaskResult."""
+    end = start + timedelta(seconds=duration) if duration > 0 else start
+    return TaskResult[Any](
+        spec=_spec(name),
+        status=status,
+        value=None,
+        attempts=attempts,
+        started_at=start if duration > 0 or status != TaskStatus.SKIPPED else None,
+        finished_at=end if duration > 0 or status != TaskStatus.SKIPPED else None,
+    )
+
+
+def _skipped_result(name: str, reason: str = "skip") -> TaskResult[Any]:
+    """构造 SKIPPED 结果（无时间戳）."""
+    return TaskResult[Any](
+        spec=_spec(name),
+        status=TaskStatus.SKIPPED,
+        reason=reason,
+    )
+
+
+class TestProfileReportConstruction:
+    """测试 ProfileReport 构建."""
+
+    def test_empty_report(self) -> None:
+        """空报告应产生空剖面."""
+        report = px.RunReport()
+        graph = px.Graph()
+        profile = ProfileReport.from_report(report, graph)
+        assert len(profile.tasks) == 0
+        assert profile.total_duration == 0.0
+        assert profile.critical_path == ()
+        assert profile.critical_path_duration == 0.0
+        assert profile.avg_parallelism == 0.0
+        assert profile.peak_parallelism == 0
+
+    def test_single_task(self) -> None:
+        """单任务：关键路径就是它自己，并行度为 1."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.5)
+        graph = px.Graph.from_specs([_spec("a")])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        assert len(profile.tasks) == 1
+        assert profile.tasks[0].name == "a"
+        assert profile.tasks[0].duration == 1.5
+        assert profile.tasks[0].is_on_critical_path
+        assert profile.total_duration == 1.5
+        assert profile.critical_path == ("a",)
+        assert profile.critical_path_duration == 1.5
+        assert profile.avg_parallelism == 1.0
+        assert profile.peak_parallelism == 1
+        assert profile.parallelism_efficiency == 1.0
+
+    def test_serial_chain(self) -> None:
+        """串行链 a -> b -> c：关键路径为全部，效率 100%."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0)
+        report.results["b"] = _result("b", start + timedelta(seconds=1), 2.0)
+        report.results["c"] = _result("c", start + timedelta(seconds=3), 1.5)
+        graph = px.Graph.from_specs([
+            _spec("a"),
+            _spec("b", deps=("a",)),
+            _spec("c", deps=("b",)),
+        ])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        assert profile.total_duration == 4.5
+        assert profile.critical_path_duration == 4.5
+        assert profile.critical_path == ("a", "b", "c")
+        assert profile.parallelism_efficiency == 1.0
+        assert profile.peak_parallelism == 1
+        assert profile.avg_parallelism == 1.0
+
+    def test_parallel_tasks(self) -> None:
+        """并行任务 a, b 同时执行：关键路径取较长者，效率 < 1."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0)
+        report.results["b"] = _result("b", start, 2.0)
+        graph = px.Graph.from_specs([_spec("a"), _spec("b")])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        # wall-clock = 2.0, 关键路径 = 2.0 (b), 效率 = 1.0
+        # 因为关键路径定义就是最长路径，与 wall-clock 相同
+        assert profile.total_duration == 2.0
+        assert profile.critical_path_duration == 2.0
+        assert profile.critical_path == ("b",)
+        assert profile.peak_parallelism == 2
+        # 平均并行度 = (1.0 + 2.0) / 2.0 = 1.5
+        assert profile.avg_parallelism == 1.5
+
+    def test_parallel_with_join(self) -> None:
+        """a, b 并行后 join 到 c：关键路径 a->c 或 b->c."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0)
+        report.results["b"] = _result("b", start, 3.0)
+        report.results["c"] = _result("c", start + timedelta(seconds=3), 1.0)
+        graph = px.Graph.from_specs([
+            _spec("a"),
+            _spec("b"),
+            _spec("c", deps=("a", "b")),
+        ])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        # 关键路径 = b -> c (3 + 1 = 4)
+        assert profile.critical_path_duration == 4.0
+        assert profile.critical_path == ("b", "c")
+        assert profile.tasks[0].is_on_critical_path is False  # a 不在关键路径
+        # task("b") 在关键路径上
+        assert profile.task("b").is_on_critical_path
+        assert profile.task("c").is_on_critical_path
+
+    def test_skipped_task_no_timestamp(self) -> None:
+        """SKIPPED 任务无时间戳：不影响并行度计算."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0)
+        report.results["b"] = _skipped_result("b")
+        graph = px.Graph.from_specs([_spec("a"), _spec("b")])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        # b 是 SKIPPED，duration=0
+        assert profile.task("b").status == TaskStatus.SKIPPED
+        assert profile.task("b").duration == 0.0
+        assert profile.peak_parallelism == 1  # 只有 a 在跑
+
+
+class TestWaitTime:
+    """测试等待时间计算."""
+
+    def test_no_deps_zero_wait(self) -> None:
+        """无依赖任务等待时间为 0."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0)
+        graph = px.Graph.from_specs([_spec("a")])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        assert profile.task("a").wait_time == 0.0
+
+    def test_wait_after_dep_completes(self) -> None:
+        """b 在 a 完成后等待 0.5s 才开始."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0)
+        report.results["b"] = _result("b", start + timedelta(seconds=1.5), 1.0)
+        graph = px.Graph.from_specs([
+            _spec("a"),
+            _spec("b", deps=("a",)),
+        ])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        assert profile.task("b").wait_time == 0.5
+
+    def test_wait_negative_clamped_to_zero(self) -> None:
+        """b 在 a 完成前就开始（异常情况）应钳制为 0."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 2.0)
+        # b 在 a 还没完成时就开始（不应该但可能发生）
+        report.results["b"] = _result("b", start + timedelta(seconds=1), 1.0)
+        graph = px.Graph.from_specs([
+            _spec("a"),
+            _spec("b", deps=("a",)),
+        ])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        # a 在 t=2 结束，b 在 t=1 开始，delta = -1，钳制为 0
+        assert profile.task("b").wait_time == 0.0
+
+    def test_skipped_task_zero_wait(self) -> None:
+        """SKIPPED 任务等待时间为 0."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0)
+        report.results["b"] = _skipped_result("b")
+        graph = px.Graph.from_specs([
+            _spec("a"),
+            _spec("b", deps=("a",)),
+        ])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        assert profile.task("b").wait_time == 0.0
+
+
+class TestCriticalPath:
+    """测试关键路径分析."""
+
+    def test_diamond_dependency(self) -> None:
+        """菱形依赖：a -> b -> d, a -> c -> d，关键路径取较长分支."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0)
+        report.results["b"] = _result("b", start + timedelta(seconds=1), 3.0)
+        report.results["c"] = _result("c", start + timedelta(seconds=1), 1.0)
+        report.results["d"] = _result("d", start + timedelta(seconds=4), 1.0)
+        graph = px.Graph.from_specs([
+            _spec("a"),
+            _spec("b", deps=("a",)),
+            _spec("c", deps=("a",)),
+            _spec("d", deps=("b", "c")),
+        ])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        # 关键路径：a -> b -> d = 1 + 3 + 1 = 5
+        assert profile.critical_path_duration == 5.0
+        assert profile.critical_path == ("a", "b", "d")
+
+    def test_graph_validation_failure_returns_empty(self) -> None:
+        """图校验失败（有环）应回退为空关键路径."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0)
+        # 手动构造带环的图（绕过校验）
+        graph = px.Graph()
+        graph.specs["a"] = _spec("a", deps=("b",))
+        graph.specs["b"] = _spec("b", deps=("a",))
+        graph.deps["a"] = ("b",)
+        graph.deps["b"] = ("a",)
+
+        profile = ProfileReport.from_report(report, graph)
+
+        # layers() 抛 CycleError，回退为空
+        assert profile.critical_path == ()
+        assert profile.critical_path_duration == 0.0
+
+
+class TestParallelism:
+    """测试并行度计算."""
+
+    def test_no_timestamps_zero_parallelism(self) -> None:
+        """所有任务无时间戳：并行度为 0."""
+        report = px.RunReport()
+        report.results["a"] = TaskResult[Any](spec=_spec("a"), status=TaskStatus.SUCCESS)
+        graph = px.Graph.from_specs([_spec("a")])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        assert profile.avg_parallelism == 0.0
+        assert profile.peak_parallelism == 0
+
+    def test_zero_duration_excluded(self) -> None:
+        """零耗时任务（end <= start）不参与并行度计算."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 0.0)  # 零耗时
+        report.results["b"] = _result("b", start, 1.0)
+        graph = px.Graph.from_specs([_spec("a"), _spec("b")])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        # 只有 b 参与，峰值 = 1
+        assert profile.peak_parallelism == 1
+
+    def test_skipped_with_timestamps_excluded(self) -> None:
+        """SKIPPED 任务即使带时间戳也不参与并行度计算."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        # SKIPPED 但带时间戳（异常但可能发生）
+        report.results["a"] = _result("a", start, 1.0, status=TaskStatus.SKIPPED)
+        report.results["b"] = _result("b", start, 1.0)
+        graph = px.Graph.from_specs([_spec("a"), _spec("b")])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        # a 是 SKIPPED，被排除；只有 b 参与
+        assert profile.peak_parallelism == 1
+
+    def test_peak_parallelism_three_tasks(self) -> None:
+        """三个任务完全重叠：峰值并行度 = 3."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 3.0)
+        report.results["b"] = _result("b", start, 3.0)
+        report.results["c"] = _result("c", start, 3.0)
+        graph = px.Graph.from_specs([_spec("a"), _spec("b"), _spec("c")])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        assert profile.peak_parallelism == 3
+        assert profile.avg_parallelism == 3.0
+
+
+class TestQueries:
+    """测试查询方法."""
+
+    def test_task_lookup(self) -> None:
+        """task(name) 应返回对应剖面."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0)
+        report.results["b"] = _result("b", start, 2.0)
+        graph = px.Graph.from_specs([_spec("a"), _spec("b")])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        assert profile.task("a").name == "a"
+        assert profile.task("b").duration == 2.0
+
+    def test_task_lookup_not_found(self) -> None:
+        """task(name) 不存在应抛 KeyError."""
+        report = px.RunReport()
+        graph = px.Graph()
+        profile = ProfileReport.from_report(report, graph)
+        try:
+            profile.task("missing")
+        except KeyError:
+            pass
+        else:
+            raise AssertionError("应抛出 KeyError")
+
+    def test_top_bottlenecks(self) -> None:
+        """top_bottlenecks 应按耗时降序返回."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0)
+        report.results["b"] = _result("b", start, 3.0)
+        report.results["c"] = _result("c", start, 2.0)
+        graph = px.Graph.from_specs([_spec("a"), _spec("b"), _spec("c")])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        top3 = profile.top_bottlenecks(3)
+        assert len(top3) == 3
+        assert top3[0].name == "b"
+        assert top3[1].name == "c"
+        assert top3[2].name == "a"
+
+    def test_top_bottlenecks_zero_or_negative(self) -> None:
+        """n <= 0 应返回空元组."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0)
+        graph = px.Graph.from_specs([_spec("a")])
+        profile = ProfileReport.from_report(report, graph)
+
+        assert profile.top_bottlenecks(0) == ()
+        assert profile.top_bottlenecks(-1) == ()
+
+    def test_critical_tasks(self) -> None:
+        """critical_tasks 应返回关键路径上的任务（按路径顺序）."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0)
+        report.results["b"] = _result("b", start + timedelta(seconds=1), 3.0)
+        report.results["c"] = _result("c", start + timedelta(seconds=1), 1.0)
+        report.results["d"] = _result("d", start + timedelta(seconds=4), 1.0)
+        graph = px.Graph.from_specs([
+            _spec("a"),
+            _spec("b", deps=("a",)),
+            _spec("c", deps=("a",)),
+            _spec("d", deps=("b", "c")),
+        ])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        # 关键路径 a -> b -> d
+        critical = profile.critical_tasks()
+        assert len(critical) == 3
+        assert [t.name for t in critical] == ["a", "b", "d"]
+
+    def test_failed_tasks(self) -> None:
+        """failed_tasks 应返回 FAILED 状态的任务."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0, status=TaskStatus.FAILED)
+        report.results["b"] = _result("b", start, 1.0)
+        graph = px.Graph.from_specs([_spec("a"), _spec("b")])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        failed = profile.failed_tasks()
+        assert len(failed) == 1
+        assert failed[0].name == "a"
+
+    def test_skipped_tasks(self) -> None:
+        """skipped_tasks 应返回 SKIPPED 状态的任务."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0)
+        report.results["b"] = _skipped_result("b")
+        graph = px.Graph.from_specs([_spec("a"), _spec("b")])
+
+        profile = ProfileReport.from_report(report, graph)
+
+        skipped = profile.skipped_tasks()
+        assert len(skipped) == 1
+        assert skipped[0].name == "b"
+
+
+class TestOutputFormats:
+    """测试输出格式."""
+
+    def test_to_dict_structure(self) -> None:
+        """to_dict 应返回包含所有字段的字典."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.5)
+        graph = px.Graph.from_specs([_spec("a")])
+
+        profile = ProfileReport.from_report(report, graph)
+        d = profile.to_dict()
+
+        assert "tasks" in d
+        assert "total_duration_seconds" in d
+        assert "critical_path_duration_seconds" in d
+        assert "critical_path" in d
+        assert "avg_parallelism" in d
+        assert "peak_parallelism" in d
+        assert "parallelism_efficiency" in d
+        assert "bottlenecks" in d
+        assert len(d["tasks"]) == 1
+        assert d["tasks"][0]["name"] == "a"
+        assert d["tasks"][0]["status"] == "success"
+        assert d["tasks"][0]["duration_seconds"] == 1.5
+        assert d["tasks"][0]["is_on_critical_path"] is True
+
+    def test_describe_contains_key_sections(self) -> None:
+        """describe 应包含关键章节标题."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0)
+        graph = px.Graph.from_specs([_spec("a")])
+
+        profile = ProfileReport.from_report(report, graph)
+        text = profile.describe()
+
+        assert "PyFlowX 性能剖面报告" in text
+        assert "【图级指标】" in text
+        assert "【关键路径】" in text
+        assert "【Top" in text
+        assert "【全部任务】" in text
+        assert "a" in text
+
+    def test_describe_empty_report(self) -> None:
+        """空报告的 describe 应不崩溃且包含章节标题."""
+        report = px.RunReport()
+        graph = px.Graph()
+        profile = ProfileReport.from_report(report, graph)
+        text = profile.describe()
+
+        assert "【图级指标】" in text
+        assert "(无)" in text
+
+    def test_repr(self) -> None:
+        """__repr__ 应包含关键指标."""
+        start = datetime(2024, 1, 1, 0, 0, 0)
+        report = px.RunReport()
+        report.results["a"] = _result("a", start, 1.0)
+        graph = px.Graph.from_specs([_spec("a")])
+
+        profile = ProfileReport.from_report(report, graph)
+        r = repr(profile)
+
+        assert "ProfileReport" in r
+        assert "tasks=1" in r
+        assert "total=1.000s" in r
+
+    def test_task_profile_to_dict(self) -> None:
+        """TaskProfile.to_dict 应返回正确字段."""
+        tp = TaskProfile(
+            name="x",
+            status=TaskStatus.SUCCESS,
+            duration=1.5,
+            attempts=2,
+            wait_time=0.3,
+            is_on_critical_path=True,
+            deps=("a", "b"),
+        )
+        d = tp.to_dict()
+
+        assert d["name"] == "x"
+        assert d["status"] == "success"
+        assert d["duration_seconds"] == 1.5
+        assert d["attempts"] == 2
+        assert d["wait_time_seconds"] == 0.3
+        assert d["is_on_critical_path"] is True
+        assert d["deps"] == ["a", "b"]
+
+
+class TestIntegrationWithRun:
+    """与真实 run() 集成测试."""
+
+    def test_profile_from_real_run(self) -> None:
+        """从真实 run() 结果构建剖面."""
+        import time
+
+        def slow() -> int:
+            time.sleep(0.01)  # 确保任务有实际耗时，避免 duration 极小导致并行度计算为 0
+            return 1
+
+        graph = px.Graph.from_specs([
+            px.TaskSpec("a", slow),
+            px.TaskSpec("b", slow, depends_on=("a",)),
+            px.TaskSpec("c", slow, depends_on=("a",)),
+        ])
+        report = px.run(graph, strategy="sequential")
+
+        profile = ProfileReport.from_report(report, graph)
+
+        assert len(profile.tasks) == 3
+        # sequential 策略下应为串行，duration > 0
+        assert profile.critical_path_duration > 0
+        # sequential 策略下并行度应为 1
+        assert profile.peak_parallelism == 1
+
+    def test_profile_from_thread_run(self) -> None:
+        """从 thread 策略 run() 结果构建剖面，验证并行度 > 1."""
+        import time
+
+        def slow() -> int:
+            time.sleep(0.05)
+            return 1
+
+        graph = px.Graph.from_specs([
+            px.TaskSpec("a", slow),
+            px.TaskSpec("b", slow),
+            px.TaskSpec("c", slow),
+        ])
+        report = px.run(graph, strategy="thread", max_workers=3)
+
+        profile = ProfileReport.from_report(report, graph)
+
+        # 三个任务并行，峰值应 >= 2（可能因调度时机不到 3）
+        assert profile.peak_parallelism >= 2
+        assert profile.critical_path_duration > 0
@@ -126,3 +126,50 @@ class TestRunReportDescribe:
        report.results["a"] = TaskResult[Any](spec=spec, status=TaskStatus.PENDING)
        desc = report.describe()
        assert "-" in desc  # duration 显示为 "-"
+
+
+class TestRunReportQueries:
+    """测试 RunReport 的新查询 API."""
+
+    def test_succeeded_tasks(self) -> None:
+        """succeeded_tasks 返回 SUCCESS 状态的任务名."""
+        report = px.RunReport()
+        report.results["a"] = _make_result("a", status=TaskStatus.SUCCESS)
+        report.results["b"] = _make_result("b", status=TaskStatus.FAILED)
+        report.results["c"] = _make_result("c", status=TaskStatus.SUCCESS)
+        assert report.succeeded_tasks() == ["a", "c"]
+
+    def test_skipped_tasks(self) -> None:
+        """skipped_tasks 返回 SKIPPED 状态的任务名."""
+        report = px.RunReport()
+        report.results["a"] = _make_result("a", status=TaskStatus.SKIPPED)
+        report.results["b"] = _make_result("b", status=TaskStatus.SUCCESS)
+        assert report.skipped_tasks() == ["a"]
+
+    def test_tasks_by_status(self) -> None:
+        """tasks_by_status 按指定状态过滤."""
+        report = px.RunReport()
+        report.results["a"] = _make_result("a", status=TaskStatus.FAILED)
+        report.results["b"] = _make_result("b", status=TaskStatus.FAILED)
+        report.results["c"] = _make_result("c", status=TaskStatus.SUCCESS)
+        assert report.tasks_by_status(TaskStatus.FAILED) == ["a", "b"]
+        assert report.tasks_by_status(TaskStatus.SUCCESS) == ["c"]
+        assert report.tasks_by_status(TaskStatus.SKIPPED) == []
+
+    def test_durations(self) -> None:
+        """durations 返回任务名 -> 时长映射."""
+        report = px.RunReport()
+        report.results["a"] = _make_result("a", duration=1.5)
+        report.results["b"] = _make_result("b", duration=2.0)
+        durs = report.durations()
+        assert durs["a"] == 1.5
+        assert durs["b"] == 2.0
+
+    def test_durations_no_duration(self) -> None:
+        """无时长的任务应返回 0.0."""
+        report = px.RunReport()
+        spec: TaskSpec[Any] = TaskSpec[Any]("a", _fn)  # type: ignore[arg-type]
+        report.results["a"] = TaskResult[Any](spec=spec, status=TaskStatus.PENDING)
+        durs = report.durations()
+        assert durs["a"] == 0.0
+
@@ -53,18 +53,18 @@ class TestCliRunnerConstruction:

    def test_requires_at_least_one_command(self) -> None:
        """没有命令时应抛出 ValueError."""
-        with pytest.raises(ValueError, match="至少需要一个命令"):
+        with pytest.raises(ValueError, match="至少需要一个别名"):
            _ = px.CliRunner()

    def test_accepts_single_graph(self) -> None:
        """单个命令应正常构造."""
-        runner = px.CliRunner(graphs={"clean": _echo_graph()})
+        runner = px.CliRunner(aliases={"clean": _echo_graph()})
        assert runner.commands == ["clean"]

    def test_accepts_multiple_graphs(self) -> None:
        """多个命令应按插入顺序保留."""
        runner = px.CliRunner(
-            graphs={
+            aliases={
                "clean": _echo_graph("c", "clean"),
                "build": _echo_graph("b", "build"),
                "test": _echo_graph("t", "test"),
@@ -72,39 +72,39 @@ class TestCliRunnerConstruction:
        )
        assert runner.commands == ["clean", "build", "test"]

-    def test_default_strategy_is_sequential(self) -> None:
-        """默认策略应为 Strategy.SEQUENTIAL."""
-        runner = px.CliRunner({"clean": _echo_graph()})
-        assert runner.strategy == "sequential"
+    def test_default_strategy_is_dependency(self) -> None:
+        """默认策略应为 dependency（依赖驱动，最大并行度）."""
+        runner = px.CliRunner(aliases={"clean": _echo_graph()})
+        assert runner.strategy == "dependency"

    def test_custom_strategy_string(self) -> None:
        """应支持通过字符串指定策略."""
-        runner = px.CliRunner({"clean": _echo_graph()}, strategy="thread")
+        runner = px.CliRunner(aliases={"clean": _echo_graph()}, strategy="thread")
        assert runner.strategy == "thread"

    def test_custom_strategy_enum(self) -> None:
        """应支持通过 Strategy 枚举指定策略."""
-        runner = px.CliRunner({"clean": _echo_graph()}, strategy="async")
+        runner = px.CliRunner(aliases={"clean": _echo_graph()}, strategy="async")
        assert runner.strategy == "async"

    def test_default_verbose_is_true(self) -> None:
        """默认 verbose 应为 True."""
-        runner = px.CliRunner({"clean": _echo_graph()})
+        runner = px.CliRunner(aliases={"clean": _echo_graph()})
        assert runner.verbose is True

    def test_custom_verbose_false(self) -> None:
        """应支持关闭 verbose."""
-        runner = px.CliRunner({"clean": _echo_graph()}, verbose=False)
+        runner = px.CliRunner(aliases={"clean": _echo_graph()}, verbose=False)
        assert runner.verbose is False

    def test_default_description_is_empty(self) -> None:
        """默认描述应为空字符串."""
-        runner = px.CliRunner({"clean": _echo_graph()})
+        runner = px.CliRunner(aliases={"clean": _echo_graph()})
        assert runner.description == ""

    def test_custom_description(self) -> None:
        """应支持自定义描述."""
-        runner = px.CliRunner({"clean": _echo_graph()}, description="My CLI")
+        runner = px.CliRunner(aliases={"clean": _echo_graph()}, description="My CLI")
        assert runner.description == "My CLI"


@@ -116,13 +116,13 @@ class TestCliRunnerProperties:

    def test_commands_returns_list(self) -> None:
        """commands 应返回列表."""
-        runner = px.CliRunner({"a": _echo_graph(), "b": _echo_graph()})
+        runner = px.CliRunner(aliases={"a": _echo_graph(), "b": _echo_graph()})
        assert isinstance(runner.commands, list)

    def test_graphs_contains_original_graphs(self) -> None:
        """graphs 应包含原始 Graph 实例."""
        g = _echo_graph()
-        runner = px.CliRunner({"cmd": g})
+        runner = px.CliRunner(aliases={"cmd": g})
        assert runner.graphs["cmd"] is g


@@ -136,69 +136,69 @@ class TestCliRunnerParser:
        """create_parser 应返回 ArgumentParser."""
        from argparse import ArgumentParser

-        runner = px.CliRunner({"clean": _echo_graph()})
+        runner = px.CliRunner(aliases={"clean": _echo_graph()})
        parser = runner.create_parser()
        assert isinstance(parser, ArgumentParser)

    def test_parser_has_command_argument(self) -> None:
        """解析器应有 command 位置参数."""
-        runner = px.CliRunner({"clean": _echo_graph()})
+        runner = px.CliRunner(aliases={"clean": _echo_graph()})
        parser = runner.create_parser()
        parsed = parser.parse_args(["clean"])
        assert parsed.command == "clean"

    def test_parser_command_is_optional(self) -> None:
        """command 应为可选参数."""
-        runner = px.CliRunner({"clean": _echo_graph()})
+        runner = px.CliRunner(aliases={"clean": _echo_graph()})
        parser = runner.create_parser()
        parsed = parser.parse_args([])
        assert parsed.command is None

    def test_parser_has_strategy_option(self) -> None:
        """解析器应有 --strategy 选项."""
-        runner = px.CliRunner({"clean": _echo_graph()})
+        runner = px.CliRunner(aliases={"clean": _echo_graph()})
        parser = runner.create_parser()
        parsed = parser.parse_args(["clean", "--strategy", "thread"])
        assert parsed.strategy == "thread"

    def test_parser_strategy_default(self) -> None:
        """--strategy 默认值应与构造时一致."""
-        runner = px.CliRunner({"clean": _echo_graph()}, strategy="async")
+        runner = px.CliRunner(aliases={"clean": _echo_graph()}, strategy="async")
        parser = runner.create_parser()
        parsed = parser.parse_args(["clean"])
        assert parsed.strategy == "async"

    def test_parser_has_dry_run_flag(self) -> None:
        """解析器应有 --dry-run 标志."""
-        runner = px.CliRunner({"clean": _echo_graph()})
+        runner = px.CliRunner(aliases={"clean": _echo_graph()})
        parser = runner.create_parser()
        parsed = parser.parse_args(["clean", "--dry-run"])
        assert parsed.dry_run is True

    def test_parser_dry_run_default_false(self) -> None:
        """--dry-run 默认为 False."""
-        runner = px.CliRunner({"clean": _echo_graph()})
+        runner = px.CliRunner(aliases={"clean": _echo_graph()})
        parser = runner.create_parser()
        parsed = parser.parse_args(["clean"])
        assert parsed.dry_run is False

    def test_parser_has_list_flag(self) -> None:
        """解析器应有 --list 标志."""
-        runner = px.CliRunner({"clean": _echo_graph()})
+        runner = px.CliRunner(aliases={"clean": _echo_graph()})
        parser = runner.create_parser()
        parsed = parser.parse_args(["--list"])
        assert parsed.list is True

    def test_parser_has_quiet_flag(self) -> None:
        """解析器应有 --quiet 标志."""
-        runner = px.CliRunner({"clean": _echo_graph()})
+        runner = px.CliRunner(aliases={"clean": _echo_graph()})
        parser = runner.create_parser()
        parsed = parser.parse_args(["clean", "--quiet"])
        assert parsed.quiet is True

    def test_parser_quiet_default_false(self) -> None:
        """--quiet 默认为 False."""
-        runner = px.CliRunner({"clean": _echo_graph()})
+        runner = px.CliRunner(aliases={"clean": _echo_graph()})
        parser = runner.create_parser()
        parsed = parser.parse_args(["clean"])
        assert parsed.quiet is False
@@ -222,7 +222,7 @@ class TestCliRunnerRunSuccess:

    def test_run_valid_command_returns_zero(self) -> None:
        """有效命令执行成功应返回 0."""
-        runner = px.CliRunner({"clean": _echo_graph()})
+        runner = px.CliRunner(aliases={"clean": _echo_graph()})
        exit_code = runner.run(["clean"])
        assert exit_code == CliExitCode.SUCCESS.value

@@ -236,28 +236,30 @@ class TestCliRunnerRunSuccess:
        def track_b() -> None:
            executed.append("b")

-        runner = px.CliRunner({
-            "a": px.Graph.from_specs([px.TaskSpec("a", track_a)]),
-            "b": px.Graph.from_specs([px.TaskSpec("b", track_b)]),
-        })
+        runner = px.CliRunner(
+            aliases={
+                "a": px.Graph.from_specs([px.TaskSpec("a", track_a)]),
+                "b": px.Graph.from_specs([px.TaskSpec("b", track_b)]),
+            }
+        )
        _ = runner.run(["b"])
        assert executed == ["b"]

    def test_run_multi_task_graph(self) -> None:
        """应能执行带依赖的多任务图."""
-        runner = px.CliRunner({"multi": _multi_task_graph()})
+        runner = px.CliRunner(aliases={"multi": _multi_task_graph()})
        exit_code = runner.run(["multi"])
        assert exit_code == CliExitCode.SUCCESS.value

    def test_run_with_strategy_override(self) -> None:
        """应支持通过 --strategy 覆盖默认策略."""
-        runner = px.CliRunner({"echo": _echo_graph()})
+        runner = px.CliRunner(aliases={"echo": _echo_graph()})
        exit_code = runner.run(["echo", "--strategy", "thread"])
        assert exit_code == CliExitCode.SUCCESS.value

    def test_run_with_dry_run(self, capsys: pytest.CaptureFixture[str]) -> None:
        """--dry-run 应只打印计划不执行."""
-        runner = px.CliRunner({"echo": _echo_graph()})
+        runner = px.CliRunner(aliases={"echo": _echo_graph()})
        exit_code = runner.run(["echo", "--dry-run"])
        assert exit_code == CliExitCode.SUCCESS.value
        captured = capsys.readouterr()
@@ -272,7 +274,7 @@ class TestCliRunnerVerbose:

    def test_verbose_default_prints_lifecycle(self, capsys: pytest.CaptureFixture[str]) -> None:
        """默认 verbose=True 应打印任务生命周期."""
-        runner = px.CliRunner({"echo": _echo_graph()})
+        runner = px.CliRunner(aliases={"echo": _echo_graph()})
        _ = runner.run(["echo"])
        captured = capsys.readouterr()
        # verbose 模式下应打印任务生命周期
@@ -280,7 +282,7 @@ class TestCliRunnerVerbose:

    def test_quiet_flag_disables_verbose(self, capsys: pytest.CaptureFixture[str]) -> None:
        """--quiet 应关闭 verbose 输出."""
-        runner = px.CliRunner({"echo": _echo_graph()})
+        runner = px.CliRunner(aliases={"echo": _echo_graph()})
        _ = runner.run(["echo", "--quiet"])
        captured = capsys.readouterr()
        # quiet 模式下不应有 [verbose] 前缀的输出
@@ -288,14 +290,14 @@ class TestCliRunnerVerbose:

    def test_verbose_false_constructor_disables_verbose(self, capsys: pytest.CaptureFixture[str]) -> None:
        """构造时 verbose=False 应关闭 verbose 输出."""
-        runner = px.CliRunner({"echo": _echo_graph()}, verbose=False)
+        runner = px.CliRunner(aliases={"echo": _echo_graph()}, verbose=False)
        _ = runner.run(["echo"])
        captured = capsys.readouterr()
        assert "[verbose]" not in captured.out

    def test_verbose_prints_command_for_cmd_task(self, capsys: pytest.CaptureFixture[str]) -> None:
        """verbose 模式下 cmd 任务应打印执行的命令."""
-        runner = px.CliRunner({"echo": _echo_graph(msg="verbose-test")})
+        runner = px.CliRunner(aliases={"echo": _echo_graph(msg="verbose-test")})
        _ = runner.run(["echo"])
        captured = capsys.readouterr()
        # 应打印执行的命令
@@ -305,7 +307,7 @@ class TestCliRunnerVerbose:

    def test_verbose_prints_success_lifecycle(self, capsys: pytest.CaptureFixture[str]) -> None:
        """verbose 模式下成功任务应打印成功信息."""
-        runner = px.CliRunner({"echo": _echo_graph()})
+        runner = px.CliRunner(aliases={"echo": _echo_graph()})
        _ = runner.run(["echo"])
        captured = capsys.readouterr()
        assert "成功" in captured.out
@@ -319,14 +321,14 @@ class TestCliRunnerVerbose:
                conditions=(lambda _ctx: False,),
            ),
        ])
-        runner = px.CliRunner({"skip": graph})
+        runner = px.CliRunner(aliases={"skip": graph})
        _ = runner.run(["skip"])
        captured = capsys.readouterr()
        assert "跳过" in captured.out

    def test_verbose_prints_failure_lifecycle(self, capsys: pytest.CaptureFixture[str]) -> None:
        """verbose 模式下失败任务应打印失败信息."""
-        runner = px.CliRunner({"fail": _failing_graph()})
+        runner = px.CliRunner(aliases={"fail": _failing_graph()})
        _ = runner.run(["fail"])
        captured = capsys.readouterr()
        # 失败信息可能出现在 stdout (verbose) 或 stderr (PyFlowXError)
@@ -342,7 +344,7 @@ class TestCliRunnerRunFailure:

    def test_run_unknown_command_returns_failure(self, capsys: pytest.CaptureFixture[str]) -> None:
        """未知命令应返回 1 并打印错误."""
-        runner = px.CliRunner({"clean": _echo_graph()})
+        runner = px.CliRunner(aliases={"clean": _echo_graph()})
        exit_code = runner.run(["unknown"])
        assert exit_code == CliExitCode.FAILURE.value
        captured = capsys.readouterr()
@@ -351,7 +353,7 @@ class TestCliRunnerRunFailure:

    def test_run_no_command_returns_failure(self, capsys: pytest.CaptureFixture[str]) -> None:
        """无命令时应返回 1 并打印帮助."""
-        runner = px.CliRunner({"clean": _echo_graph()})
+        runner = px.CliRunner(aliases={"clean": _echo_graph()})
        exit_code = runner.run([])
        assert exit_code == CliExitCode.FAILURE.value
        captured = capsys.readouterr()
@@ -359,13 +361,13 @@ class TestCliRunnerRunFailure:

    def test_run_failing_task_returns_failure(self) -> None:
        """任务失败时应返回 1."""
-        runner = px.CliRunner({"fail": _failing_graph()})
+        runner = px.CliRunner(aliases={"fail": _failing_graph()})
        exit_code = runner.run(["fail"])
        assert exit_code == CliExitCode.FAILURE.value

    def test_run_failing_task_prints_error(self, capsys: pytest.CaptureFixture[str]) -> None:
        """任务失败时应打印错误信息."""
-        runner = px.CliRunner({"fail": _failing_graph()})
+        runner = px.CliRunner(aliases={"fail": _failing_graph()})
        _ = runner.run(["fail"])
        captured = capsys.readouterr()
        # PyFlowXError 信息应输出到 stderr
@@ -380,17 +382,19 @@ class TestCliRunnerList:

    def test_list_returns_success(self) -> None:
        """--list 应返回 0."""
-        runner = px.CliRunner({"clean": _echo_graph(), "build": _echo_graph()})
+        runner = px.CliRunner(aliases={"clean": _echo_graph(), "build": _echo_graph()})
        exit_code = runner.run(["--list"])
        assert exit_code == CliExitCode.SUCCESS.value

    def test_list_prints_all_commands(self, capsys: pytest.CaptureFixture[str]) -> None:
        """--list 应打印所有命令."""
-        runner = px.CliRunner({
-            "clean": _echo_graph("c", "clean"),
-            "build": _echo_graph("b", "build"),
-            "test": _echo_graph("t", "test"),
-        })
+        runner = px.CliRunner(
+            aliases={
+                "clean": _echo_graph("c", "clean"),
+                "build": _echo_graph("b", "build"),
+                "test": _echo_graph("t", "test"),
+            }
+        )
        _ = runner.run(["--list"])
        captured = capsys.readouterr()
        assert "clean" in captured.out
@@ -404,7 +408,7 @@ class TestCliRunnerList:
        def track() -> None:
            executed.append("ran")

-        runner = px.CliRunner({"a": px.Graph.from_specs([px.TaskSpec("a", track)])})
+        runner = px.CliRunner(aliases={"a": px.Graph.from_specs([px.TaskSpec("a", track)])})
        _ = runner.run(["--list"])
        assert executed == []

@@ -417,7 +421,7 @@ class TestCliRunnerErrorHandling:

    def test_keyboard_interrupt_returns_130(self, capsys: pytest.CaptureFixture[str]) -> None:
        """KeyboardInterrupt 应返回 130."""
-        runner = px.CliRunner({"echo": _echo_graph()})
+        runner = px.CliRunner(aliases={"echo": _echo_graph()})

        def raise_interrupt(*_args: Any, **_kwargs: Any) -> None:
            raise KeyboardInterrupt
@@ -430,7 +434,7 @@ class TestCliRunnerErrorHandling:

    def test_pyflowx_error_returns_failure(self, capsys: pytest.CaptureFixture[str]) -> None:
        """PyFlowXError 应返回 1."""
-        runner = px.CliRunner({"echo": _echo_graph()})
+        runner = px.CliRunner(aliases={"echo": _echo_graph()})

        def raise_error(*_args: Any, **_kwargs: Any) -> None:
            raise TaskFailedError("echo", RuntimeError("boom"), 1)
@@ -447,7 +451,7 @@ class TestCliRunnerErrorHandling:
        class CustomError(Exception):
            pass

-        runner = px.CliRunner({"echo": _echo_graph()})
+        runner = px.CliRunner(aliases={"echo": _echo_graph()})

        def raise_custom(*_args: Any, **_kwargs: Any) -> None:
            raise CustomError("unexpected")
@@ -464,14 +468,14 @@ class TestCliRunnerRunCli:

    def test_run_cli_calls_sys_exit(self) -> None:
        """run_cli 应调用 sys.exit."""
-        runner = px.CliRunner({"echo": _echo_graph()})
+        runner = px.CliRunner(aliases={"echo": _echo_graph()})
        with pytest.raises(SystemExit) as exc_info:
            runner.run_cli(["echo"])
        assert exc_info.value.code == CliExitCode.SUCCESS.value

    def test_run_cli_exit_code_on_failure(self) -> None:
        """run_cli 失败时应以非零码退出."""
-        runner = px.CliRunner({"fail": _failing_graph()})
+        runner = px.CliRunner(aliases={"fail": _failing_graph()})
        with pytest.raises(SystemExit) as exc_info:
            runner.run_cli(["fail"])
        assert exc_info.value.code == CliExitCode.FAILURE.value
@@ -479,7 +483,7 @@ class TestCliRunnerRunCli:
    def test_run_cli_no_args_uses_sys_argv(self, monkeypatch: pytest.MonkeyPatch) -> None:
        """run_cli 无参数时应使用 sys.argv."""
        monkeypatch.setattr(sys, "argv", ["pymake", "echo"])
-        runner = px.CliRunner({"echo": _echo_graph()})
+        runner = px.CliRunner(aliases={"echo": _echo_graph()})
        with pytest.raises(SystemExit) as exc_info:
            runner.run_cli()
        assert exc_info.value.code == CliExitCode.SUCCESS.value
@@ -520,7 +524,7 @@ class TestCliRunnerIntegration:
                conditions=(lambda _ctx: False,),
            ),
        ])
-        runner = px.CliRunner({"skip": graph})
+        runner = px.CliRunner(aliases={"skip": graph})
        exit_code = runner.run(["skip"])
        assert exit_code == CliExitCode.SUCCESS.value

@@ -533,7 +537,7 @@ class TestCliRunnerIntegration:
                conditions=(lambda _ctx: True,),
            ),
        ])
-        runner = px.CliRunner({"run": graph})
+        runner = px.CliRunner(aliases={"run": graph})
        exit_code = runner.run(["run"])
        assert exit_code == CliExitCode.SUCCESS.value

@@ -554,17 +558,19 @@ class TestCliRunnerIntegration:
            px.TaskSpec("c", make("c"), depends_on=("a",)),
            px.TaskSpec("d", make("d"), depends_on=("b", "c")),
        ])
-        runner = px.CliRunner({"diamond": graph})
+        runner = px.CliRunner(aliases={"diamond": graph})
        exit_code = runner.run(["diamond"])
        assert exit_code == CliExitCode.SUCCESS.value
        assert order == ["a", "b", "c", "d"]

    def test_mixed_fn_and_cmd_commands(self) -> None:
        """混合 fn 和 cmd 的命令应都能执行."""
-        runner = px.CliRunner({
-            "fn_cmd": px.Graph.from_specs([px.TaskSpec("fn", fn=lambda: "fn-result")]),
-            "cmd_cmd": px.Graph.from_specs([px.TaskSpec("cmd", cmd=[*ECHO_CMD, "cmd-result"])]),
-        })
+        runner = px.CliRunner(
+            aliases={
+                "fn_cmd": px.Graph.from_specs([px.TaskSpec("fn", fn=lambda: "fn-result")]),
+                "cmd_cmd": px.Graph.from_specs([px.TaskSpec("cmd", cmd=[*ECHO_CMD, "cmd-result"])]),
+            }
+        )
        assert runner.run(["fn_cmd"]) == CliExitCode.SUCCESS.value
        assert runner.run(["cmd_cmd"]) == CliExitCode.SUCCESS.value

@@ -580,7 +586,7 @@ class TestCliRunnerIntegration:
                ls_cmd = ["ls"]

            graph = px.Graph.from_specs([px.TaskSpec("ls", cmd=ls_cmd, cwd=Path(tmpdir))])
-            runner = px.CliRunner({"ls": graph})
+            runner = px.CliRunner(aliases={"ls": graph})
            exit_code = runner.run(["ls"])
            assert exit_code == CliExitCode.SUCCESS.value

@@ -612,3 +618,109 @@ class TestApplyVerboseToGraph:
        new_graph = _apply_verbose_to_graph(graph, verbose=True)
        new_spec = new_graph.spec("a")
        assert new_spec.verbose is True
+
+
+# ---------------------------------------------------------------------- #
+# 新 API: tasks + aliases
+# ---------------------------------------------------------------------- #
+class TestCliRunnerNewApi:
+    """测试 CliRunner 的 tasks + aliases 新 API."""
+
+    def test_tasks_plus_aliases_single_str(self) -> None:
+        """tasks 注册 + aliases str 引用单任务."""
+        runner = px.CliRunner(
+            tasks=[px.cmd([*ECHO_CMD, "a"], name="task_a")],
+            aliases={"a": "task_a"},
+        )
+        assert runner.commands == ["a"]
+        assert runner.run(["a"]) == CliExitCode.SUCCESS.value
+
+    def test_aliases_list_str_builds_chain(self) -> None:
+        """aliases list[str] 应建立 chain 依赖（后一个依赖前一个）."""
+        runner = px.CliRunner(
+            tasks=[
+                px.cmd([*ECHO_CMD, "a"], name="task_a"),
+                px.cmd([*ECHO_CMD, "b"], name="task_b"),
+            ],
+            aliases={"ab": ["task_a", "task_b"]},
+        )
+        graph = runner.graphs["ab"]
+        specs = graph.all_specs()
+        assert specs["task_b"].depends_on == ("task_a",)
+
+    def test_aliases_taskspec_value(self) -> None:
+        """aliases 值为 TaskSpec 时直接生成单任务图."""
+        spec = px.cmd([*ECHO_CMD, "x"], name="inline_x")
+        runner = px.CliRunner(aliases={"x": spec})
+        assert runner.run(["x"]) == CliExitCode.SUCCESS.value
+
+    def test_aliases_graph_value(self) -> None:
+        """aliases 值为 Graph 时原样使用（复杂场景：conditions 等）."""
+        graph = px.Graph.from_specs([
+            px.TaskSpec("a", cmd=[*ECHO_CMD, "a"]),
+            px.TaskSpec("b", cmd=[*ECHO_CMD, "b"], depends_on=("a",)),
+        ])
+        runner = px.CliRunner(aliases={"g": graph})
+        assert set(runner.graphs["g"].all_specs().keys()) == {"a", "b"}
+
+    def test_alias_name_same_as_task_name_via_taskspec(self) -> None:
+        """alias 名与 task 名相同时，用 TaskSpec 避免自引用循环."""
+        spec = px.cmd([*ECHO_CMD, "same"], name="same")
+        runner = px.CliRunner(aliases={"same": spec})
+        assert runner.run(["same"]) == CliExitCode.SUCCESS.value
+
+    def test_alias_str_reference_to_other_alias(self) -> None:
+        """alias 值为 str 引用其他 alias."""
+        runner = px.CliRunner(
+            aliases={
+                "base": px.cmd([*ECHO_CMD, "base"], name="base"),
+                "wrapper": "base",
+            },
+        )
+        assert runner.run(["wrapper"]) == CliExitCode.SUCCESS.value
+
+    def test_empty_aliases_raises(self) -> None:
+        """空 aliases 应抛 ValueError."""
+        with pytest.raises(ValueError, match="至少需要一个别名"):
+            _ = px.CliRunner()
+
+    def test_empty_list_value_raises(self) -> None:
+        """空 list 作为 alias 值应抛 ValueError."""
+        with pytest.raises(ValueError, match="任务列表为空"):
+            _ = px.CliRunner(aliases={"x": []})
+
+    def test_invalid_value_type_raises(self) -> None:
+        """无效类型（int）作为 alias 值应抛 TypeError."""
+        with pytest.raises(TypeError, match="值类型无效"):
+            _ = px.CliRunner(aliases={"x": 123})  # type: ignore[dict-item]
+
+    def test_invalid_list_element_type_raises(self) -> None:
+        """list 中非 str/TaskSpec 元素应抛 TypeError."""
+        with pytest.raises(TypeError, match="列表元素类型无效"):
+            _ = px.CliRunner(aliases={"x": [123]})  # type: ignore[list-item]
+
+    def test_duplicate_task_name_raises(self) -> None:
+        """tasks 中重名任务应抛 ValueError."""
+        spec = px.cmd([*ECHO_CMD, "a"], name="dup")
+        with pytest.raises(ValueError, match="任务名重复"):
+            _ = px.CliRunner(tasks=[spec, spec], aliases={"a": "dup"})
+
+    def test_commands_excludes_unreferenced_tasks(self) -> None:
+        """commands 只含 aliases，不含 tasks 中未引用的任务."""
+        runner = px.CliRunner(
+            tasks=[
+                px.cmd([*ECHO_CMD, "a"], name="used"),
+                px.cmd([*ECHO_CMD, "b"], name="unused"),
+            ],
+            aliases={"a": "used"},
+        )
+        assert runner.commands == ["a"]
+
+    def test_unknown_command_rejected(self) -> None:
+        """未注册的 alias 名应被拒绝（不接受裸 task 名）."""
+        runner = px.CliRunner(
+            tasks=[px.cmd([*ECHO_CMD, "a"], name="task_a")],
+            aliases={"a": "task_a"},
+        )
+        # task_a 是任务名，不是 alias，应被拒绝
+        assert runner.run(["task_a"]) == CliExitCode.FAILURE.value
@@ -70,9 +70,9 @@ def test_memory_backend_ttl_load_filters_expired() -> None:


 def test_memory_backend_expired_key_not_in_store() -> None:
-    """_expired 对不存在键返回 False."""
+    """不存在的键 has 返回 False."""
    b = MemoryBackend(ttl=1.0)
-    assert b._expired("nonexistent") is False
+    assert b.has("nonexistent") is False


 def test_memory_backend_no_ttl_never_expired() -> None:
@@ -244,35 +244,35 @@ def test_json_backend_ttl_load_filters_expired() -> None:


 def test_json_backend_expired_no_ttl() -> None:
-    """无 TTL 时 _expired 返回 False."""
+    """无 TTL 时永不过期."""
    with tempfile.TemporaryDirectory() as tmp:
        path = str(Path(tmp) / "state.json")
        b = JSONBackend(path)
        b.save("a", 1)
        # 手动修改 ts 为很久以前
        b._store["a"]["ts"] = time.time() - 1000
-        assert b._expired(b._store["a"]) is False  # 无 TTL，永不过期
+        assert b.has("a") is True  # 无 TTL，永不过期


 def test_json_backend_expired_with_ttl() -> None:
-    """有 TTL 时 _expired 检查是否过期."""
+    """有 TTL 时过期键 has 返回 False."""
    with tempfile.TemporaryDirectory() as tmp:
        path = str(Path(tmp) / "state.json")
        b = JSONBackend(path, ttl=1.0)
        b.save("a", 1)
        # 手动修改 ts 为很久以前
        b._store["a"]["ts"] = time.time() - 10  # 10 秒前，超过 TTL
-        assert b._expired(b._store["a"]) is True
+        assert b.has("a") is False


 def test_json_backend_expired_missing_ts() -> None:
-    """entry 缺少 ts 时使用默认值 0."""
+    """entry 缺少 ts 时视为过期."""
    with tempfile.TemporaryDirectory() as tmp:
        path = str(Path(tmp) / "state.json")
        b = JSONBackend(path, ttl=1.0)
        b._store["a"] = {"value": 1}  # 缺少 ts
        # ts 默认为 0，已经过了很久
-        assert b._expired(b._store["a"]) is True
+        assert b.has("a") is False


 def test_json_backend_save_value_error(monkeypatch: pytest.MonkeyPatch) -> None:
@@ -0,0 +1,63 @@
+"""Tests for streaming result passing (iterators between tasks)."""
+
+from __future__ import annotations
+
+from typing import Iterator
+
+import pyflowx as px
+
+
+def test_generator_passed_as_iterator() -> None:
+    """上游返回生成器，下游应能惰性消费."""
+
+    @px.task
+    def source() -> Iterator[int]:
+        yield from range(5)
+
+    @px.task(depends_on=("source",))
+    def consume(source: Iterator[int]) -> int:
+        return sum(source)
+
+    graph = px.Graph.from_specs([source, consume])
+    report = px.run(graph)
+    assert report.success
+    assert report["consume"] == 10
+
+
+def test_large_range_streaming() -> None:
+    """大范围迭代器流式传递，避免中间列表."""
+
+    @px.task
+    def numbers() -> Iterator[int]:
+        yield from range(1000)
+
+    @px.task(depends_on=("numbers",))
+    def total(numbers: Iterator[int]) -> int:
+        return sum(numbers)
+
+    graph = px.Graph.from_specs([numbers, total])
+    report = px.run(graph)
+    assert report.success
+    assert report["total"] == sum(range(1000))
+
+
+def test_chain_multiple_streams() -> None:
+    """多个流式任务串联."""
+
+    @px.task
+    def gen() -> Iterator[int]:
+        yield from range(10)
+
+    @px.task(depends_on=("gen",))
+    def doubled(gen: Iterator[int]) -> Iterator[int]:
+        for x in gen:
+            yield x * 2
+
+    @px.task(depends_on=("doubled",))
+    def collect(doubled: Iterator[int]) -> list[int]:
+        return list(doubled)
+
+    graph = px.Graph.from_specs([gen, doubled, collect])
+    report = px.run(graph)
+    assert report.success
+    assert report["collect"] == [x * 2 for x in range(10)]
@@ -2,11 +2,12 @@

 import os
 import subprocess
+from pathlib import Path

 import pytest

 from pyflowx.conditions import Constants
-from pyflowx.tasks.system import clr, reset_icon_cache, setenv, which
+from pyflowx.tasks.system import clr, reset_icon_cache, setenv, setenv_group, which, write_file


 def test_clr_creates_task_spec() -> None:
@@ -189,3 +190,57 @@ def test_which_not_found(monkeypatch: pytest.MonkeyPatch, capsys: pytest.Capture
    spec.fn()
    captured = capsys.readouterr()
    assert "nonexistent_cmd -> 未找到" in captured.out
+
+
+def test_write_file_creates_task_spec() -> None:
+    """write_file() 应创建带 verbose 的 TaskSpec。"""
+    spec = write_file("/tmp/unused", "x")
+    assert spec.name == "write_file_/tmp/unused"
+    assert spec.verbose is True
+
+
+def test_write_file_writes_content(tmp_path: Path) -> None:
+    """write_file() 应将内容写入指定文件."""
+    f = tmp_path / "out.txt"
+    spec = write_file(str(f), "hello world")
+    assert spec.fn is not None
+    spec.fn()
+    assert f.read_text(encoding="utf-8") == "hello world"
+
+
+def test_write_file_with_encoding(tmp_path: Path) -> None:
+    """write_file() 应支持指定编码."""
+    f = tmp_path / "out.txt"
+    spec = write_file(str(f), "中文", encoding="utf-8")
+    assert spec.fn is not None
+    spec.fn()
+    assert f.read_text(encoding="utf-8") == "中文"
+
+
+def test_write_file_failure_propagates(tmp_path: Path) -> None:
+    """write_file() 写入失败应抛出异常（不吞异常）."""
+    # 父目录不存在时写入应抛 FileNotFoundError
+    missing = tmp_path / "no_such_dir" / "out.txt"
+    spec = write_file(str(missing), "x")
+    assert spec.fn is not None
+    with pytest.raises(FileNotFoundError):
+        spec.fn()
+
+
+def test_setenv_group_creates_specs() -> None:
+    """setenv_group() 应为每个环境变量创建 TaskSpec."""
+    envs = {"VAR_A": "1", "VAR_B": "2"}
+    specs = setenv_group(envs)
+    assert len(specs) == 2
+    assert specs[0].name == "setenv_var_a"
+    assert specs[1].name == "setenv_var_b"
+
+
+def test_setenv_group_default_mode(monkeypatch: pytest.MonkeyPatch) -> None:
+    """setenv_group(default=True) 不应覆盖已存在的环境变量."""
+    monkeypatch.setenv("PYFLOWX_GROUP_EXISTS", "original")
+    specs = setenv_group({"PYFLOWX_GROUP_EXISTS": "new"}, default=True)
+    for spec in specs:
+        assert spec.fn is not None
+        spec.fn()
+    assert os.environ["PYFLOWX_GROUP_EXISTS"] == "original"
@@ -14,6 +14,7 @@ from pyflowx.task import (
    TaskSpec,
    TaskStatus,
    _env_and_cwd,
+    cmd,
    task_template,
 )

@@ -78,6 +79,41 @@ def test_retry_policy_negative_jitter_rejected() -> None:
        RetryPolicy(jitter=-1)


+# ---------------------------------------------------------------------- #
+# cmd() 工厂
+# ---------------------------------------------------------------------- #
+def test_cmd_factory_default_name_from_two_elements() -> None:
+    """cmd() 默认 name = '_'.join(command[:2])."""
+    spec = cmd(["uv", "build"])
+    assert spec.name == "uv_build"
+    assert spec.cmd == ["uv", "build"]
+
+
+def test_cmd_factory_default_name_single_element() -> None:
+    """cmd() 单元素命令 name = command[0]."""
+    spec = cmd(["ls"])
+    assert spec.name == "ls"
+
+
+def test_cmd_factory_explicit_name() -> None:
+    """cmd() 显式 name 覆盖默认推导."""
+    spec = cmd(["ruff", "check", "--fix"], name="lint")
+    assert spec.name == "lint"
+
+
+def test_cmd_factory_passes_depends_on() -> None:
+    """cmd() depends_on 透传给 TaskSpec."""
+    spec = cmd(["echo", "b"], name="b", depends_on=("a",))
+    assert spec.depends_on == ("a",)
+
+
+def test_cmd_factory_passes_extra_kwargs() -> None:
+    """cmd() 其余 kwargs 透传给 TaskSpec."""
+    spec = cmd(["echo", "x"], name="x", timeout=10.0, tags=("t1",))
+    assert spec.timeout == 10.0
+    assert spec.tags == ("t1",)
+
+
 def test_retry_policy_retries_property() -> None:
    policy = RetryPolicy(max_attempts=3)
    assert policy.retries == 2
@@ -157,8 +193,8 @@ def test_should_execute_skip_if_missing_cmd_not_found() -> None:

 def test_should_execute_skip_if_missing_cmd_found() -> None:
    """skip_if_missing 但命令存在时应执行."""
-    # 使用 Python 作为已安装的命令
-    spec = TaskSpec("a", cmd=["echo"], skip_if_missing=True)  # echo 应存在
+    # 使用 Python 作为已安装的命令（Windows 上 echo 是 shell 内置，shutil.which 找不到）
+    spec = TaskSpec("a", cmd=["python"], skip_if_missing=True)  # python 应存在
    should_run, reason = spec.should_execute({})
    assert should_run is True
    assert reason is None
@@ -203,10 +239,10 @@ def test_is_cmd_available_callable_returns_true() -> None:
 # storage_key 异常处理
 # ---------------------------------------------------------------------- #
 def test_storage_key_cache_key_exception_returns_name() -> None:
-    """cache_key 抛异常时应返回任务名."""
+    """cache_key 抛预期异常（TypeError/ValueError/KeyError/AttributeError）时应返回任务名."""

    def bad_cache_key(_ctx):
-        raise RuntimeError("cache key error")
+        raise ValueError("cache key error")

    spec = TaskSpec("a", _fn, cache_key=bad_cache_key)
    key = spec.storage_key({})
@@ -345,14 +381,14 @@ def test_task_result_default_status() -> None:


 # ---------------------------------------------------------------------- #
-# _run_command callable 命令测试
+# run_command callable 命令测试
 # ---------------------------------------------------------------------- #
 def test_run_command_callable_verbose_with_cwd(capsys: pytest.CaptureFixture[str], tmp_path: Path) -> None:
    """callable 命令 verbose 模式应打印信息."""
-    spec = TaskSpec("a", cmd=lambda: "result", verbose=True, cwd=tmp_path)
-    import pyflowx.task as task_module
+    from pyflowx.command import run_command

-    result = task_module._run_command(spec)
+    spec = TaskSpec("a", cmd=lambda: "result", verbose=True, cwd=tmp_path)
+    result = run_command(spec)
    assert result == "result"
    captured = capsys.readouterr()
    assert "执行可调用命令" in captured.out
@@ -361,8 +397,8 @@ def test_run_command_callable_verbose_with_cwd(capsys: pytest.CaptureFixture[str

 def test_run_command_callable_exception() -> None:
    """callable 命令抛异常应转为 RuntimeError."""
-    spec = TaskSpec("a", cmd=lambda: (_ for _ in ()).throw(RuntimeError("callable error")))
-    import pyflowx.task as task_module
+    from pyflowx.command import run_command

+    spec = TaskSpec("a", cmd=lambda: (_ for _ in ()).throw(RuntimeError("callable error")))
    with pytest.raises(RuntimeError, match="可调用命令执行异常"):
-        task_module._run_command(spec)
+        run_command(spec)
@@ -0,0 +1,136 @@
+"""Tests for the @task decorator API."""
+
+from __future__ import annotations
+
+from pathlib import Path
+from typing import Any, Mapping
+
+import pyflowx as px
+from pyflowx.task import RetryPolicy, TaskHooks, TaskSpec
+
+
+def test_task_decorator_plain() -> None:
+    """@task 无参数装饰：name 取函数名，返回 TaskSpec."""
+
+    @px.task
+    def extract() -> list[int]:
+        return [1, 2, 3]
+
+    assert isinstance(extract, TaskSpec)
+    assert extract.name == "extract"
+    assert extract.fn is not None
+    assert extract.depends_on == ()
+
+
+def test_task_decorator_with_params() -> None:
+    """@task(...) 带参数装饰：传递依赖与重试."""
+
+    @px.task(depends_on=("extract",), retry=RetryPolicy(max_attempts=3))
+    def double(extract: list[int]) -> list[int]:
+        return [x * 2 for x in extract]
+
+    assert isinstance(double, TaskSpec)
+    assert double.name == "double"
+    assert double.depends_on == ("extract",)
+    assert double.retry.max_attempts == 3
+
+
+def test_task_decorator_explicit_name() -> None:
+    """@task(name=...) 应使用显式名称而非函数名."""
+
+    @px.task(name="custom_name")
+    def my_func() -> None:
+        return None
+
+    assert my_func.name == "custom_name"
+
+
+def test_task_decorator_cmd_form() -> None:
+    """@task(cmd=...) 应支持命令形式."""
+
+    spec = px.task(cmd=["ls", "-la"], name="list_files")
+    assert isinstance(spec, TaskSpec)
+    assert spec.name == "list_files"
+    assert spec.cmd == ["ls", "-la"]
+
+
+def test_task_decorator_full_options() -> None:
+    """@task 应支持全部 TaskSpec 字段."""
+
+    @px.task(
+        depends_on=("a",),
+        soft_depends_on=("b",),
+        defaults={"b": 0},
+        args=(1,),
+        kwargs={"x": 2},
+        retry=RetryPolicy(max_attempts=5),
+        timeout=10.0,
+        tags=("t1",),
+        conditions=(px.BuiltinConditions.IS_WINDOWS,),  # type: ignore[arg-type]
+        cwd="/tmp",
+        env={"K": "v"},
+        verbose=True,
+        skip_if_missing=True,
+        allow_upstream_skip=True,
+        strategy="thread",
+        priority=3,
+        concurrency_key="db",
+        continue_on_error=True,
+    )
+    def f(a: int) -> int:
+        return a
+
+    assert f.depends_on == ("a",)
+    assert f.soft_depends_on == ("b",)
+    assert f.defaults == {"b": 0}
+    assert f.args == (1,)
+    assert f.kwargs == {"x": 2}
+    assert f.retry.max_attempts == 5
+    assert f.timeout == 10.0
+    assert f.tags == ("t1",)
+    assert len(f.conditions) == 1
+    assert isinstance(f.cwd, Path)
+    assert f.cwd == Path("/tmp")
+    assert f.env == {"K": "v"}
+    assert f.verbose is True
+    assert f.skip_if_missing is True
+    assert f.allow_upstream_skip is True
+    assert f.strategy == "thread"
+    assert f.priority == 3
+    assert f.concurrency_key == "db"
+    assert f.continue_on_error is True
+
+
+def test_task_decorator_runs_in_graph() -> None:
+    """装饰器生成的 TaskSpec 应能直接构建图并运行."""
+
+    @px.task
+    def extract() -> list[int]:
+        return [1, 2, 3]
+
+    @px.task(depends_on=("extract",))
+    def double(extract: list[int]) -> list[int]:
+        return [x * 2 for x in extract]
+
+    graph = px.Graph.from_specs([extract, double])
+    report = px.run(graph)
+    assert report.success
+    assert report["double"] == [2, 4, 6]
+
+
+def test_task_decorator_hooks_passthrough() -> None:
+    """@task(hooks=...) 应传递 TaskHooks 实例."""
+
+    hooks = TaskHooks(pre_run=lambda _spec: None)
+    spec = px.task(fn=lambda: None, hooks=hooks, name="h")
+    assert spec.hooks is hooks
+
+
+def test_task_decorator_cache_key_passthrough() -> None:
+    """@task(cache_key=...) 应传递缓存键函数."""
+
+    def ck(ctx: Mapping[str, Any]) -> str:
+        return "k"
+
+    spec = px.task(fn=lambda: None, cache_key=ck, name="c")
+    assert spec.cache_key is ck
@@ -1,65 +0,0 @@
-import time
-
-import pytest
-from pytest_mock import MockerFixture
-
-from pyflowx.utils import _perf_metrics, perf_timer
-
-
-@pytest.fixture(autouse=True)
-def reset_perf_metrics():
-    """重置性能指标."""
-    _perf_metrics.clear()
-
-
-class TestPerformanceTimer:
-    def test_perf_timer(self):
-
-        @perf_timer()
-        def test_func():
-            time.sleep(0.1)
-
-        test_func()
-
-        assert _perf_metrics["test_func"] is not None
-        assert _perf_metrics["test_func"]["count"] == 1
-        assert _perf_metrics["test_func"]["total_time"] >= 0.1
-
-    def test_perf_timer_report(self, mocker: MockerFixture):
-        mock_log = mocker.patch("logging.info")
-
-        @perf_timer(report=True, unit="ms", precision=3)
-        def test_func():
-            time.sleep(0.1)
-
-        test_func()
-
-        assert _perf_metrics["test_func"] is not None
-        assert _perf_metrics["test_func"]["count"] == 1
-        assert _perf_metrics["test_func"]["total_time"] >= 0.1
-
-        assert mock_log.call_count == 1
-
-    def test_generate_report(self, mocker: MockerFixture, caplog: pytest.LogCaptureFixture):
-        mock_log = mocker.patch("logging.info")
-
-        from pyflowx.utils import _generate_report
-
-        @perf_timer(report=True, unit="ms", precision=3)
-        def test_func():
-            time.sleep(0.1)
-
-        @perf_timer(report=True, unit="ms", precision=3)
-        def test_func2():
-            time.sleep(0.2)
-
-        test_func()
-        test_func2()
-
-        _generate_report("ms", 3)
-
-        assert mock_log.call_count == 3
-        assert _perf_metrics["test_func"]["count"] == 1
-        assert _perf_metrics["test_func"]["total_time"] >= 0.1
-        assert _perf_metrics["test_func2"]["count"] == 1
-        assert _perf_metrics["test_func2"]["total_time"] >= 0.2
@@ -5603,7 +5603,7 @@ pycountry = [

 [[package]]
 name = "pyflowx"
-version = "0.2.10"
+version = "0.2.13"
 source = { editable = "." }
 dependencies = [
    { name = "graphlib-backport", marker = "python_full_version < '3.9'" },
Author	SHA1	Message	Date
zhou	5293831165	Merge pull request 'feat(cli/dev/envdev): 为Linux环境添加Docker安装配置相关任务' (#1 ) from develop into main CI / Lint & Typecheck (push) Failing after 30s Details CI / Test (ubuntu-latest) (push) Failing after 30s Details CI / Test (macos-latest) (push) Has been cancelled Details CI / Test (windows-latest) (push) Has been cancelled Details Reviewed-on: #1	2026-07-02 05:26:08 +00:00
zhou	87606d152a	feat(cli/dev/envdev): 为Linux环境添加Docker安装配置相关任务 CI / Lint & Typecheck (push) Failing after 6m4s Details CI / Test (ubuntu-latest) (push) Failing after 1m31s Details CI / Test (macos-latest) (push) Has been cancelled Details CI / Test (windows-latest) (push) Has been cancelled Details 新增Linux系统下安装docker-compose-v2、添加用户到docker组以及刷新docker用户组的任务流程，完善开发环境配置步骤	2026-07-02 10:58:12 +08:00
zhou	6f93e6eb6d	bump version to 0.3.0 Release / build (push) Failing after 31s Details Release / release (push) Has been skipped Details Release / publish-pypi (push) Has been skipped Details CI / Test (macos-latest) (push) Has been cancelled Details CI / Test (ubuntu-latest) (push) Has been cancelled Details CI / Test (windows-latest) (push) Has been cancelled Details CI / Lint & Typecheck (push) Has been cancelled Details	2026-06-28 21:38:37 +08:00
zhou	43e1aad1fe	chore: 发布版本0.2.13并完善任务执行环境配置本次提交更新了版本号至0.2.13，同时完成多项改进： 1. 在.gitignore中新增忽略性能分析文件*_profile.html 2. 修复测试用例中echo命令在Windows下无法被正确检测的问题，改用python命令 3. 优化测试用例确保性能统计数据有效，添加耗时模拟函数 4. 为所有CLI任务统一配置项目根目录作为工作目录，解决跨平台执行路径问题 5. 新增测试验证所有任务的cwd配置正确性	2026-06-28 21:38:18 +08:00
zhou	467634f8c7	bump version to 0.2.13 Release / build (push) Failing after 11m59s Details Release / release (push) Has been skipped Details Release / publish-pypi (push) Has been skipped Details	2026-06-28 20:30:54 +08:00
zhou	ce31f60441	feat(cli): add pxp performance profiler command 1. 新增pxp CLI工具用于分析PyFlowX脚本生成性能报告 2. 新增ProfileReport.to_html方法生成自包含HTML报告 3. 新增完整的profiler功能测试用例 4. 更新pyproject.toml添加pxp入口点 5. 版本升级至0.2.12	2026-06-28 20:30:17 +08:00
zhou	3d6d769685	feat(profiling): 添加工作流性能分析模块与测试用例新增了性能剖面分析能力，支持从运行报告生成任务级、图级性能指标，包括关键路径、并行度分析和瓶颈识别，同时补充了完整的单元测试覆盖。	2026-06-28 19:59:25 +08:00
zhou	3f9c52e6f1	bump version to 0.2.12 Release / build (push) Failing after 23m3s Details Release / publish-pypi (push) Has been skipped Details Release / release (push) Has been skipped Details	2026-06-28 18:56:42 +08:00
zhou	8fadf6edd8	fix(executors): 修复进程池退出阻塞问题 1. 新增_shutdown_process_pool函数，在run()结束时主动关闭进程池 2. 通过atexit注册兜底清理逻辑，防止进程池泄漏 3. 先调用shutdown(wait=False)通知管理线程退出，再强制kill工作进程，避免Python退出时threading._shutdown等待join导致数秒阻塞 4. 新增测试规范文档说明测试相关规则	2026-06-28 18:56:27 +08:00
zhou	abc1152538	refactor(cli): 统一使用@px.task装饰器定义任务，重构任务注册和别名管理 1. 将folderzip/folderback/gittool中的旧TaskSpec定义替换为@px.task装饰器 2. 重构pymake模块，将maturin_build_cmd转为常量定义，合并别名配置 3. 精简测试文件中的冗余测试用例	2026-06-28 18:12:30 +08:00
zhou	5e561b4b3a	refactor: 重构CliRunner，新增cmd工厂函数优化任务定义 1. 新增cmd工厂函数，简化TaskSpec创建并自动推导名称 2. 重构CliRunner，将graphs参数替换为tasks+aliases，支持扁平任务注册与别名映射 3. 替换所有cli工具中的旧版任务定义方式，使用新API简化代码 4. 补充对应测试用例，适配新的运行器API	2026-06-28 17:52:52 +08:00
zhou	40f641611b	feat: 新增多项核心功能并优化默认执行策略 1. 将CliRunner默认执行策略从sequential改为dependency 2. 新增RunReport的任务状态查询和时长统计方法 3. 实现task装饰器并补充executor参数文档 4. 新增进程池执行器支持CPU密集型任务 5. 新增Graph.chain链式构建和add_subgraph子图合并功能 6. 新增流式任务传递、进程池执行、命名空间等多类测试用例 7. 补充tests目录路径导入配置	2026-06-28 15:10:15 +08:00
zhou	232e7293d9	refactor(system): 简化write_file实现，使用pathlib替代手动文件操作。	2026-06-28 11:20:58 +08:00
zhou	a1bae58e56	refactor: 优化日志配置与代码细节 1. 统一使用__name__替代硬编码的logger名称 2. 使用pathlib替代os.path处理程序名 3. 细化异常捕获并优化日志打印格式 4. 收紧文件内容检查的异常捕获范围	2026-06-28 10:57:51 +08:00
zhou	cbc7cc0a75	docs: 拆分测试规范到独立技能文档并更新主规范将原python-standards.md中的测试章节迁移到新建的pyflowx-testing/SKILL.md，更新主规范指向新文档，同时整理优化了整体文档结构与内容。	2026-06-28 10:19:26 +08:00
zhou	d0ff7d7b4d	docs: 更新 README 与新增 Python 开发规范文档本次提交大幅完善了 PyFlowX 的 README 文档，新增了四种执行策略、软依赖、并发限制、任务钩子等多项特性说明，补充了任务模板、图组合、缓存键等新功能的使用示例，同时更新了执行参数、执行策略对照表与模块结构文档。另外新增了 .trae/rules/python-standards.md 规范文档，统一了项目的代码风格、类型检查、测试编写等开发标准。	2026-06-28 09:34:45 +08:00
zhou	d154f67ce0	+trae ignore	2026-06-28 08:44:23 +08:00
zhou	9999071119	refactor(executors): 重构执行器逻辑，移除重复mixin并优化分层排序主要变更： 1. 将任务跳过/重试逻辑从类mixin改为模块级函数，减少代码重复 2. 优化_graph.layers()的前置校验逻辑，统一在run入口执行 3. 重构存储过期检查API，移除废弃的_expired方法 4. 优化TaskSpec.cache_key异常处理，增加指定异常捕获并记录警告 5. 修复verbose模式下的事件回调逻辑，正确触发RUNNING事件 6. 调整测试用例以适配新的API和行为变更	2026-06-28 08:25:15 +08:00
zhou	bdd70e9c43	refactor: 重构项目代码结构，拆分职责模块 1. 抽离图组合逻辑到pyflowx.compose，原graph.py仅保留单图DAG逻辑 2. 抽离命令执行逻辑到pyflowx.command，移除task.py内的_run_command 3. 重构上下文签名缓存，优化性能 4. 移除废弃的utils.perf_timer相关代码 5. 为JSONBackend添加batch批量落盘优化 6. 调整导入路径与公开API，更新测试用例 7. 简化条件判断逻辑，移除冗余代码	2026-06-28 02:28:38 +08:00