Skip to content

Commit 16bbfb3

Browse files
authored
Refreshes agent handbook (#492)
* Updates contributor guide to reflect the new repository layout and directory ownership. Highlights the current pre-commit tooling so future patches follow the standardized lint and formatting flow.
1 parent 5c0ffc6 commit 16bbfb3

File tree

1 file changed

+87
-38
lines changed

1 file changed

+87
-38
lines changed

AGENTS.md

Lines changed: 87 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -10,57 +10,92 @@
1010
docs—reference them instead of hand-rolling virtualenvs.
1111
- **Testing**: Integration and performance tests require multi-node hardware. Explain
1212
skips explicitly when you cannot access the cluster.
13-
- **Formatting**: Project uses Black/isort/autoflake (see `pyproject.toml`). Surface any
14-
formatting gaps if you cannot run the tools yourself.
13+
- **Tooling**: `.pre-commit-config.yaml` runs Ruff (lint+format), mdformat,
14+
clang-format, nbstripout, and CLI doc generation; install with `pre-commit install`
15+
before submitting patches.
16+
- **Formatting**: Ruff + Ruff-format replace Black/isort; autoflake settings remain in
17+
`pyproject.toml`. Surface any formatting gaps you cannot auto-fix.
1518
- **Docs**: Source lives under `docs/` (Jupyter Book). Coordinate doc edits with the
1619
docs build pipeline.
20+
- **Legacy code**: `realhf/` is deprecated—do not modify or import from it; migrate uses
21+
into `areal/` equivalents instead.
1722

1823
When unsure, leave a `TODO(agent)` comment and note the constraint in your response.
1924

2025
## Repository map
2126

22-
| Path | Purpose |
23-
| ------------------------ | ------------------------------------------------------------------------------ |
24-
| `areal/api/` | Core contracts: workflows, engines, CLI configs, IO structs, scheduler APIs. |
25-
| `areal/workflow/` | Rollout/agent implementations (`multi_turn`, `rlvr`, `vision_rlvr`). |
26-
| `areal/engine/` | Training backends (FSDP2, Megatron, PPO actors) and inference adapters. |
27-
| `areal/dataset/` | Dataset loaders & utilities that feed rollouts. |
28-
| `areal/reward/` | Built-in reward functions plus helpers (math parsing, CLEVR counting). |
29-
| `areal/utils/` | Logging (`stats_tracker`), tensor helpers, recovery, evaluation, device utils. |
30-
| `examples/` | Runnable entrypoints for math, multi-turn, RLHF, VLM, search agents. |
31-
| `areal/launcher/` | Entry scripts for local, Ray, and Slurm orchestration. |
32-
| `docs/` | Published docs (https://inclusionai.github.io/AReaL/). |
33-
| `realhf/` | Legacy integrations retained for reference (read-only). |
34-
| `functioncall/` | Tool-calling utilities reused in workflows. |
35-
| `areal/platforms/` | Cluster abstractions used by advanced agents. |
36-
| `tests/` | Pytest suites (many require GPUs or mocked engines). |
37-
| `Dockerfile`, `Makefile` | Container recipe and helper tasks (`make docs`, `make lint`). |
27+
| Path | Purpose |
28+
| ------------------------- | ------------------------------------------------------------------------------- |
29+
| `areal/api/` | Core contracts: workflows, engines, controllers, schedulers, IO structs. |
30+
| `areal/controller/` | Distributed batching utilities and controller-side dataset packing. |
31+
| `areal/core/` | Async orchestration primitives (task runners, remote inference, workflow exec). |
32+
| `areal/dataset/` | Dataset loaders & utilities that feed rollouts. |
33+
| `areal/engine/` | Training backends (FSDP2, Megatron, PPO actors) and inference adapters. |
34+
| `areal/experimental/` | Prototype engines/workflows; expect churn and breaking changes. |
35+
| `areal/launcher/` | Orchestration entrypoints (local, Ray, Slurm) plus container specs. |
36+
| `areal/models/` | Model-specific adapters (Megatron-Core, Transformers wrappers). |
37+
| `areal/platforms/` | Hardware/platform abstractions (CPU/GPU/NPU backends, runtime adapters). |
38+
| `areal/reward/` | Built-in reward functions plus helpers (math parsing, CLEVR counting). |
39+
| `areal/scheduler/` | Scheduler implementations and allocation logic. |
40+
| `areal/tests/` | Targeted tests; many require GPUs or mocked distributed backends. |
41+
| `areal/thirdparty/` | Vendored integrations (e.g., vLLM shims). |
42+
| `areal/utils/` | Logging (`stats_tracker`), tensor helpers, recovery, evaluation, device utils. |
43+
| `areal/workflow/` | Rollout/agent implementations (`multi_turn`, `rlvr`, `vision_rlvr`). |
44+
| `examples/` | Runnable entrypoints for math, multi-turn, RLHF, VLM, search agents. |
45+
| `evaluation/` | Offline evaluation scripts (math/code/Elo) and utilities. |
46+
| `functioncall/` | Tool-calling utilities reused in workflows. |
47+
| `docs/` | Jupyter Book source published to https://inclusionai.github.io/AReaL/. |
48+
| `assets/` `benchmark/` | Figures, regression baselines, and benchmark snapshots. |
49+
| `blog/` | Release and update write-ups. |
50+
| `csrc/` | CUDA/C++ extensions that need `build_ext --inplace` after edits. |
51+
| `notebook/` | Reference notebooks (outputs stripped by pre-commit). |
52+
| `patch/` | Local patches for third-party deps (e.g., SGLang fixes). |
53+
| `recipe/` | Deployment recipes and higher-level orchestration configs. |
54+
| `.pre-commit-config.yaml` | Hooks: Ruff lint/format, mdformat, clang-format, nbstripout, CLI docs. |
55+
| `Dockerfile` | Container recipe for the standard runtime environment. |
56+
| `realhf/` | Legacy integrations (read-only, do **not** modify or import). |
3857

3958
### Where to find things
4059

4160
- **`areal/api/`** – Contracts for engines, schedulers, dataloaders, and CLI configs.
4261
Start here when adding new dataclasses or API surfaces.
62+
- **`areal/controller/`** – Distributed batching utilities and controller-side dataset
63+
packing.
64+
- **`areal/core/`** – Async orchestration primitives (task runners, remote inference,
65+
workflow execution).
66+
- **`areal/launcher/`** – Reference launchers for local, Ray, and Slurm targets plus
67+
container specs; reuse these instead of ad-hoc scripts.
68+
- **`areal/engine/`** – Training and inference engines: FSDP2, Megatron, PPO actors, and
69+
SGLang/vLLM adapters. Keep weight versioning logic consistent across edits.
70+
- **`areal/models/`** – Model-specific adapters (Megatron-Core layers, Transformers
71+
wrappers, custom heads).
4372
- **`areal/workflow/`** – Concrete rollout agents (`multi_turn`, `rlvr`, `vision_rlvr`).
4473
Each illustrates how `RolloutWorkflow.arun_episode` should orchestrate inference and
4574
rewards.
46-
- **`areal/engine/`** – Training and inference engines: FSDP2, Megatron, PPO actors, and
47-
SGLang/vLLM adapters. Keep weight versioning logic consistent across edits.
4875
- **`areal/dataset/`** – Stateful data pipeline utilities. New datasets should plug into
4976
these loaders for replay-safe iteration.
5077
- **`areal/reward/`** – Reward functions and math parsers. Wrap slow logic with
5178
`AsyncRewardWrapper` in `areal/api/reward_api.py`.
5279
- **`areal/utils/`** – Cross-cutting helpers (logging, stats, tensor containers,
5380
recovery, evaluation). Prefer reusing these utilities over duplicating logic.
81+
- **`areal/scheduler/`** – Placement and allocation policies for launchers; align with
82+
`examples/**` configs.
83+
- **`areal/tests/`** – Unit and integration tests colocated with code; many require GPU
84+
or mocked distributed backends.
85+
- **`areal/platforms/`** – Hardware/platform abstractions for CPU/GPU/NPU targets and
86+
runtime adapters.
87+
- **`areal/experimental/`** – Prototype engines/workflows; expect churn and breaking
88+
changes.
5489
- **`examples/`** – End-to-end wiring scripts for math, multi-turn, RLHF, VLM, and
5590
search agents. Use them as references for config wiring and launcher usage.
91+
- **`evaluation/`** – Offline scoring pipelines that consume logged trajectories.
5692
- **`docs/`** – Jupyter Book source; mirrors the high-level architecture and
5793
customization guides published at https://inclusionai.github.io/AReaL/.
58-
- **`areal/launcher/`** – Orchestration entrypoints (local, Ray, Slurm) plus container
59-
specs; essential for understanding deployment expectations.
60-
- **`realhf/`** – Legacy integrations retained for reference. Treat this directory as
61-
read-only unless explicitly extending backward compatibility.
62-
- **`functioncall/` & `areal/platforms/`** – Tool-calling scaffolding and cluster
63-
abstractions used by advanced agents.
94+
- **`functioncall/`** – Tool-calling scaffolding reused by workflows.
95+
- **`patch/`** – Maintains in-tree diffs applied to upstream dependencies; keep changes
96+
minimal and well-documented.
97+
- **`realhf/`** – Legacy integrations retained for reference. Do **not** modify or
98+
import; port call sites into `areal/` instead.
6499

65100
## Distributed operations & tooling
66101

@@ -75,10 +110,21 @@ When unsure, leave a `TODO(agent)` comment and note the constraint in your respo
75110
- **Testing limitations**: End-to-end tests (FSDP, Megatron, distributed RPC) require
76111
multi-node NCCL clusters. If you cannot execute them, state that your validation is
77112
limited to static analysis/doc updates.
78-
- **Formatting & docs**: CI enforces Black/isort/autoflake and `mdformat`. Mention when
79-
you cannot run the hooks; keep doc edits aligned with the Jupyter Book structure in
113+
- **Formatting & docs**: Pre-commit runs Ruff (lint+format), mdformat, clang-format,
114+
nbstripout, and CLI doc generation. Run `pre-commit run --all-files` (or install the
115+
hook) before submitting; keep doc edits aligned with the Jupyter Book structure in
80116
`docs/`.
81117

118+
## Legacy `realhf/` (read-only)
119+
120+
- `realhf/` remains only for archival context. The package build explicitly excludes it
121+
via `pyproject.toml`.
122+
- Do **not** modify files under `realhf/`, and avoid importing them in new code. Treat
123+
any dependency on these modules as tech debt.
124+
- When you encounter a `realhf` call site, prefer migrating the logic to the matching
125+
`areal/` module or partner with maintainers to port it.
126+
- Flag lingering `realhf` usage in reviews/issues so we can track and eliminate it.
127+
82128
### Code style & patterns
83129

84130
- **Typing & dataclasses**: Prefer explicit type hints and reuse existing dataclasses in
@@ -87,9 +133,9 @@ When unsure, leave a `TODO(agent)` comment and note the constraint in your respo
87133
is a strict superset of an existing one. Create a new dataclass if the config is
88134
conceptually distinct or would introduce breaking changes. Keep new configs
89135
dataclass-based so Hydra/CLI integration stays consistent.
90-
- **Imports**: Avoid wildcard imports; keep third-party vs internal imports separate
91-
(`isort` handles ordering). Place heavy optional deps inside functions to prevent
92-
import-time side effects.
136+
- **Imports**: Avoid wildcard imports; keep third-party vs internal groups consistent.
137+
Ruff enforces import ordering (isort rules) when hooks run. Place heavy optional deps
138+
inside functions to prevent import-time side effects.
93139
- **Logging**: Use `areal.utils.logging.getLogger(__name__)` rather than `print`. Emit
94140
structured metrics through `stats_tracker`/`StatsLogger` instead of ad-hoc counters.
95141
- **Async code**: Rollout workflows must stay non-blocking—prefer `await` with
@@ -134,7 +180,7 @@ Reference docs:
134180

135181
1. Create/modify a class in `areal/workflow/` that subclasses `RolloutWorkflow`.
136182
1. Maintain async behavior (`async def arun_episode`); gather trajectories per prompt
137-
and return padded tensors or `CompletionWithTokenLogpReward` maps.
183+
and return padded tensors (typically via `concat_padded_tensors`).
138184
1. Expose knobs via `__init__` (tokenizer, `GenerationHyperparameters`, reward fn,
139185
dump_dir).
140186
1. Update references in entry scripts or configs (e.g.,
@@ -180,8 +226,9 @@ Reference docs:
180226
acknowledge skipped coverage explicitly.
181227
- **Distributed/FSDP suites**: `test_fsdp_*`, `test_sglang_engine.py`, and RPC suites
182228
demand multi-node NCCL clusters. Mention the dependency when deferring.
183-
- **Static checks**: Black/isort/autoflake and `mdformat` are enforced in CI. Call out
184-
when formatting could not be verified locally.
229+
- **Static checks**: Pre-commit enforces Ruff lint/format, mdformat, clang-format,
230+
nbstripout, CLI doc regeneration, and autoflake. Call out when hooks cannot be run
231+
locally.
185232

186233
Always mention resource requirements in PRs and in agent responses when tests are
187234
skipped.
@@ -238,8 +285,9 @@ skipped.
238285
- **Reviews**: Be explicit about follow-ups with `TODO(agent)` comments and track them
239286
in the PR discussion. Address review feedback with additional commits (no force-push
240287
once review starts unless requested).
241-
- **Pre-merge**: Ensure formatting hooks pass (`black`, `isort`, `mdformat`,
242-
`autoflake`). For doc-only changes, run `mdformat --check` on touched files.
288+
- **Pre-merge**: Ensure pre-commit hooks pass (Ruff lint+format, mdformat, clang-format,
289+
nbstripout, CLI docs, autoflake). For doc-only changes, run `mdformat --check` on
290+
touched files.
243291

244292
## Reviewer checklist
245293

@@ -253,8 +301,9 @@ skipped.
253301
`update_weights`) consistent.
254302
- **Resource awareness**: Ensure configs note memory/GPU expectations, and new
255303
datasets/models document storage paths or cache requirements.
256-
- **Code style compliance**: Watch for Black/isort/autoflake alignment, import grouping,
257-
logging via `areal.utils.logging`, and consistent type hints/dataclass usage.
304+
- **Code style compliance**: Watch for Ruff lint/format alignment, autoflake cleanup,
305+
clang-format on CUDA/C++, mdformat for docs, logging via `areal.utils.logging`, and
306+
consistent type hints/dataclass usage.
258307
- **Config & docs**: Validate new knobs land in the right dataclasses/YAMLs with
259308
defaults explained in docs or README snippets. Cross-check hyperlinks and CLI
260309
references.

0 commit comments

Comments
 (0)