-
Notifications
You must be signed in to change notification settings - Fork 57
Open
Labels
Description
Testing rollout generation currently requires full RolloutWorker infrastructure including JAX/mesh setup, background InferenceServer threads, weight transfer via Arrow Flight, Ray curriculum actors, and rollout storage backends. This creates slow tests with high overhead (100+ lines for simple cases) and complex debugging buried in async workers. We need a lightweight RolloutManager class that provides synchronous rollout generation for unit tests and debugging while maintaining compatibility with production pipeline output formats.
Parent Issue: #1738
Relevant Code
src/marin/rl/rollout_worker.py- Production worker to extract core logic fromsrc/marin/rl/rollout_worker.py#L203-L269-_sample_batch()method with core rollout generationsrc/marin/rl/environments/base.py#L30-L60- Environment protocol andLevanterInferenceContexttests/rl/integration/config.py#L443-L683- Current test workarounds to replacetests/rl/integration/tasks.py#L41-L138- Tests manually creating batches