You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/sphinx_doc/source/tutorial/develop_workflow.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -513,7 +513,7 @@ Here, `<config_file_path>` is the path to a YAML configuration file, which shoul
513
513
Once started, the model will keep running and wait for debug instructions; it will not exit automatically. You can then run the following command in another terminal to debug your workflow:
Copy file name to clipboardExpand all lines: docs/sphinx_doc/source/tutorial/faq.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -94,7 +94,7 @@ ray start --head
94
94
95
95
**A:** The following parameters may be helpful:
96
96
97
-
- For trainer, adjust `actor_rollout_ref.actor.ppo_micro_batch_size_per_gpu` when `actor_rollout_ref.actor.use_dynamic_bsz=false`; adjust `actor_rollout_ref.actor.ppo_max_token_len_per_gpu` and `actor_rollout_ref.actor.ulysses_sequence_parallel_size` when `actor_rollout_ref.actor.use_dynamic_bsz=true`. Setting `actor_rollout_ref.actor.entropy_from_logits_with_chunking=true` may also help.
97
+
- For trainer, adjust `trainer.max_token_len_per_gpu` when `trainer.use_dynamic_bsz=false`; adjust `trainer.ppo_max_token_len_per_gpu` and `trainer.ulysses_sequence_parallel_size` when `trainer.use_dynamic_bsz=true`. Setting `trainer.trainer_config.actor_rollout_ref.actor.entropy_from_logits_with_chunking=true` may also help.
98
98
- For explorer, adjust `explorer.rollout_model.tensor_parallel_size`.
99
99
100
100
@@ -113,7 +113,7 @@ To debug a new workflow, use Trinity-RFT's debug mode with the following steps:
113
113
114
114
1. Launch the inference model via `trinity debug --config <config_file_path> --module inference_model`
115
115
116
-
2. Debug the workflow in another terminal via `trinity debug --config <config_file_path> --module workflow --output_file <output_file_path> --plugin_dir <plugin_dir>`
116
+
2. Debug the workflow in another terminal via `trinity debug --config <config_file_path> --module workflow --output-file <output_file_path> --plugin-dir <plugin_dir>`
117
117
118
118
Please refer to {ref}`Workflow Development Guide <Workflows>` section for details.
Copy file name to clipboardExpand all lines: docs/sphinx_doc/source/tutorial/trinity_configs.md
+16Lines changed: 16 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -371,6 +371,12 @@ explorer:
371
371
tensor_parallel_size: 1
372
372
eval_interval: 100
373
373
eval_on_startup: True
374
+
over_rollout:
375
+
ratio: 0.0
376
+
wait_after_min: 30.0
377
+
dynamic_timeout:
378
+
enable: false
379
+
ratio: 3.0
374
380
```
375
381
376
382
- `name`: Name of the explorer. This name will be used as the Ray actor's name, so it must be unique.
@@ -385,6 +391,12 @@ explorer:
385
391
- `auxiliary_models`: Additional models used for custom workflows.
386
392
- `eval_interval`: Interval (in steps) for evaluating the model.
387
393
- `eval_on_startup`: Whether to evaluate the model on startup. More precisely, at step 0 with the original model, so it will not be triggered when restarting.
394
+
- `over_rollout`: [Experimental] Configurations for over-rollout mechanism, which allows the explorer to proceed with fewer tasks than the full batch size. It effectively increases throughput in scenarios where some tasks take significantly longer to complete than others. Only applicable when dynamic synchronization (`synchronizer.sync_style` is not `fixed`) is used.
395
+
- `ratio`: Explorer will only wait for `(1 - ratio) * batch_size` of tasks at each step. Default is `0.0`, meaning waiting for all tasks.
396
+
- `wait_after_min`: After reaching the minimum task threshold, wait for this many seconds before proceeding. Default is `30.0` seconds.
397
+
- `dynamic_timeout`: [Experimental] Configurations for dynamic timeout mechanism, which adjusts the timeout for each task based on the average time taken for successful tasks.
398
+
- `enable`: Whether to enable dynamic timeout. Default is `false`.
399
+
- `ratio`: The timeout for each task is dynamically set to `average_time_per_success_task * ratio`. Default is `3.0`.
388
400
389
401
---
390
402
@@ -398,6 +410,7 @@ synchronizer:
398
410
sync_interval: 10
399
411
sync_offset: 0
400
412
sync_timeout: 1200
413
+
sync_style: 'fixed'
401
414
```
402
415
403
416
- `sync_method`: Method of synchronization. Options:
@@ -406,6 +419,9 @@ synchronizer:
406
419
- `sync_interval`: Interval (in steps) of model weight synchronization between trainer and explorer.
407
420
- `sync_offset`: Offset (in steps) of model weight synchronization between trainer and explorer. The explorer can run `sync_offset` steps before the trainer starts training.
408
421
- `sync_timeout`: Timeout duration for synchronization.
422
+
- `sync_style`: Style of synchronization. Options:
423
+
- `fixed`: The explorer and trainer synchronize weights every `sync_interval` steps.
424
+
- `dynamic_by_explorer`: The explorer notifies the trainer to synchronize weights after completing `sync_interval` steps, regardless of how many steps the trainer has completed at this point.
0 commit comments