Skip to content

PPO Batch/Buffer Size Handling is confusing #93

@TheEimer

Description

@TheEimer

PPO Batch size handling is a bit hacky. The big problem is the interaction between batch size and the buffer size. The buffer currently counts the number of added steps towards its size (len() is correct, but we compare self.pos), meaning that there is a hidden dimension of n_parallel_envs. What happens:

buffer_size: 128
num_envs: 1

We add 128 times 1 step. This is smaller than the default batch size, but we get a buffer overflow error and don't know it's related to batch size.

buffer_size: 128
num_env: 64

We add actually add too many steps since each has 64 and the buffer/batch size is optimized for 16. Better errors as well as maybe dynamic settings or warnings might be nice here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions