We recently had a regression where the reproducer command stopped providing the output from the container run, which meant that reproducer commands just stopped silently. We should improve such that we have tests that are a bit more rigid in terms of enforcing the CLI interaction we expect from helper.py.
Refs: