Skip to content

Conversation

@jadechoghari
Copy link
Member

@jadechoghari jadechoghari commented Dec 1, 2025

What this does

feat(dataset): add tool to convert images to video datasets
Very useful to encode images dataset into videos.
Will work out of the box

Copilot AI review requested due to automatic review settings December 1, 2025 12:48
@jadechoghari jadechoghari added the dataset Issues regarding data inputs, processing, or datasets label Dec 1, 2025
@jadechoghari jadechoghari changed the title feat(dataset): editing tools - add conversion images to videos dataset feat(dataset): add tool to convert images to video datasets Dec 1, 2025
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copilot finished reviewing on behalf of jadechoghari December 1, 2025 12:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new dataset editing tool to convert image-based datasets to video format, providing storage efficiency and potentially improved data loading performance. The implementation integrates with the existing lerobot_edit_dataset script and reuses the existing encode_video_frames utility from the video_utils module.

Key Changes:

  • Added convert_to_video operation type with configurable video encoding parameters (codec, quality, GOP size, etc.)
  • Implemented parallel processing at both episode and image levels using ThreadPoolExecutor
  • Added comprehensive documentation with usage examples in the command-line tool help text and documentation

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
src/lerobot/scripts/lerobot_edit_dataset.py Adds core implementation: ConvertToVideoConfig dataclass, image extraction/saving functions, video encoding logic, and integration with the main edit_dataset command
docs/source/using_dataset_tools.mdx Documents the new convert_to_video operation with usage examples and parameter descriptions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dataset Issues regarding data inputs, processing, or datasets

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants