Skip to content

Conversation

@Menkib64
Copy link
Contributor

@Menkib64 Menkib64 commented Nov 6, 2025

No description provided.

@borg323 borg323 requested a review from Copilot November 6, 2025 21:35
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new GetPreferredBatchStep() method to the Network interface to better control batch size granularity across different backend implementations, addressing issues with batch size management in the demux backend.

  • Adds GetPreferredBatchStep() virtual method to the base Network class with a default return value of 1
  • Implements backend-specific preferred batch step logic for XLA, ONNX, CUDA, and demux backends
  • Refactors the demux backend to use proper batch step-based work distribution instead of the previous simple splitting approach

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/neural/network.h Adds new virtual method GetPreferredBatchStep() to the Network interface
src/neural/backends/xla/xla_runner.h Declares GetPreferredBatchStep() method for XLA runner
src/neural/backends/xla/xla_runner.cc Implements preferred batch step as the smallest executable batch size
src/neural/backends/xla/network_xla.cc Removes workaround for batch size and delegates to new preferred batch step method
src/neural/backends/network_onnx.cc Implements preferred batch step based on batch configuration
src/neural/backends/network_demux.cc Major refactor with new work distribution algorithm using batch steps and improved synchronization
src/neural/backends/cuda/network_cuda.cc Implements CUDA-specific preferred batch step based on streaming multiprocessor count

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

#include <mutex>
#include <queue>
#include <thread>

Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing include for SpinloopPause(). The function SpinloopPause() is used in line 76 but the required header "utils/mutex.h" is not included. This will cause a compilation error.

Suggested change
#include "utils/mutex.h"

Copilot uses AI. Check for mistakes.
Comment on lines +306 to +312
for (; i != end_index; i = (i + 1) % network_->backends_.size()) {
assert(work_start != GetBatchSize());
int work_end = work_start + split_size * network_->batch_step_;
work_end = std::min(work_end, GetBatchSize());
parents_.emplace_back(this, work_start, work_end);
network_->backends_[i].Enqueue(&parents_.back());
work_start = work_end;
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential bug when all backends need extra work. If extra_split_backends == network_->backends_.size(), then end_index will equal start_index (line 296), causing the loop at line 306 to not execute even though all backends should receive extra work. Consider handling this case explicitly, or using a counter-based loop instead of checking i != end_index.

Suggested change
for (; i != end_index; i = (i + 1) % network_->backends_.size()) {
assert(work_start != GetBatchSize());
int work_end = work_start + split_size * network_->batch_step_;
work_end = std::min(work_end, GetBatchSize());
parents_.emplace_back(this, work_start, work_end);
network_->backends_[i].Enqueue(&parents_.back());
work_start = work_end;
for (int count = 0; count < extra_split_backends; ++count) {
assert(work_start != GetBatchSize());
int work_end = work_start + split_size * network_->batch_step_;
work_end = std::min(work_end, GetBatchSize());
parents_.emplace_back(this, work_start, work_end);
network_->backends_[i].Enqueue(&parents_.back());
work_start = work_end;
i = (i + 1) % network_->backends_.size();

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant