Skip to content

[Feature Request] Integrate FlashInfer for Prefill and Decode #64

@lausannel

Description

@lausannel

Prerequisites

  • I have searched existing issues and reviewed documentation.

Problem Description

We'd like to integrate FlashInfer into the project to improve decoding and prefill efficiency.

This integration aims to leverage FlashInfer's optimized kernel for faster inference.

Proposed Solution

Alternatives Considered

No response

Additional Context

No response

Importance

Important

Usage Statistics (Optional)

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions