Skip to content

Conversation

@hypdeb
Copy link

@hypdeb hypdeb commented Nov 20, 2025

Following the developer instructions on arm CPUs results in a bunch of:

error: instruction requires: fullfp16
    --> ~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/gemm-common-0.18.2/src/simd.rs:1978:18
     |
1978 |                 "fmla {0:v}.8h, {1:v}.8h, {2:v}.8h",
     |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     |

It seems this is a simple case of telling Rust to enable this feature, which this PR accomplishes.

Summary by CodeRabbit

  • Chores
    • Enhanced build configuration for ARM64 Linux systems to optimize floating-point operation support.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: Julien Debache <[email protected]>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 20, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link

👋 Hi hypdeb! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

@github-actions github-actions bot added fix external-contribution Pull request is from an external contributor labels Nov 20, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 20, 2025

Walkthrough

A configuration file modification that adds target-specific rustflags for aarch64-unknown-linux-gnu architecture to enable FP16 support in the Cargo build configuration.

Changes

Cohort / File(s) Change Summary
Cargo Build Configuration
\.cargo/config\.toml
Introduces a new [target.aarch64-unknown-linux-gnu] section with rustflags to enable FP16 support for the aarch64 Linux target.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Poem

🐰 A whisker-twitch of FP16 delight,
On aarch64 we soar through the night,
Half-precision floats, swift and lean,
The finest flags our Cargo's seen! ✨

Pre-merge checks

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Description check ❓ Inconclusive The description explains the problem (fullfp16 error on ARM), the solution (enable the feature), and provides context, but lacks structured sections matching the template. Add explicit sections: Overview, Details of rustflags changes, which files to review (.cargo/config.toml), and any related issues to better match the template format.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: allow build on arm64' directly addresses the main change, which is enabling ARM64 builds by adding FP16 support flags.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0219979 and 876dbe7.

📒 Files selected for processing (1)
  • .cargo/config.toml (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: dmitry-tokarev-nv
Repo: ai-dynamo/dynamo PR: 2300
File: pyproject.toml:64-66
Timestamp: 2025-08-05T22:51:59.230Z
Learning: The ai-dynamo/dynamo project does not ship ARM64 wheels, so platform markers to restrict dependencies to x86_64 are not needed in pyproject.toml dependencies.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (1)
.cargo/config.toml (1)

4-9: Configuration approach is sound.

Using target-specific rustflags under [target.aarch64-unknown-linux-gnu] is the correct and idiomatic Cargo pattern for architecture-specific compilation flags. This avoids affecting x86_64 or other builds and keeps the fix narrowly scoped.

Comment on lines +8 to +9
[target.aarch64-unknown-linux-gnu]
rustflags = ["-C", "target-feature=+fp16"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the correct rustflag for FP16 support on ARM64.

The PR description mentions the error requires "fullfp16", but the config uses target-feature=+fp16. Confirm that this flag is correct and sufficient to resolve the compilation error in gemm-common.

Additionally, consider whether other ARM64 target triples (e.g., aarch64-apple-darwin, aarch64-pc-windows-msvc) require the same configuration, or if this is specific to aarch64-unknown-linux-gnu.


🌐 Web query:

What is the correct Rust/Cargo rustflag to enable fullfp16 or fp16 CPU feature on aarch64?

💡 Result:

Use rustc's target-feature flag with the AArch64 feature name "fullfp16". Example:

  • Environment / Cargo: RUSTFLAGS="-C target-feature=+fullfp16"
  • .cargo/config.toml: [target.] rustflags = ["-C", "target-feature=+fullfp16"]

References: LLVM/Clang show the AArch64 feature name is fullfp16, and rustc accepts target-feature via -C. [1][2]

Sources:
[1] Clang/LLVM aarch64 tests showing "+fullfp16" feature name.
[2] rustc docs: use -C target-feature to enable/disable CPU features.


Correct the rustflag: use +fullfp16 instead of +fp16.

The correct Rust/Cargo rustflag to enable fullfp16 support on aarch64 is target-feature=+fullfp16. The PR currently uses target-feature=+fp16, which is incorrect and will not resolve the gemm-common compilation error.

Lines 8-9 should be:

[target.aarch64-unknown-linux-gnu]
    rustflags = ["-C", "target-feature=+fullfp16"]
🤖 Prompt for AI Agents
In .cargo/config.toml around lines 8 to 9, the rustflags entry incorrectly uses
"target-feature=+fp16"; replace it with the correct feature
"target-feature=+fullfp16" so the aarch64 build enables fullfp16 support (i.e.,
update the rustflags array value to use +fullfp16).

Copy link
Contributor

@nnshah1 nnshah1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nv-anants can you look as well? What does the arm build do on our end?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external-contribution Pull request is from an external contributor fix size/XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants