Commit 730bd35
authored
[perf][cpu] Accelerate paged attention GEMMs (QK, PV) on Arm CPUs with NEON (#29193)
Signed-off-by: Fadi Arafeh <[email protected]>1 parent f55c76c commit 730bd35
File tree
5 files changed
+416
-5
lines changed- csrc/cpu
- vllm
- engine
- v1/attention/backends
5 files changed
+416
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
16 | 28 | | |
17 | 29 | | |
18 | 30 | | |
| |||
41 | 53 | | |
42 | 54 | | |
43 | 55 | | |
| 56 | + | |
44 | 57 | | |
45 | 58 | | |
46 | 59 | | |
| |||
73 | 86 | | |
74 | 87 | | |
75 | 88 | | |
| 89 | + | |
| 90 | + | |
76 | 91 | | |
77 | 92 | | |
78 | 93 | | |
| |||
158 | 173 | | |
159 | 174 | | |
160 | 175 | | |
| 176 | + | |
| 177 | + | |
161 | 178 | | |
162 | 179 | | |
163 | 180 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | | - | |
| 17 | + | |
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| |||
143 | 143 | | |
144 | 144 | | |
145 | 145 | | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
146 | 152 | | |
147 | 153 | | |
148 | 154 | | |
| |||
0 commit comments