Request for Complete Test Script for Qwen2-Audio on AIR Bench

Hi,

I'm currently trying to replicate the performance of Qwen2-Audio on the AIR Bench. However, I noticed that the repository at [AIR-Bench](https://github.com/OFA-Sys/AIR-Bench/blob/main/score_chat.py) doesn't provide the complete test script. It only includes the inference script and the GPT-4 evaluation generation script.

Could you please clarify how the scores for the Speech, Sound, Music, and Mixed Audio metrics are obtained? It would be very helpful if you could provide the complete test script for these metrics.

Thank you for your assistance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Request for Complete Test Script for Qwen2-Audio on AIR Bench #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Request for Complete Test Script for Qwen2-Audio on AIR Bench #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions