Error in calculating the Foundation score

The calculation of score for foundation benchmark has two errors:
1. For non-generation, the total count is not updated
2. For predictions that do not result in one of letters [A, B, C, D] in either in predict[0] or predict[-2], the total  count is not updated.

Therefore, the denominator while calculating the % accuracy is much smaller than the sample space. This makes the score high and non-representative of the actual model performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error in calculating the Foundation score #8

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error in calculating the Foundation score #8

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions