Skip to content

Different Caikit server instances returns different protos  #237

@kpouget

Description

@kpouget

Describe the bug

As part of my automated testing, I observed that some requests were returning very quickly.

Investigations highlighted that min_new_tokens: 0 was the reason for the quick return:

  • max_new_tokens: 25, min_new_tokens: 0
  • Request generated 5 tokens before EosToken

while the query is set with max_new_tokens == min_new_tokens.

Further investigations let to this reproducer:

HOST_1=flan-t5-small-cpu-1-predictor-watsonx-e2e-0.apps.20231017-06h53-watsonx-ci-kpouget.psap.aws.rhperfscale.org:443;
HOST_2=flan-t5-small-cpu-2-predictor-watsonx-e2e-1.apps.20231017-06h53-watsonx-ci-kpouget.psap.aws.rhperfscale.org:443;

get_proto() {
    grpcurl -insecure $1 describe caikit.runtime.Nlp.TextGenerationTaskRequest
}

get_proto $HOST_1 > 1
get_proto $HOST_2 > 2

echo diff
diff --side <(get_proto $HOST_1) <(get_proto $HOST_2)

1.log | 2.log

image

which highlights that the protos returned by two endpoints running the same image are different.

Image is quay.io/opendatahub/caikit-tgis-serving@sha256:794adc22d52cb3ac4b5aadfb286e8431cca829acdc4909719329cf8c4fabb4ec

Platform

Caikit packages in this image have this version:

caikit                  0.19.3
caikit-nlp              0.0.1                   /caikit/src/caikit-nlp
caikit-tgis-backend     0.1.18

Python 3.9

Sample Code

See above.

The invalid launch happens ~50% of the time, from what I observed.

Expected behavior

The prototypes are always the same.

Observed behavior

The prototypes do not have the same ordering.
No error printed anywhere.

Additional info

The location of this block (+ the field numbering) is the key difference between the "different versions" of the protos:

  oneof _preserve_input_text {
    bool preserve_input_text = 15;
  }

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions