Different Caikit server instances returns different protos 

## Describe the bug

As part of my automated testing, I observed that some requests were returning very quickly.

Investigations highlighted that `min_new_tokens: 0` was the reason for the quick return:
* `max_new_tokens: 25, min_new_tokens: 0` 
* `Request generated 5 tokens before EosToken`

while the query is set with `max_new_tokens == min_new_tokens`.

Further investigations let to this reproducer:
```
HOST_1=flan-t5-small-cpu-1-predictor-watsonx-e2e-0.apps.20231017-06h53-watsonx-ci-kpouget.psap.aws.rhperfscale.org:443;
HOST_2=flan-t5-small-cpu-2-predictor-watsonx-e2e-1.apps.20231017-06h53-watsonx-ci-kpouget.psap.aws.rhperfscale.org:443;

get_proto() {
    grpcurl -insecure $1 describe caikit.runtime.Nlp.TextGenerationTaskRequest
}

get_proto $HOST_1 > 1
get_proto $HOST_2 > 2

echo diff
diff --side <(get_proto $HOST_1) <(get_proto $HOST_2)
```
[1.log](https://github.com/caikit/caikit-nlp/files/12932321/1.log) | [2.log](https://github.com/caikit/caikit-nlp/files/12932336/2.log)

![image](https://github.com/caikit/caikit-nlp/assets/7559202/92c25b6c-8fad-4434-8055-6e3a3f25546f)

which highlights that the protos returned by two endpoints *running the same image* are different.
 
Image is `quay.io/opendatahub/caikit-tgis-serving@sha256:794adc22d52cb3ac4b5aadfb286e8431cca829acdc4909719329cf8c4fabb4ec`



## Platform

Caikit packages in this image have this version:
```
caikit                  0.19.3
caikit-nlp              0.0.1                   /caikit/src/caikit-nlp
caikit-tgis-backend     0.1.18
```

Python 3.9

## Sample Code

See above.

The invalid launch happens ~50% of the time, from what I observed.

## Expected behavior

The prototypes are always the same.

## Observed behavior

The prototypes do not have the same ordering.
No error printed anywhere.

# Additional info

The location of this block (+ the field numbering) is the key difference between the "different versions" of the protos:
```
  oneof _preserve_input_text {
    bool preserve_input_text = 15;
  }
```




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Different Caikit server instances returns different protos #237

Describe the bug

Platform

Sample Code

Expected behavior

Observed behavior

Additional info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Different Caikit server instances returns different protos #237

Description

Describe the bug

Platform

Sample Code

Expected behavior

Observed behavior

Additional info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions