Skip to content

Commit bb7afe7

Browse files
committed
Caikit embeddings examples + local run documentation
Signed-off-by: Flavia Beo <[email protected]>
1 parent c12cb82 commit bb7afe7

File tree

5 files changed

+517
-0
lines changed

5 files changed

+517
-0
lines changed

examples/embeddings/README.md

Lines changed: 261 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,261 @@
1+
# Set up and run locally caikit embeddings server
2+
3+
#### Setting Up Virtual Environment using Python venv
4+
5+
For [(venv)](https://docs.python.org/3/library/venv.html), make sure you are in an activated `venv` when running `python` in the example commands that follow. Use `deactivate` if you want to exit the `venv`.
6+
7+
```shell
8+
python3 -m venv venv
9+
source venv/bin/activate
10+
```
11+
12+
### Models
13+
14+
To create a model configuration and artifacts, the best practice is to run the module's bootstrap() and save() methods. This will:
15+
16+
* Load the model by name (from Hugging Face hub or repository) or from a local directory. The model is loaded using the sentence-transformers library.
17+
* Save a config.yml which:
18+
* Ties the model to the module (with a module_id GUID)
19+
* Sets the artifacts_path to the default "artifacts" subdirectory
20+
* Saves the model in the artifacts subdirectory
21+
22+
This can be done by running the `boostrap_model.py` script in your virtual environment.
23+
24+
> For the reranker service, models supported are bi-encoder and are the same used by the other embeddings tasks.
25+
26+
```shell
27+
source venv/bin/activate
28+
./demo/server/bootstrap_model.py -m <MODEL_NAME_OR_PATH> -o <OUTPUT_DIR>
29+
```
30+
31+
32+
To avoid overwriting your files, the save() will return an error if the output directory already exists. You may want to use a temporary name. After success, move the output directory to a `<model-id>` directory under your local models dir.
33+
34+
### Environment variables
35+
36+
These are the set of variables/params related to the environment which embeddings will be run:
37+
38+
```bash
39+
# use IPEX optimization
40+
IPEX_OPTIMIZE: 'true'
41+
42+
# use "xpu" for IPEX on GPU instead of IPEX on CPU
43+
USE_XPU: 'false'
44+
45+
# IPEX performs best with autocast using bfloat16
46+
BFLOAT16: '1'
47+
48+
# use Mac chip
49+
USE_MPS: 'false'
50+
51+
# use Pytorch compile
52+
PT2_COMPILE: 'false'
53+
```
54+
55+
### Starting the Caikit Runtime
56+
57+
Run caikit-runtime configured to use the caikit-nlp library. Set up the following environment variables:
58+
59+
```bash
60+
export RUNTIME_HTTP_ENABLED=true
61+
export RUNTIME_LOCAL_MODELS_DIR=/models
62+
export RUNTIME_LAZY_LOAD_LOCAL_MODELS=true
63+
```
64+
65+
In one terminal, start the runtime server:
66+
67+
```bash
68+
source venv/bin/activate
69+
pip install -r requirements.txt
70+
caikit-runtime
71+
```
72+
73+
### Embedding retrieval example Python client
74+
75+
In another terminal, run the example client code to retrieve embeddings.
76+
77+
```shell
78+
source venv/bin/activate
79+
cd demo/client
80+
MODEL=<model-id> python embeddings.py
81+
```
82+
83+
The client code calls the model and queries for embeddings using 2 example sentences.
84+
85+
You should see output similar to the following:
86+
87+
```ShellSession
88+
$ python embeddings.py
89+
INPUT TEXTS: ['test first sentence', 'another test sentence']
90+
OUTPUT: {
91+
{
92+
"results": [
93+
[
94+
-0.17895537614822388,
95+
0.03200146183371544,
96+
-0.030327674001455307,
97+
...
98+
],
99+
[
100+
-0.17895537614822388,
101+
0.03200146183371544,
102+
-0.030327674001455307,
103+
...
104+
]
105+
],
106+
"producerId": {
107+
"name": "EmbeddingModule",
108+
"version": "0.0.1"
109+
},
110+
"inputTokenCount": "9"
111+
}
112+
}
113+
LENGTH: 2 x 384
114+
```
115+
116+
### Sentence similarity example Python client
117+
118+
In another terminal, run the client code to infer sentence similarity.
119+
120+
```shell
121+
source venv/bin/activate
122+
cd demo/client
123+
MODEL=<model-id> python sentence_similarity.py
124+
```
125+
126+
The client code calls the model and queries sentence similarity using 1 source sentence and 2 other sentences (hardcoded in sentence_similarity.py). The result produces the cosine similarity score by comparing the source sentence with each of the other sentences.
127+
128+
You should see output similar to the following:
129+
130+
```ShellSession
131+
$ python sentence_similarity.py
132+
SOURCE SENTENCE: first sentence
133+
SENTENCES: ['test first sentence', 'another test sentence']
134+
OUTPUT: {
135+
"result": {
136+
"scores": [
137+
1.0000001192092896
138+
]
139+
},
140+
"producerId": {
141+
"name": "EmbeddingModule",
142+
"version": "0.0.1"
143+
},
144+
"inputTokenCount": "9"
145+
}
146+
```
147+
148+
### Reranker example Python client
149+
150+
In another terminal, run the client code to execute the reranker task using both gRPC and REST.
151+
152+
```shell
153+
source venv/bin/activate
154+
cd demo/client
155+
MODEL=<model-id> python reranker.py
156+
```
157+
158+
You should see output similar to the following:
159+
160+
```ShellSession
161+
$ python reranker.py
162+
======================
163+
TOP N: 3
164+
QUERIES: ['first sentence', 'any sentence']
165+
DOCUMENTS: [{'text': 'first sentence', 'title': 'first title'}, {'_text': 'another sentence', 'more': 'more attributes here'}, {'text': 'a doc with a nested metadata', 'meta': {'foo': 'bar', 'i': 999, 'f': 12.34}}]
166+
======================
167+
RESPONSE from gRPC:
168+
===
169+
QUERY: first sentence
170+
score: 0.9999997019767761 index: 0 text: first sentence
171+
score: 0.7350112199783325 index: 1 text: another sentence
172+
score: 0.10398174077272415 index: 2 text: a doc with a nested metadata
173+
===
174+
QUERY: any sentence
175+
score: 0.6631797552108765 index: 0 text: first sentence
176+
score: 0.6505964398384094 index: 1 text: another sentence
177+
score: 0.11903437972068787 index: 2 text: a doc with a nested metadata
178+
===================
179+
RESPONSE from HTTP:
180+
{
181+
"results": [
182+
{
183+
"query": "first sentence",
184+
"scores": [
185+
{
186+
"document": {
187+
"text": "first sentence",
188+
"title": "first title"
189+
},
190+
"index": 0,
191+
"score": 0.9999997019767761,
192+
"text": "first sentence"
193+
},
194+
{
195+
"document": {
196+
"_text": "another sentence",
197+
"more": "more attributes here"
198+
},
199+
"index": 1,
200+
"score": 0.7350112199783325,
201+
"text": "another sentence"
202+
},
203+
{
204+
"document": {
205+
"text": "a doc with a nested metadata",
206+
"meta": {
207+
"foo": "bar",
208+
"i": 999,
209+
"f": 12.34
210+
}
211+
},
212+
"index": 2,
213+
"score": 0.10398174077272415,
214+
"text": "a doc with a nested metadata"
215+
}
216+
]
217+
},
218+
{
219+
"query": "any sentence",
220+
"scores": [
221+
{
222+
"document": {
223+
"text": "first sentence",
224+
"title": "first title"
225+
},
226+
"index": 0,
227+
"score": 0.6631797552108765,
228+
"text": "first sentence"
229+
},
230+
{
231+
"document": {
232+
"_text": "another sentence",
233+
"more": "more attributes here"
234+
},
235+
"index": 1,
236+
"score": 0.6505964398384094,
237+
"text": "another sentence"
238+
},
239+
{
240+
"document": {
241+
"text": "a doc with a nested metadata",
242+
"meta": {
243+
"foo": "bar",
244+
"i": 999,
245+
"f": 12.34
246+
}
247+
},
248+
"index": 2,
249+
"score": 0.11903437972068787,
250+
"text": "a doc with a nested metadata"
251+
}
252+
]
253+
}
254+
],
255+
"producerId": {
256+
"name": "EmbeddingModule",
257+
"version": "0.0.1"
258+
},
259+
"inputTokenCount": "9"
260+
}
261+
```

examples/embeddings/embeddings.py

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Copyright The Caikit Authors
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
# Third Party
16+
import grpc
17+
from os import path
18+
import sys
19+
import os
20+
21+
# Local
22+
import caikit
23+
from caikit.runtime.service_factory import ServicePackageFactory
24+
25+
# Add the runtime/library to the path
26+
sys.path.append(
27+
path.abspath(path.join(path.dirname(__file__), "../../"))
28+
)
29+
30+
# Load configuration for Caikit runtime
31+
CONFIG_PATH = path.realpath(
32+
path.join(path.dirname(__file__), "config.yml")
33+
)
34+
caikit.configure(CONFIG_PATH)
35+
36+
# NOTE: The model id needs to be a path to folder.
37+
# NOTE: This is relative path to the models directory
38+
MODEL_ID = os.getenv("MODEL", "mini")
39+
40+
inference_service = ServicePackageFactory().get_service_package(
41+
ServicePackageFactory.ServiceType.INFERENCE,
42+
)
43+
44+
port = os.getenv('CAIKIT_EMBEDDINGS_PORT') if os.getenv('CAIKIT_EMBEDDINGS_PORT') else 8085
45+
host = os.getenv('CAIKIT_EMBEDDINGS_HOST') if os.getenv('CAIKIT_EMBEDDINGS_HOST') else 'localhost'
46+
channel = grpc.insecure_channel(f"{host}:{port}")
47+
client_stub = inference_service.stub_class(channel)
48+
49+
# Create request object
50+
51+
texts = ["test first sentence", "another test sentence"]
52+
request = inference_service.messages.EmbeddingTasksRequest(texts=texts)
53+
54+
# Fetch predictions from server (infer)
55+
response = client_stub.EmbeddingTasksPredict(
56+
request, metadata=[("mm-model-id", MODEL_ID)]
57+
)
58+
59+
# Print response
60+
print("INPUTS TEXTS: ", texts)
61+
print("RESULTS: [")
62+
for d in response.results.vectors:
63+
woo = d.WhichOneof("data") # which one of data_<float_type>s did we get?
64+
print(getattr(d, woo).values)
65+
print("]")
66+
print("LENGTH: ", len(response.results.vectors), " x ",
67+
len(getattr(response.results.vectors[0], woo).values))
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
caikit[runtime-grpc,runtime-http]
2+
caikit-nlp

0 commit comments

Comments
 (0)