-
Notifications
You must be signed in to change notification settings - Fork 165
Description
What happened?
fastembed.sparse.sparse_text_embedding.py has this:
for EMBEDDING_MODEL_TYPE in self.EMBEDDINGS_REGISTRY:
supported_models = EMBEDDING_MODEL_TYPE._list_supported_models()
if any(model_name.lower() == model.model.lower() for model in supported_models):
self.model = EMBEDDING_MODEL_TYPE(
model_name,
cache_dir,
threads=threads,
providers=providers,
cuda=cuda,
device_ids=device_ids,
lazy_load=lazy_load,
**kwargs,
)
return
which eventually gets here in BM25's constructor:
self._model_dir = self.download_model(
model_description,
self.cache_dir,
local_files_only=self._local_files_only,
specific_model_path=self._specific_model_path,
)
which eventually gets here:
return Path(
cls.download_files_from_huggingface(
hf_source,
cache_dir=cache_dir,
extra_patterns=extra_patterns,
**kwargs,
)
)
I am passing cache_dir=/tmp and local_files_only=False. The cached file already exist from a previous run. But unfortunately the code is like this:
if local_files_only:
disable_progress_bars()
if metadata_file.exists():
metadata = json.loads(metadata_file.read_text())
verified = _verify_files_from_metadata(snapshot_dir, metadata, repo_files=[])
if not verified:
logger.warning(
"Local file sizes do not match the metadata."
) # do not raise, still make an attempt to load the model
else:
logger.warning(
"Metadata file not found. Proceeding without checking local files."
) # if users have downloaded models from hf manually, or they're updating from previous versions of
# fastembed
result = snapshot_download(
repo_id=hf_source_repo,
allow_patterns=allow_patterns,
cache_dir=cache_dir,
local_files_only=local_files_only,
**kwargs,
)
return result
repo_revision = model_info(hf_source_repo).sha
repo_tree = list(list_repo_tree(hf_source_repo, revision=repo_revision, repo_type="model"))
It ends up doing the full dance of downloading files.
I don't want local_files_only, I am expecting a "local files first". After all, this is what a cache means - use it if in the cache.
"local files only" should only determine if you fail/return right away or if you keep going to download from HF.
What is the expected behaviour?
Load from local cached files if present.
A minimal reproducible example
I have this in my DOckerfile, so I was expecting it to load the file from the local cache:
RUN echo "from fastembed.sparse.sparse_text_embedding import SparseTextEmbedding \n" > ${HF_HOME}/docker.py && \
echo "embeddings = SparseTextEmbedding('Qdrant/bm25') \n" >> ${HF_HOME}/docker.py && \
echo "embedded_junk_vector = embeddings.embed('what is the arity of this vector?') \n" >> ${HF_HOME}/docker.py && \
echo "print('Model cached successfully at:', embeddings.model_cache_dir) \n" >> ${HF_HOME}/docker.py && \
python3 ${HF_HOME}/docker.py
So, in my code later on I was expecting it to load from local files on a subsequent run.
What Python version are you on? e.g. python --version
Python 3.12.11
FastEmbed version
fastembed==0.7.3
What os are you seeing the problem on?
MacOS
Relevant stack traces and/or logs
If there is any connectivity issues with HF at runtime and I use BM25, I get this:
File "/usr/local/lib/python3.11/site-packages/fastembed/sparse/sparse_text_embedding.py", line 77, in __init__
self.model = EMBEDDING_MODEL_TYPE(
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/fastembed/sparse/bm25.py", line 119, in __init__
self._model_dir = self.download_model(
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/fastembed/common/model_management.py", line 458, in download_model
raise ValueError(f"Could not load model {model.model} from any source.")
ValueError: Could not load model Qdrant/bm25 from any source.