Skip to content

Conversation

@bhardwajjvaibhav
Copy link

Summary

This PR fixes a bug in the HuggingFace backend of init_chat_model where all kwargs were passed directly to HuggingFacePipeline.from_model_id().

This caused validation errors such as:

Validation Error

ValidationError: 1 validation error for ChatHuggingFace
llm
  Field required [type=missing]

Root Cause

The previous implementation passed parameters like temperature, max_tokens, timeout, etc. to HuggingFacePipeline, which does not accept these arguments.

As a result, Pydantic validation inside ChatHuggingFace failed.

What this PR fixes

  1. Adds default task for decoder-only models

If no task is provided, we set:

🛠️ Added Default Task for Decoder-Only Models

kwargs["task"] = "text-generation"
  1. Filters arguments to only those supported by HuggingFacePipeline

Allowed params:

{"task", "model_kwargs", "device"}

Other parameters (temperature, max_tokens, etc.) are ignored for the pipeline and used only by LangChain.

Before

if model_provider == "huggingface":
    _check_pkg("langchain_huggingface")
    from langchain_huggingface import ChatHuggingFace, HuggingFacePipeline

    llm = HuggingFacePipeline.from_model_id(model_id=model, **kwargs)
    return ChatHuggingFace(llm=llm)

This sent all kwargs → leading to ValidationError.

After (Fixed)

if model_provider == "huggingface":
    _check_pkg("langchain_huggingface")
    from langchain_huggingface import ChatHuggingFace, HuggingFacePipeline

    # Add default task for decoder-only models
    if "task" not in kwargs:
        kwargs["task"] = "text-generation"

    # Filter only parameters allowed by HuggingFacePipeline
    pipeline_allowed_params = {"task", "model_kwargs", "device"}
    pipeline_kwargs = {k: v for k, v in kwargs.items() if k in pipeline_allowed_params}

    llm = HuggingFacePipeline.from_model_id(
        model_id=model,
        **pipeline_kwargs
    )

    return ChatHuggingFace(llm=llm)

Example Code That Now Works

from langchain.chat_models import init_chat_model

llm = init_chat_model(
    model="microsoft/Phi-3-mini-4k-instruct",
    model_provider="huggingface",
    temperature=0,
    max_tokens=1024,
    timeout=None,
    max_retries=2,
)

Test Case:

from langchain.langchain_classic.chat_models import init_chat_model

def test_hf_init_chat_model_initializes_successfully():
    llm = init_chat_model(
        model="microsoft/Phi-3-mini-4k-instruct",
        model_provider="huggingface",
        temperature=0,
        max_tokens=1024,
        timeout=None,
        max_retries=2,
    )

    assert llm is not None

Result

The HuggingFace initialization now correctly separates pipeline parameters and LangChain LLM settings, preventing validation errors.

@github-actions github-actions bot added integration Related to a provider partner package integration langchain-classic huggingface labels Nov 18, 2025
@bhardwajjvaibhav bhardwajjvaibhav changed the title Fix incorrect HuggingFace initialization in init_chat_model fix: incorrect HuggingFace initialization in init_chat_model Nov 18, 2025
@github-actions github-actions bot added the fix label Nov 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix huggingface integration Related to a provider partner package integration langchain-classic

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant