Docs/2.6.0 (#2070)

natoverse · dworthen · web-flow · commit 7f996cf584da · 2025-09-23T14:48:28.000-07:00
* Add basic search to overview * Add info on input documents DataFrame * Add info on factories to docs * Add consumption warning and switch to "christmas" for folder name * Add logger to factories list * Add litellm docs. (#2058) * Fix version for input docs * Spelling --------- Co-authored-by: Derek Worthen <worthend.derek@gmail.com>
diff --git a/docs/config/models.md b/docs/config/models.md
@@ -8,9 +8,38 @@ GraphRAG was built and tested using OpenAI models, so this is the default model
 
 GraphRAG also utilizes a language model wrapper library used by several projects within our team, called fnllm. fnllm provides two important functions for GraphRAG: rate limiting configuration to help us maximize throughput for large indexing jobs, and robust caching of API calls to minimize consumption on repeated indexes for testing, experimentation, or incremental ingest. fnllm uses the OpenAI Python SDK under the covers, so OpenAI-compliant endpoints are a base requirement out-of-the-box.
 
+Starting with version 2.6.0, GraphRAG supports using [LiteLLM](https://docs.litellm.ai/) instead of fnllm for calling language models. LiteLLM provides support for 100+ models though it is important to note that when choosing a model it must support returning [structured outputs](https://openai.com/index/introducing-structured-outputs-in-the-api/) adhering to a [JSON schema](https://docs.litellm.ai/docs/completion/json_mode). 
+
+Example using LiteLLm as the language model tool for GraphRAG:
+
+```yaml
+models:
+  default_chat_model:
+    type: chat
+    auth_type: api_key
+    api_key: ${GEMINI_API_KEY}
+    model_provider: gemini
+    model: gemini-2.5-flash-lite
+  default_embedding_model:
+    type: embedding
+    auth_type: api_key
+    api_key: ${GEMINI_API_KEY}
+    model_provider: gemini
+    model: gemini-embedding-001
+```
+
+To use LiteLLM one must 
+
+- Set `type` to either `chat` or `embedding`.
+- Provide a `model_provider`, e.g., `openai`, `azure`, `gemini`, etc.
+- Set the `model` to a one supported by the `model_provider`'s API.
+- Provide a `deployment_name` if using `azure` as the `model_provider`.
+
+See [Detailed Configuration](yaml.md) for more details on configuration. [View LiteLLm basic usage](https://docs.litellm.ai/docs/#basic-usage) for details on how models are called (The `model_provider` is the portion prior to `/` while the `model` is the portion following the `/`).
+
 ## Model Selection Considerations
 
-GraphRAG has been most thoroughly tested with the gpt-4 series of models from OpenAI, including gpt-4 gpt-4-turbo, gpt-4o, and gpt-4o-mini. Our [arXiv paper](https://arxiv.org/abs/2404.16130), for example, performed quality evaluation using gpt-4-turbo.
+GraphRAG has been most thoroughly tested with the gpt-4 series of models from OpenAI, including gpt-4 gpt-4-turbo, gpt-4o, and gpt-4o-mini. Our [arXiv paper](https://arxiv.org/abs/2404.16130), for example, performed quality evaluation using gpt-4-turbo. As stated above, non-OpenAI models are now supported with GraphRAG 2.6.0 and onwards through the use of LiteLLM but the suite of gpt-4 series of models from OpenAI remain the most tested and supported suite of models for GraphRAG.
 
 Versions of GraphRAG before 2.2.0 made extensive use of `max_tokens` and `logit_bias` to control generated response length or content. The introduction of the o-series of models added new, non-compatible parameters because these models include a reasoning component that has different consumption patterns and response generation attributes than non-reasoning models. GraphRAG 2.2.0 now supports these models, but there are important differences that need to be understood before you switch.
 
@@ -58,11 +87,11 @@ Another option would be to avoid using a language model at all for the graph ext
 
 ## Using Non-OpenAI Models
 
-As noted above, our primary experience and focus has been on OpenAI models, so this is what is supported out-of-the-box. Many users have requested support for additional model types, but it's out of the scope of our research to handle the many models available today. There are two approaches you can use to connect to a non-OpenAI model:
+As shown above, non-OpenAI models may be used via LiteLLM starting with GraphRAG version 2.6.0 but cases may still exist in which some users wish to use models not supported by LiteLLM. There are two approaches one can use to connect to unsupported models:
 
 ### Proxy APIs
 
-Many users have used platforms such as [ollama](https://ollama.com/) to proxy the underlying model HTTP calls to a different model provider. This seems to work reasonably well, but we frequently see issues with malformed responses (especially JSON), so if you do this please understand that your model needs to reliably return the specific response formats that GraphRAG expects. If you're having trouble with a model, you may need to try prompting to coax the format, or intercepting the response within your proxy to try and handle malformed responses.
+Many users have used platforms such as [ollama](https://ollama.com/) and [LiteLLM Proxy Server](https://docs.litellm.ai/docs/simple_proxy) to proxy the underlying model HTTP calls to a different model provider. This seems to work reasonably well, but we frequently see issues with malformed responses (especially JSON), so if you do this please understand that your model needs to reliably return the specific response formats that GraphRAG expects. If you're having trouble with a model, you may need to try prompting to coax the format, or intercepting the response within your proxy to try and handle malformed responses.
 
 ### Model Protocol
 
diff --git a/docs/config/yaml.md b/docs/config/yaml.md
@@ -41,7 +41,8 @@ models:
 
 - `api_key` **str** - The OpenAI API key to use.
 - `auth_type` **api_key|azure_managed_identity** - Indicate how you want to authenticate requests.
-- `type` **openai_chat|azure_openai_chat|openai_embedding|azure_openai_embedding|mock_chat|mock_embeddings** - The type of LLM to use.
+- `type` **chat**|**embedding**|**openai_chat|azure_openai_chat|openai_embedding|azure_openai_embedding|mock_chat|mock_embeddings** - The type of LLM to use.
+- `model_provider` **str|None** - The model provider to use, e.g., openai, azure, anthropic, etc. Required when `type == chat|embedding`. When `type == chat|embedding`, [LiteLLM](https://docs.litellm.ai/) is used under the hood which has support for calling 100+ models. [View LiteLLm basic usage](https://docs.litellm.ai/docs/#basic-usage) for details on how models are called (The `model_provider` is the portion prior to `/` while the `model` is the portion following the `/`). [View Language Model Selection](models.md) for more details and examples on using LiteLLM.
 - `model` **str** - The model name.
 - `encoding_model` **str** - The text encoding model to use. Default is to use the encoding model aligned with the language model (i.e., it is retrieved from tiktoken if unset).
 - `api_base` **str** - The API base url to use.
diff --git a/docs/get_started.md b/docs/get_started.md
@@ -1,5 +1,7 @@
 # Getting Started
 
+⚠️ GraphRAG can consume a lot of LLM resources! We strongly recommend starting with the tutorial dataset here until you understand how the system works, and consider experimenting with fast/inexpensive models first before committing to a big indexing job.
+
 ## Requirements
 
 [Python 3.10-3.12](https://www.python.org/downloads/)
@@ -24,25 +26,25 @@ pip install graphrag
 We need to set up a data project and some initial configuration. First let's get a sample dataset ready:
 
 ```sh
-mkdir -p ./ragtest/input
+mkdir -p ./christmas/input
 ```
 
 Get a copy of A Christmas Carol by Charles Dickens from a trusted source:
 
 ```sh
-curl https://www.gutenberg.org/cache/epub/24022/pg24022.txt -o ./ragtest/input/book.txt
+curl https://www.gutenberg.org/cache/epub/24022/pg24022.txt -o ./christmas/input/book.txt
 ```
 
 ## Set Up Your Workspace Variables
 
 To initialize your workspace, first run the `graphrag init` command.
-Since we have already configured a directory named `./ragtest` in the previous step, run the following command:
+Since we have already configured a directory named `./christmas` in the previous step, run the following command:
 
 ```sh
-graphrag init --root ./ragtest
+graphrag init --root ./christmas
 ```
 
-This will create two files: `.env` and `settings.yaml` in the `./ragtest` directory.
+This will create two files: `.env` and `settings.yaml` in the `./christmas` directory.
 
 - `.env` contains the environment variables required to run the GraphRAG pipeline. If you inspect the file, you'll see a single environment variable defined,
   `GRAPHRAG_API_KEY=<API_KEY>`. Replace `<API_KEY>` with your own OpenAI or Azure API key.
@@ -78,13 +80,13 @@ You will also need to login with [az login](https://learn.microsoft.com/en-us/cl
 Finally we'll run the pipeline!
 
 ```sh
-graphrag index --root ./ragtest
+graphrag index --root ./christmas
 ```
 
 ![pipeline executing from the CLI](img/pipeline-running.png)
 
 This process will take some time to run. This depends on the size of your input data, what model you're using, and the text chunk size being used (these can be configured in your `settings.yaml` file).
-Once the pipeline is complete, you should see a new folder called `./ragtest/output` with a series of parquet files.
+Once the pipeline is complete, you should see a new folder called `./christmas/output` with a series of parquet files.
 
 # Using the Query Engine
 
@@ -94,7 +96,7 @@ Here is an example using Global search to ask a high-level question:
 
 ```sh
 graphrag query \
---root ./ragtest \
+--root ./christmas \
 --method global \
 --query "What are the top themes in this story?"
 ```
@@ -103,7 +105,7 @@ Here is an example using Local search to ask a more specific question about a pa
 
 ```sh
 graphrag query \
---root ./ragtest \
+--root ./christmas \
 --method local \
 --query "Who is Scrooge and what are his main relationships?"
 ```
diff --git a/docs/index.md b/docs/index.md
@@ -47,6 +47,7 @@ At query time, these structures are used to provide materials for the LLM contex
 - [_Global Search_](query/global_search.md) for reasoning about holistic questions about the corpus by leveraging the community summaries.
 - [_Local Search_](query/local_search.md) for reasoning about specific entities by fanning-out to their neighbors and associated concepts.
 - [_DRIFT Search_](query/drift_search.md) for reasoning about specific entities by fanning-out to their neighbors and associated concepts, but with the added context of community information.
+- _Basic Search_ for those times when your query is best answered by baseline RAG (standard top _k_ vector search).
 
 ### Prompt Tuning
 
diff --git a/docs/index/architecture.md b/docs/index/architecture.md
@@ -32,3 +32,20 @@ The GraphRAG library was designed with LLM interactions in mind, and a common se
 Because of these potential error cases, we've added a cache layer around LLM interactions.
 When completion requests are made using the same input set (prompt and tuning parameters), we return a cached result if one exists.
 This allows our indexer to be more resilient to network issues, to act idempotently, and to provide a more efficient end-user experience.
+
+### Providers & Factories
+
+Several subsystems within GraphRAG use a factory pattern to register and retrieve provider implementations. This allows deep customization to support models, storage, and so on that you may use but isn't built directly into GraphRAG.
+
+The following subsystems use a factory pattern that allows you to register your own implementations:
+
+- [language model](https://github.com/microsoft/graphrag/blob/main/graphrag/language_model/factory.py) - implement your own `chat` and `embed` methods to use a model provider of choice beyond the built-in OpenAI/Azure support
+- [cache](https://github.com/microsoft/graphrag/blob/main/graphrag/cache/factory.py) - create your own cache storage location in addition to the file, blob, and CosmosDB ones we provide
+- [logger](https://github.com/microsoft/graphrag/blob/main/graphrag/logger/factory.py) - create your own log writing location in addition to the built-in file and blob storage
+- [storage](https://github.com/microsoft/graphrag/blob/main/graphrag/storage/factory.py) - create your own storage provider (database, etc.) beyond the file, blob, and CosmosDB ones built in
+- [vector store](https://github.com/microsoft/graphrag/blob/main/graphrag/vector_stores/factory.py) - implement your own vector store other than the built-in lancedb, Azure AI Search, and CosmosDB ones built in
+- [pipeline + workflows](https://github.com/microsoft/graphrag/blob/main/graphrag/index/workflows/factory.py) - implement your own workflow steps with a custom `run_workflow` function, or register an entire pipeline (list of named workflows)
+
+The links for each of these subsystems point to the source code of the factory, which includes registration of the default built-in implementations. In addition, we have a detailed discussion of [language models](../config/models.md), which includes and example of a custom provider, and a [sample notebook](../examples_notebooks/custom_vector_store.ipynb) that demonstrates a custom vector store.
+
+All of these factories allow you to register an impl using any string name you would like, even overriding built-in ones directly.
diff --git a/docs/index/inputs.md b/docs/index/inputs.md
@@ -16,6 +16,10 @@ All input formats are loaded within GraphRAG and passed to the indexing pipeline
 
 Also see the [outputs](outputs.md) documentation for the final documents table schema saved to parquet after pipeline completion.
 
+## Bring-your-own DataFrame
+
+As of version 2.6.0, GraphRAG's [indexing API method](https://github.com/microsoft/graphrag/blob/main/graphrag/api/index.py) allows you to pass in your own pandas DataFrame and bypass all of the input loading/parsing described in the next section. This is convenient if you have content in a format or storage location we don't support out-of-the-box. __You must ensure that your input DataFrame conforms to the schema described above.__ All of the chunking behavior described later will proceed exactly the same.
+
 ## Formats
 
 We support three file formats out-of-the-box. This covers the overwhelming majority of use cases we have encountered. If you have a different format, we recommend writing a script to convert to one of these, which are widely used and supported by many tools and libraries.
diff --git a/mkdocs.yaml b/mkdocs.yaml
@@ -22,11 +22,12 @@ theme:
 
 nav:
   - Home:
-      - Welcome: index.md
-      - Getting Started: get_started.md
-      - Development Guide: developing.md
+      - Welcome: "index.md"
+      - Getting Started: "get_started.md"
+      - Development Guide: "developing.md"
   - Indexing:
       - Overview: "index/overview.md"
+      - Architecture: "index/architecture.md"
       - Dataflow: "index/default_dataflow.md"
       - Methods: "index/methods.md"
       - Inputs: "index/inputs.md"