Skip to content

Commit b102f58

Browse files
authored
Merge branch 'main' into jinja2-set-vars-incorrect
2 parents c78de27 + 4ce5b68 commit b102f58

File tree

663 files changed

+68649
-1648
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

663 files changed

+68649
-1648
lines changed

docs-website/docs/concepts/components_overview.mdx renamed to docs-website/docs/concepts/components-overview.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Read more about various Generators in our [guides](../pipeline-components/genera
3030

3131
[Retrievers](../pipeline-components/retrievers.mdx) go through all the documents in a Document Store, select the ones that match the user query, and pass it on to the next component. There are various Retrievers that are customized for specific Document Stores. This means that they can handle specific requirements for each database using customized parameters.
3232

33-
For example, for Elasticsearch Document Store, you will find both the Document Store and Retriever packages in its GitHub [repo](https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/elasticsearch).
33+
For example, for Elasticsearch Document Store, you will find both the Document Store and Retriever packages in its GitHub [repo](https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/elasticsearch).
3434

3535
### Document Stores
3636

@@ -42,9 +42,9 @@ If you are working with more complex pipelines in Haystack, you can use a [`Docu
4242

4343
You can use different [data classes](data-classes.mdx) in Haystack to carry the data through the system. The data classes are mostly likely to appear as inputs or outputs of your pipelines.
4444

45-
`Document` class contains information to be carried through the pipeline. It can be text, metadata, tables, or binary data. Documents can be written into Document Stores but also written and read by other components.
45+
`Document` class contains information to be carried through the pipeline. It can be text, metadata, tables, or binary data. Documents can be written into Document Stores but also written and read by other components.
4646

47-
`Answer` class holds not only the answer generated in a pipeline but also the originating query and metadata.
47+
`Answer` class holds not only the answer generated in a pipeline but also the originating query and metadata.
4848

4949
### Pipelines
5050

@@ -53,4 +53,4 @@ Finally, you can combine various components, Document Stores, and integrations i
5353
If you want to reuse pipelines, you can save them into a convenient format (YAML, TOML, and more) on a disk or share them around using the [serialization](pipelines/serialization.mdx) process.
5454

5555
Here is a short Haystack pipeline, illustrated:
56-
<ClickableImage src="/img/00f5fe8-Pipeline_Illustrations_2.png" alt="" />
56+
<ClickableImage src="/img/00f5fe8-Pipeline_Illustrations_2.png" alt="" />

docs-website/docs/concepts/concepts-overview.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: "Haystack Concepts Overview"
3-
id: "concepts-overview"
3+
id: concepts-overview
44
description: "Haystack provides all the tools you need to build custom agents and RAG pipelines with LLMs that work for you. This includes everything from prototyping to deployment. This page discusses the most important concepts Haystack operates on."
55
---
66

@@ -51,4 +51,4 @@ If you want to reuse pipelines, you can save them into a convenient format (YAML
5151

5252
Here is a short Haystack pipeline, illustrated:
5353

54-
![Short Haystack pipeline](/img/pipeline-illustration-overview.png)
54+
![Short Haystack pipeline](/img/pipeline-illustration-overview.png)

docs-website/docs/overview/intro.mdx renamed to docs-website/docs/intro.mdx

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
---
22
title: "Introduction to Haystack"
33
id: intro
4-
slug: "/intro"
54
description: "Haystack is an **open-source AI framework** for building production-ready **AI Agents**, **retrieval-augmented generative pipelines** and **state-of-the-art multimodal search systems**. Learn more about Haystack and how it works."
65
---
76

@@ -12,7 +11,7 @@ Haystack is an **open-source AI framework** for building production-ready **AI A
1211
:::tip
1312
Welcome to Haystack
1413

15-
To skip the introductions and go directly to installing and creating a search app, see [Get Started](get_started.mdx).
14+
To skip the introductions and go directly to installing and creating a search app, see [Get Started](overview/get-started.mdx).
1615
:::
1716

1817
Haystack is an open-source AI orchestration framework that you can use to build powerful, production-ready applications with Large Language Models (LLMs) for various use cases. Whether you’re creating autonomous agents, multimodal apps, or scalable RAG systems, Haystack provides the tools to move from idea to production easily.
@@ -31,4 +30,4 @@ If your team needs **enterprise-grade support, best practices, and deployment gu
3130
📜 [Learn more about Haystack Enterprise](https://haystack.deepset.ai/blog/announcing-haystack-enterprise)
3231

3332
👉 [Get in touch with our team](https://www.deepset.ai/products-and-services/haystack-enterprise)
34-
:::
33+
:::

docs-website/docs/overview/get_started.mdx renamed to docs-website/docs/overview/get-started.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,4 +102,4 @@ RECIPE MISSING
102102

103103
### Adding Your Data
104104

105-
Instead of running the RAG pipeline on example data, learn how you can add your own custom data using [Document Stores](../concepts/document-store.mdx).
105+
Instead of running the RAG pipeline on example data, learn how you can add your own custom data using [Document Stores](../concepts/document-store.mdx).

docs-website/docs/overview/migration.mdx

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ description: "Learn how to make the move to Haystack 2.x from Haystack 1.x."
99

1010
Learn how to make the move to Haystack 2.x from Haystack 1.x.
1111

12-
This guide is designed for those with previous experience with Haystack and who are interested in understanding the differences between Haystack 1.x and Haystack 2.x. If you're new to Haystack, skip this page and proceed directly to Haystack 2.x [documentation](get_started.mdx).
12+
This guide is designed for those with previous experience with Haystack and who are interested in understanding the differences between Haystack 1.x and Haystack 2.x. If you're new to Haystack, skip this page and proceed directly to Haystack 2.x [documentation](get-started.mdx).
1313

1414
## Major Changes
1515

@@ -38,11 +38,11 @@ While Haystack 2.x continues to rely on the `Pipeline` abstraction, the elements
3838

3939
Pipelines continue to serve as the fundamental structure of all Haystack applications. While the concept of `Pipeline` abstraction remains consistent, Haystack 2.x introduces significant enhancements that address various limitations of its predecessor. For instance, the pipelines now support loops. Pipelines also offer greater flexibility in their input, which is no longer restricted to queries. The pipeline now allows to route the output of a component to multiple recipients. This increases flexibility, however, comes with notable differences in the pipeline definition process in Haystack 2.x compared to the previous version.
4040

41-
In Haystack 1.x, a pipeline was built by adding one node after the other. In the resulting pipeline graph, edges are automatically added to connect those nodes in the order they were added.
41+
In Haystack 1.x, a pipeline was built by adding one node after the other. In the resulting pipeline graph, edges are automatically added to connect those nodes in the order they were added.
4242

4343
Building a pipeline in Haystack 2.x is a two-step process:
4444

45-
1. Initially, components are added to the pipeline without any specific order by calling the `add_component` method.
45+
1. Initially, components are added to the pipeline without any specific order by calling the `add_component` method.
4646
2. Subsequently, the components must be explicitly connected by calling the `connect` method to define the final graph.
4747

4848
To migrate an existing pipeline, the first step is to go through the nodes and identify their counterparts in Haystack 2.x (see the following section, [_Migrating Components_](#migrating-components), for guidance). If all the nodes can be replaced by corresponding components, they have to be added to the pipeline with `add_component` and explicitly connected with the appropriate calls to `connect`. Here is an example:
@@ -121,7 +121,7 @@ The [_Migration examples_](#migration-examples) section below shows how to port
121121

122122
The agentic approach facilitates the answering of questions that are significantly more complex than those typically addressed by extractive or generative question answering techniques.
123123

124-
Haystack 1.x provided Agents, enabling the use of LLMs in a loop.
124+
Haystack 1.x provided Agents, enabling the use of LLMs in a loop.
125125

126126
Currently in Haystack 2.x, you can build Agents using three main elements in a pipeline: Chat Generators, ToolInvoker component, and Tools. A standalone Agent abstraction in Haystack 2.x is in an experimental phase.
127127

@@ -297,11 +297,11 @@ cleaner = DocumentCleaner(
297297
remove_extra_whitespaces=True,
298298
)
299299
indexing_pipeline.add_component("cleaner", cleaner)
300-
300+
301301
## Pre-processes the text by performing splits and adding metadata to the text (DocumentSplitter component)
302302
preprocessor = DocumentSplitter(
303-
split_by="passage",
304-
split_length=100,
303+
split_by="passage",
304+
split_length=100,
305305
split_overlap=50
306306
)
307307
indexing_pipeline.add_component("preprocessor", preprocessor)
@@ -348,9 +348,9 @@ extractive_qa_pipeline = ExtractiveQAPipeline(reader, retriever)
348348

349349
query = "What is the capital of France?"
350350
result = extractive_qa_pipeline.run(
351-
query=query,
351+
query=query,
352352
params={
353-
"Retriever": {"top_k": 10},
353+
"Retriever": {"top_k": 10},
354354
"Reader": {"top_k": 5}
355355
}
356356
)
@@ -385,7 +385,7 @@ extractive_qa_pipeline.connect("retriever", "reader")
385385

386386
query = "What is the capital of France?"
387387
result = extractive_qa_pipeline.run(data={
388-
"retriever": {"query": query, "top_k": 3},
388+
"retriever": {"query": query, "top_k": 3},
389389
"reader": {"query": query, "top_k": 2}
390390
})
391391
```
@@ -491,4 +491,4 @@ You can access old tutorials in the [GitHub history](https://github.com/deepset-
491491

492492
The ZIP file contains documentation for all minor releases from version 1.0 to 1.26.
493493

494-
To download documentation for a specific release, replace the version number in the following URL: `https://core-engineering.s3.eu-central-1.amazonaws.com/public/docs/v1.26.zip`.
494+
To download documentation for a specific release, replace the version number in the following URL: `https://core-engineering.s3.eu-central-1.amazonaws.com/public/docs/v1.26.zip`.

docs-website/docs/pipeline-components/converters.mdx

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ Use various Converters to extract data from files in different formats and cast
2020
| [ImageFileToImageContent](converters/imagefiletoimagecontent.mdx) | Reads local image files and converts them into `ImageContent` objects. |
2121
| [JSONConverter](converters/jsonconverter.mdx) | Converts JSON files to text documents. |
2222
| [MarkdownToDocument](converters/markdowntodocument.mdx) | Converts markdown files to documents. |
23+
| [MistralOCRDocumentConverter](converters/mistralocrdocumentconverter.mdx) | Extracts text from documents using Mistral's OCR API, with optional structured annotations. |
2324
| [MSGToDocument](converters/msgtodocument.mdx) | Converts Microsoft Outlook .msg files to documents. |
2425
| [MultiFileConverter](converters/multifileconverter.mdx) | Converts CSV, DOCX, HTML, JSON, MD, PPTX, PDF, TXT, and XSLX files to documents. |
2526
| [OpenAPIServiceToFunctions](converters/openapiservicetofunctions.mdx) | Transforms OpenAPI service specifications into a format compatible with OpenAI's function calling mechanism. |
@@ -31,4 +32,4 @@ Use various Converters to extract data from files in different formats and cast
3132
| [TikaDocumentConverter](converters/tikadocumentconverter.mdx) | Converts various file types to documents using Apache Tika. |
3233
| [TextFileToDocument](converters/textfiletodocument.mdx) | Converts text files to documents. |
3334
| [UnstructuredFileConverter](converters/unstructuredfileconverter.mdx) | Converts text files and directories to a document. |
34-
| [XLSXToDocument](converters/xlsxtodocument.mdx) | Converts Excel files into documents. |
35+
| [XLSXToDocument](converters/xlsxtodocument.mdx) | Converts Excel files into documents. |

0 commit comments

Comments
 (0)