Skip to content

Commit 3b2154d

Browse files
authored
Merge pull request #16 from amoffat/dev
Release 1.0.1
2 parents 8bd523a + 31a5580 commit 3b2154d

File tree

37 files changed

+552
-180
lines changed

37 files changed

+552
-180
lines changed

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
# Changelog
22

3+
## 1.0.1 - 9/12/23
4+
5+
- Simplify docs
6+
- Add missing Postgres api docs
7+
- Generalizing `validation_only` to the `Bifrost` superclass
8+
39
## 1.0.0 - 9/11/23
410

511
- Subquery support

README.md

Lines changed: 48 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,11 @@
11
# HeimdaLLM
22

3-
> Heimdall, the watchman of the gods, dwelt at its entrance, where he guarded Bifrost,
4-
> the shimmering path connecting the realms.
3+
Pronounced `[ˈhaɪm.dɔl.əm]` or _HEIM-dall-EM_
4+
5+
HeimdaLLM is a robust static analysis framework for validating that LLM-generated
6+
structured output is safe. It currently supports SQL.
7+
8+
In simple terms, it helps makes sure that AI won't wreck your systems.
59

610
[![Heimdall](https://raw.githubusercontent.com/amoffat/HeimdaLLM/main/docs/source/images/heimdall.png)](https://heimdallm.ai)
711
[![Build status](https://github.com/amoffat/HeimdaLLM/actions/workflows/main.yml/badge.svg?branch=main)](https://github.com/amoffat/HeimdaLLM/actions)
@@ -12,77 +16,63 @@
1216
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL_v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)
1317
[![Coverage Status](https://coveralls.io/repos/github/amoffat/HeimdaLLM/badge.svg?branch=dev)](https://coveralls.io/github/amoffat/HeimdaLLM?branch=dev)
1418

15-
HeimdaLLM safely bridges the gap between untrusted human input and trusted
16-
machine-readable output by augmenting LLMs with a robust validation framework. This
17-
enables you externalize LLM technology to your users, so that you can do things like
18-
execute trusted SQL queries from their untrusted input.
19+
Consider the following natural-language database query:
1920

20-
To accomplish this, HeimdaLLM introduces a new technology, the 🌈✨
21-
[Bifrost](https://docs.heimdallm.ai/en/latest/bifrost.html), composed of 4 parts: an LLM
22-
prompt envelope, an LLM integration, a grammar, and a constraint validator. These 4
23-
components operate as a single unit—a Bifrost—which is capable of translating untrusted
24-
human input into trusted machine output.
21+
```
22+
how much have i spent renting movies, broken down by month?
23+
```
2524

26-
**This allows you to perform magic**
25+
From this query (and a little bit of context), an LLM can produce the following SQL
26+
query:
27+
28+
```sql
29+
SELECT
30+
strftime('%Y-%m', payment.payment_date) AS month,
31+
SUM(payment.amount) AS total_amount
32+
FROM payment
33+
JOIN rental ON payment.rental_id=rental.rental_id
34+
JOIN customer ON payment.customer_id=customer.customer_id
35+
WHERE customer.customer_id=:customer_id
36+
GROUP BY month
37+
LIMIT 10;
38+
```
2739

28-
Imagine giving your users natural language access to their data in your database,
29-
without having to worry about dangerous queries. This is an actual query on the [Sakila
30-
Sample
31-
Database](https://www.kaggle.com/datasets/atanaskanev/sqlite-sakila-sample-database):
40+
But how can you ensure the LLM-generated query is safe and that it only accesses
41+
authorized data?
3242

33-
```python
34-
traverse("Show me the movies I rented the longest, and the number of days I had them for.")
35-
```
43+
HeimdaLLM performs static analysis on the generated SQL to ensure that only certain
44+
columns, tables, and functions are used. It also automatically edits the query to add a
45+
`LIMIT` and to remove forbidden columns. Lastly, it ensures that there is a column
46+
constraint that would restrict the results to only the user's data.
47+
48+
It does all of this locally, without AI, using good ol' fashioned grammars and parsers:
3649

3750
```
3851
✅ Ensuring SELECT statement...
3952
✅ Resolving column and table aliases...
4053
✅ Allowlisting selectable columns...
41-
✅ Removing 4 forbidden columns...
54+
✅ Removing 2 forbidden columns...
4255
✅ Ensuring correct row LIMIT exists...
43-
✅ Lowering row LIMIT to 5...
56+
✅ Lowering row LIMIT to 10...
4457
✅ Checking JOINed tables and conditions...
4558
✅ Checking required WHERE conditions...
4659
✅ Ensuring query is constrained to requester's identity...
4760
✅ Allowlisting SQL functions...
61+
✅ strftime
62+
✅ SUM
4863
```
4964

50-
| Title | Rental Date | Return Date | Rental Days |
51-
| --------------- | ----------------------- | ----------------------- | ----------- |
52-
| OUTLAW HANKY | 2005-08-19 05:48:12.000 | 2005-08-28 10:10:12.000 | 9.181944 |
53-
| BOULEVARD MOB | 2005-08-19 07:06:51.000 | 2005-08-28 10:35:51.000 | 9.145139 |
54-
| MINDS TRUMAN | 2005-08-02 17:42:49.000 | 2005-08-11 18:14:49.000 | 9.022222 |
55-
| AMERICAN CIRCUS | 2005-07-12 16:37:55.000 | 2005-07-21 16:04:55.000 | 8.977083 |
56-
| LADY STAGE | 2005-07-28 10:07:04.000 | 2005-08-06 08:16:04.000 | 8.922917 |
57-
58-
You can safely run this example here:
59-
60-
[![Open in GitHub Codespaces](https://img.shields.io/badge/Open%20in-Codespaces-purple.svg)](https://github.com/codespaces/new?hide_repo_select=true&ref=main&repo=656570421)
61-
62-
or [view the read-only notebook](./notebooks/demo.ipynb)
63-
64-
# 📋 Explanation
65-
66-
So, what is actually happening above?
67-
68-
1. Unsafe free-form input is provided, presumably from some front end user interface.
69-
1. That unsafe input is wrapped in a prompt envelope, producing a prompt with additional
70-
context to help an LLM produce a correct query.
71-
1. The unsafe prompt is sent to an LLM of your choice, which then produces an unsafe
72-
SQL query.
73-
1. The LLM response is parsed by a strict grammar which defines only the SQL features
74-
that are allowed.
75-
1. If parsing succeeds, we know at the very least we're dealing with a valid SQL query
76-
albeit an untrusted one.
77-
1. Different features of the parsed query are extracted for validation.
78-
1. A soft validation pass is performed on the extracted features, and we potentially
79-
modify the query to be compliant, for example, to add a `LIMIT` clause, or to remove
80-
disallowed columns.
81-
1. A hard validation pass is performed with your custom constraints to ensure that the
82-
query is only accessing allowed tables, columns, and functions, while containing
83-
required conditions.
84-
1. If validation succeeds, the resulting SQL query can then be sent to the database.
85-
1. If validation fails, you'll see a helpful exception explaining exactly why.
65+
The validated query can then be executed:
66+
67+
| month | total_amount |
68+
| ------- | ------------ |
69+
| 2005-05 | 4.99 |
70+
| 2005-06 | 22.95 |
71+
| 2005-07 | 100.78 |
72+
| 2005-08 | 87.82 |
73+
74+
Want to get started quickly? Go
75+
[here](https://docs.heimdallm.ai/en/latest/quickstart/index.html).
8676

8777
# 🥽 Safety
8878

@@ -94,7 +84,7 @@ me](https://github.com/sponsors/amoffat) or [inquire about interest in a commerc
9484
license](https://forms.gle/frEPeeJx81Cmwva78).
9585

9686
To understand some of the potential vulnerabilities, take a look at the [attack
97-
surface](https://docs.heimdallm.ai/en/latest/attack_surface.html) to see the risks and
87+
surface](https://docs.heimdallm.ai/en/latest/attack-surface.html) to see the risks and
9888
the mitigations.
9989

10090
# 📚 Database support

docs/source/api/abc/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ intended for direct use.
99
bifrost
1010
envelope
1111
validator
12-
llm_integration
12+
llm-integration
1313
context
1414

1515
sql/index
File renamed without changes.

docs/source/api/bifrosts/index.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
Bifrosts
22
========
33

4-
:doc:`Bifrosts </bifrost>` are the fundamental unit of translating untrusted input into
5-
trusted output. This document will expand as we add more Bifrosts.
4+
:doc:`Bifrosts </architecture/bifrost>` are the fundamental unit of translating
5+
untrusted input into trusted output. This document will expand as we add more Bifrosts.
66

77
.. toctree::
88

docs/source/api/bifrosts/sql/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,11 @@ database, please participate in `this poll.
66
<https://github.com/amoffat/HeimdaLLM/discussions/2>`_
77

88
.. toctree::
9+
:maxdepth: 3
910

1011
sqlite/index
1112
mysql/index
13+
postgres/index
1214
exceptions
1315
common
1416

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
Postgres
2+
========
3+
4+
.. toctree::
5+
6+
select/index
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
SQL Select Bifrost
2+
==================
3+
4+
The SQL Select Bifrost produces a trusted SQL Select statement. It uses the following
5+
components:
6+
7+
* :class:`SQLPromptEnvelope <heimdallm.bifrosts.sql.postgres.select.envelope.PromptEnvelope>`
8+
* :class:`SQLConstraintValidator <heimdallm.bifrosts.sql.postgres.select.validator.ConstraintValidator>`
9+
* `Grammar <https://github.com/amoffat/HeimdaLLM/blob/dev/heimdallm/bifrosts/sql/postgres/select/grammar.lark>`_
10+
11+
.. autoclass:: heimdallm.bifrosts.sql.postgres.select.bifrost.Bifrost
12+
:members:
13+
:inherited-members:
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
SQL Select Envelope
2+
===================
3+
4+
.. CAUTION::
5+
6+
The ``db_schema`` argument of the constructor is passed to the LLM. This is how the
7+
LLM knows how to construct the query. If this concerns you, limit the information
8+
that you include in the schema.
9+
10+
.. autoclass:: heimdallm.bifrosts.sql.postgres.select.envelope.PromptEnvelope
11+
:members:
12+
:inherited-members:
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
Select
2+
======
3+
4+
.. toctree::
5+
6+
bifrost
7+
envelope
8+
validator

0 commit comments

Comments
 (0)