Skip to content

Commit 497eb36

Browse files
committed
Doogfood docs for benchmark run
1 parent 597ed9d commit 497eb36

File tree

2 files changed

+9
-9
lines changed

2 files changed

+9
-9
lines changed

klaudbiusz/.env.example

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
# Required for app generation and MLflow tracking
33
DATABRICKS_HOST=https://your-workspace.databricks.com
44
DATABRICKS_TOKEN=dapi...
5+
DATABRICKS_WAREHOUSE_ID=
56

67
# Anthropic API
78
# Required for Claude Agent SDK
@@ -10,5 +11,7 @@ ANTHROPIC_API_KEY=sk-ant-...
1011
# Optional: Database for logging
1112
# DATABASE_URL=postgresql://user:password@localhost:5432/dbname
1213

14+
GEMINI_API_KEY=
15+
1316
# MLFlow
1417
MLFLOW_EXPERIMENT_NAME=/Shared/klaudbiusz-evaluations

klaudbiusz/README.md

Lines changed: 6 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -20,18 +20,15 @@ cp .env.example .env
2020
# Edit .env with your credentials
2121
```
2222

23-
`.env` file contents:
24-
```bash
25-
DATABRICKS_HOST=https://your-workspace.databricks.com
26-
DATABRICKS_TOKEN=dapi...
27-
ANTHROPIC_API_KEY=sk-ant-...
28-
```
29-
3023
### Generate Applications
31-
3224
```bash
3325
cd klaudbiusz
3426

27+
28+
# make sure app folder is empty
29+
cli/archive_evaluation.sh
30+
cli/cleanup_evaluation.sh
31+
3532
# Generate a single app (Claude backend, default)
3633
uv run cli/single_run.py "Create a customer churn analysis dashboard"
3734

@@ -67,7 +64,7 @@ uv run cli/evaluate_all.py --skip 10 --limit 5 # Skip first 10, evaluat
6764
uv run cli/evaluate_app.py ../app/customer-churn-analysis
6865
```
6966

70-
**Results are automatically logged to MLflow:** Navigate to `ML → Experiments → /Shared/klaudbiusz-evaluations` in Databricks UI.
67+
**Results are automatically logged to MLflow:** Navigate to `ML → Experiments → /Shared/klaudbiusz-evaluations` in Databricks UI / Googfooding.
7168

7269
## Evaluation Framework
7370

0 commit comments

Comments
 (0)