Skip to content
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
569cdac
Create .gitkeep
bergio13 Jul 10, 2025
438d861
Add files via upload
bergio13 Jul 10, 2025
d024f37
Add files via upload
bergio13 Jul 10, 2025
906a6c7
Delete examples/rl/.gitkeep
bergio13 Jul 10, 2025
6f5ab24
Implement ReinforcementLearningABM and integrate with POMDPs/Crux in…
bergio13 Jul 22, 2025
6463646
ignore log
bergio13 Jul 22, 2025
ece4017
fix wolfsheep
bergio13 Jul 24, 2025
fd5d309
fix indexing, stepping with policies and training config
bergio13 Aug 7, 2025
c1b69d9
fix wolfsheep
bergio13 Aug 12, 2025
92d8bc6
add scheduler
bergio13 Aug 12, 2025
af6bdce
add scheduler
bergio13 Aug 12, 2025
0687f2c
add discount rates to rl config
bergio13 Aug 14, 2025
9d9c9ee
add plot + start refactoring as extension
bergio13 Aug 16, 2025
b839b77
refactor in extension and fix plotting
bergio13 Aug 16, 2025
93678f3
Use less restrictive versions of libs
Tortar Aug 16, 2025
c318539
Merge branch 'main' into gsoc
Tortar Aug 16, 2025
2da0515
remove piracy
Tortar Aug 17, 2025
5f05a06
add documentation for rl functions + refactor rl code
bergio13 Aug 17, 2025
5327055
fix
bergio13 Aug 17, 2025
5fc5755
fix
bergio13 Aug 17, 2025
864302e
fix docstring
bergio13 Aug 17, 2025
d43fd04
fix exports
bergio13 Aug 17, 2025
186c9c7
fix interface observation function
bergio13 Aug 18, 2025
1062488
delete old files and rename examples
bergio13 Aug 18, 2025
b1e37e8
delete old files
bergio13 Aug 18, 2025
e600392
add tests for rlabm
bergio13 Aug 18, 2025
5fc8d06
create tutorial + fix training wolfsheep + add deps for tests
bergio13 Aug 20, 2025
bfa27a7
fix tests
bergio13 Aug 20, 2025
7625bd2
fix makie version
bergio13 Aug 20, 2025
b918301
Update Project.toml
Tortar Aug 20, 2025
b8a9a4f
Update Project.toml
Tortar Aug 20, 2025
1fde568
remove debug prints
bergio13 Aug 21, 2025
0ee83ea
edit api docs
bergio13 Aug 21, 2025
30d5e6d
edit docs
bergio13 Aug 21, 2025
3eeec19
Update Agents.jl
Tortar Aug 21, 2025
2fb36da
Update rl_boltzmann.jl
Tortar Aug 21, 2025
f7df9ce
fix
Tortar Aug 21, 2025
9657ec4
improve boltzmann tutorial + fixes
bergio13 Aug 21, 2025
33bb8b2
refactor observation_radius + fixes
bergio13 Aug 21, 2025
3a638d2
fix example
bergio13 Aug 21, 2025
ac20b28
fix tutorial
bergio13 Aug 21, 2025
bf86bd0
add tests for extension + improve docs for RLABM
bergio13 Aug 22, 2025
c3c984b
fix tests
bergio13 Aug 22, 2025
f1bf7f6
Update examples/rl_boltzmann.jl
Tortar Aug 22, 2025
1b367d0
Update examples/rl_boltzmann.jl
Tortar Aug 22, 2025
de7a081
Update examples/rl_boltzmann.jl
Tortar Aug 22, 2025
73a7c49
Update examples/rl_boltzmann.jl
Tortar Aug 22, 2025
d336fd0
Update examples/rl_boltzmann.jl
Tortar Aug 22, 2025
c37b554
Update examples/rl_boltzmann.jl
Tortar Aug 22, 2025
aa0222a
update example
Tortar Aug 22, 2025
47a3240
Update Project.toml
Tortar Aug 22, 2025
988821f
fix stepping
Tortar Aug 23, 2025
2e21eca
Update rl_boltzmann.jl
Tortar Aug 23, 2025
4b9edd1
Update rl_boltzmann.jl
Tortar Aug 23, 2025
19b85bb
Update rl_boltzmann.jl
Tortar Aug 23, 2025
4eb1985
Update ext/AgentsVisualizations/src/interaction.jl
bergio13 Aug 26, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,6 @@ test/adata.arrow
test/mdata.arrow
*.csv
*.arrow
tutorial.md
tutorial.md
log
examples/rl/log
6 changes: 6 additions & 0 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,12 @@ StructArrays = "09ab397b-f2b6-538f-b94a-2f83cf4a842a"

[weakdeps]
Arrow = "69666777-d1a9-59fb-9406-91d4454c9d45"
Crux = "e51cc422-768a-4345-bb8e-2246287ae729"
GraphMakie = "1ecd5474-83a3-4783-bb4f-06765db800d2"
Makie = "ee78f7c6-11fb-53f2-987a-cfe4a2b5a57a"
OSMMakie = "76b6901f-8821-46bb-9129-841bc9cfe677"
POMDPs = "a93abf59-7444-517b-a68a-c42f96afdd7d"
Flux = "587475ba-b771-5e3f-ad9e-33799f191a9c"

[extensions]
AgentsArrow = "Arrow"
Expand All @@ -44,11 +47,13 @@ AgentsVisualizations = "Makie"
Arrow = "2"
CSV = "0.9.7, 0.10"
CommonSolve = "0.2.4"
Crux = "0.1.2"
DataFrames = "0.21, 0.22, 1"
DataStructures = "0.18"
Distributed = "1"
Distributions = "0.25"
Downloads = "1"
Flux = "0.14.25"
GraphMakie = "0.5"
Graphs = "1.4"
JLD2 = "0.4, 0.5"
Expand All @@ -59,6 +64,7 @@ LinearAlgebra = "1"
MacroTools = "0.5"
Makie = "0.20, 0.21, 0.22"
OSMMakie = "0.0, 0.1"
POMDPs = "0.9.0, 1.0.0"
PrecompileTools = "1"
ProgressMeter = "1.5"
Random = "1"
Expand Down
50 changes: 50 additions & 0 deletions docs/ReinforcementLearningABM_Guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# ReinforcementLearningABM: A New Agent-Based Model Type

## Overview

The `ReinforcementLearningABM` is a new model type that extends the capabilities of `StandardABM` by integrating reinforcement learning (RL) functionality directly into the agent-based modeling framework. This model type provides a seamless way to train agents using RL algorithms while maintaining full compatibility with the existing Agents.jl ecosystem.

## Key Features

### 1. **Integrated RL Training**

- Built-in support for training agents using various RL algorithms (PPO, DQN, A2C)
- Automatic integration with POMDPs.jl and Crux.jl

### 2. **Multi-Agent Learning Support**

- Train multiple agent types simultaneously or sequentially
- Support for heterogeneous agents with different action and observation spaces
- Automatic policy management for trained agents

### 3. **Flexible Architecture**

- Inherits all functionality from `StandardABM`
- Optional RL functionality - can be used as a regular ABM when RL is not needed

### 4. **Easy Configuration**

- Simple configuration system for RL components
- Customizable observation functions, reward functions, and termination conditions
- Support for custom neural network architectures

## Architecture

```
ReinforcementLearningABM
├── StandardABM components
│ ├── agents, space, scheduler, properties, rng, etc.
├── RL-specific components
│ ├── rl_config: Configuration for RL training
│ ├── trained_policies: Storage for trained policies
│ ├── training_history: Record of training progress
│ ├── is_training: Training mode flag
```

## Dependencies

The RL functionality requires:

- `POMDPs.jl`: For the POMDP interface
- `Crux.jl`: For RL algorithms and neural networks
- `Flux.jl`: For neural network components
Loading
Loading