Designing and building a Large Language Model from scratch specifically for code generation.
- Python 3.13
- uv
After installing uv package and project manager, run the command below to create virtual environment and install all dependencies.
uv syncThis project uses Nox for implementing and running dev automation tasks such as linting, formatting code and running tests etc. Current tasks are available in noxfile.py.
Task that runs the unit tests:
uv run nox -s run_testsThe LLM uses an open source dataset from HuggingFace, for now training script only loads Python specific code from the dataset.
Model configs are available at src/llm/configs. To start a new training simply run
uv run train-llm