This repo contains the setup scripts from the paper "A Scalable AI Training Platform for Remote Sensing Data" presented at Agile 2023.
/ansibleThis directory contains Ansible scripts to install the platform on classic servers or VMs. It is intended for cases where no internal cloud connection is available./confA basic configuration for Steep. It contains a CPU and a GPU training service. Both are based on the public tensorflow images. Thus, they can be used without a custom docker image beeing built./dataAn example script for testing workflow executioncheck-for-updates.shA simple script to restart the platform if the config in this repo changes. Should be executed as a cron job.example-workflow-cpu.yamlExample workflow that can be submitted to the platform and tests the CPU capability. See section below.example-workflow-gpu.yamlExample workflow that can be submitted to the platform and tests the GPU capability. See section below.
To try out everything we recommend the installation on classic servers. To do so, follow these steps.
- Start three servers: A main instance with a large disk for some data, a CPU instance to normal calculations and a GPU instance to AI trainig.
- Run the ansible script to prepare the instances and install everything. See the readme file in the subdirectory.
- Submit your first workflow. To do so, copy the content of the
datadirectory to/dataon the main instance. Submit theexample-workflow-cpu.yamlto the main instance to test a cpu instance. It saves it output at/data/test_out.txt.example-workflow-gpu.yamlworks similar for gpu instances. You can submit a workflow via the UI (available on the main instance at port 8080) or via the command line:curl -X POST http://<MAIN_IP>:8080/workflows --data-binary @example-workflow-gpu.yaml
- Read the documentation of Steep: https://steep-wms.github.io/ Here you can learn everything related to the used workflow management system.
- Add custom services. Setup a Gitlab repo, put your Dockerfile in it, specify a
.gitlab-ci.ymlto build your image after changes and generate a deployment key. This key has to be added to the platform. Seeansible/auth.ymlfor the docker credentials. The/my-custom-service-templatedirectory contains a template for a new service repo. - Write fancy workflows with loops etc. See https://steep-wms.github.io/#feeding-results-back-into-the-workflow-cyclesloops for an example.
- Write custom plugins for Steep. They can be placed in the
conf/pluginsdirectory. - Enable cloud integration. If you specify OpenStack credentials when starting the main instance, it can launch VMs and destroy them after usage.