-
Notifications
You must be signed in to change notification settings - Fork 3
Bring your own tool
Virtual Research Environments integrate tools and pipelines to enforce a research community. We offer the possibility to integrate your application in pone of these analytical platforms. Please, read though this documentation and contact us for any doubt or suggestion you may have.
open Virtual Research Environment offers a number of benefits for developers willing to integrate their tools in the platform:
- Open access platform publicly available
- A full web-based workbench with user support utilities, OIDC authentication service, and data handlers for local and remote datasets or repositories.
- Visibility for your tool and organization, with ownership recognition, tailored web forms, help pages and customized viewers.
- The possibility to add extra value to your tool by complementing it with other related tools already in the platform.
- A complete control of your tool through the administration panel, for monitoring, logging and managing your tool.
The application or pipeline to be integrated should:
- Be free and open source code
- Run in non-interactive mode in linux-based operating system:
- Ubuntu 18.04
- Ubuntu 20.04
- CentOS 8 Stream
- consult with us for others O.S
There are some steps to follow for achieving the integration of a your application as a new VRE tool. As a result, the VRE is able to control the whole tool execution cycle:
- Preparation of the run web form where the user will specify arguments and inputs files for each run.
- Validation input files and arguments requirements
- Stage-in input files to the run working directory (if required)
- Execution the tool in the cloud in a scalable manner
- Monitoring and interactively reporting tool progress during the execution
- Stage-out output files from the run working directory (if required)
- Registration at the VRE of the resulting output files to display them at the VRE
Essentially, VRE will need two elements, (1) your application or workflow wrapped within a VRE RUNNER, and (2) metadata annotating it (i.e. input files requirements, descriptions). The following steps describe how to achieve it.
- Step 1 Define your tool's input and output files. Build job execution files for testing the RUNNER.
- Step 2 Prepare a new VRE RUNNER wrapping your application.
- Step 3 Annotate and submit the new VRE tool
- Step 4 Test and debug the new tool from the VRE user interface
- Step 5 (optional) Prepare a web page to display a summary report on each execution
- Step 6 Provide documentation for the new tool
VRE job execution files are two JSON files. In production, these will be generated by the VRE server on each execution initiated by the user at the web interface. This data should be consumed by the RUNNER wrapping your application.
| VRE job execution files | Description |
|---|---|
| Run configuration file i.e. config.json
|
contains the list of input file selected by the user for a particular run, the values of the arguments, and the list of expected output files. |
| Infiles' metadata file i.e. in_metadata.json
|
contains the metadata of the input files listed as in config.json, including information like the absolute file path |
Additionally, is handy to have a shell script with the execution command line of the RUNNER (i.e. test.sh). The 2 previous files are passed in as arguments.
Defining which are the input files and arguments that your tool will consume is essential to build the VRE job execution files. There are two ways of creating it:
-
Manual approach:
Manually generate the 2 files following the corresponding JSON schemes and taking as reference some examples
- Examples:
- template RUNNER: config.json & in_metadata.json
- dpfrep RUNNER (example of a R-based tool): config.json & in_metadata.json
- JSON schemes:
- euCanSHare VRE tool schemes: tool schemas
- Examples:
-
VRE web interface approach:
Use the tool's developer admin panel to created these files. The user interface include web forms that allows the edition and validation of a JSON document gathering data about the input files and arguments. If you provide data about your local development environment (i.e. working directories or the location of test input files, VRE will generate a
config.jsonand ain_metadata.jsonfor downloading.- Where: in the left navigation menu, Admin → My Tools → Development → (+) Add new tool
- Requirements: user account with "tool developer" rights
Note:
schemes are being adapted to each VRE project. If the list of accepted values fordata-typeorfile-typeis not covering your use-case, just contact us. We'll extend the supported metadata.
VRE RUNNERs are pieces of code that work as adapters between the VRE server and each of the integrated applications or pipelines. Eventually, the RUNNER should:
- Consume the VRE job execution files that will be generated when a user submits a new job from the web interface,
- Run locally the wrapped application or pipeline, and
- Generate a list of output files, information that the VRE server will use to register and display the files at users' workspace.
For preparing the RUNNER, the easiest option is to take as reference a RUNNER template and use it as skeleton to wrap your own application. The template includes a couple of python classes that parse and load VRE job execution files into python dictionaries. The template includes a method that you can customize at your convenience to call the application, module or pipeline to be integrated.
Step-by-step
-
Fork or clone the repository of the RUNNER template in your local development environment.
RUNNER template repository https://github.com/inab/vre_template_tool documentation -
(optional) Run the
hello_wordexample. The RUNNER template is initially configured to "wrap" an application calledhello.py. It demonstrates the overall flow of a VRE RUNNER. -
Include your own job execution files in the repository. You can copy the JSON files generated in STEP 1 into the
test/folder of the repository to replace the basichello_wordexample. They should contain the input files and arguments for a test execution of your tool. You can try again to run the RUNNER as above, but now it's going to fail, as the RUNNER is still expecting the arguments and the input files of thehello_wordexample.Make sure that the absolute path of the working directory and the input files defined in these JSON files are accessible.
-
Implement the
runmethod of theVRE_Toolso that the function executes the application, module or pipeline to be integrated. The input file locations and argument values as defined in the job execution files are going to be the content of parameters received in therunmethod. -
The RUNNER will be ready when the wrapped application is properly executed and the output files are generated at the location specified in
output_files[].file.path. These paths are usually defined inconfig.jsonfile. Alternatively, if the name and number of the output files cannot be known before the execution, you should extend theVRE_Tool.runmethod to define thefile.pathattribute into theoutput_filesdictionary. The RUNNER will write down it into the out-files metadata file (i.e.out_metadata.json).Make sure your output files are generated in the root of the working directory
-
Save your RUNNER in a GIT repository publicly available. In the same way than the template RUNNER, document the installation and include some test datasets, considering also the installation of the wrapped application itself: extra modules, dependencies, libraries, etc. VRE administrators will eventually install this repository in the VRE cloud.
Once the RUNNER is successfully executing the application in your local development environment, it is time to ask for registering the new tool to the corresponding VRE server. To do so, some descriptive metadata on the new application is required, i.e., tool descriptions and titles, ownership, references, keywords, etc.
Again, two approaches are supported:
-
Manual approach:
Generate the
tool specification filetaking as reference some examples to fully annotate the new tool- JSON schemes:
- Examples:
- dpfrep RUNNER (example of a R-based tool): tool_specification.json
Save your tool specification file in your repository and send it all together to VRE administrators. They will validate the data and register the tool the the VRE.
-
VRE web interface approach:
Go to the tools' developer administration panel and fill in the missing information for the tool entry generated in STEP 1 when preparing the job execution files.
- Where:
- in the left navigation menu, Admin → My Tools → Development →. Find the row corresponding to your tool in preparation.
- Fill in the two last columns:
-
bring us your code: URL of the your RUNNER's repository created in STEP 2 -
Define Tool: edit online the template JSON document. Find a title for your tool, a description, etc. All this information will be displayed to the user on the web application.
-
- Send the
Submitbutton. It will send an email to VRE administrators, who will validate the data and register the tool to the VRE.
- Where:
After approval, the tool will be accessible on the web application in test mode, i.e., only tool developers and administrators will be able to find and run the tool at the VRE.
- Tool logo: minimal resolution of 400px x 400px
- Sample datasets
- Help pages