Skip to content

Bring your own tool

Laia Codó edited this page Jun 15, 2021 · 14 revisions

Virtual Research Environments integrate tools and pipelines to enforce a research community. We offer the possibility to integrate your application in pone of these analytical platforms. Please, read though this documentation and contact us for any doubt or suggestion you may have.

Table of contents


Why?

open Virtual Research Environment offers a number of benefits for developers willing to integrate their tools in the platform:

  • Open access platform publicly available
  • A full web-based workbench with user support utilities, OIDC authentication service, and data handlers for local and remote datasets or repositories.
  • Visibility for your tool and organization, with ownership recognition, tailored web forms, help pages and customized viewers.
  • The possibility to add extra value to your tool by complementing it with other related tools already in the platform.
  • A complete control of your tool through the administration panel, for monitoring, logging and managing your tool.

Requirements

The application or pipeline to be integrated should:

  • Be free and open source code
  • Run in non-interactive mode in linux-based operating system:
    • Ubuntu 18.04
    • Ubuntu 20.04
    • CentOS 8 Stream
    • consult with us for others O.S

How it works?

There are some steps to follow for achieving the integration of a your application as a new VRE tool. As a result, the VRE is able to control the whole tool execution cycle:

  1. Preparation of the run web form where the user will specify arguments and inputs files for each run.
  2. Validation input files and arguments requirements
  3. Stage-in input files to the run working directory (if required)
  4. Execution the tool in the cloud in a scalable manner
  5. Monitoring and interactively reporting tool progress during the execution
  6. Stage-out output files from the run working directory (if required)
  7. Registration at the VRE of the resulting output files to display them at the VRE

How to bring in a new tool?

Essentially, VRE will need two elements, (1) your application or workflow wrapped within a VRE RUNNER, and (2) metadata annotating it (i.e. input files requirements, descriptions). The following steps describe how to achieve it.

  • Step 1       Define your tool's input and output files. Build job execution files for testing the RUNNER.
  • Step 2       Prepare a new VRE RUNNER wrapping your application.
  • Step 3       Annotate and submit the new VRE tool
  • Step 4       Test and debug the new tool from the VRE user interface
  • Step 5       (optional) Prepare a web page to display a summary report on each execution
  • Step 6       Provide documentation for the new tool

STEP 1: Pre-define your tool to build a test set of VRE job execution files

VRE job execution files are two JSON files. In production, these will be generated by the VRE server on each execution initiated by the user at the web interface. This data should be consumed by the RUNNER wrapping your application.

VRE job execution files Description
Run configuration file
i.e. config.json
contains the list of input file selected by the user for a particular run, the values of the arguments, and the list of expected output files.
Infiles' metadata file
i.e. in_metadata.json
contains the metadata of the input files listed as in config.json, including information like the absolute file path

Additionally, is handy to have a shell script with the execution command line of the RUNNER (i.e. test.sh). The 2 previous files are passed in as arguments.

Defining which are the input files and arguments that your tool will consume is essential to build the VRE job execution files. There are two ways of creating it:

  • Manual approach:

    Manually generate the 2 files following the corresponding JSON schemes and taking as reference some examples

  • VRE web interface approach:

    Use the tool's developer admin panel to created these files. The user interface include web forms that allows the edition and validation of a JSON document gathering data about the input files and arguments. If you provide data about your local development environment (i.e. working directories or the location of test input files, VRE will generate a config.json and a in_metadata.json for downloading.

    • Where: in the left navigation menu, Admin → My Tools → Development → (+) Add new tool
    • Requirements: user account with "tool developer" rights

Note:
schemes are being adapted to each VRE project. If the list of accepted values for data-type or file-type is not covering your use-case, just contact us. We'll extend the supported metadata.


STEP 2: Prepare a VRE RUNNER wrapping your application

VRE RUNNERs are pieces of code that work as adapters between the VRE server and each of the integrated applications or pipelines. Eventually, the RUNNER should:

  1. Consume the VRE job execution files that will be generated when a user submits a new job from the web interface,
  2. Run locally the wrapped application or pipeline, and
  3. Generate a list of output files, information that the VRE server will use to register and display the files at users' workspace.

For preparing the RUNNER, the easiest option is to take as reference a RUNNER template and use it as skeleton to wrap your own application. The template includes a couple of python classes that parse and load VRE job execution files into python dictionaries. The template includes a method that you can customize at your convenience to call the application, module or pipeline to be integrated.

Step-by-step

  1. Fork or clone the repository of the RUNNER template in your local development environment.

    RUNNER template repository
    https://github.com/inab/vre_template_tool documentation
  2. (optional) Run the hello_word example. The RUNNER template is initially configured to "wrap" an application called hello.py. It demonstrates the overall flow of a VRE RUNNER.

  3. Include your own job execution files in the repository. You can copy the JSON files generated in STEP 1 into the test/ folder of the repository to replace the basic hello_word example. They should contain the input files and arguments for a test execution of your tool. You can try again to run the RUNNER as above, but now it's going to fail, as the RUNNER is still expecting the arguments and the input files of the hello_word example.

    Make sure that the absolute path of the working directory and the input files defined in these JSON files are accessible.

  4. Implement the run method of the VRE_Tool so that the function executes the application, module or pipeline to be integrated. The input file locations and argument values as defined in the job execution files are going to be the content of parameters received in the run method.

  5. The RUNNER will be ready when the wrapped application is properly executed and the output files are generated at the location specified in output_files[].file.path. These paths are usually defined in config.json file. Alternatively, if the name and number of the output files cannot be known before the execution, you should extend the VRE_Tool.run method to define the file.path attribute into the output_files dictionary. The RUNNER will write down it into the out-files metadata file (i.e. out_metadata.json).

    Make sure your output files are generated in the root of the working directory

  6. Save your RUNNER in a GIT repository publicly available. In the same way than the template RUNNER, document the installation and include some test datasets, considering also the installation of the wrapped application itself: extra modules, dependencies, libraries, etc. VRE administrators will eventually install this repository in the VRE cloud.


STEP 3: Annotate and submit the new VRE tool

Once the RUNNER is successfully executing the application in your local development environment, it is time to ask for registering the new tool to the corresponding VRE server. To do so, some descriptive metadata on the new application is required, i.e., tool descriptions and titles, ownership, references, keywords, etc.

Again, two approaches are supported:

  • Manual approach:

    Generate the tool specification file taking as reference some examples to fully annotate the new tool

    Save your tool specification file in your repository and send it all together to VRE administrators. They will validate the data and register the tool the the VRE.

  • VRE web interface approach:

    Go to the tools' developer administration panel and fill in the missing information for the tool entry generated in STEP 1 when preparing the job execution files.

    • Where:
      1. in the left navigation menu, Admin → My Tools → Development →. Find the row corresponding to your tool in preparation.
      2. Fill in the two last columns:
        • bring us your code: URL of the your RUNNER's repository created in STEP 2
        • Define Tool: edit online the template JSON document. Find a title for your tool, a description, etc. All this information will be displayed to the user on the web application.
      3. Send the Submit button. It will send an email to VRE administrators, who will validate the data and register the tool to the VRE.

After approval, the tool will be accessible on the web application in test mode, i.e., only tool developers and administrators will be able to find and run the tool at the VRE.


STEP 4: Test and debug the new tool from the VRE


STEP 5: Prepare a custom report viewer for each execution


STEP 6: Provide documentation for the new tool

  • Tool logo: minimal resolution of 400px x 400px
  • Sample datasets
  • Help pages

Clone this wiki locally