Port 2D layout algorithm to FlatGFA

This project will port the SGD-based graph visualization strategy used in odgi to FlatGFA. This is a computationally interesting approach to visualizing _large_ pangenome graphs (unlike #222, which is a simpler strategy for visualizing small ones). It produces birds-eye pictures for understanding the structure of variation graphs like this one from [the odgi docs](https://pangenome.github.io/odgi.github.io/rst/tutorials/sorting_layouting.html#drawing-the-2d-layout-of-the-drb1-3123-graph):

<img src="https://pangenome.github.io/odgi.github.io/_images/DRB1-3123_unsorted.og.lay.png" width="400">

The SGD-based approach to graph layout has been fairly well studied in the Panorama group, so there are several things to build on:

* The original `odgi layout` [source code](https://github.com/pangenome/odgi/blob/master/src/algorithms/sgd_layout.cpp).
* Jiajie's [GPU implementation](https://github.com/tonyjie/gpu_pangenome_layout) of the algorithm and its associated [PyTorch reference implementation](https://github.com/tonyjie/gpu_pangenome_layout/blob/master/torch_baseline/torch_graph_layout.py).
* The [extracted PGSGD kernel](https://github.com/UM-mbit/pangenomicsBench/tree/master/Pgsgd) from the PangenomicsBench benchmark suite.

Our goal in this project is to be as lazy as possible, i.e., to reuse as much as possible. Wherever feasible, we will reuse the actual implementation of SGD-based graph layout and just change the _data source_ to FlatGFA. We will produce the same layout file format that odgi uses, meaning that we can reuse `odgi draw` to actually produce the final images.

To that end, we will need to pick one of several approaches:

* Use the PyTorch reference implementation. This means figuring out exactly where that tool's data comes from, and then exposing enough stuff through the FlatGFA Python API that we can instead load the data from there.
* Use Jiajie's high-performance GPU implementation. This will require building a new C API to FlatGFA. Then we can link it as a library into a project that uses CUDA to do the hard work.
* Use odgi's original CPU implementation. This similarly requires creating that C API.
* As a last resort, reimplement the algorithm in Rust. Maybe we can use an off-the-shelf crate for the core SGD loop, or maybe we just implement that ourselves too.

Here's a step-by-step plan:

* [ ] Evaluate the above options and pick one.
* [ ] Do that, i.e., implement FlatGFA-based graph layout.
* [ ] Add some tests. Differential testing with odgi is probably impractical (because this is a randomized algorithm), but we can at least take some snapshots.
* [ ] Compare performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Port 2D layout algorithm to FlatGFA #224

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Port 2D layout algorithm to FlatGFA #224

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions