arborstats

A small command-line tool to run a flattening pipeline (e.g., flatone) and/or compute arbor statistics for a list of neuron segment IDs
flatone automatically downloads the mesh of an EyeWire II neuron as .obj with CaveClient/CloudVolume, skeletonizes it as an .swc with skeliner and flattens it with pywarper
Segment IDs can come from the CLI, a Google Sheet (CSV export), or a local CSV or a list of Segment IDs

Features

Two-stage pipeline:
1. Run external flattener (e.g., flatone) per segment
2. Compute arbor statistics from emitted skeletons (SWC)
Flexible input: --segids, Google Sheet (--google-sheet-id), or CSV (--csv)
Schema from CLI: choose which columns to read and their dtypes (no hard-coding)
Smart recompute: --overwrite-all or --new-only
Mode control: run both, only flatone, or only arbor stats
Parallelism: -j/--jobs for multiprocessing
Robust output layout with per-segment folders and error markers

NOTE

flatone relies on SuiteSparse, which does NOT run on native Windows. Use it on Unix-like enviroment or Windows Subsystem for Linux (WSL 2) instead.

Instructions to install prerequisites for flatone
# prerequisites
## mac
brew update
brew install suite-sparse

## debian/ubuntu/WSL
sudo apt-get update
sudo apt-get install build-essential # if not already installed
sudo apt-get install libsuitesparse-dev

Installation Instructions for ArborStats

System prerequisites for flatone (e.g., SuiteSparse) still apply; follow the steps mentioned above.

# clone this repo
git clone [email protected]:Schwartz-Lab-NU/arborStats.git
cd arborStats

# install conda environment with Python=3.13
conda create -n arborstats python=3.13

# activate conda environment
conda activate arborstats

# installing dependencies using the following command
pip install -e .

# check if the installation worked
arborstats -h

Quickstart

Compute for specific segment IDs, writing outputs under ./out:

arborstats \
  --segids 720575940550176551 720575940550176552 \
  --output-dir ./out -j 2

Read IDs from a Google Sheet (public or you have access), with explicit schema:

arborstats \
  --google-sheet-id 1o4i53h92oyzsBc8jEWKmF8ZnfyXKXtFCTaYSecs8tBk \
  --read-columns "Status" "Final SegID" "Cell Requires Review" \
  --dtypes "Final SegID=Int64" "Status=string" "Cell Requires Review=string" \
  --segid-col "Final SegID" \
  --status-col "Status" --status-filter Complete "Complete (cut off)" \
  --cell-review-col "Cell Requires Review" --cell-review-filter FALSE \
  --output-dir ./out -j 8

Read IDs from a CSV:

arborstats \
  --csv data/segids.csv \
  --read-columns "SegID,Status" \
  --dtypes "SegID=Int64,Status=string" \
  --segid-col "SegID" \
  --status-col "Status" --status-filter Complete \
  --output-dir ./out

CLI

usage: arborstats (--segids ... | --google-sheet-id ... | --csv CSV) --output-dir PATH [options]

Input source (choose exactly one) — mutually exclusive
  --segids SEGID [SEGID ...]   One or more segment IDs
  --google-sheet-id ID         Google Sheet ID to read (CSV export URL is inferred)
  --csv PATH                   CSV path containing segment IDs

Schema controls
  --read-columns COL [COL ...]   Columns to read (space or comma separated)
  --dtypes COL=DTYPE [...]       Per-column dtypes (e.g., Final SegID=Int64, Status=string)

Column names & filters
  --segid-col NAME               Column containing segment IDs (default: "Final SegID")
  --status-col NAME              Status column name (default: "Status")
  --cell-review-col NAME         Cell-review column name (default: "Cell Requires Review")
  --status-filter ...            Values to include from the status column (default: Complete, "Complete (cut off)")
  --cell-review-filter ...       Values to include from the cell-review column (default: FALSE)
  --csv-col NAME                 (deprecated) overrides --segid-col when using --csv

Common
  --output-dir PATH              Root output directory (per-seg subfolders are created)
  -j, --jobs N                   Parallel workers (default: 1)

Overwrite policy — mutually exclusive
  --overwrite-all                Force recompute even if outputs exist
  --new-only                     Only compute when expected outputs are missing

Which tasks to run — mutually exclusive
  --flatone-arbor-stats-both     Run flatone and compute arbor stats (default)
  --arbor-stats-only             Compute only arbor stats (expects existing SWC)
  --flatone-only                 Run flatone only (skip arbor stats)

Output layout

out/
├─ 720575940550176551/
│  ├─ mesh.obj                      # required: mesh of an EyeWire II neuron
│  ├─ skeleton.swc                  # required: output from flatone
│  ├─ skeleton_warped.swc           # required: output from flatone
│  ├─ arbor_stats.pkl               # required: computed statistics
│  ├─ arbor_stats_error.txt         # present only if an error occurred
│  ├─ *                             # other files from flatone outputs (not so important)
├─ arbor_stats_error_seg_ids.txt    # segIDs which errored during arbor stats computation
├─ not_processed_seg_ids.txt        # segIDs skipped (e.g., no meshes found)

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.vscode		.vscode
arborstats		arborstats
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

arborstats

Features

Installation Instructions for ArborStats

Quickstart

CLI

Output layout

About

Uh oh!

Releases

Packages

Languages

License

Schwartz-Lab-NU/arborStats

Folders and files

Latest commit

History

Repository files navigation

arborstats

Features

Installation Instructions for ArborStats

Quickstart

CLI

Output layout

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages