-
Notifications
You must be signed in to change notification settings - Fork 33
Add ndtypes from ndx-pose, ndx-photometry, ndx-fiber-photometry #1665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
@satra pointed me to: I did not find a generic controlled vocabulary for techniques. Are you using https://github.com/BICCN/TMN/blob/main/templates/SPARC%20Modalities.csv ? |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1665 +/- ##
==========================================
- Coverage 74.82% 74.81% -0.01%
==========================================
Files 84 85 +1
Lines 11693 11694 +1
==========================================
Hits 8749 8749
- Misses 2944 2945 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@rly - this is a lot less formal at this point. if you see the module list you will see how we were approaching the classification for that. the main thing this is being used for is the summary info in the dandiset and for the CLI to generate meaningful suffices when disambiguating. if you have suggestions for changes to the others, please make them. for example ndx-spectrum seems off, and incorrect. the modality field should be on alphabetical only without any dashes. i believe we did this by taking all the dandisets we had 4+ years ago and extending this list. |
| "PoseTraining": { | ||
| "module": "ndx-pose", | ||
| "neurodata_type": "PoseTraining", | ||
| "technique": "pose estimation technique", | ||
| "approach": "behavioral approach", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on a more general note, i'm a bit confused about why pose training is a data type, as that doesn't seem to reflect behavior per se, and also why it is a pose estimation technique.
some of these are direct measurement techniques or estimation techniques, but this one seems odd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PoseTraining is a group consisting of training frames and source videos. A training frame refers to a frame from a source video and an the locations of each node of 1 or more skeletons, often from human labeling. So PoseTraining isn't exactly a data type but indicative of the NWB file containing training data for pose estimation methods. Should I pick out a different data type to detect instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should consider data type as a broad umbrella of what data is generated from or used for (for example there is fourier analysis above). it could be associated with a "pose estimation training technique"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it. that makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rly here we have 2 extensions listed to assist with importing https://github.com/dandi/dandi-cli/blob/HEAD/dandi/metadata/nwb.py#L99 -- should we also add extension information into this 'registry' and use it there too? or may be it is no longer even needed?
yarikoptic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreeing with @satra on avoiding - and also overall we should aim to make them shorter
Co-authored-by: Yaroslav Halchenko <[email protected]>
Co-authored-by: Yaroslav Halchenko <[email protected]>
Co-authored-by: Yaroslav Halchenko <[email protected]>
Co-authored-by: Yaroslav Halchenko <[email protected]>
|
It's probably worthwhile also to reindex all the neurodata types observed on DANDI and add those to the map, but this PR is a start and addresses the issue raised over Slack. |
|
there are errors from docker compose but also smth which is relevant to this PR to be addressed re @rly's
do we have enough extracted in @magland's lindi's to just grep through them? |
|
From https://neurosift.app/dandi > "Neurodata Types Search", you can see a dropdown menu of all of the parsed neurodata types. That is populated from parsing https://lindi.neurosift.org/dandi/neurodata_types_index.json.gz Generated by https://github.com/magland/neurosift-kerchunker/blob/main/workflow_scripts/create_neurodata_types_index.py I believe these come from the first 100 assets of each dandiset, only from dandisets where at least one of the first 20 assets is NWB, and only if LINDI files have been generated for the NWB files (and other edge case handling). |
|
I'll try having an AI generate the name {
"module": x,
"neurodata_type": x,
"technique": x,
"approach" x
}mappings tomorrow and see how it goes. |
|
NWB files with sorted single units in the NWB This confused a user when looking at https://dandiarchive.org/dandiset/001533/draft/files?location=sub-CSH-ZAD-001&page=1 which has NWB files with suffix "behavior" and has a Units table. They thought the data has only behavior and no ecephys data. |
|
I think it would be great to add annotation on having Also I would prefer to be more specific with them, since they are "quite derived" in a sense... what if we dedicate for them something like In BIDS it will likely be an analogous dedicated suffix (e.g. |
|
If an NWB file has both raw ecephys data and derived units, would the suffix then be |
I would lean towards I think |
|
minor note: we should avoid |
| "approach": "optogenetic approach", | ||
| }, | ||
| "OptogeneticSeries": { | ||
| "module": "ogen", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a note: related issue by @TheChymera in BIDS:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are iterating on a better structure for standardizing optogenetic implant/site details, including all those mentioned by @TheChymera in https://github.com/rly/ndx-optogenetics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's great. Attn @CodyCBakerPhD , ideally we should follow up on that issue in bids-standard and align to the efforts in ndx-optogenetics, so we have metadata appropriately exposed at BIDS level too.
| }, | ||
| "Spectrum": { | ||
| "module": "ndx-spectrum", | ||
| "module": "spectrum", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: I wonder if this is more to be described in _desc-
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: apparently this "module" is not used as far as I can see within dandi-cli :-/ so the destiny of such changes is not even clear. For the participation in suffix, we solely rely on get_neurodata_types_to_modalities_map which bases its map on path within .nwb file
here is currently produced map
print(json.dumps(get_neurodata_types_to_modalities_map(), indent=2))
{
"NWBMixin": "core",
"NWBContainer": "core",
"NWBDataInterface": "core",
"NWBData": "core",
"ScratchData": "core",
"MultiContainerInterface": "core",
"ProcessingModule": "base",
"TimeSeries": "base",
"Image": "base",
"ImageReferences": "base",
"Images": "base",
"Device": "device",
"ElectrodeGroup": "ecephys",
"ElectricalSeries": "ecephys",
"SpikeEventSeries": "ecephys",
"EventDetection": "ecephys",
"EventWaveform": "ecephys",
"Clustering": "ecephys",
"ClusterWaveforms": "ecephys",
"LFP": "ecephys",
"FilteredEphys": "ecephys",
"FeatureExtraction": "ecephys",
"IntracellularElectrode": "icephys",
"PatchClampSeries": "icephys",
"CurrentClampSeries": "icephys",
"IZeroClampSeries": "icephys",
"CurrentClampStimulusSeries": "icephys",
"VoltageClampSeries": "icephys",
"VoltageClampStimulusSeries": "icephys",
"ImageSeries": "image",
"IndexSeries": "image",
"ImageMaskSeries": "image",
"OpticalSeries": "image",
"GrayscaleImage": "image",
"RGBImage": "image",
"RGBAImage": "image",
"OpticalChannel": "ophys",
"ImagingPlane": "ophys",
"OnePhotonSeries": "ophys",
"TwoPhotonSeries": "ophys",
"CorrectedImageStack": "ophys",
"MotionCorrection": "ophys",
"ImageSegmentation": "ophys",
"RoiResponseSeries": "ophys",
"DfOverF": "ophys",
"Fluorescence": "ophys",
"OptogeneticStimulusSite": "ogen",
"OptogeneticSeries": "ogen",
"AnnotationSeries": "misc",
"AbstractFeatureSeries": "misc",
"IntervalSeries": "misc",
"DecompositionSeries": "misc",
"LabMetaData": "file",
"Subject": "file",
"NWBFile": "file",
"SpatialSeries": "behavior",
"BehavioralEpochs": "behavior",
"BehavioralEvents": "behavior",
"BehavioralTimeSeries": "behavior",
"PupilTracking": "behavior",
"EyeTracking": "behavior",
"CompassDirection": "behavior",
"Position": "behavior"
}
so we really need to reapproach this map here and see how to avoid duplication and unused specifications
|
I added a new neurodata_type_map.py file with a more complete mapping of neurodata types (as detected by neurosift, as described above). Before I integrate it, update tests, etc., please take a look and let me know what you think. It should probably be its own file because it's so big. Should it be a python dictionary? JSON? Also, what is the difference between the key and the value of "neurodata_type"? |
Added new neurodata types for image series and eye tracking metadata while removing deprecated types.
yarikoptic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dandi/metadata/neurodata_typemap.py seems ot be note used, then what is it for?
type checking seems to fail.
I will move it to draft for now, but I would love for us to finalize and merge this notable improvement!!
|
@rly with an interest in physio recordings and thus potentially in https://github.com/BCM-Neurosurgery/ndx-wearables I would like to bring this over the finish line. Worse comes to worse we should do that while on @bendichter's couch in the booth at SfN! |
In response to a Slack thread, this PR adds mappings for some neurodata types from ndx-pose, ndx-photometry (deprecated), and ndx-fiber-photometry. I wasn't sure whether there would be issues with the same neurodata type name from two different modules being mapped. I also made up the approaches and techniques...
It also fixes the casing of "OptogeneticStimulusSIte" -> OptogeneticStimulusSite"