Skip to content

Use Case: Record and Discover Derived Products #2

@mbjones

Description

@mbjones

Use Case: Discover Derived Products

Goals and Summary

Via data processing, analysis, modeling, and visualization processes, researchers create derived products, including derived data sets, figures, tables, animations, and other artifacts. By establishing citation relationships showing provenance relationships among these derived and source products, we can preserve the dependency relationships for use in reproducing the science, thereby enabling discovery of data and products from their relationships. For example, with appropriate relationships (prov:wasGeneratedBy, prov:used), one can determine if one product was derived from another, and following the graph of such linkages, could discover other analyses and products that were derived from the same source data sets.

Why is it important and to whom?

  • To reproduce science, researchers need the ability to follow data derivation changes
  • Because researchers tend to only cite the proximate data used in a study, these provenance relationships allow researchers to get credit for the impact of upstream source data in downstream synthetic analysis
  • In a complex workflow, an error may be introduced in raw products that were used to create a derived product. Data source citations allow one to proceed from source to products, notifying appropriate researchers of the errors.

Why hasn’t it been solved yet?

  • Provenance modeling languages have been in flux (e.g., PROV, OPM)
  • Few tools support capture of provenance information in a standard format
  • Data repositories usually lack provenance information, or it is in natural language format

An example diagram showing provenance relationships as envisioned by DataONE:

Provenance trace in ProvONE

Additional Information and Links

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions