Skip to content

Ingestion of VRS-annotated VCF is producing conflicting allele when state == "" #221

@jsstevenson

Description

@jsstevenson

Describe the bug

I noticed this during ingestion of a VCF that had previously been annotated with VRS IDs/attributes. Part of the ingestion process involves attempting to recreate the original allele from the given parameters. For VCF variants where the annotated VRS alt state is just a deletion ("."), the allele ID for the recreated VRS object does not match the given ID.

Steps to reproduce

For example, the ALT from this record

chrY    2840370 .   C   CGAGAGAGA   .   . AC=12;AC_Hemi=0;AC_Het=0;AC_Hom=12;AF=0.0165975;AN=723;VRS_Allele_IDs=ga4gh:VA.DjClXadsTSlpvZwfdqm-kwO78vasYuy4,ga4gh:VA.lMn5xCVH1wH_6CFGngryPHVkZq0IluGH;VRS_Starts=2840369,2840370;VRS_Ends=2840370,2840416;VRS_States=C,.

Expected behavior

The ID should be ga4gh:VA.lMn5xCVH1wH_6CFGngryPHVkZq0IluGH

Current behavior

Our code produces this allele (note the differing ID)

{'digest': 'xWSEycpHNOuAM23ZO9Fg_69uH_XAAyoC',
 'id': 'ga4gh:VA.xWSEycpHNOuAM23ZO9Fg_69uH_XAAyoC',
 'location': {'digest': 'SJwjlpriSSV9CRgpOnVtQzZfJkZ0Jsdu',
              'end': 2977218,
              'sequenceReference': {'refgetAccession': 'SQ.8_liLu1aycC0tPQPFmUaGXJLDs5SbPZ5',
                                    'type': 'SequenceReference'},
              'start': 2977180,
              'type': 'SequenceLocation'},
 'state': {'length': 0,
           'repeatSubunitLength': 38,
           'sequence': '',
           'type': 'ReferenceLengthExpression'},
 'type': 'Allele'}

Acceptance Criteria

that shouldn't happen

Possible reason(s)

No response

Suggested fix

No response

Branch, commit, and/or version

main

Screenshots

No response

Environment details

main

Additional details

At first glance, I think the provided VRS coordinates/state look pretty weird given the ALT

Contribution

None

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions