Deletion-Insertion#
Deletion-Insertion (delins): a sequence change where, compared to a reference sequence, one or more nucleotides are replaced by one or more other nucleotides and which is not a substitution, inversion or conversion.
Syntax#
| single position | |
|---|---|
| Syntax | sequence_identifier ":" coordinate_type "." position "delins" sequence |
| Examples |
|
| position range | |
| Syntax | sequence_identifier ":" coordinate_type "." range "delins" sequence |
| Examples |
|
| Explanation of Symbols | |
| |
Notes#
- by definition, when one nucleotide is replaced by one other nucleotide the change is a substitution.
- changes involving two or more consecutive nucleotides are described as deletion/insertion (delins) variants
- two variants separated by one or more nucleotides should be described individually and not as a "delins"
- exception: two variants separated by one nucleotide, together affecting one amino acid, should be described as a "delins": NOTE: this prevents tools predicting the consequences of a variant to make conflicting and incorrect predictions of two different substitutions at one position (e.g.
c.235_237delinsTAT(p.Lys79Tyr) versusc.[235A>T;237G>T](p.[Lys79*;Lys79Asn]).: NOTE: the SVD-WG has prepared a proposal to modify this recommendation (see SVD-WG010). The new proposal is: two variants that are separated by fewer than two intervening nucleotides (that is, not including the variants themselves) should be described as a single "delins" variant
- exception: two variants separated by one nucleotide, together affecting one amino acid, should be described as a "delins": NOTE: this prevents tools predicting the consequences of a variant to make conflicting and incorrect predictions of two different substitutions at one position (e.g.
- conversions, a sequence change where a range of nucleotides are replaced by a sequence from elsewhere in the genome, are described as a "delins". The previous format "con" is no longer used (see Community Consultation SVD-WG009)
- for all descriptions the most 3' position possible of the reference sequence is arbitrarily assigned to have been changed (3'rule)
Examples#
NC_000023.11:g.32386323delinsGA: a deletion of nucleotideg.32386323(a T, not described), replaced by nucleotides GA, changing..CAGCtoTCTTT....CAGCThe variant corresponds toGACTTT..LRG_199t1:c.4661delinsTCbased on a coding DNA reference sequence.: NOTE: the recommendation is not to describe the variant asNC_000023.11:g.32386323delTinsGA, i.e. describe the deleted nucleotide sequence. This description is longer, it contains redundant information and chances to make an error increase (e.g.NC_000023.11:g.32386323delCinsGA).NM_004006.2:c.6775_6777delinsC: a deletion of nucleotidesc.6775toc.6777(GAG, not described), replaced by a C nucleotide, changing..GGAAtoGAGTTGC....GGAA: NOTE: the recommendation is not to describe the variant asCTTGC..NM_004006.2:c.6775_6777delGAGinsC, i.e. describe the deleted nucleotide sequence. This description is longer, it contains redundant information and chances to make an error increase (e.g.NM_004006.2:c.6775_6777delGTGinsC).LRG_199t1:c.145_147delinsTGG(p.Arg49Trp): a deletion replacing nucleotidesc.145toc.147(CGC, not described) with TGGLRG_199t1:c.9002_9009delinsTTT: a deletion of nucleotidesc.9002toc.9009, replaced by nucleotides TTT: NOTE: two variants separated by one nucleotide, together affecting one amino acid, should be described as a "delins", so the descriptionc.[145C>T;147C>G]is not correctLRG_199t1:c.850_901delinsTTCCTCGATGCCTG: a deletion of nuceotidesc.850toc.901, replaced byTTCCTCGATGCCTG: NOTE: parts of the inserted sequence "align" with the reference sequence, giving an alternative description likec.[850_869del;874_881del;887_897del;901_902insG]. The "delins" format is recommended: it is simpler and prevents software tools making incorrect predictions for the consequences at protein level.NC_000002.12:g.pter_8247756delins\[NC_000011.10:g.pter_15825266\]: nucleotidesg.ptertog.8247756of chromosome 2 are deleted and replaced by nucleotidesg.ptertog.1582566of chromosome 11: the derivative chromosome 2 from an unbalanced translocation between the short arms of chromosomes 2 and 11 (ISCN der(2)t(2;11)(p25.1;p15.2)). Example copied from Complex (HGVS/ISCN).: NOTE: balanced translocations (see Complex (HGVS/ISCN)) are described as two complementary "delins" variants.NC_000022.10:g.42522624_42522669delins42536337_42536382: conversion in exon 9 of the CYP2D6 gene replacing exon 9 nucleotidesg.42522624tog.42522669with those of the 3' flanking CYP2D7P1 gene, nucleotidesg.42536337tog.42536382from the same genomic reference sequence (NC_000022.10)NC_000012.11:g.6128892_6128954delins[NC_000022.10:g.17179029_17179091]: conversion replacing nucleotidesg.6128892tog.6128954of the VWF gene (NM_000552.3:c.3675-45_3692) on chromosome 12 with nucleotidesg.17179029tog.17179091of the VWFP1 pseudogene on chromosome 22NM_000797.3:c.812_829delins908_925: conversion replacing nucleotidesc.812toc.829of the DRD4 gene with nucleotidesc.908toc.925from the same reference sequenceNM_004006.2:c.812_829delinsN[12]: nucleotidesc.812toc.829have been deleted and replaced by 12 unknown nucleotides (N[12])
Discussion#
What is an "indel"?
The term "indel" is not used in HGVS nomenclature (see Glossary). The term is confusing, having different meanings in different disciplines.
Can I describe a GC to TG variant as a dinucleotide substitution (g.4GC>TG)?
No this is not allowed. By definition a substitution changes one nucleotide into one other nucleotide (see Substitution). The change TGT to GCCATGT should be described as TGCAg.4_5delinsTG, i.e. a deletion/insertion (indel).
Are there specific recommendations regarding the maximum number of unchanged nucleotides between two single nucleotide variants and whether the change is described as a "delins" or as two separate changes?
Yes, two variants separated by one or more nucleotides should preferably be described individually and not as a "delins" (unless they together affect one amino acid). Why? First, the two variants may have been reported (or might occur) individually. Second, sequence analysis pipelines will describe such variants individually, giving the problem that an overlap with the description of the combined variant ("delins" description) might be missed in the annotation step (database queries).
The BRCA1 coding DNA reference sequence from position c.2074 to c.2080 is ..CATGACA.. A variant frequently found in the population is ..CATAACA.. (c.2077G>A). In a patient I found the sequence ..CAT Can I describe this variant as A TAACA..c.[2077G>A;2077_2078insTA]?
The correct description of this variant is NM_007294.3:c.2077delinsATA.
NOTE: the answer was modified, i.e. the addition "However, since the variant is likely a combination of two other variants it is acceptable to describe it as NM_007294.3:c.[2077G>A;2077_2078insTA]" was removed.