Skip to content

Substitution#

Substitution: a sequence change where, compared to a reference sequence, one nucleotide is replaced by one other nucleotide.

Syntax#

Syntax sequence_identifier ":r." position reference_nucleotide ">" new_nucleotide
Examples
  • NM_004006.3:r.123c>g
Explanation of Symbols
  • coordinate_type: the coordinate type, indicating the type of numbering used; r
  • new_nucleotide: the nucleotide substituting the existing one; g
  • position: the position of the nucleotide substituted; 123
  • reference_nucleotide: the nucleotide at this position in the reference sequence; c
  • sequence_identifier: the sequence identifier used; NM_004006.3
See also explanation of grammar used in HGVS Nomenclature.

Notes#

  • all variants should be described on the DNA level; descriptions on the RNA and/or protein level may be given in addition.
  • substitutions involving two or more consecutive nucleotides are described as deletion/insertions (delins) (see Deletion/insertion).
  • two substitutions separated by one or more nucleotides should be described individually and not as a "delins".
    • exception: two variants separated by one nucleotide, together affecting one amino acid, should be described as a "delins" (e.g., r.142_144delinsugg (p.Arg48Trp)).
      NOTE: this prevents tools predicting the consequences of a variant to make conflicting and incorrect predictions of two different substitutions at one position.
  • nucleotides that have been tested and found not changed are described as r.109u=, r.4567_4569= (see SVD-WG001 (no change)).
  • it is not correct to describe "polymorphisms" as r.76a/g (see Discussions).

Examples#

  • NM_004006.3:r.76a>c
    a substitution of the a nucleotide at r.76 with a c.

  • NM_004006.3:r.76_77delinsug
    NOTE: based on the definition of a substitution, i.e. one nucleotide replaced by one other nucleotide, this change can not be described as a substitution like r.76_77aa>ug or r.76aa>ug.

  • NM_004006.3:r.(1388g>a)
    the predicted consequences on RNA level is a substitution of the g nucleotide at r.1388 with a g.

  • NM_004006.3:r.123=
    a screen was performed showing that nucleotide r.123 was a c, as in the coding DNA reference sequence (the nucleotide was not changed).

  • NM_004006.1:r.-14a>c
    an a to c substitution 14 nucleotides 5' of the ATG translation initiation codon.

  • NM_004006.3:r.*41u>a
    a u to a substitution 41 nucleotides 3' of the translation termination codon.

  • NM_004006.3:r.[897u>g,832_960del]
    two different transcripts, r.897u>g and r.832_960del, derive from one variant (NM_004006.3:c.897T>G on the DNA level).
    NOTE: for more examples of variants affecting splicing, see RNA splicing.

  • NM_004006.1:r.0
    no RNA from the variant allele could be detected.

  • NM_004006.3:r.spl
    RNA has not been analysed, but it is very likely that splicing is affected.

  • NM_004006.3:r.?
    an effect on the RNA level is expected, but it is not possible to give a reliable prediction of the consequences (RNA not analysed).

  • NM_004006.3:r.85=/u>c
    a mosaic case where at position 85, besides the normal sequence (a u, described as =), also transcripts are found containing a c (r.85u>c).
    NOTE: irrespective of the frequency in which each nucleotide was found, the reference is always described first.

  • NM_004006.3:r.85=//u>c
    a chimeric case, i.e. the sample is a mix of cells containing r.85= and r.85u>c.
    NOTE: irrespective of the frequency in which each nucleotide was found, the reference is always described first.

Discussion#

When I only sequenced RNA (cDNA) and not genomic DNA, should I then give the description of a variant on DNA level in parentheses?

Yes, while the variant on RNA level can be described as r.76a>g on DNA level, based on a coding DNA reference, sequence it should be described as c.(76A>G).

Are polymorphisms described like r.76a/g?

No, all substitutions are described as r.76a>g. In the past, the format r.76a/g has been used to describe "polymorphic" sequence variants. Note that a description should be neutral, simply describe the change, and not include any other information like predicted or known functional consequences.

I found a variant on DNA level which is a well-characterised splice variant. Is it correct to describe the variant as concluded from literature?

No, you should report what you have found. You can, however, use the published data to give the predicted consequences on RNA/protein level, e.g., NM_004006.3:c.3430C>T   r.(3277_3432del)   p.(Leu1093_Gln1144del).