Substitution#
Substitution: a sequence change where, compared to a reference sequence, one nucleotide is replaced by one other nucleotide.
Syntax#
Syntax | sequence_identifier ":r." position reference_nucleotide ">" new_nucleotide |
---|---|
Examples |
|
Explanation of Symbols | |
|
Notes#
- all variants should be described at the DNA level, descriptions at the RNA and/or protein level may be given in addition
- substitutions involving two or more consecutive nucleotides are described as deletion/insertions (indels) (see Deletion/insertion (delins)).
- two substitutions separated by one or more nucleotides should be described individually and not as a "delins"
- exception: two variants separated by one nucleotide, together affecting one amino acid, should be described as a "delins" (e.g. r.142_144delinsugg (p.Arg48Trp)).: NOTE: this prevents tools predicting the consequences of a variant to make conflicting and incorrect predictions of two different substitutions at one position
- nucleotides that have been tested and found not changed are described as
r.109u=
,r.4567_4569=
(see SVD-WG001 (no change)). - it is not correct to describe "polymorphisms" as
r.76a/g
(see Discussions).
Examples#
NM_004006.3:r.76a>c
: a substitution of the "a" nucleotide at r.76 with a "c"NM_004006.3:r.76_77delinsug
: NOTE: based on the definition of a substitution, i.e. one nucleotide replaced by one other nucleotide, this change can not be described as a substitution liker.76_77aa>ug
orr.76aa>ug
NM_004006.3:r.(1388g>a)
: the predicted consequences at RNA level is a substitution of the "g" nucleotide at r.1388 with a "g"NM_004006.3:r.123=
: a screen was performed showing that nucleotide r.123 was a "c" as in the coding DNA reference sequence (the nucleotide was not changed).NM_004006.1:r.-14a>c
: a "a" to "c" substitution 14 nucleotides 5' of the ATG translation initiation codonNM_004006.3:r.*41u>a
: a "u" to "a" substitution 41 nucleotides 3' of the translation termination codonNM_004006.3:r.[897u>g,832_960del]
: two different transcripts,r.897u>g
andr.832_960del
, derive from one variant (NM_004006.3:c.897T>G
at the DNA level). NOTE: for more examples of variants affecting splicing see RNA splicing.NM_004006.1:r.0
: no RNA from the variant allele could be detectedNM_004006.3:r.spl
: RNA has not been analysed but it is very likely that splicing is affectedNM_004006.3:r.?
: an effect on the RNA level is expected but it is not possible to give a reliable prediction of the consequences (RNA not analysed)NM_004006.3:r.85=/u>c
: a mosaic case where at position 85 besides the normal sequence (a U, described as "=") also transcripts are found containing a C (r.85u>c
). NOTE: irrespective of the frequency in which each nucleotide was found, the reference is always described first.NM_004006.3:r.85=//u>c
: a chimeric case, i.e. the sample is a mix of cells containingr.85=
andr.85u>c
. NOTE: irrespective of the frequency in which each nucleotide was found, the reference is always described first.
Discussion#
When I only sequenced RNA (cDNA) and not genomic DNA should I then give the description of a variant at DNA level in parenthesis?
Yes, while the variant at RNA level can be described as `r.76a>g` on DNA level, based on a coding DNA reference, sequence it should be described as `c.(76A>G)`.
Are polymorphisms described like r.76a/g
?
No, all substitutions are described as `r.76a>g`. In the past, the format <code class="invalid">r.76a/g</code> has been used to describe "polymorphic" sequence variants. Note that a description should be neutral, simply describe the change, and not include any other information like predicted or known functional consequences.
I found a variant on DNA level which is a well-characterised splice variant. Is it correct to describe the variant as concluded from literature?
No, you should report what **you** have found. You can however use the published data to give the predicted consequences on RNA/protein level, e.g. `NM_004006.3:c.3430C>T` `r.(3277_3432del)` `p.(Leu1093_Gln1144del)`.