Substitution#
Substitution: a sequence change where, compared to a reference sequence, one amino acid is replaced by one other amino acid.
Syntax#
| Experimentally ascertained protein consequence | |
|---|---|
| Syntax | sequence_identifier ":p." aa_position alternate_base |
| Examples |
|
| Predicted protein consequence | |
| Syntax | sequence_identifier ":p.(" aa_position alternate_base ")" |
| Examples |
|
| Explanation of Symbols | |
| |
Notes#
- all variants should be described on the DNA level; descriptions on the RNA and/or protein level may be given in addition.
- predicted consequences, i.e. without experimental evidence (no RNA or protein sequence analysed), should be given in parentheses, e.g.,
p.(Arg727Ser). - a nonsense variant, a variant changing an amino acid to a translation termination (stop) codon, is described as a substitution.
A nonsense variant is not described as a Deletion of the C-terminal end of the protein (e.g.,
p.Trp26_Arg1623del).- variants which introduce an immediate translation termination (stop) codon are described as nonsense variant.
- NOTE: not
p.Tyr4TerfsTer1, butp.Tyr4Ter(orp.Tyr4*); notp.Tyr4_Cys5insTerGluAsp, butp.Tyr4Ter(orp.Tyr4*); notp.Cys5_Ser6delinsTerGluAspbutp.Tyr4Ter(orp.Tyr4*).
- a no-stop variant, a variant changing the translation termination codon into an amino acid codon, is described as an extension (Extension).
- changes involving two or more consecutive amino acids are described as a deletion/insertion variant (delins) (see Deletion/insertion (delins)).
- the description
p.Arg76_Cys77delinsSerTrpis correct, the descriptionp.[Arg76Ser;Cys77Trp]is not correct.
- the description
- amino acids that have been tested and found not changed (silent) are described as
p.Cys123=(see SVD-WG001 (no change)).
Examples#
-
missense
-
LRG_199p1:p.Trp24Cys
amino acidTrp24is changed to aCys. -
NP_003997.1:p.(Trp24Cys)
amino acidTrp24is predicted to change to aCys(no experimental proof, e.g., based on DNA level data).
-
-
nonsense
LRG_199p1:p.Trp24Ter(p.Trp24*)
amino acidTrp24is changed to a stop codon (Ter,*).
NOTE: this change is not described as a deletion of the C-terminal end of the protein (i.e.p.Trp24_Met36853del).
-
silent (no change)
NP_003997.1:p.Cys188=
amino acidCys188is not changed (DNA level change..TGC..to..TGT..).
NOTE: the descriptionp.=means the entire protein coding region was analysed and no variant was found that changes (or is predicted to change) the protein sequence.
-
translation initiation codon
-
no protein:
LRG_199p1:p.0
as a consequence of a variant in the translation initiation codon, no protein is produced.
NOTE:LRG_199p1:p.0?can be used when you predict that no protein is produced. Do not use descriptions likep.Met1Thr, this is for sure not the consequence of the effect on protein translation. -
unknown:
LRG_199p1:p.(Met1?)
the consequence, on the protein level, of a variant affecting the translation initiation codon can not be predicted (i.e. is unknown). -
new translation initiation site
-
downstream:
NP_003997.1:p.Leu2_Met124del(deletion)
a variant in the translation initiation codon causes the activation of a downstream translation initiation site (Met) resulting in deletion of the first 123 amino acids (Met1toVal123) of the protein.
NOTE: the 3' rule applies. -
upstream:
p.Met1_Leu2insArgSerThrVal(insertion)
a variant in the translation initiation codon (Met1) changes it to a Valine (Val) and activates an upstream translation initiation site at position -4, replacing amino acidMet1withMetArgSerThrVal. Applying the 3' rule, the variant is described as an insertion.
NOTE: this variant is not described as an extension. -
new:
p.Met1ext-5(extension)
a variant in the 5' UTR activates a new in-frame upstream translation initiation site starting with amino acidMet5(see Extension).
-
-
-
translation termination codon (stop codon, no-stop change)
see Extension. -
splicing
NP_003997.1:p.?
the predicted consequence of variantNM_004006.2c.2622G>Cis a silent change (p.(Lys874=)). Since it affects the last nucleotide of the exon, it can not be excluded that the variant affects splicing, having unknown consequences.
NOTE: when others have reported the same variant, and were able to analyse RNA, you could consider to give the consequences they observed as the predicted consequences for the variant, e.g.,r.[(2603_2622del,2622g>c)]p.[(Ser868Argfs*2,Ser868=)].
-
uncertain
NP_003997.1:p.(Gly56Ala^Ser^Cys)
amino acidGly56is changed to anAla,Ser, orCys(see Uncertain).
-
mosaic
LRG_199p1:p.Trp24=/Cys
a mosaic case where at amino acid position24, besides the normal amino acid (aTrp, described as=), also protein is found containing aCys(p.Trp24Cys).
NOTE: irrespective of the frequency in which each amino acid was found, the reference is always described first.
NOTE: for the predicted consequences of a variant, the description isLRG_199t1:p.(Trp24=/Cys).
Discussion#
Are polymorphisms described like p.2366Gln/Lys?
No, all substitutions are described as NP_003997.1:p.Gln2366Lys.
In the past, the format p.2366Gln/Lys (p.2366Q/K) has been used to describe "polymorphic" sequence variants.
Note that a description should be neutral, simply describe the change, and not include any other information like predicted or known functional consequences.
Can I describe a TrpVal to CysArg variant as a amino acid substitution (p.TrpVal24CysArg)?
No, this is not allowed.
By definition, a substitution changes one amino acid into one other amino acid.
The change TrpVal to CysArg should be described as NP_003997.1:p.Trp24_Val25delinsCysArg, i.e. a deletion/insertion (delins) (see Deletion-Insertion).
How should you describe an amino acid substitution to any other amino acid?
HGVS uses IUPAC symbols (see Standards).
The symbol for 'any' amino acid is X / Xaa.
Since X has been used to indicate a translation stop codon (nonsense variant), we suggest to use Xaa three-letter amino acid code only (e.g., p.Arg782Xaa).