Substitution#
Substitution: a sequence change where, compared to a reference sequence, one amino acid is replaced by one other amino acid.
Syntax#
Experimentally ascertained protein consequence | |
---|---|
Syntax | sequence_identifier ":p." aa_position alternate_base |
Examples |
|
Predicted protein consequence | |
Syntax | sequence_identifier ":p.(" aa_position alternate_base ")" |
Examples |
|
Explanation of Symbols | |
|
Notes#
- all variants should be described on the DNA level; descriptions on the RNA and/or protein level may be given in addition.
- predicted consequences, i.e. without experimental evidence (no RNA or protein sequence analysed), should be given in parentheses, e.g.,
p.(Arg727Ser)
. - a nonsense variant, a variant changing an amino acid to a translation termination (stop) codon, is described as a substitution.
A nonsense variant is not described as a Deletion of the C-terminal end of the protein (e.g.,
p.Trp26_Arg1623del
).- variants which introduce an immediate translation termination (stop) codon are described as nonsense variant.
- NOTE: not
p.Tyr4TerfsTer1
, butp.Tyr4Ter
(orp.Tyr4*
); notp.Tyr4_Cys5insTerGluAsp
, butp.Tyr4Ter
(orp.Tyr4*
); notp.Cys5_Ser6delinsTerGluAsp
butp.Tyr4Ter
(orp.Tyr4*
).
- a no-stop variant, a variant changing the translation termination codon into an amino acid codon, is described as an extension (Extension).
- changes involving two or more consecutive amino acids are described as a deletion/insertion variant (delins) (see Deletion/insertion (delins)).
- the description
p.Arg76_Cys77delinsSerTrp
is correct, the descriptionp.[Arg76Ser;Cys77Trp]
is not correct.
- the description
- amino acids that have been tested and found not changed (silent) are described as
p.Cys123=
(see SVD-WG001 (no change)).
Examples#
-
missense
-
LRG_199p1:p.Trp24Cys
amino acidTrp24
is changed to aCys
. -
NP_003997.1:p.(Trp24Cys)
amino acidTrp24
is predicted to change to aCys
(no experimental proof, e.g., based on DNA level data).
-
-
nonsense
LRG_199p1:p.Trp24Ter
(p.Trp24*
)
amino acidTrp24
is changed to a stop codon (Ter
,*
).
NOTE: this change is not described as a deletion of the C-terminal end of the protein (i.e.p.Trp24_Met36853del
).
-
silent (no change)
NP_003997.1:p.Cys188=
amino acidCys188
is not changed (DNA level change..TGC..
to..TGT..
).
NOTE: the descriptionp.=
means the entire protein coding region was analysed and no variant was found that changes (or is predicted to change) the protein sequence.
-
translation initiation codon
-
no protein:
LRG_199p1:p.0
as a consequence of a variant in the translation initiation codon, no protein is produced.
NOTE:LRG_199p1:p.0?
can be used when you predict that no protein is produced. Do not use descriptions likep.Met1Thr
, this is for sure not the consequence of the effect on protein translation. -
unknown:
LRG_199p1:p.(Met1?)
the consequence, on the protein level, of a variant affecting the translation initiation codon can not be predicted (i.e. is unknown). -
new translation initiation site
-
downstream:
NP_003997.1:p.Leu2_Met124del
(deletion)
a variant in the translation initiation codon causes the activation of a downstream translation initiation site (Met
) resulting in deletion of the first 123 amino acids (Met1
toVal123
) of the protein.
NOTE: the 3' rule applies. -
upstream:
p.Met1_Leu2insArgSerThrVal
(insertion)
a variant in the translation initiation codon (Met1
) changes it to a Valine (Val
) and activates an upstream translation initiation site at position -4, replacing amino acidMet1
withMetArgSerThrVal
. Applying the 3' rule, the variant is described as an insertion.
NOTE: this variant is not described as an extension. -
new:
p.Met1ext-5
(extension)
a variant in the 5' UTR activates a new in-frame upstream translation initiation site starting with amino acidMet5
(see Extension).
-
-
-
translation termination codon (stop codon, no-stop change)
see Extension. -
splicing
NP_003997.1:p.?
the predicted consequence of variantNM_004006.2c.2622G>C
is a silent change (p.(Lys874=)
). Since it affects the last nucleotide of the exon, it can not be excluded that the variant affects splicing, having unknown consequences.
NOTE: when others have reported the same variant, and were able to analyse RNA, you could consider to give the consequences they observed as the predicted consequences for the variant, e.g.,r.[(2603_2622del,2622g>c)]
p.[(Ser868Argfs*2,Ser868=)]
.
-
uncertain
NP_003997.1:p.(Gly56Ala^Ser^Cys)
amino acidGly56
is changed to anAla
,Ser
, orCys
(see Uncertain).
-
mosaic
LRG_199p1:p.Trp24=/Cys
a mosaic case where at amino acid position24
, besides the normal amino acid (aTrp
, described as=
), also protein is found containing aCys
(p.Trp24Cys
).
NOTE: irrespective of the frequency in which each amino acid was found, the reference is always described first.
NOTE: for the predicted consequences of a variant, the description isLRG_199t1:p.(Trp24=/Cys)
.
Discussion#
Are polymorphisms described like p.2366Gln/Lys
?
No, all substitutions are described as NP_003997.1:p.Gln2366Lys
.
In the past, the format p.2366Gln/Lys
(p.2366Q/K
) has been used to describe "polymorphic" sequence variants.
Note that a description should be neutral, simply describe the change, and not include any other information like predicted or known functional consequences.
Can I describe a TrpVal
to CysArg
variant as a amino acid substitution (p.TrpVal24CysArg
)?
No, this is not allowed.
By definition, a substitution changes one amino acid into one other amino acid.
The change TrpVal
to CysArg
should be described as NP_003997.1:p.Trp24_Val25delinsCysArg
, i.e. a deletion/insertion (delins) (see Deletion-Insertion).
How should you describe an amino acid substitution to any other amino acid?
HGVS uses IUPAC symbols (see Standards).
The symbol for 'any' amino acid is X
/ Xaa
.
Since X
has been used to indicate a translation stop codon (nonsense variant), we suggest to use Xaa
three-letter amino acid code only (e.g., p.Arg782Xaa
).