The recommendations for the description of sequence variants are designed to be stable, meaningful, memorable and unequivocal. Still, every now and then modifications will be required to remove inconsistencies and/or to clarify confusing conventions. In addition, the recommendations may be extended to resolve cases that were hitherto not covered. HGVS nomenclature has version numbers to allow users to specify up to what point they follow the HGVS recommendations.
The version number is based on the date of the change and has the format: HGVS nomenclature _Version 15.11, for the version accepted in 2015 ("15"), November ("11"). The current HGVS version number is shown in the top right corner of this web site ("_Version xx.xx"). Note the version does not change when a typing error is corrected, an example added, an explanation clarified or a question answered. Outside the core HGVS recommendations, covered by the version number, the recommendations have "named extensions", i.e optional extensions for a specific use. Supporting named extensions is optional. A proper reference to the version of the HGVS nomenclature should mention the version number and the named extensions supported.
The current version is HGVS nomenclature v20.05.¶
NOTE: since proposals SVD-WG007 and SVD-WG008 have been accepted a new version of the HGVS nomenclature was released May 1, 2020.
For issues currently discussed see Open for Community Consultation or Open Issues.
Version 20.05: Accepted proposals include SVD-WG007 and SVD-WG008:
- SVD-WG008 (Reference Sequences): specifies requirements for acceptable Reference Sequences
- SVD-WG007 (RNA fusion): specifies how to describe RNA fusion transcripts
Version 19.01: Accepted proposals include SVD-WG005 and SVD-WG006:
- SVD-WG006 (circular DNA): allows descriptions like o.16000_100del
- SVD-WG005 (gom/lom): allows descriptions of changes in general methylation status like g.123_456|lom
Named extension ISCN: Proposal SVD-WG004 (ISCN<>HGVS) has been accepted a "named extension ISCN"
Version 15.11: Accepted proposals include SVD-WG001 and SVD-WG002:
- SVD-WG001 (No change): allows descriptions like g.11890634G=, c.123G=, r.123g= and p.(Arg41=).
- SVD-WG002 (n. reference sequence): allows descriptions like NR_028379.1:n.345A>G. : HGVS nomenclature version 15.11 is described in Den Dunnen et al. (2016) HGVS recommendations for the description of sequence variants: 2016 update. Hum.Mutat. 25: 37: 564-569. The most significant changes between version 15.11 and version 1.0 are described below.
Version 2.121101: Variants affecting translation termination - variants that replace the translation termination codon but do not encounter a new stop in the new reading frame are described as "p.321Argext?". Frame shift variants with the same effect are described as "p.Ile321Argfs*?" (see Protein descriptions)
Version 2.120831: Protein description in parentheses - parentheses in protein variant descriptions can be omitted when there is sufficient experimental evidence: Variants affecting translation initiation - at protein level, variants that generate a new upstream translation initiation codons are described using the format "p.Met1ext-5" (see Protein extensions).
Version 1: The 2000 publication of Den Dunnen JT and Antonarakis SE Mutation nomenclature extensions and suggestions to describe complex mutations: a discussion. Hum.Mutat. 15:7-12) contain a more formal set of recommendations an are considered as version 1.
Version 0: On the page "History regarding the description of sequence variants" we give an overview of all publications on the description of sequence variants. These papers can be considered as pre-versions of the first recommendations, a version 0.
Changes/additions going from the 2000 to 2016 recommendations¶
- Reference sequence: for diagnostic applications, the recommendation is to use a Locus Reference Genomic sequence (LRG, Dalgleish et al. 2010) as the reference sequence for variant descriptions. Prefixes for new reference sequence types have been added (e.g. m. and n., as well as indicators to specify different transcript variants (t1) and protein isoforms (p1) annotated in the reference sequence (see Reference Sequences)
- Definitions: the basic types of variants were defined more strictly. In addition variant types have been prioritized (see General recommendations)
- Pre-existing standards: pre-existing standards from the IUPAC and IUBMB for the description of nucleotides and amino acids are now used throughout the recommendations. These include letter codes to describe incompletely specified residues at both DNA and protein level (see Standards). Description of the translation termination (stop) codon at the protein/amino acid level changed from "X" to "Ter" / "*" since "X" in the IUPAC-IUB nomenclature means an "unspecified" or "unknown" amino acid.
- Incorporate ISCN standards: recommendations were made to describe changes with uncertain break points (i.e. not sequenced), obtained using technologies like FISH, arrays and MLPA. Furthermore, where possible, HGVS incorporated established ISCN standards in the recommendations, include the use of "/" (forward slash) to describe somatic variants and "//" for chimerism (see General recommendations).
- Simplification: in HGVS version 1.0 some symbols were used for more then one purpose leading to undesired confusion. These inconsistencies were removed.
- Prediction / experimental proof: to clarify a variant described at the protein level is a prediction, without experimental evidence, the recommendation was added to describe the predicted consequence in parentheses, like p.(Arg12Gly).
- Repeated sequences: recommendations were made to describe variability in repeated sequences (mono-, di-, tri- residue stretches, etc. see Repeated sequences).