Repeated Sequences#
Repeated sequence: a sequence between the translation initiation (start) and termination (stop) codon where, compared to a reference sequence, a segment of one or more amino acids (the repeat unit) is present several times, one after the other.
Syntax#
| Syntax | sequence_identifier ":p." position sequence "[" total_copy_number "]" |
|---|---|
| Examples |
|
| Explanation of Symbols | |
| |
Notes#
- all variants should be described on the DNA level; descriptions on the RNA and/or protein level may be given in addition.
- repeated sequences include both small (mono-, di-, tri-, etc., amino acid) and larger repeats.
Examples#
-
p.Ala2[10]
a repeated amino acid sequence, with the firstAla-residue located at position 2, is present in 10 copies.
NOTE: when the repeat is variable in the population and the reference sequence has 10 units, the descriptionp.Ala2[9]is preferred overp.Ala11del.
NOTE: when the repeat is variable in the population and the reference sequence has 10 units, the descriptionp.Ala2[12]is preferred overp.Ala10_Ala11dup. -
p.Ala2[10];[11]
a repeated amino acid sequence, with the firstAla-residue located at position 2, is present in 10 copies on one allele and 11 copies on the other allele. -
p.Gln18[23]
a repeated amino acid sequence, with the firstGln-residue located at position 18 is present in 23 copies (HDGln-repeat based on the HTT (huntingtin) protein reference sequence (GenBankNP_002102.4)).
NOTE: the protein reference sequence (GenBankNP_002102.4) contains an allele of 23Glncopies (encoded by 22cagand 1caacodons).p.(Gln18)[(70_80)]
the predictedGlnamino acid repeat, starting at position 18, has an estimated size of between 70 and 80 copies.
NOTE: the repeat can be encoded by a mix of different coding triplets (cag,caa).