Class Allele
- java.lang.Object
-
- htsjdk.variant.variantcontext.Allele
-
- All Implemented Interfaces:
Serializable
,Comparable<Allele>
public class Allele extends Object implements Comparable<Allele>, Serializable
Immutable representation of an allele.Types of alleles:
Ref: a t C g a // C is the reference base : a t G g a // C base is a G in some individuals : a t - g a // C base is deleted w.r.t. the reference : a t CAg a // A base is inserted w.r.t. the reference sequence
In these cases, where are the alleles?
- SNP polymorphism of C/G -> { C , G } -> C is the reference allele
- 1 base deletion of C -> { tC , t } -> C is the reference allele and we include the preceding reference base (null alleles are not allowed)
- 1 base insertion of A -> { C ; CA } -> C is the reference allele (because null alleles are not allowed)
Suppose I see a the following in the population:
Ref: a t C g a // C is the reference base : a t G g a // C base is a G in some individuals : a t - g a // C base is deleted w.r.t. the reference
How do I represent this? There are three segregating alleles:
{ C , G , - }
and these are represented as:
{ tC, tG, t }
Now suppose I have this more complex example:
Ref: a t C g a // C is the reference base : a t - g a : a t - - a : a t CAg a
There are actually four segregating alleles:
{ Cg , -g, --, and CAg } over bases 2-4
represented as:
{ tCg, tg, t, tCAg }
Critically, it should be possible to apply an allele to a reference sequence to create the correct haplotype sequence:
Allele + reference => haplotype
For convenience, we are going to create Alleles where the GenomeLoc of the allele is stored outside of the Allele object itself. So there's an idea of an A/C polymorphism independent of it's surrounding context. Given list of alleles it's possible to determine the "type" of the variation
A / C @ loc => SNP - / A => INDEL
If you know where allele is the reference, you can determine whether the variant is an insertion or deletion.
Alelle also supports is concept of a NO_CALL allele. This Allele represents a haplotype that couldn't be determined. This is usually represented by a '.' allele.
Note that Alleles store all bases as bytes, in **UPPER CASE**. So 'atc' == 'ATC' from the perspective of an Allele.
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static Allele
NO_CALL
static String
NO_CALL_STRING
A generic static NO_CALL allele for usestatic Allele
NON_REF_ALLELE
static String
NON_REF_STRING
Non ref allele representationsstatic long
serialVersionUID
static Allele
SPAN_DEL
static String
SPAN_DEL_STRING
A generic static SPAN_DEL allele for usestatic Allele
SV_SIMPLE_CNV
static Allele
SV_SIMPLE_DEL
static Allele
SV_SIMPLE_DUP
static Allele
SV_SIMPLE_INS
static Allele
SV_SIMPLE_INV
static Allele
UNSPECIFIED_ALTERNATE_ALLELE
static String
UNSPECIFIED_ALTERNATE_ALLELE_STRING
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static boolean
acceptableAlleleBases(byte[] bases)
static boolean
acceptableAlleleBases(byte[] bases, boolean isReferenceAllele)
static boolean
acceptableAlleleBases(String bases)
static boolean
acceptableAlleleBases(String bases, boolean isReferenceAllele)
boolean
basesMatch(byte[] test)
boolean
basesMatch(Allele test)
boolean
basesMatch(String test)
int
compareTo(Allele other)
static Allele
create(byte base)
static Allele
create(byte[] bases)
Creates a non-Ref allele.static Allele
create(byte[] bases, boolean isRef)
Create a new Allele that includes bases and if tagged as the reference allele if isRef == true.static Allele
create(byte base, boolean isRef)
static Allele
create(Allele allele, boolean ignoreRefState)
Creates a new allele based on the provided one.static Allele
create(String bases)
Creates a non-Ref allele.static Allele
create(String bases, boolean isRef)
boolean
equals(Allele other, boolean ignoreRefState)
Returns true if this and other are equal.boolean
equals(Object other)
static Allele
extend(Allele left, byte[] right)
byte[]
getBases()
Return the DNA bases segregating in this allele.String
getBaseString()
Return the DNA bases segregating in this allele in String format.byte[]
getDisplayBases()
Same as #getDisplayString() but returns the result as byte[].String
getDisplayString()
Return the printed representation of this allele.static Allele
getMatchingAllele(Collection<Allele> allAlleles, byte[] alleleBases)
int
hashCode()
boolean
isCalled()
boolean
isNoCall()
boolean
isNonRefAllele()
boolean
isNonReference()
boolean
isReference()
boolean
isSymbolic()
int
length()
static boolean
oneIsPrefixOfOther(Allele a1, Allele a2)
String
toString()
static boolean
wouldBeNoCallAllele(byte[] bases)
static boolean
wouldBeNullAllele(byte[] bases)
static boolean
wouldBeStarAllele(byte[] bases)
static boolean
wouldBeSymbolicAllele(byte[] bases)
-
-
-
Field Detail
-
serialVersionUID
public static final long serialVersionUID
- See Also:
- Constant Field Values
-
NO_CALL_STRING
public static final String NO_CALL_STRING
A generic static NO_CALL allele for use- See Also:
- Constant Field Values
-
SPAN_DEL_STRING
public static final String SPAN_DEL_STRING
A generic static SPAN_DEL allele for use- See Also:
- Constant Field Values
-
NON_REF_STRING
public static final String NON_REF_STRING
Non ref allele representations- See Also:
- Constant Field Values
-
UNSPECIFIED_ALTERNATE_ALLELE_STRING
public static final String UNSPECIFIED_ALTERNATE_ALLELE_STRING
- See Also:
- Constant Field Values
-
SPAN_DEL
public static final Allele SPAN_DEL
-
NO_CALL
public static final Allele NO_CALL
-
NON_REF_ALLELE
public static final Allele NON_REF_ALLELE
-
UNSPECIFIED_ALTERNATE_ALLELE
public static final Allele UNSPECIFIED_ALTERNATE_ALLELE
-
SV_SIMPLE_DEL
public static final Allele SV_SIMPLE_DEL
-
SV_SIMPLE_INS
public static final Allele SV_SIMPLE_INS
-
SV_SIMPLE_INV
public static final Allele SV_SIMPLE_INV
-
SV_SIMPLE_CNV
public static final Allele SV_SIMPLE_CNV
-
SV_SIMPLE_DUP
public static final Allele SV_SIMPLE_DUP
-
-
Constructor Detail
-
Allele
protected Allele(byte[] bases, boolean isRef)
-
Allele
protected Allele(String bases, boolean isRef)
-
Allele
protected Allele(Allele allele, boolean ignoreRefState)
Creates a new allele based on the provided one. Ref state will be copied unless ignoreRefState is true (in which case the returned allele will be non-Ref). This method is efficient because it can skip the validation of the bases (since the original allele was already validated)- Parameters:
allele
- the allele from which to copy the basesignoreRefState
- should we ignore the reference state of the input allele and use the default ref state?
-
-
Method Detail
-
create
public static Allele create(byte[] bases, boolean isRef)
Create a new Allele that includes bases and if tagged as the reference allele if isRef == true. If bases == '-', a Null allele is created. If bases == '.', a no call Allele is created. If bases == '*', a spanning deletions Allele is created.- Parameters:
bases
- the DNA sequence of this variation, '-', '.', or '*'isRef
- should we make this a reference allele?- Throws:
IllegalArgumentException
- if bases contains illegal characters or is otherwise malformated
-
create
public static Allele create(byte base, boolean isRef)
-
create
public static Allele create(byte base)
-
wouldBeNullAllele
public static boolean wouldBeNullAllele(byte[] bases)
- Parameters:
bases
- bases representing an allele- Returns:
- true if the bases represent the null allele
-
wouldBeStarAllele
public static boolean wouldBeStarAllele(byte[] bases)
- Parameters:
bases
- bases representing an allele- Returns:
- true if the bases represent the SPAN_DEL allele
-
wouldBeNoCallAllele
public static boolean wouldBeNoCallAllele(byte[] bases)
- Parameters:
bases
- bases representing an allele- Returns:
- true if the bases represent the NO_CALL allele
-
wouldBeSymbolicAllele
public static boolean wouldBeSymbolicAllele(byte[] bases)
- Parameters:
bases
- bases representing an allele- Returns:
- true if the bases represent a symbolic allele
-
acceptableAlleleBases
public static boolean acceptableAlleleBases(String bases)
- Parameters:
bases
- bases representing a reference allele- Returns:
- true if the bases represent the well formatted allele
-
acceptableAlleleBases
public static boolean acceptableAlleleBases(String bases, boolean isReferenceAllele)
- Parameters:
bases
- bases representing an alleleisReferenceAllele
- is a reference allele- Returns:
- true if the bases represent the well formatted allele
-
acceptableAlleleBases
public static boolean acceptableAlleleBases(byte[] bases)
- Parameters:
bases
- bases representing a reference allele- Returns:
- true if the bases represent the well formatted allele
-
acceptableAlleleBases
public static boolean acceptableAlleleBases(byte[] bases, boolean isReferenceAllele)
- Parameters:
bases
- bases representing an alleleisReferenceAllele
- true if a reference allele- Returns:
- true if the bases represent the well formatted allele
-
create
public static Allele create(String bases, boolean isRef)
- Parameters:
bases
- bases representing an alleleisRef
- is this the reference allele?- See Also:
Allele(byte[], boolean)
-
create
public static Allele create(String bases)
Creates a non-Ref allele. @see Allele(byte[], boolean) for full information- Parameters:
bases
- bases representing an allele
-
create
public static Allele create(byte[] bases)
Creates a non-Ref allele. @see Allele(byte[], boolean) for full information- Parameters:
bases
- bases representing an allele
-
create
public static Allele create(Allele allele, boolean ignoreRefState)
Creates a new allele based on the provided one. Ref state will be copied unless ignoreRefState is true (in which case the returned allele will be non-Ref). This method is efficient because it can skip the validation of the bases (since the original allele was already validated)- Parameters:
allele
- the allele from which to copy the basesignoreRefState
- should we ignore the reference state of the input allele and use the default ref state?
-
isNoCall
public boolean isNoCall()
-
isCalled
public boolean isCalled()
-
isReference
public boolean isReference()
-
isNonReference
public boolean isNonReference()
-
isSymbolic
public boolean isSymbolic()
-
getBases
public byte[] getBases()
Return the DNA bases segregating in this allele. Note this isn't reference polarized, so the Null allele is represented by a vector of length 0- Returns:
- the segregating bases
-
getBaseString
public String getBaseString()
Return the DNA bases segregating in this allele in String format. This is useful, because toString() adds a '*' to reference alleles and getBases() returns garbage when you call toString() on it.- Returns:
- the segregating bases
-
getDisplayString
public String getDisplayString()
Return the printed representation of this allele. Same as getBaseString(), except for symbolic alleles. For symbolic alleles, the base string is empty while the display string contains <TAG>.- Returns:
- the allele string representation
-
getDisplayBases
public byte[] getDisplayBases()
Same as #getDisplayString() but returns the result as byte[]. Slightly faster then getDisplayString()- Returns:
- the allele string representation
-
equals
public boolean equals(Object other)
-
equals
public boolean equals(Allele other, boolean ignoreRefState)
Returns true if this and other are equal. If ignoreRefState is true, then doesn't require both alleles has the same ref tag- Parameters:
other
- allele to compare toignoreRefState
- if true, ignore ref state in comparison- Returns:
- true if this and other are equal
-
basesMatch
public boolean basesMatch(byte[] test)
- Parameters:
test
- bases to test against- Returns:
- true if this Allele contains the same bases as test, regardless of its reference status; handles Null and NO_CALL alleles
-
basesMatch
public boolean basesMatch(String test)
- Parameters:
test
- bases to test against- Returns:
- true if this Allele contains the same bases as test, regardless of its reference status; handles Null and NO_CALL alleles
-
basesMatch
public boolean basesMatch(Allele test)
- Parameters:
test
- allele to test against- Returns:
- true if this Allele contains the same bases as test, regardless of its reference status; handles Null and NO_CALL alleles
-
length
public int length()
- Returns:
- the length of this allele. Null and NO_CALL alleles have 0 length.
-
getMatchingAllele
public static Allele getMatchingAllele(Collection<Allele> allAlleles, byte[] alleleBases)
-
compareTo
public int compareTo(Allele other)
- Specified by:
compareTo
in interfaceComparable<Allele>
-
isNonRefAllele
public boolean isNonRefAllele()
- Returns:
- true if Allele is either
<NON_REF>
or<*>
-
-