Package org.biojava.bio.alignment
Class SubstitutionMatrix
- java.lang.Object
-
- org.biojava.bio.alignment.SubstitutionMatrix
-
public class SubstitutionMatrix extends java.lang.Object
This object is able to read a substitution matrix file and constructs a short matrix in memory. Every single element of the matrix can be accessed by the method
getValueAt
with the parameters being two BioJava symbols. This is why it is not necessary to access the matrix directly. If there is no value for the two specifiedSymbol
s anException
is thrown.Substitution matrix files, are available at the NCBI FTP directory.
- Author:
- Andreas Dräger
-
-
Field Summary
Fields Modifier and Type Field Description protected FiniteAlphabet
alphabet
protected java.util.Map<Symbol,java.lang.Integer>
colSymbols
protected java.lang.String
description
protected short[][]
matrix
protected short
max
protected short
min
protected java.lang.String
name
protected java.util.Map<Symbol,java.lang.Integer>
rowSymbols
-
Constructor Summary
Constructors Constructor Description SubstitutionMatrix(java.io.File file)
This constructor can be used to guess the alphabet of this substitution matrix.SubstitutionMatrix(FiniteAlphabet alpha, short match, short replace)
Constructs a SubstitutionMatrix with every Match and every Replace having the same expenses given by the parameters.SubstitutionMatrix(FiniteAlphabet alpha, java.io.File matrixFile)
This constructs aSubstitutionMatrix
object that contains twoMap
data structures having BioJava symbols as keys and the value being the index of the matrix containing the substitution score.SubstitutionMatrix(FiniteAlphabet alpha, java.lang.String matrixString, java.lang.String name)
With this constructor it is possible to construct a SubstitutionMatrix object from a substitution matrix file.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description FiniteAlphabet
getAlphabet()
Gives the alphabet used by this matrix.java.lang.String
getDescription()
This gives you the description of this matrix if there is one.short
getMax()
The maximum score in this matrix.short
getMin()
The minimum score of this matrix.java.lang.String
getName()
Every substitution matrix has a name like "BLOSUM30" or "PAM160".static SubstitutionMatrix
getSubstitutionMatrix(java.io.BufferedReader reader)
This constructor can be used to guess the alphabet of this substitution matrix.short
getValueAt(Symbol row, Symbol col)
There are some substitution matrices containing more columns than lines.SubstitutionMatrix
normalizeMatrix()
With this method you can get a “normalized”SubstitutionMatrix
object; however, since this implementation uses an short matrix, the normalized matrix will be scaled by ten.void
printMatrix()
Just to perform some test.void
setDescription(java.lang.String desc)
Sets the description to the given value.java.lang.String
stringnifyDescription()
Converts the description of the matrix to a String.java.lang.String
stringnifyMatrix()
Creates aString
representation of this matrix.java.lang.String
toString()
Overrides the inherited method.
-
-
-
Field Detail
-
rowSymbols
protected java.util.Map<Symbol,java.lang.Integer> rowSymbols
-
colSymbols
protected java.util.Map<Symbol,java.lang.Integer> colSymbols
-
matrix
protected short[][] matrix
-
min
protected short min
-
max
protected short max
-
alphabet
protected FiniteAlphabet alphabet
-
description
protected java.lang.String description
-
name
protected java.lang.String name
-
-
Constructor Detail
-
SubstitutionMatrix
public SubstitutionMatrix(FiniteAlphabet alpha, java.io.File matrixFile) throws BioException, java.lang.NumberFormatException, java.io.IOException
This constructs aSubstitutionMatrix
object that contains twoMap
data structures having BioJava symbols as keys and the value being the index of the matrix containing the substitution score.- Parameters:
alpha
- the alphabet of the matrix (e.g., DNA, RNA or PROTEIN, or PROTEIN-TERM)matrixFile
- the file containing the substitution matrix. Lines starting with '#
' are comments. The line starting with a white space, is the table head. Every line has to start with the one letter representation of the Symbol and then the values for the exchange.- Throws:
java.io.IOException
BioException
java.lang.NumberFormatException
-
SubstitutionMatrix
public SubstitutionMatrix(FiniteAlphabet alpha, java.lang.String matrixString, java.lang.String name) throws BioException, java.lang.NumberFormatException, java.io.IOException
With this constructor it is possible to construct a SubstitutionMatrix object from a substitution matrix file. The given String contains a number of lines separated bySystem.getProperty("line.separator")
. Everything else is the same than for the constructor above.- Parameters:
alpha
- TheFiniteAlphabet
to usematrixString
-name
- of the matrix.- Throws:
BioException
java.io.IOException
java.lang.NumberFormatException
-
SubstitutionMatrix
public SubstitutionMatrix(FiniteAlphabet alpha, short match, short replace)
Constructs a SubstitutionMatrix with every Match and every Replace having the same expenses given by the parameters. Ambiguous symbols are not considered because there might be to many of them (for proteins).- Parameters:
alpha
-match
-replace
-
-
SubstitutionMatrix
public SubstitutionMatrix(java.io.File file) throws java.lang.NumberFormatException, java.util.NoSuchElementException, BioException, java.io.IOException
This constructor can be used to guess the alphabet of this substitution matrix. However, it is recommended to apply another constructor if the alphabet is known.- Parameters:
file
- A file containing a substitution matrix.- Throws:
java.lang.NumberFormatException
java.util.NoSuchElementException
BioException
java.io.IOException
-
-
Method Detail
-
getSubstitutionMatrix
public static SubstitutionMatrix getSubstitutionMatrix(java.io.BufferedReader reader) throws java.lang.NumberFormatException, BioException, java.io.IOException
This constructor can be used to guess the alphabet of this substitution matrix. However, it is recommended to apply another constructor if the alphabet is known.- Parameters:
reader
-- Throws:
java.lang.NumberFormatException
BioException
java.io.IOException
-
getValueAt
public short getValueAt(Symbol row, Symbol col) throws BioException
There are some substitution matrices containing more columns than lines. This has to do with the ambiguous symbols. Lines are always good, columns might not contain the whole information. The matrix is supposed to be symmetric anyway, so you can always set the ambiguous symbol to be the first argument.- Parameters:
row
- Symbol of the linecol
- Symbol of the column- Returns:
- expenses for the exchange of symbol row and symbol column.
- Throws:
BioException
-
getDescription
public java.lang.String getDescription()
This gives you the description of this matrix if there is one. Normally substitution matrix files like BLOSUM contain some lines of description.- Returns:
- the comment of the matrix
-
getName
public java.lang.String getName()
Every substitution matrix has a name like "BLOSUM30" or "PAM160". This will be returned by this method.- Returns:
- the name of the matrix.
-
getMin
public short getMin()
The minimum score of this matrix.- Returns:
- minimum of the matrix.
-
getMax
public short getMax()
The maximum score in this matrix.- Returns:
- maximum of the matrix.
-
setDescription
public void setDescription(java.lang.String desc)
Sets the description to the given value.- Parameters:
desc
- a description. This doesn't have to start with '#'.
-
getAlphabet
public FiniteAlphabet getAlphabet()
Gives the alphabet used by this matrix.- Returns:
- the alphabet of this matrix.
-
stringnifyMatrix
public java.lang.String stringnifyMatrix()
Creates aString
representation of this matrix.- Returns:
- a string representation of this matrix without the description.
-
stringnifyDescription
public java.lang.String stringnifyDescription()
Converts the description of the matrix to a String.- Returns:
- Gives a description with approximately 60 letters on every line
separated by
System.getProperty("line.separator")
. Every line starts with#
.
-
toString
public java.lang.String toString()
Overrides the inherited method.- Overrides:
toString
in classjava.lang.Object
- Returns:
- Gives a string representation of the SubstitutionMatrix. This is a valid input for the constructor which needs a matrix string. This String also contains the description of the matrix if there is one.
-
printMatrix
public void printMatrix()
Just to perform some test. It prints the matrix on the screen.
-
normalizeMatrix
public SubstitutionMatrix normalizeMatrix() throws BioException, java.lang.NumberFormatException, java.io.IOException
With this method you can get a “normalized”SubstitutionMatrix
object; however, since this implementation uses an short matrix, the normalized matrix will be scaled by ten. If you need values between zero and one, you have to divide every value returned bygetValueAt
by ten.- Returns:
- a new and normalized
SubstitutionMatrix
object given by this substitution matrix. Because this uses anshort
matrix, all values are scaled by 10. - Throws:
BioException
java.io.IOException
java.lang.NumberFormatException
-
-