Class SubstitutionMatrix


  • public class SubstitutionMatrix
    extends java.lang.Object

    This object is able to read a substitution matrix file and constructs a short matrix in memory. Every single element of the matrix can be accessed by the method getValueAt with the parameters being two BioJava symbols. This is why it is not necessary to access the matrix directly. If there is no value for the two specified Symbols an Exception is thrown.

    Substitution matrix files, are available at the NCBI FTP directory.

    Author:
    Andreas Dräger
    • Constructor Summary

      Constructors 
      Constructor Description
      SubstitutionMatrix​(java.io.File file)
      This constructor can be used to guess the alphabet of this substitution matrix.
      SubstitutionMatrix​(FiniteAlphabet alpha, short match, short replace)
      Constructs a SubstitutionMatrix with every Match and every Replace having the same expenses given by the parameters.
      SubstitutionMatrix​(FiniteAlphabet alpha, java.io.File matrixFile)
      This constructs a SubstitutionMatrix object that contains two Map data structures having BioJava symbols as keys and the value being the index of the matrix containing the substitution score.
      SubstitutionMatrix​(FiniteAlphabet alpha, java.lang.String matrixString, java.lang.String name)
      With this constructor it is possible to construct a SubstitutionMatrix object from a substitution matrix file.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      FiniteAlphabet getAlphabet()
      Gives the alphabet used by this matrix.
      java.lang.String getDescription()
      This gives you the description of this matrix if there is one.
      short getMax()
      The maximum score in this matrix.
      short getMin()
      The minimum score of this matrix.
      java.lang.String getName()
      Every substitution matrix has a name like "BLOSUM30" or "PAM160".
      static SubstitutionMatrix getSubstitutionMatrix​(java.io.BufferedReader reader)
      This constructor can be used to guess the alphabet of this substitution matrix.
      short getValueAt​(Symbol row, Symbol col)
      There are some substitution matrices containing more columns than lines.
      SubstitutionMatrix normalizeMatrix()
      With this method you can get a “normalized” SubstitutionMatrix object; however, since this implementation uses an short matrix, the normalized matrix will be scaled by ten.
      void printMatrix()
      Just to perform some test.
      void setDescription​(java.lang.String desc)
      Sets the description to the given value.
      java.lang.String stringnifyDescription()
      Converts the description of the matrix to a String.
      java.lang.String stringnifyMatrix()
      Creates a String representation of this matrix.
      java.lang.String toString()
      Overrides the inherited method.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Field Detail

      • rowSymbols

        protected java.util.Map<Symbol,​java.lang.Integer> rowSymbols
      • colSymbols

        protected java.util.Map<Symbol,​java.lang.Integer> colSymbols
      • matrix

        protected short[][] matrix
      • min

        protected short min
      • max

        protected short max
      • description

        protected java.lang.String description
      • name

        protected java.lang.String name
    • Constructor Detail

      • SubstitutionMatrix

        public SubstitutionMatrix​(FiniteAlphabet alpha,
                                  java.io.File matrixFile)
                           throws BioException,
                                  java.lang.NumberFormatException,
                                  java.io.IOException
        This constructs a SubstitutionMatrix object that contains two Map data structures having BioJava symbols as keys and the value being the index of the matrix containing the substitution score.
        Parameters:
        alpha - the alphabet of the matrix (e.g., DNA, RNA or PROTEIN, or PROTEIN-TERM)
        matrixFile - the file containing the substitution matrix. Lines starting with '#' are comments. The line starting with a white space, is the table head. Every line has to start with the one letter representation of the Symbol and then the values for the exchange.
        Throws:
        java.io.IOException
        BioException
        java.lang.NumberFormatException
      • SubstitutionMatrix

        public SubstitutionMatrix​(FiniteAlphabet alpha,
                                  java.lang.String matrixString,
                                  java.lang.String name)
                           throws BioException,
                                  java.lang.NumberFormatException,
                                  java.io.IOException
        With this constructor it is possible to construct a SubstitutionMatrix object from a substitution matrix file. The given String contains a number of lines separated by System.getProperty("line.separator"). Everything else is the same than for the constructor above.
        Parameters:
        alpha - The FiniteAlphabet to use
        matrixString -
        name - of the matrix.
        Throws:
        BioException
        java.io.IOException
        java.lang.NumberFormatException
      • SubstitutionMatrix

        public SubstitutionMatrix​(FiniteAlphabet alpha,
                                  short match,
                                  short replace)
        Constructs a SubstitutionMatrix with every Match and every Replace having the same expenses given by the parameters. Ambiguous symbols are not considered because there might be to many of them (for proteins).
        Parameters:
        alpha -
        match -
        replace -
      • SubstitutionMatrix

        public SubstitutionMatrix​(java.io.File file)
                           throws java.lang.NumberFormatException,
                                  java.util.NoSuchElementException,
                                  BioException,
                                  java.io.IOException
        This constructor can be used to guess the alphabet of this substitution matrix. However, it is recommended to apply another constructor if the alphabet is known.
        Parameters:
        file - A file containing a substitution matrix.
        Throws:
        java.lang.NumberFormatException
        java.util.NoSuchElementException
        BioException
        java.io.IOException
    • Method Detail

      • getSubstitutionMatrix

        public static SubstitutionMatrix getSubstitutionMatrix​(java.io.BufferedReader reader)
                                                        throws java.lang.NumberFormatException,
                                                               BioException,
                                                               java.io.IOException
        This constructor can be used to guess the alphabet of this substitution matrix. However, it is recommended to apply another constructor if the alphabet is known.
        Parameters:
        reader -
        Throws:
        java.lang.NumberFormatException
        BioException
        java.io.IOException
      • getValueAt

        public short getValueAt​(Symbol row,
                                Symbol col)
                         throws BioException
        There are some substitution matrices containing more columns than lines. This has to do with the ambiguous symbols. Lines are always good, columns might not contain the whole information. The matrix is supposed to be symmetric anyway, so you can always set the ambiguous symbol to be the first argument.
        Parameters:
        row - Symbol of the line
        col - Symbol of the column
        Returns:
        expenses for the exchange of symbol row and symbol column.
        Throws:
        BioException
      • getDescription

        public java.lang.String getDescription()
        This gives you the description of this matrix if there is one. Normally substitution matrix files like BLOSUM contain some lines of description.
        Returns:
        the comment of the matrix
      • getName

        public java.lang.String getName()
        Every substitution matrix has a name like "BLOSUM30" or "PAM160". This will be returned by this method.
        Returns:
        the name of the matrix.
      • getMin

        public short getMin()
        The minimum score of this matrix.
        Returns:
        minimum of the matrix.
      • getMax

        public short getMax()
        The maximum score in this matrix.
        Returns:
        maximum of the matrix.
      • setDescription

        public void setDescription​(java.lang.String desc)
        Sets the description to the given value.
        Parameters:
        desc - a description. This doesn't have to start with '#'.
      • getAlphabet

        public FiniteAlphabet getAlphabet()
        Gives the alphabet used by this matrix.
        Returns:
        the alphabet of this matrix.
      • stringnifyMatrix

        public java.lang.String stringnifyMatrix()
        Creates a String representation of this matrix.
        Returns:
        a string representation of this matrix without the description.
      • stringnifyDescription

        public java.lang.String stringnifyDescription()
        Converts the description of the matrix to a String.
        Returns:
        Gives a description with approximately 60 letters on every line separated by System.getProperty("line.separator"). Every line starts with #.
      • toString

        public java.lang.String toString()
        Overrides the inherited method.
        Overrides:
        toString in class java.lang.Object
        Returns:
        Gives a string representation of the SubstitutionMatrix. This is a valid input for the constructor which needs a matrix string. This String also contains the description of the matrix if there is one.
      • printMatrix

        public void printMatrix()
        Just to perform some test. It prints the matrix on the screen.
      • normalizeMatrix

        public SubstitutionMatrix normalizeMatrix()
                                           throws BioException,
                                                  java.lang.NumberFormatException,
                                                  java.io.IOException
        With this method you can get a “normalized” SubstitutionMatrix object; however, since this implementation uses an short matrix, the normalized matrix will be scaled by ten. If you need values between zero and one, you have to divide every value returned by getValueAt by ten.
        Returns:
        a new and normalized SubstitutionMatrix object given by this substitution matrix. Because this uses an short matrix, all values are scaled by 10.
        Throws:
        BioException
        java.io.IOException
        java.lang.NumberFormatException