Package htsjdk.samtools.util
Class Histogram<K extends Comparable>
- java.lang.Object
-
- htsjdk.samtools.util.Histogram<K>
-
- All Implemented Interfaces:
Serializable
public final class Histogram<K extends Comparable> extends Object implements Serializable
Class for computing and accessing histogram type data. Stored internally in a sorted Map so that keys can be iterated in order.- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
Histogram.Bin<K extends Comparable>
Represents a bin in the Histogram.
-
Constructor Summary
Constructors Constructor Description Histogram()
Constructs a new Histogram with default bin and value labels.Histogram(Histogram<K> in)
Copy constructor for a histogram.Histogram(String binLabel, String valueLabel)
Constructs a new Histogram with supplied bin and value labels.Histogram(String binLabel, String valueLabel, Comparator<? super K> comparator)
Constructor that takes labels for the bin and values and a comparator to sort the bins.Histogram(Comparator<? super K> comparator)
Constructs a new Histogram that'll use the supplied comparator to sort keys.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addHistogram(Histogram<K> addHistogram)
Mutable method that allows the addition of a Histogram into the current one.Comparator<? super K>
comparator()
Returns the comparator used to order the keys in this histogram, ornull
if this histogram uses the natural ordering of its keys.boolean
containsKey(K key)
Return whether this histogram contains the given key.Histogram<K>
divideByHistogram(Histogram<K> divisorHistogram)
Immutable method that divides the current Histogram by an input Histogram and generates a new one Throws an exception if the bins don't match up exactlyboolean
equals(Object o)
Checks that the labels and values in the two histograms are identical.double
estimateSdViaMad()
Returns a value that is intended to estimate the mean of the distribution, if the distribution is essentially normal, by using the median absolute deviation to remove the effect of erroneous massive outliers.Histogram.Bin<K>
get(K key)
Retrieves the bin associated with the given key.String
getBinLabel()
double
getCount()
double
getCumulativeProbability(double v)
Returns the cumulative probability of observing a value <= v when sampling the distribution represented by this histogram.double
getGeometricMean()
Gets the geometric mean of the distribution.double
getMax()
Returns the key with the highest count.double
getMean()
Assuming that the key type for the histogram is a Number type, returns the mean of all the items added to the histogram.double
getMeanBinSize()
Calculates the mean bin sizedouble
getMedian()
double
getMedianAbsoluteDeviation()
Gets the median absolute deviation of the distribution.double
getMedianBinSize()
Calculates the median bin sizedouble
getMin()
Returns the key with the lowest count.double
getMode()
Returns id of the Bin that's the mode of the distribution (i.e.double
getPercentile(double percentile)
Gets the bin in which the given percentile falls.double
getStandardDeviation()
double
getStandardDeviationBinSize(double mean)
Calculates the standard deviation of the bin sizedouble
getSum()
Returns the sum of the products of the histgram bin ids and the number of entries in each bin.double
getSumOfValues()
Returns the sum of the number of entries in each bin.String
getValueLabel()
int
hashCode()
void
increment(K id)
Increments the value in the designated bin by 1.void
increment(K id, double increment)
Increments the value in the designated bin by the supplied increment.boolean
isEmpty()
Returns true if this histogram has no data in in, false otherwise.Set<K>
keySet()
Returns the set of keys for this histogram.void
prefillBins(K... ids)
Prefill the histogram with the supplied set of bins.void
setBinLabel(String binLabel)
void
setValueLabel(String valueLabel)
int
size()
Returns the size of this histogram.String
toString()
void
trimByTailLimit(int tailLimit)
Trims the histogram when the bins in the tail of the distribution contain fewer than mode/tailLimit itemsvoid
trimByWidth(int width)
Trims the histogram so that only bins <= width are kept.Collection<Histogram.Bin<K>>
values()
Returns aCollection
view of the values contained in this histogram.
-
-
-
Constructor Detail
-
Histogram
public Histogram()
Constructs a new Histogram with default bin and value labels.
-
Histogram
public Histogram(String binLabel, String valueLabel)
Constructs a new Histogram with supplied bin and value labels.
-
Histogram
public Histogram(Comparator<? super K> comparator)
Constructs a new Histogram that'll use the supplied comparator to sort keys.
-
Histogram
public Histogram(String binLabel, String valueLabel, Comparator<? super K> comparator)
Constructor that takes labels for the bin and values and a comparator to sort the bins.
-
-
Method Detail
-
prefillBins
public void prefillBins(K... ids)
Prefill the histogram with the supplied set of bins.
-
increment
public void increment(K id)
Increments the value in the designated bin by 1.
-
increment
public void increment(K id, double increment)
Increments the value in the designated bin by the supplied increment.
-
getBinLabel
public String getBinLabel()
-
setBinLabel
public void setBinLabel(String binLabel)
-
getValueLabel
public String getValueLabel()
-
setValueLabel
public void setValueLabel(String valueLabel)
-
equals
public boolean equals(Object o)
Checks that the labels and values in the two histograms are identical.
-
getMean
public double getMean()
Assuming that the key type for the histogram is a Number type, returns the mean of all the items added to the histogram.
-
getSum
public double getSum()
Returns the sum of the products of the histgram bin ids and the number of entries in each bin. Note: This is only supported if this histogram stores instances of Number.
-
getSumOfValues
public double getSumOfValues()
Returns the sum of the number of entries in each bin.
-
getStandardDeviation
public double getStandardDeviation()
-
getMeanBinSize
public double getMeanBinSize()
Calculates the mean bin size
-
size
public int size()
Returns the size of this histogram.
-
comparator
public Comparator<? super K> comparator()
Returns the comparator used to order the keys in this histogram, ornull
if this histogram uses the natural ordering of its keys.- Returns:
- the comparator used to order the keys in this histogram,
or
null
if this histogram uses the natural ordering of its keys
-
getMedianBinSize
public double getMedianBinSize()
Calculates the median bin size
-
values
public Collection<Histogram.Bin<K>> values()
Returns aCollection
view of the values contained in this histogram. The collection's iterator returns the values in ascending order of the corresponding keys.
-
getStandardDeviationBinSize
public double getStandardDeviationBinSize(double mean)
Calculates the standard deviation of the bin size
-
getPercentile
public double getPercentile(double percentile)
Gets the bin in which the given percentile falls. Should only be called on histograms with non-negative values and a positive sum of values.- Parameters:
percentile
- a value between 0 and 1- Returns:
- the bin value in which the percentile falls
-
getCumulativeProbability
public double getCumulativeProbability(double v)
Returns the cumulative probability of observing a value <= v when sampling the distribution represented by this histogram.- Throws:
UnsupportedOperationException
- if this histogram does not store instances of Number
-
getMedian
public double getMedian()
-
getMedianAbsoluteDeviation
public double getMedianAbsoluteDeviation()
Gets the median absolute deviation of the distribution.
-
estimateSdViaMad
public double estimateSdViaMad()
Returns a value that is intended to estimate the mean of the distribution, if the distribution is essentially normal, by using the median absolute deviation to remove the effect of erroneous massive outliers.
-
getMode
public double getMode()
Returns id of the Bin that's the mode of the distribution (i.e. the largest bin).- Throws:
UnsupportedOperationException
- if this histogram does not store instances of Number
-
getMin
public double getMin()
Returns the key with the lowest count.- Throws:
UnsupportedOperationException
- if this histogram does not store instances of Number
-
getMax
public double getMax()
Returns the key with the highest count.- Throws:
UnsupportedOperationException
- if this histogram does not store instances of Number
-
getCount
public double getCount()
-
getGeometricMean
public double getGeometricMean()
Gets the geometric mean of the distribution.
-
trimByTailLimit
public void trimByTailLimit(int tailLimit)
Trims the histogram when the bins in the tail of the distribution contain fewer than mode/tailLimit items
-
isEmpty
public boolean isEmpty()
Returns true if this histogram has no data in in, false otherwise.
-
trimByWidth
public void trimByWidth(int width)
Trims the histogram so that only bins <= width are kept.
-
divideByHistogram
public Histogram<K> divideByHistogram(Histogram<K> divisorHistogram)
Immutable method that divides the current Histogram by an input Histogram and generates a new one Throws an exception if the bins don't match up exactly- Parameters:
divisorHistogram
-- Returns:
- Throws:
IllegalArgumentException
- if the keySet of this histogram is not equal to the keySet of the given divisorHistogram
-
addHistogram
public void addHistogram(Histogram<K> addHistogram)
Mutable method that allows the addition of a Histogram into the current one.- Parameters:
addHistogram
-
-
get
public Histogram.Bin<K> get(K key)
Retrieves the bin associated with the given key.
-
containsKey
public boolean containsKey(K key)
Return whether this histogram contains the given key.
-
-