Object

org.apache.spark.util.StatCounter

All Implemented Interfaces:
Serializable

A class for tracking the statistics of a set of numbers (count, mean and variance) in a numerically robust way. Includes support for merging two StatCounters. Based on Welford and Chan's algorithms for running variance.

See Also:
  • Constructor Summary

    Constructors

    Initialize the StatCounter with no values.

    StatCounter(scala.collection.IterableOnce<Object> values)

  • Method Summary

    apply(scala.collection.immutable.Seq<Object> values)

    Build a StatCounter from a list of values passed as variable-length arguments.

    apply(scala.collection.IterableOnce<Object> values)

    Build a StatCounter from a list of values.

    copy()

    long

    count()

    double

    max()

    double

    mean()

    merge(double value)

    Add a value into this StatCounter, updating the internal statistics.

    Merge another StatCounter into this one, adding up the internal statistics.

    merge(scala.collection.IterableOnce<Object> values)

    Add multiple values into this StatCounter, updating the internal statistics.

    double

    min()

    double

    popStdev()

    Return the population standard deviation of the values.

    double

    Return the population variance of the values.

    double

    Return the sample standard deviation of the values, which corrects for bias in estimating the variance by dividing by N-1 instead of N.

    double

    Return the sample variance, which corrects for bias in estimating the variance by dividing by N-1 instead of N.

    double

    stdev()

    Return the population standard deviation of the values.

    double

    sum()

    toString()

    double

    variance()

    Return the population variance of the values.

  • Constructor Details

    • StatCounter

      public StatCounter(scala.collection.IterableOnce<Object> values)

    • StatCounter

      public StatCounter()

      Initialize the StatCounter with no values.

  • Method Details

    • apply

      public static StatCounter apply(scala.collection.IterableOnce<Object> values)

      Build a StatCounter from a list of values.

    • apply

      public static StatCounter apply(scala.collection.immutable.Seq<Object> values)

      Build a StatCounter from a list of values passed as variable-length arguments.

    • merge

      Add a value into this StatCounter, updating the internal statistics.

    • merge

      public StatCounter merge(scala.collection.IterableOnce<Object> values)

      Add multiple values into this StatCounter, updating the internal statistics.

    • merge

      Merge another StatCounter into this one, adding up the internal statistics.

    • copy

      Clone this StatCounter

    • count

      public long count()

    • mean

      public double mean()

    • sum

      public double sum()

    • max

      public double max()

    • min

      public double min()

    • variance

      public double variance()

      Return the population variance of the values.

    • popVariance

      public double popVariance()

      Return the population variance of the values.

      Returns:
      (undocumented)
    • sampleVariance

      public double sampleVariance()

      Return the sample variance, which corrects for bias in estimating the variance by dividing by N-1 instead of N.

      Returns:
      (undocumented)
    • stdev

      public double stdev()

      Return the population standard deviation of the values.

    • popStdev

      public double popStdev()

      Return the population standard deviation of the values.

      Returns:
      (undocumented)
    • sampleStdev

      public double sampleStdev()

      Return the sample standard deviation of the values, which corrects for bias in estimating the variance by dividing by N-1 instead of N.

      Returns:
      (undocumented)
    • toString

      Overrides:
      toString in class Object