Type Parameters:
FeaturesType - Type of features. E.g., VectorUDT for vector features.
Learner - Specialization of this class. If you subclass this type, use this type parameter to specify the concrete type.
M - Specialization of PredictionModel. If you subclass this type, use this type parameter to specify the concrete type for the corresponding model.
All Implemented Interfaces:
Serializable, org.apache.spark.internal.Logging, Params, HasFeaturesCol, HasLabelCol, HasPredictionCol, PredictorParams, Identifiable
Direct Known Subclasses:
Classifier, Regressor

public abstract class Predictor<FeaturesType,Learner extends Predictor<FeaturesType,Learner,M>,M extends PredictionModel<FeaturesType,M>> extends Estimator<M> implements PredictorParams

Abstraction for prediction problems (regression and classification). It accepts all NumericType labels and will automatically cast it to DoubleType in fit(). If this predictor supports weights, it accepts all NumericType weights, which will be automatically casted to DoubleType in fit().

See Also:
  • Nested Class Summary

    Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging

    org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter

  • Constructor Summary

    Constructors

  • Method Summary

    Creates a copy of this instance with the same UID and some extra params.

    Param for features column name.

    Fits a model to the input data.

    labelCol()

    Param for label column name.

    Param for prediction column name.

    Check transform validity and derive the output schema from the input schema.

    Methods inherited from interface org.apache.spark.internal.Logging

    initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, MDC, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext

  • Constructor Details

    • Predictor

      public Predictor()

  • Method Details

    • copy

      Description copied from interface: Params

      Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. See defaultCopy().

      Specified by:
      copy in interface Params
      Specified by:
      copy in class Estimator<M extends PredictionModel<FeaturesType,M>>
      Parameters:
      extra - (undocumented)
      Returns:
      (undocumented)
    • featuresCol

      Param for features column name.

      Specified by:
      featuresCol in interface HasFeaturesCol
      Returns:
      (undocumented)
    • fit

      Description copied from class: Estimator

      Fits a model to the input data.

      Specified by:
      fit in class Estimator<M extends PredictionModel<FeaturesType,M>>
      Parameters:
      dataset - (undocumented)
      Returns:
      (undocumented)
    • labelCol

      Description copied from interface: HasLabelCol

      Param for label column name.

      Specified by:
      labelCol in interface HasLabelCol
      Returns:
      (undocumented)
    • predictionCol

      Param for prediction column name.

      Specified by:
      predictionCol in interface HasPredictionCol
      Returns:
      (undocumented)
    • setFeaturesCol

    • setLabelCol

    • setPredictionCol

    • transformSchema

      Check transform validity and derive the output schema from the input schema.

      We check validity for interactions between parameters during transformSchema and raise an exception if any parameter value is invalid. Parameter value checks which do not depend on other parameters are handled by Param.validate().

      Typical implementation should first conduct verification on schema change and parameter validity, including complex parameter interaction checks.

      Specified by:
      transformSchema in class PipelineStage
      Parameters:
      schema - (undocumented)
      Returns:
      (undocumented)