OneHotEncoderModel (Spark 4.2.0 JavaDoc)
- All Implemented Interfaces:
Serializable,org.apache.spark.internal.Logging,OneHotEncoderBase,Params,HasHandleInvalid,HasInputCol,HasInputCols,HasOutputCol,HasOutputCols,Identifiable,MLWritable
param: categorySizes Original number of categories for each feature being encoded. The array contains one value for each input column, in order.
- See Also:
-
Nested Class Summary
Nested Classes
Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter -
Method Summary
int[]Creates a copy of this instance with the same UID and some extra params.
dropLast()Whether to drop the last category in the encoded vector (default: true)
Param for how to handle invalid data during transform().
inputCol()Param for input column name.
Param for input column names.
Param for output column name.
Param for output column names.
read()setDropLast(boolean value) toString()Transforms the input dataset.
Check transform validity and derive the output schema from the input schema.
uid()An immutable unique ID for the object and its derivatives.
write()Returns an
MLWriterinstance for this ML instance.Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, MDC, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContextMethods inherited from interface org.apache.spark.ml.util.MLWritable
Methods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, estimateMatadataSize, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn
-
Method Details
-
read
-
load
-
handleInvalid
Param for how to handle invalid data during transform(). Options are 'keep' (invalid data presented as an extra categorical feature) or 'error' (throw an error). Note that this Param is only used during transform; during fitting, invalid data will result in an error. Default: "error"
- Specified by:
handleInvalidin interfaceHasHandleInvalid- Specified by:
handleInvalidin interfaceOneHotEncoderBase- Returns:
- (undocumented)
-
dropLast
Whether to drop the last category in the encoded vector (default: true)
- Specified by:
dropLastin interfaceOneHotEncoderBase- Returns:
- (undocumented)
-
outputCols
Param for output column names.
- Specified by:
outputColsin interfaceHasOutputCols- Returns:
- (undocumented)
-
outputCol
Param for output column name.
- Specified by:
outputColin interfaceHasOutputCol- Returns:
- (undocumented)
-
inputCols
Param for input column names.
- Specified by:
inputColsin interfaceHasInputCols- Returns:
- (undocumented)
-
inputCol
Description copied from interface:
HasInputColParam for input column name.
- Specified by:
inputColin interfaceHasInputCol- Returns:
- (undocumented)
-
uid
An immutable unique ID for the object and its derivatives.
- Specified by:
uidin interfaceIdentifiable- Returns:
- (undocumented)
-
categorySizes
public int[] categorySizes()
-
setInputCol
-
setOutputCol
-
setInputCols
-
setOutputCols
-
setDropLast
-
setHandleInvalid
-
transformSchema
Check transform validity and derive the output schema from the input schema.
We check validity for interactions between parameters during
transformSchemaand raise an exception if any parameter value is invalid. Parameter value checks which do not depend on other parameters are handled byParam.validate().Typical implementation should first conduct verification on schema change and parameter validity, including complex parameter interaction checks.
- Specified by:
transformSchemain classPipelineStage- Parameters:
schema- (undocumented)- Returns:
- (undocumented)
-
transform
Transforms the input dataset.
- Specified by:
transformin classTransformer- Parameters:
dataset- (undocumented)- Returns:
- (undocumented)
-
copy
Description copied from interface:
ParamsCreates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. See
defaultCopy().- Specified by:
copyin interfaceParams- Specified by:
copyin classModel<OneHotEncoderModel>- Parameters:
extra- (undocumented)- Returns:
- (undocumented)
-
write
Description copied from interface:
MLWritableReturns an
MLWriterinstance for this ML instance.- Specified by:
writein interfaceMLWritable- Returns:
- (undocumented)
-
toString
- Specified by:
toStringin interfaceIdentifiable- Overrides:
toStringin classObject
-