Object

org.apache.spark.SparkConf

All Implemented Interfaces:
Serializable, Cloneable, org.apache.spark.internal.Logging, ReadOnlySparkConf

Configuration for a Spark application. Used to set various Spark parameters as key-value pairs.

Most of the time, you would create a SparkConf object with new SparkConf(), which will load values from any spark.* Java system properties set in your application as well. In this case, parameters you set directly on the SparkConf object take priority over system properties.

For unit tests, you can also call new SparkConf(false) to skip loading external settings and get the same configuration no matter what the system properties are.

All setter methods in this class support chaining. For example, you can write new SparkConf().setMaster("local").setAppName("My app").

param: loadDefaults whether to also load values from Java system properties

See Also:
Note:
Once a SparkConf object is passed to Spark, it is cloned and can no longer be modified by the user. Spark does not support modifying the configuration at runtime.
  • Nested Class Summary

    Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging

    org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter

  • Constructor Summary

    Constructors

    SparkConf()

    Create a SparkConf that loads defaults from system properties and the classpath

    SparkConf(boolean loadDefaults)

  • Method Summary

    clone()

    boolean

    Does the configuration contain a given parameter?

    getAll()

    Get all parameters as a list of pairs

    Get all parameters that start with prefix

    <K> scala.Tuple2<K,String>[]

    Get all parameters that start with prefix and apply f.

    getAppId()

    Returns the Spark application id, valid in the Driver after TaskScheduler registration and from the start in the Executor.

    Gets all the avro schemas in the configuration used in the generic Avro record serializer

    static scala.Option<String>

    Looks for available deprecated keys for the given config option, and return the first value available.

    scala.collection.immutable.Seq<scala.Tuple2<String,String>>

    Get all executor environment variables set on this SparkConf

    Get a parameter as an Option

    static boolean

    Return whether the given config should be passed to an executor on start-up.

    static boolean

    Return true if the given config matches either spark.*.port or spark.port.*.

    static void

    Logs a warning message if the given config key is deprecated.

    static org.apache.spark.internal.Logging.LogStringContext

    LogStringContext(scala.StringContext sc)

    static org.slf4j.Logger

    static void

    org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1)

    registerAvroSchemas(scala.collection.immutable.Seq<org.apache.avro.Schema> schemas)

    Use Kryo serialization and register the given set of Avro schemas so that the generic record serializer can decrease network IO

    Use Kryo serialization and register the given set of classes with Kryo.

    Remove a parameter from the configuration

    Set a configuration variable.

    setAll(scala.collection.Iterable<scala.Tuple2<String,String>> settings)

    Set multiple parameters together

    Set a name for your application.

    Set an environment variable to be used when launching executors for this application.

    setExecutorEnv(scala.collection.immutable.Seq<scala.Tuple2<String,String>> variables)

    Set multiple environment variables to be used when launching executors.

    Set multiple environment variables to be used when launching executors.

    Set a parameter if it isn't already configured

    Set JAR files to distribute to the cluster.

    setJars(scala.collection.immutable.Seq<String> jars)

    Set JAR files to distribute to the cluster.

    The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.

    Set the location where Spark is installed on worker nodes.

    Return a string listing all keys and values, one per line.

    Methods inherited from interface org.apache.spark.internal.Logging

    initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, MDC, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext

  • Constructor Details

    • SparkConf

      public SparkConf(boolean loadDefaults)

    • SparkConf

      public SparkConf()

      Create a SparkConf that loads defaults from system properties and the classpath

  • Method Details

    • isExecutorStartupConf

      public static boolean isExecutorStartupConf(String name)

      Return whether the given config should be passed to an executor on start-up.

      Certain authentication configs are required from the executor when it connects to the scheduler, while the rest of the spark configs can be inherited from the driver later.

      Parameters:
      name - (undocumented)
      Returns:
      (undocumented)
    • isSparkPortConf

      public static boolean isSparkPortConf(String name)

      Return true if the given config matches either spark.*.port or spark.port.*.

      Parameters:
      name - (undocumented)
      Returns:
      (undocumented)
    • getDeprecatedConfig

      Looks for available deprecated keys for the given config option, and return the first value available.

      Parameters:
      key - (undocumented)
      conf - (undocumented)
      Returns:
      (undocumented)
    • logDeprecationWarning

      public static void logDeprecationWarning(String key)

      Logs a warning message if the given config key is deprecated.

      Parameters:
      key - (undocumented)
    • org$apache$spark$internal$Logging$$log_

      public static org.slf4j.Logger org$apache$spark$internal$Logging$$log_()

    • org$apache$spark$internal$Logging$$log__$eq

      public static void org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1)

    • LogStringContext

      public static org.apache.spark.internal.Logging.LogStringContext LogStringContext(scala.StringContext sc)

    • set

      Set a configuration variable.

    • setMaster

      The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.

      Parameters:
      master - (undocumented)
      Returns:
      (undocumented)
    • setAppName

      Set a name for your application. Shown in the Spark web UI.

    • setJars

      public SparkConf setJars(scala.collection.immutable.Seq<String> jars)

      Set JAR files to distribute to the cluster.

    • setJars

      Set JAR files to distribute to the cluster. (Java-friendly version.)

    • setExecutorEnv

      Set an environment variable to be used when launching executors for this application. These variables are stored as properties of the form spark.executorEnv.VAR_NAME (for example spark.executorEnv.PATH) but this method makes them easier to set.

      Parameters:
      variable - (undocumented)
      value - (undocumented)
      Returns:
      (undocumented)
    • setExecutorEnv

      public SparkConf setExecutorEnv(scala.collection.immutable.Seq<scala.Tuple2<String,String>> variables)

      Set multiple environment variables to be used when launching executors. These variables are stored as properties of the form spark.executorEnv.VAR_NAME (for example spark.executorEnv.PATH) but this method makes them easier to set.

      Parameters:
      variables - (undocumented)
      Returns:
      (undocumented)
    • setExecutorEnv

      Set multiple environment variables to be used when launching executors. (Java-friendly version.)

      Parameters:
      variables - (undocumented)
      Returns:
      (undocumented)
    • setSparkHome

      Set the location where Spark is installed on worker nodes.

      Parameters:
      home - (undocumented)
      Returns:
      (undocumented)
    • setAll

      public SparkConf setAll(scala.collection.Iterable<scala.Tuple2<String,String>> settings)

      Set multiple parameters together

    • setIfMissing

      Set a parameter if it isn't already configured

    • registerKryoClasses

      Use Kryo serialization and register the given set of classes with Kryo. If called multiple times, this will append the classes from all calls together.

      Parameters:
      classes - (undocumented)
      Returns:
      (undocumented)
    • registerAvroSchemas

      public SparkConf registerAvroSchemas(scala.collection.immutable.Seq<org.apache.avro.Schema> schemas)

      Use Kryo serialization and register the given set of Avro schemas so that the generic record serializer can decrease network IO

      Parameters:
      schemas - (undocumented)
      Returns:
      (undocumented)
    • getAvroSchema

      public scala.collection.immutable.Map<Object,String> getAvroSchema()

      Gets all the avro schemas in the configuration used in the generic Avro record serializer

    • remove

      Remove a parameter from the configuration

    • getOption

      public scala.Option<String> getOption(String key)

      Get a parameter as an Option

      Specified by:
      getOption in interface ReadOnlySparkConf
    • getAll

      Get all parameters as a list of pairs

      Specified by:
      getAll in interface ReadOnlySparkConf
    • getAllWithPrefix

      Get all parameters that start with prefix

      Parameters:
      prefix - (undocumented)
      Returns:
      (undocumented)
    • getAllWithPrefix

      public <K> scala.Tuple2<K,String>[] getAllWithPrefix(String prefix, scala.Function1<String,K> f)

      Get all parameters that start with prefix and apply f.

      Parameters:
      prefix - (undocumented)
      f - (undocumented)
      Returns:
      (undocumented)
    • getExecutorEnv

      public scala.collection.immutable.Seq<scala.Tuple2<String,String>> getExecutorEnv()

      Get all executor environment variables set on this SparkConf

    • getAppId

      Returns the Spark application id, valid in the Driver after TaskScheduler registration and from the start in the Executor.

      Returns:
      (undocumented)
    • contains

      public boolean contains(String key)

      Does the configuration contain a given parameter?

      Specified by:
      contains in interface ReadOnlySparkConf
    • clone

      Copy this object

    • toDebugString

      public String toDebugString()

      Return a string listing all keys and values, one per line. This is useful to print the configuration out for debugging.

      Returns:
      (undocumented)