SparkContext Class
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
Main entry point for Spark functionality. A SparkContext represents the connection to a Spark cluster, and can be used to create RDDs, accumulators and broadcast variables on that cluster.
Only one SparkContext
should be active per JVM. You must stop()
the
active SparkContext
before creating a new one.
public sealed class SparkContext
type SparkContext = class
Public NotInheritable Class SparkContext
- Inheritance
-
SparkContext
Constructors
SparkContext() |
Create a SparkContext that loads settings from system properties (for instance, when launching with spark-submit). |
SparkContext(SparkConf) |
Create a SparkContext object with the given config. |
SparkContext(String, String, SparkConf) |
Alternative constructor that allows setting common Spark properties directly. |
SparkContext(String, String, String) |
Alternative constructor that allows setting common Spark properties directly. |
SparkContext(String, String) |
Initializes a SparkContext instance with a specific master and application name. |
Properties
DefaultParallelism |
Default level of parallelism to use when not given by user (e.g. Parallelize()). |
Methods
AddFile(String, Boolean) |
Add a file to be downloaded with this Spark job on every node. |
Broadcast<T>(T) |
Broadcast a read-only variable to the cluster, returning a Microsoft.Spark.Broadcast object for reading it in distributed functions. The variable will be sent to each executor only once. |
ClearJobGroup() |
Clear the current thread's job group ID and its description. |
GetConf() |
Returns SparkConf object associated with this SparkContext object. Note that modifying the SparkConf object will not have any impact. |
GetOrCreate(SparkConf) |
This function may be used to get or instantiate a SparkContext and register it as a singleton object. Because we can only have one active SparkContext per JVM, this is useful when applications may wish to share a SparkContext. |
SetCheckpointDir(String) |
Sets the directory under which RDDs are going to be checkpointed. |
SetJobDescription(String) |
Sets a human readable description of the current job. |
SetJobGroup(String, String, Boolean) |
Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared. |
SetLogLevel(String) |
Control our logLevel. This overrides any user-defined log settings. |
Stop() |
Shut down the SparkContext. |