StreamingQuery (Spark 4.2.0 JavaDoc)
public interface StreamingQuery
A handle to a query that is executing continuously in the background as new data arrives. All these methods are thread-safe.
- Since:
- 2.0.0
-
Method Summary
voidWaits for the termination of
thisquery, either byquery.stop()or by an exception.booleanawaitTermination(long timeoutMs) Waits for the termination of
thisquery, either byquery.stop()or by an exception.voidexplain()Prints the physical plan to the console for debugging purposes.
voidexplain(boolean extended) Prints the physical plan to the console for debugging purposes.
id()Returns the unique id of this query that persists across restarts from checkpoint data.
booleanisActive()Returns
trueif this query is actively running.name()Returns the user-specified name of the query, or null if not specified.
voidBlocks until all available data in the source has been processed and committed to the sink.
runId()Returns the unique id of this run of the query.
Returns the
SparkSessionassociated withthis.status()Returns the current status of the query.
voidstop()Stops the execution of this query if it is running.
-
Method Details
-
awaitTermination
Waits for the termination of
thisquery, either byquery.stop()or by an exception. If the query has terminated with an exception, then the exception will be thrown.If the query has terminated, then all subsequent calls to this method will either return immediately (if the query was terminated by
stop()), or throw the exception immediately (if the query has terminated with exception).- Throws:
StreamingQueryException- if the query has terminated with an exception.- Since:
- 2.0.0
-
awaitTermination
Waits for the termination of
thisquery, either byquery.stop()or by an exception. If the query has terminated with an exception, then the exception will be thrown. Otherwise, it returns whether the query has terminated or not within thetimeoutMsmilliseconds.If the query has terminated, then all subsequent calls to this method will either return
trueimmediately (if the query was terminated bystop()), or throw the exception immediately (if the query has terminated with exception).- Parameters:
timeoutMs- (undocumented)- Returns:
- (undocumented)
- Throws:
StreamingQueryException- if the query has terminated with an exception- Since:
- 2.0.0
-
exception
- Returns:
- (undocumented)
- Since:
- 2.0.0
-
explain
void explain()
Prints the physical plan to the console for debugging purposes.
- Since:
- 2.0.0
-
explain
void explain
(boolean extended) Prints the physical plan to the console for debugging purposes.
- Parameters:
extended- whether to do extended explain or not- Since:
- 2.0.0
-
id
Returns the unique id of this query that persists across restarts from checkpoint data. That is, this id is generated when a query is started for the first time, and will be the same every time it is restarted from checkpoint data. Also see
runId().- Returns:
- (undocumented)
- Since:
- 2.1.0
-
isActive
boolean isActive()
Returns
trueif this query is actively running.- Returns:
- (undocumented)
- Since:
- 2.0.0
-
lastProgress
- Returns:
- (undocumented)
- Since:
- 2.1.0
-
name
Returns the user-specified name of the query, or null if not specified. This name can be specified in the
org.apache.spark.sql.streaming.DataStreamWriterasdataframe.writeStream.queryName("query").start(). This name, if set, must be unique across all active queries.- Returns:
- (undocumented)
- Since:
- 2.0.0
-
processAllAvailable
void processAllAvailable()
Blocks until all available data in the source has been processed and committed to the sink. This method is intended for testing. Note that in the case of continually arriving data, this method may block forever. Additionally, this method is only guaranteed to block until data that has been synchronously appended data to a
org.apache.spark.sql.execution.streaming.Sourceprior to invocation. (i.e.getOffsetmust immediately reflect the addition).- Since:
- 2.0.0
-
recentProgress
Returns an array of the most recent
StreamingQueryProgressupdates for this query. The number of progress updates retained for each stream is configured by Spark session configurationspark.sql.streaming.numRecentProgressUpdates.- Returns:
- (undocumented)
- Since:
- 2.1.0
-
runId
Returns the unique id of this run of the query. That is, every start/restart of a query will generate a unique runId. Therefore, every time a query is restarted from checkpoint, it will have the same
id()but differentrunId()s.- Returns:
- (undocumented)
-
sparkSession
Returns the
SparkSessionassociated withthis.- Returns:
- (undocumented)
- Since:
- 2.0.0
-
status
Returns the current status of the query.
- Returns:
- (undocumented)
- Since:
- 2.0.2
-
stop
Stops the execution of this query if it is running. This waits until the termination of the query execution threads or until a timeout is hit.
By default stop will block indefinitely. You can configure a timeout by the configuration
spark.sql.streaming.stopTimeout. A timeout of 0 (or negative) milliseconds will block indefinitely. If aTimeoutExceptionis thrown, users can retry stopping the stream. If the issue persists, it is advisable to kill the Spark application.- Throws:
TimeoutException- Since:
- 2.0.0
-