StreamingQueryProgress (Spark 4.2.0 JavaDoc)
org.apache.spark.sql.streaming.StreamingQueryProgress
- All Implemented Interfaces:
Serializable
public class StreamingQueryProgress extends Object implements Serializable
Information about progress made in the execution of a StreamingQuery during a trigger. Each
event relates to processing done for a single trigger of the streaming query. Events are
emitted even when no new data is available to be processed.
param: id
A unique query id that persists across restarts. See StreamingQuery.id().
param: runId
A query id that is unique for every start/restart. See StreamingQuery.runId().
param: name
User-specified name of the query, null if not specified.
param: timestamp
Beginning time of the trigger in ISO8601 format, i.e. UTC timestamps.
param: batchId
A unique id for the current batch of data being processed. Note that in the case of retries
after a failure a given batchId my be executed more than once. Similarly, when there is no
data to be processed, the batchId will not be incremented.
param: batchDuration
The process duration of each batch.
param: durationMs
The amount of time taken to perform various operations in milliseconds.
param: eventTime
Statistics of event time seen in this batch. It may contain the following keys:
"max" -> "2016-12-05T20:54:20.827Z" // maximum event time seen in this trigger
"min" -> "2016-12-05T20:54:20.827Z" // minimum event time seen in this trigger
"avg" -> "2016-12-05T20:54:20.827Z" // average event time seen in this trigger
"watermark" -> "2016-12-05T20:54:20.827Z" // watermark used in this trigger
All timestamps are in ISO8601 format, i.e. UTC timestamps. param: stateOperators Information about operators in the query that store state. param: sources detailed statistics on data being read from each of the streaming sources.
- Since:
- 2.1.0
- See Also:
-
Method Summary
longlongbatchId()id()doubleThe aggregate (across all sources) rate of data arriving.
json()The compact JSON representation of this progress.
name()longThe aggregate (across all sources) number of records processed in a trigger.
doubleThe aggregate (across all sources) rate at which Spark is processing data.
runId()sink()sources()toString()
-
Method Details
-
id
-
runId
public UUID runId()
-
name
-
timestamp
public String timestamp()
-
batchId
public long batchId()
-
batchDuration
public long batchDuration()
-
durationMs
-
eventTime
-
stateOperators
-
sources
-
sink
-
observedMetrics
-
numInputRows
public long numInputRows()
The aggregate (across all sources) number of records processed in a trigger.
-
inputRowsPerSecond
public double inputRowsPerSecond()
The aggregate (across all sources) rate of data arriving.
-
processedRowsPerSecond
public double processedRowsPerSecond()
The aggregate (across all sources) rate at which Spark is processing data.
-
json
The compact JSON representation of this progress.
-
prettyJson
public String prettyJson()
The pretty (i.e. indented) JSON representation of this progress.
-
toString
-