Expand description
Spark Connect gRPC protobuf translated using tonic
Modules§
- add_
artifacts_ request - Nested message and enum types in
AddArtifactsRequest
. - add_
artifacts_ response - Nested message and enum types in
AddArtifactsResponse
. - aggregate
- Nested message and enum types in
Aggregate
. - analyze_
plan_ request - Nested message and enum types in
AnalyzePlanRequest
. - analyze_
plan_ response - Nested message and enum types in
AnalyzePlanResponse
. - artifact_
statuses_ response - Nested message and enum types in
ArtifactStatusesResponse
. - catalog
- Nested message and enum types in
Catalog
. - command
- Nested message and enum types in
Command
. - common_
inline_ user_ defined_ function - Nested message and enum types in
CommonInlineUserDefinedFunction
. - common_
inline_ user_ defined_ table_ function - Nested message and enum types in
CommonInlineUserDefinedTableFunction
. - config_
request - Nested message and enum types in
ConfigRequest
. - data_
type - Nested message and enum types in
DataType
. - execute_
plan_ request - Nested message and enum types in
ExecutePlanRequest
. - execute_
plan_ response - Nested message and enum types in
ExecutePlanResponse
. - expression
- Nested message and enum types in
Expression
. - interrupt_
request - Nested message and enum types in
InterruptRequest
. - join
- Nested message and enum types in
Join
. - na_
replace - Nested message and enum types in
NAReplace
. - parse
- Nested message and enum types in
Parse
. - plan
- Nested message and enum types in
Plan
. - read
- Nested message and enum types in
Read
. - relation
- Nested message and enum types in
Relation
. - release_
execute_ request - Nested message and enum types in
ReleaseExecuteRequest
. - set_
operation - Nested message and enum types in
SetOperation
. - spark_
connect_ service_ client - Generated client implementations.
- stat_
sample_ by - Nested message and enum types in
StatSampleBy
. - streaming_
foreach_ function - Nested message and enum types in
StreamingForeachFunction
. - streaming_
query_ command - Nested message and enum types in
StreamingQueryCommand
. - streaming_
query_ command_ result - Nested message and enum types in
StreamingQueryCommandResult
. - streaming_
query_ manager_ command - Nested message and enum types in
StreamingQueryManagerCommand
. - streaming_
query_ manager_ command_ result - Nested message and enum types in
StreamingQueryManagerCommandResult
. - unpivot
- Nested message and enum types in
Unpivot
. - write_
operation - Nested message and enum types in
WriteOperation
. - write_
operation_ v2 - Nested message and enum types in
WriteOperationV2
. - write_
stream_ operation_ start - Nested message and enum types in
WriteStreamOperationStart
.
Structs§
- AddArtifacts
Request - Request to transfer client-local artifacts.
- AddArtifacts
Response - Response to adding an artifact. Contains relevant metadata to verify successful transfer of artifact(s).
- Aggregate
- Relation of type [Aggregate].
- Analyze
Plan Request - Request to perform plan analyze, optionally to explain the plan.
- Analyze
Plan Response - Response to performing analysis of the query. Contains relevant metadata to be able to reason about the performance.
- Apply
InPandas With State - Artifact
Statuses Request - Request to get current statuses of artifacts at the server side.
- Artifact
Statuses Response - Response to checking artifact statuses.
- Cache
Table - See
spark.catalog.cacheTable
- Cached
Local Relation - A local relation that has been cached already.
- Cached
Remote Relation - Represents a remote relation that has been cached on server.
- Call
Function - Catalog
- Catalog messages are marked as unstable.
- Clear
Cache - See
spark.catalog.clearCache
- CoGroup
Map - Collect
Metrics - Collect arbitrary (named) metrics from a dataset.
- Command
- A [Command] is an operation that is executed by the server that does not directly consume or produce a relational result.
- Common
Inline User Defined Function - Common
Inline User Defined Table Function - Config
Request - Request to update or fetch the configurations.
- Config
Response - Response to the config request.
- Create
Data Frame View Command - A command that can create DataFrame global temp view or local temp view.
- Create
External Table - See
spark.catalog.createExternalTable
- Create
Table - See
spark.catalog.createTable
- Current
Catalog - See
spark.catalog.currentCatalog
- Current
Database - See
spark.catalog.currentDatabase
- Data
Type - This message describes the logical [DataType] of something. It does not carry the value itself but only describes it.
- Database
Exists - See
spark.catalog.databaseExists
- Deduplicate
- Relation of type [Deduplicate] which have duplicate rows removed, could consider either only the subset of columns or all the columns.
- Drop
- Drop specified columns.
- Drop
Global Temp View - See
spark.catalog.dropGlobalTempView
- Drop
Temp View - See
spark.catalog.dropTempView
- Example
Plugin Command - Example
Plugin Expression - Example
Plugin Relation - Execute
Plan Request - A request to be executed by the service.
- Execute
Plan Response - The response of a query, can be one or more for each request. Responses belonging to the
same input query, carry the same
session_id
. - Expression
- Expression used to refer to fields, functions and similar. This can be used everywhere expressions in SQL appear.
- Filter
- Relation that applies a boolean expression
condition
on each row ofinput
to produce the output result. - Function
Exists - See
spark.catalog.functionExists
- GetDatabase
- See
spark.catalog.getDatabase
- GetFunction
- See
spark.catalog.getFunction
- GetResources
Command - Command to get the output of ‘SparkContext.resources’
- GetResources
Command Result - Response for command ‘GetResourcesCommand’.
- GetTable
- See
spark.catalog.getTable
- Group
Map - Hint
- Specify a hint over a relation. Hint should have a name and optional parameters.
- Html
String - Compose the string representing rows for output. It will invoke ‘Dataset.htmlString’ to compute the results.
- Interrupt
Request - Interrupt
Response - IsCached
- See
spark.catalog.isCached
- JavaUdf
- Join
- Relation of type [Join].
- KeyValue
- The key-value pair for the config request and response.
- Limit
- Relation of type [Limit] that is used to
limit
rows from the input relation. - List
Catalogs - See
spark.catalog.listCatalogs
- List
Columns - See
spark.catalog.listColumns
- List
Databases - See
spark.catalog.listDatabases
- List
Functions - See
spark.catalog.listFunctions
- List
Tables - See
spark.catalog.listTables
- Local
Relation - A relation that does not need to be qualified by name.
- MapPartitions
- NaDrop
- Drop rows containing null values. It will invoke ‘Dataset.na.drop’ (same as ‘DataFrameNaFunctions.drop’) to compute the results.
- NaFill
- Replaces null values. It will invoke ‘Dataset.na.fill’ (same as ‘DataFrameNaFunctions.fill’) to compute the results. Following 3 parameter combinations are supported: 1, ‘values’ only contains 1 item, ‘cols’ is empty: replaces null values in all type-compatible columns. 2, ‘values’ only contains 1 item, ‘cols’ is not empty: replaces null values in specified columns. 3, ‘values’ contains more than 1 items, then ‘cols’ is required to have the same length: replaces each specified column with corresponding value.
- NaReplace
- Replaces old values with the corresponding values. It will invoke ‘Dataset.na.replace’ (same as ‘DataFrameNaFunctions.replace’) to compute the results.
- Offset
- Relation of type [Offset] that is used to read rows staring from the
offset
on the input relation. - Parse
- Plan
- A [Plan] is the structure that carries the runtime information for the execution from the client to the server. A [Plan] can either be of the type [Relation] which is a reference to the underlying logical plan or it can be of the [Command] type that is used to execute commands on the server.
- Project
- Projection of a bag of expressions for a given input relation.
- Python
Udf - Python
Udtf - Range
- Relation of type [Range] that generates a sequence of integers.
- Read
- Relation that reads from a file / table or other data source. Does not have additional inputs.
- Reattach
Execute Request - Reattach
Options - Recover
Partitions - See
spark.catalog.recoverPartitions
- Refresh
ByPath - See
spark.catalog.refreshByPath
- Refresh
Table - See
spark.catalog.refreshTable
- Relation
- The main [Relation] type. Fundamentally, a relation is a typed container that has exactly one explicit relation type set.
- Relation
Common - Common metadata of all relations.
- Release
Execute Request - Release
Execute Response - Repartition
- Relation repartition.
- Repartition
ByExpression - Resource
Information - ResourceInformation to hold information about a type of Resource. The corresponding class is ‘org.apache.spark.resource.ResourceInformation’
- Sample
- Relation of type [Sample] that samples a fraction of the dataset.
- Scalar
Scala Udf - SetCurrent
Catalog - See
spark.catalog.setCurrentCatalog
- SetCurrent
Database - See
spark.catalog.setCurrentDatabase
- SetOperation
- Relation of type [SetOperation]
- Show
String - Compose the string representing rows for output. It will invoke ‘Dataset.showString’ to compute the results.
- Sort
- Relation of type [Sort].
- Sql
- Relation that uses a SQL query to generate the output.
- SqlCommand
- A SQL Command is used to trigger the eager evaluation of SQL commands in Spark.
- Stat
Approx Quantile - Calculates the approximate quantiles of numerical columns of a DataFrame. It will invoke ‘Dataset.stat.approxQuantile’ (same as ‘StatFunctions.approxQuantile’) to compute the results.
- Stat
Corr - Calculates the correlation of two columns of a DataFrame. Currently only supports the Pearson Correlation Coefficient. It will invoke ‘Dataset.stat.corr’ (same as ‘StatFunctions.pearsonCorrelation’) to compute the results.
- StatCov
- Calculate the sample covariance of two numerical columns of a DataFrame. It will invoke ‘Dataset.stat.cov’ (same as ‘StatFunctions.calculateCov’) to compute the results.
- Stat
Crosstab - Computes a pair-wise frequency table of the given columns. Also known as a contingency table. It will invoke ‘Dataset.stat.crosstab’ (same as ‘StatFunctions.crossTabulate’) to compute the results.
- Stat
Describe - Computes basic statistics for numeric and string columns, including count, mean, stddev, min, and max. If no columns are given, this function computes statistics for all numerical or string columns.
- Stat
Freq Items - Finding frequent items for columns, possibly with false positives. It will invoke ‘Dataset.stat.freqItems’ (same as ‘StatFunctions.freqItems’) to compute the results.
- Stat
Sample By - Returns a stratified sample without replacement based on the fraction given on each stratum. It will invoke ‘Dataset.stat.freqItems’ (same as ‘StatFunctions.freqItems’) to compute the results.
- Stat
Summary - Computes specified statistics for numeric and string columns. It will invoke ‘Dataset.summary’ (same as ‘StatFunctions.summary’) to compute the results.
- Storage
Level - StorageLevel for persisting Datasets/Tables.
- Streaming
Foreach Function - Streaming
Query Command - Commands for a streaming query.
- Streaming
Query Command Result - Response for commands on a streaming query.
- Streaming
Query Instance Id - A tuple that uniquely identifies an instance of streaming query run. It consists of
id
that persists across the streaming runs andrun_id
that changes between each run of the streaming query that resumes from the checkpoint. - Streaming
Query Manager Command - Commands for the streaming query manager.
- Streaming
Query Manager Command Result - Response for commands on the streaming query manager.
- Subquery
Alias - Relation alias.
- Table
Exists - See
spark.catalog.tableExists
- Tail
- Relation of type [Tail] that is used to fetch
limit
rows from the last of the input relation. - ToDf
- Rename columns on the input relation by the same length of names.
- ToSchema
- Uncache
Table - See
spark.catalog.uncacheTable
- Unknown
- Used for testing purposes only.
- Unpivot
- Unpivot a DataFrame from wide format to long format, optionally leaving identifier columns set.
- User
Context - User Context is used to refer to one particular user session that is executing queries in the backend.
- With
Columns - Adding columns or replacing the existing columns that have the same names.
- With
Columns Renamed - Rename columns on the input relation by a map with name to name mapping.
- With
Watermark - Write
Operation - As writes are not directly handled during analysis and planning, they are modeled as commands.
- Write
Operation V2 - As writes are not directly handled during analysis and planning, they are modeled as commands.
- Write
Stream Operation Start - Starts write stream operation as streaming query. Query ID and Run ID of the streaming query are returned.
- Write
Stream Operation Start Result