Module spark_connect_rs::spark
source · Expand description
Spark Connect gRPC protobuf translated using tonic
Modules
- Nested message and enum types in
AddArtifactsRequest
. - Nested message and enum types in
AddArtifactsResponse
. - Nested message and enum types in
Aggregate
. - Nested message and enum types in
AnalyzePlanRequest
. - Nested message and enum types in
AnalyzePlanResponse
. - Nested message and enum types in
ArtifactStatusesResponse
. - Nested message and enum types in
Catalog
. - Nested message and enum types in
Command
. - Nested message and enum types in
CommonInlineUserDefinedFunction
. - Nested message and enum types in
CommonInlineUserDefinedTableFunction
. - Nested message and enum types in
ConfigRequest
. - Nested message and enum types in
DataType
. - Nested message and enum types in
ExecutePlanRequest
. - Nested message and enum types in
ExecutePlanResponse
. - Nested message and enum types in
Expression
. - Nested message and enum types in
InterruptRequest
. - Nested message and enum types in
Join
. - Nested message and enum types in
NAReplace
. - Nested message and enum types in
Parse
. - Nested message and enum types in
Plan
. - Nested message and enum types in
Read
. - Nested message and enum types in
Relation
. - Nested message and enum types in
ReleaseExecuteRequest
. - Nested message and enum types in
SetOperation
. - Generated client implementations.
- Nested message and enum types in
StatSampleBy
. - Nested message and enum types in
StreamingForeachFunction
. - Nested message and enum types in
StreamingQueryCommand
. - Nested message and enum types in
StreamingQueryCommandResult
. - Nested message and enum types in
StreamingQueryManagerCommand
. - Nested message and enum types in
StreamingQueryManagerCommandResult
. - Nested message and enum types in
Unpivot
. - Nested message and enum types in
WriteOperation
. - Nested message and enum types in
WriteOperationV2
. - Nested message and enum types in
WriteStreamOperationStart
.
Structs
- Request to transfer client-local artifacts.
- Response to adding an artifact. Contains relevant metadata to verify successful transfer of artifact(s).
- Relation of type [Aggregate].
- Request to perform plan analyze, optionally to explain the plan.
- Response to performing analysis of the query. Contains relevant metadata to be able to reason about the performance.
- Request to get current statuses of artifacts at the server side.
- Response to checking artifact statuses.
- See
spark.catalog.cacheTable
- A local relation that has been cached already.
- Represents a remote relation that has been cached on server.
- Catalog messages are marked as unstable.
- See
spark.catalog.clearCache
- Collect arbitrary (named) metrics from a dataset.
- A [Command] is an operation that is executed by the server that does not directly consume or produce a relational result.
- Request to update or fetch the configurations.
- Response to the config request.
- A command that can create DataFrame global temp view or local temp view.
- See
spark.catalog.createExternalTable
- See
spark.catalog.createTable
- See
spark.catalog.currentCatalog
- See
spark.catalog.currentDatabase
- This message describes the logical [DataType] of something. It does not carry the value itself but only describes it.
- See
spark.catalog.databaseExists
- Relation of type [Deduplicate] which have duplicate rows removed, could consider either only the subset of columns or all the columns.
- Drop specified columns.
- See
spark.catalog.dropGlobalTempView
- See
spark.catalog.dropTempView
- A request to be executed by the service.
- The response of a query, can be one or more for each request. Responses belonging to the same input query, carry the same
session_id
. - Expression used to refer to fields, functions and similar. This can be used everywhere expressions in SQL appear.
- Relation that applies a boolean expression
condition
on each row ofinput
to produce the output result. - See
spark.catalog.functionExists
- See
spark.catalog.getDatabase
- See
spark.catalog.getFunction
- Command to get the output of ‘SparkContext.resources’
- Response for command ‘GetResourcesCommand’.
- See
spark.catalog.getTable
- Specify a hint over a relation. Hint should have a name and optional parameters.
- Compose the string representing rows for output. It will invoke ‘Dataset.htmlString’ to compute the results.
- See
spark.catalog.isCached
- Relation of type [Join].
- The key-value pair for the config request and response.
- Relation of type [Limit] that is used to
limit
rows from the input relation. - See
spark.catalog.listCatalogs
- See
spark.catalog.listColumns
- See
spark.catalog.listDatabases
- See
spark.catalog.listFunctions
- See
spark.catalog.listTables
- A relation that does not need to be qualified by name.
- Drop rows containing null values. It will invoke ‘Dataset.na.drop’ (same as ‘DataFrameNaFunctions.drop’) to compute the results.
- Replaces null values. It will invoke ‘Dataset.na.fill’ (same as ‘DataFrameNaFunctions.fill’) to compute the results. Following 3 parameter combinations are supported: 1, ‘values’ only contains 1 item, ‘cols’ is empty: replaces null values in all type-compatible columns. 2, ‘values’ only contains 1 item, ‘cols’ is not empty: replaces null values in specified columns. 3, ‘values’ contains more than 1 items, then ‘cols’ is required to have the same length: replaces each specified column with corresponding value.
- Replaces old values with the corresponding values. It will invoke ‘Dataset.na.replace’ (same as ‘DataFrameNaFunctions.replace’) to compute the results.
- Relation of type [Offset] that is used to read rows staring from the
offset
on the input relation. - Projection of a bag of expressions for a given input relation.
- Relation of type [Range] that generates a sequence of integers.
- Relation that reads from a file / table or other data source. Does not have additional inputs.
- See
spark.catalog.recoverPartitions
- See
spark.catalog.refreshByPath
- See
spark.catalog.refreshTable
- The main [Relation] type. Fundamentally, a relation is a typed container that has exactly one explicit relation type set.
- Common metadata of all relations.
- Relation repartition.
- ResourceInformation to hold information about a type of Resource. The corresponding class is ‘org.apache.spark.resource.ResourceInformation’
- Relation of type [Sample] that samples a fraction of the dataset.
- See
spark.catalog.setCurrentCatalog
- See
spark.catalog.setCurrentDatabase
- Relation of type [SetOperation]
- Compose the string representing rows for output. It will invoke ‘Dataset.showString’ to compute the results.
- Relation of type [Sort].
- Relation that uses a SQL query to generate the output.
- A SQL Command is used to trigger the eager evaluation of SQL commands in Spark.
- Calculates the approximate quantiles of numerical columns of a DataFrame. It will invoke ‘Dataset.stat.approxQuantile’ (same as ‘StatFunctions.approxQuantile’) to compute the results.
- Calculates the correlation of two columns of a DataFrame. Currently only supports the Pearson Correlation Coefficient. It will invoke ‘Dataset.stat.corr’ (same as ‘StatFunctions.pearsonCorrelation’) to compute the results.
- Calculate the sample covariance of two numerical columns of a DataFrame. It will invoke ‘Dataset.stat.cov’ (same as ‘StatFunctions.calculateCov’) to compute the results.
- Computes a pair-wise frequency table of the given columns. Also known as a contingency table. It will invoke ‘Dataset.stat.crosstab’ (same as ‘StatFunctions.crossTabulate’) to compute the results.
- Computes basic statistics for numeric and string columns, including count, mean, stddev, min, and max. If no columns are given, this function computes statistics for all numerical or string columns.
- Finding frequent items for columns, possibly with false positives. It will invoke ‘Dataset.stat.freqItems’ (same as ‘StatFunctions.freqItems’) to compute the results.
- Returns a stratified sample without replacement based on the fraction given on each stratum. It will invoke ‘Dataset.stat.freqItems’ (same as ‘StatFunctions.freqItems’) to compute the results.
- Computes specified statistics for numeric and string columns. It will invoke ‘Dataset.summary’ (same as ‘StatFunctions.summary’) to compute the results.
- StorageLevel for persisting Datasets/Tables.
- Commands for a streaming query.
- Response for commands on a streaming query.
- A tuple that uniquely identifies an instance of streaming query run. It consists of
id
that persists across the streaming runs andrun_id
that changes between each run of the streaming query that resumes from the checkpoint. - Commands for the streaming query manager.
- Response for commands on the streaming query manager.
- Relation alias.
- See
spark.catalog.tableExists
- Relation of type [Tail] that is used to fetch
limit
rows from the last of the input relation. - Rename columns on the input relation by the same length of names.
- See
spark.catalog.uncacheTable
- Used for testing purposes only.
- Unpivot a DataFrame from wide format to long format, optionally leaving identifier columns set.
- User Context is used to refer to one particular user session that is executing queries in the backend.
- Adding columns or replacing the existing columns that have the same names.
- Rename columns on the input relation by a map with name to name mapping.
- As writes are not directly handled during analysis and planning, they are modeled as commands.
- As writes are not directly handled during analysis and planning, they are modeled as commands.
- Starts write stream operation as streaming query. Query ID and Run ID of the streaming query are returned.