Structs

Specifies the type and number of accelerator cards attached to the instances of an instance. See GPUs on Compute Engine (https://cloud.google.com/compute/docs/gpus/).

Autoscaling Policy config associated with the cluster.

Describes an autoscaling policy for Dataproc cluster autoscaler.

Basic algorithm for autoscaling.

Basic autoscaling configurations for YARN.

A representation of a batch workload in the service.

Associates members, or principals, with a role.

A request to cancel a job.

Describes the identifying information, config, and status of a Dataproc cluster

The cluster config.

Contains cluster daemon metrics, such as HDFS and YARN stats.Beta Feature: This report is available for testing purposes only. It may be changed before final release.

A selector that chooses target cluster for jobs based on metadata.

The status of a cluster and its instances.

Confidential Instance Config for clusters using Confidential VMs (https://cloud.google.com/compute/confidential-vm/docs)

Central instance to access all Dataproc related resource activities

Contains dataproc metric config.

A request to collect cluster diagnostic information.

Specifies the config of disk options for a group of VM instances.

A generic empty message that you can re-use to avoid defining duplicated empty messages in your APIs. A typical example is to use it as the request or the response type of an API method. For instance: service Foo { rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); } The JSON representation for Empty is empty JSON object {}.

Encryption settings for the cluster.

Endpoint config for this cluster

Environment configuration for a workload.

Execution configuration for a workload.

Represents a textual expression in the Common Expression Language (CEL) syntax. CEL is a C-like expression language. The syntax and semantics of CEL are documented at https://github.com/google/cel-spec.Example (Comparison): title: “Summary size limit” description: “Determines if a summary is less than 100 chars” expression: “document.summary.size() < 100” Example (Equality): title: “Requestor is owner” description: “Determines if requestor is the document owner” expression: “document.owner == request.auth.claims.email” Example (Logic): title: “Public documents” description: “Determine whether the document should be publicly visible” expression: “document.type != ‘private’ && document.type != ‘internal’” Example (Data Manipulation): title: “Notification string” description: “Create a notification string with a timestamp.” expression: “’New message received at ’ + string(document.create_time)” The exact variables and functions that may be referenced within an expression are determined by the service that evaluates it. See the service documentation for additional information.

Common config settings for resources of Compute Engine cluster instances, applicable to all instances in the cluster.

Request message for GetIamPolicy method.

Encapsulates settings provided to GetIamPolicy.

The cluster’s GKE config.

A Dataproc job for running Apache Hadoop MapReduce (https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html) jobs on Apache Hadoop YARN (https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/YARN.html).

A Dataproc job for running Apache Hive (https://hive.apache.org/) queries on YARN.

Identity related configuration, including service account based secure multi-tenancy user mappings.

A request to inject credentials into a cluster.

Configuration for the size bounds of an instance group, including its proportional size to other groups.

The config settings for Compute Engine resources in an instance group, such as a master or worker group.

A reference to a Compute Engine instance.

A request to instantiate a workflow template.

A Dataproc job resource.

Dataproc job config.

Encapsulates the full scoping used to reference a job.

Job scheduling options.

Dataproc job status.

Specifies Kerberos related configuration.

Specifies the cluster auto-delete schedule configuration.

A response to a request to list autoscaling policies in a project.

A list of batch workloads.

The list of all clusters in a project.

A list of jobs in a project.

The response message for Operations.ListOperations.

A response to a request to list workflow templates in a project.

The runtime logging config of the job.

Cluster that is managed by the workflow.

Specifies the resources used to actively manage an instance group.

Specifies a Metastore configuration.

Metric source to enable along with any optional metrics for this source that override the dataproc defaults

A full, namespace-isolated deployment target for an existing GKE cluster.

Node Group Affinity for clusters using sole-tenant node groups.

Specifies an executable to run on a fully configured node and a timeout period for executable completion.

This resource represents a long-running operation that is the result of a network API call.

A job executed by the workflow.

Configuration for parameter validation.

Auxiliary services configuration for a workload.

A Dataproc job for running Apache Pig (https://pig.apache.org/) queries on YARN.

An Identity and Access Management (IAM) policy, which specifies access controls for Google Cloud resources.A Policy is a collection of bindings. A binding binds one or more members, or principals, to a single role. Principals can be user accounts, service accounts, Google groups, and domains (such as G Suite). A role is a named list of permissions; each role can be an IAM predefined role or a user-created custom role.For some types of Google Cloud resources, a binding can also specify a condition, which is a logical expression that allows access to a resource only if the expression evaluates to true. A condition can add constraints based on attributes of the request, the resource, or both. To learn which resources support conditions in their IAM policies, see the IAM documentation (https://cloud.google.com/iam/help/conditions/resource-policies).JSON example: { “bindings”: [ { “role”: “roles/resourcemanager.organizationAdmin”, “members”: [ “user:mike@example.com”, “group:admins@example.com”, “domain:google.com”, “serviceAccount:my-project-id@appspot.gserviceaccount.com” ] }, { “role”: “roles/resourcemanager.organizationViewer”, “members”: [ “user:eve@example.com” ], “condition”: { “title”: “expirable access”, “description”: “Does not grant access after Sep 2020”, “expression”: “request.time < timestamp(‘2020-10-01T00:00:00.000Z’)”, } } ], “etag”: “BwWWja0YfJA=”, “version”: 3 } YAML example: bindings: - members: - user:mike@example.com - group:admins@example.com - domain:google.com - serviceAccount:my-project-id@appspot.gserviceaccount.com role: roles/resourcemanager.organizationAdmin - members: - user:eve@example.com role: roles/resourcemanager.organizationViewer condition: title: expirable access description: Does not grant access after Sep 2020 expression: request.time < timestamp(‘2020-10-01T00:00:00.000Z’) etag: BwWWja0YfJA= version: 3 For a description of IAM and its features, see the IAM documentation (https://cloud.google.com/iam/docs/).

A Dataproc job for running Presto (https://prestosql.io/) queries. IMPORTANT: The Dataproc Presto Optional Component (https://cloud.google.com/dataproc/docs/concepts/components/presto) must be enabled when the cluster is created to submit a Presto job to the cluster.

Creates new autoscaling policy.

Deletes an autoscaling policy. It is an error to delete an autoscaling policy that is in use by one or more clusters.

Retrieves autoscaling policy.

Gets the access control policy for a resource. Returns an empty policy if the resource exists and does not have a policy set.

Lists autoscaling policies in the project.

Sets the access control policy on the specified resource. Replaces any existing policy.Can return NOT_FOUND, INVALID_ARGUMENT, and PERMISSION_DENIED errors.

Returns permissions that a caller has on the specified resource. If the resource does not exist, this will return an empty set of permissions, not a NOT_FOUND error.Note: This operation is designed to be used for building permission-aware UIs and command-line tools, not for authorization checking. This operation may “fail open” without warning.

Updates (replaces) autoscaling policy.Disabled check for update_mask, because all updates will be full replacements.

Creates a batch workload that executes asynchronously.

Deletes the batch workload resource. If the batch is not in terminal state, the delete fails and the response returns FAILED_PRECONDITION.

Gets the batch workload resource representation.

Lists batch workloads.

Creates new workflow template.

Deletes a workflow template. It does not cancel in-progress workflows.

Retrieves the latest workflow template.Can retrieve previously instantiated template by specifying optional version parameter.

Gets the access control policy for a resource. Returns an empty policy if the resource exists and does not have a policy set.

Instantiates a template and begins execution.The returned Operation can be used to track execution of workflow by polling operations.get. The Operation will complete when entire workflow is finished.The running workflow can be aborted via operations.cancel. This will cause any inflight jobs to be cancelled and workflow-owned clusters to be deleted.The Operation.metadata will be WorkflowMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#workflowmetadata). Also see Using WorkflowMetadata (https://cloud.google.com/dataproc/docs/concepts/workflows/debugging#using_workflowmetadata).On successful completion, Operation.response will be Empty.

Instantiates a template and begins execution.This method is equivalent to executing the sequence CreateWorkflowTemplate, InstantiateWorkflowTemplate, DeleteWorkflowTemplate.The returned Operation can be used to track execution of workflow by polling operations.get. The Operation will complete when entire workflow is finished.The running workflow can be aborted via operations.cancel. This will cause any inflight jobs to be cancelled and workflow-owned clusters to be deleted.The Operation.metadata will be WorkflowMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#workflowmetadata). Also see Using WorkflowMetadata (https://cloud.google.com/dataproc/docs/concepts/workflows/debugging#using_workflowmetadata).On successful completion, Operation.response will be Empty.

Lists workflows that match the specified filter in the request.

Sets the access control policy on the specified resource. Replaces any existing policy.Can return NOT_FOUND, INVALID_ARGUMENT, and PERMISSION_DENIED errors.

Returns permissions that a caller has on the specified resource. If the resource does not exist, this will return an empty set of permissions, not a NOT_FOUND error.Note: This operation is designed to be used for building permission-aware UIs and command-line tools, not for authorization checking. This operation may “fail open” without warning.

Updates (replaces) workflow template. The updated template must contain version that matches the current server version.

A builder providing access to all methods supported on project resources. It is not used directly, but through the Dataproc hub.

Creates new autoscaling policy.

Deletes an autoscaling policy. It is an error to delete an autoscaling policy that is in use by one or more clusters.

Retrieves autoscaling policy.

Gets the access control policy for a resource. Returns an empty policy if the resource exists and does not have a policy set.

Lists autoscaling policies in the project.

Sets the access control policy on the specified resource. Replaces any existing policy.Can return NOT_FOUND, INVALID_ARGUMENT, and PERMISSION_DENIED errors.

Returns permissions that a caller has on the specified resource. If the resource does not exist, this will return an empty set of permissions, not a NOT_FOUND error.Note: This operation is designed to be used for building permission-aware UIs and command-line tools, not for authorization checking. This operation may “fail open” without warning.

Updates (replaces) autoscaling policy.Disabled check for update_mask, because all updates will be full replacements.

Creates a cluster in a project. The returned Operation.metadata will be ClusterOperationMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#clusteroperationmetadata).

Deletes a cluster in a project. The returned Operation.metadata will be ClusterOperationMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#clusteroperationmetadata).

Gets cluster diagnostic information. The returned Operation.metadata will be ClusterOperationMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#clusteroperationmetadata). After the operation completes, Operation.response contains DiagnoseClusterResults (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#diagnoseclusterresults).

Gets the resource representation for a cluster in a project.

Gets the access control policy for a resource. Returns an empty policy if the resource exists and does not have a policy set.

Inject encrypted credentials into all of the VMs in a cluster.The target cluster must be a personal auth cluster assigned to the user who is issuing the RPC.

Lists all regions/{region}/clusters in a project alphabetically.

Updates a cluster in a project. The returned Operation.metadata will be ClusterOperationMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#clusteroperationmetadata). The cluster must be in a RUNNING state or an error is returned.

Sets the access control policy on the specified resource. Replaces any existing policy.Can return NOT_FOUND, INVALID_ARGUMENT, and PERMISSION_DENIED errors.

Starts a cluster in a project.

Stops a cluster in a project.

Returns permissions that a caller has on the specified resource. If the resource does not exist, this will return an empty set of permissions, not a NOT_FOUND error.Note: This operation is designed to be used for building permission-aware UIs and command-line tools, not for authorization checking. This operation may “fail open” without warning.

Starts a job cancellation request. To access the job resource after cancellation, call regions/{region}/jobs.list (https://cloud.google.com/dataproc/docs/reference/rest/v1/projects.regions.jobs/list) or regions/{region}/jobs.get (https://cloud.google.com/dataproc/docs/reference/rest/v1/projects.regions.jobs/get).

Deletes the job from the project. If the job is active, the delete fails, and the response returns FAILED_PRECONDITION.

Gets the resource representation for a job in a project.

Gets the access control policy for a resource. Returns an empty policy if the resource exists and does not have a policy set.

Lists regions/{region}/jobs in a project.

Updates a job in a project.

Sets the access control policy on the specified resource. Replaces any existing policy.Can return NOT_FOUND, INVALID_ARGUMENT, and PERMISSION_DENIED errors.

Submits job to a cluster.

Submits a job to a cluster.

Returns permissions that a caller has on the specified resource. If the resource does not exist, this will return an empty set of permissions, not a NOT_FOUND error.Note: This operation is designed to be used for building permission-aware UIs and command-line tools, not for authorization checking. This operation may “fail open” without warning.

Starts asynchronous cancellation on a long-running operation. The server makes a best effort to cancel the operation, but success is not guaranteed. If the server doesn’t support this method, it returns google.rpc.Code.UNIMPLEMENTED. Clients can use Operations.GetOperation or other methods to check whether the cancellation succeeded or whether the operation completed despite cancellation. On successful cancellation, the operation is not deleted; instead, it becomes an operation with an Operation.error value with a google.rpc.Status.code of 1, corresponding to Code.CANCELLED.

Deletes a long-running operation. This method indicates that the client is no longer interested in the operation result. It does not cancel the operation. If the server doesn’t support this method, it returns google.rpc.Code.UNIMPLEMENTED.

Gets the latest state of a long-running operation. Clients can use this method to poll the operation result at intervals as recommended by the API service.

Gets the access control policy for a resource. Returns an empty policy if the resource exists and does not have a policy set.

Lists operations that match the specified filter in the request. If the server doesn’t support this method, it returns UNIMPLEMENTED.NOTE: the name binding allows API services to override the binding to use different resource name schemes, such as users//operations. To override the binding, API services can add a binding such as “/v1/{name=users/}/operations” to their service configuration. For backwards compatibility, the default name includes the operations collection id, however overriding users must ensure the name binding is the parent resource, without the operations collection id.

Sets the access control policy on the specified resource. Replaces any existing policy.Can return NOT_FOUND, INVALID_ARGUMENT, and PERMISSION_DENIED errors.

Returns permissions that a caller has on the specified resource. If the resource does not exist, this will return an empty set of permissions, not a NOT_FOUND error.Note: This operation is designed to be used for building permission-aware UIs and command-line tools, not for authorization checking. This operation may “fail open” without warning.

Creates new workflow template.

Deletes a workflow template. It does not cancel in-progress workflows.

Retrieves the latest workflow template.Can retrieve previously instantiated template by specifying optional version parameter.

Gets the access control policy for a resource. Returns an empty policy if the resource exists and does not have a policy set.

Instantiates a template and begins execution.The returned Operation can be used to track execution of workflow by polling operations.get. The Operation will complete when entire workflow is finished.The running workflow can be aborted via operations.cancel. This will cause any inflight jobs to be cancelled and workflow-owned clusters to be deleted.The Operation.metadata will be WorkflowMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#workflowmetadata). Also see Using WorkflowMetadata (https://cloud.google.com/dataproc/docs/concepts/workflows/debugging#using_workflowmetadata).On successful completion, Operation.response will be Empty.

Instantiates a template and begins execution.This method is equivalent to executing the sequence CreateWorkflowTemplate, InstantiateWorkflowTemplate, DeleteWorkflowTemplate.The returned Operation can be used to track execution of workflow by polling operations.get. The Operation will complete when entire workflow is finished.The running workflow can be aborted via operations.cancel. This will cause any inflight jobs to be cancelled and workflow-owned clusters to be deleted.The Operation.metadata will be WorkflowMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#workflowmetadata). Also see Using WorkflowMetadata (https://cloud.google.com/dataproc/docs/concepts/workflows/debugging#using_workflowmetadata).On successful completion, Operation.response will be Empty.

Lists workflows that match the specified filter in the request.

Sets the access control policy on the specified resource. Replaces any existing policy.Can return NOT_FOUND, INVALID_ARGUMENT, and PERMISSION_DENIED errors.

Returns permissions that a caller has on the specified resource. If the resource does not exist, this will return an empty set of permissions, not a NOT_FOUND error.Note: This operation is designed to be used for building permission-aware UIs and command-line tools, not for authorization checking. This operation may “fail open” without warning.

Updates (replaces) workflow template. The updated template must contain version that matches the current server version.

A configuration for running an Apache PySpark (https://spark.apache.org/docs/latest/api/python/getting_started/quickstart.html) batch workload.

A Dataproc job for running Apache PySpark (https://spark.apache.org/docs/0.9.0/python-programming-guide.html) applications on YARN.

A list of queries to run on a cluster.

Validation based on regular expressions.

A request to repair a cluster.

Reservation Affinity for consuming Zonal reservation.

Runtime configuration for a workload.

Runtime information about workload execution.

Security related configuration, including encryption, Kerberos, etc.

Request message for SetIamPolicy method.

Shielded Instance Config for clusters using Compute Engine Shielded VMs (https://cloud.google.com/security/shielded-cloud/shielded-vm).

Specifies the selection and config of software inside the cluster.

A configuration for running an Apache Spark (https://spark.apache.org/) batch workload.

Spark History Server configuration for the workload.

A Dataproc job for running Apache Spark (https://spark.apache.org/) applications on YARN.

A configuration for running an Apache SparkR (https://spark.apache.org/docs/latest/sparkr.html) batch workload.

A Dataproc job for running Apache SparkR (https://spark.apache.org/docs/latest/sparkr.html) applications on YARN.

A configuration for running Apache Spark SQL (https://spark.apache.org/sql/) queries as a batch workload.

A Dataproc job for running Apache Spark SQL (https://spark.apache.org/sql/) queries.

Basic autoscaling configurations for Spark Standalone.

A request to start a cluster.

Historical state information.

The Status type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by gRPC (https://github.com/grpc). Each Status message contains three pieces of data: error code, error message, and error details.You can find out more about this error model and how to work with it in the API Design Guide (https://cloud.google.com/apis/design/errors).

A request to stop a cluster.

A request to submit a job.

A configurable parameter that replaces one or more fields in the template. Parameterizable fields: - Labels - File uris - Job properties - Job arguments - Script variables - Main class (in HadoopJob and SparkJob) - Zone (in ClusterSelector)

Request message for TestIamPermissions method.

Response message for TestIamPermissions method.

Validation based on a list of allowed values.

A Dataproc workflow template resource.

Specifies workflow execution target.Either managed_cluster or cluster_selector is required.

A YARN application created by a job. Application information is a subset of org.apache.hadoop.yarn.proto.YarnProtos.ApplicationReportProto.Beta Feature: This report is available for testing purposes only. It may be changed before final release.

Enums

Identifies the an OAuth2 authorization scope. A scope is needed when requesting an authorization token.