Module google_dataproc1::api

source ·

Structs§

  • Specifies the type and number of accelerator cards attached to the instances of an instance. See GPUs on Compute Engine (https://cloud.google.com/compute/docs/gpus/).
  • Autoscaling Policy config associated with the cluster.
  • Describes an autoscaling policy for Dataproc cluster autoscaler.
  • Node group identification and configuration information.
  • Auxiliary services configuration for a Cluster.
  • Basic algorithm for autoscaling.
  • Basic autoscaling configurations for YARN.
  • A representation of a batch workload in the service.
  • Associates members, or principals, with a role.
  • A request to cancel a job.
  • Describes the identifying information, config, and status of a Dataproc cluster
  • The cluster config.
  • Contains cluster daemon metrics, such as HDFS and YARN stats.Beta Feature: This report is available for testing purposes only. It may be changed before final release.
  • A selector that chooses target cluster for jobs based on metadata.
  • The status of a cluster and its instances.
  • Confidential Instance Config for clusters using Confidential VMs (https://cloud.google.com/compute/confidential-vm/docs)
  • Central instance to access all Dataproc related resource activities
  • Dataproc metric config.
  • A request to collect cluster diagnostic information.
  • Specifies the config of disk options for a group of VM instances.
  • Driver scheduling configuration.
  • A generic empty message that you can re-use to avoid defining duplicated empty messages in your APIs. A typical example is to use it as the request or the response type of an API method. For instance: service Foo { rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); }
  • Encryption settings for the cluster.
  • Endpoint config for this cluster
  • Environment configuration for a workload.
  • Execution configuration for a workload.
  • Represents a textual expression in the Common Expression Language (CEL) syntax. CEL is a C-like expression language. The syntax and semantics of CEL are documented at https://github.com/google/cel-spec.Example (Comparison): title: “Summary size limit” description: “Determines if a summary is less than 100 chars” expression: “document.summary.size() < 100” Example (Equality): title: “Requestor is owner” description: “Determines if requestor is the document owner” expression: “document.owner == request.auth.claims.email” Example (Logic): title: “Public documents” description: “Determine whether the document should be publicly visible” expression: “document.type != ‘private’ && document.type != ‘internal’” Example (Data Manipulation): title: “Notification string” description: “Create a notification string with a timestamp.” expression: “’New message received at ’ + string(document.create_time)” The exact variables and functions that may be referenced within an expression are determined by the service that evaluates it. See the service documentation for additional information.
  • A Dataproc job for running Apache Flink applications on YARN.
  • Common config settings for resources of Compute Engine cluster instances, applicable to all instances in the cluster.
  • Request message for GetIamPolicy method.
  • Encapsulates settings provided to GetIamPolicy.
  • The cluster’s GKE config.
  • Parameters that describe cluster nodes.
  • A GkeNodeConfigAcceleratorConfig represents a Hardware Accelerator request for a node pool.
  • GkeNodePoolAutoscaling contains information the cluster autoscaler needs to adjust the size of the node pool to the current cluster usage.
  • The configuration of a GKE node pool used by a Dataproc-on-GKE cluster (https://cloud.google.com/dataproc/docs/concepts/jobs/dataproc-gke#create-a-dataproc-on-gke-cluster).
  • GKE node pools that Dataproc workloads run on.
  • Encryption settings for encrypting workflow template job arguments.
  • A Dataproc job for running Apache Hadoop MapReduce (https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html) jobs on Apache Hadoop YARN (https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/YARN.html).
  • A Dataproc job for running Apache Hive (https://hive.apache.org/) queries on YARN.
  • Identity related configuration, including service account based secure multi-tenancy user mappings.
  • A request to inject credentials into a cluster.
  • Instance flexibility Policy allowing a mixture of VM shapes and provisioning models.
  • Configuration for the size bounds of an instance group, including its proportional size to other groups.
  • The config settings for Compute Engine resources in an instance group, such as a master or worker group.
  • A reference to a Compute Engine instance.
  • Defines machines types and a rank to which the machines types belong.
  • Defines a mapping from machine types to the number of VMs that are created with each machine type.
  • A request to instantiate a workflow template.
  • Represents a time interval, encoded as a Timestamp start (inclusive) and a Timestamp end (exclusive).The start must be less than or equal to the end. When the start equals the end, the interval is empty (matches no time). When both start and end are unspecified, the interval matches any time.
  • A Dataproc job resource.
  • Dataproc job config.
  • Encapsulates the full scoping used to reference a job.
  • Job scheduling options.
  • Dataproc job status.
  • Jupyter configuration for an interactive session.
  • Specifies Kerberos related configuration.
  • The configuration for running the Dataproc cluster on Kubernetes.
  • The software configuration for this Dataproc cluster running on Kubernetes.
  • Specifies the cluster auto-delete schedule configuration.
  • A response to a request to list autoscaling policies in a project.
  • A list of batch workloads.
  • The list of all clusters in a project.
  • A list of jobs in a project.
  • The response message for Operations.ListOperations.
  • A list of session templates.
  • A list of interactive sessions.
  • A response to a request to list workflow templates in a project.
  • The runtime logging config of the job.
  • Cluster that is managed by the workflow.
  • Specifies the resources used to actively manage an instance group.
  • Specifies a Metastore configuration.
  • A Dataproc custom metric.
  • Deprecated. Used only for the deprecated beta. A full, namespace-isolated deployment target for an existing GKE cluster.
  • Dataproc Node Group. The Dataproc NodeGroup resource is not related to the Dataproc NodeGroupAffinity resource.
  • Node Group Affinity for clusters using sole-tenant node groups. The Dataproc NodeGroupAffinity resource is not related to the Dataproc NodeGroup resource.
  • Specifies an executable to run on a fully configured node and a timeout period for executable completion.
  • indicating a list of workers of same type
  • This resource represents a long-running operation that is the result of a network API call.
  • A job executed by the workflow.
  • Configuration for parameter validation.
  • Auxiliary services configuration for a workload.
  • A Dataproc job for running Apache Pig (https://pig.apache.org/) queries on YARN.
  • An Identity and Access Management (IAM) policy, which specifies access controls for Google Cloud resources.A Policy is a collection of bindings. A binding binds one or more members, or principals, to a single role. Principals can be user accounts, service accounts, Google groups, and domains (such as G Suite). A role is a named list of permissions; each role can be an IAM predefined role or a user-created custom role.For some types of Google Cloud resources, a binding can also specify a condition, which is a logical expression that allows access to a resource only if the expression evaluates to true. A condition can add constraints based on attributes of the request, the resource, or both. To learn which resources support conditions in their IAM policies, see the IAM documentation (https://cloud.google.com/iam/help/conditions/resource-policies).JSON example: { “bindings”: [ { “role”: “roles/resourcemanager.organizationAdmin”, “members”: [ “user:mike@example.com”, “group:admins@example.com”, “domain:google.com”, “serviceAccount:my-project-id@appspot.gserviceaccount.com” ] }, { “role”: “roles/resourcemanager.organizationViewer”, “members”: [ “user:eve@example.com” ], “condition”: { “title”: “expirable access”, “description”: “Does not grant access after Sep 2020”, “expression”: “request.time < timestamp(‘2020-10-01T00:00:00.000Z’)”, } } ], “etag”: “BwWWja0YfJA=”, “version”: 3 } YAML example: bindings: - members: - user:mike@example.com - group:admins@example.com - domain:google.com - serviceAccount:my-project-id@appspot.gserviceaccount.com role: roles/resourcemanager.organizationAdmin - members: - user:eve@example.com role: roles/resourcemanager.organizationViewer condition: title: expirable access description: Does not grant access after Sep 2020 expression: request.time < timestamp(‘2020-10-01T00:00:00.000Z’) etag: BwWWja0YfJA= version: 3 For a description of IAM and its features, see the IAM documentation (https://cloud.google.com/iam/docs/).
  • A Dataproc job for running Presto (https://prestosql.io/) queries. IMPORTANT: The Dataproc Presto Optional Component (https://cloud.google.com/dataproc/docs/concepts/components/presto) must be enabled when the cluster is created to submit a Presto job to the cluster.
  • Creates new autoscaling policy.
  • Deletes an autoscaling policy. It is an error to delete an autoscaling policy that is in use by one or more clusters.
  • Retrieves autoscaling policy.
  • Gets the access control policy for a resource. Returns an empty policy if the resource exists and does not have a policy set.
  • Lists autoscaling policies in the project.
  • Sets the access control policy on the specified resource. Replaces any existing policy.Can return NOT_FOUND, INVALID_ARGUMENT, and PERMISSION_DENIED errors.
  • Returns permissions that a caller has on the specified resource. If the resource does not exist, this will return an empty set of permissions, not a NOT_FOUND error.Note: This operation is designed to be used for building permission-aware UIs and command-line tools, not for authorization checking. This operation may “fail open” without warning.
  • Updates (replaces) autoscaling policy.Disabled check for update_mask, because all updates will be full replacements.
  • Creates a batch workload that executes asynchronously.
  • Deletes the batch workload resource. If the batch is not in a CANCELLED, SUCCEEDED or FAILED State, the delete operation fails and the response returns FAILED_PRECONDITION.
  • Gets the batch workload resource representation.
  • Lists batch workloads.
  • Starts asynchronous cancellation on a long-running operation. The server makes a best effort to cancel the operation, but success is not guaranteed. If the server doesn’t support this method, it returns google.rpc.Code.UNIMPLEMENTED. Clients can use Operations.GetOperation or other methods to check whether the cancellation succeeded or whether the operation completed despite cancellation. On successful cancellation, the operation is not deleted; instead, it becomes an operation with an Operation.error value with a google.rpc.Status.code of 1, corresponding to Code.CANCELLED.
  • Deletes a long-running operation. This method indicates that the client is no longer interested in the operation result. It does not cancel the operation. If the server doesn’t support this method, it returns google.rpc.Code.UNIMPLEMENTED.
  • Gets the latest state of a long-running operation. Clients can use this method to poll the operation result at intervals as recommended by the API service.
  • Lists operations that match the specified filter in the request. If the server doesn’t support this method, it returns UNIMPLEMENTED.
  • Create an interactive session asynchronously.
  • Deletes the interactive session resource. If the session is not in terminal state, it is terminated, and then deleted.
  • Gets the resource representation for an interactive session.
  • Lists interactive sessions.
  • Create a session template synchronously.
  • Deletes a session template.
  • Gets the resource representation for a session template.
  • Lists session templates.
  • Updates the session template synchronously.
  • Terminates the interactive session.
  • Creates new workflow template.
  • Deletes a workflow template. It does not cancel in-progress workflows.
  • Retrieves the latest workflow template.Can retrieve previously instantiated template by specifying optional version parameter.
  • Gets the access control policy for a resource. Returns an empty policy if the resource exists and does not have a policy set.
  • Instantiates a template and begins execution.The returned Operation can be used to track execution of workflow by polling operations.get. The Operation will complete when entire workflow is finished.The running workflow can be aborted via operations.cancel. This will cause any inflight jobs to be cancelled and workflow-owned clusters to be deleted.The Operation.metadata will be WorkflowMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#workflowmetadata). Also see Using WorkflowMetadata (https://cloud.google.com/dataproc/docs/concepts/workflows/debugging#using_workflowmetadata).On successful completion, Operation.response will be Empty.
  • Instantiates a template and begins execution.This method is equivalent to executing the sequence CreateWorkflowTemplate, InstantiateWorkflowTemplate, DeleteWorkflowTemplate.The returned Operation can be used to track execution of workflow by polling operations.get. The Operation will complete when entire workflow is finished.The running workflow can be aborted via operations.cancel. This will cause any inflight jobs to be cancelled and workflow-owned clusters to be deleted.The Operation.metadata will be WorkflowMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#workflowmetadata). Also see Using WorkflowMetadata (https://cloud.google.com/dataproc/docs/concepts/workflows/debugging#using_workflowmetadata).On successful completion, Operation.response will be Empty.
  • Lists workflows that match the specified filter in the request.
  • Sets the access control policy on the specified resource. Replaces any existing policy.Can return NOT_FOUND, INVALID_ARGUMENT, and PERMISSION_DENIED errors.
  • Returns permissions that a caller has on the specified resource. If the resource does not exist, this will return an empty set of permissions, not a NOT_FOUND error.Note: This operation is designed to be used for building permission-aware UIs and command-line tools, not for authorization checking. This operation may “fail open” without warning.
  • Updates (replaces) workflow template. The updated template must contain version that matches the current server version.
  • A builder providing access to all methods supported on project resources. It is not used directly, but through the Dataproc hub.
  • Creates new autoscaling policy.
  • Deletes an autoscaling policy. It is an error to delete an autoscaling policy that is in use by one or more clusters.
  • Retrieves autoscaling policy.
  • Gets the access control policy for a resource. Returns an empty policy if the resource exists and does not have a policy set.
  • Lists autoscaling policies in the project.
  • Sets the access control policy on the specified resource. Replaces any existing policy.Can return NOT_FOUND, INVALID_ARGUMENT, and PERMISSION_DENIED errors.
  • Returns permissions that a caller has on the specified resource. If the resource does not exist, this will return an empty set of permissions, not a NOT_FOUND error.Note: This operation is designed to be used for building permission-aware UIs and command-line tools, not for authorization checking. This operation may “fail open” without warning.
  • Updates (replaces) autoscaling policy.Disabled check for update_mask, because all updates will be full replacements.
  • Creates a cluster in a project. The returned Operation.metadata will be ClusterOperationMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#clusteroperationmetadata).
  • Deletes a cluster in a project. The returned Operation.metadata will be ClusterOperationMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#clusteroperationmetadata).
  • Gets cluster diagnostic information. The returned Operation.metadata will be ClusterOperationMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#clusteroperationmetadata). After the operation completes, Operation.response contains DiagnoseClusterResults (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#diagnoseclusterresults).
  • Gets the resource representation for a cluster in a project.
  • Gets the access control policy for a resource. Returns an empty policy if the resource exists and does not have a policy set.
  • Inject encrypted credentials into all of the VMs in a cluster.The target cluster must be a personal auth cluster assigned to the user who is issuing the RPC.
  • Lists all regions/{region}/clusters in a project alphabetically.
  • Creates a node group in a cluster. The returned Operation.metadata is NodeGroupOperationMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#nodegroupoperationmetadata).
  • Gets the resource representation for a node group in a cluster.
  • Repair nodes in a node group.
  • Resizes a node group in a cluster. The returned Operation.metadata is NodeGroupOperationMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#nodegroupoperationmetadata).
  • Updates a cluster in a project. The returned Operation.metadata will be ClusterOperationMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#clusteroperationmetadata). The cluster must be in a RUNNING state or an error is returned.
  • Repairs a cluster.
  • Sets the access control policy on the specified resource. Replaces any existing policy.Can return NOT_FOUND, INVALID_ARGUMENT, and PERMISSION_DENIED errors.
  • Starts a cluster in a project.
  • Stops a cluster in a project.
  • Returns permissions that a caller has on the specified resource. If the resource does not exist, this will return an empty set of permissions, not a NOT_FOUND error.Note: This operation is designed to be used for building permission-aware UIs and command-line tools, not for authorization checking. This operation may “fail open” without warning.
  • Starts a job cancellation request. To access the job resource after cancellation, call regions/{region}/jobs.list (https://cloud.google.com/dataproc/docs/reference/rest/v1/projects.regions.jobs/list) or regions/{region}/jobs.get (https://cloud.google.com/dataproc/docs/reference/rest/v1/projects.regions.jobs/get).
  • Deletes the job from the project. If the job is active, the delete fails, and the response returns FAILED_PRECONDITION.
  • Gets the resource representation for a job in a project.
  • Gets the access control policy for a resource. Returns an empty policy if the resource exists and does not have a policy set.
  • Lists regions/{region}/jobs in a project.
  • Updates a job in a project.
  • Sets the access control policy on the specified resource. Replaces any existing policy.Can return NOT_FOUND, INVALID_ARGUMENT, and PERMISSION_DENIED errors.
  • Submits job to a cluster.
  • Submits a job to a cluster.
  • Returns permissions that a caller has on the specified resource. If the resource does not exist, this will return an empty set of permissions, not a NOT_FOUND error.Note: This operation is designed to be used for building permission-aware UIs and command-line tools, not for authorization checking. This operation may “fail open” without warning.
  • Starts asynchronous cancellation on a long-running operation. The server makes a best effort to cancel the operation, but success is not guaranteed. If the server doesn’t support this method, it returns google.rpc.Code.UNIMPLEMENTED. Clients can use Operations.GetOperation or other methods to check whether the cancellation succeeded or whether the operation completed despite cancellation. On successful cancellation, the operation is not deleted; instead, it becomes an operation with an Operation.error value with a google.rpc.Status.code of 1, corresponding to Code.CANCELLED.
  • Deletes a long-running operation. This method indicates that the client is no longer interested in the operation result. It does not cancel the operation. If the server doesn’t support this method, it returns google.rpc.Code.UNIMPLEMENTED.
  • Gets the latest state of a long-running operation. Clients can use this method to poll the operation result at intervals as recommended by the API service.
  • Gets the access control policy for a resource. Returns an empty policy if the resource exists and does not have a policy set.
  • Lists operations that match the specified filter in the request. If the server doesn’t support this method, it returns UNIMPLEMENTED.
  • Sets the access control policy on the specified resource. Replaces any existing policy.Can return NOT_FOUND, INVALID_ARGUMENT, and PERMISSION_DENIED errors.
  • Returns permissions that a caller has on the specified resource. If the resource does not exist, this will return an empty set of permissions, not a NOT_FOUND error.Note: This operation is designed to be used for building permission-aware UIs and command-line tools, not for authorization checking. This operation may “fail open” without warning.
  • Creates new workflow template.
  • Deletes a workflow template. It does not cancel in-progress workflows.
  • Retrieves the latest workflow template.Can retrieve previously instantiated template by specifying optional version parameter.
  • Gets the access control policy for a resource. Returns an empty policy if the resource exists and does not have a policy set.
  • Instantiates a template and begins execution.The returned Operation can be used to track execution of workflow by polling operations.get. The Operation will complete when entire workflow is finished.The running workflow can be aborted via operations.cancel. This will cause any inflight jobs to be cancelled and workflow-owned clusters to be deleted.The Operation.metadata will be WorkflowMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#workflowmetadata). Also see Using WorkflowMetadata (https://cloud.google.com/dataproc/docs/concepts/workflows/debugging#using_workflowmetadata).On successful completion, Operation.response will be Empty.
  • Instantiates a template and begins execution.This method is equivalent to executing the sequence CreateWorkflowTemplate, InstantiateWorkflowTemplate, DeleteWorkflowTemplate.The returned Operation can be used to track execution of workflow by polling operations.get. The Operation will complete when entire workflow is finished.The running workflow can be aborted via operations.cancel. This will cause any inflight jobs to be cancelled and workflow-owned clusters to be deleted.The Operation.metadata will be WorkflowMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#workflowmetadata). Also see Using WorkflowMetadata (https://cloud.google.com/dataproc/docs/concepts/workflows/debugging#using_workflowmetadata).On successful completion, Operation.response will be Empty.
  • Lists workflows that match the specified filter in the request.
  • Sets the access control policy on the specified resource. Replaces any existing policy.Can return NOT_FOUND, INVALID_ARGUMENT, and PERMISSION_DENIED errors.
  • Returns permissions that a caller has on the specified resource. If the resource does not exist, this will return an empty set of permissions, not a NOT_FOUND error.Note: This operation is designed to be used for building permission-aware UIs and command-line tools, not for authorization checking. This operation may “fail open” without warning.
  • Updates (replaces) workflow template. The updated template must contain version that matches the current server version.
  • Configuration for PyPi repository
  • A configuration for running an Apache PySpark (https://spark.apache.org/docs/latest/api/python/getting_started/quickstart.html) batch workload.
  • A Dataproc job for running Apache PySpark (https://spark.apache.org/docs/0.9.0/python-programming-guide.html) applications on YARN.
  • A list of queries to run on a cluster.
  • Validation based on regular expressions.
  • A request to repair a cluster.
  • There is no detailed description.
  • Configuration for dependency repositories
  • Reservation Affinity for consuming Zonal reservation.
  • A request to resize a node group.
  • Runtime configuration for a workload.
  • Runtime information about workload execution.
  • Security related configuration, including encryption, Kerberos, etc.
  • A representation of a session.
  • Historical state information.
  • A representation of a session template.
  • Request message for SetIamPolicy method.
  • Shielded Instance Config for clusters using Compute Engine Shielded VMs (https://cloud.google.com/security/shielded-cloud/shielded-vm).
  • Specifies the selection and config of software inside the cluster.
  • A configuration for running an Apache Spark (https://spark.apache.org/) batch workload.
  • Spark History Server configuration for the workload.
  • A Dataproc job for running Apache Spark (https://spark.apache.org/) applications on YARN.
  • A configuration for running an Apache SparkR (https://spark.apache.org/docs/latest/sparkr.html) batch workload.
  • A Dataproc job for running Apache SparkR (https://spark.apache.org/docs/latest/sparkr.html) applications on YARN.
  • A configuration for running Apache Spark SQL (https://spark.apache.org/sql/) queries as a batch workload.
  • A Dataproc job for running Apache Spark SQL (https://spark.apache.org/sql/) queries.
  • Basic autoscaling configurations for Spark Standalone.
  • A request to start a cluster.
  • Configuration to handle the startup of instances during cluster create and update process.
  • Historical state information.
  • The Status type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by gRPC (https://github.com/grpc). Each Status message contains three pieces of data: error code, error message, and error details.You can find out more about this error model and how to work with it in the API Design Guide (https://cloud.google.com/apis/design/errors).
  • A request to stop a cluster.
  • A request to submit a job.
  • A configurable parameter that replaces one or more fields in the template. Parameterizable fields: - Labels - File uris - Job properties - Job arguments - Script variables - Main class (in HadoopJob and SparkJob) - Zone (in ClusterSelector)
  • A request to terminate an interactive session.
  • Request message for TestIamPermissions method.
  • Response message for TestIamPermissions method.
  • A Dataproc job for running Trino (https://trino.io/) queries. IMPORTANT: The Dataproc Trino Optional Component (https://cloud.google.com/dataproc/docs/concepts/components/trino) must be enabled when the cluster is created to submit a Trino job to the cluster.
  • Usage metrics represent approximate total resources consumed by a workload.
  • The usage snapshot represents the resources consumed by a workload at a specified time.
  • Validation based on a list of allowed values.
  • The Dataproc cluster config for a cluster that does not directly control the underlying compute resources, such as a Dataproc-on-GKE cluster (https://cloud.google.com/dataproc/docs/guides/dpgke/dataproc-gke-overview).
  • A Dataproc workflow template resource.
  • Specifies workflow execution target.Either managed_cluster or cluster_selector is required.
  • A YARN application created by a job. Application information is a subset of org.apache.hadoop.yarn.proto.YarnProtos.ApplicationReportProto.Beta Feature: This report is available for testing purposes only. It may be changed before final release.

Enums§