Struct GoogleCloudAiplatformV1InputDataConfig

Source

pub struct GoogleCloudAiplatformV1InputDataConfig {
    pub timestamp_split: Option<GoogleCloudAiplatformV1TimestampSplit>,
    pub annotation_schema_uri: Option<String>,
    pub bigquery_destination: Option<GoogleCloudAiplatformV1BigQueryDestination>,
    pub fraction_split: Option<GoogleCloudAiplatformV1FractionSplit>,
    pub gcs_destination: Option<GoogleCloudAiplatformV1GcsDestination>,
    pub predefined_split: Option<GoogleCloudAiplatformV1PredefinedSplit>,
    pub saved_query_id: Option<String>,
    pub persist_ml_use_assignment: Option<bool>,
    pub stratified_split: Option<GoogleCloudAiplatformV1StratifiedSplit>,
    pub annotations_filter: Option<String>,
    pub filter_split: Option<GoogleCloudAiplatformV1FilterSplit>,
    pub dataset_id: Option<String>,
}

Expand description

Specifies Vertex AI owned input data to be used for training, and possibly evaluating, the Model.

This type is not used in any activity, and only used as part of another schema.

Fields§

§timestamp_split: Option<GoogleCloudAiplatformV1TimestampSplit>

Supported only for tabular Datasets. Split based on the timestamp of the input data pieces.

§annotation_schema_uri: Option<String>

Applicable only to custom training with Datasets that have DataItems and Annotations. Cloud Storage URI that points to a YAML file describing the annotation schema. The schema is defined as an OpenAPI 3.0.2 Schema Object. The schema files that can be used here are found in gs://google-cloud-aiplatform/schema/dataset/annotation/ , note that the chosen schema must be consistent with metadata of the Dataset specified by dataset_id. Only Annotations that both match this schema and belong to DataItems not ignored by the split method are used in respectively training, validation or test role, depending on the role of the DataItem they are on. When used in conjunction with annotations_filter, the Annotations used for training are filtered by both annotations_filter and annotation_schema_uri.

§bigquery_destination: Option<GoogleCloudAiplatformV1BigQueryDestination>

Only applicable to custom training with tabular Dataset with BigQuery source. The BigQuery project location where the training data is to be written to. In the given project a new dataset is created with name dataset___ where timestamp is in YYYY_MM_DDThh_mm_ss_sssZ format. All training input data is written into that dataset. In the dataset three tables are created, training, validation and test. * AIP_DATA_FORMAT = “bigquery”. * AIP_TRAINING_DATA_URI = “bigquery_destination.dataset___.training” * AIP_VALIDATION_DATA_URI = “bigquery_destination.dataset___.validation” * AIP_TEST_DATA_URI = “bigquery_destination.dataset___.test”

§fraction_split: Option<GoogleCloudAiplatformV1FractionSplit>

Split based on fractions defining the size of each set.

§gcs_destination: Option<GoogleCloudAiplatformV1GcsDestination>

The Cloud Storage location where the training data is to be written to. In the given directory a new directory is created with name: dataset--- where timestamp is in YYYY-MM-DDThh:mm:ss.sssZ ISO-8601 format. All training input data is written into that directory. The Vertex AI environment variables representing Cloud Storage data URIs are represented in the Cloud Storage wildcard format to support sharded data. e.g.: “gs://…/training-.jsonl” * AIP_DATA_FORMAT = “jsonl” for non-tabular data, “csv” for tabular data * AIP_TRAINING_DATA_URI = “gcs_destination/dataset—/training-.${AIP_DATA_FORMAT}” * AIP_VALIDATION_DATA_URI = “gcs_destination/dataset—/validation-.${AIP_DATA_FORMAT}” * AIP_TEST_DATA_URI = “gcs_destination/dataset—/test-.${AIP_DATA_FORMAT}”

§predefined_split: Option<GoogleCloudAiplatformV1PredefinedSplit>

Supported only for tabular Datasets. Split based on a predefined key.

§saved_query_id: Option<String>

Only applicable to Datasets that have SavedQueries. The ID of a SavedQuery (annotation set) under the Dataset specified by dataset_id used for filtering Annotations for training. Only Annotations that are associated with this SavedQuery are used in respectively training. When used in conjunction with annotations_filter, the Annotations used for training are filtered by both saved_query_id and annotations_filter. Only one of saved_query_id and annotation_schema_uri should be specified as both of them represent the same thing: problem type.

§persist_ml_use_assignment: Option<bool>

Whether to persist the ML use assignment to data item system labels.

§stratified_split: Option<GoogleCloudAiplatformV1StratifiedSplit>

Supported only for tabular Datasets. Split based on the distribution of the specified column.

§annotations_filter: Option<String>

Applicable only to Datasets that have DataItems and Annotations. A filter on Annotations of the Dataset. Only Annotations that both match this filter and belong to DataItems not ignored by the split method are used in respectively training, validation or test role, depending on the role of the DataItem they are on (for the auto-assigned that role is decided by Vertex AI). A filter with same syntax as the one used in ListAnnotations may be used, but note here it filters across all Annotations of the Dataset, and not just within a single DataItem.

§filter_split: Option<GoogleCloudAiplatformV1FilterSplit>

Split based on the provided filters for each set.

§dataset_id: Option<String>

Required. The ID of the Dataset in the same Project and Location which data will be used to train the Model. The Dataset must use schema compatible with Model being trained, and what is compatible should be described in the used TrainingPipeline’s training_task_definition. For tabular Datasets, all their data is exported to training, to pick and choose from.

Struct GoogleCloudAiplatformV1InputDataConfigCopy item path

Fields§

Trait Implementations§

impl Clone for GoogleCloudAiplatformV1InputDataConfig

fn clone(&self) -> GoogleCloudAiplatformV1InputDataConfig

fn clone_from(&mut self, source: &Self)

impl Debug for GoogleCloudAiplatformV1InputDataConfig

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl Default for GoogleCloudAiplatformV1InputDataConfig

fn default() -> GoogleCloudAiplatformV1InputDataConfig

impl<'de> Deserialize<'de> for GoogleCloudAiplatformV1InputDataConfig

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where __D: Deserializer<'de>,

impl Serialize for GoogleCloudAiplatformV1InputDataConfig

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>where __S: Serializer,

impl Part for GoogleCloudAiplatformV1InputDataConfig

Auto Trait Implementations§

impl Freeze for GoogleCloudAiplatformV1InputDataConfig

impl RefUnwindSafe for GoogleCloudAiplatformV1InputDataConfig

impl Send for GoogleCloudAiplatformV1InputDataConfig

impl Sync for GoogleCloudAiplatformV1InputDataConfig

impl Unpin for GoogleCloudAiplatformV1InputDataConfig

impl UnwindSafe for GoogleCloudAiplatformV1InputDataConfig

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T> Instrument for T

fn instrument(self, span: Span) -> Instrumented<Self>

fn in_current_span(self) -> Instrumented<Self>

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> IntoEither for T

fn into_either(self, into_left: bool) -> Either<Self, Self>

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>where F: FnOnce(&Self) -> bool,

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<T> WithSubscriber for T

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>where S: Into<Dispatch>,

fn with_current_subscriber(self) -> WithDispatch<Self>

impl<T> DeserializeOwned for Twhere T: for<'de> Deserialize<'de>,

impl<T> ErasedDestructor for Twhere T: 'static,

Struct GoogleCloudAiplatformV1InputDataConfig

fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where __D: Deserializer<'de>,

fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where S: Serializer,

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,

impl<T> ErasedDestructor for T
where T: 'static,