De-identifies potentially sensitive info from a list of strings.
This method has limits on input size and output size.
Finds potentially sensitive info in a list of strings.
This method has limits on input size, processing time, and output size.
A builder providing access to all methods supported on
content resources.
It is not used directly, but through the
DLP
hub.
Redacts potentially sensitive info from a list of strings.
This method has limits on input size, processing time, and output size.
Central instance to access all DLP related resource activities
Schedules a job to compute risk analysis metrics over content in a Google
Cloud Platform repository.
A builder providing access to all methods supported on
dataSource resources.
It is not used directly, but through the
DLP
hub.
The request message for Operations.CancelOperation.
The response message for Operations.ListOperations.
This resource represents a long-running operation that is the result of a
network API call.
Request for creating a risk analysis operation.
An auxiliary table contains statistical information on the relative
frequency of different quasi-identifiers values. It has one or several
quasi-identifiers columns, and one column that indicates the relative
frequency of each quasi-identifier tuple.
If a tuple is present in the data but not in the auxiliary table, the
corresponding relative frequency is assumed to be zero (and thus, the
tuple is highly reidentifiable).
Options defining BigQuery table and row identifiers.
Message defining the location of a BigQuery table. A table is uniquely
identified by its project_id, dataset_id, and table_name. Within a query
a table is often referenced with a string in the format of:
<project_id>:<dataset_id>.<table_id>
or
<project_id>.<dataset_id>.<table_id>
.
Buckets represented as ranges, along with replacement values. Ranges must
be non-overlapping.
Generalization function that buckets values based on ranges. The ranges and
replacement values are dynamically provided by the user for custom behavior,
such as 1-30 -> LOW 31-65 -> MEDIUM 66-100 -> HIGH
This can be used on
data of type: number, long, string, timestamp.
If the bound Value
type differs from the type of data being transformed, we
will first attempt converting the type of the data to be transformed to match
the type of the bound before comparing.
Compute numerical stats over an individual column, including
number of distinct values and value count distribution.
Info Type Category description.
Partially mask a string by replacing a given number of characters with a
fixed character. Masking can start from the beginning or end of the string.
This can be used on data of any type (numbers, longs, and so on) and when
de-identifying structured data we’ll attempt to preserve the original data’s
type. (This allows you to take a long like 123 and modify it to a string like
**3.
Characters to skip when doing deidentification of a value. These will be left
alone and skipped.
Record key for a finding in a Cloud Storage file.
Options defining a file or a set of files (path ending with *) within
a Google Cloud Storage bucket.
A location in Cloud Storage.
Represents a color in the RGB color space.
The field type of value
and field
do not need to match to be
considered equal, but not all comparisons are possible.
There is no detailed description.
Container structure for the content to inspect.
Request for scheduling a scan of a data subset from a Google Platform data
repository.
Pseudonymization method that generates surrogates via cryptographic hashing.
Uses SHA-256.
Outputs a 32 byte digest as an uppercase hex string
(for example, 41D1567F7F99F1DC2A5FAB886DEE5BEE).
Currently, only string and integer values can be hashed.
This is a data encryption key (DEK) (as opposed to
a key encryption key (KEK) stored by KMS).
When using KMS to wrap/unwrap DEKs, be sure to set an appropriate
IAM policy on the KMS CryptoKey (KEK) to ensure an attacker cannot
unwrap the data crypto key.
Replaces an identifier with a surrogate using FPE with the FFX
mode of operation.
The identifier must be representable by the US-ASCII character set.
For a given crypto key and context, the same identifier will be
replaced with the same surrogate.
Identifiers must be at least two characters long.
In the case that the identifier is the empty string, it will be skipped.
Custom information type provided by the user. Used to find domain-specific
sensitive information configurable to the data in question.
Record key for a finding in Cloud Datastore.
Options defining a data set within Google Cloud Datastore.
High level summary of deidentification.
The configuration that controls how the data will change.
Request to de-identify a list of items.
Results of de-identifying a list of items.
Custom information type based on a dictionary of words or phrases. This can
be used to match sensitive information specific to the data, such as a list
of employee IDs or job titles.
An entity in a dataset is a field or set of fields that correspond to a
single person. For example, in medical records the EntityId
might be
a patient identifier, or for financial records it might be an account
identifier. This message is used when generalizations or analysis must be
consistent across multiple rows pertaining to the same entity.
A collection of expressions
General identifier of a data field in a storage service.
The transformation to apply to the field.
Set of files to scan.
Container structure describing a single finding within a string or image.
Buckets values based on fixed size ranges. The
Bucketing transformation can provide all of this functionality,
but requires more configuration. This message is provided as a convenience to
the user for simple bucketing strategies.
The resulting value will be a hyphenated string of
lower_bound-upper_bound.
This can be used on data of type: double, long.
If the bound Value type differs from the type of data
being transformed, we will first attempt converting the type of the data to
be transformed to match the type of the bound before comparing.
Bounding box encompassing detected text within an image.
Configuration for determining how redaction of images should occur.
Type of information detected by the API.
Description of the information type (infoType).
Max findings configuration per info type, per content item or long running
operation.
A transformation to apply to text that is identified as a specific
info_type.
A type of transformation that will scan unstructured text and
apply various PrimitiveTransformation
s to each finding, where the
transformation is applied to only values that were identified as a specific
info_type.
Configuration description of the scanning process.
When used with redactContent only info_types and min_likelihood are currently
used.
Request to search for potentially sensitive info in a list of items.
Results of inspecting a list of items.
All the findings for a single scanned item.
k-anonymity metric, used for analysis of reidentification risk.
Reidentifiability metric. This corresponds to a risk model similar to what
is called “journalist risk” in the literature, except the attack dataset is
statistically modeled instead of being perfectly known. This can be done
using publicly available data (like the US Census), or using a custom
statistical model (indicated as one or several BigQuery tables), or by
extrapolating from the distribution of values in the input dataset.
A unique identifier for a Datastore entity.
If a key’s partition ID or any of its path kinds or names are
reserved/read-only, the key is reserved/read-only.
A reserved/read-only key is forbidden in certain documented contexts.
A representation of a Datastore kind.
Include to use an existing data crypto key wrapped by KMS.
Authorization requires the following IAM permissions when sending a request
to perform a crypto transformation using a kms-wrapped crypto key:
dlp.kms.encrypt
l-diversity metric, used for analysis of reidentification risk.
Response to the ListInfoTypes request.
Response to the ListInspectFindings request.
Response for ListRootCategories request.
Specifies the location of a finding within its source item.
Compute numerical stats over an individual column, including
min, max, and quantiles.
Additional configuration for inspect long running operations.
Cloud repository for storing output.
Datastore partition ID.
A partition ID identifies a grouping of entities. The grouping is always
by project and namespace, however the namespace ID may be empty.
A (kind, ID/name) pair used to construct a key path.
A rule for transforming a value.
Privacy metric to compute for reidentification risk analysis.
A representation of a Datastore property in a projection.
A reference to a property relative to the Datastore kind expressions.
A quasi-identifier column has a custom_tag, used to know which column
in the data corresponds to which column in the statistical model.
Generic half-open interval [start, end)
A condition for determining whether a transformation should be applied to
a field.
Message for a unique key indicating a record that contains a finding.
Configuration to suppress records whose suppression conditions evaluate to
true.
A type of transformation that is applied over structured data such as a
table.
Redact a given value. For example, if used with an InfoTypeTransformation
transforming PHONE_NUMBER, and input ‘My phone number is 206-555-0123’, the
output would be ’My phone number is ’.
Request to search for potentially sensitive info in a list of items
and replace it with a default or provided content.
Results of redacting a list of items.
There is no detailed description.
Replace each input value with a given Value
.
Replace each matching finding with the name of the info_type.
There is no detailed description.
Shared message indicating Cloud storage type.
A collection that informs the user the number of times a particular
TransformationResultCode
and error details occurred.
Message for detecting output from deidentification transformations
such as
CryptoReplaceFfxFpeConfig
.
These types of transformations are
those that perform pseudonymization, thereby producing a “surrogate” as
output. This should be used in conjunction with a field on the
transformation such as
surrogate_info_type
.
Structured content to inspect. Up to 50,000 Value
s per request allowed.
Location of a finding within a ContentItem.Table
.
A column with a semantic tag attached.
For use with Date
, Timestamp
, and TimeOfDay
, extract or preserve a
portion of the value.
Summary of a single tranformation.
Use this to have a random data crypto key generated.
It will be discarded after the operation/request finishes.
Using raw keys is prone to security risks due to accidentally
leaking the key. Choose another type of key if possible.
Set of primitive values supported by the system.
Message defining a list of words or phrases to search for in the data.
A generic empty message that you can re-use to avoid defining duplicated
empty messages in your APIs. A typical example is to use it as the request
or the response type of an API method. For instance:
The
Status
type defines a logical error model that is suitable for different
programming environments, including REST APIs and RPC APIs. It is used by
gRPC. The error model is designed to be:
Represents a whole calendar date, e.g. date of birth. The time of day and
time zone are either specified elsewhere or are not significant. The date
is relative to the Proleptic Gregorian Calendar. The day may be 0 to
represent a year and month where the day is not significant, e.g. credit card
expiration date. The year may be 0 to represent a month and day independent
of year, e.g. anniversary date. Related types are google.type.TimeOfDay
and google.protobuf.Timestamp
.
Represents a time of day. The date and time zone are either not significant
or are specified elsewhere. An API may choose to allow leap seconds. Related
types are google.type.Date and google.protobuf.Timestamp
.
A builder providing access to all methods supported on
inspect resources.
It is not used directly, but through the
DLP
hub.
Cancels an operation. Use the inspect.operations.get
to check whether the cancellation succeeded or the operation completed despite cancellation.
Schedules a job scanning content in a Google Cloud Platform data
repository.
This method is not supported and the server returns UNIMPLEMENTED
.
Gets the latest state of a long-running operation. Clients can use this
method to poll the operation result at intervals as recommended by the API
service.
Fetches the list of long running operations.
Returns list of results for given inspect operation result set id.
A builder providing access to all methods supported on
riskAnalysi resources.
It is not used directly, but through the
DLP
hub.
Cancels an operation. Use the inspect.operations.get
to check whether the cancellation succeeded or the operation completed despite cancellation.
This method is not supported and the server returns UNIMPLEMENTED
.
Gets the latest state of a long-running operation. Clients can use this
method to poll the operation result at intervals as recommended by the API
service.
Fetches the list of long running operations.
Returns sensitive information types for given category.
Returns the list of root categories of sensitive information.
A builder providing access to all methods supported on
rootCategory resources.
It is not used directly, but through the
DLP
hub.