Skip to main content

S3SourceConfig

Struct S3SourceConfig 

Source
pub struct S3SourceConfig {
    pub bucket: String,
    pub prefix: Option<String>,
    pub region: Option<String>,
    pub endpoint_url: Option<String>,
    pub file_format: S3FileFormat,
    pub max_objects: Option<usize>,
    pub concurrency: usize,
    pub batch_size: usize,
    pub compression: CompressionConfig,
}
Expand description

Configuration for the S3 source connector.

Fields§

§bucket: String

S3 bucket name.

§prefix: Option<String>

Object key prefix filter.

§region: Option<String>

AWS region. None uses the SDK default.

§endpoint_url: Option<String>

Custom endpoint URL for S3-compatible services (e.g. MinIO).

§file_format: S3FileFormat

Format of the files to read.

§max_objects: Option<usize>

Maximum number of objects to read.

§concurrency: usize

Maximum number of concurrent object reads (default: 10).

§batch_size: usize

Records per emitted StreamPage. For JsonLines and RawText formats, the object body is decoded line-by-line via tokio::io::AsyncBufReadExt and a page is yielded whenever the buffer reaches this size; multi-object scans flatten so a single page may contain lines from any object. For JsonArray, each object is buffered fully before its records are chunked into pages of this size (see the README “Streaming and batching” section for the caveat). Defaults to DEFAULT_BATCH_SIZE.

batch_size = 0 is the “no batching” sentinel: every page is one complete object — no within-object chunking. Useful for small lookup files, or for sinks (e.g. SQL COPY, BigQuery load jobs) that prefer one large request per file to many small ones.

§compression: CompressionConfig
Available on crate feature compression only.

Compression codec applied to each downloaded object. Defaults to CompressionConfig::Auto — the codec is resolved per-object-key, so a single source can read a mix of compressed and uncompressed objects. Requires the crate-local compression feature.

Implementations§

Source§

impl S3SourceConfig

Source

pub fn new(bucket: impl Into<String>) -> Self

Create a new config with the required bucket name and sensible defaults.

Source

pub fn prefix(self, prefix: impl Into<String>) -> Self

Set the object key prefix filter.

Source

pub fn region(self, region: impl Into<String>) -> Self

Set the AWS region.

Source

pub fn endpoint_url(self, url: impl Into<String>) -> Self

Set a custom endpoint URL for S3-compatible services.

Source

pub fn file_format(self, format: S3FileFormat) -> Self

Set the file format.

Source

pub fn max_objects(self, max: usize) -> Self

Set the maximum number of objects to read.

Source

pub fn concurrency(self, concurrency: usize) -> Self

Set the maximum number of concurrent object reads.

Source

pub fn with_batch_size(self, batch_size: usize) -> Self

Set the per-page record count for Source::stream_pages.

Pass 0 to opt out of within-object chunking — every emitted StreamPage corresponds to exactly one S3 object.

Source

pub fn compression(self, c: CompressionConfig) -> Self

Available on crate feature compression only.

Set the compression codec. Available only with the compression feature.

Trait Implementations§

Source§

impl Clone for S3SourceConfig

Source§

fn clone(&self) -> S3SourceConfig

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for S3SourceConfig

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'de> Deserialize<'de> for S3SourceConfig

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl JsonSchema for S3SourceConfig

Source§

fn schema_name() -> Cow<'static, str>

The name of the generated JSON Schema. Read more
Source§

fn schema_id() -> Cow<'static, str>

Returns a string that uniquely identifies the schema produced by this type. Read more
Source§

fn json_schema(generator: &mut SchemaGenerator) -> Schema

Generates a JSON Schema for this type. Read more
Source§

fn inline_schema() -> bool

Whether JSON Schemas generated for this type should be included directly in parent schemas, rather than being re-used where possible using the $ref keyword. Read more
Source§

impl Serialize for S3SourceConfig

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> DynClone for T
where T: Clone,

Source§

fn __clone_box(&self, _: Private) -> *mut ()

Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<Unshared, Shared> IntoShared<Shared> for Unshared
where Shared: FromUnshared<Unshared>,

Source§

fn into_shared(self) -> Shared

Creates a shared type from an unshared type.
Source§

impl<T> PolicyExt for T
where T: ?Sized,

Source§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more
Source§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,