pub struct S3SourceConfig {
pub bucket: String,
pub prefix: Option<String>,
pub region: Option<String>,
pub endpoint_url: Option<String>,
pub file_format: S3FileFormat,
pub max_objects: Option<usize>,
pub concurrency: usize,
pub batch_size: usize,
pub compression: CompressionConfig,
}Expand description
Configuration for the S3 source connector.
Fields§
§bucket: StringS3 bucket name.
prefix: Option<String>Object key prefix filter.
region: Option<String>AWS region. None uses the SDK default.
endpoint_url: Option<String>Custom endpoint URL for S3-compatible services (e.g. MinIO).
file_format: S3FileFormatFormat of the files to read.
max_objects: Option<usize>Maximum number of objects to read.
concurrency: usizeMaximum number of concurrent object reads (default: 10).
batch_size: usizeRecords per emitted StreamPage. For
JsonLines and RawText formats, the object body is decoded
line-by-line via tokio::io::AsyncBufReadExt and a page is yielded
whenever the buffer reaches this size; multi-object scans flatten so
a single page may contain lines from any object. For JsonArray,
each object is buffered fully before its records are chunked into
pages of this size (see the README “Streaming and batching” section
for the caveat). Defaults to DEFAULT_BATCH_SIZE.
batch_size = 0 is the “no batching” sentinel: every page is one
complete object — no within-object chunking. Useful for small
lookup files, or for sinks (e.g. SQL COPY, BigQuery load jobs)
that prefer one large request per file to many small ones.
compression: CompressionConfigcompression only.Compression codec applied to each downloaded object. Defaults to
CompressionConfig::Auto —
the codec is resolved per-object-key, so a single source can read a
mix of compressed and uncompressed objects. Requires the
crate-local compression feature.
Implementations§
Source§impl S3SourceConfig
impl S3SourceConfig
Sourcepub fn new(bucket: impl Into<String>) -> Self
pub fn new(bucket: impl Into<String>) -> Self
Create a new config with the required bucket name and sensible defaults.
Sourcepub fn endpoint_url(self, url: impl Into<String>) -> Self
pub fn endpoint_url(self, url: impl Into<String>) -> Self
Set a custom endpoint URL for S3-compatible services.
Sourcepub fn file_format(self, format: S3FileFormat) -> Self
pub fn file_format(self, format: S3FileFormat) -> Self
Set the file format.
Sourcepub fn max_objects(self, max: usize) -> Self
pub fn max_objects(self, max: usize) -> Self
Set the maximum number of objects to read.
Sourcepub fn concurrency(self, concurrency: usize) -> Self
pub fn concurrency(self, concurrency: usize) -> Self
Set the maximum number of concurrent object reads.
Sourcepub fn with_batch_size(self, batch_size: usize) -> Self
pub fn with_batch_size(self, batch_size: usize) -> Self
Set the per-page record count for Source::stream_pages.
Pass 0 to opt out of within-object chunking — every emitted
StreamPage corresponds to exactly one
S3 object.
Sourcepub fn compression(self, c: CompressionConfig) -> Self
Available on crate feature compression only.
pub fn compression(self, c: CompressionConfig) -> Self
compression only.Set the compression codec. Available only with the compression feature.
Trait Implementations§
Source§impl Clone for S3SourceConfig
impl Clone for S3SourceConfig
Source§fn clone(&self) -> S3SourceConfig
fn clone(&self) -> S3SourceConfig
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for S3SourceConfig
impl Debug for S3SourceConfig
Source§impl<'de> Deserialize<'de> for S3SourceConfig
impl<'de> Deserialize<'de> for S3SourceConfig
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Source§impl JsonSchema for S3SourceConfig
impl JsonSchema for S3SourceConfig
Source§fn schema_id() -> Cow<'static, str>
fn schema_id() -> Cow<'static, str>
Source§fn json_schema(generator: &mut SchemaGenerator) -> Schema
fn json_schema(generator: &mut SchemaGenerator) -> Schema
Source§fn inline_schema() -> bool
fn inline_schema() -> bool
$ref keyword. Read moreAuto Trait Implementations§
impl Freeze for S3SourceConfig
impl RefUnwindSafe for S3SourceConfig
impl Send for S3SourceConfig
impl Sync for S3SourceConfig
impl Unpin for S3SourceConfig
impl UnsafeUnpin for S3SourceConfig
impl UnwindSafe for S3SourceConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more