Skip to main content

ListingTableUrl

Struct ListingTableUrl 

Source
pub struct ListingTableUrl { /* private fields */ }
Expand description

A parsed URL identifying files for a listing table, see ListingTableUrl::parse for more information on the supported expressions

Implementations§

Source§

impl ListingTableUrl

Source

pub fn parse(s: impl AsRef<str>) -> Result<Self>

Parse a provided string as a ListingTableUrl

A URL can either refer to a single object, or a collection of objects with a common prefix, with the presence of a trailing / indicating a collection.

For example, file:///foo.txt refers to the file at /foo.txt, whereas file:///foo/ refers to all the files under the directory /foo and its subdirectories.

Similarly s3://BUCKET/blob.csv refers to blob.csv in the S3 bucket BUCKET, whereas s3://BUCKET/foo/ refers to all objects with the prefix foo/ in the S3 bucket BUCKET

§URL Encoding

URL paths are expected to be URL-encoded. That is, the URL for a file named bar%2Efoo would be file:///bar%252Efoo, as per the URL specification.

It should be noted that some tools, such as the AWS CLI, take a different approach and instead interpret the URL path verbatim. For example the object bar%2Efoo would be addressed as s3://BUCKET/bar%252Efoo using ListingTableUrl but s3://BUCKET/bar%2Efoo when using the aws-cli.

§Paths without a Scheme

If no scheme is provided, or the string is an absolute filesystem path as determined by std::path::Path::is_absolute, the string will be interpreted as a path on the local filesystem using the operating system’s standard path delimiter, i.e. \ on Windows, / on Unix.

If the path contains any of '?', '*', '[', it will be considered a glob expression and resolved as described in the section below.

Otherwise, the path will be resolved to an absolute path based on the current working directory, and converted to a file URI.

If the path already exists in the local filesystem this will be used to determine if this ListingTableUrl refers to a collection or a single object, otherwise the presence of a trailing path delimiter will be used to indicate a directory. For the avoidance of ambiguity it is recommended users always include trailing / when intending to refer to a directory.

§Glob File Paths

If no scheme is provided, and the path contains a glob expression, it will be resolved as follows.

The string up to the first path segment containing a glob expression will be extracted, and resolved in the same manner as a normal scheme-less path above.

The remaining string will be interpreted as a glob::Pattern and used as a filter when listing files from object storage

Source

pub fn try_new(url: Url, glob: Option<Pattern>) -> Result<Self>

Creates a new ListingTableUrl from a url and optional glob expression

Self::parse supports glob expression only for file system paths. However, some applications may want to support glob expression for URLs with a scheme. The application can split the URL into a base URL and a glob expression and use this method to create a ListingTableUrl.

Source

pub fn scheme(&self) -> &str

Returns the URL scheme

Source

pub fn prefix(&self) -> &Path

Return the URL path not excluding any glob expression

If Self::is_collection, this is the listing prefix Otherwise, this is the path to the object

Source

pub fn contains(&self, path: &Path, ignore_subdirectory: bool) -> bool

Returns true if path matches this ListingTableUrl

Source

pub fn is_collection(&self) -> bool

Returns true if path refers to a collection of objects

Source

pub fn file_extension(&self) -> Option<&str>

Returns the file extension of the last path segment if it exists

Examples:

use datafusion_datasource::ListingTableUrl;
let url = ListingTableUrl::parse("file:///foo/bar.csv").unwrap();
assert_eq!(url.file_extension(), Some("csv"));
let url = ListingTableUrl::parse("file:///foo/bar").unwrap();
assert_eq!(url.file_extension(), None);
let url = ListingTableUrl::parse("file:///foo/bar.").unwrap();
assert_eq!(url.file_extension(), None);
Source

pub fn strip_prefix<'a, 'b: 'a>( &'a self, path: &'b Path, ) -> Option<impl Iterator<Item = &'b str> + 'a>

Strips the prefix of this ListingTableUrl from the provided path, returning an iterator of the remaining path segments

Source

pub async fn list_prefixed_files<'a>( &'a self, ctx: &'a dyn Session, store: &'a dyn ObjectStore, prefix: Option<Path>, file_extension: &'a str, ) -> Result<BoxStream<'a, Result<ObjectMeta>>>

List all files identified by this ListingTableUrl for the provided file_extension, optionally filtering by a path prefix

Source

pub async fn list_all_files<'a>( &'a self, ctx: &'a dyn Session, store: &'a dyn ObjectStore, file_extension: &'a str, ) -> Result<BoxStream<'a, Result<ObjectMeta>>>

List all files identified by this ListingTableUrl for the provided file_extension

Source

pub fn as_str(&self) -> &str

Returns this ListingTableUrl as a string

Source

pub fn object_store(&self) -> ObjectStoreUrl

Return the ObjectStoreUrl for this ListingTableUrl

Source

pub fn is_folder(&self) -> bool

Returns true if the ListingTableUrl points to the folder

Source

pub fn get_url(&self) -> &Url

Return the url for ListingTableUrl

Source

pub fn get_glob(&self) -> &Option<Pattern>

Return the glob for ListingTableUrl

Source

pub fn with_glob(self, glob: &str) -> Result<Self>

Returns a copy of current ListingTableUrl with a specified glob

Source

pub fn with_table_ref(self, table_ref: TableReference) -> Self

Set the table reference for this ListingTableUrl

Source

pub fn get_table_ref(&self) -> &Option<TableReference>

Return the table reference for this ListingTableUrl

Trait Implementations§

Source§

impl AsRef<Url> for ListingTableUrl

Source§

fn as_ref(&self) -> &Url

Converts this type into a shared reference of the (usually inferred) input type.
Source§

impl AsRef<str> for ListingTableUrl

Source§

fn as_ref(&self) -> &str

Converts this type into a shared reference of the (usually inferred) input type.
Source§

impl Clone for ListingTableUrl

Source§

fn clone(&self) -> ListingTableUrl

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for ListingTableUrl

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Display for ListingTableUrl

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Eq for ListingTableUrl

Source§

impl Hash for ListingTableUrl

Source§

fn hash<__H: Hasher>(&self, state: &mut __H)

Feeds this value into the given Hasher. Read more
1.3.0 · Source§

fn hash_slice<H>(data: &[Self], state: &mut H)
where H: Hasher, Self: Sized,

Feeds a slice of this type into the given Hasher. Read more
Source§

impl PartialEq for ListingTableUrl

Source§

fn eq(&self, other: &ListingTableUrl) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 (const: unstable) · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl StructuralPartialEq for ListingTableUrl

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> DynEq for T
where T: Eq + Any,

Source§

fn dyn_eq(&self, other: &(dyn Any + 'static)) -> bool

Source§

impl<T> DynHash for T
where T: Hash + Any,

Source§

fn dyn_hash(&self, state: &mut dyn Hasher)

Source§

impl<Q, K> Equivalent<K> for Q
where Q: Eq + ?Sized, K: Borrow<Q> + ?Sized,

Source§

fn equivalent(&self, key: &K) -> bool

Compare self to key and return true if they are equal.
Source§

impl<Q, K> Equivalent<K> for Q
where Q: Eq + ?Sized, K: Borrow<Q> + ?Sized,

Source§

fn equivalent(&self, key: &K) -> bool

Checks if this value is equivalent to the given key. Read more
Source§

impl<Q, K> Equivalent<K> for Q
where Q: Eq + ?Sized, K: Borrow<Q> + ?Sized,

Source§

fn equivalent(&self, key: &K) -> bool

Checks if this value is equivalent to the given key. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T> ToString for T
where T: Display + ?Sized,

Source§

fn to_string(&self) -> String

Converts the given value to a String. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V