Enum dowser::Extension

source ·
pub enum Extension {
    Ext2(u16),
    Ext3(u32),
    Ext4(u32),
}
Available on Unix only.
Expand description

Extension.

This enum can be used to efficiently check a file path’s extension case- insensitively against a hard-coded reference extension. It is likely overkill in most situations, but if you’re looking to optimize the filtering of large path lists, this can turn those painful nanosecond operations into pleasant picosecond ones!

The magic is largely down to storing values as u16 or u32 integers and comparing those (rather than byte slices or OsStr), and not messing around with the path Components iterator. (Note, this is done using the safe u*::from_le_bytes() methods rather than casting chicanery.)

At the moment, only extensions sized between 2-4 bytes are supported as those sizes are the most common and also translate perfectly to primitives, but larger values may be added in the future.

Reference Constructors.

A “reference” extension is one known to you ahead of time, i.e. what you’re looking for. These can be constructed using the constant Extension::new2, Extension::new3, and Extension::new4 methods.

Because these are “known” values, no logical validation is performed. If you do something silly like mix case or type them incorrectly, equality tests will fail. You’d only be hurting yourself!

use dowser::Extension;

const EXT2: Extension = Extension::new2(*b"gz");
const EXT3: Extension = Extension::new3(*b"png");
const EXT4: Extension = Extension::new4(*b"html");

The main idea is you’ll pre-compute these values and compare unknown runtime values against them later.

Runtime Constructors.

A “runtime” extension, for lack of a better adjective, is a value you don’t know ahead of time, e.g. from a user-supplied path. These can be constructed using the Extension::try_from2, Extension::try_from3, and Extension::try_from4 methods, which accept any AsRef<Path> argument.

The method you choose should match the length you’re looking for. For example, if you’re hoping for a PNG, use Extension::try_from3.

use dowser::Extension;

const EXT3: Extension = Extension::new3(*b"png");
assert_eq!(Extension::try_from3("/path/to/IMAGE.PNG"), Some(EXT3));
assert_eq!(Extension::try_from3("/path/to/doc.html"), None);

Examples

To filter a list of image paths with the standard library — say, matching PNGs — you would do something like:

use std::os::unix::ffi::OsStrExt;
use std::path::PathBuf;

// Imagine this is much longer…
let paths = vec![PathBuf::from("/path/to/image.png")];

paths.iter()
    .filter(|p| p.extension()
        .map_or(false, |e| e.as_bytes().eq_ignore_ascii_case(b"png"))
    )
    .for_each(|p| todo!());

Using Extension instead, the same operation looks like:

use dowser::Extension;
use std::path::PathBuf;

// Imagine this is much longer…
let paths = vec![PathBuf::from("/path/to/image.png")];

// The reference extension.
const EXT: Extension = Extension::new3(*b"png");

paths.iter()
    .filter(|p| Extension::try_from3(p).map_or(false, |e| e == EXT))
    .for_each(|p| todo!());

Variants

Ext2(u16)

Ext3(u32)

3-char Extension.

Like .png.

Ext4(u32)

4-char Extension.

Like .jpeg.

Implementations

New Unchecked (2).

Create a new Extension, unchecked, from two bytes, e.g. *b"gz". This should be lowercase and not include a period.

This method is intended for known values that you want to check unknown values against. Sanity-checking is traded for performance, but you’re only hurting yourself if you misuse it.

For compile-time generation, see Extension::codegen.

Examples
use dowser::Extension;
const MY_EXT: Extension = Extension::new2(*b"gz");
New Unchecked (3).

Create a new Extension, unchecked, from three bytes, e.g. *b"gif". This should be lowercase and not include a period.

This method is intended for known values that you want to check unknown values against. Sanity-checking is traded for performance, but you’re only hurting yourself if you misuse it.

For compile-time generation, see Extension::codegen.

Examples
use dowser::Extension;
const MY_EXT: Extension = Extension::new3(*b"gif");
New Unchecked (4).

Create a new Extension, unchecked, from four bytes, e.g. *b"html". This should be lowercase and not include a period.

This method is intended for known values that you want to check unknown values against. Sanity-checking is traded for performance, but you’re only hurting yourself if you misuse it.

For compile-time generation, see Extension::codegen.

Examples
use dowser::Extension;
const MY_EXT: Extension = Extension::new4(*b"html");
Try From Path (2).

This method is used to (try to) pull a 2-byte extension from a file path. This requires that the path be at least 4 bytes, with anything but a forward/backward slash at [len - 4] and a dot at [len - 3].

If successful, it will return an Extension::Ext2 that can be compared against your reference Extension. Casing will be fixed automatically.

Examples
use dowser::Extension;

const MY_EXT: Extension = Extension::new2(*b"gz");
assert_eq!(Extension::try_from2("/path/to/file.gz"), Some(MY_EXT));
assert_eq!(Extension::try_from2("/path/to/file.GZ"), Some(MY_EXT));

assert_eq!(Extension::try_from2("/path/to/file.png"), None);
assert_ne!(Extension::try_from2("/path/to/file.br"), Some(MY_EXT));
Try From Path (3).

This method is used to (try to) pull a 3-byte extension from a file path. This requires that the path be at least 5 bytes, with anything but a forward/backward slash at [len - 5] and a dot at [len - 4].

If successful, it will return an Extension::Ext3 that can be compared against your reference Extension. Casing will be fixed automatically.

Examples
use dowser::Extension;

const MY_EXT: Extension = Extension::new3(*b"png");
assert_eq!(Extension::try_from3("/path/to/file.png"), Some(MY_EXT));
assert_eq!(Extension::try_from3("/path/to/FILE.PNG"), Some(MY_EXT));

assert_eq!(Extension::try_from3("/path/to/file.html"), None);
assert_ne!(Extension::try_from3("/path/to/file.jpg"), Some(MY_EXT));
Try From Path (4).

This method is used to (try to) pull a 4-byte extension from a file path. This requires that the path be at least 6 bytes, with anything but a forward/backward slash at [len - 6] and a dot at [len - 5].

If successful, it will return an Extension::Ext4 that can be compared against your reference Extension. Casing will be fixed automatically.

Examples
use dowser::Extension;

const MY_EXT: Extension = Extension::new4(*b"html");
assert_eq!(Extension::try_from4("/path/to/file.html"), Some(MY_EXT));
assert_eq!(Extension::try_from4("/path/to/FILE.HTML"), Some(MY_EXT));

assert_eq!(Extension::try_from4("/path/to/file.png"), None);
assert_ne!(Extension::try_from4("/path/to/file.xhtm"), Some(MY_EXT));
Slice Extension.

This returns the file extension portion of a path as a byte slice, similar to std::path::Path::extension, but faster since it is dealing with straight bytes.

The extension is found by jumping to the last period, ensuring the byte before that period is not a path separator, and that there are one or more bytes after that period (none of which are path separators).

If the above are all good, a slice containing everything after that last period is returned.

Examples
use dowser::Extension;

// Uppercase in, uppercase out.
assert_eq!(
    Extension::slice_ext(b"/path/to/IMAGE.JPEG"),
    Some(&b"JPEG"[..])
);

// Lowercase in, lowercase out.
assert_eq!(
    Extension::slice_ext(b"/path/to/file.docx"),
    Some(&b"docx"[..])
);

// These are all bad, though:
assert_eq!(
    Extension::slice_ext(b"/path/to/.htaccess"),
    None
);
assert_eq!(
    Extension::slice_ext(b"/path/to/"),
    None
);
assert_eq!(
    Extension::slice_ext(b"/path/to/file."),
    None
);
Codegen Helper.

This compile-time method can be used in a build.rs script to generate a pre-computed Extension value of any supported length (2-4 bytes).

Unlike the runtime methods, this will automatically fix case and period inconsistencies, but ideally you should still pass it just the letters, in lowercase, because you have the power to do so. Haha.

Examples
use dowser::Extension;

// This is what it looks like.
assert_eq!(
    Extension::codegen(b"js"),
    "Extension::Ext2(29_546_u16)"
);
assert_eq!(
    Extension::codegen(b"jpg"),
    "Extension::Ext3(1_735_420_462_u32)"
);
assert_eq!(
    Extension::codegen(b"html"),
    "Extension::Ext4(1_819_112_552_u32)"
);

In a typical build.rs workflow, you’d be building up a string of other code around it, and saving it all to a file, like:

use dowser::Extension;
use std::fs::File;
use std::io::Write;
use std::path::PathBuf;

fn main() {
    let out = format!(
        "const MY_EXT: Extension = {};",
        Extension::codegen(b"jpg")
    );

    let out_path = PathBuf::from(std::env::var("OUT_DIR").unwrap())
        .join("compile-time-vars.rs");
    let mut f = File::create(out_path).unwrap();
    f.write_all(out.as_bytes()).unwrap();
    f.flush().unwrap();
}

Then in your main program, say lib.rs, you’d toss an include!() to that file to import the code as code, like:

use dowser::Extension;

include!(concat!(env!("OUT_DIR"), "/compile-time-vars.rs"));

Et voilà, you’ve saved yourself a nanosecond of runtime effort! Haha.

Panics

This will panic if the extension (minus punctuation) is not 2-4 bytes or contains whitespace or path separators.

Trait Implementations

Returns a copy of the value. Read more
Performs copy-assignment from source. Read more
Formats the value using the given formatter. Read more
Feeds this value into the given Hasher. Read more
Feeds a slice of this type into the given Hasher. Read more
This method tests for self and other values to be equal, and is used by ==. Read more
This method tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason. Read more
Path Equality.

When there’s just one extension and one path to check, you can compare them directly (extension first).

Examples
use dowser::Extension;

const MY_EXT: Extension = Extension::new4(*b"html");

assert_eq!(MY_EXT, "/path/to/index.html");
assert_ne!(MY_EXT, "/path/to/image.jpeg");
This method tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason. Read more

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more
Immutably borrows from an owned value. Read more
Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The resulting type after obtaining ownership.
Creates owned data from borrowed data, usually by cloning. Read more
Uses borrowed data to replace owned data, usually by cloning. Read more
The type returned in the event of a conversion error.
Performs the conversion.
The type returned in the event of a conversion error.
Performs the conversion.