Crate ciff

Source
Expand description

Library supporting converting CIFF to PISA uncompressed collection format. Refer to osirrc/ciff on Github for more detailed information about the format.

For more information about PISA’s internal storage formats, see the documentation.

§Examples

Use PisaToCiff and CiffToPisa builders to convert from one format to another.

CiffToPisa::default()
    .input_path(ciff_file)
    .output_paths(&pisa_base_path)
    .convert()?;
PisaToCiff::default()
    .description("Hello, CIFF!")
    .pisa_paths(&pisa_base_path)
    .output_path(output)
    .convert()?;

Structs§

BinaryCollection
Represents a single binary collection.
BinarySequence
A single binary sequence.
CiffToPisa
CIFF to PISA converter.
DocRecord
InvalidFormat
Error raised when the bytes cannot be properly parsed into the collection format.
PayloadIter
Iterator over PayloadSlice.
PayloadSlice
Payload slice is a slice of variable-sized elements (payloads) encoded in a single block of memory. This way, sequences of, say, strings, can be indexed into without loading all the elements in memory, but rather using a memory mapped buffer.
PayloadVector
Owning variant of PayloadSlice, in which the underlying bytes are fully in memory within the struct. This is useful mainly for building the structure before writing it to a file, but also if one decides to fully load the bytes to memory and use it to assess elements without parsing the whole vector to a Vec.
PisaToCiff
PISA to CIFF converter.
Posting
PostingsList
RandomAccessBinaryCollection
A version of BinaryCollection with random access to sequences.

Functions§

build_lexicon
Builds a lexicon using the text file at input and writes it to output.
ciff_to_pisaDeprecated
Converts a CIFF index stored in path to a PISA “binary collection” (uncompressed inverted index) with a basename output.
concat
Concatenate two OsStrings.
encode_u32_sequence
Encodes a sequence of 4-byte unsigned integers into writer in native-endianness.
pisa_to_ciffDeprecated
Converts a a PISA “binary collection” (uncompressed inverted index) with a basename input to a CIFF index stored in output.