Expand description
Transfer data between the Arrow memory format and JSON line-delimited records.
See the module level documentation for the
reader
and writer
for usage examples.
§Binary Data
As per RFC7159 JSON cannot encode arbitrary binary data. A common approach to workaround this is to use a binary-to-text encoding scheme, such as base64, to encode the input data and then decode it on output.
// The data we want to write
let input = BinaryArray::from(vec![b"\xDE\x00\xFF".as_ref()]);
// Base64 encode it to a string
let encoded: StringArray = b64_encode(&BASE64_STANDARD, &input);
// Write the StringArray to JSON
let batch = RecordBatch::try_from_iter([("col", Arc::new(encoded) as _)]).unwrap();
let mut buf = Vec::with_capacity(1024);
let mut writer = LineDelimitedWriter::new(&mut buf);
writer.write(&batch).unwrap();
writer.finish().unwrap();
// Read the JSON data
let cursor = Cursor::new(buf);
let mut reader = ReaderBuilder::new(batch.schema()).build(cursor).unwrap();
let batch = reader.next().unwrap().unwrap();
// Reverse the base64 encoding
let col: BinaryArray = batch.column(0).as_string::<i32>().clone().into();
let output = b64_decode(&BASE64_STANDARD, &col).unwrap();
assert_eq!(input, output);
Modules§
Structs§
- Encoder
Options - Configuration options for the JSON encoder.
- Reader
- Reads JSON data with a known schema directly into arrow
RecordBatch
- Reader
Builder - A builder for
Reader
andDecoder
- Writer
- A JSON writer which serializes
RecordBatch
es to a stream ofu8
encoded JSON objects. - Writer
Builder - JSON writer builder.
Enums§
- Struct
Mode - Specifies what is considered valid JSON when reading or writing RecordBatches or StructArrays.
Traits§
- Encoder
- A trait to format array values as JSON values
- Encoder
Factory - A trait to create custom encoders for specific data types.
- Json
Serializable - Trait declaring any type that is serializable to JSON. This includes all primitive types (bool, i32, etc.).
Type Aliases§
- Array
Writer - A JSON writer which serializes
RecordBatch
es to JSON arrays. - Line
Delimited Writer - A JSON writer which serializes
RecordBatch
es to newline delimited JSON objects.