Crate jaded[][src]

Java Deserializer for Rust

Java has a much maligned but still widely used Serialization mechanism. The serial stream produced follows the specification available from Oracle here (link is to Java 11 (latest LTS version at time of writing) but protocol hasn't changed since 1.7).

This library enables that serial stream to be read in Rust applications.

Example

Taking the example from the bottom of the linked protocol, the Java class in the listing below can be used to write a byte stream

// example code from java 11 serialization protocol (@2020/01/20)
class List implements java.io.Serializable {
    int value;
    List next;
    public static void main(String[] args) {
        try {
            List list1 = new List();
            List list2 = new List();
            list1.value = 17;
            list1.next = list2;
            list2.value = 19;
            list2.next = null;

            ByteArrayOutputStream o = new ByteArrayOutputStream();
            ObjectOutputStream out = new ObjectOutputStream(o);
            out.writeObject(list1);
            out.writeObject(list2);
            out.flush();
        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }
}

As &[u8] implements std::io::Read, we can take the bytes listed in the protocol above as the output of the class and use this directly in our parser.

use jaded::{Parser, PrimitiveType};
let mut bytes: &[u8] = &[
    0xAC,0xED,0x00,0x05,0x73,0x72,0x00,0x04,0x4C,0x69,0x73,0x74,0x69,0xC8,0x8A,0x15,
    0x40,0x16,0xAE,0x68,0x02,0x00,0x02,0x49,0x00,0x05,0x76,0x61,0x6C,0x75,0x65,0x4C,
    0x00,0x04,0x6E,0x65,0x78,0x74,0x74,0x00,0x06,0x4C,0x4C,0x69,0x73,0x74,0x3B,0x78,
    0x70,0x00,0x00,0x00,0x11,0x73,0x71,0x00,0x7E,0x00,0x00,0x00,0x00,0x00,0x13,0x70,
    0x71,0x00,0x7E,0x00,0x03
];

// We can create the parser from anything that implements Read. If the stream
// is not a valid stream of serialized data, the `new` method will give an
// error.
let mut parser = Parser::new(bytes).expect("Bytes stream was not valid");

// read will return the next object in the stream. If the stream is not
// valid for any reason, the error will include the problem.
let content = parser.read().expect("Couldn't read object from stream");

// As we know what we're expecting, we can unwrap it directly. These
// methods would panic if the content read was not an object.
let list1 = content
        .value() // This is a value not raw block data
        .object_data(); // This value is an instance of an object rather
                        // than a class or an enum etc
assert_eq!(list1.class_name(), "List");
assert_eq!(list1.get_field("value").unwrap().primitive(), &PrimitiveType::Int(17));

let list1_next = list1.get_field("next").unwrap().object_data();
assert_eq!(list1_next.class_name(), "List");
assert_eq!(list1_next.get_field("value").unwrap().primitive(), &PrimitiveType::Int(19));
assert!(list1_next.get_field("next").unwrap().is_null());

let content = parser.read().expect("Coudn't read second object from stream");
// This is list2 again written in its own right rather than as a field of
// list1.
let list2 = content.value().object_data();
assert_eq!(list2, list1_next);

Limitations

Ambiguous serialization

Unfortunately, there are limits to what we can do without the original code that created the serial byte stream. The protocol above lists four types of object. One of which, classes that implement java.lang.Externalizable and use PROTOCOL_VERSION_1 (not been the default since v1.2), are not readable by anything other than the class that wrote them as their data is nothing more than a stream of bytes.

Of the remaining three types we can only reliably deserialize two.

  • 'Normal' classes that implement java.lang.Serializable without having a writeObject method

    These can be read as shown above

  • Classes that implement Externalizable and use the newer PROTOCOL_VERSION_2

    These can be read, although their data is held fully by the annotations fields of the ObjectData struct and the get_field method only returns None.

  • Serializable classes that implement writeObject

    These objects are ambiguous. The spec above suggests that they have their fields written as 'normal' classes and then have optional annotations written afterwards. In practice this is not the case and the fields are only written if the class calls defaultWriteObject as the first call in their writeObject method. Many classes in the standard library do this (eg java.util.ArrayList) but as others do not, we can't reliably determine how to interpret the rest of the stream. The downside of this is that once we have found a class that we can't read, it is difficult to get back on track as it requires picking out the marker signifying the start of the next object from the sea of custom data.

In the future, there will hopefully be a method do define how customised classes should be read so that at least within a certain application where expected class types are known beforehand, all classes can be read.

It may also be possible to 'guess' how classes were written by making some assumptions and hoping that custom data doesn't look like stream markers. This method would be unreliable though and as such will only ever be an opt in process.

Future plans

  • Deserialize to custom structs. At the moment the process of getting useful data out of a derserialized stream is awkward and in most situations the data types would be known beforehand. Having something along the lines of a FromJava trait that would allow a readObject<T: FromJava>() method would make the process more straight forward.
  • Possible tie in with Serde. I've not yet looked into how the serde data model works but this seems like it would be a useful way of accessing Java data.

State of development

Very much a work in progress at the moment. I am writing this for another application I am working on so I imagine there will be many changes in the functionality and API at least in the short term as the requirements become apparent. As things settle down I hope things will become more stable.

Structs

ObjectData

Object data representing serialized Java object

Parser

The main parser class

Enums

Content

The content read from a stream.

JavaError

Error for things that can go wrong with deserialization

PrimitiveType

Java's primitive value types in Rust form

Value

The possible values written by Java's serialization

Type Definitions

Result

Result type for all deserialization operations