» » Serialization mechanisms in Java and Kotlin

Serialization mechanisms in Java and Kotlin

A little about the definitions of “serialization” and “deserialization”

There are several examples of the use of these mechanisms:

  • Storing objects in some storage. In this case, we serialize the object into a byte array, write it to storage, and then, after some time, when we need this object, we deserialize it from the byte array received from storage.

  • Passing an object between applications. In this case, one of our applications serializes the object, transfers the resulting array of bytes in some way to another of our applications, and the latter is already involved in deserialization.

  • Obtaining an object representation of a request or generating a response. In this case, our application is only on one side and, accordingly, we need to either deserialize the request or serialize the response.

  • internal format. This format is understood only by the implementation that made it. Java Serializable is a good example of this format implementation.

  • xml. Pretty wide format. Based on it, there are many “subformats” that are implemented by various libraries.

  • JSON. The most popular format, as it is supported by various programming languages ​​and has an almost unambiguous option for converting an object into it.

  • Avro. A binary format that is supported by many programming languages.

  • Protobuf. Another binary format that is supported by many programming languages.

An important property of the mechanism is resistance to the evolution of the object. This means that we sometimes need to deserialize an object that was serialized by a previous version of our application (we wrote the object to the file, then updated the application, read the object from the file). Or vice versa: we need the old version of our application to be able to deserialize the data received by the new version (only one part of our application has been updated, and now it sends data in the new format that the old version of the application reads).

Serialization mechanisms provide this property in different ways. Let's look at a few options.

Standard

Java has a standard way to serialize. Its disadvantage is that you can only read data from Java, and in the classpath we must have the classes that we have serialized.

import java.io.Serializable;

public class Address implements Serializable {
    private final int countryCode;
    private final String city;
    private final String street;

    public Address(int countryCode, String city, String street) {
        this.countryCode = countryCode;
        this.city = city;
        this.street = street;
    }

    @Override
    public String toString() {
        return "[Address " +
                "countryCode=" + countryCode +
                ", city='" + city + '\'' +
                ", street='" + street + '\'' +
                ']';
    }
}
import java.io.Serializable;

public class Person implements Serializable {
    private final String firstName;
    private final String lastName;
    private final Address address;

    public Person(String firstName, String lastName, Address address) {
        this.firstName = firstName;
        this.lastName = lastName;
        this.address = address;
    }

    @Override
    public String toString() {
        return "[Person " +
                "firstName='" + firstName + '\'' +
                ", lastName='" + lastName + '\'' +
                ", address=" + address +
                ']';
    }
}
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

public class Main {
    public static void main(String[] args) throws Throwable {
        Path path = Paths.get("vasya.dat");
        try (ObjectOutputStream oos = new ObjectOutputStream(
                Files.newOutputStream(path))) {
            Person person = new Person("Вася", "Пупкин",
                    new Address(7, "Н", "Бассейная"));
            oos.writeObject(person);
        }

        try (ObjectInputStream ois = new ObjectInputStream(
                Files.newInputStream(path))) {
            Person read = (Person) ois.readObject();
            System.out.printf("Read person: %s", read);
        }
    }
}

Notice how convenient - you didn't have to do anything extra. The JVM itself wrote all the fields of the objects for us, and then read them itself.

If we change classes, for example, add a house number to an address, then a java.io.InvalidClassException will be thrown when reading the old file. Let's try to avoid this.

 

Let's make our own methods for writing and reading, we will write the version of the class and, when reading, determine which fields need to be read and which not. Thus, we can know about all past versions and be able to proofread them in various ways, ensuring backward compatibility. We do not implement forward compatibility in this way - in this mechanism this is not a completely trivial task.

import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.io.Serializable;

public class Address implements Serializable {
    // сами задаём значение, чтобы JVM не генерировала его
    private static final long serialVersionUID = -4554333115192365232L;
    private static final int VER = 2;

    private int countryCode;
    private String city;
    private String street;
    private int houseNumber;

    public Address(int countryCode, String city, String street,
            int houseNumber) {
        this.countryCode = countryCode;
        this.city = city;
        this.street = street;
        this.houseNumber = houseNumber;
    }

    private void writeObject(ObjectOutputStream oos) throws IOException {
        oos.writeInt(VER);
        oos.writeInt(countryCode);
        oos.writeUTF(city);
        oos.writeUTF(street);
        oos.writeInt(houseNumber);
    }

    private void readObject(ObjectInputStream ois) throws IOException {
        int ver = ois.readInt();
        if (ver == 1) {
            countryCode = ois.readInt();
            city = ois.readUTF();
            street = ois.readUTF();
            houseNumber = 0;
        } else if (ver == 2) {
            countryCode = ois.readInt();
            city = ois.readUTF();
            street = ois.readUTF();
            houseNumber = ois.readInt();
        } else {
            throw new IOException("Неизвестная версия: " + ver);
        }
    }

    @Override
    public String toString() {
        return "[Address " +
                "countryCode=" + countryCode +
                ", city='" + city + '\'' +
                ", street='" + street + '\'' +
                ", houseNumber=" + houseNumber +
                ']';
    }
}

External Libraries

Now let's talk about a few libraries that allow you to serialize objects in a more flexible way than the standard mechanism.

FasterXML Jackson

This is a library that was originally made for serialization to JSON format, but then the ability to serialize any format was added to it. In turn, the developers have made appropriate extensions for many popular formats.

Jackson JSON

Let's add default constructors and getters to our classes as required by the library.

public class Address {
    private final int countryCode;
    private final String city;
    private final String street;

    public Address(int countryCode, String city, String street) {
        this.countryCode = countryCode;
        this.city = city;
        this.street = street;
    }

    public Address() {
    }

    public int getCountryCode() {
        return countryCode;
    }

    public String getCity() {
        return city;
    }

    public String getStreet() {
        return street;
    }

    @Override
    public String toString() {
        return "[Address " +
                "countryCode=" + countryCode +
                ", city='" + city + '\'' +
                ", street='" + street + '\'' +
                ']';
    }
}
public class Person {
    private final String firstName;
    private final String lastName;
    private final Address address;

    public Person(String firstName, String lastName, Address address) {
        this.firstName = firstName;
        this.lastName = lastName;
        this.address = address;
    }

    public Person() {
    }

    public String getFirstName() {
        return firstName;
    }

    public String getLastName() {
        return lastName;
    }

    public Address getAddress() {
        return address;
    }

    @Override
    public String toString() {
        return "[Person " +
                "firstName='" + firstName + '\'' +
                ", lastName='" + lastName + '\'' +
                ", address=" + address +
                ']';
    }
}
import com.fasterxml.jackson.databind.ObjectMapper;

public class Main {
    public static void main(String[] args) throws Throwable {
        ObjectMapper om = new ObjectMapper();
        Person person = new Person("Вася", "Пупкин",
                new Address(7, "Н", "Бассейная"));

        String json = om.writeValueAsString(person);

        Person read = om.readValue(json, Person.class);
        System.out.printf("Read person: %s\n", read);
    }
}

We get this line:

{"firstName":"Вася","lastName":"Пупкин","address":{"countryCode":7,"city":"Н","street":"Бассейная"}}

What happens if we want to add a house number? Nothing bad will happen: the field will simply remain as it was after the default constructor was called.

And if, on the contrary, we add houseNumber to JSON, but will we read it with the old code? We get the error com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException. To avoid it, you can add an annotation to the Address class:

@JsonIgnoreProperties(ignoreUnknown = true)
public class Address {

In fact, the library has a lot of different settings with which you can do almost anything you want.

Jackson XML

Nothing in the Person and Address classes, with minimal changes (by creating another ObjectMapper) we can serialize our object into XML:

ObjectMapper om = new XmlMapper();

We will get the following line:

<Person><firstName>Вася</firstName><lastName>Пупкин</lastName><address><countryCode>7</countryCode><city>Н</city><street>Бассейная</street></address></Person>

Jackson Avro

The Avro format is designed in such a way that it works with the data schema. We must specify the schema when serializing the object, as well as the schema when deserializing (it is possible to include the schema in the serialized data). The recipient will have two schemas, the writer's schema and his own, and can decide which one to read.

The library has the option to get the schema directly from our POJO, but we won't use that feature to see what an Avro schema is.

Let's describe the circuit manually. This is done in JSON format:

{
  "type": "record",
  "name": "Person",
  "fields": [
    {
      "name": "firstName", "type": "string"
    },
    {
      "name": "lastName", "type": "string"
    },
    {
      "name": "address",
      "type": {
        "type": "record",
        "name": "Address",
        "fields": [
          {
            "name": "countryCode", "type": "int"
          },
          {
            "name": "city", "type": "string"
          },
          {
            "name": "street", "type": "string"
          }
        ]
      }
    }
  ]
}

Our main will now look like this:

import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.dataformat.avro.AvroMapper;
import com.fasterxml.jackson.dataformat.avro.AvroSchema;
import org.apache.avro.Schema;

import java.io.File;

public class Main {
    public static void main(String[] args) throws Throwable {
        Schema raw = new Schema.Parser()
                .setValidate(true)
                .parse(new File("avro-schema.json"));
        AvroSchema schema = new AvroSchema(raw);

        ObjectMapper om = new AvroMapper();

        Person person = new Person("Вася", "Пупкин",
                new Address(7, "Н", "Бассейная"));
        byte[] bytes = om.writer(schema).writeValueAsBytes(person);

        Person read = om.readerFor(Person.class)
                .with(schema)
                .readValue(bytes);
        System.out.printf("Read person: %s\n", read);
    }
}

Jackson Protobuf

Protobuf is a format that also requires a preliminary description of the data schema. This time we will use the generator from POJO:

import com.fasterxml.jackson.dataformat.protobuf.ProtobufMapper;
import com.fasterxml.jackson.dataformat.protobuf.schema.ProtobufSchema;

public class Main {
    public static void main(String[] args) throws Throwable {
        ProtobufMapper om = new ProtobufMapper();
        ProtobufSchema schema = om.generateSchemaFor(Person.class);

        Person person = new Person("Вася", "Пупкин",
                new Address(7, "Н", "Бассейная"));
        byte[] bytes = om.writer(schema).writeValueAsBytes(person);

        Person read = om.readerFor(Person.class)
                .with(schema)
                .readValue(bytes);
        System.out.printf("Read person: %s\n", read);
    }
}

Jackson Smile

Smile is just a binary JSON representation format. We just need to create the corresponding ObjectMapper:

ObjectMapper om = new SmileMapper();

Kryo

Kryo is a serialization library that focuses on speed and efficiency.

import com.esotericsoftware.kryo.Kryo;
import com.esotericsoftware.kryo.io.Input;
import com.esotericsoftware.kryo.io.Output;

import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

public class Main {
    public static void main(String[] args) throws Throwable {
        Kryo kryo = new Kryo();

        // нужно либо зарегистрировать все используемые классы,
        kryo.register(Person.class);
        kryo.register(Address.class);

        // либо указать, что мы доверяем источнику и можно инстанцировать
        // любые классы
        kryo.setRegistrationRequired(false);

        Path path = Paths.get("vasya.dat");
        try (Output output = new Output(Files.newOutputStream(path))) {
            Person person = new Person("Вася", "Пупкин",
                    new Address(7, "Н", "Бассейная"));
            kryo.writeObject(output, person);
        }

        try (Input input = new Input(Files.newInputStream(path))) {
            Person read = kryo.readObject(input, Person.class);
            System.out.printf("Read person: %s\n", read);
        }
    }
}

For forward and backward compatibility, you can specify:

kryo.setDefaultSerializer(CompatibleFieldSerializer.class);

Kotlin

Now let's see what interesting things Kotlin did about serialization. Since Kotlin is a JVM based language, we can use all the previous libraries for serialization. But Kotlin has a very useful kotlinx.serialization library that allows you to build a schema at compile time, rather than using the Reflection API at runtime. This provides faster performance.

Let's first rewrite our classes in Kotlin:

import kotlinx.serialization.Serializable

@Serializable
data class Address(
    val countryCode: Int,
    val city: String,
    val street: String,
)

@Serializable
data class Person(
    val firstName: String,
    val lastName: String,
    val address: Address,
)

JSON

Now let's serialize to JSON:

import kotlinx.serialization.decodeFromString
import kotlinx.serialization.encodeToString
import kotlinx.serialization.json.Json

fun main() {
    val json = Json

    val person = Person("Вася", "Пупкин",Address(7, "Н", "Бассейная"))
    val str = json.encodeToString(person)

    val read = json.decodeFromString<Person>(str)
    println("Read person: $read")
}

By adding a field to Address, we get kotlinx.serialization.MissingFieldException. To avoid this, you can specify a default value for this field:

@Serializable
data class Address(
    val countryCode: Int,
    val city: String,
    val street: String,
    val houseNumber: Int = 0,
)

Protobuf

In Protobuf, serialization is no more difficult:

import kotlinx.serialization.Serializable
import kotlinx.serialization.decodeFromByteArray
import kotlinx.serialization.encodeToByteArray
import kotlinx.serialization.protobuf.ProtoBuf

fun main() {
    val protobuf = ProtoBuf

    val person = Person("Вася", "Пупкин",Address(7, "Н", "Бассейная"))
    val bytes = protobuf.encodeToByteArray(person)

    val read = protobuf.decodeFromByteArray<Person>(bytes)
    println("Read person: $read")
}

What can be said at the end?

We've only covered a small number of options for serializing objects in the JVM. The choice of method depends on many factors. Decide what you need: cross-platform, support for backward and / or forward compatibility, serialization and deserialization speed, and whether the size of the received data is important.

In any case, the standard serialization method is hardly worth using. We reviewed it for educational purposes only. It is slow, quite voluminous, and compatibility is done by hand there.

If you can use Kotlin, then I would recommend using its library - it's a convenient and efficient solution.

Well, if you are limited to pure Java, then, in my opinion, the Jackson library is a great option. It's pretty fast, has a lot of settings, and you can easily change the format without rewriting your code. The format can be selected for your specific task:

  • JSON - for all occasions, as it is supported by all languages ​​and frameworks, and also because of its visibility;

  • Protobuf or Avro - if you need speed and a minimum size (they have differences, but their discussion is a matter for a separate article);

  • XML - for example, if you need to validate data against XSD, or for some other reason;

  • Any other format.

Related Articles

Add Your Comment

reload, if the code cannot be seen

All comments will be moderated before being published.