Serialisation: Protocol Buffers

General

Google Protocol Buffers

One of the most popular multiple-language serialization techniques is Protocol Buffers from Google. Google supports the languages C++, C#, Dart, Go, Java and Python and there is community support for many others, including the ones in this book

It uses a binary format, and the format is defined by a specification language. A data structure is defined in this language, and compilers then generate specific language versions which can be included programs written in those languages.

We will define a Person datatype in this section and then for each of the languages dealt with will show a client and a server that can deal with messages sent using that data type.

The specification language is described in the Protocol Buffers Language Guide (proto3)

Suppose we have data about a person and their email addresses. Informally it could look like this


Name {
    string family
    string personal
}

Email {
    string kind
    string address
}

Person {
    Name name
    Email[] emails
}
      

An example could be


Person {
    Name: {
              family: "Newmarch"
              personal: "Jan"
          }
    Email[]: {
              Email: {
                        kind: "home"
                        address: "jan@newmarch.name"
                     }
              Email: {
                        kind: "work"
                        address: "j.newmarch@boxhill.edu.au"
                     }
             }
}
      

The specification in the protocol buffers specification language would be as in personv3.proto :


syntax = "proto3";
package person;

message Person {

	message Name {
        	string family = 1;
        	string personal = 2;
	}

	message Email {
        	string kind = 1;
        	string address = 2;
	}

	Name  name = 1;
        repeated Email emails = 2;

}
      

Serialized, it could (but this is too simple) look something like


1 1 Newmarch 2 Jan 2 1 home 2 jan@newmarch.name 1 work 2 j.newmarch@boxhill.edu.au
      
This could be stored in a file, sent across the network, attached to a web page, etc. We will just use it for network data transmission.

The protobuf compiler

The specification can be compiled into appropriate code for a number of languages using the protoc compiler. On my system, the packaged version is only at version 3.0.0 and is a couple of years old, while the current version (at June 1, 2020) is 3.12.2

There has been evolution, upward compatable for the specification language and the wire protocol, but not so clean for some of the language APIs. So it is best to use the latest version rather than the one in a distro's repositories. The latest version is at GitHub . The compiler itself comes in compiled versions for various platforms, such as protoc-3.12.2-linux-x86_64.zip for 64-bit Linux systems. This should be downloaded and unzipped into a suitable directory such as /usr/local/ for Linux.

In addition to this, you also need the translator specifications for each target programming language, such as protobuf-java-3.12.2.zip for Java. This gives you source code, which has to built. It is easier to get a ready-built version from a site such as JAR Download. For example, for Java it is on the page Download com.google.protobuf protobuf-java JAR files with dependency

Resources


Copyright © Jan Newmarch, jan@newmarch.name
Creative Commons License
" Network Programming using Java, Go, Python, Rust, JavaScript and Julia" by Jan Newmarch is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License .
Based on a work at https://jan.newmarch.name/NetworkProgramming/ .

If you like this book, please contribute using PayPal