I promised I would follow up on my previous post on how the “The ActiveMQ Protobuf Implementation Rocks!”.
So you might be asking yourself, what’s the secret sauce? Well before I get into that, let me first explain the class model that our proto compiler generates.
For every message definition in the ‘.proto’ file, the compiler will generate 3 classes:
- the message interface: is implemented by the bean and buffer classes. It has all the ‘getters’.
- the bean class: has all the ’setters’ and ‘merge’ methods
- the buffer class: has all the encoding and decoding methods. It does not allow mutation.
The message interface also defines the freeze(), frozen(), and copy() methods which allow you to make an instance immutable, check to see if an instance is immutable, and create a mutable copy. Buffer classes are alway immutable. Bean class can transition to being immutable via freeze(). freeze() naturally returns a buffer object. copy() naturally returns a bean object.
This bean model gives substantial flexibility. Besides making it easy to transition from immutable to mutable and back, the message interface lets you implement business methods that operate against either type of instance. You could use the bean class purely in a builder style to always generate a buffer instance, or you could just use them like traditional java bean objects.
Once a bean instance is frozen, any attempts to modify the instance will throw assertion errors if assertions are enabled in your JVM. So the CPU cost of validating program correctness can can be disabled at run time.
Finally, the most important feature of the buffer class is that it holds on to either the byte array that it was created from or the frozen bean that created it, and sometimes both, after it builds one from the other. This has several implications. Firstly, once a buffer is encoded to a byte[], subsequent encoding passes are free. This is also true when a buffer is decoded, as the next encoding is free since it still retains the original encoding of the message. And the other benefit that this provides which the benchmark highlighted, is that deferred decoding is possible. A newly created buffer class will not decode the data until a field is accessed. This also true of the nested messages that are encoded in a buffer. While the outer message may get decoded, the nested message will not be decoded until it’s fields are accessed.
While reading Comparing the Java Serialization Options I ran across the a cool google code project which has done an excellent job benchmarking a wide variety of serialization options for java.
I’ve had been researching the protobuf encoding format for a while and really liked it. But I did not really like the Java implementation that Google had published. It was kinda clunky to use and I saw several optimizations that could be used that were missing. Optimizations that could create huge performance wins when applied to the usage patterns of an enterprise messaging system like Apache ActiveMQ. So I created a new protobuf implementation in the ActiveMQ project.
Naturally, I was curious to see how the activemq protobuf implementation stacked up against similar technologies. So I grabbed the V1 benchmark source code and added our implementation to it. If you want to do the same, apply this patch.
Once I ran the benchmark and I was very pleased with the results. I’m including the performance graphs of our impl and standard protobuf and thrift for comparison.




As it turns out, our implementation looks awesome in the benchmark! How about that decoding speed!
It’s getting late here.. so I’ll have do a follow up post explaining how come we did so much better.
Wow, I can’t believe I missed it. Python lovers rejoice! Seems some good folks have created a python client for ActiveMQ which is using the very robust ActiveMQ C++ client.
And for those of you on Ubuntu, Dejan Bosanac has put together an excellent guide on how to build it on ubuntu.
Last weekend I got a little spare time an through together a small little library while should help with the problem of boring Java console applications on Windows.
It’s called Jansi and it provides support for using ANSI escape sequences in your Java console applications on Windows.
With ANSI escape sequences, you can fully control the the cursor positioning and the foreground and background color of the console text output. Here is quick example of what’s posssible:

So in my last post I was suggesting making it easier to include dependency checksums as part of a maven build. I decided that it should be simple enough to implement this as a Maven Plugin. For those of you interested, you can get the source to the new Checksum Plugin here.
The basic problem the plugin is trying to solve is that it is possible that central repositories get hacked and the artifacts/dependencies of our builds get replaced with
malicious versions. Right now we have no way to easily detect that
and we could potential create a release build of a project which
bundles one of those malicious dependencies. In practice this rarely
occurs, but it’s not out of the realm of possibilities.
Basically the plugin supports generating a checksum.txt file that is included as part of the project build. This file holds all the checksums for the dependencies (including the dependencies’ pom checksum). Generating/updating is induced via the use of a maven profile. This is only done when dependencies get updated.
In a normal build the plugin just validates the checksums of the downloaded dependencies against those stored in the checksum.txt file.
I wish I could move up the validation of the dependencies from their current maven life cycle locations, but it seems you can’t get the list of dependencies it gets moved up any more. Any maven mojo hackers have any work arounds for that?