Thoughts of Hiram Chirino

Saturday, August 02, 2008

New Checksum Plugin

So in my last post I was suggesting making it easier to include dependency checksums as part of a maven build. I decided that it should be simple enough to implement this as a Maven Plugin. For those of you interested, you can get the source to the new Checksum Plugin here.

The basic problem the plugin is trying to solve is that it is possible that central repositories get hacked and the artifacts/dependencies of our builds get replaced with
malicious versions. Right now we have no way to easily detect that
and we could potential create a release build of a project which
bundles one of those malicious dependencies. In practice this rarely
occurs, but it's not out of the realm of possibilities.

Basically the plugin supports generating a checksum.txt file that is included as part of the project build. This file holds all the checksums for the dependencies (including the dependencies' pom checksum). Generating/updating is induced via the use of a maven profile. This is only done when dependencies get updated.

In a normal build the plugin just validates the checksums of the downloaded dependencies against those stored in the checksum.txt file.

I wish I could move up the validation of the dependencies from their current maven life cycle locations, but it seems you can't get the list of dependencies it gets moved up any more. Any maven mojo hackers have any work arounds for that?

Monday, July 28, 2008

Comments on the Maven Repository Security Proposal

For those of you who don't know, Maven is an awesome build tool. It uses centralized repositories to share build artifacts. Right now there is a problem, where if a repository is hacked, malicious code could be injected into those artifacts and distributed by other builds. Lots of folks object to using maven solely due to this possibility. It's a good thing that the maven teams seems to be working on fix those problems.

First off, I love the Maven Repository Security Proposal. I think that the 'Specified Checksums' idea is awesome. I think it needs to be made so easy to use that folks always use it. Right now it's a little ugly because it makes the dependency declaration much more verbose. Plus it does not seem to cover transitive dependencies that are being used during the build, and I think that those checksums NEED to be included too.

I think that what would be better is if maven provided the tools to update the checksum information in the pom.

Lets say that a build for a module is setup in some strict mode where only artifacts with known checksums are allowed. If the pom is updated to add a new dependency, I think there should be some maven command which automatically adds the checksum for the new dependency (and transitive dependencies). Artifacts that are signed with a trusted key get added without prompting, and a confirmation prompt would be given for artifacts that are not GPG trusted.

So the question is why go through all that trouble? So that folks get a trusted source distribution (out of SCM or a signed tar ball), can do a build and have a high level of guarantee that the dependencies that are being used in the source build match what was intended by the developers of the source distribution. Furthermore, it will not matter if the transitive dependencies are signed and have keys in the end user's keyring since all the checksums are include in the build.

Now, since there could be lots of dependencies in a build, due to the use of build plugins and transitive dependencies, it might be worth storing the checksum data in a file external to pom.xml, or at least in a different xml section from the dependencies declaration.

Things to think about: Having SNAPSHOT dependencies in the build could complicate things, as the build would be tied to a particular SNAPSHOT/checksum, but maybe that's a good thing.

Thursday, July 17, 2008

Keep an eye out for ZooKeeper

Wow, I love the simplicity that ZooKeeper brings to a really hard set of distributed problems. Check out this Introductory Video that explains it more in depth. Basically group leadership/coordination and cluster wide configuration issues are taken care of if you Use ZooKeeper.

Oh and it's an Apache Project now. Yay! Seems like the project website is still not fully setup since they are migrating from SourceForge to Apache, be here's a link to the source tree.

TODO: Double Write Buffers

Note to self: investigate implementing the Double Write Buffers idea in ActiveMQ. ActiveMQ keeps several indexes into the persistent messages that it's holding and when ActiveMQ is shutdown ungracefully, we rebuild the indexes from the data logs due to them being in inconsistent state. If your queueing up millions of messages, building those indexes can take a long time.

Double buffering may allow us fix inconistencies in those index and gets us running faster..

Monday, June 02, 2008

ActiveMQ/SpecJMS/Camel Webinar

Whoa, time flies by, and I forgot to post about the upcoming webinar that I will be co-hosting with Rob Davies on June 10th. We will be covering some messaging basics, introducing Apache ActiveMQ and Apache Camel to the audience, but most interesting I think will be the section where Rob will be covering the results that IONA has been seeing benchmarking ActiveMQ against the SpecJMS2007 test suite. I totally agree with Rob's comment that "An independent benchmark is important, because it negates the chance to skew home groan tests to a vendor's strengths."

Thursday, May 29, 2008

InfoQ Covers ActiveMQ 5.1 Release

InfoQ has posted nice article on the new features in the ActiveMQ 5.1 release versus the last 4.1 release:

Apache ActiveMQ, an open source provider of enterprise messaging services, recently released version 5.1 which includes improvements in stability and performance of the message broker product. This version also includes support for priority message ordering and a Microsoft Message Queue (MSMQ) to ActiveMQ Bridge with the new msmq transport component.

There are also improvements in the monitoring module of ActiveMQ container. A new DestinationSource class was added to access the available Queues or Topics or listen to Queues/Topics being created or deleted in the container. There is a new API to help end users view available destinations and query them to find JMS statistics such as active queue count, queue depth, number of messages etc.

Read More...

Wednesday, May 07, 2008

ActiveMQ 5.1.0 Release

For all of you who ran into issues with ActiveMQ 5.0.0 when running it in anger, I highly recommend you give the just released ActiveMQ 5.1.0 a try. This release focused focused on making the broker rock solid. It resolved several bugs which only reared their heads in high load situations. Memory leaks have been squashed and performance has even improved in several areas.

Even if you have not had seen any issues with your 5.0.0 installation, I'd highly recommend you upgrade to 5.1.0 to avoid running into some of the bugs that have been addressed in the release.

Thursday, April 10, 2008

Mulitcast not working on a Linux box?

Just ran into a problem where some mutlicast tests were failing on a linux box and I could not figure out why. Did a little bit of research and found out that you may need to add a route for it first. So if you have this problem try running:
route add 224.0.0.0 netmask 240.0.0.0 dev eth0

or if you have an older version of linux like me:
route add -net 224.0.0.0 netmask 240.0.0.0 dev eth0