HistoryHelmsman's legacy is a collection of SystemV style init scripts that were copied to different machines and manually maintained or symlinked all over the place. Needless to say that didn't scale very well and the scripts started to drift between the machines. We also ran into issues with permissions because we didn't want the entire development team or QA team having root access but the scripts needed to be maintained, linked to the appropriate run level, and services controlled.
This led to a rewrite in Python which was chosen because it was quick to put together and has pretty good subprocess execution and monitoring capabilities. Unfortunately the implementation tried to be a little too fancy and do some nice ASCII art to show service status which would cause the interpreter to crash in various term configurations. We also ran into issues keeping the Python install (and supporting modules) consistent across the various Solaris Sparc, Solaris x86, RedHat x86, OSX, and Windows machines in use through-out different environments (a discussion for another time).
I then spent some time looking for alternatives. Unfortunately I didn't find anything that was simple to install, cross platform, and would be maintainable by our (mostly Java) development team. I looked at monit, Chef, Puppet, RunDeck, Jenkins, Upstartd, etc. but they felt way too heavy weight or got us back into the issue of needing another runtime across all of our machines. We're not a huge shop so having to build out Puppet scripts to consistently install a runtime to start and stop services just doesn't seem like time well spent.
Given that our main applications are written in Java and we already maintain JVM installs on all machines and our developers know Java well, it seemed like an obvious choice. I spent a few hours playing around with commons-exec and how to format the output and debugging information to be readable and support all terminals, I was able to rewrite the Python scripts in a day. Helmsman was born.
I deploy Helmsman with our application. So our deployment scripts (automated via Jenkins) push a copy of Helmsman out with our deployment, stop all the services using the existing version, move everything out of the way, install the new deployment, and then use the new Helmsman to start everything back up. This makes it super simple to make sure that the same version is on every machine and that all changes are getting pushed out reliably just like the rest of our build.
In test/stage environments, I have versions setup for QA to use to start and stop key services during testing to test failover, redundancy, etc. We also use the groups feature to define services that need to stay up even when in maintenance mode or services that should run at the warm standby site.
FeaturesSome of the features include:
- Simple Java properties configuration format
- One jar deployment
- Base configuration shared across all environments/machines
- Per machine configuration overrides
- Simple service start/stop ordering (no dependency model)
- Parallel or serial service execution
Configuration is done via Java properties files which list the names of the services and then a few basic properties for each service. The "services" are simply a command to be executed which follows the SystemV init script style of taking an argument of start, stop or status. These scripts can be custom written but in most cases they will be provided by frameworks like Java Service Wrapper (JSW) or Yet Another Java Service Wrapper (YAJSW) or by your container.
You can grab the source from my Github repo or grab a precompiled version from my Github hosted MVN repo. Checkout the Github page for more details on how to use the tool.
Let me know if you find a use for Helmsman in your process. Hopefully it makes your life a bit easier.