Building an Enterprise Build and Deployment system.
July 05, 2008
In my experience with build and deployment systems, I've noticed some things that make build/deployment systems successful. My work with build systems has been mainly with Java web apps using Ant, CruiseControl or AnthillPro, and SVN, CVS, Starteam, or Perforce. Deployment arenas ranged from one to two environments with one or two servers, to 7 environments (2 of them with 7 streams, one with 14) and over 100 servers.
To begin, how do you know if you have a successful build system? Quite simply, your consumers are happy. Developers have environments that they can mess around in and not bother the QA Dept. The QA Dept. has environments where they can perform functional tests, load tests, integration tests, and regression tests whilst not colliding with the developers or other QA test teams. QA can consistently get the builds they need, when and where they need them. The production support team has a maintenance environment that looks like the production environment from a network, hardware, and software perspective. Finally, successful build and deployment systems have deploys and patches to production that are either zero down-time or they have a consistently sized (small) outage window.
Before I get into my 'good ideas', this article assumes a basic knowledge of project automation and build/deployment systems. If you're not familiar with terms like 'continuous integration' and 'versioned code repository', I'd suggest you take a look at this book -Pragmatic Project Automation by Mike Clark. It outlines the ideas behind these terms as well as how to put together a basic build and deployment system.
Now, into the good stuff... these aren't necessarily in order of importance.
1. Verify hardware/software configurations and where you can, move them into your repository. Ensuring that your hardware looks the same (# of cpu's, memory size, jdk/jre installations, etc) as well as your software (http servers and configs, web servers and configs, databases) and your network (ports, firewalls), and it all doesn't change unless you permit it, is crucial. If your foundation (what you're deploying to) isn't consistent, how do you expect your application function consistently the way you want it to? For more details of some of the commands I used to write a simple verification tool, click here.
2. Have managerial support from infrastructure, networking, and development Depts. If you don't have the means of getting a proper environment or set of environments in place, you're up the creek without a paddle. Knowing you have a consistent 'foundation' of hardware, network, and software to deploy to is key.
3. Deploy into folders that are named with the label of the build you're deploying and use a sym-link to denote the 'active' build. Assuming you're using continuous integration and a code repository, pushing your code onto boxes into folders named with the build label (like 1.5.24) and then using a 'sym-link' ('ln -nsf' to create one in linux) to point to the 'active' build helps immensely. Using this technique makes it easy to roll back a deploy to the previous version - just change the sym-link. In your ant file, have all your server start targets point to the sym-link, pass in the build label as a property and presto! - you can stop/start any build deployed in that environment from ant.
4. Release more often with less functionality. When considering how often to do a production release (patches are a different matter), try to work towards a higher frequency of releases per year with less change in each, rather than one big release with all the changes for the year. There are many advantages to this, not the least of which being the release team gets more familiar with the release/deployment process into production. Also, development and QA teams can (hopefully) do a better job if they have less functionality to build/test, even if they have less time to do it.
5. Put everything in the repository you can. Within reason, of course. However, managing code (obviously), db changes, web server configurations and installations, and even CruiseControl configurations in the repository can be uber helpful. You can include entire tomcat, jboss, ant, even jrun installations under your deployed, labelled folder and run them automatically using ant - I've done it.
6. Use ant macros to keep your ant build files organized. Using ant macros, you can get rid of lots of duplicated ant targets and scripting - just be careful with naming your targets and your macros. When you get into hundreds of them, it can really make a difference if they are named sensibly. Also (some developers may disagree with me here) using the ant-contrib library with the conditional logic (if statements, etc) provides a major help when dealing with numerous environments, server types and applications in one build system.
7. Use free tools to ease brain strain and your work load. If you have to use ant to deploy to Windows boxes, try installing Cygwin with sshd on the windows box. That way you can use scp/ssh (with keys and certs so you don't hardcode passwords) up to your windows and *nix boxes. If you have numerous boxes that you want to validate a configuration on, consider using Tentakel with a modified hosts file. It will give you visibility into all the boxes at once (if you set it up correctly). Or perhaps you need to verify the metadata in your databases - check out SchemaCrawler. You can use a unix 'diff' command on it's results to verify the schema's integrity.
8. Consider empowering your QA Dept. by giving them the ability to push builds when and where they want them at the push of a button. I've seen this be very successful. Have a small web app that hooks into you build system with a simple gui whereby a member of the QA team can select a build label and an environment, and push the build there by clicking a button!
9. Get rid of people manually pushing files/folders entirely! One of my big 'Aha!' moments in working with software in general was when I was told (and realized) that for any kind of truly automated system, humans have to be removed from the process. We tend to be fallible... and we prove it repeatedly. Machines tend to do what they're told. :-)
I hope you enjoyed this little overview. If you have questions or comments, please send them to me @ perry.mckenzie@netfocusconsulting.com.