Apologies in advance if you’re not interested in a post about the guts of Opscode Chef.
I recently started to adopt Bryan Berry’s application & library cookbook model as outlined in his excellent and funny blog post, "How to Write Reusable Chef Cookbooks, Gangnam Style". But I quickly ran into a blocker, because people are trying to solve problems using the compile phase and not the execute phase of Chef. Perhaps this calls into question the entire viability of compile-phase providers like chef_gem.
Let’s recap Bryan’s post quickly. He argues for vastly simplifying the number and complexity of bespoke cookbooks that sysadmins have to maintain by leveraging existing, high-quality community cookbooks as “libraries”. (The most common source of these library cookbooks is, of course, the Opscode Cookbooks GitHub repo.) Then, by creating a small bespoke cookbook (the “application” cookbook) that overrides certain attributes & behaviors, you can achieve the desired customizations for your own environment. As a trivial example, suppose you had a popular web application called “instachef” that just got bought by MyFace, and you need to handle more traffic in the Apache VirtualHost that’s fronting it. Well, you might create an instachef::server recipe that does nothing more than
node.set['apache']['prefork']['maxclients'] = 31337 include_recipe "apache2"
thereby overriding the default attributes in the apache2 cookbook but re-using all the logic that Opscode and other community members have put in there. Seems logical, right?
I wanted to do the same thing with the PostgreSQL cookbook: that is, I wanted to wrap the community’s with one that sets up Yum repos from the PostgreSQL Global Development Group (PGDG) and then installs PostgreSQL 9.2 instead of PostgreSQL 8.x on CentOS servers. The PGDG recipe looks something like this:
and then the run_list in my Berkshelf-managed Vagrant VM is just “recipe[smpostgresql::pgdb], recipe[postgresql::server]“. This works great: I get PostgreSQL 9.2 installed on the VM. So far, so good.
Things break down, however, when I want to use the database and database_user LWRPs to manage a set of databases and users. In order to use the LWRPs which run in the execute phase, I need to have the “pg” Rubygem installed in the compile phase. But the “pg” Gem has native extensions which must be compiled against the headers of the PostgreSQL I want, and I can’t retrieve those for PostgreSQL 9.2 until the execute phase, when the pgdg recipe sets up a Yum repo for me to retrieve them from! Argh, chicken and egg problem.
I think the root cause of this problem is that people are abusing the compile-phase of the Chef run to do things that normally would be done in the execute phase. Just look at the source code of postgresql::ruby: it’s almost like an entire recipe, forcibly run in the compile phase. Whenever I see code that breaks the boundaries between execute and compile, I think something’s seriously wrong.
I don’t know the internals of Chef and I’m no Ruby expert, so I don’t know how viable my solution is. But conceptually, this would all be solved if chef_gem was an execute-time resource only. Doing so would mean that other recipes that actually belong in the execute phase — like downloading postgresql92-devel and a C compiler — could be done ahead of installing the gem. It’s almost like we need lazy evaluation of the require "pg" call, until the execute phase, at which point the LoadError could be rescued, the Gem installation could proceed, and the require retried.
I’m interested to know what other Chef practitioners think. In the meantime, I’m working around the issue by simply avoiding using the LWRPs.
hey Julian, thanks for giving my post such careful attention.
I agree that sticking stuff in compile time is a bad idea. I have a commit to postgresql::ruby recipe that moves everything into execute time rather than compile time. https://github.com/bryanwb/postgresql/commit/52c4aa005ac78a9f10651f77c4f2034347ba26fb
I tend to agree w/ you that chef_gem running at compile-time may cause more problems than it solves
Perhaps a patch to chef_gem resource could be added so that you could choose to run it at run-time rather than compile-time. That said, running resources at compile really screws up your perception of how resources are processed, at least it does for me.
great post! we’ll have to get you on the foodfightshow to talk more about this topic
i should also mention that executing resources frequently breaks cookbooks on RHEL-ish distros than on Debian. This is because virtually every cookbook relies on having the EPEL yum repository in place, which is of course installed by a yum repo resource at run-time, not compile-time
At first we used lots of ruby gems to drive logic in our cookbooks but over time I came to see this as a huge problem and now we almost always try to shell out to command line programs to do the work. For your specific problem (Postgres database interaction) we wrote a cookbook [1] that used the psql CLI to implement the same behaviour.
Multi-stage builds (where one stage builds the tool, that drives the next stage of the build, that builds the tool for the next stage etc) are a problem across a range of build tools. There has been some interesting academic work on the subject but I have yet to see a practical build tool that solves this nicely. A poor mans version would just be the ability to have an array of runlists. Phase 1 runlist would build and converge before Phase 2 runlist built and converged before Phase 3 runlist built and converged etc. This would also make it quite easy to cut down builds on nodes. i.e. The Deploy apps into appserver phase could be Phase 3 and you could chose to run just that phase for quick deploys.
[1] https://github.com/realityforge/chef-psql
I second the might peter donald here, support for phases would be very helpful, though it would add yet even more complexity to chef.
Ok Bryan, here’s your reply.
You made some great points. I realized I’d never been bitten by the precondition on EPEL because we handle that at server provisioning time with a custom Knife bootstrap, but obviously not everyone’s going to do that.
I’m not entirely sold on arbitrary length multi-stage runlists yet because of the complexity, but maybe adding a third phase between the existing compile & execute — and locking it down to that — might be appropriate for Chef.
Peter, I do love your psql cookbook and doing things that way. Modules that require compilation against exact versions of devel libraries are fine for serious software developers where subtle ABI breakage is a problem, but are a poor fit for doing routine system administration. Nobody is going to change “CREATE TABLE” significantly between versions. Perhaps another solution would be to convert the database LWRPs to use the command line tools rather than Gems?