A DevOps Workflow, Part 1: Local Development

This series is a longform version of an internal talk I gave at a former company. It wasn't recorded. It has been mirrored here for posterity.

How many times have you heard: "That's weird - it works on my machine?"

How often has a new employee's first task turned into a days-long effort, roping in several developers and revealing a surprising number of undocumented requirements, broken links and nondeterministic operations?

How often has a release gone south due to stovepiped knowledge, missing dependencies, and poor documentation?

In my experience, if you put a dollar into a swear jar whenever one of the above happened, plenty of people would be retiring early to spend time on their private islands. The fact that this situation exists is a huge problem.

What would an ideal solution look like? It should ensure consistency of environments, capture external dependencies, manage configuration, be self-documenting, allow for rapid iteration, and be as automated as possible. These features - the intersection of development and operations - make up the practice of DevOps. The solution shouldn't suck for your team - you need to maximize buy-in, and that can't be done when people need to fight container daemons and provisioning scripts every time they rebase to master.

In this series, I'll be walking through how we do DevOps at HumanGeo. Our strategy consists of three phases - local development, continuous integration, and deployment.

Please note that, while I mention specific technologies, I'm not stating that this is The One True Way™. We encourage our teams to experiment with new tools and methods, so this series presents a model that several teams have implemented with success, not official developer guidelines.

Development Environment: Vagrant

In order to best capture external dependencies, one should start with a blank slate. Thankfully, this doesn't mean a developer has to format her computer each time she takes on a new project. Depending on the project, it may be as simple as putting code into a new directory or creating a new virtual environment. However, given the scale of the problems we tackle at HumanGeo, we need to push even further and assemble specific combinations of databases, Elasticsearch nodes, Hadoop clusters, and other bespoke installations. To do so, we need to create sandboxed instances of the aforementioned tools; it's the only sane way to juggle multiple versions of a product when developing locally. There are plenty of fine solutions to this problem, Docker and Vagrant being two of the major players. There's not a perfect overlap between the two, but as they fit in our stack, they're near-equivalent. Since it provides a gentler learning curve, this series will cover Vagrant.

Vagrant provides a means for creating and managing portable development environments. Typically, these reside in VirtualBox virtual machines, although they have support for many different backend providers. What's neat is that, with a single Vagrantfile, you can provision and connect multiple VMs, while automatically syncing code changes made on the host machine (i.e., your computer) to the guest instance (i.e., the Vagrant box).

To get started with Vagrant, you must define your configuration in a Vagrantfile. Here's a sample:

Vagrant.configure("2") do |config|
  config.vm.box = "trusty64"
  config.vm.hostname = "webserver"
  config.vm.network :private_network, ip: "192.168.0.42"

  config.vm.provider :virtualbox do |vb|
    vb.customize [
      "modifyvm", :id,
      "--memory", "256",
    ]
  end

  config.vm.provision :shell, path: "bootstrap.sh"
end

This defines an Ubuntu 14.04 (Trusty Tahr) machine with a fixed private IP, 256mb of RAM, and a bootstrap shell script, which will install needed dependencies and apply software-level configuration. The Vagrantfile can be committed to version control alongside the bootstrap script and your application code so the entire environment can be captured in a single snapshot.

Launching the machine is done with a single command: vagrant up. Vagrant will download the trusty64 base image from a central repository, launch a new instance of it with the hardware and networking states we've defined, and then run the bootstrap file. The image download will only occur once-per image, so future machine initializations will utilize the cached version. Machines can be stopped with vagrant down. You can later re-launch the machine with vagrant up. If you decide that you need to nuke your entire environment from orbit and start over (an immensely useful option), you can do so with vagrant destroy.

To manage these machines, one can connect via SSH just as one would a remote server. The vagrant ssh command will automatically log the user in using public key authentication. From there, a developer can experiment with configuration and other aspects of application development. All ports are exposed to the host machine, so, if a webserver is bound to port 5000, it can be reached from your browser at http://192.168.0.42:5000 (the IP address we assigned to our instance in the Vagrantfile).

Unlike when working with a remote server, you don't need to run a terminal-based editor via SSH, or use rsync every time you save a file in order to make changes to the code on the virtual machine. Instead, the directory that contains your Vagrantfile is automatically mounted as /vagrant/ on the guest, with changes automatically synced back and forth. So, you can use whatever editor you want on the host, while executing code on the VM. Easy.

Provisioning: Ansible

Vagrant itself is only really focused on the orchestration of virtual machines; the configuration of the machines is outside of its purview. As such, it relies on a provisioner - an external tool or script that runs against newly created virtual machines in order to build upon the base image. For example, a provisioner would be responsible for taking a blank Ubuntu installation and installing PostgreSQL, initializing a database, and seeding the database with data.

The example Vagrantfile uses a simple shell script (bootstrap.sh) to handle provisioning. For simple cases, this may well be sufficient. However, if you're doing any serious development, you'll want to move to a more robust configuration management tool. Vagrant ships with support for several different ones, including our preferred tool - Ansible.

Ansible is great in many ways: its YAML-based configuration language is clean and logical, it operates over SSH, has a great community, emphasizes modularity, and doesn't require any custom software be present on your target computers (other than Python 2, with Python 3 support in the technical preview phase). With a little elbow grease, you can even make it idempotent, so there's nothing to fear if you reprovision an instance. Since these provisioning scripts live alongside your code, they can be included in your merge review process, and improve validation of your infrastructure.

Swapping out Vagrant's shell provisioner is extremely straightforward. Just change your provisioner to "ansible", point it at the Ansible configuration script (called a playbook), and you're set! The final provisioning block should now look like this:

config.vm.provision "ansible" do |ansible|
  ansible.playbook = "playbook.yml"
end

Tasks

Ansible's basic building block is a task. Conceptually, a task is an atomic operation. These operations run the gamut from the basic (e.g., set the permissions on a file) to the complex (e.g., create a database table). Here's a sample task:

- name: Install database
  apt: name=mysql-server state=present

The equivalent shell command would be sudo apt-get install mysql-server. Nothing fancy, right?

- name: Deploy DB config
  copy: src=mysql.{{env_name}}.conf dest=/etc/mysql.conf mode=644

There are several things going on here. First, surprise! Ansible is awesome and speaks Jinja2. As such, it will interpolate the variable env_name into the string value for src, resulting in mysql.dev.conf if we were targeting a dev environment (env_name is a convention we use internally for this very purpose). Next, we're invoking the copy module. This doesn't actually copy a file from one remote location to another, it instead copies a local file to a remote destination. This saves you from having to scp the file to your target machine, then remote in to set a permission. It's also far easier to understand at a glance.

- name: Start mysqld
  service: name=mysql state=started enabled=yes

Finally, we ensure that the MySQL service is not only running, but is set to automatically start when the system does. This highlights one of the benefits of Ansible's module system - it masks (and handles) underlying implementation complexities. Whether or not the target machine is using SysV-style inits, Upstart, or systemd, the service module takes care of it for you.

Roles

Tasks can either reside in your playbook, or they can be organized into functional units called roles. Roles not only allow you to group tasks, but also bundle files, templates and other resources, providing for a clean separation of concerns. The tasks above can be placed in a file called tasks/main.yml, resulting in the following directory structure:

roles
└── mysql                   # Tasks to be carried out on DB machines
    ├── files
    │   ├── mysql.dev.conf
    │   └── mysql.prod.conf
    └── tasks
        └── main.yml

Then, all you need to do is reference the role from within your playbook.

Playbooks

These are the entry points for Ansible. A playbook is comprised of one or more plays, each of which possesses several parameters: one or more instances to target, variables to bundle, and a series of tasks (or roles) to execute.

- name: Configure the test environment app server
  hosts: 192.168.1.1
  vars:
    env_name: dev
    es_version: 2.1.0
  roles:
    - common
    - elasticsearch
    - mysql

What is evident in the above example is how Ansible roles help improve modularity and reusability. If I have to install MySQL on several different hosts (e.g., the test app server and the production app server), all I need to do is include the role. Ansible maintains a central repository of roles for developers to customize; most of the time you don't need to write any novel provisioning code.

To invoke the playbook, run ansible-playbook name-of-playbook.yml. If you're using Ansible with Vagrant, you should instead use vagrant provision, as Vagrant will handle the mapping of hosts and authentication. And, no matter how many times you provision, the machine state should remain the same.

This concludes the local development portion of our dive into DevOps. In the next installment, we'll cover continuous integration!