In the last post I covered how I structure projects with Django, virtualenv and Buildout. Now I’m going to talk about how I deploy code via Fabric. If you’re not familiar with Fabric, it’s a Python package and set of command line tools you use to deploy or run tasks on systems accessible via SSH. Fabric is great because you can script it in Python and it has an active community. Note: I started using Fabric in 2010 and newer versions of Fabric might have made what I’m about to share easier. If that’s the case, definitely let me know in the comments.
One more note. I have to give a lot of credit to my friend Elias Torres, who was my CTO at Performable and is now making amazing things happen at Hubspot. A lot of the ideas in this post are from working together managing deployments at Performable and he gets all the credit for introducing me to Fabric.
Alright, so let’s dive in.
What is our goal?
The reason I want to use Fabric is simple. I want to take the setup I have in my development environment and move it to staging, production, wherever I want you my code to be live. Ideally, the production setup will look very similar to my development setup so I don’t have to worry about issues arising from differences in environments.
So, you can imagine, if I had some command I could run from Terminal that would simple deploy code to all my servers, life would be great. So let’s build that.
Getting Fabric
The first thing you’ll need is Fabric. Easy enough, just go edit the setup.py file we created in the last blog post and add Fabric to the install_requires section:
install_requires=[
...
'Fabric == 1.4.2',
...
]
While you have your setup.py file open, we’re going to do one more thing. We want to create a way to run Fabric easily for our project. To do that, we’ll add an entry point in our setup.py file. Add the following as argument to the setup method in your setup.py file:
entry_points="""
[console_scripts]
fab=fabric.main:main
"""
What’s going on here? Without going into detail, when Buildout runs, this will create a python script in the bin directory that will run the specified function when you execute it. So in this case, we’ll end up with a file that contains:
import fabric.main
if __name__ == '__main__':
fabric.main.main()
So when you run bin/fab, you’ll be able to run Fabric. Cool? Okay, run bin/buildout to get the Fabric package and create the fab script in the bin directory.
The fabfile.py
By default, Fabric automatically looks for a file called fabfile.py to find tasks it can run. You could just stick all of the tasks you’re going to use all in that one file, but we’re going to do it a bit differently. Instead, we’re going to split our tasks into four files. The first is fabfile.py and the other three are going in a deploy directory. First, we need an __init__.py file and then two more files, app.py and servers.py.
Here’s how I’ll organize the code. app.py will contain tasks with logic specific to our project (so things like how to deploy). The servers.py will contain tasks we can reuse that deal with figuring out which servers we want to deploy to. Finally, the __init__.py and fabfile.py will tie it all together. Why do it this way? To run tasks, Fabric needs to know which servers to target and then what to do. With Fabric, you can specify the target servers via the command line or you can hardcore them with your tasks, but I prefer to separate them into a separate file and also autogenerate some nice server groupings.
Okay, so let’s look at the fabfile.py:
from fabric.api import *
from deploy import *
from deploy import app
env.user = 'ubuntu'
env.hosts = []
setup_hosts(globals())
for m in [app]:
load_module(m, globals())
Okay, I know I’m doing a few things here that’ll probably upset a lot of people. I’m definitely being a little “magical.” The import *‘s could list out what they’re importing, but because rarely edit this file and I might want to add additional tasks in other files, this makes that process simpler. You’ll also notice I’m referencing globals(). I need to do this because Fabric expects all the tasks you’ll run in the fabfile.py. I’m going to autogenerate some tasks and I’ve found the best way to do that is to pass a reference to globals() and add the autogenerated ones to that dictionary.
There are a few other things to note here:
env.user = 'ubuntu' is my hard coded user because I’m deploying to Ubuntu servers where I’m using the default user. If you need to parameterize the user, you can look at the Fabric docs to see how to do that.
- If you’re wondering where
setup_hosts and load_module come from, they’re being imported from the __init__.py and servers.py files, respectfully.
Before I dive into what’s in the other files, the general flow is that setup_hosts will create Fabric tasks to assign servers to the env.hosts variable and load_module will load tasks from app.py and namespace them. The namespacing is there in case I want to add more tasks in a separate file later.
The __init__.py file
This is fairly simple, so I’ll cover what’s in here first. Remember, this file is located at deploy/__init__.py. Here’s what in there:
I define one function which will take the name of functions defined in the __all__ property in a module and create a Fabric task prefixed with the module name. So for example, if our app.py we’ll have a task called deploy. When we run load_module() on the app module, we’ll end up with a Fabric task called app_deploy.
The servers.py file
The servers.py file holds the configuration of our servers (this could should be split out, I just haven’t gotten to it yet) and the logic to autogenerate tasks that will setup env.hosts. Here’s what it looks like:
from fabric.api import *
__all__ = ['setup_hosts', 'db', 'print_hosts',]
class Host(object):
def __init__(self, host, name, instance_id, elbs):
self.host = host
self.name = name
self.instance_id = instance_id
self.elbs = elbs
def __str__(self):
return self.__repr__()
def __repr__(self):
return '<fabfile.Host host="%s", name="%s", instance_id="%s", elbs="%s">' % (self.host, self.name, self.instance_id, self.elbs)
class HostManager(object):
def __init__(self, hosts=None):
self.hosts = set()
self.host_lookup = dict()
if hosts:
for h in hosts:
self.add_host(h)
def add_host(self, host):
if isinstance(host, dict):
host = Host(**host)
self.hosts.add(host)
self.host_lookup['host:' + host.host] = host
self.host_lookup['name:' + host.name] = host
self.host_lookup['instance_id:' + host.instance_id] = host
for elb in host.elbs:
key = 'elb:' + elb
if key not in self.host_lookup.keys():
self.host_lookup[key] = set()
self.host_lookup[key].add(host)
def get_all_hosts(self):
return self.hosts
def get_hosts_by(self, method, key):
return self.host_lookup['%s:%s' % (method, key)]
db = HostManager([
# app
Host(host='ec2-12-34-56-78.compute-1.amazonaws.com', name='production-1', instance_id='i-abcdefgh', elbs=['production']),
Host(host='ec2-12-34-56-79.compute-1.amazonaws.com', name='production-2', instance_id='i-ijklmnop', elbs=['production']),
Host(host='ec2-12-34-56-80.compute-1.amazonaws.com', name='staging-1', instance_id='i-qrstuvwx', elbs=['staging']),
Host(host='ec2-12-34-56-81.compute-1.amazonaws.com', name='staging-2', instance_id='i-yz123456', elbs=['staging']),
])
########################################################
# M E T H O D S T O S E T U P H O S T S
########################################################
def create_host_setter(_filter):
def wrapper():
env.hosts = list(set([h.host for h in db.get_all_hosts() if _filter(h)] + env.hosts))
return wrapper
# By ELB
def _filter_by_elb(elb):
def _filter(host): return elb in host.elbs
return _filter
# By instance
def _filter_by_name(name):
def _filter(host): return host.name == name
return _filter
def setup_hosts(g):
for elb in ['staging', 'production']:
g[elb] = create_host_setter(_filter_by_elb(elb))
for host in db.get_all_hosts():
g[host.name] = create_host_setter(_filter_by_name(host.name))
def print_hosts():
print env.host
At the top of the file, I define a wrapper Host object that represents a server/host I’m deploying to and a HostManager which holds all of the Host instances and has lookups to find servers by instance ID, name and ELB. Instance ID and ELB are both logical EC2 attributes if you’re deploying to EC2s on AWS. If you’re not deploying to AWS, you can remove those attributes, but the same logic still applies. The idea is that I want to use Fabric to deploy to a specific machine or set of machines based on attributes that are convenient. If you have other metadata you want to incorporate, it’s straightforward to do so.
After the HostManager class, you’ll see my server configuration hardcoded. Whether it’s in this file or elsewhere, this is super convenient because it’s easy to modify what servers you want managed and it’s okay to check this file into source control because it doesn’t contain any sensitive information. (Remember Fabric deploys via SSH, so if your key isn’t in the authorized_keys for that server, you can’t access it).
Finally, the the block below the comment is what it claims to be — methods to set up hosts. Specifically methods to generate tasks to set the env.hosts environment variable Fabric uses. I’m using Python to create some generators that will take a filtering function to run through all the Host objects and then assign them to env.hosts. This is really great because now I can deploy to a specific set of servers by a canonical name.
At this point, if you have this file setup, you can go to the command line and run:
bin/fab staging print_hosts
and it’ll output which servers are in that group. Also, because of how we’re setting env.hosts, you can use multiple hosting groups at once. For instance:
bin/fab staging production print_hosts
will print all four servers in our configuration.
One more thing to point out. The pattern I use for deploying is: bin/fab [task to setup hosts] [task to run]. Just as the tasks to set hosts can be chained, you can chain the tasks to run, but it’s important that all the host setting tasks precede the tasks that actually “do stuff.”
The app.py file
Okay, so now I have Fabric setup to do everything except actually deploy my code. Here’s what the app.py file looks like:
from fabric.api import *
__all__ = [ 'deploy', ]
def git_pull():
with cd('app'):
run('git reset --hard')
run('git pull')
def buildout(fetch=True):
if fetch:
git_pull()
with cd('app'):
run('python bootstrap.py')
run('bin/buildout')
def restart():
sudo('service app restart', pty=False)
def deploy():
git_pull()
buildout(False)
restart()
One of my favorite parts of Fabric is how readable it is. If you look at the deploy function and you’re wondering what it does, well, it’s pretty easy to see. Our deployment process involves doing a git pull, running Buildout and then restarting the server. There are many more tasks I have in my actual app.py file that do all sorts of things like setup a server from scratch, install apt-get packages, restart other services and much more (let me know if there are specific use cases you’d like to see and I’ll cover then in a future post). For this post, I’m just going to focus on deploying code and assume you’ve already manually SSHed to each server and git clone‘d your app to a directory called “app.”
So let’s do a quick walkthrough of what this does. First, git_pull changes directory to app and does a git reset followed by a git pull. I do the reset in case I happen to have changes floating around on a server from debugging or something like that. Notice how Fabric uses the incredibly pythonic with cd('app') to execute commands in a specific directory. Love that.
The buildout function just runs bin/buildout like we do in our development environment. Then the restart command starts the service running our app. The restart command I use in production has more logic to wait for the restart to complete and do a few other checks to make sure life is good before we say we’re done. I’ve simplified it here just to illustrate the idea. (Side note: I run my apps in production with gunicorn If anyone’s interested, I can do a post on how I do it.)
And that’s it! If you have everything setup with a configuration pointing to your hosts, all you need to deploy is:
My development process boils down to building features/fixing bugs/running tests, git add ., git ci -m "Helpful message", git push and then bin/fab staging deploy. This makes deployment a single command and allows deployments to occur as soon as the code is ready.
If you followed this far, thanks for reading. If you have any questions, you can post in the comments and I’ll respond. If you liked this, I’d appreciate it if you voted for this over on Hacker News. Thanks!