Skinny Models – the DAO pattern for Ruby persistence

For those used to ActiveRecord-style development the following might not seem all that bad:

class User < ActiveRecord::Base
  belongs_to :account
  has_many   :posts

  validates_format_of :email, :with => /@/

  after_create :deliver_welcome_email
  after_create :mark_pending

  private

  def mark_pending
    update_attributes! :state => 'pending'
  end

  def deliver_welcome_email
    ApplicationMailer.deliver :welcome_email, email
  end
end

If you’ve read Sandi Metz’ book (or have an OO background) you’ll notice something about this conventional Rails code.

Not only is it a huge mess of dependencies, it violates damn near every part of SOLID object-oriented design that we practice as a means of keeping our code nimble.

So what’s the alternative?. The Data Access Object pattern uses an object to manage persistence and doesn’t let that object do anything else. Effectively, it makes the above class look like this:

class User < ActiveRecord::Base
  belongs_to :account
  has_many :posts
end

Or, if you want to get religious:

class User < ActiveRecord::Base
end

(Personally, I think associations are the thing ActiveRecord does right so I’m fine leaving those in.)

We just took a lot of code out of the User class and it has to go somewhere else. Where? Well, that depends on what it does. Consider this reorganization:

class User < ActiveRecord::Base
  belongs_to :account
  has_many :posts
end

module Validator
  class Invalid < StandardError; end
                                   
  def self.validate!(record)
    unless record.email =~ /@/   
      raise Invalid
    end
  end
end

module Email
  def self.welcome(email)
    ## Handle email delivery
  end
end

class StateMachine

  attr_reader :record
  def initialize(record)  # I'm making the StateMachine instantiable because
    @record = record      # the 'record' object feels like internal state and
  end                     # there may be actions we want to perform on that
                          # object during a state transition
  def pending!
    record.state = 'pending'
    record.save!
  end
end

module Signup
                                   # You now have to use this to tell Signup
  def self.persistence=(db_model)  # what object is in charge of database
    @persistence = db_model        # persistence. This is where you pass in
  end                              # your DAO or, in tests, your stub object
                                 
  def self.complete! data              # I'm not bothering to make Signup an instantiable
    record = @persistence.new(data)    # class at this point because we don't need it yet.
    Validator.validate! record
    record.save!
    StateMachine.new(record).pending!
    Email.welcome data[:email]
  end
end

We now have many more lines of code and many more objects so it might look like we just added complexity but in fact we’ve reduced it. If complexity is the entanglement of concepts, then we’ve removed nearly all complexity from this application. Let’s walk through the changes bit by bit.

Removing callbacks

We turned a couple after_create callbacks into direct code execution. You now know exactly when the email address will be validated and when the welcome message will be sent. You also know how to stub out the objects that perform those functions in your tests.

Introducing a state machine

This is my favorite. There’s a Rails application at work that moves a lot of money around and needs to handle many intermediate states with very precise rules about when things are purchased, paid for, approved, rejected, returned, refunded, etc. If we were to try to implement this logic inside an ActiveRecord class we’d be fighting against the weight of Rails every time we extended the state graph.

Most Rails applications I’ve worked on have eventually had some kind of state machine in them. I’ve tried acts_as_state_machine, two gems both called state_machine and I’ve built several terrible and bug-prone versions myself.

Nothing has worked as well as pulling all the logic out into a new class and being very explicit about what kind of state transitions your application allows. You’ll notice the interface to the persistence object is only two methods large. As long as the object you pass in responds to state= and save! the StateMachine class will work. Which means you do NOT need to pass in an ActiveRecord instance in your tests – any simple mock will do.

Adding reasonable validations

Two months ago I was in the middle of a Rails2 → Rails3 upgrade and I found myself in a pry console staring at the insides of a validates_each block deep inside Rails. For trivial validations and/or trivial applications the Rails way of adding data validation is just fine. But when you write the validations yourself you get all the
following benefits:

test your validations without simultaneously testing that ActiveRecord#save still works
test the edge cases of your validations because you actually control how they’re defined and when they fire
easily debug your validations because you implemented them in a well-factored class
write hundreds of validation tests if you want because they’ll all run in less than a second.

Separating application-level features from database objects

The conventional path would have you type User.create! params[:user] in your controller. This is only good if you’re certain that creating a new User record is the action you’re trying to complete. But more likely what you want is to complete a user signup. So Signup.complete! params[:user] is much more descriptive of what you want. Maybe you want to create a User record. Maybe you want to create a Business record. Maybe it’s such an important action that you’ll be generating Account, Business, User, and Product records all at once. By keeping the controller-level semantics high-level you won’t have to change them to keep up with an evolving implementation.

And then your tests go zoom

This post isn’t called ‘how to make your Rails tests faster’ but it might as well be. The reason your tests are slow is ActiveRecord and ActionController. Your code (and Ruby) is actually blazing fast but it’s organized in such a way that you have to test each object’s dependencies if you want to test the object itself.

The tests

Let’s take a look at what these tests would look like before and after the above reorganization. First, the conventional Rails way:

## file:spec/models/user_spec.rb
require 'config/environment'  # This loads all of Rails and autoloads
describe User do              # every object automatically
  describe 'creating' do
    subject { User.create! params }
    let(:email) { 'm@rvelo.us' }
    let(:name)  { 'Marvelous'  }
    let(:params) { {
      email: email,
      name:  name
    } }
    context "with invalid email" do
      let(:email) { 'whoops' }
      it "doesn't create a new record" do
        expect { subject }.to_not change { User.count }
      end
    end
    context "with valid params" do
      it "creates a new record" do
        expect { subject }.to change { User.count }.by(1)
      end
      it "sends a welcome email" do
        ActionMailer.should_receive(:deliver).with(:welcome, email)
        subject
      end
      it "sets the 'state' column to 'pending'" do
        subject
        User.last.state.should == 'pending'
      end
    end
  end
end

That’s a respectable spec. It’s reasonably clean, tests most of the important stuff, and doesn’t go off track testing unnecessary edge cases or the implementation of the code (other than the ActionMailer bit).

However, every example in that snippet booted all of ActiveRecord and tested the full forest of callbacks, validations, database connection mechanisms, and relational associations inside ActiveRecord. Rails has its own tests so there’s no point in me or you re-testing the Rails internals. That just turns our app into a Seti@Home that’s guaranteed not to find anything.

Let’s rewrite that to test only what we care about.

## file:spec/unit/services/signup_spec.rb
require 'app/services/signup'  # We only require the thing we care about
                               # and we trust that _it_ will require any
                               # require any dependencies that it needs.
module FakeDAO
  require 'ostruct'               # We don't want to go saving to the actual
  def self.instance               # database all the time, we'll just assume that
    @instance ||= OpenStruct.new  # ActiveRecord still works.
  end

  def self.new(attrs)
    attrs.each do |k,v|
      instance.send "#{k}=", v
    end
    instance.send "save!=", true
    instance
  end
end

describe Signup do

  before { Signup.persistence = FakeDAO } # This is where we inject the dependency
  subject { Signup.complete! data }

  let(:email) { 'm@rvelo.us' }
  let(:name)  { 'Marvelous'  }
  let(:data) { {
    email: email,
    name:  name
  } }

  context "with invalid email" do
    let(:email) { 'whoops' }
    it "doesn't create a new record" do
      expect { subject }.to raise_error(Validator::Invalid)
    end
  end
  context "with valid data" do
    it "creates a new record" do
      FakeDAO.instance.should_receive(:save!).at_least(1).times
      subject
    end
    it "sends a welcome email" do
      Email.should_receive(:welcome).with(email)
      subject
    end
    it "sets the 'state' column to 'pending'" do
      subject
      FakeDAO.instance.state.should == 'pending'
    end
  end
end

That’s about the same size code as the previous example but it runs in less than a hundredth of a millisecond. It also explicitly tests only Signup – the object under test.

We then get to add individual tests for the other components we’ve extracted. Each of those will be focused and obvious.

You’re stubbing things. What if a bug slips through the cracks?

It’s only okay to stub the dependencies of an object if two conditions hold: You’re stubbing in such a way that you’re unlikely to change the core behavior of the object under test and you still have high-level integration tests that test the whole suite.

So if you’re reasonable about what you choose to stub and you still write a basic high-level test that only tests for the primary cases (not edge cases) you can get away with this approach.

We’ve had enough success with this pattern at work that Xavier added the following to our spec/unit_helper.rb:

RSpec.configure do |config|
  config.before(:suite) do
    if Object.const_defined?("Rails") || Object.const_defined?("ActiveRecord")
      raise "The Rails environment should not be loaded for unit tests."
    end
  end
end

That’s right, we throw an error if you’ve required any part of Rails in your code. This enforces that you’ve written loosely-coupled, dependency-injected, DAO-style persistence-oriented classes whether you are familiar with those terms or not. I can’t recommend this snippet of code highly enough.

Removing callbacks¶

Introducing a state machine¶

Adding reasonable validations¶

Separating application-level features from database objects¶

And then your tests go zoom¶

The tests¶

You’re stubbing things. What if a bug slips through the cracks?¶