Factory Girl Best Practices

Written by: Clemens Helm
6 min read


This is the seventh Testing Tuesday episode. Every week we will share our insights and opinions on the software testing space. Drop by every Tuesday to learn more! Last week we talked about the top 5 Cucumber best practices.

Generating and maintaining test data

When you write software tests, you usually need to get your application into a certain state by creating test data. This test data is the basis to run your tests on. One possibility to create this test data is writing an SQL script. A better one is writing fixtures. But all static test data has a downside:


Generating test data with factory girl There are tools that make generating and maintaining test data easy. In this screencast I show you why using a test data generation tool makes sense. I introduce my favorite tool named factory_girl. It is written in Ruby, but there are libraries inspired by factory_girl for Python, PHP, Scala and JavaScript as well. You can find them below in the "Further readings" section.

Up next week: Behavior-Driven Integration and Unit Testing

In next week's Testing Tuesday #8 we'll talk about integration and unit testing and how to use it in behavior driven development. We will meet our old friend Cucumber again and also introduce our new friend Rspec.

Further information:


Managing test data

Ahoi and welcome! My name is still Clemens Helm and you're watching Codeship Testing Tuesday #7. As I promised you last week, today we'll take a look at managing test data. By test data I mean data used in automated tests. Especially integration tests usually require a specific configuration of test data to perform on. By the way, what's an integration test? Integration testing means that you test multiple components of your application together. For example you test the user interface, the whole underlying web application and the database. In contrast to unit tests, where you just test single components like models or controllers. We will cover this difference in next week's episode.

For now let's focus on how to create test data for our tests. One possibility is to simply insert it into the database before running the tests using SQL like this:

… INSERT INTO users (name, age, female, city) VALUES ('Maggie', 42, true, 'Vienna'); INSERT INTO users (name, age, female, city) VALUES ('Kurt', 12, false, 'Seattle'); …

What's wrong about this? Most of all: It's not readable. You need to read a lot of unnecessary SQL syntax to figure out what this data even means.

A better option is to use a structured data file like YAML


Bob: name: Bob Dylan secret_question: How many roads must a man walk down before he can call him a man? secret_answer: Seven.

Janis: name: Janis Joplin secret_question: Oh Lord, won't you buy me a Mercedes Benz? secret_answer: Nope.

Much better. These structured datasets are called test fixtures. They are more readable and you can easily parse this data and insert it into the database. Also this way you are not depending on a specific database type and can migrate your tests easily to something like MongoDB later on.

So what we can do now is insert all our test data into the database first and then run our tests on it. Right?

Wrong. Most of the time we modify test data during the tests. That means, the next test has to deal with modified test data. That's not what we want. We want each test to run on fresh, unmodified data.

So we can simply re-generate all test data before each test. Right?

Well, you could do that, but then you generate all data for any test for each test. As your test suite grows linearly, the time for setting up your test data will increase exponentially, and that's definitely not what you want.

Instead, we actually only want test data that's relevant in a test. So in your tests you could do something like

bob = load_fixture("Bob")

But then, why do we need fixtures anyway? We could just create test data in each test! Fixtures are much more maintainable. Let's say you've got 200 tests using users and then you add a required attribute "secret_wish" to the user. Then you need to correct your test data in all 200 places. If you use a small number of fixtures everywhere, then you just need to correct the fixtures.

Unfortunately, as your project grows, you usually need a large number of fixtures. You may need old and young, female and male, dead and alive users and combinations thereof. Also maybe you have to define 20 attributes for each user so it is a valid record, but you only need one per test.

This way you will end up with a huge amount of – mostly duplicate – fixture data. This will of course lead to the same problem: Test data becomes hard to maintain.

There is a number of tools that solve this problem. My favorite one is factory_girl. factory girl is a fixture replacement written in Ruby, but there are also similar implementations in Python, PHP, Scala and JavaScript. Check out the further readings section to find out more.

In factory girl we can define fixtures as "factories" like this:

FactoryGirl.define do factory :user do name "Bob Dylan" secret_question "How many roads must a man walk down before he can call him a man?" secret_answer "Seven." end end

You create a new user in the database like this:

user = FactoryGirl.create(:user)

But you can also decide to build a user in your application without saving it:

user = FactoryGirl.build(:user)

You can override the attributes defined in a factory when you build a user:

user = FactoryGirl.build(:user, name: "Janis Joplin")

We can also tell a factory to generate a different attribute every time. Let's say we only allow users to sign up once per email address:

FactoryGirl.define do factory :user do name "Bob Dylan" secret_question "How many roads must a man walk down before he can call him a man?" secret_answer "Seven." sequence(:email) { |number| "user#{number}@example.com" } end end



One of the most beneficial features of factory girl is that factories can inherit from other factories. So one good strategy is to keep all required attributes in a factory and let all other factories inherit from it. So you just need to define required attributes once instead of for every single factory:

FactoryGirl.define do factory :user do name "Bob Dylan" sequence(:email) { |number| "user#{number}@example.com" }

factory :user_with_secret  do
  secret_question "How many roads must a man walk down before he can call him a man?"
  secret_answer "Seven."
factory :old_user do
  year_of_birth 1931

end end

And then we can just say

bob = FactoryGirl.build(:user_with_secret)

This helps clean up our test data massively.

There are many other neat features factory girl has to offer, and there are many other excellent tools for managing your test data as well. I can highly encourage you to check them out!

I hope you had a good time watching this episode! Next week we're gonna talk about integration and unit testing and how to use it in behavior driven development.

Thanks for watching, see you next Testing Tuesday … and don't forget: Always stay shipping!

Further readings:

factory-girl-php (PHP) – https://github.com/breerly/factory-girl-php rosie (JavaScript) – https://github.com/bkeepers/rosie

Stay up to date

We'll never share your email address and you can opt out at any time, we promise.