Writing a Rake Task to Export a Model to CSV

CSV (comma separated values) files are a great way to export a collection of records for importing into other systems or for analysis of data. Here we’ll be exporting the records for a table that has a single has_many relation.

Our Tables and Data

class Foo < ApplicationRecord
  has_many :bars 
end

class Bar < ApplicationRecord
  belongs_to :foo
end

In our database are two Foo records:
1. Named ‘East’, with two Bar records named ‘Tom’ and ‘Barb’.
2. Named ‘Mountain’, with three Bar records named ‘Kurt’, ‘Marg’, and ‘Andy’

From this we’ll want to export a CSV file that looks like this:

ID,Name,Bars
1,East, Tom Barb
2,Mountain, Kurt Marg Andy

Setting Up Rake Task

First off we’ll want to set up the rake task. We’ll want to create a file at lib/tasks/export.rb.

Inside that file we’ll want to add the following code:

namespace :export do
  task foo: :environment do
    puts Foo.to_csv
  end
end

The above code will create a rake task that can be called via rake export:foo. This will call Foo.to_csv and then output it to the console.

We’ll want to construct #to_csv to generate a CSV file that we can use to export or save.

Foo#to_csv

Foo#to_csv is our class function that’ll generate a CSV file from all of the records in the given collection. This could be called on a scope to only export a subset of the records, or just as Foo.to_csv as in our rake task to export them all.

class Foo
  ...
  def self.to_csv(options = {})
    CSV.generate(options) do |csv_file|
      csv_file << csv_header_row

      all.each do |foo|
        csv_file << foo.to_csv
      end
    end
  end

  def self.csv_header_row
    %w(ID Name Bars)
  end
end 

This creates two functions self.to_csv, and self.csv_header_row. Note that these are each created as class functions using self.

The above code starts by calling CSV.generate, this takes a block with a parameter (called csv_file) containing what will be the end file, within the block we then add to csv_file to build up the whole file.

The first thing we do is add our header row to csv_file. This contains an array of column names that we want to have in our final spreadsheet. In this case this is contained in Foo#csv_header_row and is an array containing “ID”, “Name”, and “Bars”.

Next we iterate over all and insert the result of .to_csv from each record into csv_file.

The function will then return the result of CSV.generate, which is a CSV file generated from the 2D array we created in the block.

Foo.to_csv

Foo.to_csv is the instance function available on each Foo record that generates a single row for that record to insert into a CSV file.

In this case we want to return an array containing the record primary key, its name, and a list of names of each of its bars.

Our code should look like this:

class Foo < ApplicationRecord
  def to_csv
    [id, name, bars.map(&:name).join(' ')]
  end
end

This returns an array containing the id, name, and a string containing the name of each of the record’s has_many :bars relations.

bars.map(&:name) iterates over each bar and creates an array from the result of calling name on each record. .join(' ') takes those names and joins them together into a single string separating each element by a space.

Pitfalls: We make sure to join the bar names and make sure they are separated by spaces to make sure that they appear in the same cell in the spreadsheet as each other. If we inserted them as an array without joining they would be inserted as comma separated values and be in individual cells. This is the same reason we join with a space instead of a comma.

Outputting to a File

To output the data to a file we will want to modify our rake task to write the output of Foo.to_csv to a file.

namespace :export do
  task foo: :environment do
    file_name = 'exported_foo.csv'
    csv_data = Foo.to_csv

    File.write(file_name, csv_data)
  end
end

This will output your csv file to exported_foo.csv.

Adding a timestamp to your exported file

To add a timestamp to your exported file’s name adjust the file_name variable to look like this:

time_stamp = Time.now.strftime('%Y%m%dT%H%M')
file_name = "#{exported_foo_#{time_stamp}.csv"

This will add a timestamp to your exported files that’ll allow you to sort them more easily.

Heroku Notes

Keep in mind that on Heroku you don’t have a file system. Exporting to a file won’t work and you won’t be able to retrieve the file. You’ll either want to just output the file to the console and then manually copy it out and save it to a CSV file or use a Mailer to email it as an attachment to a designated email address.

Thanks for subscribing!
Like what I'm doing? Subscribe and I'll let you know when I write new stuff.
Want to unsubscribe?
Subscribe