Using database views for increased performance in Rails
tl;dr
I used the Scenic gem from Thoughtbot to create database views for a page that needed data from a large number of tables in a Rails application, reducing the total database queries from 190 to to 12 and reducing the local page load times from 27.7s to 1.3s.
The Setup
Working with Rails provides developers with a lot of niceities. There is a lot of magic and abstraction at play that can make tasks significantly easier. The flip side of that is sometimes those abstractions are so easy that you develop yourself into a corner and the next thing you know you have a page that takes 30 seconds to load on your local server.
That’s where I found myself in not too long ago. Like a lot of projects I work on I approached this particular feature with a “make it work and then make it better” attitude. By the time I got to the “make it better” part I realized that I had created a problem for myself that really had to be rethought from the beginning.
The feature in question was relatively simple from a design perspective (yay CSS Grid!). In order to meet some of the design requirements, however, I needed different pieces of data from slightly more than ten different tables in the database. This is where my slow down woes come to a culmination. Preloading wasn’t quite helping. Restructuring the existing application to support this new design seemed like overkill. Further abstractions would only make it harder on the next developer to make tweaks.
I found myself thinking aloud: I wish there was a magic table that had all of the informtion in it already preformatted. Fortunately a little birdie replied saying “You’re describing database views”. Having heard of a new term for the first time I went to googling! Not only was this a magical feature that databases have supported for a long time but it even has first class support in Rails.
What is a database view?
I think the best way to explain this would be by pulling some information from Postgres’ documentation:
Making liberal use of views is a key aspect of good SQL database design. Views allow you to encapsulate the details of the structure of your tables, which might change as your application evolves, behind consistent interfaces.
Views can be used in almost any place a real table can be used. Building views upon other views is not uncommon.
So, let’s say you ran the backend for a retail chain and you often referred to each store’s sale numbers. You could create a new view that encapsulates that data like this:
CREATE VIEW store_sales AS
SELECT
stores.id,
SUM(orders.total)
FROM
stores,
orders
WHERE
stores.id = orders.store_id
GROUP BY
stores.id;
Now instead of duplicating a query in every place you need to reference that data you can treat the view like it’s a table itself:
SELECT * from store_sales;
I like to think of database views as a way to encapsulate (or save) complicated queries so that you can easily refer to them later. It’s important to remember that a database view isn’t duplicating any data, though; nothing new is being saved or persisted in the database.
Enter: Scenic
Like most things Rails-related I was sure there would be a gem that could help me on this mission. I was happy to find one that not only specifically worked with Postgres (the database choice for this application) but that it was made by the folks at Thoughtbot who I’ve always admired.
Scenic provides a wrapper around Rails’ own support for database views. I don’t like adding gems to applications willy-nilly but Scenic sold me on a few great features:
- Ability to version views
- Ability to write the views in SQL files (instead of in the migrations themselves)
- Command lind view generator
After adding gem scenic
to your Gemfile
and running $ bundle install
you can start creating views! Using our retail store example above we can create our first view:
$ rails generate scenic:view store_sales
create db/views/store_sales_v01.sql
create db/migrate/[TIMESTAMP]_create_store_sales.rb
Next we need to edit the SQL file that’s been created at db/views/store_sales_v01.sql
:
-- db/views/store_sales_v01.sql
SELECT
stores.id,
SUM(orders.total) AS sales_total
FROM
stores,
orders
WHERE
stores.id = orders.store_id
GROUP BY
stores.id;
Now we can run $ rake db:migrate
and, as my friends over at Thoughtbot say: baby, you got a view goin’.
Our view will return data in a format that looks like the table below. We get all of the store IDs in one column and then the sum of all of the sales for that store in the other:
+----+-------------+
| id | sales_total |
+----+-------------+
| 1 | 18200.00 |
| 2 | 1700.00 |
+----+-------------+
Treating our view like a table
Now that we have a view created we can query it like we would any other table:
SELECT * FROM store_sales;
But our Rails integration is about to get a heck of a lot cooler. We’re able to create a model for this table and treat it like an ActiveRecord
object! This was my big “ah-ha!” moment of digging into this. So instead of querying this with straight SQL we can create a new model:
class StoreSale < ActiveRecord::Base
def readonly?
true
end
end
Note: the
readyonly?
isn’t necessary as any saves will fail at the database level but this method will prevent Rails itself from even trying.
Now we can get the sales number for a store in a much more Rails-friendly way:
# Find a single store's sales total
StoreSale.find(STORE_ID).sales_total
# Query for multiple stores
store_sales = StoreSale.where(id: [STORE_IDS])
This approach becomes even more powerful with the addition of associations. If our view also included an id of another model like this:
+----+-------------+--------+
| id | sales_total | foo_id |
+----+-------------+--------+
| 1 | 18200.00 | 101 |
| 2 | 1700.00 | 451 |
+----+-------------+--------+
We can modify our model:
class StoreSale < ActiveRecord::Base
belongs_to :foo
def readonly?
true
end
end
And now we can more efficiently preload Foo
if we need to with ActiveRecord
:
store_sales = StoreSale.where(id: [STORE_IDS]).includes(:foo)
This was extremely important for the project that I was working on because I needed to deal with Paperclip attachments that I could not generate a URL for in the SQL that generated the view. Creating an association allowed me to preload all of the required paperclip attachments, limiting a number of N+1 queries, and simplifying the overall implimentation.
The Results
Is it fair to compare something super optimized against something created while ignoring performance? Not really but the numbers are so rewarding to compare!
Before: 27.7s, 190 SQL queries
After: 1.3s, 12 SQL queries
This was a page that was already being fairly heavily cached in the production environment so reporting on performance improvements there is a little tougher. What I do know is that everytime those caches were cleared users would experience a significant slow down until the cache was rebuilt. Implementing this change allows that interim-cache time to be just as speedy.
Using this approach may even allow us to reduce the heavy caching strategy previously implemented now that we can provide speedier page loads. Pulling back on our use of caching might allow us to deliver personalized pages that weren’t possible before.