Implementing Full-Text Search in Rails with Postgres
Ryan Stenberg, Former Developer
Article Category:
Posted on
Need full-text search in your Rails app? Running Postgres as your DB? PgSearch is just the thing for you!
With very little effort, you can tap into Postgres' native full-text search functionality with PgSearch. PgSearch also exposes a number of options/configurations that allow you to tweak how full-text search happens inside Postgres. Alright, let's do it!
Step 1: Gemfile #
Add it for great justice!
gem 'pg_search'
Step 2: Single-Model Search or Multi-Model #
PgSearch offers two distinct search strategies depending on whether or not you need to search against a single model or multiple. The configuration and options available are mostly specific to a particular strategy.
Let's start out with Single-Model.
Step 3 (Single-Model): Configure the Model #
PgSearch gives you the class-level pg_search_scope
method for configuration. Feed it a name for the search scope, the columns to search against, and any additional fine-tuning via the using
option. Given a basic BlogPost
model, you'd probably have something like this:
class BlogPost < ActiveRecord::Base
include PgSearch
pg_search_scope :search_for, against: %i(title body)
end
When I set it up on a recent project, I was searching against a single JSON-type column where I wanted to match against multiple words. Here's what that looked like:
pg_search_scope :search_content_for, against: :content, using: { tsearch: { any_word: true } }
That's pretty much all you need. You're sporting Postgres-powered full-text search in two lines of code.
Digging a Little Deeper: pg_search_scope
#
PgSearch allows you to customize a handful of things through the pg_search_scope
method and the options it takes. I wanted to briefly touch on some of these to give an idea of what you can do with PgSearch.
The against
option, as we saw, will take a single column or an array of columns. It also supports weighting! To weight the columns, pass a hash or two-dimensional array with the values or second elements as A
, B
, C
, or D
:
pg_search_scope :search_full_text, against: {
title: 'A',
content: 'B'
}
pg_search_scope :search_full_text, against: [
[:title, 'A'],
[:content, 'B']
}
The using:
option is the thing that lets you tap into Postgres full text search features:
tsearch
: PostgreSQL's built-in full text search supports weighting, prefix searches, and stemming in multiple languages.dmetaphone
: Double Metaphone is an algorithm for matching words that sound alike even if they are spelled very differently. For example, "Geoff" and "Jeff" sound identical and thus match. Currently, this is not a true double-metaphone, as only the first metaphone is used for searching.trigram
: Trigram search works by counting how many three-letter substrings (or "trigrams") match between the query and the text.
PostgreSQL ships with everything you need for full-text search, but you'll need to install additional PostgreSQL packages to support the other two types.
In the above example I gave from my own experience, I used tsearch
to tap into the any_word
option. It also has the following options:
Step 3 (Multi-Model): Configuration! #
Now that we've touched on how to set up PgSearch for Single-Model, let's take a look at Multi-Model.
Step 3.1: Run PgSearch's Multi-Model Generator #
To support multi-model search, PgSearch basically sets up a PgSearch::Document
model with its own database table. To add the model and its migration, run the following from your Rails app's project root:
$ rails g pg_search:migration:multisearch
$ bundle exec rake db:migrate
Step 3.2: Specify the Models to Include in Multi-Search #
Here's our BlogPost
model from before (demonstrating conditional inclusion in multi-search results based on a published flag):
class BlogPost < ActiveRecord::Base
include PgSearch
multisearchable against: %i(title body), if: :published?
end
Step 3.3: Optional Initializer #
We can optionally configure multi-search in an initializer. In my case, I still wanted to return results where any word matched:
# config/initializers/pg_search.rb
PgSearch.multisearch_options = {
using: { tsearch: { any_word: true } }
}
Digging a Little Deeper: PgSearch::Document
#
Going back to the PgSearch::Document
model -- it contains a polymorphic association that points to an instance of one of the multiple models being searched against as well as a text column that aggregates the string contents of each column from a given, multi-searchable model. When you search against multiple models, you're really just searching against PgSearch::Document
as it serves as the aggregation of all text across your models.
If we had a Comment
model alongside our BlogPost
model where we want to search against a comment's body along with the title and body of any blog posts, PgSearch would build a PgSearch::Document
record for each BlogPost
and Comment
. Let's look at some mock data to demonstrate how it works.
Given the following records:
post1 = BlogPost.create(title: 'Single-Model Search', body: 'So easy.')
post2 = BlogPost.create(title: 'Multi-Model Search', body: 'Surprisingly easy.')
comment = Comment.create(body: 'PgSearch makes search easy!')
We'd end up with PgSearch::Document
records like this:
[
#<PgSearch::Document:0x007fab39232af8
id: 1,
content: "Single-Model Search So easy."
searchable_id: 1,
searchable_type: "BlogPost"
>,
#<PgSearch::Document:0x007fab39232af8
id: 2,
content: "Multi-Model Search Surprisingly easy."
searchable_id: 2,
searchable_type: "BlogPost"
>,
#<PgSearch::Document:0x007fab39232af8
id: 3,
content: "PgSearch makes search easy!"
searchable_id: 1,
searchable_type: "Comment"
>
]
The actual full-text search functions the same as it did in the single-model strategy now that everything's contained in our PgSearch::Document
records.
Step 4: Use It! #
With our single-model example, search is as simple as:
BlogPost.search_for('postgres 5ever')
With multi-model:
PgSearch.multisearch('easy')
And for extra fun, tack on the .with_pg_search_rank
to either of those search scopes to expose the pg_search_rank
on the returned records. It'll show the numeric relevancy ranking from Postgres. I found the pg_search_rank
was helpful when validating search in my tests and also when used in a multi-condition sort.
Parting Words #
I have two "gotchas" to share before we part ways.
- I couldn't use
distinct
with PgSearch (issue), so I had to fall back to calling.to_a.uniq
on the final result set. This was necessary because you'll get multiple instances of the same record if it matches against multiple keywords. - Results using full text search are automatically ordered by relevance (
pg_search_rank
). To override the ordering, you have to apply the.reorder
scope.
Overall, PgSearch was a really pleasant surprise that made me love Postgres and the Ruby/Rails community even more. It's powerful, simple, and will most likely cover most use cases around search.