Abstraction: use scope
to encapsulate database changes
Active record makes it easy to write simple queries anywhere in your application. The key phrase here is anywhere in your application: When database queries are scattered throughout your application, simple database changes may require modifications to your controllers, your views, helpers, mailers, etc.
ActiveRecord provides you with scope
which allows you to hide the details of your schema so that database changes don’t affect the entire application.
Here is an example of using a scope to look up a user by their email address:
ruby
class User < ActiveRecord::Base
has_many :emails
scope :by_email, -> (email) {
joins(:emails).where(emails: {address: email})
}
end
User.by_email('john@example.org')
# SELECT "users".* FROM "users"
# INNER JOIN "emails"
# ON "emails"."user_id" = "users"."id"
# WHERE "emails"."address" = 'john@example.org'
When you use a scope, if you change the schema, you can update the User model while everything else remains the same.
Find matching records using LIKE
and ILIKE
Let’s say we have been getting spam comments coming from the same email domain and we want to get rid of all comments from all users who have that particular domain. Here our scope needs to join to the emails table:
ruby
scope :by_email_domain, -> (domain) {
joins(:emails).where("emails.address like ?", "%@#{domain}")
}
# Find all users who have ever had an email address at example.org
User.by_email_domain('example.org')
# SELECT "users".* FROM "users"
# INNER JOIN "emails"
# ON "emails"."user_id" = "users"."id"
# WHERE (emails.address like '%@example.org')
like will retrieve all records where the email address includes the value returned by
#{domain}. You can also use
ilike for a case-insensitive version of
like.
## Eliminate duplicates by replacing a join with a sub-select
There is one gotcha with the above code: if there are multiple users at the same domain, the scope will return multiple records. To eliminate duplicates, you can replace the join with a sub-select. This has no impact on performance (at least if you are using Postgres).
__Performance tip #1:__ The sub-select is built by calling
#to_sql` on ActiveRecord::Relation to generate the sub-select. This avoids loading the records.
Performance tip #2: It also avoids name collision issues that arise when joining to the same table more than once.
ruby
scope :by_email_domain, -> (domain) {
user_ids = Email.by_domain(domain).select(:user_id)
where "users.id in (#{user_ids.to_sql})"
}
User.by_email_domain('example.org')
# SELECT "users".* FROM "users"
# WHERE (users.id in (
# SELECT "emails"."user_id" FROM "emails" WHERE (address like '%@example.org')
# ))
In the above example, we first execute a query that generates all the user ids (Email.by_domain(domain).select(:user_id)
). Until we actually try to use that result (by iterating through it, for example), it is just an ActiveRecord relation. It hasn’t been fetched from the database yet.
This is important because what we do on the next line is call to_sql
on it. This will return the SQL query that would have been run to fetch those records, but the user ids are not actually fetched.
Doing it this way maintains the performance advantages of having a single query but, because there is no join, you don’t have any duplicate records.
## Filter multiple values by passing in an array
To filter by multiple domains at once, you can pass in an array:
ruby
class Email
scope :by_domain, -> (domain) {
where <<-SQL, Array(domain).map {|e| "%#{e}"}
emails.address LIKE ANY (select unnest(ARRAY[?]))
SQL
}
end
Email.by_domain(['example.com', 'example.org'])
# SELECT "emails".* FROM "emails"
# WHERE (
# emails.address LIKE ANY (
# select unnest(ARRAY['%example.com','%example.org'])
# )
# )
Note: Array(domain)
converts the domain
parameter to an array if one item if it isn’t an array already.
When you are comparing a column value with values in an array, you use LIKE ANY
.
*******************
**Ask Daniel to write about it if he wants it included
here documents in ruby
<<-SQL
is a multi-line string
*********************
Combine conditions from multiple queries using Merge
ActiveRecord has a merge method on active relation that allows you to combine conditions from multiple queries. Let’s say you have a product model and you want to search either by the single search term “mazda”, or on multiple search terms such as “mazda, door, and handle”.
You would do this in both cases by using Product.matching_term
. When there are multiple terms, you can call Product.matching_term
on each of the terms one by one and then merging the results.
This can be useful in a number of scenarios, such as when you are creating an admin interface and you need to build up complex filtering.
ruby
# All products matching 'mazda'
Product.matching_term("mazda")
# Products matching 'mazda', 'door' and 'handle'
terms = %w( mazda door handle )
products = terms.map {|t| Product.matching_term(t) }.reduce(&:merge)
# select * from products
# where name ilike '%mazda%'
# and name ilike '%door%'
# and name ilike '%handle%'
Generate an OR query in ActiveRecord using Union
Union is great because it allows you to generate an OR
query in ActiveRecord, which is something that is surprisingly difficult to do.
UNION
will find everything in each of the component queries and will remove duplicates. If you do not want to remove duplicates, use UNION ALL
instead.
ruby
mazdas = Product.matching_search('Mazda')
mitsubishis = Product.matching_search("Mitsubishi")
mazdas_and_mitsubishis = Product.from(mazdas.to_sql + ' UNION ' + mitsubishis.to_sql)
# select * from products
# where name ilike '%Mazda%'
# UNION
# select * from products
# where name ilike '%Mitsubishi%'
Build up complex queries using common table expressions
Common table expressions allow you to progressively build up complex queries. For example, here is a scope that adds a calculated field to Campaign. It ranks campaigns into 1..5 according to how much money they have raised.
ruby
class CampaignInstanceUser < ApplicationRecord
scope :with_quintile, -> {
from <<-SQL
( with "quintile_table" as (
select
*,
ntile(5) over (order by total_raised_cents desc) as quintile
from "campaign_instance_users"
) select * from "quintile_table"
) as "campaign_instance_users"
SQL
}
end
CampaignInstanceUser.with_quintile.where(quintile: 5)
This code uses ntile
, which is a Postgres function that tells you where your record appears when the record set is sorted a certain way. ntile(5)
will give every record a number from 1 to 5 according to whether it is in the top 20%, the next 20%, and so on.
From the code example above, where quintile = 1
will give you the top 20% of campaigns according to how much money they have raised.
You Might Also Be Interested In
;