There are articles and stackoverflow answers that sample the general opinion that pluck
is the superior method to use over map
and select
. I agree on that with map
, but I beg to differ for select
on some “select” cases, and for good reason.
What are the cases?
Usually, pluck
is being used when there is some special chain of queries that are needed to be called on a complicated application. A sample of this call that I usually see is:
discounted_product_ids = Product.with_discounts.where(country: 'AU').pluck(:id)
promo_product_ids = Product.with_promos.where(country: 'AU').pluck(:id)
products = Product.where(id: discounted_product_ids + promo_product_ids)
This is way more common practice than I’ve expected, and it’s not totally bad. Where it becomes bad though is that there are 3 round-trip queries to the database for this. One is from discounted_product_ids
, another is from promo_product_ids
, and the third is where both arrays are combined to fetch the products by ids.
Benchmarking
I’ve set up a simple application that has a Project
model and has 1 million records on it. It simply has a name
string and an active
boolean fields. Then, I’ve benchmarked on map
, pluck
, and select
examples. I’ve also added a pluck + or
example just to ensure select
makes the impact instead of or
. We’re trying to simplify the app since we want to observe the raw performance of the methods.
require 'benchmark'
class Benchmarking
def call
Benchmark.bm do |x|
x.report('map') { use_map }
x.report('pluck') { use_pluck }
x.report('pluck + or') { use_pluck_with_or }
x.report('select') { use_select }
end
end
private
def use_map
active_ids = Project.where(active: true).map(&:id)
inactive_ids = Project.where(active: false).map(&:id)
Project.where(id: active_ids + inactive_ids)
end
def use_pluck
active_ids = Project.where(active: true).pluck(:id)
inactive_ids = Project.where(active: false).pluck(:id)
Project.where(id: active_ids + inactive_ids)
end
def use_pluck_with_or
active = Project.where(active: true)
inactive = Project.where(active: false)
Project.where(id: active.or(inactive).pluck(:id))
end
def use_select
active_ids = Project.where(active: true).select(:id)
inactive_ids = Project.where(active: false).select(:id)
Project.where(id: active_ids.or(inactive_ids))
end
end
Results
user system total real
map 4.227708 0.058591 4.286299 ( 4.532398)
pluck 0.410575 0.015285 0.425860 ( 0.562810)
pluck + or 0.391582 0.011911 0.403493 ( 0.492739)
select 0.000305 0.000015 0.000320 ( 0.000320)
`pluck` is 8.1x faster than `map`
`pluck + or` is 1.1x faster than `pluck`
`select` is 1758.8x faster than `pluck`
`select` is 1539.8x faster than `pluck + or`
We can see why pluck
is the go-to solution for most Rails developers. It’s 8.1x faster than map
, so there’s absolute efficiency compared to when using map.
However, the speed of select
, when used properly, is just a wonder. It’s 1758.8x faster than pluck
since pluck
queries to the database 3 times, while select
has got the 1 query with the .or
function.
Caveats
Most cases that we need to create complex queries to the database, we might need the or
function to combine select
statements. However, if your Rails project is before v5.1, the or
statement cannot be used, so you’ll probably try to keep using select
until you have to use pluck
on getting combined ids.
Conclusion
For us Rails developers, performance is an important topic. We’d like to maximise what Rails can do on our websites. So, it’s best to consider how we can maximise the use of select
over pluck
as the advantage is tremendous.