Blog software-development-news-and-opinion

When to Use Select Instead of Pluck

Miko Dagatan
Miko Dagatan
June 22, 2023

There are articles and stackoverflow answers that sample the general opinion that pluck is the superior method to use over map and select. I agree on that with map, but I beg to differ for select on some “select” cases, and for good reason.

What are the cases?

Usually, pluck is being used when there is some special chain of queries that are needed to be called on a complicated application. A sample of this call that I usually see is:


discounted_product_ids = Product.with_discounts.where(country: 'AU').pluck(:id)
promo_product_ids = Product.with_promos.where(country: 'AU').pluck(:id)

products = Product.where(id: discounted_product_ids + promo_product_ids)

This is way more common practice than I’ve expected, and it’s not totally bad. Where it becomes bad though is that there are 3 round-trip queries to the database for this. One is from discounted_product_ids, another is from promo_product_ids, and the third is where both arrays are combined to fetch the products by ids.

Benchmarking

I’ve set up a simple application that has a Project model and has 1 million records on it. It simply has a name string and an active boolean fields. Then, I’ve benchmarked on map, pluck, and select examples. I’ve also added a pluck + or example just to ensure select makes the impact instead of or. We’re trying to simplify the app since we want to observe the raw performance of the methods.


require 'benchmark'

class Benchmarking
  def call
    Benchmark.bm do |x|
      x.report('map') { use_map }
      x.report('pluck') { use_pluck }
      x.report('pluck + or') { use_pluck_with_or }
      x.report('select') { use_select }
    end
  end

  private

  def use_map
    active_ids = Project.where(active: true).map(&:id)
    inactive_ids = Project.where(active: false).map(&:id)

    Project.where(id: active_ids + inactive_ids)
  end

  def use_pluck
    active_ids = Project.where(active: true).pluck(:id)
    inactive_ids = Project.where(active: false).pluck(:id)

    Project.where(id: active_ids + inactive_ids)
  end

  def use_pluck_with_or
    active = Project.where(active: true)
    inactive = Project.where(active: false)

    Project.where(id: active.or(inactive).pluck(:id))
  end

  def use_select
    active_ids = Project.where(active: true).select(:id)
    inactive_ids = Project.where(active: false).select(:id)

    Project.where(id: active_ids.or(inactive_ids))
  end
end

Results


               user     system       total        real
map        4.227708   0.058591    4.286299 (  4.532398)
pluck      0.410575   0.015285    0.425860 (  0.562810)
pluck + or 0.391582   0.011911    0.403493 (  0.492739)
select     0.000305   0.000015    0.000320 (  0.000320)

`pluck` is 8.1x faster than `map`
`pluck + or` is 1.1x faster than `pluck`
`select` is 1758.8x faster than `pluck`
`select` is 1539.8x faster than `pluck + or`

We can see why pluck is the go-to solution for most Rails developers. It’s 8.1x faster than map, so there’s absolute efficiency compared to when using map.

However, the speed of select, when used properly, is just a wonder. It’s 1758.8x faster than pluck since pluck queries to the database 3 times, while select has got the 1 query with the .or function.

Caveats

Most cases that we need to create complex queries to the database, we might need the or function to combine select statements. However, if your Rails project is before v5.1, the or statement cannot be used, so you’ll probably try to keep using select until you have to use pluck on getting combined ids.

Conclusion

For us Rails developers, performance is an important topic. We’d like to maximise what Rails can do on our websites. So, it’s best to consider how we can maximise the use of select over pluck as the advantage is tremendous.