Optimize ActiveRecord Queries: Master .exists?, .any?, and .size

In a lot of Ruby on Rails projects, you would often need to do a presence checking of an ActiveRecord Object or need the actual count of a specific ActiveRecord Model. And that happens multiple times in your application. And in some cases that could result in making your application slow due to inefficient or excessive queries on the database.

Presence Checking

Often times we do a lot of presence checking in our application and this would be the most common queries we do in our database. Rails out of the box offers a lot of ways to check the existence of a record in the database and the most common one is using .present?. We often use the presence checking for conditionals in our application like we want to do a certain action or render something only if a record exists.

In these certain scenarios, We should most of the time, utilise the .exists? method. For example we have a User Table and we want to check if a User that has signed in 5 times already exists and the User Table already has 10,000 records on it

# .present?
User.where(sign_in_count: 5).present?
User Load (5.5ms)  SELECT "users".* FROM "users" WHERE "users"."sign_in_count" = $1  [["sign_in_count", 5]]

# .exists?
User.where(sign_in_count: 5).exists?
User Exists? (0.7ms)  SELECT 1 AS one FROM "users" WHERE "users"."sign_in_count" = $1 LIMIT $2  [["sign_in_count", 5], ["LIMIT", 1]]

One thing to notice is that the SQL queries are quite similar but the thing with .exists? is it is only Selecting 1 record and limiting the result to 1. This could make a huge difference if a certain Table or ActiveRecord Model has already a lot of records in it. Unlike with .present?, It would load all of the database records and from there, check if the result is empty or not. So in most cases it would be recommended to use .exists?. But what if the records are already loaded? e.g in a controller action you have an instance variable of @users, Then it would be best to opt in using .any?. This method would not do an additional SQL call

@users = User.where(sign_in_count: 5)

@users.exists?
User Exists? (1.7ms)  SELECT 1 AS one FROM "users" WHERE "users"."sign_in_count" = $1 LIMIT $2  [["sign_in_count", 5], ["LIMIT", 1]]
=> true

@users.any?
=> true

In real world usage, It would be recommended to use exists? or any? depending on the situation.

Fetching Count Data

By the phrase itself, If we want to fetch the number of records for a specific Table or ActiveRecord Model, We would do .count. In real world scenarios, We would often have a variable loading all the records for a specific table e.g @users = User.all. But the thing would using .count, It would then perform another SQL Query.

@users = User.all
User Load (58.6ms)  SELECT "users".* FROM "users"

@users.count
(8.8ms)  SELECT COUNT(*) FROM "users"

Thus causing another database hit. In these scenarios we would then opt to use .size. For context, .size will only perform a SQL Query only if the records is not loaded yet. If the records are already loaded, it would then return the total number of items in that collection

@users = User.all
@users = User.all
User Load (58.6ms)  SELECT "users".* FROM "users"
=> 10000

@users.size
=> 10000

By using .size, This would eliminate the second SQL call and would not have to do a database call.

That’s it! Applying these approaches can cause significant improvements given in a lot of projects, presence checking and fetching data count happens in a lot of our application’s pages and requests and with large scale applications, reducing SQL calls will help a lot in terms of performance.

ActiveRecord Optimisation: Utilising .exists?, .any? and .size

Presence Checking

Fetching Count Data