Before, there are few articles that rose up saying that in terms of performance, Structs are powerful and could be used to define some of the code in place of the Class. Two of these are this one and this one.
Let's revisit these things with the latest Ruby version, 3.4.1, so that we can see whether this perspective still holds true.
class BenchmarkHashStruct
class << self
NUM = 1_000_000
def measure
array
hash_str
hash_sym
klass
struct
data
end
def new_class
@class ||= Class.new do
attr_reader :name
def initialize(name:)
@name = name
end
end
end
def array
time = Benchmark.measure do
NUM.times do
array = [Faker.name]
hash[0]
end
end
puts "array: #{time}"
end
def hash_str
time = Benchmark.measure do
NUM.times do
hash = { 'name' => Faker.name }
hash['name']
end
end
puts "hash_str: #{time}"
end
def hash_sym
time = Benchmark.measure do
NUM.times do
hash = { name: Faker.name }
hash[:name]
end
end
puts "hash_sym: #{time}"
end
def struct
time = Benchmark.measure do
struct = Struct.new(:name) # Structs are only initialized once especially for large datasets
NUM.times do |i|
init = struct.new(name: Faker.name)
init.name
end
end
puts "struct: #{time}"
end
def klass
time = Benchmark.measure do
klass = new_class
NUM.times do
a = klass.new(name: Faker.name)
a.name
end
end
puts "class: #{time}"
end
def data
time = Benchmark.measure do
name_data = Data.define(:name)
NUM.times do
a = name_data.new(name: Faker.name)
a.name
end
end
puts "data: #{time}"
end
end
end
In this file, we're simply trying to create benchmark measures for arrays, hashes with string keys, hashes with symbolized keys, structs, classes, and data. In a the lifetime of these objects, we understand that we instantiate them then we access the data we stored. So, we'll simulate only that for our tests. We use 1 million instances of these scenarios and see the results. The measure method will show all of these measurements together.
performance(dev)> BenchmarkHashStruct.measure
array: 0.124267 0.000000 0.124267 ( 0.129573)
hash_str: 0.264137 0.000000 0.264137 ( 0.275421)
hash_sym: 0.174082 0.000000 0.174082 ( 0.181514)
class: 0.308020 0.000000 0.308020 ( 0.321165)
struct: 0.336229 0.000000 0.336229 ( 0.350576)
data: 0.345480 0.000000 0.345480 ( 0.360232)
=> nil
performance(dev)> BenchmarkHashStruct.measure
array: 0.090669 0.000378 0.091047 ( 0.094786)
hash_str: 0.264261 0.000000 0.264261 ( 0.275104)
hash_sym: 0.172333 0.000000 0.172333 ( 0.179407)
class: 0.311545 0.000060 0.311605 ( 0.324390)
struct: 0.335436 0.000000 0.335436 ( 0.349203)
data: 0.346124 0.000071 0.346195 ( 0.360396)
=> nil
performance(dev)> BenchmarkHashStruct.measure
array: 0.088372 0.003872 0.092244 ( 0.096181)
hash_str: 0.265748 0.000464 0.266212 ( 0.277565)
hash_sym: 0.174393 0.000000 0.174393 ( 0.181831)
class: 0.309411 0.000000 0.309411 ( 0.322613)
struct: 0.346008 0.000000 0.346008 ( 0.360760)
data: 0.344666 0.000000 0.344666 ( 0.359361)
=> nil
performance(dev)> BenchmarkHashStruct.measure
array: 0.077396 0.000038 0.077434 ( 0.080771)
hash_str: 0.242372 0.000140 0.242512 ( 0.252853)
hash_sym: 0.159206 0.000000 0.159206 ( 0.166007)
class: 0.273878 0.009250 0.283128 ( 0.295201)
struct: 0.322791 0.000323 0.323114 ( 0.336889)
data: 0.346099 0.000038 0.346137 ( 0.360901)
=> nil
I've run measure 4 times to account for any random changes that may have come and completely ensure of the performance of these tests. As expected, we see array at the top while symbolized hashes goes as a general second. We see that stringified hashes falls at the 3rd, with a huge gap when compared the the symbolized hashes. Then, when we look at class vs structs, it seems that structs have fallen a little bit behind compared to the classes. We could surmise that there is probably a performance boost done to classes in the recent patches.
Also, we could see that the Data object that was introduced in Ruby 3.2.0+ was falling behind the Struct object. This may be problematic since the Data object is basically a Struct that is immutable, so there's already disadvantages of using Data over Struct. We may still prefer Struct over Data considering that there's a bit of a performance bump over the Data.
There are 2 takeaways from this test. First, it's really important that we use symbolized hashes over stringified hashes as the former is 1.5x faster than the latter. Meanwhile, if not using hashes, it's better to use Classes over Structs, unlike what was previously encouraged. Classes are now 1.07x - 1.14x times faster than structs, so it's encouraged to keep using them.
Ps. if you have any questions
Ask here