"This article was originally published on The Codeminer 42's Engineering Blog by Thiago Araujos. With their kind permission, we’re sharing it here for Codeship readers.
I try to embrace a particular way of working with code: it should be minimal, idiomatic, and performant by default. Sometimes it is necessary to trade performance for readability, or readability for performance, or even break the deal if the trade is not worth it. It is possible to attain an optimal balance among all of these factors, as long as the goal remains ingrained in small decisions that we make while coding.
We will go over some specific examples that may benefit from a straight “no” answer by default, as they do not pose a worth trade most of the times. If you are not already into this mindset, consider getting into it.
Minimal: Defining unnecessary methods
Let’s stick to “attribute readers”, which are also known as “getters” in other programming languages. Take this code for example:
class Person attr_reader :first_name, :last_name def initialize(first_name, last_name) @first_name = first_name @last_name = last_name end def name "#{first_name} #{last_name}" end
Assuming that all clients of this code just use the name
method, the first_name
and last_name
readers should not really exist. They are nothing but slop unnecessarily polluting the public interface of our object.
As an application interface designer, you should definitely care about which methods are meant to be publicly exposed or not — and if you have control over the call sites, it is best to publish only methods that are indeed used by consumers. This kind of minimalism yields benefits and tends to make a codebase well-rounded and easier to evolve over time. I have frequently stumbled across such sloppiness while performing refactorings and searching for occurrences of these methods around the code base, only to find out they are not used anywhere else but internally. As you can see, this is also about communication. Let us go farther:
All code should exist with a good reason
And how about making the readers private? It is a popular technique, but is it any good? Let’s see:
class Person def initialize(first_name, last_name) @first_name = first_name @last_name = last_name end def name "#{first_name} #{last_name}" end private attr_reader :first_name, :last_name end
Well, this change introduces a subtle problem: more code. That may be a bad idea if it does not help with readability. Let’s run our file in the Ruby interpreter with the -W
flag:
$ ruby -W person.rb person.rb:10: warning: private attribute? person.rb:10: warning: private attribute?
That’s right, we asked for Ruby’s opinion and it warned us against our redundancy. We are adding extra lines of code to no substantial benefit, as we could have just referenced the instance variables directly:
class Person def initialize(first_name, last_name) @first_name = first_name @last_name = last_name end def name "#{@first_name} #{@last_name}" end end
As you may already know, attr_reader
dynamically defines a method which returns an instance variable of the same name:
def first_name @first_name end
What if there is a typo in the ivar?
Let’s make that assumption with some code:
# Instead of "Thiago Araújo" this would return " Araújo" def name "#{@firs_name} #{@last_name}" end
This version is semantically incorrect due to a typo on @firs_name
, but to the surprise of many it does not fail with a NameError
. Why? Because undefined ivars return nil
in Ruby, and nil.to_s
results in an empty string.
To be honest, to this date I have only seen this carelessness occur in sloppy code with no tests, not to say it only happens when interpolating ivars within strings.
The following example, in turn, provides a more useful error message:
# person.rb:8:in `name': undefined method `capitalize' for nil:NilClass (NoMethodError) def name "#{@firs_name.capitalize} #{@last_name.capitalize}" end
As opposed to a NameError
in the case of a private reader, which is even more precise:
# person.rb:8:in `name': undefined local variable or method `firs_name' for #<Person:0x007fd0f30719c8> (NameError) def name “#{firs_name.capitalize} #{last_name.capitalize}” end
Fortunately, both errors indicate the most relevant information to help us solve the problem: the line number — so, having a slightly improved error message in case a typo ever sneaks in does not sound like a good trade to me.
The error may explode somewhere else within the class if we pass a nil
ivar to another method, but still it gets easy to fix.
But I want to provide an extension point
A private getter may provide an extension point in case we require custom logic for the attribute in the future. Consider the following hypothetical change:
def first_name "#{salutation} #{first_name}" end
Heads up if you spot “in case…” or “in the future…” amidst a justification for code to exist. Why create such an extension point for a private attribute? Remember: the internals are under your control, hence I advise you to postpone this kind of “preventive” behavior until really necessary.
Truth to be said, using a private reader won’t make your code DRY if you are concerned about referencing the ivar more than once. Changing it to a method is just one search and replace away.
DRY is more about the big picture
There is a beautiful and minimal simplicity in using ivars directly, not to say they are easier to discern than barewords. Just try to keep ivar assignments on the initializer, and your class will become easier to maintain.
Idiomatic: Converting blocks to procs
The following each method leverages a very common Ruby idiom:
class CustomCollection include Enumerable def initialize(collection) @collection = collection end def each(&block) @collection.each(&block) end # Picture more methods... end CustomCollection.new([1, 2, 3, 4, 5]).each do |i| puts i end
It seems innocuous, right? Well, not so much. There is a performance hit there which should not be ignored. It turns out the each
method converts a block to a proc object behind the scenes:
collection = CustomCollection.new([1, 2, 3, 4, 5]) # This block is transformed into a proc collection.each { |item| do_something(item) }
And what is the problem? Well, a proc is a full-blown object, whereas a block is one of the few Ruby constructs that is not an object — actually, it is tuned to be performant and it does not have a callable interface nor does it respond to any methods.
There is a cost involved in this conversion, but we can easily avoid it by using an old’n’trusty block:
def each @collection.each { |item| yield item } end
Let’s run a benchmark to compare both alternatives:
require 'benchmark/ips' class CustomCollection def initialize(collection) @collection = collection end def each_block @collection.each { |item| yield item } end def each_block_to_proc(&block) @collection.each(&block) end end Benchmark.ips do |x| collection = CustomCollection.new([1, 2, 3, 4, 5]) x.report 'block' do collection.each_block { |item| } end x.report 'block to proc' do collection.each_block_to_proc { |item| } end x.compare! end
These results show that converting a block to a proc is about 1.44 times slower:
Warming up -------------------------------------- block 79.756k i/100ms block to proc 59.273k i/100ms Calculating ------------------------------------- block 1.268M (± 5.4%) i/s block to proc 878.862k (± 6.9%) i/s Comparison: block: 1268320.3 i/s block to proc: 878862.3 i/s - 1.44x slower
The numbers may as well be interesting, but one thing you have to keep in mind regardless is: the code works harder. If there was a substantial benefit to this I would just say “OK, no big deal”. But is there? Is the other way around problematic?
To be fair, there is still a syntactic advantage to using a &block
argument: the delegation is transparent. For instance, the following example breaks if we do not pass a block:
def each @collection.each { |item| yield item } end
custom_collection.rb:9:in `block in each’: no block given (yield) (LocalJumpError)
But it should have returned an enumerator, right? If not given a block, any idiomatic Ruby iterator is meant to return an enumerator.
So, why does the block-to-proc alternative return an enumerator? Because it delegates away the input block, and since the delegatee happens to be Array#each
we get this benefit for free. It works like this:
class CustomCollection def initialize(collection) @collection = collection end # We pass no block. There is nothing to convert, # so it comes in nil. def each(&block) # Here &nil has the effect of discarding the block. @collection.each(&block) end end #<Enumerator:0x007fd1f1d61738>
Fortunately, improving the second example consists in just one more line of code:
def each return to_enum(__callee__) unless block_given? @collection.each { |item| yield item } end
Now our method has the same behavior as a proc-to-block delegation and it returns an enumerator if we pass no block to it. And here is an equivalent alternative:
def each if block_given? @collection.each { |item| yield item } else @collection.each end end
It has a few more lines, but it is still readable, elegant, maintainable, and faster than our &block
option. Now we can chain iterators to perform complex transformations:
collection.each.with_object([]).with_index do |(item, memo), index| # Do something useful... end
When possible, make your code behave like a native citizen
And when is it good to use a &block
argument? When the code within the block needs to be stored for later use, due to a block falling out of scope after a method exits:
def store_callback(&proc) @i_will_be_used_later = proc end
Performant: Method objects emulating first class functions
Ruby does not have first class functions, so how do we compensate that deficiency? That’s right, by extracting method objects. Follows a silly example specifically tailored to illustrate this point:
class NumberListDoubler def initialize(numbers) @numbers = numbers end def call @numbers.map(&method(:multiply_by_two)) end private def multiply_by_two(n) n * 2 end end
This object takes an array of numeric values and returns a new one with each number multiplied by two. Notice how call
converts the multiply_by_two
method into an object that can be passed over to map
.
CTA: !Sign up for a free Codeship Account
When I see code like this written in Ruby, I automatically assume it is trying to be concise and reduce verboseness. However, does it read well? Does it meet that goal? I don’t think so. The &method
call is noisy and it does not look like idiomatic Ruby — it goes against the nature of the language and how it wants to be used in such cases.
Let’s face it: Ruby is not JavaScript, so you should not pass functions around indiscriminately. We can improve this code with “blocks”, an idiomatic Ruby feature that is ideal for this situation:
class NumberListDoubler def initialize(numbers) @numbers = numbers end def call @numbers.map { |n| multiply_by_two(n) } end def multiply_by_two(n) n * 2 end end
This is considerably more expressive in my opinion. It is not point-free like the &method
alternative, but you can use a splat if you need more resilience:
But we are not at the bottom line yet. There is something more important than a stylistic issue: extracting a method comes at a cost, and as a wary programmer you should be aware of that.
Turns out this code is extracting off a method object and converting it to a block afterwards (hence the &
character). Let’s run a benchmark comparing “block” versus “method object”:
require 'benchmark/ips' class NumberDoubler def initialize(numbers) @numbers = numbers end def call_with_block @numbers.map { |n| multiply_by_two(n) } end def call_with_method @numbers.map(&method(:multiply_by_two)) end def multiply_by_two(n) n * 2 end end Benchmark.ips do |x| list = NumberDoubler.new([1, 2, 3, 4, 5]) x.report 'with block' do list.call_with_block end x.report 'with method' do list.call_with_method end end
Now onto the results:
Calculating ------------------------------------- with block 46.214k i/100ms with method 21.740k i/100ms ------------------------------------------------- with block 806.459k (± 7.0%) i/s - 4.021M with method 285.775k (± 7.4%) i/s - 1.435M Comparison: with block: 806459.0 i/s with method: 285775.4 i/s - 2.82x slower
As you can see, method object is 2.82 times slower than block. This may not seem like a big deal if the code runs just a few times, but it may add up over the course of a real-world program. If we can avoid it, why not?
Ruby is slow, why should I care?
It is not that slow and I would dare to say it has acceptable performance for a dynamic language. This mindset is dangerous and may potentially turn a fast program into a slow one over time. As wary programmers, we should find a sweet spot between performance and readability, and avoid doing unnecessary work whenever we can.
Avoid doing more work if the trade is not worth it
There are still nice use cases for Object#method
, and most of them refer to reflection and metaprogramming. We can ask a method for its source location:
# ["number_doubler.rb", 22] p list.method(:call).source_location
Or for its arity:
# 0 puts list.method(:call).arity
Conclusion
I could have used an abstract and general prose here, but instead I chose to present content that is somehow relevant to the theme. That said, there is a lot more ground to cover!
I hope this post challenges you to think deeper about how your code looks and works under the hood, and to stop thinking “Ruby is slow, I don’t care”. If I have the opportunity, I hope to go over more examples in the near future.