About that monkey-patching business...

04·11·2021

A few days ago, a comment was made on the internet about Polyphony, an open source project of mine, mentioning the fact that Polyphony patches some core Ruby APIs. The context was the difference between Polyphony and Async (another fiber-based concurrency gem for Ruby):

Last time I checked, polyphony monkey patched Ruby core methods to do its work :-/. This has deterred from learning more about polyphony… On the other hand, async gem has always had incredibly clean code.

I’m sure Bruno Sutic, the author of the above comment, was writing in good faith, but his comment implies that Polyphony’s code is “dirty”. It also implies that monkey-patching is somehow illegitimate. While normally I don’t get too excited about people calling my code names, I do take pride in my work, and I feel a rebuttal is in order.

Moreover, the mere fact that Polyphony employs a (somewhat) controversial technique to do its thing should not deter people like Bruno from examining it. I’m sure all of us would benefit from approaching other people’s code with an open and inquisitive mind.

In this article I’ll explain in detail the strategy I chose for developing Polyphony and the role monkey-patching plays within it. I’ll also discuss the potential problems raised by the practice of monkey-patching, and how they can be minimized.

What is monkey-patching?

But first, for those that are confused about what monkey-patching actually is, here’s the Wikipedia entry (go ahead, read it!) For the sake of the present discussion, I’ll define monkey-patching as the practice of changing or extending the behavior of pre-existing Ruby classes or modules by overriding their instance or class methods. This can be done in a variety of ways, depending on what you want to achieve. The most obvious way would be to open the class, then redefine some methods:

class String
  def to_i
    42
  end
end

'123'.to_i #=> 42

You can also put your patched methods in a separate module, then prepend it to the target class (that way it will take precedence over existing method definitions):

module StringPatches
  def to_i
    42
  end
end

String.prepend StringPatches

'123'.to_i #=> 42

If you need to target specific object instances, you can patch their singleton class:

s = '123'
class << s
  def to_i
    42
  end
end

s.to_i #=> 42

You can also limit the monkey-patching to a single file, class, or module, by using refinements:

module StringPatches
  refine String do
    def to_i
      42
    end
  end
end

using StringPatches # activate refinement

'123'.to_i #=> 42

So monkey-patching can be done in a variety of ways in Ruby, depending on how specific you want the patched behaviour to be: from the level of a single object instance, through specific scopes, all the way to patching classes globally.

It’s also worth noting that there are other techniques that could be used instead of monkey-patching: subclassing is ubiquitous in Ruby, and can even work for extending core Ruby classes. Rails’s HashWithIndifferentAccess is a case in point. I could probably come up with a bunch of other alternatives, but I’ll leave it at that. The point is, it really depends on the circumstances.

Is monkey-patching inherently bad?

I’m sure many people have written before about monkey-patching and whether it’s good or bad for you, but in my most humble opinion, there’s no right or wrong when it comes to programming. Monkey-patching is just a technique that has its place like everything else under the heavens.

Of course, monkey-patching can lead to problems - it can cause compatibility issues and strange bugs, for example when your monkey-patching gem is combined with other gems. It can break behaviour across different versions of Ruby, or in conjunction with specific versions of specific dependencies. It can cause all kinds of havoc. But it can also provide a very elegant solution in specific circumstances, and can be amazingly effective.

When is monkey-patching useful?

Monkey-patching is useful when you need to alter or extend the way pre-existing classes behave. Ruby’s open nature lets you change almost everything about Ruby, even core classes such as Array or String can be modified (as shown in the above examples.) Why would we want to do this? Here are some cases where monkey-patching can be useful:

Ensuring compatibility between different versions of Ruby. This is especially useful when you need to backport some new method introduced in a later version of Ruby to an earlier version of Ruby. This is commonly called “polyfill” (there’s a whole bunch of them on rubygems.org.)
“Fixing” some gem to work with your code. Suppose you have encountered a bug in some gem your project depends on. In some cases, despite everybody’s best intentions, fixes to those problems can sometimes take months find their way into a new version. In those cases, a monkey-patch can solve the problem immediately, even if only temporarily, the new version containing the fix is put out by the gem’s author.

Debugging an application’s behaviour by overriding methods and adding tracing, for example:

require 'socket'

class TCPServer
  alias_method :orig_initialize, :initialize
  def initialize(hostname, port)
    puts "Connecting to #{hostname}:#{port}"
    orig_initialize(hostname, port)
    puts "Connected to #{hostname}:#{port}"
  end
end

Otherwise extending or replacing behaviours provided by the Ruby core or stdlib, or by Ruby gems. For example, the oj gem, which provides fast JSON processing, has a compatibility mode that effectively patches the json gem to provided as part of Ruby’s stdlib. This feature lets Ruby apps take advantage of faster JSON processing without any change to their code.

It’s important to note that the advantage monkey-patching provides over other techniques, such as subclassing, is that those patches are in fact going to impact all the other dependencies of your app. In the case of the oj gem, any other dependencies that make use of the JSON API are also going to show improved performance!

Designing Polyphony

When I first started working on Polyphony, I didn’t know where it would take me. Polyphony began as an experiment in designing an API for writing concurrent Ruby programs. My starting point was the nio4r gem, which implements an event loop based on libev. I really liked what nio4r was able to do, and wanted to experiment with different concurrency models, so I took its C-extension code and start fiddling with it. I went through a whole bunch of different designs: callbacks, promises, futures, async/await, and finally fibers.

As Polyphony slowly took form, the following principles manifested themselves:

Polyphony should extend the Ruby runtime and feel like an integral part of it.
Polyphony’s API should allow expressing concurrent operations in a concise manner, with a minimum of abstractions or boilerplate.
Polyphony should allow developers to continue working with core and stdlib classes and APIs such as IO, Socket and Net::HTTP.

In order to be able to apply the above principles to Polyphony’s design, I needed a way to make Ruby’s core classes, especially those having to do with I/O, usable under Polyphony’s concurrency model. The only solution that would have allowed me to do that was monkey-patching all those classes, including the IO class, the different Socket classes, even the OpenSSL classes dealing with I/O. Without monkey-patching, Polyphony as it currently is would have been impossible to implement!

Polyphony and Ruby’s new fiber scheduler interface

At this point people might ask: what about using the new Fiber::SchedulerInterface introduced in Ruby 3.0? Presumably, with the Fiber::SchedulerInterface I would be able to keep the same design based on the same principles, without resorting to monkey-patching Ruby core classes. That’s because the fiber scheduler is baked right into Ruby’s core.

I have long thought about this problem, and have always come to the same conclusion: if I were to base Polyphony on the Fiber::SchedulerInterface, it would have limited what Polyphony could do. In fact, some of the features Polyphony currently offers would have been impossible to achieve:

I want Polyphony to work on older versions of Ruby. In fact, one of my original constraints for developing Polyphony was to have it work on Ruby >= 2.6 (I started work in Polyphony in August 2018.)
Polyphony’s design is highly-integrated - from the io_uring- or libev-based backend through the fiber-scheduling code, all the way to the developer-facing APIs for spinning up fibers and controlling them. Polyphony’s io_uring backend in particular offers unique capabilities, such as chaining of I/O operations, which might have been much more difficult to achieve had it been based on the fiber scheduler interface.
The Fiber::SchedulerInterface itself is still in a state of flux, and is still missing hooks for socket operations (according to Samuel Williams, the developer behind the fiber scheduler interface, the read and write hooks are considered experimental at the moment.)
The Fiber::SchedulerInterface will not magically bring fiber-awareness to all Ruby gems, especially not those implemented as C-extensions. Take for example the pg gem, which has recently added support for fiber schedulers. Compare that with Polyphony’s monkey-patching approach, which is much more minimal. Another example is Polyphony’s patch for the redis gem.

Even gems that do not rely on C-extensions might be problematic. Such is the case with ActiveRecord, which does connection pooling per thread and is thus apparently incompatible with both Async and Polyphony. Here again, it seems to me that monkey-patching might be the more effective solution, and perhaps also simpler to implement, at least in the short term. That’s how Polyphony implements fiber-aware connection pooling for Sequel (thanks wjordan!)

Integrating fiber scheduling into Ruby’s core is not a trivial undertaking (and I applaud Samuel for his resolve and determination in the face of substantial pushback.) The problem is not only technological - making fiber scheduling work with the complex code of Ruby’s IO core - but also getting other Ruby core developers and gem authors to understand the merits of this effort, and finally to put out new fiber-aware versions of their code.

As the fiber scheduler interface matures, I guess I will have to reconsider my position regarding Polyphony. One interesting suggestion was to implement Polyphony as a fiber scheduler for Ruby >= 3.0, and as a “polyfill” for earlier Ruby versions.

What about compatibility?

Monkey-patching does introduce the problem of compatibility, and this should be taken seriously. Polyphony aims to reduce compatibility issues in two ways. firstly, Polyphony aims to mimic the same behaviour as much as possible across all monkey-patched APIs from the point of view of the application. Secondly, Polyphony aims to monkey-patch mostly stable APIs that have little chance of changing between versions.

This approach is not without problems. For example, the changes to irb introduced in Ruby 2.7 have broken Polyphony’s patch, and there’s an outstanding issue for it (I’ll get to it eventually.)

Polyphony also provides, as described above, mokey-patches for third-party gems, such as pg, redis and others. Those are are bundled as part of Polyphony, but in the futre might be extracted to separate gems, in order to be able to respond more quickly to local issues that arise in integrating those gems with Polyphony.

I’d also like to note that I do not expect people to just add Polypony to their Gemfile and start spinning up fibers all over the place. In fact, using Polyphony is to me such a radical shift from previous approaches to Ruby concurrency that I find it improbable that one day it will simply work with any Ruby on Rails app. Using Polyphony to its full potential will require much more careful consideration on the part of developers using it.

I’d also like to add that my goal is not for Polyphony to become the solution for fiber-based concurrency in Ruby. It’s just a project that I find useful for my own work and I feel could be useful for others as well. There’s nothing wrong with having multiple solutions to the same problem. On the contrary, I find it beneficial and stimulating to have competing projects based on different approaches.

So what does Polyphony patch?

Polyphony replaces whole parts of the Ruby core API with fiber-aware code that provides the same functionality, but integrated with Polyphony’s code. I took great care to make method signatures are the same and behave identically as much as possible.

It’s worth noting that running Ruby programs with multiple fibers present challenges that go beyond merely reading and writing to IO instances: there’s all kinds of subtleties around forking, signal handling, waiting for child processes and thread control. Much of the monkey-patching that Polyphony performs is around that.

Here’s a (probably incomplete) list of APIs monkey-patched by Polyphony:

IO - all read/write instance methods and all read/write class methods
Socket/TCPSocket/TCPServer et al - all I/O functionality including accept and connect
OpenSSL::SSL::SSLSocket/OpenSSL::SSL::SSLSocket - all read/write/accept/connect methods
Kernel - methods such as sleep, ` (backtick) , system, trap
Process#detach
Timeout#timeout
Thread#new and Thread#join

Polyphony also provides monkey-patches for gems such as pg, redis, mysql2 and sequel.

Conclusion

Polyphony uses monkey-patching extensively because it’s the best way to achieving the goals I set to myself in developing it. Yes, monkey-patching has its disadvantages, but it also has advantages (as I showed above). Finally, I believe Polyphony should be rather judged by what it can do, and by the value it provides to developers.