noteflakes

About that monkey-patching business...

04·11·2021

A few days ago, a comment was made on the internet about Polyphony, an open source project of mine, mentioning the fact that Polyphony patches some core Ruby APIs. The context was the difference between Polyphony and Async (another fiber-based concurrency gem for Ruby):

Last time I checked, polyphony monkey patched Ruby core methods to do its work :-/. This has deterred from learning more about polyphony… On the other hand, async gem has always had incredibly clean code.

I’m sure Bruno Sutic, the author of the above comment, was writing in good faith, but his comment implies that Polyphony’s code is “dirty”. It also implies that monkey-patching is somehow illegitimate. While normally I don’t get too excited about people calling my code names, I do take pride in my work, and I feel a rebuttal is in order.

Moreover, the mere fact that Polyphony employs a (somewhat) controversial technique to do its thing should not deter people like Bruno from examining it. I’m sure all of us would benefit from approaching other people’s code with an open and inquisitive mind.

In this article I’ll explain in detail the strategy I chose for developing Polyphony and the role monkey-patching plays within it. I’ll also discuss the potential problems raised by the practice of monkey-patching, and how they can be minimized.

What is monkey-patching?

But first, for those that are confused about what monkey-patching actually is, here’s the Wikipedia entry (go ahead, read it!) For the sake of the present discussion, I’ll define monkey-patching as the practice of changing or extending the behavior of pre-existing Ruby classes or modules by overriding their instance or class methods. This can be done in a variety of ways, depending on what you want to achieve. The most obvious way would be to open the class, then redefine some methods:

class String
  def to_i
    42
  end
end

'123'.to_i #=> 42

You can also put your patched methods in a separate module, then prepend it to the target class (that way it will take precedence over existing method definitions):

module StringPatches
  def to_i
    42
  end
end

String.prepend StringPatches

'123'.to_i #=> 42

If you need to target specific object instances, you can patch their singleton class:

s = '123'
class << s
  def to_i
    42
  end
end

s.to_i #=> 42

You can also limit the monkey-patching to a single file, class, or module, by using refinements:

module StringPatches
  refine String do
    def to_i
      42
    end
  end
end

using StringPatches # activate refinement

'123'.to_i #=> 42

So monkey-patching can be done in a variety of ways in Ruby, depending on how specific you want the patched behaviour to be: from the level of a single object instance, through specific scopes, all the way to patching classes globally.

It’s also worth noting that there are other techniques that could be used instead of monkey-patching: subclassing is ubiquitous in Ruby, and can even work for extending core Ruby classes. Rails’s HashWithIndifferentAccess is a case in point. I could probably come up with a bunch of other alternatives, but I’ll leave it at that. The point is, it really depends on the circumstances.

Is monkey-patching inherently bad?

I’m sure many people have written before about monkey-patching and whether it’s good or bad for you, but in my most humble opinion, there’s no right or wrong when it comes to programming. Monkey-patching is just a technique that has its place like everything else under the heavens.

Of course, monkey-patching can lead to problems - it can cause compatibility issues and strange bugs, for example when your monkey-patching gem is combined with other gems. It can break behaviour across different versions of Ruby, or in conjunction with specific versions of specific dependencies. It can cause all kinds of havoc. But it can also provide a very elegant solution in specific circumstances, and can be amazingly effective.

When is monkey-patching useful?

Monkey-patching is useful when you need to alter or extend the way pre-existing classes behave. Ruby’s open nature lets you change almost everything about Ruby, even core classes such as Array or String can be modified (as shown in the above examples.) Why would we want to do this? Here are some cases where monkey-patching can be useful:

Designing Polyphony

When I first started working on Polyphony, I didn’t know where it would take me. Polyphony began as an experiment in designing an API for writing concurrent Ruby programs. My starting point was the nio4r gem, which implements an event loop based on libev. I really liked what nio4r was able to do, and wanted to experiment with different concurrency models, so I took its C-extension code and start fiddling with it. I went through a whole bunch of different designs: callbacks, promises, futures, async/await, and finally fibers.

As Polyphony slowly took form, the following principles manifested themselves:

In order to be able to apply the above principles to Polyphony’s design, I needed a way to make Ruby’s core classes, especially those having to do with I/O, usable under Polyphony’s concurrency model. The only solution that would have allowed me to do that was monkey-patching all those classes, including the IO class, the different Socket classes, even the OpenSSL classes dealing with I/O. Without monkey-patching, Polyphony as it currently is would have been impossible to implement!

Polyphony and Ruby’s new fiber scheduler interface

At this point people might ask: what about using the new Fiber::SchedulerInterface introduced in Ruby 3.0? Presumably, with the Fiber::SchedulerInterface I would be able to keep the same design based on the same principles, without resorting to monkey-patching Ruby core classes. That’s because the fiber scheduler is baked right into Ruby’s core.

I have long thought about this problem, and have always come to the same conclusion: if I were to base Polyphony on the Fiber::SchedulerInterface, it would have limited what Polyphony could do. In fact, some of the features Polyphony currently offers would have been impossible to achieve:

Integrating fiber scheduling into Ruby’s core is not a trivial undertaking (and I applaud Samuel for his resolve and determination in the face of substantial pushback.) The problem is not only technological - making fiber scheduling work with the complex code of Ruby’s IO core - but also getting other Ruby core developers and gem authors to understand the merits of this effort, and finally to put out new fiber-aware versions of their code.

As the fiber scheduler interface matures, I guess I will have to reconsider my position regarding Polyphony. One interesting suggestion was to implement Polyphony as a fiber scheduler for Ruby >= 3.0, and as a “polyfill” for earlier Ruby versions.

What about compatibility?

Monkey-patching does introduce the problem of compatibility, and this should be taken seriously. Polyphony aims to reduce compatibility issues in two ways. firstly, Polyphony aims to mimic the same behaviour as much as possible across all monkey-patched APIs from the point of view of the application. Secondly, Polyphony aims to monkey-patch mostly stable APIs that have little chance of changing between versions.

This approach is not without problems. For example, the changes to irb introduced in Ruby 2.7 have broken Polyphony’s patch, and there’s an outstanding issue for it (I’ll get to it eventually.)

Polyphony also provides, as described above, mokey-patches for third-party gems, such as pg, redis and others. Those are are bundled as part of Polyphony, but in the futre might be extracted to separate gems, in order to be able to respond more quickly to local issues that arise in integrating those gems with Polyphony.

I’d also like to note that I do not expect people to just add Polypony to their Gemfile and start spinning up fibers all over the place. In fact, using Polyphony is to me such a radical shift from previous approaches to Ruby concurrency that I find it improbable that one day it will simply work with any Ruby on Rails app. Using Polyphony to its full potential will require much more careful consideration on the part of developers using it.

I’d also like to add that my goal is not for Polyphony to become the solution for fiber-based concurrency in Ruby. It’s just a project that I find useful for my own work and I feel could be useful for others as well. There’s nothing wrong with having multiple solutions to the same problem. On the contrary, I find it beneficial and stimulating to have competing projects based on different approaches.

So what does Polyphony patch?

Polyphony replaces whole parts of the Ruby core API with fiber-aware code that provides the same functionality, but integrated with Polyphony’s code. I took great care to make method signatures are the same and behave identically as much as possible.

It’s worth noting that running Ruby programs with multiple fibers present challenges that go beyond merely reading and writing to IO instances: there’s all kinds of subtleties around forking, signal handling, waiting for child processes and thread control. Much of the monkey-patching that Polyphony performs is around that.

Here’s a (probably incomplete) list of APIs monkey-patched by Polyphony:

Polyphony also provides monkey-patches for gems such as pg, redis, mysql2 and sequel.

Conclusion

Polyphony uses monkey-patching extensively because it’s the best way to achieving the goals I set to myself in developing it. Yes, monkey-patching has its disadvantages, but it also has advantages (as I showed above). Finally, I believe Polyphony should be rather judged by what it can do, and by the value it provides to developers.