Understanding and Testing I/O in Ruby
Let’s talk about I/O streams in Ruby and how to test them.
Most programs will communicate with the outside world. This is what we commonly refer to as I/O, or input and output. In general terms, with input we mean things like reading from a file, from another program over a port/socket (e.g. a database), receiving user input or an HTTP request. With output we refer to printing some messages on the command line shell, writing to a file, or sending some data somewhere. Yes, some programs have graphical user interfaces, but those are just a few more layers of indirection and complexity over the same basic concepts.
If we focus on the most elementary building blocks of these interactions, we can say that a program can receive or produce information. In Ruby, the basic input and output channels are represented with the IO
class: simply stated, IO
objects can be written to and read from.
It’s also interesting to point out that Ruby’s File
class is a direct subclass of IO
, and in fact we can think of files as particular IO
streams backed by some sort of persistent thing on the other end of the pipe.
A program can have as many sources of input and output as it requires, but three standard streams will be preconnected automatically: standard input (stdin), standard output (stdout) and standard error (stderr). These are represented in Ruby as IO
instances, and are preassigned when execution starts to a few global variables: $stdin
, $stdout
and $stderr
(also available with the constants STDIN
, STDOUT
and STDERR
).
1 2 3 4 | STDIN # => #<IO:<STDIN>> STDIN.equal? $stdin # => true |
Of course these are properly open or closed for reading and writing operations, and trying to, let’s say, read from $stdout
or write to $stdin
will raise a IOError
.
Since they are IO
objects we know that we can interact with them with any of the instance methods of IO
, for example we can read with IO#gets
, IO#readline
or IO#each_line
, and we can write with IO#puts
or IO#print
.
Ruby also gives us some utility methods, like Kernel#gets
that can be used directly.
Command line I/O in practice
With that in mind, here are a few practical examples.
We can read manually entered user input, here simplistically in IRB:
1 2 3 4 5 6 7 8 9 | puts "I am from $stdin! '#{gets.inspect}'" # ...types "hello" # => I am from $stdin! '"hello\n"' $stdin.each_line { |input| puts "I am from $stdin! '#{input.inspect}'" } # ...types "hello" # => I am from $stdin! '"hello\n"' # ...types "world" # => I am from $stdin! '"world\n"' |
Our program can also implement its own REPL:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | class Repl def start loop do print "> " command = gets.chomp if command == "exit" puts "end of games..." break else puts "echo: \"#{command}\"" end end end end Repl.new.start |
Or we can use a unix pipe to send messages to a ruby process:
1 2 3 4 5 | # my_listener.rb loop do puts "$stdin: #{$stdin.gets}" sleep 2 end |
1 2 3 4 5 6 7 8 9 | # Either pipe some output into it directly $ other_program_that_emits_output | ruby my_listener.rb # Or we can start the ruby script and make it read from a pipe $ mkfifo my_pipe $ ruby my_listener.rb < my_pipe # then, in another shell $ echo "Hello, World!" > my_pipe |
How to test I/O
It is helpful to start by imagining how our hypothetical program is handling I/O. Ideally there is a specific object responsible of input and output, and the rest of the porgram will simply talk with its interface.
With such a design we only have to deal with I/O in the unit tests for this object and in the integration tests for the program, where we want to check how the system behaves as a whole and ensure that the output for some input is correct.
With this in mind, let’s look at a few techniques to test output and handle input.
Testing the program’s output
Testing the output of a program is easier than working with the input.
Producing output is such a common task that all the popular testing frameworks provide builtin helpers to record a program’s output and set expectations on it.
From the standard library, Minitest has the capure_io
and the assert_output
methods (the latter builds on top of the former):
1 2 3 4 5 6 7 8 9 10 11 12 | def test_prints_to_stdout_the_right_message_1 out, err = capture_io do @subject.do_something end assert_match %r{I'm the output!}, out end def test_prints_to_stdout_the_right_message_2 assert_output %r{I'm the output!} do @subject.do_something end end |
while RSpec has the output
matcher:
1 2 3 4 5 6 7 | it "prints to stdout the right message" do expect { subject.do_something }.to output( %r{I'm the output!} ).to_stdout end |
Testing how the program reacts to input
Testing the input of a program is not as straigtforward.
The concept of asserting or expecting a result only really applies to the output, which is produced directly from the program.
The input is something that is fed into the program from the outside, and the real question is not how to test it, but rather how to produce it in our tests (or how to simulate it); done that, testing how the program reacts is simple.
We can approach the problem by considering what the program does to receive input: it uses a special IO
object ($stdin
) and sends it messages (calls its methods).
This means that all we have to do is act on $stdin
, either stubbing its methods or replacing it altogether with a test double or another object with the same interface.
Let’s see how to use this piece of information in practice with a non-trivial problem: simulating a sequence of input values.
With RSpec we can easily stub a method with a series of return values:
1 2 3 4 5 | allow($stdin).to receive(:gets).and_return("one", "two", "three") $stdin.gets # => "one" $stdin.gets # => "two" $stdin.gets # => "three" |
And with minitest we can stub a method and use a callable object to get the same result:
1 2 3 4 5 6 7 | list = %w(one two three) $stdin.stub(:gets, proc { list.shift }) do puts $stdin.gets # => "one" puts $stdin.gets # => "two" puts $stdin.gets # => "three" end |
Also, we can do the exact same thing in plain ruby:
1 2 3 4 5 6 7 8 | def $stdin.gets @stubbed_input ||= %w(one two three) @stubbed_input.shift end $stdin.gets # => "one" $stdin.gets # => "two" $stdin.gets # => "three" |
The previous examples are useful to quickly stub the program input and are a simple way to get things moving. Unfortunately, though, they will only work if the program interacts with $stdin.gets
directly, and in fact fail in properly simulating input. They are brittle solutions that make a lot of assumptions regarding the inner design of the program.
For example, the techniques above will not work for the globally available method gets
(from the Kernel
module), which is possibly the most idiomatic way to read input. This is because stubbing a specific method on $stdin
will have no effect on accessing $stdin
in other ways.
Before looking at a more solid solution, let’s see a quick way to set the proper stubs for gets
.
Consider that gets
is available globally because is defined as an instance method of the Kernel
module, and Kernel
is included everywhere through Object
. When the program uses gets
, the receiver of the message is whatever is self
in that context.
Thus, the previous examples can be adapted to also work with Kernel#gets
. The following snippet is a simple proof of concept, and in a real test self
is probably going to be the object that will contain the calls to gets
:
1 2 3 4 5 6 7 8 9 10 | # rspec allow(self).to receive(:gets).and_return("one", "two", "three") # minitest list = %w(one two three) self.stub(:gets, proc { list.shift }) do puts gets # => "one" puts gets # => "two" puts gets # => "three" end |
I must admit that I don’t particularly like this solution for the same reasons I don’t like the previous examples that show how to stub methods on $stdin
. This might be a bit worse, also, because the stubs should be set as close to the original source of input as possible, and not on an object that just happens to expose the utility method gets
.
A better input simulation technique
In order to simulate user input in a reliable way, without any knowledge of how the program will access the input, we must write into $stdin
in advance. And since $stdin
is not open for writing (it will raise an IOError
), we have to replace $stdin
with something else that we can control.
Doing so is not complicated, we just need to use an object with the right interface and then restore the original value after we’re done (remember that $stdin == STDIN
):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | class FakeInput def gets @stubbed_input ||= %w(one two three) @stubbed_input.shift end end $stdin = FakeInput.new gets # => "one" gets # => "two" gets # => "three" $stdin = STDIN # restore the original value |
The code snippet above proves a point, but it’s not what we want because FakeInput
is too bare-bones to be actually useful.
Using another type of IO
object is a better solution because they already implement the expected interface. For example, we can use the StringIO
class from the Standard Library:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | require 'stringio' io = StringIO.new io.puts "one" io.puts "two" io.puts "three" io.puts "four" io.rewind $stdin = io gets # => "one\n" readline # => "two\n" readlines # => ["three\n", "four\n"] $stdin = STDIN |
This is what we want to use, as it allows us to pre-program any kind of user input with no need to know how the program will read the information.
We can do better though, for instance use what we’ve learned to write a more usable function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | require 'stringio' module IoTestHelpers def simulate_stdin(*inputs, &block) io = StringIO.new inputs.flatten.each { |str| io.puts(str) } io.rewind actual_stdin, $stdin = $stdin, io yield ensure $stdin = actual_stdin end end |
Which can be used to provide sequences of input values to be consumed inside a block.
For example it can be used with the simple REPL we saw at the beginning of the post:
1 2 3 4 5 6 7 8 | simulate_stdin("command1 arg", "otherCommand 1 2 3", "exit") do Repl.new.start end # > echo: "command1 arg" # > echo: "otherCommand 1 2 3" # > end of games... # => nil |
Or it can be combined with the output test helpers, with minitest:
1 2 3 4 5 | def test_it_prints_an_exit_message_on_exit_command assert_output " > end of games..." do simulate_stdin("exit") { Repl.new.start } end end |
And with RSpec:
1 2 3 4 5 | it "prints an exit message on the exit command" do expect { simulate_stdin("exit") { Repl.new.start } }.to output(" > end of games...").to_stdout end |
I’ve found this solution to not only be helpful in writing expressive tests, but to also make integration tests very resilient: the input stubs in the tests should keep working even if the code to access $stdin
in the program is refactored.
Al always, your YMMV.