Understanding and Testing I/O in Ruby

8 May 2016

Let’s talk about I/O streams in Ruby and how to test them.

Most programs will communicate with the outside world. This is what we commonly refer to as I/O, or input and output. In general terms, with input we mean things like reading from a file, from another program over a port/socket (e.g. a database), receiving user input or an HTTP request. With output we refer to printing some messages on the command line shell, writing to a file, or sending some data somewhere. Yes, some programs have graphical user interfaces, but those are just a few more layers of indirection and complexity over the same basic concepts.

If we focus on the most elementary building blocks of these interactions, we can say that a program can receive or produce information. In Ruby, the basic input and output channels are represented with the IO class: simply stated, IO objects can be written to and read from.
It’s also interesting to point out that Ruby’s File class is a direct subclass of IO, and in fact we can think of files as particular IO streams backed by some sort of persistent thing on the other end of the pipe.

A program can have as many sources of input and output as it requires, but three standard streams will be preconnected automatically: standard input (stdin), standard output (stdout) and standard error (stderr). These are represented in Ruby as IO instances, and are preassigned when execution starts to a few global variables: $stdin, $stdout and $stderr (also available with the constants STDIN, STDOUT and STDERR).

STDIN
# => #<IO:<STDIN>>
STDIN.equal? $stdin
# => true

Of course these are properly open or closed for reading and writing operations, and trying to, let’s say, read from $stdout or write to $stdin will raise a IOError.

Since they are IO objects we know that we can interact with them with any of the instance methods of IO, for example we can read with IO#gets, IO#readline or IO#each_line, and we can write with IO#puts or IO#print.
Ruby also gives us some utility methods, like Kernel#gets that can be used directly.

Command line I/O in practice

With that in mind, here are a few practical examples.

We can read manually entered user input, here simplistically in IRB:

puts "I am from $stdin! '#{gets.inspect}'"
# ...types "hello"
# => I am from $stdin! '"hello\n"'

$stdin.each_line { |input| puts "I am from $stdin! '#{input.inspect}'" }
# ...types "hello"
# => I am from $stdin! '"hello\n"'
# ...types "world"
# => I am from $stdin! '"world\n"'

Our program can also implement its own REPL:

class Repl
  def start
    loop do
      print "> "
      command = gets.chomp

      if command == "exit"
        puts "end of games..."
        break
      else
        puts "echo: \"#{command}\""
      end
    end
  end
end

Repl.new.start

Or we can use a unix pipe to send messages to a ruby process:

# my_listener.rb
loop do
  puts "$stdin: #{$stdin.gets}"
  sleep 2
end

# Either pipe some output into it directly
$ other_program_that_emits_output | ruby my_listener.rb

# Or we can start the ruby script and make it read from a pipe
$ mkfifo my_pipe
$ ruby my_listener.rb < my_pipe

# then, in another shell
$ echo "Hello, World!" > my_pipe

How to test I/O

It is helpful to start by imagining how our hypothetical program is handling I/O. Ideally there is a specific object responsible of input and output, and the rest of the porgram will simply talk with its interface.

With such a design we only have to deal with I/O in the unit tests for this object and in the integration tests for the program, where we want to check how the system behaves as a whole and ensure that the output for some input is correct.

With this in mind, let’s look at a few techniques to test output and handle input.

Testing the program’s output

Testing the output of a program is easier than working with the input.

Producing output is such a common task that all the popular testing frameworks provide builtin helpers to record a program’s output and set expectations on it.

From the standard library, Minitest has the capure_io and the assert_output methods (the latter builds on top of the former):

def test_prints_to_stdout_the_right_message_1
  out, err = capture_io do
    @subject.do_something
  end
  assert_match %r{I'm the output!}, out
end

def test_prints_to_stdout_the_right_message_2
  assert_output %r{I'm the output!} do
    @subject.do_something
  end
end

while RSpec has the output matcher:

it "prints to stdout the right message" do
  expect {
    subject.do_something
  }.to output(
    %r{I'm the output!}
  ).to_stdout
end

Testing how the program reacts to input

Testing the input of a program is not as straigtforward.

The concept of asserting or expecting a result only really applies to the output, which is produced directly from the program.
The input is something that is fed into the program from the outside, and the real question is not how to test it, but rather how to produce it in our tests (or how to simulate it); done that, testing how the program reacts is simple.

We can approach the problem by considering what the program does to receive input: it uses a special IO object ($stdin) and sends it messages (calls its methods).
This means that all we have to do is act on $stdin, either stubbing its methods or replacing it altogether with a test double or another object with the same interface.

Let’s see how to use this piece of information in practice with a non-trivial problem: simulating a sequence of input values.

With RSpec we can easily stub a method with a series of return values:

allow($stdin).to receive(:gets).and_return("one", "two", "three")

$stdin.gets # => "one"
$stdin.gets # => "two"
$stdin.gets # => "three"

And with minitest we can stub a method and use a callable object to get the same result:

list = %w(one two three)

$stdin.stub(:gets, proc { list.shift }) do
  puts $stdin.gets # => "one"
  puts $stdin.gets # => "two"
  puts $stdin.gets # => "three"
end

Also, we can do the exact same thing in plain ruby:

def $stdin.gets
  @stubbed_input ||= %w(one two three)
  @stubbed_input.shift
end

$stdin.gets # => "one"
$stdin.gets # => "two"
$stdin.gets # => "three"

The previous examples are useful to quickly stub the program input and are a simple way to get things moving. Unfortunately, though, they will only work if the program interacts with $stdin.gets directly, and in fact fail in properly simulating input. They are brittle solutions that make a lot of assumptions regarding the inner design of the program.

For example, the techniques above will not work for the globally available method gets (from the Kernel module), which is possibly the most idiomatic way to read input. This is because stubbing a specific method on $stdin will have no effect on accessing $stdin in other ways.

Before looking at a more solid solution, let’s see a quick way to set the proper stubs for gets.
Consider that gets is available globally because is defined as an instance method of the Kernel module, and Kernel is included everywhere through Object. When the program uses gets, the receiver of the message is whatever is self in that context.

Thus, the previous examples can be adapted to also work with Kernel#gets. The following snippet is a simple proof of concept, and in a real test self is probably going to be the object that will contain the calls to gets:

# rspec
allow(self).to receive(:gets).and_return("one", "two", "three")

# minitest
list = %w(one two three)
self.stub(:gets, proc { list.shift }) do
  puts gets # => "one"
  puts gets # => "two"
  puts gets # => "three"
end

I must admit that I don’t particularly like this solution for the same reasons I don’t like the previous examples that show how to stub methods on $stdin. This might be a bit worse, also, because the stubs should be set as close to the original source of input as possible, and not on an object that just happens to expose the utility method gets.

A better input simulation technique

In order to simulate user input in a reliable way, without any knowledge of how the program will access the input, we must write into $stdin in advance. And since $stdin is not open for writing (it will raise an IOError), we have to replace $stdin with something else that we can control.

Doing so is not complicated, we just need to use an object with the right interface and then restore the original value after we’re done (remember that $stdin == STDIN):

class FakeInput
  def gets
    @stubbed_input ||= %w(one two three)
    @stubbed_input.shift
  end
end

$stdin = FakeInput.new

gets # => "one"
gets # => "two"
gets # => "three"

$stdin = STDIN # restore the original value

The code snippet above proves a point, but it’s not what we want because FakeInput is too bare-bones to be actually useful.

Using another type of IO object is a better solution because they already implement the expected interface. For example, we can use the StringIO class from the Standard Library:

require 'stringio'

io = StringIO.new
io.puts "one"
io.puts "two"
io.puts "three"
io.puts "four"
io.rewind

$stdin = io

gets      # => "one\n"
readline  # => "two\n"
readlines # => ["three\n", "four\n"]

$stdin = STDIN

This is what we want to use, as it allows us to pre-program any kind of user input with no need to know how the program will read the information.

We can do better though, for instance use what we’ve learned to write a more usable function:

require 'stringio'

module IoTestHelpers
  def simulate_stdin(*inputs, &block)
    io = StringIO.new
    inputs.flatten.each { |str| io.puts(str) }
    io.rewind

    actual_stdin, $stdin = $stdin, io
    yield
  ensure
    $stdin = actual_stdin
  end
end

Which can be used to provide sequences of input values to be consumed inside a block.

For example it can be used with the simple REPL we saw at the beginning of the post:

simulate_stdin("command1 arg", "otherCommand 1 2 3", "exit") do
  Repl.new.start
end

# > echo: "command1 arg"
# > echo: "otherCommand 1 2 3"
# > end of games...
# => nil

Or it can be combined with the output test helpers, with minitest:

def test_it_prints_an_exit_message_on_exit_command
  assert_output " > end of games..." do
    simulate_stdin("exit") { Repl.new.start }
  end
end

And with RSpec:

it "prints an exit message on the exit command" do
  expect {
    simulate_stdin("exit") { Repl.new.start }
  }.to output(" > end of games...").to_stdout
end

I’ve found this solution to not only be helpful in writing expressive tests, but to also make integration tests very resilient: the input stubs in the tests should keep working even if the code to access $stdin in the program is refactored.

Al always, your YMMV.

tags: code ruby testing cli