A few weeks ago I gave a presentation that included an overview of what we mean by web concurrency on the back end.
Also, since I had been wanting to try Tumult Hype for a while, I took the opportunity to make animations! (And hat tip to Tumult, Hype is really great)
I’m going to write longer posts to look at the three concurrency models more closely, for example to explain how we achieve concurrency in Rails despite its sequential approach to request handling.
In the meantime, here are the animations:
Hopefully they will make sense to you if you’re already familiar with how those libraries and frameworks work. If they don’t, read on.
And if you find them helpful, feel free to use them!
The request journey
The animations might require a bit of explanation. Let’s have a look at the first one. Not because it’s titled “Ruby”, but because it’s the simplest one.
There are coloured loops, lines and a moving circle. The loops and lines represent different types of activities that happen on the back end when a request is processed. The lengths and sizes are a rough approximation of the time that something would take (so that a DB query takes longer than a cache read, but less than a request to an external API). The moving circle that travels along the lines and loops is our incoming HTTP request, but also an indicator of progress and what is being executed.
The request comes in and reaches the server that starts processing it immediately. That’s the inner red loop, where the CPU work happens. At this stage, the server is probably parsing the request headers and body, or analyzing the request method and path to route the request to the right handler.
Responding to this request is not simply CPU-bound, however, but also I/O-bound. And, in fact, soon enough the server needs to communicate with an external resource. The yellow loop represents the time spent reading from (or writing to) a cache, and the white lines represent I/O activity. When the request is travelling on one of the external loops, no CPU work happens on the server.
In our case the server has started handling the request and, as soon as it completes the preliminary parsing and routing steps, it checks if there is a cached response available (there isn’t). Once a reply has been returned from the cache, the server resumes the CPU work and executes the code in the request handler until it has to query a database.
The blue loops represent the time spent querying a DB, and let’s say that the first one is a read operation the second one, later, an update. Once the DB query has completed, the server goes back to do some more CPU work. This time it’s probably filtering and analyzing the data obtained from the DB.
In the middle of this process, the server decides that it needs data over the network from somewhere else (which could be another service in the data center or a 3rd party REST API, it doesn’t matter: the DB and the cache are attached as network resources too). So, the server executes a HTTP request and pauses again waiting for the response. Travelling through the green loop takes a bit longer to complete (maybe the service is slow or there is network latency), and the server waits until an API response is returned and it can resume its local execution.
With the API response in hand the server can do some more CPU work, for example it can use the newly obtained information to traverse the data structure built from the DB results and set some new values.
A final I/O operation is required when some record must be updated in the DB, and again the server waits idle until it completes. Finally, the last bit of CPU work is executed to serialize the assembled data as JSON and return an HTTP response.
Once the response is returned, a new HTTP request is accepted and the loop starts over.
Same journey, different ways of travelling
What I just described applies one-to-one to how a Rack framework like Ruby on Rails processes requests: it’s sequential and quite simple.
Let’s immagine, however, that the same request could be served by different kinds of systems. What would Node.js (evented) or Phoenix (actors) do, when responding to requests just like this one?
That’s what the other two animations try to show. In each of the three models, the individual requests go through the exact same steps, in the exact same order. The evented and actor-based models, however, try to process them concurrently using different strategies.
Node.js, specifically, will use an event loop to schedule IO tasks as asyncronous callbacks: requests can run concurrently, but only one thing at a time can happen in the inner event loop where the CPU work is done.
Phoenix, on the other hand, uses actors to isolate the execution of the concurrent requests. And in addition to efficient concurrency for IO tasks, the VM’s ability to scale on multi-core systems gives us true parallelism for CPU work.
More on this subject in another post.