As we talked about in part one, Conspire makes heavy use of Akka for our backend. Akka itself provides terrific features for scalable, resilient processing but the heart of all of this is a concept known as “actors.”
(Shameless plug: To join Conspire and get a personalized weekly update about your email relationships, sign up at goconspire.com.)
What is an actor?
Akka’s benefits stem from a simple concurrency model known as the “actor model.” An actor is the most basic unit of computing within this model. An actor is very limited in its capabilities, but these limitations provide Akka its power: an actor can receive messages, send messages to other actors, create actors and monitor those actors it creates. You can only interact with an actor by passing a message to an actor’s reference—the actual actor is never accessible. The decoupling of an actor from its reference is key: a reference can point to an actor on the same machine or a different machine and in the event of failure, the actor behind the reference can be replaced without impacting the reference. An actor processes only one message at a time, and its state cannot be accessed or modified except by its receive function.
An actor consists of state, behavior, a mailbox (i.e., a queue of messages), a supervisor and, potentially, children. State is not accessible from outside the actor—client code only interfaces with an actor via message passing to its reference. Messages are sent to an actor’s reference and placed in its mailbox. Its behavior (that is, receive function) processes one message at a time, potentially changing the actor’s state. An actor can create other actors, for which it serves as a supervisor, handling failures within the child actors.
To the code!
Let’s take a look at an example. In this example, we’ll show a common pattern: request comes in, create an actor for it, stop that actor when complete. We’ll also show how actors provide resilience by specifying a supervisor strategy for child actors and how to manage state safely (hint: nothing special required for this one!).
First of all, let’s get some definitions out of the way. Assume that we define a
NewUserQueue actor elsewhere in the system which continuously streams users for processing to actors which register themselves with this queue. Also, assume that
UserIndexer is an actor which indexes relevant documents for a user: this process could take quite a long time. Finally, assume that
UserPersister just handles writing users to a database. To begin, we’ll stub out some actors and define our messages.
Our pipeline will marshall users from the
NewUserQueue to a new
We’ll create our own
And a map to track the
UserIndexer actors by user:
We’ll set up the supervisor strategy for the actors we create. This lets us handle restart logic on a per-exception basis for exceptions that can’t be cleanly handled internally by the child actors.
Then, on start we’ll ask the
NewUserQueue to start sending us users. On stop, we’ll tell it we’re no longer interested.
Finally, we’ll define this actor’s behavior, known as the
For each user, we’re going to create a
UserIndexer to do the actual work for us. We’ll watch this actor (
context.watch) so that we’re notified in case of failures. If it succeeds, we’ll write the updated user back to the database.
UserIndexer succeeds, we’ll send that user on to the persister and stop the worker actor.
UserIndexer fails, we’ll handle that too.
In either case, we want to stop watching that actor and remove it from our map of current indexers.
We also want to know if one of our indexers dies unexpectedly, for that we’ll handle
What did we just do?
In this example, you’ve seen actors modify their internal state (thread-safe!), interact with other actors and supervise their children. Moreover, you’ve seen a very common pattern: create a new actor for every request, clean it up when finished. This pattern allows each request to start from a clean slate. Moreover, actors are very lightweight so the cost of creating many actors is negligible (depending, of course, on how much state those actors contain).
Why is this better?
Replace your mental model of a thread with an actor, the difference being that state is now contained and must be communicated between actors solely through message passing. The rules imposed by the actor model are the very thing that allow Akka to keep complex systems conceptually simple. The actor model cleanly compartmentalizes state and, in turn, the way we think about it. This is how a small team of three was able to build an elastically scalable, fault tolerant backend in so little time. Akka isn’t all roses though: later in this series we’ll cover pitfalls we ran into over the last few months.