As I wrote about previously, I’ve been using Elixir for the past several months to build out our new marking video automation platform. In my other post, I talked about some of my initial reactions to Elixir & Phoenix, and tried to give an objective breakdown of how the technologies work, particularly with a comparison to Ruby & Rails, which I’ve used quite a bit in my career.

That story was not intended to be an Elixir cheerleader post. This story, on the other hand, is intended to be an Elixir cheerleader post. Here, I want to document some of the things I really love about working with Elixir and some of the reasons why I don’t think I’ll be headed back to Ruby any time soon. In particular, I want to describe a couple of neat language and ecosystem features that are really making Elixir a joy to work with.

This story, on the other hand, is intended to be an Elixir cheerleader post.

I know that some people reading through these features will shake their fists, “What about Haskell/Lisp/Rust/whatever-my-favorite-language-is?! They’ve had that feature for years!” And they’re right: none of these features by themselves are unique to Elixir, and that’s fine — but that’s not the point of the post. The point is to describe a particular collection of neat features, which, along with many, many others, make Elixir a pleasure to use.

Control flow via pattern matching

The first feature I’d like to discuss is pattern matching, and in particular, using pattern matching for control flow in Elixir. Let’s say we need to handle some incoming stream that can provide messages with a variety of message types. In other languages, I’d usually think about writing something like this in a case-statement:

def process_message(message) do
  case message do 
    %{type: :start, info: info}  -> consume_start(info)
    %{type: :data, data: data}   -> consume_data(data)
    %{type: :finish, info: info} -> consume_finish(info)
    %{type: type} -> Logger.warning("Unknown message type: #{type}")
  end
end

Elixir also has a case statement, so we could do the same thing. But there’s another language feature we can use as well: pattern matching. We can use pattern match against the incoming data to provide a variety of implementations that match the structure of the incoming message:

def process_message(%{type: :start, info: info}) do
  consume_start(info)
end

def process_message(%{type: :start, data: data}) do
  consume_data(data)
end

def process_message(%{type: :finish, info: info}) do
  consume_finish(info)
end

def process_message(%{type: type}) do
  Logger.warning("Unknown message type: #{type}")
end

With this pattern matching in place, there’s no explicit flow-control statement — we’ve eliminated the case call completely. Whenever we call process message, it will just invoke the correct implementation based on our pattern matching:

def receive_message(msg) do
  ...
  process_message(msg)
  ...
end

This is conceptually similar to the multiple method dispatch found in many languages, in which multiple function implementations can be defined with different classes or types, but pattern matching goes much further. Here, we can actually dispatch on the internal structure or values of the data, not just the type of the data.

The use of pattern matching for flow control can vastly improve the readability of code. Consider the following pseudo-code with a variety of outcomes based on nested conditionals:

def run(msg, ctx) do
  if msg.state == "ready" do
    if length(msg.data) > 0
      ...
    else
      ...
    end
  else if msg.state == "pending"
    ...
  else if msg.state == "start" && !ctx.initialized
    ...
  else 
    ...
  end
end

Code like this can be busy and difficult to parse. With pattern matching, we can rewrite the conditional logic by dispatching to different functions based on the input:

def run(%{state: "ready", data: data}, ctx) when length(data) > 0 do
  ...
end

def run(%{state: "ready"}, ctx) do
  ...
end

def run(%{state: "pending"}, %{initialized: false } = ctx) do
  ...
end

def run(_, _) do 
  ...
end

Note that in the first implementation, in addition to the pattern match, we have a guard (“when length(data) > 0”) which lets us further refine the match. The last implementation, written as “run(_, _)” matches any value for the two arguments, so it behaves like our final “else” clause in the original example.

Elixir of course supports more common flow control constructs like “if/else” and “case”, so pattern matching is not used for all flow control. But in many cases, especially when the flow control logic becomes complex, this simple pattern-matching feature can make code simpler both to read and to write.

The Pipe Operator

I love functional programming, but one of it’s more annoying aspects is “parentheses hell” in which long chains of nested function calls are used to transform data. The result is a deeply nested and difficult to parse expression of functions, like:

f(g(h(i(j(k(l(m(n)))))))))

Is that right? I don’t even know — I got lost in those parentheses and didn’t even bother to check if they’re balanced. Even though the compiler would kindly tell me if I got it wrong, there’s no denying the ugliness and difficulty both in writing and reading code like that.

Sadly, through the years, many would-be virtuosos have been driven away from the field of software development by the wounds inflicted by parentheses like these in introductory computer science classes.

The worst part about expressions like these are that the starting-point of the expression is the inner-most value. That is to say, in the above example, the expression “n” is the starting point for the data, and one must then work backwards to see the chain of transformations applied. And this is just a simple example with each function taking a single argument — add in some other arguments and variable names and the whole thing becomes completely unreadable.

Sadly, through the years, many would-be virtuosos have been driven away from the field of software development by the wounds inflicted by parentheses like these in introductory computer science classes.

Though this pattern is common in functional programming languages, it is not exclusive to them — this type of ugly expression can rear its head in just about any language.

Fortunately, Elixir presents a solution to parentheses hell: the pipe operator. The pipe operator lets us invert this expression and build a pipeline for the data, like a UNIX pipe:

n |> m |> l |> k |> j |> i |> h |> g |> f

Behind the scenes, all this does is take the output of an expression and put it into the first argument of the following function.

x |> f() |> g() == g(f(x))

For a real world example, let’s say — just for illustrative purposes — we’re trying to do our own homemade parsing of a CSV configuration file (which we should totally never do, by the way, but it’s a good example).

To load the file, split the string into lines and then fields, and then process it, we might end up with an expression like this:

Enum.map(
  Enum.map(
    String.split(File.read!(SomeModule.get_path), "\n"),     
      &String.split(&1, ",")
  ), &process_line/1
)

Not the end of the world, but not pretty by any stretch of the imagination. It requires a bit of mental overhead just to find the innermost expression that is the starting point of our computation (“SomeModule.get_path” in this case)

Using the pipe operator, we can rewrite this expression as:

SomeModule.get_path
 |> File.read!
 |> String.split("\n")
 |> Enum.map(&String.split(&1, ","))
 |> Enum.map(&process_line/1)

The updated version reads cleanly top-to-bottom and is easy to understand. And no nested parentheses!

In addition to simply eliminating parenthesis hell, it also lets us more easily make modifications to the pipeline. One particularly useful example of this is in debugging. We can easily put a call to “IO.inspect” in the middle of the pipeline in those delicate situations in which we have no idea what the hell is going on:

SomeModule.load_data
  |> clean_data
  |> parse_fields
  |> complicated_and_error_prone_machine_learning_model_number1
  |> IO.inspect
  |> complicated_and_error_prone_machine_learning_model_number2
  |> save_processed_data

We can put whatever we want into the middle of a pipeline — even an anonymous function for a more nuanced debugging experience. It might be considered bad form to do any real computation this way, but it works quite well for debugging and logging purposes:

SomeModule.load_data
  |> clean_data
  |> parse_fields
  |> complicated_and_error_prone_machine_learning_model_number1
  |> (fn d -> if is_bad_data(d), do: IO.inspect(d), else d).()
  |> complicated_and_error_prone_machine_learning_model_number2
  |> save_processed_data

Maybe don’t do that in shipping code, but: cool!

Natural service boundaries via the OTP

One of Elixir’s biggest assets is that it runs on the Erlang VM and uses the Open Telecom Platform (OTP), a runtime library and toolset which manages processes in the Erlang VM. Not to be confused with OS-level processes, Erlang VM processes are extremely lightweight, allowing for many thousands (or hundreds of thousands!) of processes to be run in a single application. Any non-trivial application written in Elixir makes extensive use of the process model to manage computation.

Processes can pass messages to one another, but they cannot operate on any shared state. Moreover, processes are fully isolated from one another in terms of error handling and fault tolerance. A failure or exception in one process does not cause the entire application to fail. In the sense that processes are fully isolated and share no state, interacting with processes within an Elixir application is a bit more like making calls to an external API than like calling a function on an object or struct in other languages.

The OTP is extremely powerful in and of itself, but one of its more nuanced benefits is that influences how you structure your code. Because there is full isolation and no shared state, the language itself encourages natural service boundaries and pushes developers towards well-isolated, composable, reusable components within an application. In a sense, Elixir/OTP applications get many of the benefits of a microservices architecture (isolation, flexibility, simplicity of components), but without many of the downsides (developer experience, architectural and runtime complexity, latency, etc). It also makes an eventual transition to actual microservices simple: an isolated process running inside of an Elixir app can easily be extracted and run in a separate application on a separate node.

To see these patterns in action, one can look at apps written with the Phoenix web framework. There are completely separate processes for things like individual web requests, websockets, database connections, connection pools, loggers and much more. Any of these individual processes can fail, even catastrophically, without affecting the application as a whole. And importantly, that guarantee is automatically made through the OTP, not through clever application design.

Bonus — One thing that is not a pleasure in Elixir: string keys vs atom keys vs keyword lists

Okay, even though this is a cheerleading post, I don’t love everything about Elixir — there are some things I still get tripped up on. One of my biggest annoyances is one that I commonly ran into in the Ruby world, that unfortunately rears its head with Elixir as well.

Elixir has a built in map data structure, similar to a hash in Ruby. And similar to Ruby’s symbol datatype, Elixir has atoms that are frequently used as keys for a hash:

{:name => "Bob", :age => 22}

Atoms are so commonly used as keys that there’s a shorthand form of the above as well:

{name: "Bob", age: 22}

Of course, maps can take any values as keys, so it’s also possible to use strings :

{"name" => "Bob", "age" => 22}

The problem is that some interfaces and APIs expect string keys, some expect symbol keys, and some are fine with either — and it’s not always entirely clear by convention or context which is the case. Invariably there are times when one usage pattern collides with the other.

As an example, the Ecto library for data mapping happily works with either string keys or atom keys, and atoms do seem like the better match for setting database attributes. So you might happily write your code that that interfaces with Ecto using symbols. But then you find that the Phoenix web framework passes in request parameters as strings. If you use a map of string-key parameters to create or update a model, it works just fine — but if you mix in your code that manipulates the input expecting symbols before it’s inserted into the database, you run into trouble and will need to reconcile the mismatch.

To add another wrinkle, Elixir has another data structure, the keyword list, which looks like a map and is typically used to specific optional or named parameters to a function call. Under the hood, the keyword list is really a list of tuples (which is confusing in its own regard):

[{:x, 1}, {:y, 2}] == [x: 1, y: 2]

The solution to all of this is, of course, to know the data the various APIs expect and to be explicit in how it’s used, but it does occasionally trip me up. Having multiple data structures capable of representing similar kinds of data is not a unique problem, but this particular domain of overlapping functionality, combined with Elixir’s dynamic typing system sometimes makes for unexpected surprises.

A Pleasure

In spite of my occasional misuse of strings & atoms, I still find Elixir a pleasure to use. And these are just some of the simple things I like — there are lots of more profound reasons to love Elixir as well, like the Phoenix web framework, the Elixir concurrency model and the rock solid foundation of the Erlang VM. But sometimes it’s the simple ergonomics of a language or technology that make the biggest differences for the developer experience, and Elixir definitely delivers.