Erlang/OTP 20.3 is released

img src=

Erlang/OTP 20.3 is the third service release for the 20 major release.
The service release contains mostly bug fixes and characteristics
improvements but also some new features.
Some highlights for 20.3
Application(s): ssl
               Added new API functions to facilitate cipher suite
Application(s): erts, observer
               More crash dump info such as: process binary virtual
               heap stats, full info for process causing out-of-mem
               during GC, more port related info, and dirty scheduler
Application(s): inets
               Add support for unix domain sockets in the http client.
You can find the README and the full listing of changes for this
service release at
The source distribution and binary distributions for Windows can be
downloaded from
Note: To unpack the TAR archive you need a GNU TAR compatible program.
For installation instructions please consult the README file that is
of the distribution.
The Erlang/OTP source can also be found at GitHub on the official
repository, with tag OTP-20.3
The on-line documentation can be found at:
You can also download the complete HTML documentation or the Unix
manual files
Please report any new issues via Erlang/OTPs public issue tracker
We want to thank all of those who sent us patches, suggestions and bug
Thank you!
The Erlang/OTP Team at Ericsson


Querying an Embedded Map in PostgreSQL with Ecto

PostgreSQL has great support for objects stored as JSON. This is useful for those moments when you need to store data that could be variably structured, such as responses from other services’ APIs, or data that frequently travels together within your relational tables.

A common trade-off for mixing scalar column data types (like varchar or integer) with column data types that handle more-complicated objects (like JSON) is that ORMs or data mappers sometimes can’t introspect on them for you, which means it becomes much harder to query that data.

Using Ecto’s embedded_schema helps introspect on those known values, but it doesn’t really assist you with querying those fields in SQL. This is where I became extremely greatful for Ecto’s escape hatch: fragment().

Define the Struct or Map in Ecto

Let’s dive into some code as an example:

I have a Vehicle.Photo schema that has several versions of the photo:

  • craigslist_ad
  • facebook_ad
  • facebookcarouselad
  • extra_large
  • extra_small
  • large
  • medium
  • original
  • small

We decided to store the versions’ URLs inside a map in the database, because we’re going to use a set of the URLs at the same time inside of an HTML <img srcset />. You can read more about srcset from MDN and how it helps with responsive images.

The Ecto migration looks like this:

def up do
  alter table(:vehicle_photos) do
    add :standard_urls, :map
    add :facebook_urls, :map
    add :craigslist_urls, :map

The Ecto schema looks like this:

schema "vehicle_photos" do
  field(:file, PhotoUploader.Type)

  embeds_one :standard_urls, StandardUrls, on_replace: :update do
    field(:extra_large, :string)
    field(:extra_small, :string)
    field(:large, :string)
    field(:medium, :string)
    field(:original, :string)
    field(:small, :string)

  embeds_one :facebook_urls, FacebookUrls, on_replace: :update do
    field(:hero_ad, :string)
    field(:carousel_ad, :string)

  embeds_one :craigslist_urls, CraigslistUrls, on_replace: :update do
    field(:ad, :string)

Since this is a known structure, Ecto can introspect on the JSON values and cast and dump them to the appropriate Elixir data types, which is immensely helpful. Here I am achieving that by using embeds_one and specifying the struct. Once pulled from the database, Ecto will decode them.

Other times, you may not be able to do this ahead of time, so the schema might look like this (the api_response field):

schema "vehicle_photos" do
  field(:file, PhotoUploader.Type)
  field(:api_response, :map)

Query the JSON

Continuing with the struct example schema, we found out that some of our URLs weren’t being populated like we expected, so I had to find those photos and fix them. How do I query for them since they’re stored in PostgreSQL as JSON? We need to drop down into raw SQL:

def where_photo_urls_have_a_null(query) do
  |> where([_q], fragment(
    (facebook_urls IS NULL) OR
    (facebook_urls->>'ad_version' IS NULL) OR
    (facebook_urls->>'hero_version' IS NULL) OR
    (craigslist_urls->>'ad' IS NULL)

The SQL operator ->> will leverage PostgreSQL’s JSON functions to retrieve the text or integers that are stored in the JSON. You can access them using this syntax: column->>key. In my case, I needed to find if the column was null, or it wasn’t null, then to ask if the JSON object has any keys that are null. This will work regardless of whether you use an embedded struct or a map, because PostgreSQL sees it as the same thing: JSON.

Here’s an example that checks for substrings:

def where_photo_url_wrong(query) do
  |> where([_q], fragment(
    (facebook_urls->>'hero_ad' NOT ILIKE ?) OR
    (facebook_urls->>'carousel_ad' NOT ILIKE ?) OR
    (craigslist_urls->>'ad' NOT ILIKE ?)

Make the Query Composable

Above is all I needed for my use case, but I wondered how I could continue querying those fields in a reusable way. For example, how do I chain these together in an OR statement that uses both of these fragments?

To do that, I’ll need to extract the fragment expressions and put them into a macro so they can be used within Ecto’s functions.

defmodule MyProject.SampleQuery.Fragments do
  import Ecto.Query.API, only: [fragment: 1]

  defmacro photo_urls_have_a_null do
    quote do
        (facebook_urls IS NULL) OR
        (facebook_urls->>'ad_version' IS NULL) OR
        (facebook_urls->>'hero_version' IS NULL) OR
        (craigslist_urls->>'ad' IS NULL)

  defmacro photo_urls_not_contain([hero_ad_value, carousel_ad_value, ad_value]) do
    quote do
        (facebook_urls->>'hero_ad' NOT ILIKE ?) OR
        (facebook_urls->>'carousel_ad' NOT ILIKE ?) OR
        (craigslist_urls->>'ad' NOT ILIKE ?)

Now that those fragments are extracted, let’s use them:

import MyProject.SampleQuery.Fragments
alias MyProject.Photo

defmodule MyProject.SampleQuery do
  def find_bad_photos(query \\ Photo) do
    |> where([_p], photo_urls_have_a_null())
    |> or_where([_p], photo_urls_not_contain([
    |> Repo.all


If you’d like to check out the code a little more, you can see this sample Ecto and Phoenix repo with tests.

This article only explains how to query a JSON object in the database and how it works with Ecto querying. If you’re needing to store an array of maps or structs, then check out Jon’s post Why Ecto’s Way of Storing Embedded Lists of Maps Makes Querying Hard.


The Missing Testing Tip

Maybe the most important one

I recently wrote an article about Speeding up your Erlang Tests for AdRoll. I included 4 useful tips for everyone writing Common Test suites, but as Fred and Juan quickly pointed out, I missed the main one…

Fred on Erlang Slack
Juan in our private slack — You didn’t include the tip about not using timer:sleep/1. I guess it’s too evident…

I agree with them 100%, so let me try and fix that situation.

Don’t let your tests sleep when they should wait

The Terminal (2004) — Source

The Symptom

So, your tests are taking a long time to run. You go in and try to figure out why and then you find many instances of timer:sleep/1 applications spread around in different test cases / suites (True Story: I’ve seen timer:sleep(10000) more than once! — I even wrote some of them 🙈). Of course, that’s why the tests are slow, they’re just stalling there for seconds at a time… just sleeping.

Why do we do that?

There might be multiple reasons leading us to introduce timer:sleep/1 calls in our test cases, but in the general structure of those tests is something like this:

test(_Config) ->

We evaluate some function that will return immediately while also triggering some background task. Then we wait enough time for that background task to complete and check that it, in fact, was completed correctly.

We can’t just not wait (i.e. remove the call to timer:sleep/1 entirely) because given the concurrent nature of Erlang, it’s more than likely than the expected side effect will not have happened by the time the test evaluates the next line if we do so.

But timer:sleep/1 is not the best way to write this kind of tests…

What’s wrong with timer:sleep/1?

To use timer:sleep/1 you need to choose a number (i.e. how many milliseconds do you wish this process to sleep for). Choosing the right number for that parameter is generally hard, if not plainly impossible.

Let’s assume that you know that your background task will never last more than 5 seconds (there is a hard timeout somewhere within it).

Now you need to decide for how long your test should sleep. If you choose a number that’s below 5000, your test may report an error while your system is actually working as expected. Even if you choose to wait exactly 5000ms, you can’t be sure that if the test fails it’s because the system is not working as expected. It might be some schedule-related delay, maybe message got queued for a bit, etc.

So, you choose a number that’s larger than 5000. But… how larger? 5100? 5200?… Let’s choose 6000 just to be sure. And now each test run is 6 seconds longer just because of that. When, in reality, 5000 was just the upper bound. Most likely, your background task (particularly in test mode) takes only 100ms or so.

In other words, using timer:sleep/1 forces you to make a trade off between unpredictable test results and wasting lots of time.

But there is a better way…

What we should do instead?

The basic idea for the solution proposed by Fred above and also implemented in ktn_task:wait_for/2,4 is to periodically check to see if we get the expected result and only fail if we don’t get it after a long time.

The implementation of ktn_task in github is a bit more complex since its more generic, but in a simpler way…

wait_for(_Task, ExpectedResult, _SleepTime, 0) ->
wait_for(Task, ExpectedResult, SleepTime, Retries) ->
case Task() of
ExpectedResult -> ok;
_SomethingElse ->
wait_for(Task, ExpectedResult, SleepTime, Retries - 1)

You would use it like this:

test(_Config) ->
ktn_task:wait_for(fun verify_the_expected_result/0, ok, 100, 60).

The key is in the last 2 parameters. Instead of simply waiting 6000ms and then checking once, you wait at most 60 rounds of 100ms and check each time. If, as expected, the correct result is obtained before the 6th second, you just move on with your test… like a boss 😎

Lesson Learned. What now?

My plan is to add this as a guideline, and then get Elvis to validate it for us. Those PRs are in public repos, so if you feel like +1'ing the guideline or submitting a PR to the elvis repo for this, I’ll highly appreciate it 🙏.

Conferences, Talks and other Off-Topics

Many things are about to happen soon and, if you are like me, you’ll not want to miss any of them…

#OpenErlang / CodeBEAM SF 2018

Next week, thanks to AdRoll, I’m gonna give a talk at my favorite yearly conference, now called CodeBEAM (personal note: I ❤️ this new name). Meet me there to talk about Opaque Types, SpawnFest, Erlang Battleground and anything else!!

The same week, we will celebrate Erlang’s 20th open source birthday at #OpenErlang. We have a lot to be thankful for, so let’s meet there and have a party!

And we have yet another option to meet: Erlang Elixir Meetup SF at TigerText offices on Tuesday.

So… busy week, next week. Probably one of the best weeks of the year :)

BeamBA Meetup — April 6th

Back in BA, we’re organizing the first gathering of 2018 for our local Meetup group, BeamBA. Join us on April 6th at unbalancedparenthesesLambdaClass offices to listen to Facundo Olano talk about Riak Core and Pablo Brud talk about GenStage. You can still submit a talk to the CFP if you have something to share with the community.

BuzzConf — April 27th

Finally, I want to invite you to an amazing conference that unbalancedparentheses is organizing.

It’s called BuzzConf and TL;DR:

If you live near Buenos Aires, you must be there!

Check the website, the speakers, the ticket prices. It will definitely be awesome!

The Missing Testing Tip was originally published in Erlang Battleground on Medium, where people are continuing the conversation by highlighting and responding to this story.


State-Machine Hate

“Quick, how many FSMs are there in your code?”
Zero, right?
(Yeah, I didn’t need to say “Quick” — you already knew the answer)
A better question might have been “How many places in your code could benefit from using an FSM?”.
This, one is a lot more discuss-able, with pro-FSM and anti-FSM camps getting into the usual thing (if you’re hankering for entertainment, grab some popcorn, and then ask a group of erlang devs about gen_fsm).
Rephrasing the question yet again into “How come there are no FSMs in your code?”, you usually get responses like “I didn’t need them when I started”, and “They’re too complicated”.
And now we get to the heart of the matter. Let’s parse these two responses
I didn’t need them when I started”: We’re all trained to incrementally add behavior to our code, and, when you’re starting out, it just doesn’t seem like the system is worth the time, energy, and effort associated with using an FSM.
They’re too complicated”: Remember FSMs in your CS class? Worse, in your Control Systems class, for you EE types? They were chock full of automatons, Moore/Mealy, Acceptors, Math, and whatnot. And yes, it was f**king complicated.
We have mostly been trained into perceiving them as too complicated, which ensures that when we are starting out with a new project, we do not use an FSM. After all, we want to start simple, right? And FSMs are complicated, right? And by the time the project warrants one, well, it’s way huge, and who wants to refactor all that stuff?
The thing is, they are not complicated. All that stuff in CS class? You barely use most of it — you’re basically looking at states and transitions (and, given that most languages have some level of first-class tooling supporting FSMs, it is really easy these days (•). On top of this, you reap huge benefits in your code, in terms of clarity, auditability, logging, upgrading, isolation, and a whole bunch more.
And yes, you did need them when you started, you didn’t realize it. Or maybe you did, but had a little bit of “Who wants to think through all the states of the system up front” going on, which is far more likely. After all, thinking through stuff before coding? Ugh.
So yeah, learn to love your FSMs…
(•) Check out Fred’s excellent chapter on FSM’s for how erlang does it.


World, meet Code Sync Conferences

I attended my first Erlang User Conference in 1995. It was my first conference ever. I was an intern at the Computer Science Lab, working on my Master’s thesis with Joe Armstrong. The conference was opened by Erlang System’s manager Roy Bengtson, my future boss. In his opening talk, he announced two new libraries, the Erlang Term Storage, and the Generic Server Module, as well as the tools which were eventually merged to give us the Observer. When attendees complained over the lack of documentation for these tools, Klacke at the CS Lab suggested they write it themselves.

The two day conference had doubled in numbers from its first installation the previous year, with presentations from the Computer Science Laboratory, Erlang Systems, Ericsson and Universities around the world. It was the beginning of something you do not get to experience often.

Opening slide from the proceedings of the Second Erlang User Conference 1995

The journey to launching Code Sync

By 2009, the conference had outgrown the Ericsson conference center in Älvsjö, the conference chair Bjarne Däcker (the former head of the Computer Science Lab) had retired and the OTP team did not have the infrastructure and flexibility needed to expand the event. We at Erlang Solutions had gained experience in events by running the Erlang eXchange in 2008 followed by the first Erlang Factory in Palo Alto in early 2009. Ericsson asked us to help, so we took over the logistics and worked with them to put together the program.

Can you spot me at Erlang User Conference 2006?

From these humble beginnings, a conference focused on Erlang expanded to include OTP. Use cases of trade-offs in distributed systems. Talks on cloud infrastructure, orchestration and micro services before the terms were invented. And attempts to make Erlang OO (Not the way Alan Kay intended it) were described and forgotten. The discussions in the hallway track were on the unsuitability of C++ for certain types of problems and around an emerging language called Java.

Fast forward to 2017, the focus from Java has moved to the JVM and its ecosystem. It is Scala, Akka, Groovy, Grails, Clojure and Spring. The same happened with .NET, giving it an ecosystem for C#, F# and Visual Basic to thrive. Erlang’s natural progression was no different. As time progressed, the BEAM came along, and new languages were created to run on it. Reia, by Tony Arcieri was the first (who ever said that a Ruby Flavoured Erlang was a bad idea?) and Efene, a C-flavoured language by Mariano Guerra first presented at the Erlang Exchange in 2008 is still used in production today!

The conferences evolved from a languages conference to a conference on the Erlang Ecosystem, where the BEAM and OTP were used to build scalable and resilient systems. Conferences where communities were exchanging experiences, inspiring and learning from one another. And as we started looking outside of the Erlang ecosystem, our events expanded to include talks on functional programming, concurrency, multi-core and distributed systems.

As a result, the Erlang User Conference, Erlang Factory, and Code Mesh have grown to a roster of global Erlang, Elixir and Alternative Tech conferences which have gone from strength to strength. Who can forget Mike, Joe and Robert on stage bickering together, Martin Odersky joking on how Scala influenced Erlang, Simon Peyton Jones talking about Erlang and Haskell, two childhood friends grew up together or Joe Armstrong interviewing Alan Kay! As of today, we organise five tentpole conferences every year, as well as numerous satellite conferences and a thriving partnership with ElixirConf and Lambda Days.

Joe Armstrong and Alan Kay in conversation at Code Mesh 2016

Last month we took Erlang Factory Lite to the Indian Subcontinent for the first time! This was on the back of a successful event in Buenos Aires this March and a sold out Factory Lite in Rome. This happened alongside some of the best conferences we’ve ever put on, from Erlang and Elixir Factories in San Francisco, Erlang User Conferences in Stockholm, Code Mesh in London to co-organising ElixirConf EU in Barcelona.

Introducing Code Sync

On the eve of 2018, the tenth anniversary of our first event, we’re ready for the next phase. I’m excited to announce that all of our conferences are joining the newly launched family of global conferences called Code Sync. Each conference will retain its own personality and stay true to one vision of creating the space for developers and innovators to come together as a community to share their ideas & experiences, learn from one another and invent the future. New name and brand, new colleagues and speakers joining our existing roster of contributors, speakers, and attendees.

Scheduled for next year, we have:

Code BEAM - Discovering the Future of the Erlang Ecosystem

Previously Erlang and Elixir Factory
Code BEAM SF, San Francisco - 15 - 16 March 2018
Code BEAM STO, Stockholm - 31 May - 1 June 2018

Code BEAM Lites - Satellite conferences of Code BEAM

Previously Erlang and Elixir Factory Lite
Various dates & locations
Milan - 6 April 2018
Berlin - 12 October 2018

Code Elixir - Connecting the Elixir Community

Previously ElixirLDN
London - 16 August 2018

Code Mesh - Exploring Alternative Tech

Name unchanged
London - 8-9 November 2018

We are in the early stages of planning Code BEAM Lite events in New York, Budapest, Bangalore, and Bogota. If interested, join our mailing list and stay tuned.

The creation of the Code Sync family of tech conferences is part of the commitment we have made to open our conferences to a wider audience and to spread the culture to Learn, Share & Inspire globally. The Code Sync team has grown from a single person, to a group of five full time employees and an ever growing number of local partners, programme committee members and volunteers. All this wouldn’t have happened without your continuous support - so we hope you will join our Code Sync conferences and become a member of one global community!

The very first Code Sync conference is Code BEAM SF taking place in San Francisco on 15-16 March. Call For Talks and Very Early Bird tickets are already open so we hope to see you there!

  • - Francesco


Elixir Deployment Tools Update - February 2018

In the time since I was hired by DockYard to start working on deployment tools, I have been pretty quiet, and I can finally remedy that today by giving you a look at what I have planned for this year, and what you can expect from your tools in the future. This plan is based in part on my own experience with Distillery, discussions with the Elixir core team, as well as feedback from the community in general. I’m really excited about where things are headed, and by the time you’ve finished reading this update, it is my hope that you will be too!

A Quick Recap

To better understand where we are headed, it is useful to know where we have been. When it comes to deployment of Elixir projects, there has always been some degree of confusion, because there were really two different ways you could choose to deploy your application: either as an OTP release, or in the form of a Mix project, source code and all. I don’t believe the latter is an acceptable option; not only is it not reproducible, but it opens up your production hosts to additional attack vectors, and worse, it throws out a huge part of OTP’s design. We want deployments to be reproducible, require the minimum amount of resources and dependencies, and take full advantage of the features provided to us by Erlang/OTP.

Releases are a fundamental piece in the design of OTP, and if you ever take the time to read through the Erlang source code, you will see just how pervasive it is. In short, they are the means by which you are intended to package together OTP applications which are then run as a single unit. Releases both build on, and extend, the low-level tooling by which the runtime boots and manages OTP applications. For example, there is a script which defines how to load and boot applications at startup, used both within and without releases; but only the release handler provides the script which describes how to upgrade from one version of an application to another, and vice versa. While you can manually reload modules without the release handler, that is a blunt instrument in comparison to the carefully structured appup process, which not only loads the new code, but also ensures that running processes are transitioned into the new code in an organized fashion. These low-level tools work in concert with one another, so that the release handler can even change the version of the runtime being used in the middle of the upgrade/downgrade process. It is important to understand how fundamental releases are, because to dismiss them is to dismiss all of the experience and effort that went into the design of OTP, which has stood the test of time.

While in general deploying OTP releases is preferred by the community, it isn’t always an easy choice for those who are new to the language, are told to use Distillery to build a release when they want to deploy, end up encountering some issues, struggle to understand what went wrong and how to fix it, and end up asking themselves why they shouldn’t just deploy source code and use mix run --no-halt or mix phx.server - particularly coming from scripting languages where deploying source code is the norm. That’s an absolutely valid reaction in that situation, after all, we all want our tools to just work, and when the only tool to build a release is some third-party library written by someone they don’t know, and on top of that, doesn’t integrate seamlessly with some of the features of the language, well, it’s no surprise that people choose the easiest path.

The core team realizes that this is untenable - we can’t afford to have this kind of fragmentation and pain, or it will spread and infect other aspects of the language. In fact, this has already happened to some degree, you only have to look at how many variations on configuring libraries based on environment variables there are to see the impact this has had. Let’s take a look at what I see as the major issues facing us, and what my plan is for solving them.

Development vs Production Tooling

In my opinion, the biggest issue with consolidating the community around OTP releases has been with the discrepancy between how things are done in development, and how things are done when you are ready to go to production. This is due in part to the fact that the two main development tasks we rely on, iex -S mix and mix run (or mix phx.server for Phoenix applications), are not founded on OTP releases. Instead, Mix starts the runtime with only :kernel, :stdlib, and itself started, and then dynamically loads and starts applications based on the project definition found in mix.exs. It chooses to do this because it unified the Mix task infrastructure, and provided the means to handle configuration via config.exs, since Mix could inject configuration into the application environment before starting other applications. At the time, I suspect that José wasn’t concerned with whether releases were supported or not, and was more concerned with making sure the tooling was powerful and intuitive, and it wasn’t until later that the friction became apparent.


Mix tasks don’t work within releases for a simple reason: Mix is designed to work in the context of the typical Mix project directory structure, as well as the required Mix project definition in mix.exs. Some aspects of Mix’s API, such as Mix.env, are also meaningless in the context of a release. Instead, Distillery provides a facility for executing custom commands, for example, a bin/myapp migrate command might be a thin shell script like so:

#!/usr/bin/env bash

$RELEASE_ROOT_DIR/bin/myapp command Elixir.MyApp.ReleaseTasks migrate

This results in MyApp.ReleaseTasks.migrate/0 being invoked in a context where all of the code is loaded, but none of the applications are started, effectively equivalent to a Mix task. The problem is that you can’t reuse, for example, Ecto’s migration task. Instead, you have to use Ecto’s migrator API to implement similar functionality. You have to know which applications to start, how to discover the migrations, and a variety of other small differences, which Ecto would have already solved for you if you could use the Mix task. You also have to read up on Distillery’s shell context and API so you are aware of things like the RELEASE_ROOT_DIR environment variable.

Other than the question of migrations, this hasn’t been a huge issue in practice, but not being able to use Mix in production has certainly presented friction, particularly if you want to expose a variety of custom commands for interacting with the running application both in dev and prod. You ultimately end up hacking something together which wraps a Mix task around the module actually implementing the task, so that you can use the task in dev, and invoke the module via custom command in prod - not ideal.


Erlang projects traditionally rely on the system configuration file, sys.config, only for configuring the runtime, and more rarely for configuring applications. This config file is static, you can’t call functions in it, you can’t fetch the value of an environment variable; that type of dynamic configuration was expected to be performed in application code.

Elixir came on the scene, and introduced its own configuration file, config.exs, which unlike sys.config is dynamic - after all, it is effectively an Elixir script, you are limited only by what code you are willing to write. Because Mix was taking care of starting your application and its dependencies, it could evaluate the config file before doing so, and ensure everything was pushed into the application environment (which is what you access via Application.get_env/3 and Application.put_env/3). The fundamental issue here is that this didn’t take into account how Elixir applications would work in OTP releases; there is no Mix project, and Mix is no longer in charge of choosing how and when to boot your application, instead that is the job of :init and the boot script which instructs the runtime how to load and start applications and in what order.

When I wrote the initial tooling for releases in Elixir, I saw little choice but to translate the Mix config file to sys.config, by reading it at build time and writing out the resulting datastructure in Erlang term format. This decision had several implications: it implicitly changes the semantics of config.exs from that of a runtime config file to a build time config file; it requires that usages of System.get_env/1 be changed to use some other mechanism, of which there are numerous fragmented variants; it prevents you from fetching something from the environment and transforming it prior to setting the config; and worse of all, it creates a rift between development and production that is only surfaced when it is time to deploy to production, which is confusing to many, and for good reason.


Speaking of confusion, another area in which I see room for significant improvement is that of errors which occur early in the boot process. When the Erlang VM starts, it does a number of things before Elixir itself is even loaded, and even once it is, Elixir isn’t in charge of the boot process, the runtime is. To give you an idea of what I mean, let’s take a very quick look at how the runtime boots, regardless of whether it’s a release or a Mix project:

  1. The ERTS (Erlang Runtime System) emulator (written in C) gets to a point where it calls the very first piece of Erlang code in the system, this code is found in the :otp_ring0 module, and its start/2 function. :otp_ring0 and a handful of other modules are preloaded into the emulator, which is how the system bootstraps itself.
  2. :otp_ring0.start/2 simply calls :init.boot/1 with the command line arguments passed to the runtime
  3. :init loads some core NIFs, such as zlib, erl_tracer, and others; parses arguments and sets boot options, and then begins to evaluate the instructions found in the boot script. By default this boot script is found in the Erlang distribution (as start.boot or start_clean.boot), but releases provide their own with instructions for the applications it contains. This file is simply a binary containing an Erlang term, which is a list of tuples containing instructions like load, apply and others. :init remains running during the entire life of the VM, until a shutdown is initiated (e.g. via :init.stop/0) or a crashing application requires that the node itself crash. This module is responsible for handling errors which are unrecoverable. Such errors result in a crash dump being written, and then termination of the process.
  4. Early in the boot script, there are instructions to load and start the :application_controller module, which is then sent messages by :init to load and start applications in the order they are specified in the boot script. This module handles any errors with loading or starting applications, as well as errors when applications crash. It will generally print information about those errors, and then crash itself, which will bubble up to :init
  5. Once all applications are started (or more specifically, all boot instructions have been processed), the boot process is complete

The reason why we sometimes get really ugly or undecipherable errors when crashes occur, is because those errors are handled so early in the boot process that Elixir can’t intercept them and print them in a nicer format. Furthermore, they are handled at a level where it is not guaranteed that even the Erlang standard library is fully available. Because hardly any modules can be relied on at this phase of the program, no effort is made to pretty print errors, instead they are dumped to standard output in raw form. If you’ve ever seen an {"init terminating in do_boot", ...} error, that is something which failed in :init, or which produced an unrecoverable error. As an example, if an application in the release is missing, that error will be hit inside :application_controller when :init sends the message to load that application. The application controller will crash because it could not find the application it needed to load, :init will receive the EXIT message for the application controller and determine that an unrecoverable error has occurred. This is an error which you would expect to be able to print a friendly message for, but we can’t currently. There are a vareity of such errors, and it has been a pain point when those errors occur.

A More Perfect Union

Elixir has evolved to a point where we need to close the gap on some, if not all, of these issues. Friction early in the development of a language is acceptable, but we’re now to the stage where core tooling like this really needs to be rock solid and well integrated. This means we need to ensure that Mix embraces releases fully, and provides the necessary API to write tools which work both in the context of a Mix project, or in the context of a release - but such things should mostly be transparent to the author of libraries, Mix tasks, etc. Now that you have a better understanding of the problems, let’s take a look at what we’re planning to do to fix them!

Releases in Elixir Core

To remove the awkward transition between development and production, we really need our development tooling to be built on releases under the covers. In short, if you run iex -S mix or mix run, these should effectively be the equivalents of bin/myapp console or bin/myapp foreground. If you are always running releases, then there is no transition to be made between development and production.

Just making those commands generate and run a release isn’t enough though; we want our Mix config files to work the same as they do now in development, but in production too. If we just took the current version of how releases work and made you work that way all the time, Mix config files would lose all their usefulness.

Luckily, I’ve come up with a solution for this which is beautiful in its simplicity, but also frustrating because it has been right in front of me for so long and I missed it. By leveraging instructions in the boot script, we can have Mix evaluate the config file before any further actions are taken, which gives us the best of all worlds - we no longer need to translate to sys.config, and config.exs retains its runtime semantics across development and production. The only caveat to this is that some configuration options will still need to be done via vm.args, for example any :kernel config settings, as it is loaded and started before Elixir. That is a very small price to pay for being able to fully support Mix configuration.

Supporting Mix tasks is a more complicated issue, and will likely not be solved in the first pass we make, but if Mix itself understands releases, I don’t think it’s a huge leap to get to a point where we can define Mix tasks which work in a release. Fundamentally, there will always be some subset of Mix tasks which are designed for working within a Mix project directory and alongside source code, so it is important to provide a clear delineation between those which need a Mix project, and those which don’t, so that proper errors can be produced if one tries to invoke a task in the wrong context. It is something I will continue to have conversations with the core team about as we get closer to release.

Better Errors

The question of how to better deal with errors is still an open one. I am exploring a few approaches that may work, which involve making it possible to inject custom behaviour into :init and :application_controller, or potentially taking over both roles entirely by supporting custom :init or :application_controller modules. The former is definitely more desirable, because we want a model by which we can easily extend or customize behaviour in these early phases, without having to reimplement all of the really critical work they do. The bottom line is that I think we can get to a point where errors are either more readable, or even better, presented in a format which is native to the project (e.g. Elixir projects get Elixir-formatted error data, rather than Erlang-formatted). This work will require coordination and cooperation with the OTP team to realize, but I’ve spoken with a number of people who see that there is room for improvement here, so I have high hopes that we can find a solution that will work for them as well as the community at large.

Better Documentation

A major issue for beginners has been the quality and style of documentation around releases. While the current docs do cover the vast majority of questions one may have, they are poorly organized, contain duplication, and are not broken up into smaller topics which are easier to find. I’ve come up with a new structure for the docs, which will likely be carried forward into the Elixir guides once releases are in the core tooling, and which I’ve already started working on. It is my hope that by the end of March, the docs will be completely reworked from the ground up in a way that makes them far friendlier to beginners and veterans alike.

Looking Forward

The initial version of release tooling in Mix will be relatively simple in comparison to Distillery; it will support console and foreground modes, but not daemon mode, it will not support hot upgrades/downgrades, and will likely have limited extension points (i.e. plugins/hooks). That said, we plan to bring these things back in some form or another as quickly as possible.

The question of how to deal with cross-compilation, for example building on your Macbook and deploying to a Digital Ocean VM running Debian, is very much still open. My intent is to provide much more detailed docs on how to get started with properly preparing for deployments like this, and rely on those as the primary solution for the near future; but I am always looking for ways by which we can make that process smoother. Nerves projects have the benefit of their toolchain, but I’m not yet convinced that it carries over well to non-embedded projects. I am keeping my mind open about this though, and always welcome feedback on the topic!

What About Distillery?

Distillery will continue to be the primary release tooling for the near future. I am planning to port some of these changes to Distillery itself, namely the Mix configuration support. That said, once releases are part of Mix, Distillery will be rewritten to build on top of that tooling and extend it where needed. If it turns out that it is not needed, then all the better! Ideally, once we get releases in Mix itself, then Distillery can be deprecated, and we can all focus our attention on a single path forward.


If you didn’t care about all of that stuff, and just want to know the major strokes:

  • Mix will be extended with the ability to generate OTP releases
  • The mix run and iex -S mix commands will be modified to use OTP releases
  • Mix configuration (i.e. config.exs) will be fully supported
  • Mix tasks may be supported in the future, but that is likely something for a later phase
  • You may see better errors for node crashes, it depends on some external factors
  • Expect much better and richer documentation in the next month or so

While no commitment has been made, we’re expecting that the initial release of this tooling will be made available as part of Elixir 1.7. I will continue to provide updates moving forward on a monthly basis. Feel free to reach out with any feedback you may have!


Copyright © 2016, Planet Erlang. No rights reserved.
Planet Erlang is maintained by Proctor.