Building Erlang on Devuan/Debian/Ubuntu

Erlang R24 is out! …but there are, as usual with a version X.0 release, a few rough edges (enough of them around the important-but-annoying WX and OpenGL updates that I’ll be writing another post about that shortly…).

In the meantime, R23.3.4.3 is excellent and quite reliable. The steps for building with kerl are nearly identical as for R22, but it is worth re-posting them with the relevant version updates (or if you like the video version, Dr. Kumar made a few demonstrating his home Erlang + ZX build). Also, if you’re in a part of the world where’s docs are occasionally really really sloooooooowwww don’t forget the R23 docs mirror (or my Erlang page with linky links to everything).

sudo apt update
sudo apt upgrade
sudo apt install \
    gcc curl g++ dpkg-dev build-essential automake autoconf \
    libncurses5-dev libssl-dev flex xsltproc libwxgtk3.0-dev \
    wget vim git
mkdir vcs bin
cd vcs
git clone
cd ..
ln -s ~/vcs/kerl/kerl bin/kerl
kerl update releases
kerl build
kerl install ~/.erts/
echo '. "$HOME"/.erts/' >> .bashrc
. ~/.erts/
wget -q && bash get_zx

As usual, the ~/vcs/ directory is just my convention for version-controlled code that my $HOME sync scripts know to ignore, and you might want to install Erlang to some place global on your system like /opt/erts/ or whatever. The steps above work without root privileges with the exception of the apt commands. Remember if you are on Devuan or Debian that you need to perform the sudo commands actually as root unless you configure sudo on your system, then the rest as your normal local user account.


Social virtual spaces with Elixir at Mozilla

Welcome to our series of case studies about companies using Elixir in production. See all cases we have published so far.

Hubs is Mozilla’s take on virtual social experiences. You build your own private spaces and share them with your friends, co-workers, and community. Avatars in this space can move freely in a 3D social environment and watch videos, exchange messages, and talk to other people nearby. All you need is a browser and a microphone!

Hubs is fully open source and you can host it on your infrastructure via Hubs Cloud. Community managers, educators, and event organizers have been using Hubs Cloud to run virtual events and online activities tailored to their specific brandings and needs. All it takes to run your own version of Hubs is one click away - which perhaps makes Hubs the most deployed Phoenix application ever!

Mozilla Hubs

From VR to Elixir

The Hubs team started at Mozilla as the Mixed Reality team about 3.5 years ago. Their main goal was to explore ways for online social interaction via avatars and mixed reality.

They quickly focused on building their first proof of concept, where avatars could communicate, move around, and join different rooms, everything running directly in the browser. This was a significant departure from the state of the art of Virtual Reality everywhere, as the getting started experience up to this point was cumbersome and often required steep investment in the form of VR headsets.

The initial prototype was a success and it pushed the team to build a product. However, all communication in the proof of concept was peer-to-peer, which limited the features and experiences they could provide. Therefore the Hubs team knew they needed a capable backend technology to provide fan-out communication and coordinate how all different avatars interact within the virtual spaces. John Shaughnessy, Staff Software Engineer at Mozilla, comments: “When you get a lot of people in the same space, be it virtual or in the real world, there is never only a single conversation going on. Once you get ten or twenty different people in a room, conference calls don’t work well. In Hubs, people transition between multiple simultaneous conversations simply by moving around”.

With this bold vision in hand, they assessed their whole stack. They settled on using JavaScript with Three.js in the front-end and chose the Phoenix web framework for the backend. Greg Fodor, who was an Engineering Manager at Mozilla at the time, explains the choice: “We first listed all of the features we had to implement, from trivial things like REST endpoints, to more complex use cases, such as chat messaging and tracking where avatars are in the virtual world. Once I started to learn Phoenix, I saw all of those features were already there! The application we were building has to manage a large number of connections with real-time low latencies, something we knew the Erlang VM was an excellent fit for”.

In production

Hubs went live in January 2018. Almost everything in Hubs goes through Phoenix. The only exception is the WebRTC voice channels, which are handled by designated voice servers, initially implemented with Janus and later ported to MediaSoup. However, the Phoenix app still manages the voice servers and how connections are assigned to them.

The deployment is orchestrated by Habitat and running on Amazon EC2. Habitat provides packaging and orchestration. When a voice server joins the Habitat ring, the Phoenix services receive a message and start assigning voices to voice servers. Overall they run on 4 Phoenix and 4 voice servers.

The Elixir experience in production has been quite smooth. Dominick D’Aniello, Staff Software Engineer at Mozilla, points out some areas they discussed improving: “the Phoenix application works mostly as a proxy, so we avoid decoding and reencoding the data unless we really need to. But sometimes we have to peek at the payloads and JSON is not the most efficient format to do so.” They have also considered relying more on Elixir processes and the Erlang distribution. Dominick continues: “when a new client joins, it needs to ask all other clients what is their state in the world, what they own, and what they care about. One option is to use Elixir processes in a cluster to hold the state of the different entities and objects in the virtual world”.

Beyond Hubs

With many large companies investing in online communication, the Mozilla team saw the possibility of virtual spaces becoming walled-gardens inside existing social platforms. This led the Hubs team to work on Hubs Cloud, with the mission to commoditize virtual spaces by allowing anyone to run their own version of Hubs with a single click.

Hubs Cloud launched in February 2020 and it has been a hit. New York University did its graduation ceremony on a Hubs Cloud instance. The IEEE Virtual Reality Conference embraced Hubs for a more accessible and sustainable event with talks and poster sessions all happening in virtual rooms, while the Minnesota Twins baseball team launched a Virtual Hall of Fame on top of the platform.

Their cloud version uses Amazon CloudFormation to instantiate Hubs inside the user’s account. This approach brought different challenges to the Hubs team: “we want Hubs Cloud to be as affordable and straightforward as possible. The Phoenix app has already been a massive help on this front. We have also moved some features to Amazon Lambda and made them optional, such as image resizing and video conversion” - details John.

Since Hubs is also open source, developers can run their own Hubs instance in whatever platform they choose or change it however they want. That’s the path Greg Fodor recently took when he announced Jel: “Jel is the video game for work. It is a mashup of Minecraft and Discord, where everything is 3D. My goal is to spark new directions and ideas to get people excited about VR”.

Summing up

Today, the Hubs team has 10 contributors, half of whom are developers. Their engineering team is quite general and learning Elixir happens organically: “you are motivated by the feature you are working on. If it requires changing the backend, you learn Elixir with the help of the team and then make your contribution”.

Overall, the bet on Phoenix was a successful one. Greg Fodor highlights: “The most significant benefit of Phoenix is in using a stack that excels at solving a large variety of problems. Once onboarded to Phoenix, there is a huge surface area our engineers can touch. Any feature they come up with, they can run with it. And because Hubs is open source, our contributors will also have the same experience. Overall, Elixir and Phoenix reduce the effort needed to cause the largest impact possible across our whole product”.

Lately, they have leaned even further into the ecosystem, as they have started exposing Hubs APIs over GraphQL with the help of Absinthe. They have also migrated to Phoenix v1.5 and are using the Phoenix LiveDashboard to provide metrics and instrumentation to Hubs Cloud users.


You've got to upgrade Rebar3


You've got to upgrade Rebar3

Bad news. You have to upgrade Rebar3. Like right now. We just noticed that SSL validation had been partially disabled for years.

This post covers:

  • what's the problem
  • how to fix it
  • what happened
  • why we think it hasn't been exploited

You should definitely read the first two sections to remediate this, the rest is intended to be informative.

What's the Problem

We accidentally disabled all TLS validation when communicating with in Rebar3 itself, meaning Hex packages you download may have seen only partial validation and could have been subjected to attack. While we do not think any such exploit has happened in the wild, we still treat this as urgent. Git or mercurial dependencies, and any other communications (such as rebar3 local upgrade commands) are unaffected.

All versions starting with Rebar3 3.7.0 released in November 2018 are affected. The specific versions, given OTP compatibility schedules, are:

  • 3.16.0 (Erlang/OTP 22-24)
  • 3.14.0 through 3.15.1 (Erlang/OTP 19-23)
  • 3.11.0 through 3.13.2 (Erlang/OTP 18-22)
  • 3.7.0 through 3.10.0 (Erlang/OTP 17-21)

You can call rebar3 version if you want to know which one you're running for sure. If you are a mix user (with Elixir), you are not at risk: Rebar3 is used by mix only to build code, not to fetch dependencies.

How To Fix It

Upgrade and Update.

The following versions have been tagged, to provide the quickest path to safety for any project on any supported versions in that time period:

You will want to install the newest one your project can tolerate (possibly with rebar3 local upgrade), and then call rebar3 update to bring your local registry up to date.

I'm hoping to soon be able to patch 3.13.x and 3.10.x to work, but the patch that's been tested to work for newer versions won't port cleanly to them, and the realities of OSS maintainership on a skeleton crew is that I can't currently commit the time to patch these up in a hurry (and get 8 year old copies of Erlang to build cleanly for it all).

The patch is not that complex on new versions and should be relatively easy to port to older versions if anyone ends up wanting to help.

Why We Think it Hasn't Been Exploited

Even though a build tool like Rebar3 is fundamentally about running code from random places online on your machine and can never be considered safe software, we still try to defend in depth where we can. The maintainers do the same and have taken previous reports seriously.

If you're on a moderately up-to-date Rebar3, the following things should all take place

  • The package index is validated against a bundled signature in hex_core
  • Each package definition fetched from the index comes with 2 hashes, on external to the tarball, and one internal to it
  • Each of these hashes is saved in the lock file everyone is expected to check into their repositories
  • Each of the subsequent package downloads are done by validating that hashes fit

These measures exist so that our layered package index mechanism can safely be used with partial mirrors. For example, someone in a corporate setting could have a "blessed" package index with only valid repositories that have been audited in there, and fetch from there as a top priority. Packages that can't be seen could go to a public mirror, and if the mirror is out of date, then they could go to the root hex package.

The locking mechanism we use and the way fetching and validating is done such that if someone were to compromise the public mirror and try to change a package's definition, any locked package version would allow us to detect that and warn about it. In practice, people see this warning pop up if they fetched the new version of a package within the first hour of its publication when the maintainer is still allowed to mutate it briefly (for bug fixing and correcting metadata).

Additionally, Rebar3 does not fetch updated copies of the hex index unless it is specifically told to (rebar3 update), and otherwise only does partial index fetches (per-package) when it is asked to download a version of a library it hasn't seen locally before.

So for someone to succesfully exploit this without anyone noticing, they likely would need to:

  1. Man-in-the-middle (MITM) attack the connection between your dev machine and
  2. notice which package you're getting and live-inject a bad one that also matches the hex signatures bundled into the library
  3. MITM attack the connection of some if not all of your contributors in a similar way so they never get a warning about broken packages
  4. MITM attack the connection with your continuous integration or build servers in a similar way so there's never a warning either
  5. Keep it going for as long as you're using the bad package on new devices that might fetch it.

Now there are arguably weaker points in the supply chain: if you 'bootstrap' Rebar3 rather than using a pre-built copy, where the first fetch is done without validation (we need to download the CA Certs bundle package without one), and if this overall chain attack happens when someone happens to publish a package.

Package publishing is the riskiest one here, but it requires an interesting amount of sophistication:

  1. the attacker needs to MITM the connection between you and
  2. it needs to intercept the update you're making
  3. malicious code must be injected; this can either be done live (hard) or at a later point in the hour re-using credentials they snooped
  4. You need to not be aware of or ignore the email hex sends you every time a new package you maintain is updated on the registry.

In short, since we've never seen any sign of this, and that it would be quite a convoluted attack, we're not very concerned that it's been seen in practice. If you have seen any of the above symptoms and just didn't know what to make of them, then you might be at risk.

What Happened

The issue was first introduced in Rebar3 3.7.0, a big release with work founded by the IEUG, the pre-cursor of the EEF, which aimed to do major re-work around the tool in order to support using Elixir dependencies in Erlang. This came with an overhaul of the plugin system, how the compiler works, and also bundled changes to how we fetched packages from hex, deferring work to the hex_core library. This latter change came in with extra features around how we built indexes and fetched updates, which was made as lazy as possible (only hit the network when you must).

Erlang's SSL/TLS libraries were never safe and never checked certificates by default. Our advice as OSS contributors has always been to use hackney as an HTTP client since it handles validation of certificate chains out of the box. Unfortunately, Rebar3 cannot rely on Hackney because Rebar3 needs to exist to fetch the hackney package, and early on we wanted to use as few dependencies as possible to prevent unlucky clashes with plugins and dependencies (which incidentally was made less annoying in 3.7.0). The Erlang libraries that come out of the box just don't check validation, have no way of getting the OS's root CA bundle, and before last week's release of OTP-24, wouldn't warn when unsafe calls would be made. Instead, we bundle the certifi along with tls validation functions that we pass in each of our calls.

In 3.7.0, the switching to the new hex_core library subtly replacing a call from:

request(Url,ETag)->HttpOptions=[{ssl,ssl_opts(Url)},{relaxed,true}|rebar_utils:get_proxy_auth()],casehttpc:request(get,{Url,[{"if-none-match","\""++ETag++"\""}||ETag=/=false]++[{"User-Agent",rebar_utils:user_agent()}]},HttpOptions,[{body_format,binary}],rebar)of% ...

to instead be:

request(Config,Name,Version,ETag)->Config1=Config#{http_etag=>ETag},tryhex_repo:get_tarball(Config1,Name,Version)of% ...

At this point in time, Rebar3 already supported PROXY environment variables, which were used to set the rebar profile in the built-in httpc client library. As long as that profile is used, the basic set-up around auth and redirection is in place.

This whole thing is a bit far behind, but as a reviewer on these commits I was still operating under the impression that this is where TLS validation was taking place. I was absolutely confident that we were passing the right profile and everything was good, and last week when someone asked how to get rid of the OTP-24 TLS warnings in an unrelated project, I pointed them to our SSL Options code as a way we worked around things and never got the warning.

Now that bit is really interesting because:

  1. The warning appeared only last week on OTP-24 (a very good thing)
  2. I had only seen the warnings in the bootstrap script, when we have to first generate the client to go fetch the CA bundle (which is validated against previously-signed hex index caches) when testing new rebar3 builds from scratch on OTP-24
  3. I had never seen it otherwise on all my computers on OTP-24

But here's where the impression breaks down:

  • I had only built rebar3 on my VPS using linux because of woes with OTP-24 compilation on OSX (which persist to this day)
  • My dev work on my VPS is limited to mostly Rebar3, a toy project, or other small incidental patches that mostly all happen on stable libraries
  • Rebar3 will happily avoid the network and keep fetching source packages from a local cache

As such, I thought we got none of the warnings because we had solid code in place, but I never got them because Rebar3 is good at not hitting the network when it knows packages and I was mostly working with known packages and getting the warnings in places where I expected them.

At some point yesterday I just decided to go take a look at the code to see how it was wired in, just because something felt funny (had I only seen the warning where I thought I did?):

<MononcQc> what the hell where are ssl opts configured in our stack
<MononcQc> fuck I think we might have broken ssl but I'll need to double-check
<MononcQc> shit we do. God fucking damn it

One of the core things of the validation we do is that it requires the current hostname of a TLS query. When I looked at that, it became apparent that it could not be set on the profile since it needs to be set by each call. And hex_core does not come with any of the required dependencies to do it. So for a few years we had just silently been using unvalidated TLS.

The patch turned out to be quite simple. I made it sort of subtle and non-panicky, in order to let it be reviewed and merged without causing a panic before we had time to rebuild all sorts of packages, get these instructions ready and ship it out. Now you can use it.

Obviously, it feels like we failed people here. Sorry about that one.


Elixir v1.12 released

Elixir v1.12 is out with improvements to scripting, tighter Erlang/OTP 24 integration, stepped ranges, and dozen of new functions across the standard library. Overall this is a small release, which continues our tradition of bringing Elixir developers quality of life improvements every 6 months. Some of these improvements directly relates with the recent efforts of bringing Numerical Computing to Elixir.

Elixir v1.12 requires Erlang/OTP 22+. We also recommend running mix local.rebar after installation to upgrade to the latest Rebar versions, which includes support for Erlang OTP/24+.

Note: this announcement contains asciinema snippets. You may need to enable 3rd-party JavaScript on this site in order to see them. If JavaScript is disabled, noscript tags with the proper links will be shown.

Scripting improvements: Mix.install/2 and System.trap_signal/3

Elixir v1.12 brings new conveniences for those using Elixir for scripting (via .exs files). Elixir has been capable of managing dependencies for a quite long time, but it could only be done within Mix projects. In particular, the Elixir team is wary of global dependencies as any scripts that rely on system packages are brittle and hard to reproduce whenever your system changes.

Mix.install/2 is meant to be a sweet spot between single-file scripts and full-blown Mix projects. With Mix.install/2, you can list your dependencies at the top of your scripts. When you execute the script for the first time, Elixir will download, compile, and cache your dependencies before running your script. Future invocations of the script will simply read the compiled artifacts from the cache:

IO.puts(Jason.encode!(%{hello: :world}))

Mix.install/2 also performs protocol consolidation, which gives script developers an option to execute their code in the most performant format possible. Note Mix.install/2 is currently experimental and it may change in future releases.

Furthermore, Mix.install/2 pairs nicely with Livebook, a newly announced project that brings interactive and collaborative notebook projects to Elixir. With Livebook and Mix.install/2, you can bring dependencies into your code notebooks and ensure they are fully reproducible. Watch the Livebook announcement to learn more.

Another improvement to scripting is the ability to trap exit signals via System.trap_signal/3. All you need is the signal name and a callback that will be invoked when the signal triggers. For example, ExUnit leverages this functionality to print all currently running tests when you abort the test suite via SIGQUIT (Ctrl+\\ ). You can see this in action when running tests in the Plug project below:

<noscript><p><a href="">See the example in asciinema</a></p></noscript>

This is particularly useful when your tests get stuck and you want to know which one is the culprit.

Important: Trapping signals may have strong implications on how a system shuts down and behaves in production and therefore it is extremely discouraged for libraries to set their own traps. Instead, they should redirect users to configure them themselves. The only cases where it is acceptable for libraries to set their own traps is when using Elixir in script mode, such as in .exs files and via Mix tasks.

Tighter Erlang/OTP 24 integration

Erlang/OTP 24 ships with JIT compilation and Elixir developers don’t have to do anything to reap its benefits. There are many other features in Erlang/OTP 24 to look forwards to and Elixir v1.12 provides integration with many of them: such as support for 16bit floats in bitstrings as well as performance improvements in the compiler and during code evaluation.

Another excellent feature in Erlang/OTP 24 is the implementation of EEP 54, which provides extended error information for many functions in Erlang’s stdlib. Elixir v1.12 fully leverages this feature to improve reporting for errors coming from Erlang. For example, in earlier OTP versions, inserting an invalid argument into an ETS table that no longer exists would simply error with ArgumentError:

<noscript><p><a href="">See the example in asciinema</a></p></noscript>

However, in Elixir v1.12 with Erlang/OTP 24:

<noscript><p><a href="">See the example in asciinema</a></p></noscript>

Finally, note Rebar v2 no longer works on Erlang/OTP 24+. Mix defaults to Rebar v3 since Elixir v1.4, so no changes should be necessary by the vast majority of developers. However, if you are explicitly setting manager: :rebar in your dependency, you want to move to Rebar v3 by removing the :manager option. Compatibility with unsupported Rebar versions will be removed from Mix in the future.

Stepped ranges

Elixir has had support for ranges from before its v1.0 release. Ranges support only integers and are inclusive, using the mathematic notation a..b. Ranges in Elixir are either increasing 1..10 or decreasing 10..1 and the direction of the range was always inferred from the first and last positions. Ranges are always lazy as its values are emitted as they are enumerated rather than being computed upfront.

Unfortunately, due to this inference, it is not possible to have empty ranges. For example, if you want to create a list of n elements, you cannot express it with a range from 1..n, as 1..0 (for n=0) is a decreasing range with two elements.

Elixir v1.12 supports stepped ranges via the first..last//step notation. For example: 1..10//2 will emit the numbers 1, 3, 5, 7, and 9. You can consider the // operator as an equivalent to “range division”, as it effectively divides the number of elements in the range by step, rounding up on inexact scenarios. Steps can be either positive (increasing ranges) or negative (decreasing ranges). Stepped ranges bring more expressive power to Elixir ranges and they elegantly solve the empty range problem, as they allow the direction of the steps to be explicitly declared instead of inferred.

As of Elixir v1.12, implicitly decreasing ranges are soft-deprecated and warnings will be emitted in future Elixir versions based on our deprecation policy.

then/2 and tap/2

Two new functions have been added to Kernel module, in order to ease working with pipelines. tap/2 passes the given argument to an anonymous function, returning the argument itself. then/2 passes the given argument to an anonymous function, returning the result. The following:

"hello world"
|> tap(&IO.puts/1)
|> then(&Regex.scan(~r/\w+/, &1))

Is equivalent to this:

"hello world"
|> (fn x ->
|> (&Regex.scan(~r/\w+/, &1)).()

Both tap/2 and then/2 are implemented as macros, and compiler improvements available on Erlang/OTP 24 ensure the intermediate anonymous functions is optimized away, which guarantees the idioms above do not have any performance impact on your code.

IEx improvements

IEx got two important quality of life improvements in this release. Hitting tab after a function invocation will show all of the arguments for said function and it is now possible to paste code with pipelines in the shell. See both features in action below:

<noscript><p><a href="">See the example in asciinema</a></p></noscript>

Additional functions

Elixir v1.12 has also added many functions across the standard library. The Enum module received additions such as Enum.count_until/2, Enum.product/1, Enum.zip_with/2, and more. The Integer module now includes Integer.pow/2 and Integer.extended_gcd/2.

The Code module got a cursor_context/2 function, which is now used to power IEx autocompletion and it is used by projects such as Livebook to provide intellisense.

The EEx application has also been extended to provide metadata on text segments. This has enabled the Surface and Phoenix LiveView teams to implement a new template language called HEEx, which validates both HTML and EEx. Finally, the Registry module supports the :compressed option, which is useful for GraphQL applications managing hundreds of thousands of subscriptions via Absinthe.

For a complete list of all changes, see the full release notes. Check the Install section to get Elixir installed and read our Getting Started guide to learn more.

Have fun!


OTP 24.0 Release

img src=

OTP 24

Erlang/OTP 24 is a new major release with new features, improvements as well as a few incompatibilities.

Below are some of the highlights of the release:



  • The compiler will now inline funs that are used only once immediately after their definition.
  • Compiler warnings and errors now include column numbers in addition to line numbers.
  • Variables bound between the keywords 'try' and 'of' can now be used in the clauses following the 'of' keyword
    (that is, in the success case when no exception was raised).
  • Generators in list and binary comprehensions will now
    raise a {bad_generator,Generator} exception if the
    generator has an incorrect type
    Similarly, when a
    filter does not evaluate to a boolean, a
    {bad_filter,Filter} exception will be raised.
  • Warnings for expressions whose result was ignored that could be suppressed by
    using the anonymous variable '_' can now be suppressed with a variable beginning with '_'.
  • Selective receive optimization will now be applied much
    more often.
    The new recv_opt_info compile flag can be used to print
    diagnostics relating to this optimization.
    You can read more about the selective receive
    optimization in the Efficiency Guide.

erts, kernel, stdlib

  • hex encoding and decoding functions added in the binary module

  • The BeamAsm JIT-compiler has been added to Erlang/OTP and will give a significant performance boost for many applications.
    The JIT-compiler is enabled by default on most x86 64-bit platforms that have a C++ compiler that can compile C++17.
    To verify that a JIT enabled emulator is running you can use erlang:system_info(emu_flavor).

  • A compatibility adaptor for gen_tcp to use the new socket API has been implemented (gen_tcp_socket).

  • Extended error information for failing BIF calls as proposed in EEP 54 has been implemented.

  • Process aliases as outlined by EEP 53 has been introduced.

  • Implementation of EEP 56 in supervisor. It adds the concept of significant children as well as the auto_shutdown supervisor flag. See the supervisor manual page for more information.


  • Add support for FTPES (explicit FTP over TLS).


  • Make TLS handshakes in Erlang distribution concurrent.
  • TLS connections now support EdDSA certificates.


  • The application has been completely rewritten in order
    to use wxWidgets version 3 as its base.
  • Added support for wxWebView.


  • EDoc is now capable of emitting EEP-48 doc chunks. This means that, with some
    configuration, community projects can now provide documentation for shell_docs
    the same way that OTP libraries did since OTP 23.0.

For more details about new features and potential incompatibilities see

Pre built versions for Windows can be fetched here:

Online documentation can be browsed here:

The Erlang/OTP source can also be found at GitHub on the official Erlang repository,

Many thanks to all the contributors.


Erlang/OTP 24 Highlights

Finally Erlang/OTP 24 is here! A release that for me has been about 10 years in the making. As is tradition by now, this blog post will go through the additions to Erlang/OTP that I am most excited about!

Erlang/OTP 24 includes contributions from 60+ external contributors totalling 1400+ commits, 300+ PRs and changing 0.5 million(!) lines of code. Though I’m not sure the line number should count as we vendored all of AsmJit and re-generated the wxWidgets support. If we ignore AsmJit and wx, there are still 260k lines of code added and 320k lines removed, which is about 100k more than what our releases normally contain.

You can download the readme describing the changes here: Erlang/OTP 24 Readme. Or, as always, look at the release notes of the application you are interested in. For instance here: Erlang/OTP 24 - Erts Release Notes - Version 12.0.

This years highlights are:

BeamAsm - the JIT compiler for Erlang

The most anticipated feature of Erlang/OTP 24 has to be the JIT compiler. A lot has already been said about it:

and even before released the WhatsApp team has shown what it is capable of.

However, besides the performance gains that the JIT brings, what I am the most excited about is the benefits that come with running native code instead of interpreting. What I’m talking about is the native code tooling that now becomes available to all Erlang programmers, such as integration with perf.

As an example, when building a dialyzer plt of a small core of Erlang, the previous way to profile would be via something like eprof.

> eprof:profile(fun() ->

This increases the time to build the PLT from about 1.2 seconds to 15 seconds on my system. In the end, you get something like the below that will guide you to what you need to optimize. Maybe take a look at erl_types:t_has_var*/1 and check if you really need to call it 13-15 million times!

> eprof:analyze(total).
FUNCTION                      CALLS        %     TIME [uS / CALLS]
--------                      -----  -------     ---- [----------]
erl_types:t_sup1/2          2744805     1.68   752795 [      0.27]
erl_types:t_subst/2         2803211     1.92   858180 [      0.31]
erl_types:t_limit_k/2       3783173     2.04   913217 [      0.24]
maps:find/2                 4798032     2.14   957223 [      0.20]
erl_types:t_has_var/1      15943238     5.89  2634428 [      0.17]
erl_types:t_has_var_list/1 13736485     7.51  3360309 [      0.24]
------------------------  ---------  ------- -------- [----------]
Total:                    174708211  100.00% 44719837 [      0.26]

In Erlang/OTP 24 we can get the same result without having to pay the pretty steep cost of profiling with eprof. When running the same analysis as above using perf it takes roughly 1.3 seconds to run.

$ ERL_FLAGS="+JPperf true" perf record dialyzer --build_plt \
    --apps erts

Then we can use tools such as perf report, hotspot or speedscope to analyze the results.

$ hotspot

alt text

In the above, we can see that we get roughly the same result as when using eprof, though interestingly not exactly the same. I’ll leave the whys of this up to the reader to find out :)

With this little overhead when profiling, we can run scenarios that previously would take too long to run when profiling. For those brave enough it might even be possible to run always-on profiling in production!

The journey with what can be done with perf has only started. In PR-4676 we will be adding frame pointer support which will give a much more accurate call frames when profiling and, in the end, the goal is to have mappings to Erlang source code lines instead of only functions when using perf report and hotspot to analyze a perf recording.

Improved error messages

Erlang’s error messages tend to get a lot of (valid) critisism for being hard to understand. Two great new features have been added to help the user understand why something has failed.

Column number in warnings and errors

Thanks to the work of Richard Carlsson and Hans Bolinder, when you compile Erlang code you now get the line and column of errors and warnings printed in the shell together with a ^-sign showing exactly where the error actually was. For example, if you compile the below:

foo(A, B) ->
  #{ a => A, b := B }.

you would in Erlang/OTP 23 and earlier get:

$ erlc t.erl
t.erl:6: only association operators '=>' are allowed in map construction

but in Erlang/OTP 24 you now also get the following printout:

$ erlc test.erl
t.erl:6:16: only association operators '=>' are allowed in map construction
%    6|   #{ a => A, b := B }.
%     |                ^

This behavior also extends into most of the Erlang code editors so that when you use VSCode or Emacs through Erlang LS or flycheck you also get a narrower warning/error indicator, for example in Emacs using Erlang LS.

alt text

EEP-54: Improved BIF error information

One of the other big changes when it comes to error information is the introduction of EEP-54. In the past many of the BIFs (built-in functions) would give very cryptic error messages:

1> element({a,b,c}, 1).
** exception error: bad argument
     in function  element/2
        called as element({a,b,c},1)

In the example above, the only thing we know is that one or more of the arguments are invalid, but without checking the documentation there is no way of knowing which one and why. This is especially a problem for BIFs where the arguments may fail for different reasons depending on factors not visible in the arguments. For example in the ets:update_counter call below:

> ets:update_counter(table, k, 1).
** exception error: bad argument
     in function  ets:update_counter/3
        called as ets:update_counter(table,k,1)

We don’t know if the call failed because the table did not exist at all or if the key k that we wanted to update did not exist in the table.

In Erlang/OTP 24 both of the examples above will have a much clearer error messages.

1> element({a,b,c}, 1).
** exception error: bad argument
     in function  element/2
        called as element({a,b,c},1)
        *** argument 1: not an integer
        *** argument 2: not a tuple
2> ets:new(table,[named_table]).
3> ets:update_counter(table, k, 1).
** exception error: bad argument
     in function  ets:update_counter/3
        called as ets:update_counter(table,k,1)
        *** argument 2: not a key that exists in the table

That looks much better and now we can see what the problem was! The standard logging formatters also include the additional information so that if this type of error happens in a production environment you will get the extra error information:

1> proc_lib:spawn(fun() -> ets:update_counter(table, k, 1) end).
=CRASH REPORT==== 10-May-2021::11:20:35.367023 ===
    initial call: erl_eval:'-expr/5-fun-3-'/0
    pid: <0.94.0>
    registered_name: []
    exception error: bad argument
      in function  ets:update_counter/3
         called as ets:update_counter(table,k,1)
         *** argument 1: the table identifier does
                         not refer to an existing ETS table
    ancestors: [<0.92.0>]

EEP-54 is not only useful for error messages coming from BIFs but can be used by any application that wants to provide extra information about their exceptions. For example, we have been working on providing better error information around io:format in PR-4757.

Improved receive optimizations

Since Erlang/OTP R14 (released in 2010), the Erlang compiler and run-time system have co-operated to optimize for the pattern of code used by gen_server:call like functionality to avoid scanning a potentially huge mailbox. The basic pattern looks like this:

call(To, Msg) ->
  Ref = make_ref(),
  To ! {call, Ref, self(), Msg},
    {reply, Ref, Reply} -> Reply

The compiler can from this figure out that when Ref is created, there can be no messages in the mailbox of the process that contains Ref and therefore it can skip all of those when receiving the Reply.

This has always worked great in simple scenarios like this, but as soon as you had to make the scenarios a little more complex it tended to break the compiler’s analysis and you would end up scanning the entire mailbox. For example, in the code below Erlang/OTP 23 will not optimize the receive.

call(To, Msg, Async) ->
  Ref = make_ref(),
  To ! {call, Ref, self(), Msg},
    Async ->
      {ok, Ref};
    not Async ->
        {reply, Ref, Reply} -> Reply

That all changes with Erlang/OTP 24! Many more complex scenarios are now covered by the optimization and a new compiler flag has been added to tell the user if an optimization is done.

$ erlc +recv_opt_info test.erl
test.erl:6: Warning: OPTIMIZED: reference used to mark
                                a message queue position
%    6|   Ref = make_ref(),
test.erl:12: Warning: OPTIMIZED: all clauses match reference
                                 created by make_ref/0
                                 at test.erl:6
%   12|       receive

Even patterns such as multi_call are now optimized to not scan the mailbox of the process.

multi_call(ToList, Msg) ->
  %% OPTIMIZED: reference used to mark a message queue position
  Ref = make_ref(),
  %% INFO: passing reference created by make_ref/0 at test.erl:18
  [To ! {call, Ref, self(), Msg} || To <- ToList],
  %% INFO: passing reference created by make_ref/0 at test.erl:18
  %% OPTIMIZED: all clauses match reference
  %%            in function parameter 2
  [receive {reply, Ref, Reply} -> Reply end || _ <- ToList].

There are still a lot of places where this optimization does not trigger. For instance as soon as any of the make_ref/send/receive are in different modules it will not work. However, the new improvements in Erlang/OTP 24 make the number of scenarios a lot fewer and now we also have the tools to check and see if the optimization is triggered!

You can read more about this optimization and others in the Efficiency Guide.

EEP-53: Process aliases

When doing a call to another Erlang process, the pattern used by gen_server:call, gen_statem:call and others normally looks something like this:

call(To, Msg, Tmo) ->
  MonRef = erlang:monitor(process, To),
  To ! {call, MonRef, self(), Msg},
    {'DOWN',MonRef,_,_,Reason} ->
      {error, Reason};
    {reply, MonRef, Reply}
      {ok, Reply}
    after Tmo ->
      {error, timeout}

This normally works well except for when a timeout happens. When a timeout happens the process on the other end has no way to know that the reply is no longer needed and so will send it anyway when it is done with it. This causes all kinds of problems as the user of a third-party library would never know what messages to expect to be present in the mailbox.

There have been numerous attempts to solve this problem using the primitives that Erlang gives you, but in the end, most ended up just adding a handle_info in their gen_servers that ignored any unknown messages.

In Erlang/OTP 24, EEP-53 has introduced the alias functionality to solve this problem. An alias is a temporary reference to a process that can be used to send messages to. In most respects, it works just as a PID except that the lifetime of an alias is not tied with the lifetime of the process it represents. So when you try to send a late reply to an alias that has been deactivated the message will just be dropped.

The code changes needed to make this happen are very small and are already used behind the scenes in all the standard behaviors of Erlang/OTP. The only thing needed to be changed in the example code above is that a new option must be given to erlang:monitor and the reply reference should now be the alias instead of the calling PID. That is, like this:

call(To, Msg, Tmo) ->
  MonAlias = erlang:monitor(process, To, [{alias, demonitor}]),
  To ! {call, MonAlias, MonAlias, Msg},
    {'DOWN', MonAlias, _ , _, Reason} ->
      {error, Reason};
    {reply, MonAlias, Reply}
      {ok, Reply}
    after Tmo ->
      {error, timeout}

You can read more about this functionality in the alias documentation.

EEP-48: Documentation chunks for edoc

In Erlang/OTP 23 erl_docgen was extended to be able to emit EEP-48 style documentation. This allowed the documentation to be used by h(lists) in the Erlang shell and external tools such as Erlang LS. However, there are very few applications outside Erlang/OTP that use erl_docgen to create documentation, so EEP-48 style documentation was unavailable to those applications. Until now!

Radek Szymczyszyn has added support for EEP-48 into edoc which means that from Erlang/OTP 24 you can view both the documentation of lists:foldl/3 and recon:info/1.

$ rebar3 as docs shell
Erlang/OTP 24 [erts-12.0] [source] [jit]

Eshell V11.2.1  (abort with ^G)
1> h(recon,info,1).
 -spec info(PidTerm) ->
   [{info_type(), [{info_key(), Value}]}, ...]
     when PidTerm :: pid_term().

  Allows to be similar to erlang:process_info/1, but excludes
  fields such as the mailbox, which tend to grow
  and be unsafe when called in production systems. Also includes
  a few more fields than what is usually given (monitors,
  monitored_by, etc.), and separates the fields in a more
  readable format based on the type of information contained.

For more information about how to enable this in your project see the Doc chunks section in the Edoc User’s Guide.

socket support in gen_tcp

The gen_tcp module has gotten support for optionally using the new socket nif API instead of the previous inet driver. The new interface can be configured to be used either on a system level through setting the application configuration parameter like this: -kernel inet_backend socket, or on a per connection bases like this: gen_tcp:connect(localhost,8080,[{inet_backend,socket}]).

If you do this you will notice that the Socket returned by gen_tcp no longer is a port but instead of a tuple containing (among other things) a PID and a reference.

1> gen_tcp:connect(localhost,8080,[{inet_backend,socket}]).

This data structure is and always has been opaque, and therefore should not be inspected directly but instead only used as an argument to other gen_tcp and inet functions.

You can then use inet:i/0 to get a listing of all open sockets in the system:

2> inet:i().
Port      Module         Recv Sent Owner    Local Address   Foreign Address    State Type   
esock[19] gen_tcp_socket 0    0    <0.98.0> localhost:44082 localhost:http-alt CD:SD STREAM 

The gen_tcp API should be completely backward compatible with the old implementation, so if you can, please test it and report any bugs that you find back to us.

Why should you want to test this? Because in some of our benchmarks, we get up to 4 times the throughput vs the old implementation. In others, there is no difference or even a loss of throughput. So, as always, you need to measure and check for yourself!

EEP-56: Supervisor automatic shutdown

When creating supervisor hierarchies for applications that manage connections such as ssl or ssh, there are times when there is a need for terminating that supervisor hierarchy from within. Some event happens on the socket that should trigger a graceful shutdown of the processes associated with the connection.

Normally this would be done by using supervisor:terminate_child/2. However, this has two problems.

  1. It requires the child to know the ID of the child that needs to be terminated and the PID of the supervisor to talk to. This is simple when there is just one process in the supervisor, but when there are supervisors under supervisors, this becomes harder and harder to figure out.
  2. Calling supervisor:terminate_child/2 is a synchronous operation. This means that if you do the call in the child, you may end up in a deadlock as the top supervisor wants to terminate the child while the child is blocking in the call to terminate itself.

To solve this problem EEP-56 has added a mechanism in which a child can be marked as significant and if such a child terminates, it can trigger an automatic shutdown of the supervisor that it is part of.

This way a child process can trigger the shutdown of a supervisor hierarchy from within, without the child having to know anything about the supervisor hierarchy nor risking dead-locking itself during termination.

You can read more about automatic shutdown in the supervisor documentation.

Edwards-curve Digital Signature Algorithm

With Erlang/OTP 24 comes support for Edwards-curve Digital Signature Algorithm (EdDSA). EdDSA can be used when connecting to or acting as a TLS 1.3 client/server.

EdDSA is a type of elliptic curve signature algorithm (ECDSA) that can be used for secure communication. The security of ECDSA relies on a strong cryptographically secure random number which can cause issues when the random number is by mistake not secure enough, as has been the case in several uses of ECDSA (none of them in Erlang as far as we know :).

EdDSA does not rely on a strong random number to be secure. This means that when you are using EdDSA, the communication is secure even if your random number generator is not.

Despite the added security, EdDSA is claimed to be faster than other eliptic curve signature algorithms. If you have OpenSSL 1.1.1 or later, then as of Erlang/OTP 24 you will have access to this algorithm!

> crypto:supports(curves).
  c2tnb359v1, c2tnb431r1, ed25519, ed448, ipsec3, ipsec4
 ...]                     ^        ^


Copyright © 2016, Planet Erlang. No rights reserved.
Planet Erlang is maintained by Proctor.