Planet = erlang.

Elixir v1.19 brings further improvements to the type system and compilation times, allowing us to find more bugs, faster.

Type system improvements

This release improves the type system by adding type inference of anonymous functions and type checking of protocols. These enhancements seem simple on the surface but required us to go beyond existing literature by extending current theory and developing new techniques. We will outline the technical details in future articles. For now, let’s look at what’s new.

Type checking of protocol dispatch and implementations

This release adds type checking when dispatching and implementing protocols.

For example, string interpolation in Elixir uses the String.Chars protocol. If you pass a value that does not implement said protocol, Elixir will now emit a warning accordingly.

Here is an example passing a range, which cannot be converted into a string, to an interpolation:

defmodule Example do
  def my_code(first..last//step = range) do
    "hello #{range}"
  end
end

the above emits the following warnings:

warning: incompatible value given to string interpolation:

    data

it has type:

    %Range{first: term(), last: term(), step: term()}

but expected a type that implements the String.Chars protocol, it must be one of:

    dynamic(
      %Date{} or %DateTime{} or %NaiveDateTime{} or %Time{} or %URI{} or %Version{} or
        %Version.Requirement{}
    ) or atom() or binary() or float() or integer() or list(term())

Warnings are also emitted if you pass a data type that does not implement the Enumerable protocol as a generator to for-comprehensions:

defmodule Example do
  def my_code(%Date{} = date) do
    for(x <- date, do: x)
  end
end

will emit:

warning: incompatible value given to for-comprehension:

    x <- date

it has type:

    %Date{year: term(), month: term(), day: term(), calendar: term()}

but expected a type that implements the Enumerable protocol, it must be one of:

    dynamic(
      %Date.Range{} or %File.Stream{} or %GenEvent.Stream{} or %HashDict{} or %HashSet{} or
        %IO.Stream{} or %MapSet{} or %Range{} or %Stream{}
    ) or fun() or list(term()) or non_struct_map()

Type checking and inference of anonymous functions

Elixir v1.19 can now type infer and type check anonymous functions. Here is a trivial example:

defmodule Example do
  def run do
    fun = fn %{} -> :map end
    fun.("hello")
  end
end

The example above has an obvious typing violation, as the anonymous function expects a map but a string is given. With Elixir v1.19, the following warning is now printed:

    warning: incompatible types given on function application:

        fun.("hello")

    given types:

        binary()

    but function has type:

        (dynamic(map()) -> :map)

    typing violation found at:
    │
  6 │     fun.("hello")
    │        ~
    │
    └─ mod.exs:6:8: Example.run/0

Function captures, such as &String.to_integer/1, will also propagate the type as of Elixir v1.19, arising more opportunity for Elixir’s type system to catch bugs in our programs.

Acknowledgements

The type system was made possible thanks to a partnership between CNRS and Remote. The development work is currently sponsored by Fresha, Starfish*, and Dashbit.

Faster compile times in large projects

This release includes two compiler improvements that can lead up to 4x faster builds in large codebases.

While Elixir has always compiled the given files in project or a dependency in parallel, the compiler would sometimes be unable to use all of the machine resources efficiently. This release addresses two common limitations, delivering performance improvements that scale with codebase size and available CPU cores.

Code loading bottlenecks

Prior to this release, Elixir would load modules as soon as they were defined. However, because the Erlang part of code loading happens within a single process (the code server), this would make it a bottleneck, reducing parallelization, especially on large projects.

This release makes it so modules are loaded lazily. This reduces the pressure on the code server and the amount of work during compilation, with reports of more than two times faster compilation for large projects. The benefits depend on the codebase size and the number of CPU cores available.

Implementation wise, the parallel compiler already acts as a mechanism to resolve modules during compilation, so we built on that. By making sure the compiler controls both module compilation and module loading, it can also better guarantee deterministic builds.

There are two potential regressions with this approach. The first one happens if you spawn processes during compilation which invoke other modules defined within the same project. For example:

defmodule MyLib.SomeModule do
  list = [...]

  Task.async_stream(list, fn item ->
    MyLib.SomeOtherModule.do_something(item)
  end)
end

Because the spawned process is not visible to the compiler, it won’t be able to load MyLib.SomeOtherModule. You have two options, either use Kernel.ParallelCompiler.pmap/2 or explicitly call Code.ensure_compiled!(MyLib.SomeOtherModule) before spawning the process that uses said module.

The second one is related to @on_load callbacks (typically used for NIFs) that invoke other modules defined within the same project. For example:

defmodule MyLib.SomeModule do
  @on_load :init

  def init do
    MyLib.AnotherModule.do_something()
  end

  def something_else do
    ...
  end
end

MyLib.SomeModule.something_else()

The reason this fails is because @on_load callbacks are invoked within the code server and therefore they have limited ability to load additional modules. It is generally advisable to limit invocation of external modules during @on_load callbacks but, in case it is strictly necessary, you can set @compile {:autoload, true} in the invoked module to address this issue in a forward and backwards compatible manner.

Both snippets above could actually lead to non-deterministic compilation failures in the past, and as a result of these changes, compiling these cases are now deterministic.

Parallel compilation of dependencies

This release introduces a variable called MIX_OS_DEPS_COMPILE_PARTITION_COUNT, which instructs mix deps.compile to compile dependencies in parallel.

While fetching dependencies and compiling individual Elixir dependencies already happened in parallel, as outlined in the previous section, there were pathological cases where performance gains would be left on the table, such as when compiling dependencies with native code or dependencies where one or two large files would take most of the compilation time.

By setting MIX_OS_DEPS_COMPILE_PARTITION_COUNT to a number greater than 1, Mix will now compile multiple dependencies at the same time, using separate OS processes. Empirical testing shows that setting it to half of the number of cores on your machine is enough to maximize resource usage. The exact speed up will depend on the number of dependencies and the number of machine cores and some users reported up to 4x faster compilation times when using our release candidates. If you plan to enable it on CI or build servers, keep in mind it will most likely have a direct impact on memory usage too.

Erlang/OTP 28 support

Elixir v1.19 officially supports Erlang/OTP 28.1+ and later. In order to support the new Erlang/OTP 28 representation for regular expressions, structs can now control how they are escaped into abstract syntax trees by defining a __escape__/1 callback.

On the other hand, the new representation for regular expressions in Erlang/OTP 28+ implies they can no longer be used as default values for struct fields. Therefore, this is not allowed:

defmodule Foo do
  defstruct regex: ~r/foo/
end

You can, however, still use regexes when initializing the structs themselves:

defmodule Foo do
  defstruct [:regex]

  def new do
    %Foo{regex: ~r/foo/}
  end
end

OpenChain certification

Elixir v1.19 is also our first release following OpenChain compliance, as previously announced. In a nutshell:

Elixir releases now include a Source SBoM in CycloneDX 1.6 or later and SPDX 2.3 or later formats.
Each release is attested along with the Source SBoM.

These additions offer greater transparency into the components and licenses of each release, supporting more rigorous supply chain requirements.

This work was performed by Jonatan Männchen and sponsored by the Erlang Ecosystem Foundation.

Summary

There are many other goodies in this release, such as improved option parsing, better debuggability and performance in ExUnit, the addition of mix help Mod, mix help Mod.fun, mix help Mod.fun/arity, and mix help app:package to make documentation accessible via shell for humans and agents, and much more. See the CHANGELOG for the complete release notes.

Happy coding!

Permalink

Selling cats as a biologist | Piotr Nosek

Permalink

Ongoing Tradeoffs, and Incidents as Landmarks

One of the really valuable things you can get out of in-depth incident investigations is a better understanding of how work is actually done, as opposed to how we think work is done, or how it is specified. A solid approach to do this is to get people back into what things felt like at the time, and interview them about their experience to know what they were looking for, what was challenging. By taking a close look at how people deal with exceptional situations and how they translate goals into actions you also get to learn a lot about what's really important in normal times.

Incidents disrupt. They do so in undeniable ways that more or less force organizations to look inwards and question themselves. The disruption is why they are good opportunities to study and change how we do things.

In daily work, we'll tend to frame things in terms of decisions: do I ship now or test more? Do I go at it slow to really learn how this works or do I try and get AI to slam through it and figure it out in more depth later? Do we cut scope or move the delivery date? Do I slow down my own work to speed up a peer who needs some help? Is this fast enough? Should I argue in favor of an optimization phase? Do I fix the flappy test from another team or rerun it and move on? Do I address the low urgency alert now even though it will create a major emergency later, or address the minor emergency already in front of me? As we look back into our incidents and construct explanations, we can shed more light on what goes on and what's important.

In this post, I want to argue in favor of an additional perspective, centered considering incidents to be landmarks useful to orient yourself in a tradeoff space.

From Decisions to Continuous Tradeoffs

Once you look past mechanical failures and seek to highlight the challenges of normal work, you start to seek ways to make situations clearer, not just to prevent undesirable outcomes, but to make it easier to reach good ones too.

Over time, you may think that decisions get better or worse, or that some types shift and drift as you study an ever-evolving set of incidents. There are trends, patterns. It will feel like a moving target, where some things that were always fine start being a problem. Sometimes it will seem that external pressures, outside of any employee's control, create challenges that seem to emerge from situations related to previous ones, which all make incidents increasingly feel like natural consequences of having to make choices.

Put another way, you can see incidents as collections of events in which decisions happen. Within that perspective, learning from them means hoping for participants to get better at dealing with the ambiguity and making future decisions better. But rather than being collections of events in which decisions happen, it's worthwhile to instead consider incidents as windows letting you look at continuous tradeoffs.

By continuous tradeoffs, I mean something similar to this bit of an article Dr. Laura Maguire and I co-authored titled Navigating Tradeoffs in Software Failures:

Tradeoffs During Incidents Are Continuations of Past Tradeoffs
Multiple answers hinted at the incident being an outcome of existing patterns within the organization where they had happened, where communication or information flow may be incomplete or limited. Specifically, the ability of specific higher-ranking contributors who can routinely cross-pollinate siloed organizations is called as useful for such situations [...]
[...]
The ways similar tradeoffs were handled outside of incidents are revisited during the incidents. Ongoing events provide new information that wasn’t available before, and the informational boundaries that were in place before the outage became temporarily suspended to repair shared context.

A key point in this quote is that what happens before, during, and after an incident can all be projected as being part of the same problem space, but with varying amounts of information and uncertainty weighing on the organization. There are also goals, values, priorities, and all sorts of needs and limitations being balanced against each other.

When you set up your organization to ship software and run it, you do it in response to and in anticipation of these pressure gradients. You don’t want to move slow with full consensus on all decisions. You don’t want everyone to need to know what everybody else is doing. Maybe your system is big enough you couldn’t anyway. You adopt an organizational structure, processes, and select what information gets transmitted and how across the organization so people get what they need to do what is required. You give some people more control of the roadmap than others, you are willing to pay for some tools and not others, you will slow down for some fixes but live with other imperfections, you will hire or promote for some teams before others, you will set deadlines and push for some practices and discourage others, because as an organization, you think this makes you more effective and competitive.

When there’s a big incident happening and you find out you need half a dozen teams to fix things, what you see is a sudden shift in priorities. Normal work is suspended. Normal organizational structure is suspended. Normal communication patterns are suspended. Break glass situations mean you dust off irregular processes and expedite things you wouldn’t otherwise, on schedules you wouldn’t usually agree to.

In the perspective of decisions, it's possible the bad system behavior gets attributed to suboptimal choices, and we'll know better in the future through our learning now that we've shaken up our structure for the incident. In the aftermath, people keep suspending regular work to investigate what happened, share lessons, and mess with the roadmap with action items outside of the regular process. Then you more or less go back to normal, but with new knowledge and follow-up items.

Acting on decisions creates a sort of focus on how people handle the situations. Looking at incidents like they're part of a continuous tradeoff space lets you focus on how context gives rise to the situations.

In this framing, the various goals, values, priorities, and pressures are constantly being communicated and balanced against each other, and create an environment that shapes what solutions and approaches we think are worth pursuing or ignoring. Incidents are new information. The need to temporarily re-structure the organization is a clue that your "steady state" (even if this term doesn't really apply) isn't perfect.

Likewise, in a perspective of continuous tradeoffs, it's also possible and now easier for the "bad" system behavior to be a normal outcome of how we've structured our organization.

The type of prioritizations, configurations, and strategic moves you make mean that some types of incidents are more likely than others. Choosing to build a multi-tenant system saves money from shared resources but reduces isolation between workload types, such that one customer can disrupt others. Going multi-cloud prevents some outages but comes with a tax in terms of having to develop or integrate services that you could just build around a single provider. Keeping your infrastructure team split from your product org and never talking to sales means they may not know about major shifts in workloads that might come soon (like a big marketing campaign, a planned influx of new heavy users, or new features that are more expensive to run) and will stress their reactive capacity and make work more interrupt-driven.

Reacting to incidents by patching things up and moving on might bring us back to business as usual, but it does not necessarily question whether we're on the right trajectory.

Incidents as Navigational Landmarks

Think of old explorer maps, or even treasure maps: they are likely inaccurate, full of unspecified areas, and focused mainly on features that would let someone else figure out how to get around. The key markers on them would be forks in some roads or waterways, and landmarks.

A map drawn by Samual de Champlain in 1632, representing the Ottawa region, showing the route he took in a 1616 trip.

If you were to find yourself navigating with a map like this, the way you'd know you were heading the right direction is by confirming your position by finding landmarks or elements matching your itinerary, or knowing you're actually not on the right path at all by noticing features that aren't where you expect them or not there at all: you may have missed a turn if you suddenly encounter a ravine that wasn't on your planned path, or not until you had first seen a river.

The analogy I want to introduce is to think of the largely unpredictable solution space of tradeoffs as the poorly mapped territory, and of incidents as potential landmarks when finding your way. They let you know if you're going in a desired general direction, but also if you're entirely in the wrong spot compared to where you wanted to be. You always keep looking for them; on top of being point-in-time feedback mechanisms when they surprise you, they're also precious ongoing signals in an imprecise world.

Making tradeoffs implies that there are types of incidents you expect to see happening, and some you don't.

If you decide to ship prototypes earlier to validate their market fit, before having fully analyzed usage patterns and prepared scaling work, then getting complaints from your biggest customers trying them and causing slowdowns is actually in line with your priorities. That should be a plausible outcome. If you decide to have a team ignore your usual design process (say, RFCs or ADRs that make sure it integrates with the rest of the system well) in order to ship faster, then you should be ready for issues arising from clashes there. If you emphasize following procedures and runbooks, you might expect documented cases to be easier to handle but the truly surprising ones to be relatively more challenging and disruptive since you did not train as much for coping with unknown situations.

All these elements might come to a head when a multitenant system gets heavy usage from a large customer trying out a new feature developed in isolation (and without runbooks), which then impacts other parts of the system, devolving into a broader outage while your team struggles to figure out how to respond. This juncture could be considered to be a perfect storm as much as it could be framed as a powder keg—which one we get is often decided based on the amount of information available (and acted on) at the time, with some significant influence from hindsight.

You can't be everywhere all at once in the tradeoff space, and you can't prevent all types of incidents all at once. Robustness in some places create weaknesses in others. Adaptation lets you reconfigure as you go, but fostering that capacity to adapt requires anticipation and the means to do so.

Either the incidents and their internal dynamics are a confirmation of the path you've chosen and it's acceptable (even if regrettable), or it's a path you don't want to be on and you need to keep that in mind going forward.

Incidents as landmarks is one of the tools that lets you notice and evaluate whether you need to change priorities, or put your thumb on the scale another way. You can suspect that the position you’re in was an outcome of these priorities. You might want to correct not just your current position, but your overall navigational strategy. Note that an absence of incidents doesn't mean you’re doing well, just that there are no visible landmarks for now; if you still seek a landmark, maybe near misses and other indirect signs might help.

But to know how to orient yourself, you need more than local and narrow perspectives to what happened.

If your post-incident processes purely focus on technical elements and response, then they may structurally locate responsibility on technical elements and responders. The incidents as landmarks stance demands that your people setting strategy do not consider themselves to be outside of the incident space, but instead see themselves as indirect but relevant participants. We're not looking to shift accountability away, but to broaden our definition of what the system is.

You want to give them the opportunity to continually have the pressure gradients behind goal conflicts and their related adaptations in scope for incident reviews.

One thing to be careful about here is that to find the landmarks and make them visible, you need to go beyond the surface of the incident. The best structures to look for are going to be stable; forests are better than trees, but geological features are even better.

What you'll want to do is keep looking for second stories, elements that do not simply explain a specific failure, but also influence every day successes. They're elements that incidents give you opportunities to investigate, but that are in play all the time. They shape the work by their own existence, and they become the terrain that can both constrain and improve how your people make things happen.

When identifying contributing factors, it's often factors present whether things are going well or not that can be useful in letting you navigate tradeoff spaces.

What does orientation look like? Once you have identified some of these factors that has systemic impact, then you should expect the related intervention (if any is required because you think the tradeoff should not be the same going forward) to also be at a system level.

Are you going to find ways to influence habits, tweak system feedback mechanisms, clarify goal conflicts, shift pressures or change capacity? Then maybe the landmarks are used for reorienting your org. But if the interventions get re-localized down to the same responders or as new pressures added on top of old ones (making things more complex to handle, rather than clarifying them), there are chances you are letting landmarks pass you by.

The Risks of Pushing for This Approach

The idea of using incidents as navigational landmarks can make sense if you like framing the organization as its own organism, a form of distributed cognition that makes its way through its ecosystem with varying amounts of self-awareness. There's a large distance between that abstract concept, and you, as an individual, running an investigation and writing a report, where even taking the time to investigate is subject to the same pressures and constraints as the rest of normal work.

As Richard Cook pointed out, the concept of human error can be considered useful for organizations looking to shield themselves from the liabilities of an incident: if someone can be blamed for events, then the organization does not need to change what it normally does. By finding a culprit, blame and human error act like a lightning rod that safely diverts consequences from the org’s structure itself.

In organizations where this happens, trying to openly question broad priorities and goal conflicts can mark you as a threat to these defence mechanisms. Post-incident processes are places where power dynamics are often in play and articulate themselves.

If you are to use incidents as landmarks, do it the way you would for any other incident investigation: frame all participants (including upper management) to be people trying to do a good job in a challenging world, maintain blame awareness, try to find how the choices made sense at the time, let people tell their stories, seek to learn before fixing, and don’t overload people with theory.

Maintaining the trust the people in your organization give you is your main priority in the long term, and sometimes, letting go of some learnings today to protect your ability to keep doing more later is the best decision to make.

Beyond personal risk, being able to establish incidents as landmarks and using them to steer an organization means that your findings become part of how priorities and goals are set and established. People may have vested interests in you not changing things that currently advantage them, or may try to co-opt your process and push for their own agendas. The incidents chosen for investigations and the type of observations allowed or emphasized by the organization will be of interest. Your work is also part of the landscape.

Permalink

Erlang/OTP 28.1 Release

OTP 28.1

Erlang/OTP 28.1 is the first maintenance patch package for OTP 28, with mostly bug fixes as well as improvements.

Potential incompatibilities:

The internal inet_dns_tsig and inet_res modules have been fixed to TSIG verify the correct timestamp. In the process two undocumented error code atoms have been corrected to notauth and notzone to adhere to the DNS RFCs. Code that relied on the previous incorrect values may have to be corrected.

HIGHLIGHTS

A User’s Guide to dbg is now available in the documentation.
Support for quantum crypto signature algorithm ML-DSA and key exchange algorithm ML-KEM (ssl, public_key and crypto if built and linked with OpenSSL 3.5).

For details about bugfixes and potential incompatibilities see the Erlang 28.1 README

The Erlang/OTP source can also be found at GitHub on the official Erlang repository, https://github.com/erlang/otp

Download links for this and previous versions are found here:

https://www.erlang.org/downloads

Permalink

Interoperability in 2025: beyond the Erlang VM

The Erlang Virtual Machine has, historically, provided three main options for interoperability with other languages and ecosystems, with different degrees of isolation:

NIFs (Native Implemented Functions) integrate with third party code in the same memory space via C bindings. This translates to low overhead and best performance but it also means faulty code can bring the whole Virtual Machine down, bypassing some of Erlang’s fault-tolerance guarantees
Ports start a separate Operating System process to communicate with other languages through STDIN/STDOUT, guaranteeing process isolation. In a typical Erlang fashion, ports are fully evented, concurrent, and distributed (i.e. you can pass and communicate with ports across nodes)
Distributed nodes rely on Erlang well-defined distribution and serialization protocol to communicate with other runtimes. Any language can implement said protocol and act as an Erlang node, giving you full node isolation between runtimes

Those mechanisms have led to multiple integrations between Elixir and other programming languages, such as Zig and Rust, and more recently C++, Python, and Swift, which we will explore here.

Furthermore, alternative implementations of the Erlang VM and Elixir have brought a fourth category of interoperability through portability: where your Elixir program runs in a completely different environment to leverage its native capabilities, libraries, and ecosystem, while maintaining Elixir’s syntax and semantics (either partially or fully). This opens up some exciting new possibilities and since this approach is still relatively uncharted territory, let’s dive into it first.

Portability

The AtomVM is a lightweight implementation of the Erlang VM that can run on constrained environments, such as microcontrollers with just a few hundred kilobytes of memory such as ESP32, STM32 or Pico. AtomVM supports a functional subset of Erlang VM and its standard library, all optimized to run on tiny microcontrollers.

Given its low footprint, AtomVM can also target WebAssembly, paving the way to run Elixir in web browsers and alternative WASM runtimes in the future. The Popcorn project, recently announced at ElixirConf EU 2025, builds on those capabilities to provide better interoperability between Elixir and JavaScript.

Popcorn

Popcorn is a library for running Elixir in web browsers, with JavaScript interoperability. Popcorn brings an extensive subset of Elixir semantics into the browser and, although it is in its infancy, it is already capable of running interactive Elixir code entirely client side.

And here is a quick example showing how to communicate with JavaScript from WASM:

defmodule HelloPopcorn do
  use GenServer

  @process_name :main

  def start_link(args) do
    GenServer.start_link(__MODULE__, args, name: @process_name)
  end

  @impl true
  def init(_init_arg) do
    Popcorn.Wasm.register(@process_name)
    IO.puts("Hello console!")

    Popcorn.Wasm.run_js("""
    () => {
      document.body.innerHTML = "Hello from WASM!";
    }
    """)

    :ignore
  end
end

Popcorn could help with Elixir adoption by making it really easy to create interactive guides with executable code right there in the browser. And once it’s production ready, it could enable offline, local-first applications, entirely in Elixir.

Hologram

Hologram is a full-stack isomorphic Elixir web framework that runs on top of Phoenix. It lets developers create dynamic, interactive web applications entirely in Elixir.

Hologram transpiles Elixir code to JavaScript and provides a complete framework including templates, components, routing, and client-server communication for building rich web applications.

Here is a snippet of a Hologram component that handles drawing events entirely client-side, taken from the official SVG Drawing Demo:

defmodule DrawingBoard do
  use Hologram.Component

  def init(_props, component, _server) do
    put_state(component, drawing?: false, path: "")
  end

  def template do
    ~HOLO"""
    <svg
      class="cursor-crosshair touch-none bg-black w-[75vw] h-[75vh]"
      $pointer_down="start_drawing"
      $pointer_move="draw_move"
      $pointer_up="stop_drawing"
      $pointer_cancel="stop_drawing"
    >
      <path d={@path} stroke="white" stroke-width="2" fill="none" />
    </svg>
    """
  end

  def action(:draw_move, params, component) when component.state.drawing? do
    new_path = component.state.path <> " L #{params.event.offset_x} #{params.event.offset_y}"
    put_state(component, :path, new_path)
  end

  def action(:start_drawing, params, component) do
    new_path = component.state.path <> " M #{params.event.offset_x} #{params.event.offset_y}"
    put_state(component, drawing?: true, path: new_path)
  end
end

While Popcorn runs on a lightweight implementation of the Erlang VM with all of its primitives, Hologram works directly on the Elixir syntax tree. They explore distinct paths for bringing Elixir to the browser and are both in active development.

Native Implemented Functions (NIFs)

NIFs allow us to write performance-critical or system-level code and call it directly from Erlang and Elixir as if it were a regular function.

NIFs solve practical problems like improving performance or using all Operating System capabilities. NIFs run in the same Operating System process as the VM, the same memory space. With them we can use third-party native libraries, execute syscalls, interface with the hardware, etc. On the other hand, using them can forgo some of Erlang’s stability and error handling guarantees.

Originally, NIFs could never block and had to be written in a “yielding” fashion, which limited their applicability. Since Erlang/OTP 17, however, NIFs can be scheduled to run on separate OS threads called “dirty schedulers”, based on their workloads (IO or CPU). This has directly brought Elixir and the Erlang VM into new domains, such as Numerical Elixir, and to interop with new languages and ecosystems.

C

Erlang’s NIFs directly target the C programming language and is used to implement low-level functionality present in Erlang’s standard library:

#include <erl_nif.h>

static ERL_NIF_TERM add_int64_nif(ErlNifEnv* env, int argc, const ERL_NIF_TERM argv[])
{
    int64_t a, b;
    if (!enif_get_int64(env, argv[0], &a) || !enif_get_int64(env, argv[1], &b)) {
        return enif_make_badarg(env);
    }
    return enif_make_int64(env, a + b);
}

static ErlNifFunc nif_funcs[] = {
    {"add", 2, add_int64_nif},
};

ERL_NIF_INIT("Elixir.Example", nif_funcs, NULL, NULL, NULL, NULL)

Writing NIFs in C can be verbose and error-prone. Fortunately, the Elixir ecosystem offers a number of high-quality libraries that make it possible to write NIFs in other languages, let’s check them out.

C++

Fine is a lightweight C++ library that wraps the NIF API with a modern interface. Given the widespread use of C++ in machine learning and data, Fine aims to reduce the friction of getting from Elixir to C++ and vice-versa.

Here’s the same NIF that adds two numbers in C++, using Fine:

#include <fine.hpp>

int64_t add(ErlNifEnv *env, int64_t a, int64_t b) {
  return a + b;
}

FINE_NIF(add, 0);
FINE_INIT("Elixir.Example");

Fine automatically encodes and decodes NIF arguments and return values based on the function signature, significantly reducing boilerplate code. It also has first-class support for Elixir structs, propagating C++ exceptions as Elixir exceptions, and more.

Rust

Rustler is a library for writing NIFs in Rust. The goal is to make it impossible to crash the VM when using “safe” Rust code. Furthermore, Rustler makes it easy to encode/decode Rust values to and from Elixir terms while safely and ergonomically managing resources.

Here’s an example NIF implemented with Rustler:

#[rustler::nif]
fn add(a: i64, b: i64) -> i64 {
  a + b
}

rustler::init!("Elixir.Example");

Zig

Zigler lets us write NIFs in Zig, a low-level programming language designed for maintaining robust, optimal, and reusable software. Zig removes hidden control flow, implicit memory allocation, and similar abstractions in favour of code that’s explicit and predictable.

Zigler compiles Zig code at build time and exposes it directly to Elixir, without external build scripts or glue. It tightly integrates with Elixir tooling: Zig code is formatted via mix format and documentation written in Zig appears in IEx via the h helper.

Here’s an example NIF in Zig:

iex> Mix.install([:zigler])
iex> defmodule Example do
       use Zig, otp_app: :zigler

       ~Z"""
       pub fn add(a: i64, b: i64) i64 {
         return a + b;
       }
       """
     end
iex> Example.add(1, 2)
3

We can write NIFs directly in IEx sessions, scripts, Livebook notebooks, and similar! And with Zig’s excellent interop with C, it’s really easy to experiment with native code on the Erlang VM.

Python

Pythonx runs a Python interpreter in the same OS process as your Elixir application, allowing you to evaluate Python code and conveniently convert between Python and Elixir data structures. Pythonx also integrates with the uv package manager, automating the management of Python and its dependencies.

One caveat is that Python’s Global Interpreter Lock (GIL) prevents multiple threads from executing Python code at the same time so calling Pythonx from multiple Elixir processes does not provide concurrency we might expect and can become source of bottlenecks. However, GIL is a constraint for regular Python code only. Packages with CPU-intense functionality, such as numpy, have native implementation of many functions and invoking those releases the GIL (GIL is also released when waiting on I/O).

Here’s an example of using numpy in Elixir:

iex> Mix.install([{:pythonx, "~> 0.4.0"}])
iex> Pythonx.uv_init("""
     [project]
     name = "myapp"
     version = "0.0.0"
     requires-python = "==3.13.*"
     dependencies = [
       "numpy==2.2.2"
     ]
     """)
iex> import Pythonx, only: :sigils
iex> x = 1
iex> ~PY"""
     import numpy as np

     a = np.int64(x)
     b = np.int64(2)
     a + b
     """
#Pythonx.Object<
  np.int64(3)
>

Livebook uses Pythonx to allow Elixir and Python code cells to co-exist in the same notebook (and in the same memory space), with low-overhead when transferring data between them.

Distributed nodes

Elixir, by way of Erlang, has built-in support for distributed systems. Multiple nodes can connect over a network and communicate using message passing, with the same primitives such as send and receive used for both local and remote processes.

Nodes become discoverable in the cluster simply by starting them with names. Once we connect to a node, we can send messages, spawn remote processes, and more. Here’s an example:

$ iex --name a@127.0.0.1 --cookie secret
$ iex --name b@127.0.0.1 --cookie secret
iex(a@127.0.0.1)> Node.connect(:"b@127.0.0.1")
iex(a@127.0.0.1)> node()
:"a@127.0.0.1"
iex(a@127.0.0.1)> :erpc.call(:"b@127.0.0.1", fn -> node() end)
:"b@127.0.0.1"

While Distributed Erlang is typically used for Erlang-Erlang communication, it can be also used for interacting with programs written in other programming languages. Erlang/OTP includes Erl_Interface, a C library for writing programs that can participate in the Erlang cluster. Such programs are commonly called C nodes.

Any language may implement these protocols from scratch or, alternatively, use erl_interface as its building block. For example, Erlang/OTP ships with Jinterface application, a Java library that lets JVM programs act as distributed Erlang nodes. Another recent example is the Swift Erlang Actor System, for communicating between Swift and Erlang/Elixir programs.

Ports

Last but not least, ports are the basic mechanism that Elixir/Erlang uses to communicate with the outside world. Ports are the most common of interoperability across programming languages, so we will only provide two brief examples.

In Elixir, the Port module offers a low-level API to start separate programs. Here’s an example that runs uname -s to print the current operating system:

iex> port = Port.open({:spawn, "uname -s"}, [:binary])
iex> flush()
{#Port<0.3>, {:data, "Darwin\n"}}
iex> send(port, {self(), :close})
iex> flush()
{#Port<0.3>, :closed}
:ok

Most times, however, developers use System.cmd/3 to invoke short-running programs:

iex> System.cmd("uname", ["-s"])
{"Darwin\n", 0}

Summary

This article highlights many of the several options for interoperating with Elixir and the Erlang Virtual Machine. While it does not aim to be a complete reference, it covers integration across a range of languages, such as Rust, Zig, Python and Swift, as well as portability to different environments, including microcontrollers and web browsers.

Permalink

Take part in the Global Elixir Meetups week

Update: Our first Global Elixir Meetups was a success with 46 meetups spread across six continents. Thanks to everyone who organized and attended. See you next time!

We are launching Global Elixir Meetups (GEMs) - a week where the Elixir community organizes meetups around the world to meet and learn from each other. Our goal is to spark local communities, old and new, to get together and discuss everything related to Elixir and the Erlang VM.

Our first GEM will happen on 22-28 September and, if you were looking for an opportunity to organize a meetup, now is the perfect time: visit the Global Elixir Meetups website to learn how to organize and attend. Organizers may also opt-in to live stream their meetups to the whole world directly from the website.

Global Elixir Meetup banner

The Global Elixir Meetup is organized by Software Mansion, who brought their expertise as creators of Membrane and Elixir WebRTC to make it all possible. At launch, we are already counting with 7 meetups across Europe, South America, and North America, with hopefully more continents joining us soon.

Go ahead and find your closest GEM or run your own!

Permalink

The Gap Through Which We Praise the Machine

In this post I’ll expose my current theory of agentic programming: people are amazing at adapting the tools they’re given and totally underestimate the extent to which they do it, and the amount of skill we build doing that is an incidental consequence of how badly the tools are designed.

I’ll first cover some of the drive behind AI assistant adoption in software, the stochastic-looking divide in expectations and satisfactions with these tools, and the desire to figure out an explanation for that phenomenon.

I’ll then look at what successful users seem to do, explore the type of scaffolding and skills they need to grow to do well with LLMs when coding or implementing features. By borrowing analytical ideas from French Ergonomists, I’ll then explain how this extensive adaptive work highlights a gap in interaction design from AI tool builders, which is what results in tricky skill acquisition.

Basically, things could be much better if we spent less time congratulating machines for the work people do and we instead supported people more directly.

Money Claps for Tinkerbell, and so Must You

A few months ago, Charity Majors and I gave the closing plenary talk at SRECon Americas 2025. While we were writing the talk, trying to thread a needle between skepticism and optimism, Charity mentioned one thing I hadn’t yet understood by then but was enlightening: investors in the industry already have divided up companies in two categories, pre-AI and post-AI, and they are asking “what are you going to do to not be beaten by the post-AI companies?”

The usefulness and success of using LLMs are axiomatically taken for granted and the mandate for their adoption can often come from above your CEO. Your execs can be as baffled as anyone else having to figure out where to jam AI into their product. Adoption may be forced to keep board members, investors, and analysts happy, regardless of what customers may be needing.

It does not matter whether LLMs can or cannot deliver on what they promise: people calling the shots assume they can, so it’s gonna happen no matter what. I’m therefore going to bypass any discussion of the desirability, sustainability, and ethics of AI here, and jump directly to “well you gotta build with it anyway or find a new job” as a premise. My main focus will consequently be on people who engage with the tech based on these promises, and how they do it. There’s a wide spectrum where at one end you have “true believers,” and at the other you have people convinced of the opposite—that this is all fraudulent shit that can’t work.

In practice, what I’m seeing is a bunch of devs who derive real value from it at certain types of tasks and workflows ranging from copilot-as-autocomplete to full agentic coding, and some who don’t and keep struggling to find ways to add LLMs to their workflows (either because they must due to some top-down mandate, or because they fear they’ll be left behind if they don’t1). I can also find no obvious correlation between where someone lands on that spectrum and things like experience levels; people fall here and there regardless of where they work, how much trust I have in their ability, how good they are at communicating, how much of a hard worker they are, or how willing to learn they might be.

A Theory of Division

So where does that difference come from? It could be easy to assign dissatisfaction to “you just gotta try harder”, or “some people work differently”, or “you go fast now but you are just creating more problems for later.” These all may be true to some degree, and the reality is surely a rich multifactorial mess. We also can’t ignore broader social and non-individual elements like the type of organizational culture people evolve in,2 on top of variations that can be seen within single teams.

My gut feeling is that, on top of all the potential factors already identified, people underestimate their own situatedness (how much they know and interpret and adjust from “thing I am told to build” and tie that to a richer contextualized “thing that makes sense to build” by being connected participants in the real world and the problem space) and how much active interpretation and steering work they do when using and evaluating coding assistants.3 Those who feel the steering process as taxing end up having a worse time and blame the machine for negative outcomes; those for whom it feels easy in turn praise the machine for the positive results.

This tolerance for steering is likely moderated or amplified by elements such as how much people trust themselves and how much they trust the AI, how threatened they might feel by it, their existing workflows, the support they might get, and the type of “benchmarks” they choose (also influenced by the preceding factors).4

I’m advancing this theory because the people I’ve seen most excited and effective about agentic work were deeply involved in constantly correcting and recognizing bugs or loops or dead ends the agent was getting into, steering them away from it, while also adding a bunch of technical safeguards and markers to projects to try and make the agents more effective. When willingly withholding these efforts, their agents’ token costs would double as they kept growing their context windows through repeating the same dead-end patterns; oddities and references to non-existing code would accumulate, and the agents would increasingly do unhinged stuff like removing tests they wrote but could no longer pass.

I’ve seen people take the blame for that erratic behavior on themselves (“oh I should have prompted in that way instead, my bad”), while others would just call out the agent for being stupid or useless.

The early frustration I have seen (and felt) seems to be due to hitting these road blocks and sort of going “wow, this sucks and isn’t what was advertised.” If you got more adept users around you, they’ll tell you to try different models, tweak bits of what you do, suggest better prompts, and offer jargon-laden workarounds.

That gap between “what we are told the AI can do” and “what it actually does out of the box” is significant. To bridge that gap, engineers need to do a lot of work.

The Load-bearing Scaffolding of Effective Users

There are tons of different artifacts, mechanisms, and tips and tricks required to make AI code agents work. To name a few, as suggested by vendors and multiple blog posts, you may want to do things such as:

Play and experiment with multiple models, figure out which to use and when, and from which interfaces, which all can significantly change your experience.
Agent-specific configuration files (such as CLAUDE.md, AGENTS.md, or other rule files) that specify project structure, commands, style guidelines, testing strategies, conventions, potential pitfalls, and other information. There can be one or more of them, in multiple locations, and adjusted to specific users.
Optimize your prompts by adding personality or character traits and special role-play instructions, possibly relying on prompt improvers.
Install or create MCP servers to extend the abilities of your agents. Some examples can include file management or source control, but can also do stuff like giving access to production telemetry data or issue trackers.
Use files as memory storage for past efforts made by the agent.
Specify checkpoints and manage permissions to influence when user input may be required.
Monitor your usage and cost.

There are more options there, and each can branch out into lots of subtle qualitative details: workarounds for code bases too large for the model’s context, defining broader evaluation strategies, working around cut-off dates, ingesting docs, or all preferences around specific coding, testing, and interaction methods. Having these artifacts in place can significantly alter someone’s experience. Needing to come up with and maintain these could be framed as increasing the effort required for successful adoption.

I’ve seen people experimenting, even with these elements in place, failing to get good results, and then being met with “yeah, of course, that’s a terrible prompt” followed with suggestions of what to improve (things like “if the current solution works, say it works so the agent does not try to change it”, asking for real examples to try and prevent fake ones, or being more or less polite).

For example, a coworker used a prompt that, among many other instructions, had one line stating “use the newest version of <component> so we can use <feature>”. The agent ignored that instruction and used an older version of the component. My coworker reacted by saying “I set myself up for refactoring by not specifying the exact version.”

From an objective point of view, asking for the newest version of the component is a very specific instruction: only one version is the newest, and the feature that was specified only existed in that version. There is no ambiguity. Saying “version $X.0” is semantically the same. But my coworker knew, from experience, that a version number would yield better results, and took it on themselves to do better next time.

These interactions show that engineers have internalized a complex set of heuristics to guide and navigate the LLM’s idiosyncrasies. That is, they’ve built a mental model of complex and hardly predictable agentic behavior (and of how it all interacts with the set of rules and artifacts and bits of scaffolding they’ve added to their repos and sessions) to best predict what will or won’t yield good results, and then do extra corrective work ahead of time through prompting variations. This is a skill that makes a difference.

That you need to do these things might in fact point at how agentic AI does not behave with cognitive fluency,5 and instead, the user subtly does it on its behalf in order to be productive.

Whether you will be willing to provide that skill for the machine may require a mindset or position that I’ll caricature as “I just need to get better”, as opposed to taking a stance of “the LLM needs to get better”. I suspect this stance, whether it is chosen deliberately or not, will influence how much interaction (and course-correcting) one expects to handle while still finding an agent useful or helpful.

I don’t know that engineers even realize they’re doing that type of work, that they’re essential to LLMs working for code, that the tech is fascinating but maybe not that useful without the scaffolding and constant guidance they provide. At least, people who speak of AI replacing engineers probably aren’t fully aware that while engineers could maybe be doing more work through assisting an agent than they would do alone, agents would still not do good work without the engineer. AI is normal technology, in that its adoption, propagation, and the efforts to make it work all follow predictable patterns. LLMs, as a piece of tech, mainly offer some unrealized potential.

It may sound demeaning, like I’m implying people lack awareness of their own processes, but it absolutely isn’t. The process of adaptation is often not obvious, even to the people doing it. There are lots of strategies and patterns and behaviors people pick up or develop tacitly as a part of trying to meet goals. Cognitive work that gets deeply ingrained sometimes just feels effortless, natural, and obvious. Unless you’re constantly interacting with newcomers, you forget what you take for granted—you just know what you know and get results.

By extension, my supposition is that those who won’t internalize the idiosyncrasies and the motions of doing the scaffolding work are disappointed far more quickly: they may provide more assistance to the agent than the agent provides to them, and this is seen as the AI failing to improve their usual workflow and to deliver on the wonders advertised by its makers.

The Gap Highlighted Through Adaptive Work

What AI sells is vastly different from what it delivers, particularly what it delivers out of the box. In their study of the difference between work-as-imagined (WAI) and work-as-done (WAD), ergonomists and resilience engineers have developed a useful framing device to understand what’s going on.

Work-as-imagined describes the work as it is anticipated or expected to happen, how it can be specified and described. The work-as-done comprises the work as it is carried out, along with the supporting tasks, deviations, meanings, and their relationships to the prescribed tasks.

By looking at how people turn artifacts they’re given into useful tools, we can make sense of that gap.6 This adjustment ends up transforming both the artifacts (by modifying and configuring them) and the people using them (through learning and by changing their behavior). The difference between the original artifact developed by the people planning the work and the forms that end up effectively used in the field offer a clue of the mismatch between WAI and WAD.

Tying this back to our LLM systems, what is imagined is powerful agents who replace engineers (at least junior ones), make everyone more productive, and that will be a total game changer. LLMs are artifacts. The scaffolding we put in place to control them are how we try to transform the artifacts into tools; the learning we do to get better at prompting and interacting with the LLMs is part of how they transform us. If what we have to do to be productive with LLMs is to add a lot of scaffolding and invest effort to gain important but poorly defined skills, we should be able to assume that what we’re sold and what we get are rather different things.

That gap implies that better designed artifacts could have better affordances, and be more appropriate to the task at hand. They would be easier to turn into productive tools. A narrow gap means fewer adaptations are required, and a wider gap implies more of them are needed.

Flipping it around, we have to ask whether the amount of scaffolding and skill required by coding agents is acceptable. If we think it is, then our agent workflows are on the right track. If we’re a bit baffled by all that’s needed to make it work well, we may rightfully suspect that we’re not being sold the right stuff, or at least stuff with the right design.

Bad Interaction Design Demands Greater Coping Skills

I fall in the baffled camp that thinks better designs are possible. In a fundamental sense, LLMs can be assumed to be there to impress you. Their general focus on anthropomorphic interfaces—just have a chat!—makes them charming, misguides us into attributing more agency and intelligence than they have, which makes it even more challenging for people to control or use them predictably. Sycophancy is one of the many challenges here, for example.

Coding assistants, particularly agents, are narrower in their interface, but they build on a similar interaction model. They aim to look like developers, independent entities that can do the actual work. The same anthropomorphic interface is in place, and we similarly must work even harder to peel the veneer of agency they have to properly predict them and apply them in controlled manners.

You can see the outline of this when a coding agent reaches limits it has no awareness of, like when it switches from boilerplate generation (where we’re often fine letting it do its thing) to core algorithms (where we want involvement to avoid major refactors) without proper hand-offs or pauses. Either precise prompting must be done to preempt and handle the mode switch, or we find the agent went too far and we must fix (or rewrite) buggy code rather than being involved at the right time.

And maybe the issue is prompting, maybe it’s the boilerplatey nature of things, maybe it’s because there was not enough training material for your language or framework. Maybe your config files aren’t asking for the right persona, or another model could do better. Maybe it’s that we don’t even know what exactly is the boundary where our involvement is more critical. Figuring that out requires skill, but also it’s kind of painful to investigate as a self-improvement workflow.

Coding agents require the scaffolding, learning, and often demand more attention than tools, but are built to look like teammates. This makes them both unwieldy tools and lousy teammates. We should either have agents designed to look like a teammate properly act like a teammate, and barring that, have a tool that behaves like a tool. This is the point I make in AI: Where in the Loop Should Humans Go?, where a dozen questions are offered to evaluate how well this is done.

Key problems that arise when we’re in the current LLM landscape include:

AI that aims to improve us can ironically end up deskilling us;
Not knowing whether we are improving the computers or augmenting people can lead to unsustainable workflows and demands;
We risk putting people in passive supervision and monitoring roles, which is known not to work well;
We may artificially constrain and pigeonhole how people approach problems, and reduce the scope of what they can do;
We can adopt known anti-patterns in team dynamics that reduce overall system efficiency;
We can create structural patterns where people are forced to become accountability scapegoats.

Hazel Weakly comes up with related complaints in Stop Building AI Tools Backwards, where she argues for design centered on collaborative learning patterns (Explain, Demonstrate, Guide, Enhance) to play to the strengths that make people and teams effective, rather than one that reinforces people into being ineffective.

Some people may hope that better models will eventually meet expectations and narrow the gap on their own. My stance is that rather than anchoring coding agent design into ideals of science fiction (magical, perfect workers granting your wishes), they should be grounded in actual science. The gap would be narrowed much more effectively then. AI tool designers should study how to integrate solutions to existing dynamics, and plan to align with known strength and limitations of automation.

We Oversell Machines by Erasing Ourselves

Being able to effectively use LLMs for programming demands a lot of scaffolding and skills. The skills needed are, however, poorly defined and highly context dependent, such that we currently don’t have great ways of improving them other than long periods of trial and error.7

The problem is that while the skills are real and important, I would argue that the level of sophistication they demand is an accidental outcome of poor interaction design. Better design, aimed more closely to how real work is done, could drastically reduce the amount of scaffolding and learning required (and increase the ease with which learning takes place).

I don’t expect my calls to be heard. Selling sci-fi is way too effective. And as long as the AI is perceived as the engine of a new industrial revolution, decision-makers will imagine it can do so, and task people to make it so.

Things won’t change, because people are adaptable and want the system to succeed. We consequently take on the responsibility for making things work, through ongoing effort and by transforming ourselves in the process. Through that work, we make the technology appear closer to what it promises than what it actually delivers, which in turn reinforces the pressure to adopt it.

As we take charge of bridging the gap, the machine claims the praise.

1: Dr. Cat Hicks has shared some great research on factors related to this, stating that competitive cultures that assume brilliance is innate and internal tend to lead to a much larger perceived threat from AI regarding people’s skills, whereas learning cultures with a sense of belonging lowered that threat. Upskilling can be impacted by such threats, along with other factors described in the summaries and the preprint.

2: Related to the previous footnote, Dr. Cat Hicks here once again shares research on cumulative culture, a framing that shows how collaborative innovation and learning can be, and offers an alternative construct to individualistic explanations for software developers’ problem solving.

3: A related concept might be Moravec’s Paradox. Roughly, this classic AI argument states that we tend to believe higher order reasoning like maths and logic is very difficult because it feels difficult to us, but the actually harder stuff (perception and whatnot) is very easy to us because we’re so optimized for it.

4: The concept of self-trust and AI trust is explored in The Impact of Generative AI on Critical Thinking by HPH Lee and Microsoft Research. The impact of AI skill threat is better defined in the research in footnote 1. The rest is guesswork.
The guess about “benchmarks” is based on observations that people may use heuristics like checking how it does at things you’re good at to estimate how you can trust it at things you’ve got less expertise on. This can be a useful strategy but can also raise criteria for elements where expertise may not be needed (say, boilerplate), and high expectations can lay the groundwork for easier disappointment.

5: The Law of Fluency states that Well-adapted cognitive work occurs with a facility that belies the difficulty of resolving demands and balancing dilemmas, basically stating that if you’ve gotten good at stuff, you make it look a lot easier than it actually is to do things.

6: This idea comes from a recent French ergonomics paper. It states that “Artifacts represent for the worker a part of the elements of WAI. These artifacts can become tools only once the workers become users, when they appropriate them. [Tools] are an aggregation of artifacts (WAI) and of usage schemas by those who use them in the field (WAD).”

7: One interesting anecdote here is hearing people say they found it challenging to switch from their personal to corporate accounts for some providers, because something in their personal sessions had made the LLMs work better with their style of prompting and this got lost when switching.
Other factors here include elements such as how updating models can significantly impact user experience, which may point to a lack of stable feedback that can also make skill acquisition more difficult.

Permalink

Erlang/OTP 28.0

OTP 28.0

Erlang/OTP 28 is a new major release with new features, improvements as well as a few incompatibilities. Some of the new features are highlighted below.

Many thanks to all contributors!

Starting with this release, a source Software Bill of Materials (SBOM) will describe the release on the Github Releases page. We welcome feedback on the SBOM.

New language features

Functionality making it possible for processes to enable reception of priority messages has been introduced in accordance with EEP 76.
Comprehensions have been extended with “zip generators” allowing multiple generators to be run in parallel. For example, [A+B || A <- [1,2,3] && B <- [4,5,6]] will produce [5,7,9].
Generators in comprehensions can now be strict, meaning that if the generator pattern does not match, an exception will be raised instead of silently ignore the value that didn’t match.
It is now possible to use any base for floating point numbers as per EEP 75: Based Floating Point Literals.

Compiler and JIT improvements

For certain types of errors, the compiler can now suggest corrections. For example, when attempting to use variable A that is not defined but A0 is, the compiler could emit the following message: variable 'A' is unbound, did you mean 'A0'?
The size of an atom in the Erlang source code was limited to 255 bytes in previous releases, meaning that an atom containing only emojis could contain only 63 emojis. While atoms are still only allowed to contain 255 characters, the number of bytes is no longer limited.
The warn_deprecated_catch option enables warnings for use of old-style catch expressions on the form catch Expr instead of the modern try … catch … end.
Provided that the map argument for a maps:put/3 call is known to the compiler to be a map, the compiler will replace such calls with the corresponding update using the map syntax.
Some BIFs with side-effects (such as binary_to_atom/1) are optimized in try … catch in the same way as guard BIFs in order to gain performance.
The compiler’s alias analysis pass is now both faster and less conservative, allowing optimizations of records and binary construction to be applied in more cases.

ERTS

The trace:system/3 function has been added. It has a similar interface as erlang:system_monitor/2 but it also supports trace sessions.
os:set_signal/2 now supports setting handlers for the SIGWINCH, SIGCONT, and SIGINFO signals.
The two new BIFs erlang:processes_iterator/0 and erlang:process_next/1 make it possible to iterate over the process table in a way that scales better than erlang:processes/0.

Shell and terminal

The erl -noshell mode has been updated to have two sub modes called raw and cooked, where cooked is the old default behaviour and raw can be used to bypass the line-editing support of the native terminal. Using raw mode it is possible to read keystrokes as they occur without the user having to press Enter. Also, the raw mode does not echo the typed characters to stdout.
The shell now prints a help message explaining how to interrupt a running command when stuck executing a command for longer than 5 seconds.

STDLIB

The join(Binaries, Separator) function that joins a list of binaries has been added to the binary module.
By default, sets created by module sets will now be represented as maps.
Module re has been updated to use the newer PCRE2 library instead of the PCRE library.
There is a new zstd module that does Zstandard compression.

Public_key

The ancient ASN.1 modules used in public_key has been replaced with more modern versions, but we have strived to keep the documented Erlang API for the public_key application compatible.

Dialyzer

EEP 69: Nominal Types has been implemented.

SSL

The data handling for tls-v1.3 has been optimized.

Emacs mode (in the Tools application)

The indent-region in Emacs command will now handle multiline strings better.

For more details about new features and potential incompatibilities see the README.

Permalink

Type system improvements

Type checking of protocol dispatch and implementations

Type checking and inference of anonymous functions

Acknowledgements

Faster compile times in large projects

Code loading bottlenecks

Parallel compilation of dependencies

Erlang/OTP 28 support

OpenChain certification

Summary

From Decisions to Continuous Tradeoffs

Incidents as Navigational Landmarks

The Risks of Pushing for This Approach

OTP 28.1

Potential incompatibilities:

HIGHLIGHTS

Portability

Popcorn

Hologram

Native Implemented Functions (NIFs)

C

C++

Rust

Zig

Python

Distributed nodes

Ports

Summary

Money Claps for Tinkerbell, and so Must You

A Theory of Division

The Load-bearing Scaffolding of Effective Users

The Gap Highlighted Through Adaptive Work

Bad Interaction Design Demands Greater Coping Skills

We Oversell Machines by Erasing Ourselves

OTP 28.0

New language features

Compiler and JIT improvements

ERTS

Shell and terminal

STDLIB

Public_key

Dialyzer

SSL

Emacs mode (in the Tools application)

About

Syndicate

Planetarium

Subscriptions