An Unprecedented Subtraction

Let’s go back to the origins of this blog with a bit of unexpected code behavior. This time, let’s try removing elements from a list…

Maxwell Smart — because I couldn’t find any proper image for this article

Let’s Subtract!

The beauty of this trick is that the code examples work in both Erlang and Elixir, with barely any changes at all… I’ll do it in Elixir, feel free to try them out in Erlang or efene if you like.

Now, this is a list…

iex(1)> [1, 2, 3]
[1, 2, 3]

And now, we can subtract a few elements from it…

iex(2)> [1, 2, 3] -- [1, 2]
[3]

Let’s now subtract that last one, too…

iex(3)> [1, 2, 3] -- [1, 2] -- [3]
[3]
Santa Clarita Diet — Confusing, I know

What’s going on here?

As usual, it’s better if you first try to decipher the mystery on your own. So, go ahead, check the docs…

Have you found it? Do you know what’s happening?

To be fair, if you’re an Elixir dev you’re in a slightly better situation than your Erlang counterparts: The documentation is, I believe, much clearer.

But, let’s go step by step. First, let’s see what happens if we add parentheses to our expression…

iex(6)> ([1, 2, 3] -- [1, 2]) -- [3]
[]
iex(7)> [1, 2, 3] -- ([1, 2] -- [3])
[3]

That actually makes sense! It looks like -- associates from right to left. Let’s see if we can find documentation that proves it.

In Elixir

What we’re looking for is operator associativity. And elixir actually has a whole page in the docs dedicated to it. There you can see that all binary (as in “with two arguments”) list operations (++ -- .. and <>) associate from Right to Left.

That was easy.

As an aside: Let’s try to verify one if .. is actually right associative…

iex(7)> 1 .. 10
1..10
iex(8)> 1 .. 10 .. 20
** (ArgumentError) ranges (first..last) expect both sides to be integers, got: 1..10..20
(elixir) lib/range.ex:55: Range.new/2

In Erlang

Now let’s try to find that same thing in Erlang… 🕵️‍♀️

Well… There is no page for operator associativity, but if you google carefully you’ll be redirected to the docs for Erlang expressions. There, deep down, in the last part of that page, you’ll find a table dedicated to operator precedence

Table 8.6: Operator Precedence

If you squint a bit, you’ll find ++ and -- there and then you’ll see that they’re both Right associative. Just like =! …

Wait… what?

What’s =!? I’ve never seen such an operator in Erlang…

That’s because there is no =! in Erlang. What that row is actually stating is that both = and ! are right associative.

Which is actually cool since it let’s you do stuff like…

1> self() ! self() ! self() ! something.
something
2> flush().
Shell got something
Shell got something
Shell got something
ok
3> Pid1 ! Pid2 ! Pid3 ! something. # "broadcast" a message :P

Is that it?

Yeah, I know… this article is not very insightful, deep or revealing as others, but this topic was in sitting in my To-Write list for almost 2 years. Now it’s out of the way.

In Other News…

SpawnFest

As I mentioned in my previous article, SpawnFest is coming!
This year it will happen on September 21st & 22nd.
Register your team and join us for FREE to win several amazing prizes provided by our great sponsors!

ElixirConfLA

ElixirConf is coming to Latin America for the first time!
Thanks to our friends from PlayUS Media, we’ll meet on Medellín, Colombia for ElixirConfLA on October 24th & 25th.
The schedule was already announced, Early Bird Tickets are on sale and the CFT is open until July 19th.

Erlang Battleground

Finally, the usual reminder: We’re looking for writers. If you want to write on Erlang Battleground, just le me (Brujo Benavides) know :)


An Unprecedented Subtraction was originally published in Erlang Battleground on Medium, where people are continuing the conversation by highlighting and responding to this story.

Permalink

MongooseIM 3.4: Designed with privacy in mind

Let’s face it. We are living in an age where all technology players gather and process huge piles of user data, starting from our behavioural patterns and finishing on our location data. Hence, we receive personalized emails from online retail stores we have visited for just a second or personalized ads of stores in our vicinity that are displayed in our social media streams.

Consequently, more and more people become aware of how their privacy could be at risk with all of that data collection. In turn, the European Union strived to fight for consumer rights by implementing privacy guidelines in the form of the General Data Protection Regulation (GDPR), which governs how consumer data can be handled by third parties. In fact, over 200,000 privacy violation cases have been filed during the course of the last year followed with over €56m in fines for data breaches. Therefore, the stakes are high for all messaging service providers out there.

You might wonder: “Why should this matter to me? After all, my company is not in Europe.” Well, if any of the users of your messaging service are located in the EU, you are affected by GDPR as if you would host your service right there. Feeling uneasy? Don’t worry, MongooseIM team has got you covered. Please welcome MongooseIM release 3.4 that brings us full GDPR compliance.

Privacy by Design

A new concept has been defined with the dawn of GDPR - privacy by default. It is assumed that the software solution being used follows the principles of minimising and limiting, hiding and protecting, separating, aggregating, as well as, providing privacy by default.

Minimise and limit

The minimise and limit principle regards the amount of personal data gathered by a service. The general principle here is to take only the bare minimum required for a service to run instead of saving unnecessary data just in case. If more data is taken out, the unnecessary part should be deleted. Luckily, MongooseIM is using only the bare minimum of personal data provided by the users and relies on the users themselves to provide more if they wish to - e.g. by filling out the roster information. Moreover, since it is implementing XMPP and is open source, everybody has an insight as to how the data is processed.

Hide and protect

The hide and protect principle refers to the fact that user data should not be made public and should be hidden from plain view disallowing third parties to identify users through personal data or its interrelation. We have tackled that by handling the creation of JIDs and having recommendations regarding log collection and archiving.

What is this all about? See, JIDs are the central and focal point of MongooseIM operation as they are user unique identifiers in the system. As long as the JID does not contain any personally identifiable information like a name or a telephone number, the JID is far more than pseudo-anonymous and cannot be linked to the individual it represents. This is why one should refrain from putting personally identifiable information in JIDs. For that reason, our 3.4 release includes a mechanism that allows automatic user creation with random JIDs that you can invoke by typing ‘register’ in the console. Specific JIDs are created by intentionally invoking a different command (register_identified).

Still, it is possible that MongooseIM logs contain personally identifiable information such as IP addresses that could correlate to JIDs. Even though the JID is anonymous, an IP address next to a JID might lead to the person behind it through correlation. That is why we recommend that installations with privacy in mind have their log level set to at least ‘warning’ level in order to avoid breaches of privacy while still maintaining log usability.

Separate and aggregate

The separate principle boils down to partitioning user data into chunks rather than keeping them in a monolithic DB. Each chunk should contain only the necessary private data for its own functioning. Such a separation creates issues when trying to identify a person through correlation as the data is scattered and isolated - hence the popularity of microservices. Since MongooseIM is an XMPP server written in Erlang, it is naturally partitioned into modules that have their own storage backends. In this way, private data is separated by default in MongooseIM and can be also handled individually - e.g. by deleting all the private data relating to one function.

The aggregation principle refers to the fact that all data should be processed in an aggregated manner and not in one focused on detailed personal cases. For instance, behavioural patterns should be representative of a concrete, not identifiable cohort rather than of a certain Rick Sanchez or Morty Smith. All the usage data being processed by MongooseIM is devoid of any personally identifiable traits and instead tracks metrics relevant to the health of the server. The same can be said for WombatOAM if you pair it with MongooseIM. Therefore, aggregation is supported by default.

Privacy by default

It is assumed that the user should be offered the highest degree of privacy by default. This is highly dependant on your own implementation of the service running on top of MongooseIM. However, if you follow our recommendations laid out in this post, you can be sure you implement it well on the backend side, as we do not differentiate between the levels of privacy being offered.

The Right of Access

According to GDPR, each user has the right of access to their own data that’s being kept by a service provider. That data includes not only personal data provided by the user but also all the derivate data generated by MongooseIM on its basis. That includes data held in mod_vcard, mod_roster, mod_mam, mod_offline, mod_pubsub, mod_private, mod_inbox, and logs. If we add a range of PubSub backends and MAM backends to the fray, one can see it gets complicated.

With MongooseIM 3.4 we have put a lot of effort in order to make the retrieval as painless as possible for system administrators that oversee the day to day operations. That is why we have developed a mechanism you can start by executing the retrieve_personal_data command in order to collect all the personal and derivative data belonging to a user behind a specific JID. The command will execute for all the modules no matter if they are enabled or disabled. Then, all the relevant data is extracted per module and is returned to the user in the form of an archive.

In order to facilitate the data collection, we have changed the schemas for all of our MAM backends. This has been done to allow a swift data extraction since up till now it was very inefficient and resource hungry to run such a query. Of course, we have prepared migration strategies for the affected backends.

The Right to be Forgotten

The right to be forgotten is another one that goes alongside the right of access. Each user has the right to remove their footprint from the service. Since we know retrieval from all the modules listed above is problematic, removal is even worse.

We have implemented a mechanism that removes the user account leaving behind only the JID. You can run it by executing the “unregister” command. All of the private data not shared with other users is deleted from the system. In contrast, all of the private data that is shared with other users - e.g. group chats messages or PubSub flat nodes - is left intact as the content is not only owned by one party.

Logs are not a part of this action. If the log levels are set at least to ‘warning’, there is no personal data that can be tied to the JIDs in the first place so there is no need for removal.

Final Words on GDPR

The elements above make MongooseIM fully compliant with the current GDPR. However, you have to remember that this is only a piece of the puzzle. Since MongooseIM is a backend to a service there are other considerations that have to be fulfilled in order for the entire service to be GDPR compliant. Some of these considerations include process-oriented requirements of informing, enforcing, controlling, and demonstrating that have to be taken into consideration during service design.

Changelog

Please feel free to read the detailed changelog. Here, you can find a full list of source code changes and useful links.

Test our work on MongooseIM 3.4 and share your feedback

Help us improve MongooseIM:

  1. Star our repo: esl/MongooseIM
  2. Report issues: esl/MongooseIM/issues
  3. Share your thoughts via Twitter
  4. Download Docker image with new release.
  5. Sign up to our dedicated mailing list to stay up to date about MongooseIM, messaging innovations and industry news.
  6. Check out our MongooseIM product page for more information on the MongooseIM platform.

Permalink

Blockchain 2018: Myths vs Reality

After a successful year of involvement in a variety of blockchain projects, I felt it healthy to take a step back and observe the emerging trends in this nascent (but still multi billion dollar) industry and question whether some of the directions taken so far are sensible or whether any corrections are necessary. I will provide a ‘snapshot’ of the state of advancement of the blockchain technology, describing the strengths and weaknesses of the solutions that have emerged so far, without proposing innovations at this stage (those ideas will form the subject of future blog posts). If you are interested to hear what I and Erlang Solutions, where I work as Scalability Architect and Technical Lead have to say about blockchain, let me take you through it:

The context

I do not propose in this blogpost to get into the details of the blockchain data structure itself, nor will I discuss what is the best Merkle tree solution to adopt. I will also avoid hot topics such as ‘Transactions Per Second’ (TPS) and the mechanisms for achieving substantial transaction volumes, which is de facto the ultimate benchmark widely adopted as a measurement of how competitive a solution is against the major players in the market; Bitcoin and Ethereum.

What I would like to examine instead is the state of maturity of the technology, and its alignment with the core principles that underpin the distributed ledger ecosystem. I hope that presenting a clear picture of these principles and how they are evolving may be helpful.

The principles

As the primary drive for innovation emerges from public, open source blockchains, this will be the one on which we will focus our attention.

The blockchain technology mainly aims at embracing the following high level principles:

  1. Immutability of the history
  2. Decentralisation of the control
  3. ‘Workable’ consensus mechanism
  4. Distribution and resilience
  5. Transactional automation (including ‘smart contracts’)
  6. Transparency and Trust
  7. Link to the external world

Let us look at them one by one:

1. Immutability of the history

In an ideal world it would be desirable to preserve an accurate historical trace of events, and make sure this trace does not deteriorate over time, whether through natural events, or by human error or by the intervention of fraudulent actors. The artefacts produced in the analogue world face alterations over time, although there is often the intent to make sure they can withstand forces that threaten to alter and eventually destroy them. In the digital world the quantized / binary nature of the stored information provides the possibility of continuous corrections to prevent any deterioration that might occur over time.

Writing an immutable blockchain aims to retain a digital history that cannot be altered over time and on top of which one can verify that a trace of an event, known as a transaction, is recorded in it. This is particularly attractive when it comes to assessing the ownership or the authenticity of an asset or to validate one or more transactions.

We should note that, on top of the inherent immutability of a well-designed and implemented blockchain, hashing algorithms also provide a means to encode the information that gets written in the history so that the capacity to verify a trace/transaction can only be performed by actors possessing sufficient data to compute the one-way1 cascaded encoding/encryption. This is typically implemented on top of Merkle trees where hashes of concatenated hashes are computed.

Legitimate questions can be raised about the guarantees for indefinitely storing an immutable data structure:

  1. If this is an indefinitely growing history, where can it be stored once it grows beyond capacity of the ledgers?
  2. As the history size grows (and/or the computing power needed to validate further transactions increases) this reduces the number of potential participants in the ecosystem, leading to a de facto loss of decentralisation. At what point does this concentration of ‘power’ create concerns?
  3. How does the verification performance deteriorate as the history grows?
  4. How does it deteriorate when a lot of data gets written on it concurrently by the users?
  5. How long is the segment of data that you replicate on each ledger node?
  6. How much network traffic would such replication generate?
  7. How much history is needed to be able to compute a new transaction?
  8. What compromises need to be made on linearisation of the history, replication of the information, capacity to recover from anomalies and TPS throughput?

Further to the above questions we would also like to understand how many replicas converging to a specific history (i.e. consensus) would be needed for it to carry on existing, and in particular:

  1. Can a fragmented network carry on writing to their known history2 ?
  2. Is an approach designed to ‘heal’ any discrepancies in the immutable history of transactions by rewarding the longest fork, fair and efficient?
  3. Are the deterrents strong enough to prevent a group of ledgers forming their own fork3 that eventually reaches a wider adoption?

Furthermore, a new requirement to comply with the General Data Protection Regulations (GDPR) in Europe and ‘the right to be forgotten’ introduced new challenges to the perspective of keeping permanent and immutable traces indefinitely. This is important because fines for breach of GDPR are potentially very significant. The solutions introduced so far effectively aim at anonymising the information that enters the immutable on-chain storage process, while sensitive information is stored separately in support databases where this information can be deleted if required. None of these approaches has yet been tested by the courts, nor has a definition of what the GDPR ‘right to be forgotten’ means in practice.

The challenging aspect here is to decide upfront what is considered sensitive and what can safely be placed on the immutable history. A wrong upfront choice can backfire at a later stage in the event that any involved actor manages to extract or trace sensitive information through the immutable history.

Immutability represents one of the fundamental principles that motivates the research into blockchain technology, both private and public. The solutions explored so far have managed to provide a satisfactory response to the market needs via the introduction of history linearisation techniques, one-way hashing encryptions, merkle trees and off-chain storage, although the linearity of the immutable history4 comes at a cost (notably transaction volume).

2. Decentralisation of the control

During the aftermath of the 2008 global financial crisis (one interpretation of which is that it highlighted the global financial disasters that could occur from over centralisation and the misalignment of economic incentives) there arose a deep mistrust of ‘traditional’, centralised institutions, political and commercial. One reaction against such centralisation was the exploration of various decentralised mechanisms which could replace those traditional, centralised structures. The proposition that individuals operating in a social context ideally would like to enjoy the freedom to be independent from a central authority gained in popularity. Self determination, democratic fairness and heterogeneity as a form of wealth are among the dominant values broadly recognised in Western (and, increasingly, non-Western) society. These values added weight to the movement that introducing decentralisation in a system is positive. I have rarely seen this very idea being challenged at all (in the same way one rarely hears criticism of the proposition that ‘information wants to be free’), although in some circumstances the unwanted consequences of using this approach can clearly be seen.

For instance, one might argue that it is only due to our habits that we normally resolve anomalies in a system by contacting a central authority, which according to our implicit or explicit contractual terms, bears the responsibility for what happens to the system. Therefore, in the event of a damage incurred through a miscarried transaction that might be caused by a system failure or a fraudulent actor, we are typically inclined to contact the central bearer of the responsibility to intervene and try to resolve the damage sustained. Human history is characterised by the evolution of hierarchical power structures, even in the most ‘democratic’ of societies, and these hierarchies naturally create centralisation, independent of the dominant political structure in any particular society. This characteristic continues into the early 21st century.

With decentralisation, however, there is no such central authority that could resolve those issues for us. Traditional, centralised systems have well developed anti-fraud and asset recovery mechanisms which people have become used to. Using new, decentralised technology places a far greater responsibility on the user if they are to receive all of the benefits of the technology, forcing them to take additional precautions when it comes to handling and storing their digital assets. In particular they need to keep the access to their digital wallets protected and make sure they don’t lose it. Similarly when performing transactions, such as giving away a digital asset to a friend or a relative, they have to make sure it is sent to the right address/wallet, otherwise it will be effectively lost or mistakenly handed over to someone else.

Also, there’s no point having an ultra-secure blockchain if one then hands over one’s wallet private key to an intermediary (more ‘centralisation’ again) whose security is lax: it’s like having the most secure safe in the world then writing the combination on a whiteboard in the same room.

Is the increased level of personal responsibility that goes with the proper implementation of a secure blockchain a price that users are willing to pay; or will they trade off some security in exchange for ease of use (and, by definition, more centralisation)? It’s too early to see how this might pan out. If people are willing to make compromises here, what other compromises re, say, security or centralisation would they be prepared to accept in exchange for lower cost/ease of use? We don’t know, as there’s no secure blockchain ecosystem yet operating at scale.

Another threat to the broad endorsement and success of the decentralisation principle is posed by governmental regulatory/legal pressures on digital assets and ecosystems. This is to ensure that individuals do not use the blockchain for tax evasion and that the ownership of their digital assets is somehow protected. However any attempt to regulate this market from a central point is undermining the effort to promote the adoption of a decentralised form of authority.

3. Consensus

The consistent push towards decentralised forms of control and responsibility has brought to light the fundamental requirement to validate transactions without the need for (or intervention of) a central authority; this is known as the ‘consensus’ problem and a number of approaches have grown out of the blockchain industry, some competing and some complementary.

There has also been a significant focus around the concept of governance within a blockchain ecosystem. This concerns the need to regulate the rates at which new blocks are added to the chain and the associated rewards for miners (in the case of blockchains using proof of work (POW) consensus methodologies). More generally, it is important to create incentives and deterrent mechanisms whereby interested actors contribute positively to the healthy continuation of the chain growth.

Besides serving as economic deterrent against denial of service and spam attacks, POW approaches are amongst the first attempts to automatically work out, via the use of computational power, which ledgers/actors have the authority to create/mine new blocks5. Other similar approaches (proof of space, proof of bandwidth etc) followed, however, they all suffered from exposure to deviations from the intended fair distribution of control. Wealthy participants can in fact exploit these approaches to gain an advantage via purchasing high performance (CPU / memory / network bandwidth) dedicated hardware in large quantity and operating it in jurisdictions where electricity is relatively cheap. This results in overtaking the competition to obtain the reward, and the authority to mine new blocks, which has the inherent effect of centralising the control. Also the huge energy consumption that comes with the inefficient nature of the competitive race to mine new blocks in POW consensus mechanisms has raised concerns about its environmental impact and economic sustainability. The most recent report on the energy usage of Bitcoin can be seen here on digiconomist.

Proof of Stake (POS) and Proof of Importance (POI) are among the ideas introduced to drive consensus via the use of more social parameters, rather than computing resources. These two approaches link the authority to the accumulated digital asset/currency wealth or the measured productivity of the involved participants. Implementing POS and POI mechanisms, whilst guarding against the concentration of power/wealth, poses not insubstantial challenges for their architects and developers.

More recently, semi-automatic approaches, driven by a human-curated group of ledgers, are putting in place solutions to overcome the limitations and arguable fairness of the above strategies. The Delegated Proof of Stake (DPOS) and Proof of Authority (POA) methods promise higher throughput and lower energy consumptions, while the human element can ensure a more adaptive and flexible response to potential deviations caused by malicious actors attempting to exploit a vulnerability of the system.

Whether these solutions actually manage to fulfill the inherent requirements, set by the principle of control distribution, is debatable. Similarly, when it comes to the idea of bringing in another layer of human driven consensus for the curation, it is clear that we are abandoning a degree of automation. Valuing trust and reputation in the way authority gets exercised on the governance of a particular blockchain appears to bring back a form of centralised control, which clearly goes against the original intention and the blockchain ethos.

This appears to be one of the areas where, despite the initial enthusiasm, there is not yet a clear solution that drives consensus in a fair, sustainable and automated manner. Moving forward I expect this to be the main research focus for the players in the industry, as current solutions leave many observers unimpressed. There will likely emerge a number of differing approaches, each suitable for particular classes of use case.

4. Distribution and resilience

Apart from decentralising the authority, control and governance, blockchain solutions typically embrace a distributed Peer to Peer (P2P) design paradigm. This preference is motivated by the inherent resilience and flexibility that these types of networks have introduced and demonstrated, particularly in the context of file and data sharing (see a brief p2p history here). The diagram below is frequently used to explain the difference among three network topologies: centralised, decentralised and distributed.

A centralised network, typical of mainframes and centralised services6, is clearly exposed to a ‘single point of failure’ vulnerability as the operations are always routed towards a central node. In the event that the central node breaks down or is congested, all the other nodes will be affected by disruptions.

Decentralised and distributed networks attempt to reduce the detrimental effects that issues occurring on a node might trigger on other nodes. In a decentralised network, the failure of a node can still affect several neighbouring nodes that rely on it to carry out their operations. In a distributed network the idea is that failure of a single node should not impact significantly any other node. In fact, even when one preferential/optimal route in the network becomes congested or breaks down entirely, a message can reach the destination via an alternative route. This greatly increases the chances to keep a service available in the event of failure or malicious attacks such as a denial of service (DOS) attack.

Blockchain networks where a distributed topology is combined with a high redundancy of ledgers backing a history have occasionally been declared “unhackable” by enthusiasts or, as some more prudent debaters say, “difficult to hack”. There is truth in this, especially when it comes to very large networks such as Bitcoin (see an additional explanation here). In such a highly distributed network, the resources needed to generate a significant disruption are very high, which not only delivers on the resilience requirement, but also works as a deterrent against malicious attacks (principally because the cost of conducting a successful malicious attack becomes prohibitive).

Although a distributed topology can provide an effective response to failures or traffic spikes, we need to be aware that delivering resilience against prolonged over-capacity demands or malicious attacks requires adequate adapting mechanisms. While the Bitcoin network is well positioned, as it currently benefits from a high capacity condition (due to the historical high incentive to purchase hardware by third party miners7 ), this is not the case for other emerging networks as they grow in popularity. This is where novel instruments, capable of delivering preemptive adaptation combined with back pressure throttling applied to the P2P level, can be of great value.

Distributed systems are not new and, whilst they provide highly robust solutions to many enterprises and governmental problems, they are subject to the laws of physics and require their architects to consider the trade-offs that need to be made in their design and implementation (e.g. consistency vs availability). This remains the case for blockchain systems.

5. Automation

In order to sustain a coherent, fair and consistent blockchain and its surrounding ecosystem a high degree of automation is required. Existing areas with a high demand of automation include those common to most distributed systems. For instance; deployment, elastic topologies, monitoring, recovery from anomalies, testing, continuous integration, and continuous delivery. In the context of blockchains, these represent well-established IT engineering practices. Additionally, there is a creative R&D effort to automate the interactions required to handle assets, computational resources and users across a range of new problem spaces (e.g. logistics, digital asset creation and trading etc).

The trend of social interactions has seen a significant shift towards scripting for transactional operations. This is where ‘Smart Contracts’ and constrained virtual machines (VM) interpreters have emerged - an effort pioneered by the Ethereum project.

The possibility to define through scripting how to operate an asset exchange, under what conditions and actioned by which triggers, has attracted many blockchain enthusiasts. Some of the most common applications of Smart Contracts involve lotteries, trade of digital assets and derivative trading. While there is clearly an exciting potential unleashed by the introduction of Smart Contracts, it is also true that it is still an area with a high entry barrier. Only skilled developers that are willing to invest time in learning Domain Specific Languages (DSL)8 have access to the actual creation and modification of these contracts.

Besides, developers can create contracts that contain errors or are incapable to operate under unexpected conditions. This can happen, for instance, when the implementation of a contract is commissioned to a developer that does not have sufficient domain knowledge. Although the industry is taking steps in the right direction, there is still a long way to go in order to automatically adapt to unforeseen conditions and create effective Smart Contracts in non-trivial use cases. The challenge is to respond to safety and security concerns when Smart Contracts are applied to edge case scenarios that deviate from the ‘happy path’. If badly-designed contracts cannot properly rollback/undo a miscarried transaction, their execution might lead to assets being lost or erroneously handed over to unwanted receivers. A number of organisations are conducting R&D effort to respond to these known issues and introduce VMs operating under more restrictive constraints to deliver a higher level of safety and security. However, from a conceptual perspective, this is a restriction that reduces the flexibility to implement specific needs.

Another area in high need for automation is governance. Any blockchain ecosystem of users and computing resources requires periodic configurations of the parameters to carry on operating coherently and consensually. This results in a complex exercise of tuning for incentives and deterrents to guarantee the fulfilment of ambitious collaborative and decentralised goals. The newly emerging field of ‘blockchain economics’ (combining economics; game theory; social science and other disciplines) remains in its infancy.

Clearly the removal of a central ruling authority produces a vacuum that needs to be filled by an adequate decision making body, which is typically supplied with an automation that maintains a combination of static and dynamic configuration settings. Those consensus solutions referred to earlier which use computational resources or social stackable assets to assign the authority, not only to produce blocks but also to steer the variable part of governance9, have originally succeeded to fill the decision making gap in a fair and automated way. Successively, the exploitation of flaws in the static element of governance has hindered the success of these models. This has contributed to the rise of popularity of curated approaches such as POA or DPOS, which not only bring back a centralised control, but also reduce the automation of governance.

I expect this to be one of the major area where blockchain has to evolve in order to succeed in getting a widespread market adoption.

6. Transparency and Trust

In order to produce the desired audience engagement for a blockchain and eventually determine its mass adoption and success, its consensus and governance mechanisms need to operate transparently. Users need to know who has access to what data, so that they can decide what can be stored and possibly shared on-chain. These are the contractual terms by which users agree to share their data. As previously discussed users might require to exercise the right for their data to be deleted, which typically is a feature delivered via auxiliary, ‘off-chain’ databases. In contrast, only hashed information, effectively devoid of its meaning, is preserved permanently on-chain.

Given the immutable nature of the chain history, it is important to decide upfront what data should be permanently written on-chain and what gets written off-chain. The users should be made aware of what data gets stored on-chain and with whom it could potentially be shared. Changing access to on-chain data or deleting it goes against the fundamentals of the immutability and therefore is almost impossible. Getting that decision wrong at the outset can significantly affect the cost and usability (and therefore likely adoption) of the blockchain in question.

Besides transparency, trust is another critical feature that users legitimately seek. This is one of the reasons why blockchain manages to attract customers who have developed distrust in traditional centrally-ruled networks (most notably banks and rating organisations after the mishandling of the subprime mortgages that led to the 2008 financial crisis). It is vital, therefore, that central operating bodies in the shape of curated POA/DPOS consensus act in a transparent and trustworthy manner, so that decision makers are not perceived as an elite that could pursue their own goals, instead of operating in the interest of the collective.

Another example of trust towards the people involved becomes relevant when the blockchain information is linked to the real world. As it will be clarified in the context of the next principle (see heading 7), this link involves people and technology dedicated to guarantee the accurate preservation of these links against environmental deterioration or fraudulent misuse.

Trust also has to go beyond the scope of the people involved as systems need to be trusted as well. Every static element, such as an encryption algorithm, the dependency on a library, or an fixed configuration, is potentially exposed to vulnerabilities. Concerns here are understandable given the increasing amount of well-publicised hacks and security breaches that have occurred over the last couple of years. In the cryptocurrency space, these attacks have frequently resulted in the perpetrators successfully walking away with large sums of money, without leaving traces that could be realistically used to track down their identity, or rollback to a previous healthy state. In the non-crypto space they have resulted in widespread hacks of data and the disabling of corporate and civil IT systems.

In some circumstances digital wallets were targeted, as the users might not have stored the access keys in a sufficiently secure place10, and in other circumstances the blockchain itself. This is the case of the ‘DAO attack’ on Ethereum or the so known ‘51% attacks’ on Bitcoin.

Blockchain enthusiasts tend to forget that immutability is only preserved by having a sufficient number of ledgers backing a history. Theoretically in the event that a genuine history gets overruled by a large group ledgers interested in backing a different history, we could fall in a lack of consensus situation, which leads to a fork of the chain. If supported by a large enough group of ledgers11, the most popular fork could eclipse the genuine minor fork making it effectively irrelevant. This is not a reason to be excessively alarmed; it should just be a consideration to be aware of when we put our trust in a system. The bitcoin network, for instance, is currently backed by such a vast amount of ledgers that makes it impractical for anyone to hack.

Similar considerations need to be made for state-of-the-art encryption, which has a deterrent against brute force attacks based on the current cost and availability of computational power. Should new technologies such as quantum computing emerge, it is expected that even that encryption would need an upgrade.

7. Link to the external world

The attractive features that blockchain has brought to the internet market would be limited to handling digital assets, unless there was a way to link informations to the real world. For some reason this brings back to my mind the popular movie “The Matrix” and the philosophical question from René Descartes about what is real and if there is another reality behind the perceived one. Without indulging excessively in arguable analogies, it is safe to say that there would be less interest if we were to accept that a blockchain can only operate under the restrictive boundaries of the digital world, without connecting to the analog real world in which we live.

Technologies used to overcome these limitations include cyber-physical devices such as sensors for input and robotic activators for output, and in most circumstances, people and organisations. As we read through most blockchain whitepapers we might occasionally come across the notion of the oracle, which in short, is a way to name an input coming from a trusted external source that could potentially trigger/activate a sequence of transactions in a Smart Contract or which can otherwise be used to validate some information that cannot be validated within the blockchain itself.

Under the Transparency and Trust section we discussed how these peripheral areas are as susceptible to errors and malicious interference as the core blockchain and Smart Contracts automations. For instance, suppose we are tracking the history of a precious collectible item, unless we have the capacity to identify reliably the physical object in question we risk that its history is lost or attributed to another object.

A remarkable example of object identification approach via physical characteristics is the one used by Everledger to track diamonds. As briefly explained, 40 metadata information can be extracted from the object, including the cut, the clarity, the colour, etc. Given the quasi-immutable nature of the objects in question, this has proved a particularly successful use case. It is more challenging instead to accurately identify objects that degrade over time (e.g. fine art). In this case typically trusted people are involved, or a combination of people and sensor metadata. Relying on expert actors to provide external validation relies upon a properly aligned set of incentives. ‘Traditional’ industries have a long history of dealing with these issues and well-developed mechanisms for dealing with them: blockchain-based solutions are exploring many ways in which they can interface (and adapt) these existing mechanisms.

On a different perspective even wealth itself only makes sense if it can be exercised in the ‘real world’. This is as valid for blockchain as well as traditional centrally-controlled systems, and in fact the idea of virtualising wealth is not new to us as we shifted from trading time and goods directly, to stacking wealth into account deposits. What’s new in this respect is the introduction of alternate currencies known as cryptocurrencies.

Bitcoin and Ethereum, the two dominant projects (in September 2018) in the blockchain space are by many investors seen as an opportunity to diversify a portfolio or speculate on the value of their respective cryptocurrency. The same applies to a wide range of other cryptocurrencies with the exception of fiat pegged currencies, most notably Tether, where the value is effectively bound to the US dollar. Conversions from one cryptocurrency to another and to/from fiat currencies is typically operated by exchanges on behalf of an investor. These are again peripheral services that serve as link to the external world.

From a blockchain perspective, some argue that the risk control a cryptocurrency investor demands - so that there is the possibility to step back and convert back to fiat currencies - is the wrong mindset. Evangelising trust towards the cryptocurrency diversification however, understandably, requires an implementation period during which investors legitimately might want to exercise the option to pull-out.

Besides oracles and cyber-physical links, an interest is emerging with linking Smart Contracts together to deliver a comprehensive solution. Contracts could indeed operate in a cross-chain scenario to offer interoperability among a variety of digital assets and protocols. Although attempts to combine different protocols and approaches have emerged (e.g. see the EEA and the Accord Project), this is still an area where further R&D is necessary in order to provide enough instruments and guarantees to developers and entrepreneurs. The challenge is to deliver cross-chain functionalities without the support of a central governing agency/body.

Conclusion

The financial boost that the blockchain market has benefited from, through the large fundraise operations conducted through the 2017-2018 initial coin offerings (ICO), has given the opportunity to conduct a substantial amount of R&D studies aimed at finding solutions to the challenges described in this blogpost. Although step changes have been introduced by the work of visionary individuals with mathematical backgrounds such as S. Nakamoto and V. Buterin, it seems clear that contributions need to come from a variety of expertises to ensure a successful development of the technology and the surrounding ecosystem. It is an exercise that involves evangelisation, social psychology, automation, regulatory compliance and ethical direction.

The challenges the market is facing can be interpreted as an opportunity for ambitious entrepreneurs to step in and resolve the pending fundamental issues of consensus, governance, automation and link to the real world. It is important in this sense to ensure that the people you work with have the competence, the technology and a good understanding of the issues to be resolved, in order to successfully drive this research forward (see Erlang Solutions offering here).

As discussed in the introduction, this blogpost only focuses on assessing the state of advancement of the technology in response to the motivating principles, and analyses known issues without proposing solutions. That said, I and the people I work with, are actively working on solutions that will be presented and discussed in a separate forum. Want to learn more? Check out our blog post on the way blockchain impacts ownership in the digital era.

Footnotes

1. One-way in this context means that you cannot retrieve the original information from a hash value which also frequently involves the destruction of an information, potentially leading to rare collisions (read more on perfect hash function).

2. In the event of a split brain, a typical consensus policy is to allow fragmented networks to carry on writing blocks on their known chain and once the connection is re-established the longest chain part (also known as fork) is preferred while the shortest is deleted. Note this could lead to a temporary double spending scenario, a fundamental ‘no-go’ in the blockchain (and ‘traditional’) world.

3. Note that in this case the very concept of ownership of an asset can be threatened. E.g: if i remain the only retainer of a history that is valid, while the rest of the world has moved on in disagreement with it. My valid history gets effectively invalidated. History is indeed written by the victors!

4. Blockchain stores its blocks in a linear form although there have been attempts to introduce graph fork tolerant approaches such as the IOTA tangle.

5. Mining is the process of adding a block of transactions to the chain (see here for instructions and here for more context).

6. Historically a central server was the only economically viable option given the high cost of performant hardware and cheaper cost of terminals.

7. Bitcoin’s POW rewards a successful miner with a prize in bitcoin cryptocurrency. This is no longer an incentive in some countries. See here a report released in May about the cost to mine 1 bitcoin per country.

8. For example Solidity is a popular DSL inspired by JavaScript.

9. Once discovered and understood, malicious users can take advantage of any static / immutable configuration. For instance, rich/resourceful individuals could gain an advantage via the purchase of a vast amount of computational resources to bend the POW fairness.

10. A common guideline is to split the access key into parts and store each in a different cryptovault.

11. The Bitcoin network estimated computing power at the time of writing (September 2018) amounts to 80704290 petaflops while the world’s most powerful supercomputer reaches 200 petaflops. This obviously only allows to infer by analogy the magnitude of the amount of ledgers on the network.

Permalink

Using CircleCI for Continuous Integration of Elixir projects.

Continuous integration is vital to ensure the quality of any software you ship. Circle CI is a great tool that provides effective continuous integration. In this blog, we will guide you through how to set up CircleCI for an Elixir project. We’ve set up the project on GitHub so you can follow along. In the project we will:

  • Build the project
  • Run tests and check code coverage
  • Generate API documentation
  • Check formatting
  • Run Dialyzer check.

You can follow the actual example here. Some inspiration for this blog post comes from this blog and this post. Prerequisites: For this project you will need an account on CircleCI and GitHub. You also need to connect the two accounts. Lastly, you will need Erlang/OTP and Elixir installed.

Create an Elixir project

To begin with, we need a project, for this demonstration we wanted to use the most trivial possible example, so we generated a classic ‘Hello World’ program.
To create the project, simply type the following into the shell:

mix new hello_world

We also added a printout, because the generated constant-returning function can be optimised away and that will confuse the code coverage checking.

Add code coverage metric

Code coverage allows us to identify how much of the code is being executed during the testing. Once we know that, we can also understand what lines are not executed because these lines could be where bugs can hide undetected or that we have forgotten to write tests for. If those lines of code are unreachable or not used, they should obviously be removed. To get this metric, I add the excoveralls package to our project (see here for details). Beware that even if you have 100% code coverage, it does not mean the code is bug-free. Here’s an example (inspired by this):

defmodule Fact do

  def fact(1), do: 1
  def fact(n), do: n*fact(n-1)
end

If the test executes Fact.fact(10), we get 100% test coverage, but the code is still faulty, it won’t terminate if the input is e.g. -1. For a new project, it should be easy to keep the code coverage near 100%, especially if the developers follow test-driven principles. However, for a “legacy” or already established project without adequate test coverage, reaching 100% code coverage might be unreasonably expensive. In this case, we should still strive to increase (or at least not decrease) the code coverage.

To run the tests with code coverage, execute:

mix coveralls.html

In the project root directory (hello_world). Apart from the printout, a HTML report will be generated in the cover directory too.

Add Dialyzer checking

Dialyzer is a static analyzer tool for Erlang and Elixir code. It can detect type errors (e.g. when a function expects a list to be an argument, but is called with a tuple), unreachable code and other kinds of bugs. Although Elixir is a dynamically typed language, it is possible to add type specifications (specs) for function arguments, return values, structs, etc. The Elixir compiler does not check for this, but the generated API documentation uses the information and it is extremely useful for users of your code. As a post-compilation step , you can run Dialyzer to check the type specifications in addition to writing unit-tests, you can regard this type checking as an early warning to detect incorrect input before you start debugging a unit-test or API client. To enable a Dialyzer check for this code, I’ll use the Dialyzer mix task (see this commit for more details). Dialyzer needs PLT (persistent lookup table) files to speed up its analysis. Generating this file for the Elixir and Erlang standard libraries takes time, even on fast hardware, so it is vital to reuse the PLT files between CI jobs.

To run the dialyzer check, execute ‘mix dialyzer’ in the project root directory (hello_world). The output will be printed on the console. The first run (which generates the PLT file for the system and dependencies) might take a long time!

Add documentation generation

Elixir provides support for written documentation. By default this documentation will be included in the generated beam files and accessible from the interactive Elixir shell (iex). However, if the project provides an API for other projects, it is vital to generate more accessible documentation. We’re going to use the mix docs task to generate HTML documentation (see this commit for more details).

To generate documentation, execute:

‘mix docs’

In the project root directory (hello_world). The documentation is generated (by default) in the docs directory.

Ensure code formatting guidelines

Using the same code style consistently throughout a project helps readers understand the code and could prevent some bugs from being introduced. The mix tool in Elixir provides a task to check that the code is in compliance with the specified guidelines. The command to do this is:

mix format --check-formatted

In the project root directory (hello_world).

Push the project to GitHub

Now that the project is created, it needs to be published to allow CircleCI to access it. Create a new repository on GitHub by clicking New on the Repository page, then follow the instructions. When the new repository is created, initialize the git repository and push it to GitHub:

cd hello_world
git init
git add config/ .formatter.exs .gitignore lib/ mix.exs README.md test/
git commit -m “Initial version” -a
git remote add origin <repo-you-created>
git push -u origin master

Integrate the GitHub repository to CircleCI

Login to circleci.com Click on “Add projects” Click on “Set Up Project” for hello_world For now skip the instructions to create the .circleci/config.yml file (we’ll get back to this file later), just click on “Start Building”

The build will fail, because we didn’t add the configuration file, that’s the next step.

Configuring the CircleCI workflow

As our requirement states above, we’ll need 5 jobs. These are:

  • Build: Compile the project and create the PLT file for Dialyzer analysis.
  • Test: Run the tests and compute code coverage.
  • Generate documentation: Generate the HTML and associated files that document the project.
  • Check code format: The mix tool can be used to ensure that the project follows the specified code formatting guidelines.
  • Execute Dialyzer check: run the Dialyzer mix task using the previously generated PLT file

The syntax of the CircleCI configuration file is described here.

The next section describes the configuration required to setup the above five jobs, so create a .circleci/config.yml file and add the following to it:

Common preamble

version: 2.1
jobs:

The above specifies the current CircleCI version. The jobs (described in the next sections) should be listed in the configuration file.

The build step

The configuration for the build step:

  build:
    docker:
    - image: circleci/elixir:1.8.2
        environment:
        MIX_ENV: test

    steps:
    - checkout

    - run: mix local.hex --force
    - run: mix local.rebar --force

    - restore_cache:
        key: deps-cache-{{ checksum "mix.lock" }}
    - run: mix do deps.get, deps.compile
    - save_cache:
        key: deps-cache-{{ checksum "mix.lock" }}
        paths:
            - deps
            - ~/.mix
            - _build

    - run: mix compile

    - run: echo "$OTP_VERSION $ELIXIR_VERSION" > .version_file
    - restore_cache:
        keys:
            - plt-cache-{{ checksum ".version_file" }}-{{ checksum "mix.lock" }}
    - run: mix dialyzer --plt
    - save_cache:
        key: plt-cache-{{ checksum ".version_file"  }}-{{ checksum "mix.lock" }}
        paths:
            - _build
            - deps
            - ~/.mix

In this example, I’m using 1.8.2 Elixir. The list of available images can be found here. The MIX_ENV variable is set to test, so the test code is also built and more importantly, the PLT file for Dialyzer will be built for test-only dependencies too.

Further build steps:

The actual build process checks the code, fetches and installs hex and rebar locally.

Restore the dependencies from cache. The cache key depends on the checksum of the mix.lock file which contains the exact versions of the dependencies, so this key changes only when actual dependencies are changed. The dependencies and built files are saved to the cache, to be reused in the test and documentation generating steps, and later when CI runs.

Build the actual project. Unfortunately, this result cannot be reused, because the checkout steps produce source files with current timestamp, so in later steps, the source files will have newer timestamps than the beam files generated in this step, this will lead to mix compiling the project anyway.

The dialyzer PLT file depends on the Erlang/OTP and Elixir versions. Even though I’ve fixed the Elixir version, it is possible that the Erlang/OTP in the Docker image is updated and in that case, the PLT file would be out of date. As the CircleCI caches are immutable, there’s no way to update the PLT file, for these cases, you’ll need a new cache name for new cache contents. Unfortunately, not all environment variable names can be used in CircleCI cache names, so I needed to use a workaround here: create a temporary .version_file which contains the Erlang/OTP and Elixir versions and use its checksum in the cache name along with the checksum of the mix.lock file (which contains the exact versions of all dependencies). So as long as we have the exact same dependencies and versions, we can reuse the PLT file safely, but as soon as anything changes, we get to use a new PLT file.

The test step

The configuration for the test step:

 test:
    docker:
    - image: circleci/elixir:1.8.2

    steps:
    - checkout
    - restore_cache:
        key: deps-cache-{{ checksum "mix.lock" }}
    - run: mix coveralls.html

    - store_artifacts:
        path: cover
        destination: coverage_results

Obviously, you need to use the same docker image as you use in the build step. There’s no need to explicitly configure the MIX_ENV environment variable because the mix job will set it. The test is fairly straightforward: checkout the code from the repository, fetch the dependencies from the cache, then run the coverage check job. The coverage report is generated in the cover directory. It is stored as an artifact, so you can check the results via a browser on the CircleCI page. If the tests itself fail, the output of the step will show the error message, and the job itself will fail.

The documentation generation step

The configuration for the documentation generation step:

    generate_documentation:
    docker:
    - image: circleci/elixir:1.8.2
        environment:
        MIX_ENV: test

    steps:
    - checkout
    - restore_cache:
        key: deps-cache-{{ checksum "mix.lock" }}
    - run: mix docs

    - store_artifacts:
        path: doc
        destination: documentation

Once again, the same docker from the build step needs to be used and setting up the MIX_ENV variable is important, otherwise, the dependencies might be different from the dev environment to the test environment. The documentation generation is fairly straightforward: checkout the code from the repository, fetch the dependencies from the cache (which contains the documentation generating task), then run the documentation generating the job. The documentation is generated in the doc directory. It is stored as an artifact, so you can check the results via a browser on the CircleCI page.

The dialyzer step

The configuration for the dialyzer step:

  dialyzer:
    docker:
    - image: circleci/elixir:1.8.2
        environment:
        MIX_ENV: test

    steps:
    - checkout
    - run: echo "$OTP_VERSION $ELIXIR_VERSION" > .version_file
    - restore_cache:
        keys:
            - plt-cache-{{ checksum ".version_file" }}-{{ checksum "mix.lock" }}
    - run: mix dialyzer --halt-exit-status

Much like the last step, the docker image needs to match the one used for the build. Ensure that the MIX_ENV variable is correct. The workaround mentioned above is required to find the cache with the right PLT file, then executing Dialyzer is a simple command. If Dialyzer finds an error, it will return with a non-zero exit code, so the step will fail.

The format checking step

The configuration for the format checking step:

  format_check:
    docker:
    - image: circleci/elixir:1.8.2
        environment:
        MIX_ENV: test

    steps:
    - checkout

    - run: mix format --check-formatted

This is really simple, we don’t even need any cached files, just run the check.

Piecing it all together

CircleCI executes a workflow which contains the above steps. This is the configuration for the workflow:

workflows:
  version: 2
  build_and_test:
    jobs:
    - build
    - format_check:
        requires:
            - build
    - generate_documentation:
        requires:
            - build
    - dialyzer:
        requires:
            - build
    - test:
        requires:
            - build

The workflow specifies that the four later steps depend on the build test, but they can be executed simultaneously. The whole configuration file can be seen on GitHub. When the configuration file is ready, add it to git:

git add .circleci/config.yml

And push it to GitHub:

git push origin master

CircleCI will automatically detect that a new version was committed and will execute the configured jobs.

Conclusion

Continuous integration is essential in a fast moving project. The developers need feedback as soon as possible because the earlier a bug is found, the earlier and cheaper it is to fix it. This blog presents a simple, but effective setup that can run this vital part of an Elixir project. Want more fantastic Elixir inspiration? Don’t miss out on Code Elixir, a day to learn, share, connect with and be inspired by the Elixir community. Want to learn about live tracing in Elixir? Head to our easy to follow guide.

Permalink

Elixir v1.9 released

Elixir v1.9 is out with releases support, improved configuration, and more.

We are also glad to announce Fernando Tapia Rico has joined the Elixir Core Team. Fernando has been extremely helpful in keeping the issues tracker tidy, by fixing bugs and improving Elixir in many different areas, such as the code formatter, IEx, the compiler, and others.

Now let’s take a look at what’s new in this new version.

Releases

The main feature in Elixir v1.9 is the addition of releases. A release is a self-contained directory that consists of your application code, all of its dependencies, plus the whole Erlang Virtual Machine (VM) and runtime. Once a release is assembled, it can be packaged and deployed to a target as long as the target runs on the same operating system (OS) distribution and version as the machine running the mix release command.

Releases have always been part of the Elixir community thanks to Paul Schoenfelder’s work on Distillery (and EXRM before that). Distillery was announced in July 2016. Then in 2017, DockYard hired Paul to work on improving deployments, an effort that would lead to Distillery 2.0. Distillery 2.0 provided important answers in areas where the community was struggling to establish conventions and best practices, such as configuration.

At the beginning of this year, thanks to Plataformatec, I was able to prioritize the work on bringing releases directly into Elixir. Paul was aware that we wanted to have releases in Elixir itself and during ElixirConf 2018 I announced that releases was the last planned feature for Elixir.

The goal of Elixir releases was to double down on the most important concepts provided by Distillery and provide extensions points for the other bits the community may find important. Paul and Tristan (who maintains Erlang’s relx) provided excellent feedback on Elixir’s implementation, which we are very thankful for. The Hex package manager is already using releases in production and we also got feedback from other companies doing the same.

Enough background, let’s see why you would want to use releases and how to assemble one.

Why releases?

Releases allow developers to precompile and package all of their code and the runtime into a single unit. The benefits of releases are:

  • Code preloading. The VM has two mechanisms for loading code: interactive and embedded. By default, it runs in the interactive mode which dynamically loads modules when they are used for the first time. The first time your application calls Enum.map/2, the VM will find the Enum module and load it. There’s a downside. When you start a new server in production, it may need to load many other modules, causing the first requests to have an unusual spike in response time. Releases run in embedded mode, which loads all available modules upfront, guaranteeing your system is ready to handle requests after booting.

  • Configuration and customization. Releases give developers fine grained control over system configuration and the VM flags used to start the system.

  • Self-contained. A release does not require the source code to be included in your production artifacts. All of the code is precompiled and packaged. Releases do not even require Erlang or Elixir in your servers, as they include the Erlang VM and its runtime by default. Furthermore, both Erlang and Elixir standard libraries are stripped to bring only the parts you are actually using.

  • Multiple releases. You can assemble different releases with different configuration per application or even with different applications altogether.

  • Management scripts. Releases come with scripts to start, restart, connect to the running system remotely, execute RPC calls, run as daemon, run as a Windows service, and more.

1, 2, 3: released assembled!

You can start a new project and assemble a release for it in three easy steps:

$ mix new my_app
$ cd my_app
$ MIX_ENV=prod mix release

A release will be assembled in _build/prod/rel/my_app. Inside the release, there will be a bin/my_app file which is the entry point to your system. It supports multiple commands, such as:

  • bin/my_app start, bin/my_app start_iex, bin/my_app restart, and bin/my_app stop - for general management of the release

  • bin/my_app rpc COMMAND and bin/my_app remote - for running commands on the running system or to connect to the running system

  • bin/my_app eval COMMAND - to start a fresh system that runs a single command and then shuts down

  • bin/my_app daemon and bin/my_app daemon_iex - to start the system as a daemon on Unix-like systems

  • bin/my_app install - to install the system as a service on Windows machines

Hooks and Configuration

Releases also provide built-in hooks for configuring almost every need of the production system:

  • config/config.exs (and config/prod.exs) - provides build-time application configuration, which is executed when the release is assembled

  • config/releases.exs - provides runtime application configuration. It is executed every time the release boots and is further extensible via config providers

  • rel/vm.args.eex - a template file that is copied into every release and provides static configuration of the Erlang Virtual Machine and other runtime flags

  • rel/env.sh.eex and rel/env.bat.eex - template files that are copied into every release and executed on every command to set up environment variables, including ones specific to the VM, and the general environment

We have written extensive documentation on releases, so we recommend checking it out for more information.

Configuration

We also use the work on releases to streamline Elixir’s configuration API. A new Config module has been added to Elixir. The previous configuration API, Mix.Config, was part of the Mix build tool. However, since releases provide runtime configuration and Mix is not included in releases, we ported the Mix.Config API to Elixir. In other words, use Mix.Config has been soft-deprecated in favor of import Config.

Another important change related to configuration is that mix new will no longer generate a config/config.exs file. Relying on configuration is undesired for most libraries and the generated config files pushed library authors in the wrong direction. Furthermore, mix new --umbrella will no longer generate a configuration for each child app, instead all configuration should be declared in the umbrella root. That’s how it has always behaved, we are now making it explicit.

Other improvements

There are many other enhancements in Elixir v1.9. The Elixir CLI got a handful of new options in order to best support releases. Logger now computes its sync/async/discard thresholds in a decentralized fashion, reducing contention. EEx (Embedded Elixir) templates support more complex expressions than before. Finally, there is a new ~U sigil for working with UTC DateTimes as well as new functions in the File, Registry, and System modules.

What’s next?

As mentioned earlier, releases was the last planned feature for Elixir. We don’t have any major user-facing feature in the works nor planned. I know for certain some will consider this fact the most excing part of this announcement!

Of course, it does not mean that v1.9 is the last Elixir version. We will continue shipping new releases every 6 months with enhancements, bug fixes and improvements. You can see the Issues Tracker for more details.

We also are working on some structural changes. One of them is move the mix xref pass straight into the compiler, which would allow us to emit undefined function and deprecation warnings in more places. We are also considering a move to Cirrus-CI, so we can test Elixir on Windows, Unix, and FreeBSD through a single service.

It is also important to highlight that there are two main reasons why we can afford to have an empty backlog.

First of all, Elixir is built on top of Erlang/OTP and we simply leverage all of the work done by Ericsson and the OTP team on the runtime and Virtual Machine. The Elixir team has always aimed to contribute back as much as possible and those contributions have increased in the last years.

Second, Elixir was designed to be an extensible language. The same tools and abstractions we used to create and enhance the language are also available to libraries and frameworks. This means the community can continue to improve the ecosystem without a need to change the language itself, which would effectively become a bottleneck for progress.

Check the Install section to get Elixir installed and read our Getting Started guide to learn more. We have also updated our advanced Mix & OTP to talk about releases. If you are looking for a more fast paced introduction to the language, see the How I Start: Elixir tutorial, which has also been brought to the latest and greatest.

Have fun!

Permalink

Copyright © 2016, Planet Erlang. No rights reserved.
Planet Erlang is maintained by Proctor.