20 years of open source Erlang: OpenErlang Interview with Simon Phipps

This is now our fourth #OpenErlang Interview having already unleashed videos featuring Robert Virding and Joe Armstrong, Chris Price, and Jane Walerud; each have their own views and stories to share when it comes to Erlang and open source. 2018 also marks another huge anniversary! The “open source” label was created at a strategy session held on February 3rd, 1998 in Palo Alto, California. That same month, the Open Source Initiative (OSI) was founded. So this year we’ve partnered with OSI to celebrate our anniversaries together. Viva open source, viva #OpenErlang!

Without further ado, we are very happy to introduce our next #OpenErlang friend - President of the Open Source Initiative, Simon Phipps.

We have the transcript listed at the bottom of this blog post.

About Simon

Simon Phipps has his fingers in a lot of puddings, and is a passionate open source advocate. He is a programmer and computer scientist extraordinaire with a wealth of experience and we didn’t have to ask him twice to work with us. In a true open source spirit, Simon is always very excited to support any project or campaign promoting the open source!

Simon has had experience at some of the world’s leading tech companies. He was a key figure at IBM in relation to Java, having founded their IBM Java Technology Center. He then joined Sun Microsystems in 2000 - most of Sun’s core software then became open source under Simon’s leadership (Solaris, XML and Java being the main ones).

As mentioned, Simon is the President of the Open Source Initiative, which he also created, and he has been President twice; firstly until 2015 before stepping down at the end of his term in 2016. He was then re-elected in 2017.

Simon is also part of the Open Rights Group, The Document Foundation and on the board of Open Source for America. Pass roles have included OpenSolaris, OpenJDK, OpenSPARC and MariaDB Foundation. He is the creator of LibreOffice and founder of Open Mobile Alliance.

About The Open Source Initiative

Do any of us remember a life without Internet? If so, those were bleak times indeed. Using paper maps and needing to entertain ourselves…

For starters, none of us would be in our position now if the decision by Ericsson to open source Erlang hadn’t been made 20 years ago. In the same year, the coinage of the term “open source” came about in Palo Alto, California and the Open Source Initiative (OSI) was created. Like Erlang, the Open Source Initiative had no idea just how popular it would become.

OSI is a global non-profit organisation that promotes open source software and the communities that love them. This includes education and infrastructure, as well as building communities surrounding the various language.

Along with broadcasting the popularity of open source technologies and projects, including BEAM languages, the OSI aims at preserving the communities and continuing legacies that are now 20 years old.

Interview Transcript

At work with the boss breathing down your neck? Or don’t want to be one of those playing videos out loud on public transport? Here’s the transcript, although not as exciting as the real thing.

Simon Phipps: I’ve been with the Open Source Initiative now since 2008. For the last few years, I’ve been the President of the Open Source Initiative.

The Open Source Initiative is at its heart a marketing program for free software. It takes the practical aspects of software freedom, the development methodology, the community openness, and makes them accessible to businesses and to those who don’t want to become activists for the ethics of free software.

In the early years, it was very much an upstart movement. It focused on attacking its opponents like Microsoft. It focused on defining a standard for licenses, called the Open Source Definition.

During the mid part of that first decade, it became obvious that Open Source was going to become adopted by many businesses. I think our first highlight was being surroundly attacked by the established corporations and then having those same corporations adopting Open Source as their core methodology. When, for example, a few years ago Microsoft said it loved Linux, that was a complete reversal of the position it took in 2002. That was a delicious moment, I have to say.

Software freedom is the essential ingredient for the profitability of companies using Open Source, for the career mobility of developers who are using Open Source and for the deployment of Open Source into new technologies like Cloud and IoT.

Open Source has become completely mainstream now. I think that’s a great thing. I think it vindicates the visions that many of us had 20 years ago. It’s also a challenge because it’s become so widely adopted now that people want to use the term “Open Source" to describe things that aren’t good for software freedom. I think that in the third decade that we’re just entering one of our focuses will be on making sure that software freedom becomes formats.

[00:02:04] [END OF AUDIO]

OpenErlang; 20 Years of Open Sourced Erlang

Erlang was originally built for Ericsson and Ericsson only, as a proprietary language, to improve telephony applications. It can also be referred to as “Erlang/OTP” and was designed to be a fault-tolerant, distributed, real-time system that offered pattern matching and functional programming in one handy package.

Robert Virding, Joe Armstrong and Mike Williams were using this programming language at Ericsson for approximately 12 years before it went open source to the public in 1998. Since then, it has been responsible for a huge number of businesses big and small, offering massively reliable systems and ease of use.

OpenErlang Interview Series

As mentioned, this isn’t the first in the #OpenErlang Interview series. We have three more existing videos to enjoy.

Robert Virding and Joe Armstrong

It only seems fitting to have launched with the creators of Erlang; Robert Virding and Joe Armstrong (minus Mike Williams). Robert and Joe talk about their journey with Erlang including the early days at Ericsson and how the Erlang community has developed.

Christopher Price

Last week was the launch of our second #OpenErlang Interview from Ericsson’s Chris Price. Currently the President of Ericsson’s Software Technology, Chris has been championing open source technologies for a number of years.

Chris chats to us about how Erlang has evolved, 5G standardization technology, and his predictions for the future.

Jane Walerud

Jane is a serial entrepreneur of the tech persuasion. She was instrumental in promoting and open sourcing Erlang back in the 90s. Since then, she has continued her entrepreneurial activities, helping launch countless startups within the technology sector from 1999 to present day. Her work has spanned across many influential companies who use the language including Klarna, Tobil Technology, Teclo Networks and Bluetail, which she founded herself.

Other roles have included Member of the Board at Racefox, Creades AB and Royal Swedish Academy of Engineering Sciences, and a key role in the Swedish Government Innovation Council.

Other Erlang Solutions Activities…

Strategies for Successfully Adopting Elixir

You missed it! Continuing the #OpenErlang buzz, Ben Marx from Bleacher Report shared with us the many ways he adopted Elixir in our latest webinar. Don’t worry this recording is now available, and don’t forget you can sign up to receive the deck first, whether you can make the live recording or not.

The #OpenErlang London Party

We have announced our #OpenErlang London Party! Sign up for free at eventbrite to celebrate 20 years of open source Erlang! We’ll have great food, free-flowing drinks, and entertainment, and the best part? It’s all free!

If you’re interested in contributing and collaborating with us at Erlang Solutions, you can contact us at general@erlang-solutions.com.

Permalink

Blockchain 2018: Myths vs Reality

As you may know, we offer FinTech services at Erlang Solutions and this has been a successful year of involvement in a variety of blockchain projects. I felt it healthy to take a step back and observe the emerging trends in this nascent (but still multi billion dollar) industry and question whether some of the directions taken so far are sensible or whether any corrections are necessary. I will provide a ‘snapshot’ of the state of advancement of the blockchain technology, describing the strengths and weaknesses of the solutions that have emerged so far, without proposing innovations at this stage (those ideas will form the subject of future blog posts). If you are interested to hear what I and Erlang Solutions, where I work as Scalability Architect and Technical Lead have to say about blockchain, let me take you through it:

The context

I do not propose in this blogpost to get into the details of the blockchain data structure itself, nor will I discuss what is the best Merkle tree solution to adopt. I will also avoid hot topics such as ‘Transactions Per Second’ (TPS) and the mechanisms for achieving substantial transaction volumes, which is de facto the ultimate benchmark widely adopted as a measurement of how competitive a solution is against the major players in the market; Bitcoin and Ethereum.

What I would like to examine instead is the state of maturity of the technology, and its alignment with the core principles that underpin the distributed ledger ecosystem. I hope that presenting a clear picture of these principles and how they are evolving may be helpful.

The principles

As the primary drive for innovation emerges from public, open source blockchains, this will be the one on which we will focus our attention.

The blockchain technology mainly aims at embracing the following high level principles:

  1. Immutability of the history
  2. Decentralisation of the control
  3. ‘Workable’ consensus mechanism
  4. Distribution and resilience
  5. Transactional automation (including ‘smart contracts’)
  6. Transparency and Trust
  7. Link to the external world

Let us look at them one by one:

1. Immutability of the history

In an ideal world it would be desirable to preserve an accurate historical trace of events, and make sure this trace does not deteriorate over time, whether through natural events, or by human error or by the intervention of fraudulent actors. The artefacts produced in the analogue world face alterations over time, although there is often the intent to make sure they can withstand forces that threaten to alter and eventually destroy them. In the digital world the quantized / binary nature of the stored information provides the possibility of continuous corrections to prevent any deterioration that might occur over time.

Writing an immutable blockchain aims to retain a digital history that cannot be altered over time and on top of which one can verify that a trace of an event, known as a transaction, is recorded in it. This is particularly attractive when it comes to assessing the ownership or the authenticity of an asset or to validate one or more transactions.

We should note that, on top of the inherent immutability of a well-designed and implemented blockchain, hashing algorithms also provide a means to encode the information that gets written in the history so that the capacity to verify a trace/transaction can only be performed by actors possessing sufficient data to compute the one-way1 cascaded encoding/encryption. This is typically implemented on top of Merkle trees where hashes of concatenated hashes are computed.

Legitimate questions can be raised about the guarantees for indefinitely storing an immutable data structure:

  1. If this is an indefinitely growing history, where can it be stored once it grows beyond capacity of the ledgers?
  2. As the history size grows (and/or the computing power needed to validate further transactions increases) this reduces the number of potential participants in the ecosystem, leading to a de facto loss of decentralisation. At what point does this concentration of ‘power’ create concerns?
  3. How does the verification performance deteriorate as the history grows?
  4. How does it deteriorate when a lot of data gets written on it concurrently by the users?
  5. How long is the segment of data that you replicate on each ledger node?
  6. How much network traffic would such replication generate?
  7. How much history is needed to be able to compute a new transaction?
  8. What compromises need to be made on linearisation of the history, replication of the information, capacity to recover from anomalies and TPS throughput?

Further to the above questions we would also like to understand how many replicas converging to a specific history (i.e. consensus) would be needed for it to carry on existing, and in particular:

  1. Can a fragmented network carry on writing to their known history2 ?
  2. Is an approach designed to ‘heal’ any discrepancies in the immutable history of transactions by rewarding the longest fork, fair and efficient?
  3. Are the deterrents strong enough to prevent a group of ledgers forming their own fork3 that eventually reaches a wider adoption?

Furthermore, a new requirement to comply with the General Data Protection Regulations (GDPR) in Europe and ‘the right to be forgotten’ introduced new challenges to the perspective of keeping permanent and immutable traces indefinitely. This is important because fines for breach of GDPR are potentially very significant. The solutions introduced so far effectively aim at anonymising the information that enters the immutable on-chain storage process, while sensitive information is stored separately in support databases where this information can be deleted if required. None of these approaches has yet been tested by the courts, nor has a definition of what the GDPR ‘right to be forgotten’ means in practice.

The challenging aspect here is to decide upfront what is considered sensitive and what can safely be placed on the immutable history. A wrong upfront choice can backfire at a later stage in the event that any involved actor manages to extract or trace sensitive information through the immutable history.

Immutability represents one of the fundamental principles that motivates the research into blockchain technology, both private and public. The solutions explored so far have managed to provide a satisfactory response to the market needs via the introduction of history linearisation techniques, one-way hashing encryptions, merkle trees and off-chain storage, although the linearity of the immutable history4 comes at a cost (notably transaction volume).

2. Decentralisation of the control

During the aftermath of the 2008 global financial crisis (one interpretation of which is that it highlighted the global financial disasters that could occur from over centralisation and the misalignment of economic incentives) there arose a deep mistrust of ‘traditional’, centralised institutions, political and commercial. One reaction against such centralisation was the exploration of various decentralised mechanisms which could replace those traditional, centralised structures. The proposition that individuals operating in a social context ideally would like to enjoy the freedom to be independent from a central authority gained in popularity. Self determination, democratic fairness and heterogeneity as a form of wealth are among the dominant values broadly recognised in Western (and, increasingly, non-Western) society. These values added weight to the movement that introducing decentralisation in a system is positive. I have rarely seen this very idea being challenged at all (in the same way one rarely hears criticism of the proposition that ‘information wants to be free’), although in some circumstances the unwanted consequences of using this approach can clearly be seen.

For instance, one might argue that it is only due to our habits that we normally resolve anomalies in a system by contacting a central authority, which according to our implicit or explicit contractual terms, bears the responsibility for what happens to the system. Therefore, in the event of a damage incurred through a miscarried transaction that might be caused by a system failure or a fraudulent actor, we are typically inclined to contact the central bearer of the responsibility to intervene and try to resolve the damage sustained. Human history is characterised by the evolution of hierarchical power structures, even in the most ‘democratic’ of societies, and these hierarchies naturally create centralisation, independent of the dominant political structure in any particular society. This characteristic continues into the early 21st century.

With decentralisation, however, there is no such central authority that could resolve those issues for us. Traditional, centralised systems have well developed anti-fraud and asset recovery mechanisms which people have become used to. Using new, decentralised technology places a far greater responsibility on the user if they are to receive all of the benefits of the technology, forcing them to take additional precautions when it comes to handling and storing their digital assets. In particular they need to keep the access to their digital wallets protected and make sure they don’t lose it. Similarly when performing transactions, such as giving away a digital asset to a friend or a relative, they have to make sure it is sent to the right address/wallet, otherwise it will be effectively lost or mistakenly handed over to someone else.

Also, there’s no point having an ultra-secure blockchain if one then hands over one’s wallet private key to an intermediary (more ‘centralisation’ again) whose security is lax: it’s like having the most secure safe in the world then writing the combination on a whiteboard in the same room.

Is the increased level of personal responsibility that goes with the proper implementation of a secure blockchain a price that users are willing to pay; or will they trade off some security in exchange for ease of use (and, by definition, more centralisation)? It’s too early to see how this might pan out. If people are willing to make compromises here, what other compromises re, say, security or centralisation would they be prepared to accept in exchange for lower cost/ease of use? We don’t know, as there’s no secure blockchain ecosystem yet operating at scale.

Another threat to the broad endorsement and success of the decentralisation principle is posed by governmental regulatory/legal pressures on digital assets and ecosystems. This is to ensure that individuals do not use the blockchain for tax evasion and that the ownership of their digital assets is somehow protected. However any attempt to regulate this market from a central point is undermining the effort to promote the adoption of a decentralised form of authority.

3. Consensus

The consistent push towards decentralised forms of control and responsibility has brought to light the fundamental requirement to validate transactions without the need for (or intervention of) a central authority; this is known as the ‘consensus’ problem and a number of approaches have grown out of the blockchain industry, some competing and some complementary.

There has also been a significant focus around the concept of governance within a blockchain ecosystem. This concerns the need to regulate the rates at which new blocks are added to the chain and the associated rewards for miners (in the case of blockchains using proof of work (POW) consensus methodologies). More generally, it is important to create incentives and deterrent mechanisms whereby interested actors contribute positively to the healthy continuation of the chain growth.

Besides serving as economic deterrent against denial of service and spam attacks, POW approaches are amongst the first attempts to automatically work out, via the use of computational power, which ledgers/actors have the authority to create/mine new blocks5. Other similar approaches (proof of space, proof of bandwidth etc) followed, however, they all suffered from exposure to deviations from the intended fair distribution of control. Wealthy participants can in fact exploit these approaches to gain an advantage via purchasing high performance (CPU / memory / network bandwidth) dedicated hardware in large quantity and operating it in jurisdictions where electricity is relatively cheap. This results in overtaking the competition to obtain the reward, and the authority to mine new blocks, which has the inherent effect of centralising the control. Also the huge energy consumption that comes with the inefficient nature of the competitive race to mine new blocks in POW consensus mechanisms has raised concerns about its environmental impact and economic sustainability. The most recent report on the energy usage of Bitcoin can be seen here on digiconomist.

Proof of Stake (POS) and Proof of Importance (POI) are among the ideas introduced to drive consensus via the use of more social parameters, rather than computing resources. These two approaches link the authority to the accumulated digital asset/currency wealth or the measured productivity of the involved participants. Implementing POS and POI mechanisms, whilst guarding against the concentration of power/wealth, poses not insubstantial challenges for their architects and developers.

More recently, semi-automatic approaches, driven by a human-curated group of ledgers, are putting in place solutions to overcome the limitations and arguable fairness of the above strategies. The Delegated Proof of Stake (DPOS) and Proof of Authority (POA) methods promise higher throughput and lower energy consumptions, while the human element can ensure a more adaptive and flexible response to potential deviations caused by malicious actors attempting to exploit a vulnerability of the system.

Whether these solutions actually manage to fulfill the inherent requirements, set by the principle of control distribution, is debatable. Similarly, when it comes to the idea of bringing in another layer of human driven consensus for the curation, it is clear that we are abandoning a degree of automation. Valuing trust and reputation in the way authority gets exercised on the governance of a particular blockchain appears to bring back a form of centralised control, which clearly goes against the original intention and the blockchain ethos.

This appears to be one of the areas where, despite the initial enthusiasm, there is not yet a clear solution that drives consensus in a fair, sustainable and automated manner. Moving forward I expect this to be the main research focus for the players in the industry, as current solutions leave many observers unimpressed. There will likely emerge a number of differing approaches, each suitable for particular classes of use case.

4. Distribution and resilience

Apart from decentralising the authority, control and governance, blockchain solutions typically embrace a distributed Peer to Peer (P2P) design paradigm. This preference is motivated by the inherent resilience and flexibility that these types of networks have introduced and demonstrated, particularly in the context of file and data sharing (see a brief p2p history here). The diagram below is frequently used to explain the difference among three network topologies: centralised, decentralised and distributed.

A centralised network, typical of mainframes and centralised services6, is clearly exposed to a ‘single point of failure’ vulnerability as the operations are always routed towards a central node. In the event that the central node breaks down or is congested, all the other nodes will be affected by disruptions.

Decentralised and distributed networks attempt to reduce the detrimental effects that issues occurring on a node might trigger on other nodes. In a decentralised network, the failure of a node can still affect several neighbouring nodes that rely on it to carry out their operations. In a distributed network the idea is that failure of a single node should not impact significantly any other node. In fact, even when one preferential/optimal route in the network becomes congested or breaks down entirely, a message can reach the destination via an alternative route. This greatly increases the chances to keep a service available in the event of failure or malicious attacks such as a denial of service (DOS) attack.

Blockchain networks where a distributed topology is combined with a high redundancy of ledgers backing a history have occasionally been declared “unhackable” by enthusiasts or, as some more prudent debaters say, “difficult to hack”. There is truth in this, especially when it comes to very large networks such as Bitcoin (see an additional explanation here). In such a highly distributed network, the resources needed to generate a significant disruption are very high, which not only delivers on the resilience requirement, but also works as a deterrent against malicious attacks (principally because the cost of conducting a successful malicious attack becomes prohibitive).

Although a distributed topology can provide an effective response to failures or traffic spikes, we need to be aware that delivering resilience against prolonged over-capacity demands or malicious attacks requires adequate adapting mechanisms. While the Bitcoin network is well positioned, as it currently benefits from a high capacity condition (due to the historical high incentive to purchase hardware by third party miners7 ), this is not the case for other emerging networks as they grow in popularity. This is where novel instruments, capable of delivering preemptive adaptation combined with back pressure throttling applied to the P2P level, can be of great value.

Distributed systems are not new and, whilst they provide highly robust solutions to many enterprises and governmental problems, they are subject to the laws of physics and require their architects to consider the trade-offs that need to be made in their design and implementation (e.g. consistency vs availability). This remains the case for blockchain systems.

5. Automation

In order to sustain a coherent, fair and consistent blockchain and its surrounding ecosystem a high degree of automation is required. Existing areas with a high demand of automation include those common to most distributed systems. For instance; deployment, elastic topologies, monitoring, recovery from anomalies, testing, continuous integration, and continuous delivery. In the context of blockchains, these represent well-established IT engineering practices. Additionally, there is a creative R&D effort to automate the interactions required to handle assets, computational resources and users across a range of new problem spaces (e.g. logistics, digital asset creation and trading etc).

The trend of social interactions has seen a significant shift towards scripting for transactional operations. This is where ‘Smart Contracts’ and constrained virtual machines (VM) interpreters have emerged - an effort pioneered by the Ethereum project.

The possibility to define through scripting how to operate an asset exchange, under what conditions and actioned by which triggers, has attracted many blockchain enthusiasts. Some of the most common applications of Smart Contracts involve lotteries, trade of digital assets and derivative trading. While there is clearly an exciting potential unleashed by the introduction of Smart Contracts, it is also true that it is still an area with a high entry barrier. Only skilled developers that are willing to invest time in learning Domain Specific Languages (DSL)8 have access to the actual creation and modification of these contracts.

Besides, developers can create contracts that contain errors or are incapable to operate under unexpected conditions. This can happen, for instance, when the implementation of a contract is commissioned to a developer that does not have sufficient domain knowledge. Although the industry is taking steps in the right direction, there is still a long way to go in order to automatically adapt to unforeseen conditions and create effective Smart Contracts in non-trivial use cases. The challenge is to respond to safety and security concerns when Smart Contracts are applied to edge case scenarios that deviate from the ‘happy path’. If badly-designed contracts cannot properly rollback/undo a miscarried transaction, their execution might lead to assets being lost or erroneously handed over to unwanted receivers. A number of organisations are conducting R&D effort to respond to these known issues and introduce VMs operating under more restrictive constraints to deliver a higher level of safety and security. However, from a conceptual perspective, this is a restriction that reduces the flexibility to implement specific needs.

Another area in high need for automation is governance. Any blockchain ecosystem of users and computing resources requires periodic configurations of the parameters to carry on operating coherently and consensually. This results in a complex exercise of tuning for incentives and deterrents to guarantee the fulfilment of ambitious collaborative and decentralised goals. The newly emerging field of ‘blockchain economics’ (combining economics; game theory; social science and other disciplines) remains in its infancy.

Clearly the removal of a central ruling authority produces a vacuum that needs to be filled by an adequate decision making body, which is typically supplied with an automation that maintains a combination of static and dynamic configuration settings. Those consensus solutions referred to earlier which use computational resources or social stackable assets to assign the authority, not only to produce blocks but also to steer the variable part of governance9, have originally succeeded to fill the decision making gap in a fair and automated way. Successively, the exploitation of flaws in the static element of governance has hindered the success of these models. This has contributed to the rise of popularity of curated approaches such as POA or DPOS, which not only bring back a centralised control, but also reduce the automation of governance.

I expect this to be one of the major area where blockchain has to evolve in order to succeed in getting a widespread market adoption.

6. Transparency and Trust

In order to produce the desired audience engagement for a blockchain and eventually determine its mass adoption and success, its consensus and governance mechanisms need to operate transparently. Users need to know who has access to what data, so that they can decide what can be stored and possibly shared on-chain. These are the contractual terms by which users agree to share their data. As previously discussed users might require to exercise the right for their data to be deleted, which typically is a feature delivered via auxiliary, ‘off-chain’ databases. In contrast, only hashed information, effectively devoid of its meaning, is preserved permanently on-chain.

Given the immutable nature of the chain history, it is important to decide upfront what data should be permanently written on-chain and what gets written off-chain. The users should be made aware of what data gets stored on-chain and with whom it could potentially be shared. Changing access to on-chain data or deleting it goes against the fundamentals of the immutability and therefore is almost impossible. Getting that decision wrong at the outset can significantly affect the cost and usability (and therefore likely adoption) of the blockchain in question.

Besides transparency, trust is another critical feature that users legitimately seek. This is one of the reasons why blockchain manages to attract customers who have developed distrust in traditional centrally-ruled networks (most notably banks and rating organisations after the mishandling of the subprime mortgages that led to the 2008 financial crisis). It is vital, therefore, that central operating bodies in the shape of curated POA/DPOS consensus act in a transparent and trustworthy manner, so that decision makers are not perceived as an elite that could pursue their own goals, instead of operating in the interest of the collective.

Another example of trust towards the people involved becomes relevant when the blockchain information is linked to the real world. As it will be clarified in the context of the next principle (see heading 7), this link involves people and technology dedicated to guarantee the accurate preservation of these links against environmental deterioration or fraudulent misuse.

Trust also has to go beyond the scope of the people involved as systems need to be trusted as well. Every static element, such as an encryption algorithm, the dependency on a library, or an fixed configuration, is potentially exposed to vulnerabilities. Concerns here are understandable given the increasing amount of well-publicised hacks and security breaches that have occurred over the last couple of years. In the cryptocurrency space, these attacks have frequently resulted in the perpetrators successfully walking away with large sums of money, without leaving traces that could be realistically used to track down their identity, or rollback to a previous healthy state. In the non-crypto space they have resulted in widespread hacks of data and the disabling of corporate and civil IT systems.

In some circumstances digital wallets were targeted, as the users might not have stored the access keys in a sufficiently secure place10, and in other circumstances the blockchain itself. This is the case of the ‘DAO attack’ on Ethereum or the so known ‘51% attacks’ on Bitcoin.

Blockchain enthusiasts tend to forget that immutability is only preserved by having a sufficient number of ledgers backing a history. Theoretically in the event that a genuine history gets overruled by a large group ledgers interested in backing a different history, we could fall in a lack of consensus situation, which leads to a fork of the chain. If supported by a large enough group of ledgers11, the most popular fork could eclipse the genuine minor fork making it effectively irrelevant. This is not a reason to be excessively alarmed; it should just be a consideration to be aware of when we put our trust in a system. The bitcoin network, for instance, is currently backed by such a vast amount of ledgers that makes it impractical for anyone to hack.

Similar considerations need to be made for state-of-the-art encryption, which has a deterrent against brute force attacks based on the current cost and availability of computational power. Should new technologies such as quantum computing emerge, it is expected that even that encryption would need an upgrade.

7. Link to the external world

The attractive features that blockchain has brought to the internet market would be limited to handling digital assets, unless there was a way to link informations to the real world. For some reason this brings back to my mind the popular movie “The Matrix” and the philosophical question from René Descartes about what is real and if there is another reality behind the perceived one. Without indulging excessively in arguable analogies, it is safe to say that there would be less interest if we were to accept that a blockchain can only operate under the restrictive boundaries of the digital world, without connecting to the analog real world in which we live.

Technologies used to overcome these limitations include cyber-physical devices such as sensors for input and robotic activators for output, and in most circumstances, people and organisations. As we read through most blockchain whitepapers we might occasionally come across the notion of the oracle, which in short, is a way to name an input coming from a trusted external source that could potentially trigger/activate a sequence of transactions in a Smart Contract or which can otherwise be used to validate some information that cannot be validated within the blockchain itself.

Under the Transparency and Trust section we discussed how these peripheral areas are as susceptible to errors and malicious interference as the core blockchain and Smart Contracts automations. For instance, suppose we are tracking the history of a precious collectible item, unless we have the capacity to identify reliably the physical object in question we risk that its history is lost or attributed to another object.

A remarkable example of object identification approach via physical characteristics is the one used by Everledger to track diamonds. As briefly explained, 40 metadata information can be extracted from the object, including the cut, the clarity, the colour, etc. Given the quasi-immutable nature of the objects in question, this has proved a particularly successful use case. It is more challenging instead to accurately identify objects that degrade over time (e.g. fine art). In this case typically trusted people are involved, or a combination of people and sensor metadata. Relying on expert actors to provide external validation relies upon a properly aligned set of incentives. ‘Traditional’ industries have a long history of dealing with these issues and well-developed mechanisms for dealing with them: blockchain-based solutions are exploring many ways in which they can interface (and adapt) these existing mechanisms.

On a different perspective even wealth itself only makes sense if it can be exercised in the ‘real world’. This is as valid for blockchain as well as traditional centrally-controlled systems, and in fact the idea of virtualising wealth is not new to us as we shifted from trading time and goods directly, to stacking wealth into account deposits. What’s new in this respect is the introduction of alternate currencies known as cryptocurrencies.

Bitcoin and Ethereum, the two dominant projects (in September 2018) in the blockchain space are by many investors seen as an opportunity to diversify a portfolio or speculate on the value of their respective cryptocurrency. The same applies to a wide range of other cryptocurrencies with the exception of fiat pegged currencies, most notably Tether, where the value is effectively bound to the US dollar. Conversions from one cryptocurrency to another and to/from fiat currencies is typically operated by exchanges on behalf of an investor. These are again peripheral services that serve as link to the external world.

From a blockchain perspective, some argue that the risk control a cryptocurrency investor demands - so that there is the possibility to step back and convert back to fiat currencies - is the wrong mindset. Evangelising trust towards the cryptocurrency diversification however, understandably, requires an implementation period during which investors legitimately might want to exercise the option to pull-out.

Besides oracles and cyber-physical links, an interest is emerging with linking Smart Contracts together to deliver a comprehensive solution. Contracts could indeed operate in a cross-chain scenario to offer interoperability among a variety of digital assets and protocols. Although attempts to combine different protocols and approaches have emerged (e.g. see the EEA and the Accord Project), this is still an area where further R&D is necessary in order to provide enough instruments and guarantees to developers and entrepreneurs. The challenge is to deliver cross-chain functionalities without the support of a central governing agency/body.

Conclusion

The financial boost that the blockchain market has benefited from, through the large fundraise operations conducted through the 2017-2018 initial coin offerings (ICO), has given the opportunity to conduct a substantial amount of R&D studies aimed at finding solutions to the challenges described in this blogpost. Although step changes have been introduced by the work of visionary individuals with mathematical backgrounds such as S. Nakamoto and V. Buterin, it seems clear that contributions need to come from a variety of expertises to ensure a successful development of the technology and the surrounding ecosystem. It is an exercise that involves evangelisation, social psychology, automation, regulatory compliance and ethical direction.

The challenges the market is facing can be interpreted as an opportunity for ambitious entrepreneurs to step in and resolve the pending fundamental issues of consensus, governance, automation and link to the real world. It is important in this sense to ensure that the people you work with have the competence, the technology and a good understanding of the issues to be resolved, in order to successfully drive this research forward (see Erlang Solutions offering here).

As discussed in the introduction, this blogpost only focuses on assessing the state of advancement of the technology in response to the motivating principles, and analyses known issues without proposing solutions. That said, I and the people I work with, are actively working on solutions that will be presented and discussed in a separate forum. If you are interested to learn more, stay tuned for the next chapter :-) ..

Footnotes

1. One-way in this context means that you cannot retrieve the original information from a hash value which also frequently involves the destruction of an information, potentially leading to rare collisions (read more on perfect hash function).

2. In the event of a split brain, a typical consensus policy is to allow fragmented networks to carry on writing blocks on their known chain and once the connection is re-established the longest chain part (also known as fork) is preferred while the shortest is deleted. Note this could lead to a temporary double spending scenario, a fundamental ‘no-go’ in the blockchain (and ‘traditional’) world.

3. Note that in this case the very concept of ownership of an asset can be threatened. E.g: if i remain the only retainer of a history that is valid, while the rest of the world has moved on in disagreement with it. My valid history gets effectively invalidated. History is indeed written by the victors!

4. Blockchain stores its blocks in a linear form although there have been attempts to introduce graph fork tolerant approaches such as the IOTA tangle.

5. Mining is the process of adding a block of transactions to the chain (see here for instructions and here for more context).

6. Historically a central server was the only economically viable option given the high cost of performant hardware and cheaper cost of terminals.

7. Bitcoin’s POW rewards a successful miner with a prize in bitcoin cryptocurrency. This is no longer an incentive in some countries. See here a report released in May about the cost to mine 1 bitcoin per country.

8. For example Solidity is a popular DSL inspired by JavaScript.

9. Once discovered and understood, malicious users can take advantage of any static / immutable configuration. For instance, rich/resourceful individuals could gain an advantage via the purchase of a vast amount of computational resources to bend the POW fairness.

10. A common guideline is to split the access key into parts and store each in a different cryptovault.

11. The Bitcoin network estimated computing power at the time of writing (September 2018) amounts to 80704290 petaflops while the world’s most powerful supercomputer reaches 200 petaflops. This obviously only allows to infer by analogy the magnitude of the amount of ledgers on the network.

Permalink

Is Elixir a scripting language?

Elixir is known for being a language made for building distributed applications that scale, are massively concurrent, and have self-healing properties. All of these adjectives paint Elixir in a grandiose light. And for good reasons!

But is Elixir also a language that can be used for the more mundane tasks of this world like scripting? I think the answer is a definite yes.

To see this, let’s take a look at all the ways we can write scripts using Elixir. We’ll build using the same example, going from simple to more complex solutions.

Defining the script

Our script will simply create a new markdown file with today’s date so we can write our To Do list. The structure of the directories will be YYYY/MM/DD.md, so we will nest each day under a month and each month under a year.

# Get today's date
date = Date.utc_today()
year = Integer.to_string(date.year)
month = Integer.to_string(date.month)
day = Integer.to_string(date.day)

# Generate the month's full path with YYYY/MM format
month_path =
  File.cwd!()
  |> Path.join(year)
  |> Path.join(month)

# Create the month and year directories
File.mkdir_p(month_path)

# Generate the filename with today's date
filename = Path.join(month_path, day) <> ".md"

# Check existence so we don't override a file
unless File.exists?(filename) do
  # include sample header
  header = """
  ---
  date: #{Date.to_string(date)}
  ---

  To Do
  =====

  - What do you need to accomplish today?
  """

  # write to the file
  File.open(filename, [:write], fn file ->
    IO.write(file, header)
  end)
end

# Print out confirmation message
final_message = """

> Created #{Path.relative_to_cwd(filename)}
"""

IO.puts(final_message)

Excellent! That is the full extent of our script. Let’s now see how we can run it.

elixir todo.exs

The first and perhaps simplest way to run an elixir script is just to run elixir name_of_file.exs from the shell. So let’s do that.

  1. Save the above code in a file named todo.exs
  2. Run elixir todo.exs
  3. You should see the following message (with the date you’re running this script instead of the date I ran it):
$ elixir todo.exs

> Created 2018/9/28.md

If we inspect the file that it created, we can confirm that the template is there,

---
date: 2018-09-28
---

To Do
=====

- What do you need to accomplish today?

bin/todo

A second way to run the script is to make it into an executable. This is perhaps an extension of the one above, but I include it as a separate step because I think it could prove useful in some projects.

Follow these steps:

  1. Move todo.exs under a bin/ directory (not required but nice)
  2. Mark it as executable with chmod: chmod +x bin/todo.exs
  3. Add a shebang #! /usr/bin/env elixir at the top of your file
  4. Rename it to bin/todo

And run bin/todo!

If you’re unfamiliar with using chmod and shebang to turn a file into an executable, take a look at the notes in this let’s build a CLI video.

When would I use these?

Our todo.exs and bin/todo scripts are examples of standalone scripts. They are useful when you want to run a task that is self-contained. bin/todo has the benefit that the user of the script need not know that the script is running Elixir (though it still needs to have Elixir installed). It’s just a script like any other one you may find in your bin/ folder.

mix run todo.exs

Now if you’re working with Elixir, it is very likely you are already working in a mix project. If that is the case, mix allows you to run arbitrary scripts via mix run [name of file].

To test this, let’s go ahead and create a new project called tasker and run our script there.

Follow these steps:

  1. mix new tasker,
  2. cd into the tasker directory,
  3. make a scripts/ directory,
  4. copy the todo.exs file from the first section into scripts/todo.exs

Now run mix run scripts/todo.exs!

When would I use this?

The benefit of running this script via mix (as opposed to the previous two options) is that the script is part of your project, so it has access to all the code you have defined in the project.

Let’s bring that point home by extracting most of the logic to a TodoBuilder module:

# lib/todo_builder.ex
defmodule TodoBuilder do
  def run(date) do
    year = Integer.to_string(date.year)
    month = Integer.to_string(date.month)
    day = Integer.to_string(date.day)

    month_path =
      File.cwd!()
      |> Path.join(year)
      |> Path.join(month)

    File.mkdir_p(month_path)

    filename = Path.join(month_path, day) <> ".md"

    unless File.exists?(filename) do
      header = """
      ---
      date: #{Date.to_string(date)}
      ---

      To Do
      =====

      - What do you need to accomplish today?
      """

      File.open(filename, [:write], fn file ->
        IO.write(file, header)
      end)
    end

    {:ok, filename}
  end
end
# scripts/todo.exs

{:ok, filename} =
  Date.utc_today()
  |> TodoBuilder.run()

final_message = """

> Created #{Path.relative_to_cwd(filename)}
"""

IO.puts(final_message)

In the wild

If you’re interested in seeing this “in the wild”, Phoenix seeds data in their applications by running mix run priv/repo/seeds.exs.

mix tasker.todo

Having it as a script that can run with our project code is nice. But sometimes we want to make it a more explicit part of our project. Turning our script into a mix task can do just that. This is especially true if we expect external parties to use our project since mix tasks have the extra benefit of documentation!

Let’s change our script into a mix task. In the tasker project,

  1. Create a lib/mix/tasks/ directory
  2. Create a todo.ex task
  3. Move the code we have in scripts/todo.exs into that file
  4. Add use Mix.Task at the top of the module
  5. Add a @shortdoc and @moduledoc with descriptions of what the task does
# lib/mix/tasks/todo.ex
defmodule Mix.Tasks.Todo do
  use Mix.Task

  @shortdoc "Creates a new todo file with today's date"

  @moduledoc """
  Creates a new todo file with today's date

  ## Example

  mix todo
  """

  def run(_args) do
    {:ok, filename} =
      Date.utc_today()
      |> TodoBuilder.run()

    final_message = """

    > Created #{Path.relative_to_cwd(filename)}
    """

    Mix.shell().info(final_message)
  end
end

Now run mix todo and voila!

But that’s not all. Note the use of @shortdoc and @moduledoc. The documentation that we added in those two module attributes makes our script especially friendly to other users.

If you check mix help, you’ll see that our task is listed right after mix test, and it uses @shortdoc for its documentation.

image-of-mix-help

Now check out mix help todo. It uses your @moduledoc for documentation!

image-of-mix-help-todo

In the wild

If you’re interested on how this gets used in the wild, take a look at how Phoenix and Ecto use mix tasks for commonly performed actions and for their generators. Things such as mix phx.new, mix phx.server, and mix ecto.migrate are all mix tasks!

./tasker escript

Wow, it’s been a long road but we’re finally down to the last way we can script in Elixir (that I know of).

One of the built-in mix tasks that comes with mix is mix escript.build. It packages your project and dependencies into a binary that is executable. It even embeds Elixir as part of the script, so it can be used so long as a machine has Erlang/OTP without requiring Elixir to be present.

Let’s get to it. We have to first update our mix.exs file to define an entry point for the escript.build task,

# mix.exs
def project do
  [
    app: :tasker,
    # other options
    #
    # add this line below
    escript: escript()
  ]
end

# add the entry point
defp escript do
  [main_module: Tasker.TodoCLI]
end

Now let’s create Tasker.TodoCLI and put our code there,

# lib/tasker/todo_cli.ex
defmodule Tasker.TodoCLI do
  def main(_args) do
    {:ok, filename} =
      Date.utc_today()
      |> TodoBuilder.run()

    final_message = """

    > Created #{Path.relative_to_cwd(filename)}
    """

    IO.puts(final_message)
  end
end

Now we can simply run mix escript.build to build the executable. And you should see that you have a new executable called tasker in the main directory. Run it with ./tasker and see the magic happen!

When would I use this?

Much like a mix task, an escript has the ability to use the rest of your project’s codebase. But it has that ability because the project and its dependencies are compiled and packaged in. That means you can use it outside of your mix project. And since Elixir is embedded, all you need is to have Erlang/OTP installed. So it makes our little script into an executable that is easily shareable with others!

To test that, feel free to move the ./tasker script out of your mix project into another directory and run it. You’ll see it does its work just fine!

Anything else?

We covered a lot of ground, and we saw that Elixir has great tooling for creating scripts. But I would be remiss if I didn’t include a mention of the OptionParser module. It’s a little module that helps parse command line options from something like this,

--flag true --option arg1

into a keyword list like this,

[flag: true, option: "arg1"]

So if you’re writing Elixir scripts, it’s sure to come in handy!

Permalink

10 Lessons from Decade with Erlang

The year was 2008. I had a steady job as a .NET developer. Then I read an ad from a company that was looking for developers with knowledge of Erlang… or functional programming in general and I applied.

I had learned a bit of Haskell in college and I loved it but I was not even remotely close to having experience in functional programming. Nevertheless, something told me that it was the right path.

Totally unprepared but ready to just improvise and see, I arrived at Novamens for an interview. I met Juanjo there but, more importantly, I met Erlang!

And that moment changed my life.

Erlang was celebrating just 10 years of open source back in 2008 :’)

Ten years later…

10 years after those events, I have learned a few lessons that I want to share with you.

Notice though that these lessons are things that helped me to write better code, they should not be taken as strict rules to follow. As my mentor Hernán likes to call them, these are heuristics. They’re likely to help you in your quest to build the best Erlang system ever, but more as a reference or guideline. In some scenarios, you’ll certainly need to bend and even break them entirely…

Higher-order Constructs

Erlang as a language is pretty simple, with few types, few keywords and a set of very basic operations. Those are the building blocks of huge systems and you should totally learn and understand them well.

But you should also learn to build abstractions on top of that. Think of things like higher-order functions, list comprehensions, OTP behaviors, libraries like sumo_rest and others. They all encapsulate shared knowledge that makes your life as a developer easier by removing the repetitive parts and letting you focus only on the specific stuff you need just for your system.

Things like message passing with ! and receive, recursion over lists, parsing xml manually, etc. should be scarcely used in large systems. You should instead use the proper libraries, frameworks o syntax (e.g. OTP, list comprehensions, xmerl, etc.). And if you find yourself writing similar things over and over again, you should consider abstracting that generic pieces to a library.

Use higher-order constructs (libraries, frameworks, tools) instead of building everything from scratch. If there is no higher-order construct yet, build one.

Find more about this on this in this article I wrote at Erlang Solutions blog.

Opaque Data Structures

Software development (as Hernán describes it in Spanish) can be seen the process of building computable models of reality, particularly in the early stages of development when you’re designing your system.

Those models include representations of the entities that exist in the real world. In OOP one would use objects for that. In other functional languages, like Haskell, we would use types. But Erlang has a pretty narrow set of types and, in principle, you are not allowed to define your own ones.

So, what to do? You have to combine those types to represent your entities (for instance using tagged tuples, records, etc.). But that gets messy pretty quickly.

That’s why I recommend using Opaque Data Structures, instead. ODSs are modules with opaque exported types and all the logic needed to manage them. They expose a functional interface for others to consume without worrying about the internal representation of the types.

Use Opaque Data Structures to represent your entities.

Learn more on this topic in these two talks I gave (one at EFLBA2017 and the other at CodeBEAM SF 2018)…

https://medium.com/media/a19f45ef634ba3b22d1d6f1a11494e76/hrefhttps://medium.com/media/1db810547d096c1a11611e579820876e/href

Test Driven Development

This is not particular to Erlang, TDD is a great methodology to create software in general. I will not go over its virtues here, but I will say that Erlang makes working with TDD very very easy.

For instance, here is an example of an assignment I provide my students when teaching them recursion:

-module my_lists.
-export [test/0].
test() ->
[] = my_lists:run_length([]),
[{a,1}] = my_lists:run_length([a]),
[{a,1}, {b,1}] = my_lists:run_length([a, b]),
[{a,2}] = my_lists:run_length([a, a]),
[{a,2}, {b,1}, {a,1}] = my_lists:run_length([a, a, b, a]),

ok.

That module compiles (thanks to its dynamic nature, Erlang compiler won’t blame me for not having a run_length/1 function defined there) but when you try to run it…

1> c(my_lists).
{ok,my_lists}
2> my_lists:test().
** exception error: undefined function my_lists:run_length/1
in function my_lists:test/0 (my_lists.erl, line 4)
3>

There you go, in pure TDD fashion, I’m prompted to define run_length/1 now.

See how easy it is? And for more complex systems you have tools like Common Test that work exactly like that test/1 function above, using pattern-matching to determine if a test passes or fails.

In my mind, there is no excuse not to work this way when building systems in Erlang.

Develop your systems incrementally using Test Driven Development.

Meta-Testing

Meta-Testing, as the Inakos d̶e̶f̶i̶n̶e̶d̶ borrowed from Hernán, is the practice of writing tests to validate particular properties of your code instead of its behavior. In other words, the idea is to check your code with tools like dialyzer, xref, elvis, etc. as part of your tests or continuous integration processes.

If you start using dialyzer, xref or elvis once your project is mature… you’ll have to spend a lot of time trying to detangle the cryptic meaning of dialyzer warnings. And don’t forget…

Dialyzer is never wrong. A warning emitted by dialyzer means there is a bug somewhere.

Dialyzer may not warn you about some problems, but if it emits a warning (confusing as it may be) that means you have an issue, somewhere. Maybe it’s not where the warning is reported, but you do have something to fix.

Now, deciphering what dialyzer found when you run it for the first time on a codebase with tens of thousands of lines of code can be challenging. But, if you run dialyzer on your code from the very first day, and you keep your code warning-free, whenever you get a warning it can only be something you just changed and that’s far easier to debug.

Use dialyzer, xref, and elvis in your projects constantly and consistently.
Start using those tools as soon as you start developing your system.

Katana Test (as explained in the link above) will make that extremely easy for you if you use common test. You just need to add a suite like the one below and that’s it. In fact, this one is usually the first suite I add to all my projects.

-module(your_meta_SUITE).
-include_lib("mixer/include/mixer.hrl").
-mixin([ktn_meta_SUITE]).
-export([init_per_suite/1, end_per_suite/1]).
init_per_suite(Config) -> [{application, your_app} | Config].
end_per_suite(_) -> ok.

Test Speed

Large systems tend to have even larger batteries of tests. As good practices go, you generally run all those tests at least once for every pull request and/or before every deploy.

That’s all good, but if kept unattended, those large batteries of tests will start making your everyday development cycle longer a longer. The goal of test completeness (i.e. covering as much functionality of your system as possible with tests) should be balanced with test speed. And that is not an easy thing to do.

Keep your tests running smoothly and fast.

In this article you’ll find some useful techniques to achieve that balance.

Behaviors

Behaviors live at the core of OTP, yet time and again people struggle with them. If you come from OOP land, probably somebody already told you that behaviors are like interfaces.

While that’s generally true, it hides a lot of complexity and it sometimes leads to some false beliefs that will needlessly complicate your code.

Invest time in understanding how the behaviors you use work and how to define and use your own ones.

Behaviors are great, they’re very powerful and yet extremely simple. You can learn more about them and unlock their whole potential with these articles I wrote a while back: Erlang Behaviors… and how to behave around them.

Tools

Erlang/OTP comes with many useful but somewhat hidden gems that will help you in your everyday life as an Erlang developer and boost your productivity. You should learn how to use them.

From .erlang and user_default to dbg, xref, and observer. Have you checked the sys module? What about erlang:system_info/1 or et? There are many hidden gems that can make your life easier.

Learn about all the tools that Erlang/OTP already provides to work better and avoid reinventing the wheel.

You can find 10 of these things in this article I wrote for Pluralsight.

No Debugging

Coming from the OOP world, one of the first things I tried to use when I started working with Erlang was the debugger. Turns out, that’s not a great idea when you’re dealing with a concurrent and distributed programming language.

Don’t get me wrong, the debugger works really well and it’s very powerful, but it will mostly just help you debugging sequential code. Debugging big systems and applications with it is… cumbersome at best.

Do not debug, inspect and trace instead.

On the other hand, Erlang/OTP comes with tons of tools to make tracing and inspecting systems easier. You can find several of them in this article by Dimitris Zorbas. And on top of the ones provided by Erlang/OTP itself, you have amazing libraries like redbug and recon.

Engage with the Community

This is easily the most common advice I ever gave to anybody who asks me about Erlang: Engage with the Community.

The community is not huge (we know each other by first name, as Monika Coles pointed out not so long ago), but it’s proactive and always helpful.

When you start working in Erlang it’s not uncommon to feel lost, many things are new and others are just unique for someone who never used the language before. Since the main benefits of Erlang are experienced when you build large systems, the initial steps can be challenging. Everyone in the community knows that (we’ve all been through those things) and we’re here to help.

Join the community and don’t hesitate to ask for help.

Join the mailing lists. Join the Erlang Slack. Find us on IRC. Find a meetup near you.

Have fun!

The most important lesson of all: The fact that you might end up building massive high-reliable backend systems with Erlang (and those seem really serious to me), doesn’t mean you can’t just have fun with it!

Erlang, being designed with fault-tolerance and concurrency in mind from day 0, allows you to worry less on those things and more on the interesting aspects of what you’re building. That’s The Zen of Erlang.

Enjoy your time working with this amazing language!

Besides that, if you just want to play with Erlang, you can test your knowledge of the language with BeamOlympics or challenge your friends to a game of Serpents

Or… you can build a team and win some awesome prizes at SpawnFest, too!

Thank you!

Finally, I want to use this chance to thank everybody who joined me, pushed me and guide me through this path. In no particular order…

📝

Do you want to contribute to Erlang-Battleground? We’re accepting writers! Sign up for a Medium account and get in touch with me (Brujo Benavides) or join the Inaka Community.

☕️

As usual, you can buy me a coffee.

You are also invited to join the Inaka Community.


10 Lessons from Decade with Erlang was originally published in Erlang Battleground on Medium, where people are continuing the conversation by highlighting and responding to this story.

Permalink

TLS logging improvements in OTP 22

Erlang/OTP 22 will be an important release for the ssl application. We are working on several new features and improvements such as support for TLS 1.3, some of those are already on the master branch. This blog post presents the new ssl debug logging built on the new logger API.

Usage

As the ssl application undergoes a lot of changes the release of the new logger API presented the opportunity to level up its debug logging capabilities to be on par with OpenSSL.

We have introduced a new option log_level that specifies the log level for the ssl application. It can take the following values (ordered by increasing verbosity level): emergency, alert, critical, error, warning, notice, info and debug. At verbosity level notice and above error reports are displayed in TLS. The level debug triggers verbose logging of TLS protocol messages in a similar style as in OpenSSL.

The verbose debug logging can be turned on by two simple steps: the log_level shall be set to debug and the logger shall be configured to enable debug logging for the ssl application. The following code snippet is a sample module with a simple TLS server and client:

-module(ssltest).

-compile(export_all).

-define(PORT, 11000).

server() ->
    application:load(ssl),
    logger:set_application_level(ssl, debug),
    {ok, _} = application:ensure_all_started(ssl),
    Port = ?PORT,
    LOpts = [{certfile, "server.pem"},
             {keyfile, "server.key"},
             {versions, ['tlsv1.2']},
             {log_level, debug}
            ],
    {ok, LSock} = ssl:listen(Port, LOpts),
    {ok, CSock} = ssl:transport_accept(LSock),
    {ok, _} = ssl:handshake(CSock).

client() ->
    application:load(ssl),
    logger:set_application_level(ssl, debug),
    {ok, _} = application:ensure_all_started(ssl),
    Port = ?PORT,
    COpts = [{verify, verify_peer},
             {cacertfile, "ca.pem"},
             {versions, ['tlsv1.2']},
             {log_level, debug}
            ],
    {ok, Sock} = ssl:connect("localhost", Port, COpts).

Starting the server and client in their respective erlang shells produces the following verbose logging of TLS protocol messages:

1> ssltest:server().
reading (238 bytes) TLS 1.2 Record Protocol, handshake
0000 - 16 03 03 00 e9 01 00 00  e5 03 03 5b ab 42 7a ee    ...........[.Bz.
0010 - 91 23 df 70 30 fb 41 b9  c5 14 79 d7 02 48 74 c9    .#.p0.A...y..Ht.
0020 - b9 a9 8f e0 e9 04 1a f9  a8 21 49 00 00 4a 00 ff    .........!I..J..
0030 - c0 2c c0 30 c0 24 c0 28  c0 2e c0 32 c0 26 c0 2a    .,.0.$.(...2.&.*
0040 - 00 9f 00 a3 00 6b 00 6a  c0 2b c0 2f c0 23 c0 27    .....k.j.+./.#.'
0050 - c0 2d c0 31 c0 25 c0 29  00 9e 00 a2 00 67 00 40    .-.1.%.).....g.@
0060 - c0 0a c0 14 00 39 00 38  c0 05 c0 0f c0 09 c0 13    .....9.8........
0070 - 00 33 00 32 c0 04 c0 0e  01 00 00 72 00 00 00 0e    .3.2.......r....
0080 - 00 0c 00 00 09 6c 6f 63  61 6c 68 6f 73 74 00 0a    .....localhost..
0090 - 00 3a 00 38 00 0e 00 0d  00 19 00 1c 00 0b 00 0c    .:.8............
00a0 - 00 1b 00 18 00 09 00 0a  00 1a 00 16 00 17 00 08    ................
00b0 - 00 06 00 07 00 14 00 15  00 04 00 05 00 12 00 13    ................
00c0 - 00 01 00 02 00 03 00 0f  00 10 00 11 00 0b 00 02    ................
00d0 - 01 00 00 0d 00 18 00 16  06 03 06 01 05 03 05 01    ................
00e0 - 04 03 04 01 03 03 03 01  02 03 02 01 02 02          ..............
<<< TLS 1.2 Handshake, ClientHello
[{client_version,{3,3}},
 {random,
     <<91,171,66,122,238,145,35,223,112,48,251,65,185,197,20,121,215,2,72,116,
       201,185,169,143,224,233,4,26,249,168,33,73>>},
 {session_id,<<>>},
 {cipher_suites,
     [<<0,255>>,
      <<"À,">>,<<"À0">>,<<"À$">>,<<"À(">>,<<"À.">>,<<"À2">>,<<"À&">>,<<"À*">>,
      <<0,159>>,
      <<0,163>>,
      <<0,107>>,
      <<0,106>>,
      <<"À+">>,<<"À/">>,<<"À#">>,<<"À'">>,<<"À-">>,<<"À1">>,<<"À%">>,<<"À)">>,
      <<0,158>>,
      <<0,162>>,
      <<0,103>>,
      <<0,64>>,
      <<"À\n">>,
      <<192,20>>,
      <<0,57>>,
      <<0,56>>,
      <<192,5>>,
      <<192,15>>,
      <<"À\t">>,
      <<192,19>>,
      <<0,51>>,
      <<0,50>>,
      <<192,4>>,
      <<192,14>>]},
 {compression_methods,[0]},
...
[Truncated for brevity]

This is not the final format as there are many ways to further improve the representation of the handshake protocol messages such as converting the cipher suites to a human-readable erlang representation.

As a comparison this is the debug output from an OpenSSL server when the same erlang client connects to it:

$ /usr/bin/openssl s_server -accept 11000 -tls1_2 -cert server.pem -key server.key -msg -debug
Using default temp DH parameters
ACCEPT
read from 0x16f0040 [0x16f56b3] (5 bytes => 5 (0x5))
0000 - 16 03 03 00 a1                                    .....
<<< ??? [length 0005]
    16 03 03 00 a1
read from 0x16f0040 [0x16f56b8] (161 bytes => 161 (0xA1))
0000 - 01 00 00 9d 03 03 5b ac-a1 cc 20 4c 4d 52 d0 d4   ......[... LMR..
0010 - c8 fc dd 95 b0 fa 65 97-57 9e 44 aa dd 0e 46 10   ......e.W.D...F.
0020 - 6c 14 57 9c ce a0 00 00-04 00 ff c0 14 01 00 00   l.W.............
0030 - 70 00 2b 00 06 00 04 03-04 03 03 00 00 00 0e 00   p.+.............
0040 - 0c 00 00 09 6c 6f 63 61-6c 68 6f 73 74 00 0a 00   ....localhost...
0050 - 3a 00 38 00 0e 00 0d 00-19 00 1c 00 0b 00 0c 00   :.8.............
0060 - 1b 00 18 00 09 00 0a 00-1a 00 16 00 17 00 08 00   ................
0070 - 06 00 07 00 14 00 15 00-04 00 05 00 12 00 13 00   ................
0080 - 01 00 02 00 03 00 0f 00-10 00 11 00 0b 00 02 01   ................
0090 - 00 00 32 00 04 00 02 02-03 00 0d 00 04 00 02 02   ..2.............
00a0 - 01                                                .
<<< TLS 1.2 Handshake [length 00a1], ClientHello
    01 00 00 9d 03 03 5b ac a1 cc 20 4c 4d 52 d0 d4
    c8 fc dd 95 b0 fa 65 97 57 9e 44 aa dd 0e 46 10
    6c 14 57 9c ce a0 00 00 04 00 ff c0 14 01 00 00
    70 00 2b 00 06 00 04 03 04 03 03 00 00 00 0e 00
    0c 00 00 09 6c 6f 63 61 6c 68 6f 73 74 00 0a 00
    3a 00 38 00 0e 00 0d 00 19 00 1c 00 0b 00 0c 00
    1b 00 18 00 09 00 0a 00 1a 00 16 00 17 00 08 00
    06 00 07 00 14 00 15 00 04 00 05 00 12 00 13 00
    01 00 02 00 03 00 0f 00 10 00 11 00 0b 00 02 01
    00 00 32 00 04 00 02 02 03 00 0d 00 04 00 02 02
    01
...
[Truncated for brevity]

The verbose debug logging proved to be especially useful during the development of new extensions as previously we had to use wireshark captures to validate TLS protocol messages.

Implementation

In the ssl application, we needed a way to handle two types of protocol messages, tls_record and handshake, each with a custom formatter.

The most straightforward solution was to add a new handler instance to the logger with a special formatter function that filters out all the “noise” coming from other modules of the system.

The handler itself could reuse the standard handler for logger, logger_std_h, as it could print logs to standard_io. You can add multiple standard handler instances to logger if your application requires it.

logger:add_handler(ssl_handler, logger_std_h, Config),

The new ssl_handler is configured with a formatter that is implemented by the ssl_logger module.

Config = #{level => debug,
           filter_default => stop,
           formatter => {ssl_logger, #{}}},

Handler filter level is set to debug with stop as the default filter action. We also need a filter that lets the log events pass to the formatter if the source of the log event is the ssl application. In other words, we need a domain filter with the action log on all sub-domains matching [otp,ssl].

Filter = {fun logger_filters:domain/2,{log,sub,[otp,ssl]}},

Putting it all together we get the following function.

start_logger() ->
    Config = #{level => debug,
               filter_default => stop,
               formatter => {ssl_logger, #{}}},
    Filter = {fun logger_filters:domain/2,{log,sub,[otp,ssl]}},
    logger:add_handler(ssl_handler, logger_std_h, Config),
    logger:add_handler_filter(ssl_handler, filter_non_ssl, Filter).

The function format is called in ssl_logger when an event gets through all the filters:

format(#{level:= _Level, msg:= {report, Msg}, meta:= _Meta},
       _Config0) ->
     #{direction := Direction,
       protocol := Protocol,
       message := BinMsg0} = Msg,
    case Protocol of
        'tls_record' ->
            BinMsg = lists:flatten(BinMsg0),
            format_tls_record(Direction, BinMsg);
        'handshake' ->
            format_handshake(Direction, BinMsg0);
        _Other ->
            []
    end.

There are two more helper functions that wrap around the logging macros. They were added in order to be able to set logging level per TLS session.

debug(Level, Report, Meta) ->
    case logger:compare_levels(Level, debug) of
        lt ->
            ?LOG_DEBUG(Report, Meta);
        eq ->
            ?LOG_DEBUG(Report, Meta);
        _ ->
            ok
    end.

notice(Level, Report) ->
    case logger:compare_levels(Level, notice) of
        lt ->
            ?LOG_NOTICE(Report);
        eq ->
            ?LOG_NOTICE(Report);
        _ ->
            ok
    end.

To print a log event, the above functions are called with the configured ssl log level and the domain parameter.

ssl_logger:debug(Opts#ssl_options.log_level,
	         Report,
		 #{domain => [otp,ssl,handshake]}),

Those who are interested in the current state of development can already play with the 'tlsv1.3' atom in the versions option.

Permalink

Cowboy 2.5

Cowboy 2.5.0 has been released! Cowboy 2.5 focused on making the test suites pass. It is now possible to get all the Cowboy tests to pass successfully, at least on Linux and on the more recent Erlang/OTP versions. HTTP/1.1 has been improved with a fix for the TCP reset problem and the ability to stream a response body without using chunked transfer-encoding. Two functions have been added: cowboy_req:stream_events/3 encodes and streams one or more text/event-stream events, and cowboy_req:read_and_match_urlencoded_body/2,3 reads, parses and matches application/x-www-form-urlencoded request bodies.

Permalink

Copyright © 2016, Planet Erlang. No rights reserved.
Planet Erlang is maintained by Proctor.