Solving Payments Pain Points with Blockchain | Simon Taylor - 11:FS
Over a year ago, I left the role of software engineering* behind to become a site reliability engineer* at Honeycomb.io. Since then, I've been writing a bunch of blog posts over there rather than over here, including the following:
There are also a couple of incident reviews, including one on a Kafka migration and another on a spate of scaling-related incidents.
Either way, I only have so much things to rant about to fill in two blogs, so this place here has been a bit calmer. However, I recently gave a talk at IRConf (video).
I am reproducing this talk here because, well, I stand behind the content, but it would also not fit the work blog's format. I'm also taking this opportunity because I don't know how many talks I'll give in the next few years. I've decided to limit how much traveling I do for conferences due to environmental concerns—if you see me at a conference, it either overlapped with other opportunistic trips either for work or vacations, or they were close enough for me to attend them via less polluting means—and so I'd like to still post some of the interesting talks I have when I can.
This talk is first of all about the idea that errors are not objective truths. Even if we look at objective facts with a lot of care, errors are arbitrary interpretations that we come up with, constructions that depend on the perspective we have. Think of them the same way constellations in the night sky are made up of real stars, but their shape and meaning are made up based on our point of view and what their shapes remind us of.
The other thing this talk will be about is ideas about what we can do once we accept this idea, to figure out the sort of changes and benefits we can get from our post-incident process when we adjust to it.
I tend to enjoy incidents a lot. Most of the things in this talk aren't original ideas, they're things I've read and learned from smarter, more experienced people, and that I've put back together after digesting them for a long time. In fact, I thought my title for this talk was clever, but as I found out by accident a few days ago, it's an almost pure paraphrasing of a quote in a book I've read over 3 years ago. So I can't properly give attribution for all these ideas because I don't know where they're from anymore, and I'm sorry about that.
This is a quote from "Those found responsible have been sacked": some observations on the usefulness of error" that I'm using because even if errors are arbitrary constructions, they carry meaning, and they are useful to organizations. The paper defines four types I'll be paraphrasing:
So generally, error is useful as a concept, but as an investigator it is particularly useful as a signal to tell you when things get interesting, not as an explanation on their own.
And so this sort of makes me think about how a lot of incident reviews tend to go on. We use the incident as an opportunity because the disruption is big and large enough to let us think about it all. But the natural framing that easily comes through is to lay blame to the operational area.
Here I don't mean blame as in "people fucked up" nearly as much as "where do we think the organisation needs to improve"—where do we think that as a group we need to improve as a result of this. The incident and the operations are the surface, they often need improvement for sure because it is really tricky work done in special circumstances and it's worth constantly adjusting it, but stopping there is missing on a lot of possible content that could be useful.
People doing the operations are more or less thrown in a context where a lot of big decisions have been made already. Whatever was tested, who was hired, what the budgets are, and all these sorts of pressures are in a large part defined by the rest of the organization, they matter as well. They set the context in which operations take place.
So one question then, is how do we go from that surface-level vision, and start figuring out what happens below that organisational waterline.
The first step of almost any incident investigation is to start with a timeline. Something that lets us go back from the incident or its resolution, and that we use as breadcrumbs to lead us towards ways to prevent this sort of thing from happening again. So we start at the bottom where things go boom, and walk backwards from there.
The usual frameworks we're familiar with apply labels to these common patterns. We'll call something a failure or an error. The thing that happened right before it tends to be called a proximate cause, which is frequently used in insurance situations: it's the last event in the whole chain that could have prevented the failure from happening. Then we walk back. Either five times because we're doing the five whys, or until you land at a convenient place. If there is a mechanical or software component you don't like, you're likely to highlight its flaws there. If it's people or teams you don't trust as much, you may find human error there.
Even the concept of steady state is a shaky one. Large systems are always in some weird degraded state. In short, you find what you're looking for. The labels we use, the lens through which we look at the incident, influence the way we build our explanations.
The overall system is not any of the specific lenses, though, it's a whole set of interactions. To get a fuller richer picture, we have to account for what things looked liked at the time, not just our hindsight-fuelled vision when looking behind. There are a lot of things happening concurrently, a lot of decisions made to avoid bad situations that never took place, and some that did.
Hindsight bias is something somewhat similar to outcome bias, which essentially says that because we know there was a failure, every decision we look at that has taken place before the incident will seem to us like it should obviously have appeared as risky and wrong. That's because we know the result, it affects our judgment. But when people were going down that path and deciding what to do, they were trying to do a good job; they were making the best calls they could to the extent of their abilities and the information available at the time.
We can't really avoid hindsight bias, but we can be aware of it. One tip there is to look at what was available at the time, and consider the signals that were available to people. If they made a decision that looks weird, then look for what made it look better than the alternatives back then.
Counterfactuals are another thing to avoid, and they're one of the trickiest ones to eliminate from incident reviews. They essentially are suppositions about things that have never happened and will never happen. Whenever we say "oh, if we had done this at this point in time instead of that, then the incident would have been prevented." They're talking about a fictional universe that never happened, and they're not productive.
I find it useful to always cast these comments into the future: "next time this happens, we should try to try that to prevent an issue." This orients the discussion towards more realistic means: how can we make this option more likely? the bad one less interesting? In many cases, a suggestion will even become useless: by changing something else in the system, a given scenario may no longer be a concern for the future, or it may highlight how a possible fix would in fact create more confusion.
Finally, normative judgments. Those are often close to counterfactuals, but you can spot them because they tend to be about what people should or shouldn't have done, often around procedures or questions of competence. "The engineer should have checked the output more carefully, and they shouldn't have run the command without checking with someone else, as stated in the procedure." Well they did because it arguably looked reasonable at the time!
The risk with a counterfactual judgment is that it assumes that the established procedure is correct and realistically applicable to the situation at hand. It assumes that deviations and adjustments made by the responder are bound to fail even if we'll conveniently ignore all the times they work. We can't properly correct procedures if we think they're already perfect and it's wrong not to obey them, and we can't improve tooling if we believe the problem is always the person holding it wrong.
A key factor is to understand that in high pressure incident responses, failure and successes use the same mechanisms. We're often tired, distracted, or possibly thrown in there without adequate preparation. What we do to try and make things go right and often succeed through is also in play when things go wrong.
People look for signals, and have a background and training that influences the tough calls that usually will be shared across situations. We tend to want things to go right. The outcome tends to define whether the decision was good one or not, but the decision-making mechanism is shared both for decisions that go well and those that do not. And so we need to look at how these decisions are made with the best of intentions to have any chance of improving how events unfold the next time.
This leads to the idea you want to look at what's not visible, because they show real work.
I say this is "real work" because we come in to a task with an understanding of things, a sort of mental model. That mental model is the rolled up experience we have, and lets us frame all the events we encounter, and is the thing we use to predict the consequences of our decisions.
When we are in an incident, there's almost always a surprise in there, which means that the world and our mental model are clashing. This mismatch between our understanding of the world and the real world was already there. That gap between both needs to be closed, and the big holes in an incident's timelines tend to be one of the places where this takes place.
Whenever someone reports "nothing relevant happens here", these are generally the places where active hypothesis generation periods happen, where a lot of the repair and gap bridging is taking place.
This is where the incident can become a very interesting window into the whole organizational iceberg below the water line.
So looking back at the iceberg, looking at how decisions are made in the moment lets you glimpse at the values below the waterline that are in play. What are people looking at, how are they making their decisions. What's their perspective. These are impacted by everything else that happens before.
If you see concurrent outages or multiple systems impacted, digging into which one gets resolved first and why that is is likely to give you insights about what responders feel confident about, the things they believe are more important to the organization and users. They can reflect values and priorities.
If you look at who they ask help from and where they look for information (or avoid looking for it), this will let you know about various dynamics, social and otherwise, that might be going on in your organization. This can be because some people are central points of knowledge, others are jerks, seen as more or less competent, or also be about what people believe the state of documentation is at that point in time.
And this is why changing how we look at and construct errors matters. If we take the straightforward causal approach, we'll tend to only skim the surface. Looking at how people do their jobs and how they make decisions is an effective way to go below that waterline, and have a much broader impact than staying above water.
To take a proper dive, it helps to ask the proper type of questions. As a facilitator, your job is to listen to what people tell you, but there are ways to prompt for more useful information. The Etsy debriefing facilitation guide is a great source, and so is Jeli's Howie guide. The slide contains some of the questions I like to ask most.
There's one story I recall from a previous job where a team had specifically written an incident report on an outage with some service X, and the report had that sort of 30 minutes gap in it and were asking for feedback on it. I instantly asked "so what was going on during this time?" Only for someone on the team to answer "oh, we were looking for the dashboard of service Y". I asked why they had been looking at the dashboard of another service, and he said that the the service's own dashboard isn't trustworthy, and that this one gave a better picture of the health of the service through its effects. And just like that we opened new paths for improvements that were so normal it had become invisible.
Another one also came from a previous job where an engineer kept accidentally deleting production databases and triggering a whole disaster recovery response. They were initially trying to delete a staging database that was dynamically generated for test cases, but kept fat-fingering the removal of production instances in the AWS console. Other engineers were getting mad and felt that person was being incompetent, and were planning to remove all of their AWS console permissions because there also existed an admin tool that did the same thing safely by segmenting environments.
I ended up asking the engineer if there was anything that made them choose the AWS console more than the admin tool given the difference in safety, and they said, quite simply, that the AWS console has an autocomplete and they never remembered the exact table name, so it was just much faster to delete that table often there than the admin. This was an interesting one because instead of blaming the engineer for being incompetent, it opened the door to questioning the gap in tooling rather than adding more blockers and procedures.
In both of these stories, a focus on how people were making their decisions and their direct work experience managed to highlight alternative views that wouldn't have come up otherwise. They can generate new, different insights and action items.
And this is the sort of map that, when I have time for it, I tried to generate at Honeycomb. It's non-linear, and the main objective is to help show different patterns about the response. Rather than building a map by categorizing events within a structure, the idea is to lay the events around to see what sort of structure pops up. And then we can walk through the timeline and ask what we were thinking, feeling, or seeing.
The objective is to highlight challenging bits and look at the way we work in a new light. Are there things we trust, distrust? Procedures that don't work well, bits where we feel lost? Focusing on these can improve response in the future.
This idea of focusing on generating rather than categorizing is intended to take an approach that is closer to Qualitative science than Quantitative research.
The way we structure our reviews will have a large impact on how we construct errors. I tend to favour a qualitative approach to a quantitative one.
A quantitative approach will often look at ways to aggregate data, and create ways to compare one incident to the next. They'll measure things such as the Mean Time To Recovery (MTTR), the impact, the severity, and will look to assign costs and various classifications. This approach will be good to highlight trends and patterns across events, but as far as I can tell, they won't necessarily provide a solid path for practical improvements for any of the issues found.
The qualitative approach by comparison aims to do a deep dive to provide more complete understanding. It tends to be more observational and generative. Instead of cutting up the incidents and classifying its various parts, we look at what was challenging, what are the things people felt were significant during the incident, and all sorts of messy details. These will highlight tricky dynamics, both for high-pace outages and wider organizational practices, and are generally behind the insights that help change things effectively.
To put this difference in context, I have an example from a prior jobs, where one of my first mandates was to try and help with their reliability story. We went over 30 or so incident reports that had been written over the last year, and a pattern that quickly came up was how many reports mentioned "lack of tests" (or lack of good tests) as causes, and had "adding tests" in action items.
By looking at the overall list, our initial diagnosis was that testing practices were challenging. We thought of improving the ergonomics around tests (making them faster) and to also provide training in better ways to test. But then we had another incident where the review reported tests as an issue, so I decided to jump in.
I reached out to the engineers in question and asked about what made them feel like they had enough tests. I said that we often write tests up until the point we feel they're not adding much anymore, and that I was wondering what they were looking at, what made them feel like they had reached the points where they had enough tests. They just told me directly that they knew they didn't have enough tests. In fact, they knew that the code was buggy. But they felt in general that it was safer to be on-time with a broken project than late with a working one. They were afraid that being late would put them in trouble and have someone yell at them for not doing a good job.
And so that revealed a much larger pattern within the organization and its culture. When I went up to upper management, they absolutely believed that engineers were empowered and should feel safe pressing a big red button that stopped feature work if they thought their code wasn't ready. The engineers on that team felt that while this is what they were being told, in practice they'd still get in trouble.
There's no amount of test training that would fix this sort of issue. The engineers knew they didn't have enough tests and they were making that tradeoff willingly.
So to conclude on this, the focus should be on understanding the mess:
Overall, the idea is that looking for understanding more than causes opens up a lot of doors and makes incidents more valuable.
* I can't legally call myself a software engineer, and technically neither can I be a site reliability engineer, because Quebec considers engineering to be a protected discipline. I however, do not really get to tell American employers what they should give as a job title to people, so I get stuck having titles I can't legally advertise but for which no real non-protected forms exist to communicate. So anywhere you see me referred to any sort of "engineer", that's not an official thing I would choose as a title. It'd be nice to know what the non-engineer-titled equivalent of SRE ought to be.
Erlang/OTP 25-rc3 is the third and final release candidate before the OTP 25.0 release.
The intention with this release is to get feedback from our users. All feedback is welcome, even if it is only to say that it works for you. We encourage users to try it out and give us feedback either by creating an issue here https://github.com/erlang/otp/issues or by posting to Erlangforums or the mailing list erlang-questions@erlang.org.
All artifacts for the release can be downloaded from the Erlang/OTP Github release and you can view the new documentation at https://erlang.org/documentation/doc-13.0-rc3/doc/.
You can also install the latest release using kerl like this: kerl build 25.0-rc3 25.0-rc3
.
Erlang/OTP 25 is a new major release with new features, improvements as well as a few incompatibilities. Some of the new features are highlighted below.
Many thanks to all contributors!
Below are some highlights of the release:
Change format of feature options and directives for
better consistency. Options to erlc and the
-compile(..)
directive now has the format {feature,
feature-name, enable | disable}
. The -feature(..)
now
has the format -feature(feature-name, enable |
disable)
.
Introducing a new (still experimental) option {certs_keys
,[cert_key_conf()]}.
With this a list of a certificates with their associated key may be
used to authenticate the client or the server. The
certificate key pair that is considered best and matches
negotiated parameters for the connection will be selected.
filelib:ensure_path/1
will ensure that all directories for the given path existsgroups_from_list/2
and groups_from_list/3
in the maps
moduleuniq/1
uniq/2
in the lists moduleEEP-60
. Features can be enabled/disabled during compilation with options (ordinary and +term) to erlc
as well as with directives in the file. Similar options
can be used to erl
for enabling/disabling features
allowed at runtime. The new maybe
expression EEP-49
is fully supported as the feature maybe_expr.perf
and gdb
, allowing them to show line numbers and even
the original Erlang source code when that can be found.Users can now configure ETS tables with the
{write_concurrency, auto}
option. This option forces
tables to automatically change the number of locks that
are used at run-time depending on how much concurrency
is detected. The {decentralized_counters, true}
option
is enabled by default when {write_concurrency, auto}
is
active.
Benchmark results comparing this option with the other ETS optimization options are available here: benchmarks.
message_queue_data=off_heap
has been optimized to
allow parallel reception of signals from multiple processes.
This can improve performance when many processes are sending in parallel to
one process. See benchmark.short
has been added to the
functions erlang:float_to_list/2
and
erlang:float_to_binary/2
. This option creates the
shortest correctly rounded string representation of the
given float that can be converted back to the same
float again.quote/1
and unquote/1
functions in
the uri_string
module - a replacement for the deprecated functions http_uri:encode
and http_uri:decode
.peer
supersedes the slave
module. The
slave
module is now deprecated and will be removed in OTP 27.global
will now by default prevent
overlapping partitions due to network issues. This is done by
actively disconnecting from nodes that reports that
they have lost connections to other nodes. This will
cause fully connected partitions to form instead of
leaving the network in a state with overlapping
partitions.
It is possible to turn off the new behavior by setting the
the kernel
configuration parameter prevent_overlapping_partitions
to false
.
Doing this will retain the same behavior as in OTP 24 and earlier.
The format_status/2
callback for gen_server
, gen_statem
and gen_event
has been deprecated in favor of the new
format_status/1
callback.
The new callback adds the possibility to limit and change many more things than the just the state.
timer
module has been modernized and made more
efficient, which makes the timer server less
susceptible to being overloaded. The timer:sleep/1
function now accepts an arbitrarily large integer.The maybe ... end
construction as proposed in EEP-49
has been implemented. It can simplify complex code
where otherwise deeply nested cases would have to be
used.
To enable maybe
, give the option {enable_feature,maybe_expr}
to
the compiler. The exact option to use will change in a coming release candidate and then it will also be possible to
use from inside the module being compiled.
{badrecord, ExpectedRecordTag}
exception used to be
raised. In this release, the exception has been changed
to {badrecord, ActualValue}
, where ActualValue
is the
value that was found instead of the expected record.-nifs()
to empower compiler and loader with
information about which functions may be overridden as NIFs by erlang:load_nif/2
.erl_error:format_exception/3,4
.crypto:hash_equals/2
which is a constant time comparision of hashvalues.erl_types
module. Parallelize the Dialyzer pass remote.missing_return
and extra_return
options to
raise warnings when specifications differ from inferred
types. These are similar to, but not quite as verbose
as overspecs and underspecs.min/2
,
max/2
, and erlang:raise/3
. Because of that, Dialyzer
can potentially generate new warnings. In particular,
functions that use erlang:raise/3
could now need a spec
with a no_return()
return type to avoid an unwanted
warning.For more details about new features and potential incompatibilities see
Variable syntax is one of the big differences between Erlang and Elixir that I encountered when learning Elixir. Instead of having to start each variable name with an uppercase letter, a lowercase letter must be used. This change in syntax seems like an improvement - after all most mainstream programming languages require variables to start with a lowercase letter, and lowercase is generally easier to type. However, a deeper look at this syntax choice reveals some significant downsides that I want to present here.
The example I’m using to illustrate this problem is from a blog post on variable shadowing in Elixir by Michael Stalker.
defmodule Shadowing do x = 5 def x, do: x def x(x = 0), do: x def x(x), do: x(x - 1) end
Without running the code, tell me what the return values of these three function calls are:
Shadowing.x()
Shadowing.x(0)
Shadowing.x(2)
No, really. Think about the code for a minute.
Now…are you positive your answers are right?
This code snippet is confusing because the variable names and function names are indistinguishable from each other. This is an ambiguity in scope and also an ambiguity in identifier type. It’s not clear whether the token x
is the function name (an atom) or a variable (identified by the same sequence of characters). Both are identifiers, but unlike Erlang, function identifiers and variable identifiers look the same. Despite this the compiler doesn’t get confused and handles this code according to Elixir’s scoping rules.
I translated the Elixir code above to Erlang. The functions in this Erlang module behave the exact same as the functions in the Elixir module above.
-module(shadowing).
-export([x/0, x/1]).
-define(X, 5).
x() -> x().
x(X) when X == 0 -> X;
x(X) -> x(X - 1).
With Erlang all the ambiguity is gone. We now have functions and variables that cannot be confused. All variables start with uppercase letters and all function names always start with a lowercase letter or are wrapped in single quotes. This makes it impossible to confuse the two. Granted this is not an apples to apples comparison because Erlang doesn’t have a module scope for variables so I used a macro for the model-level variable. But we still have a function and a variable that can no longer be confused.
Despite it’s rough edges Erlang syntax is unambiguous. This is a key advantage Erlang has over Elixir when it comes to syntax. Variables, functions, and all other data types are easily distinguishable. Keywords can be confused with other atoms but this is seldom a problem in practice. The list of keywords is short and easy to memorize but syntax highlighters highlight them in a specific color making memorization unnecessary most of the time.
Erlang/OTP 25-rc2 is the second release candidate of three before the OTP 25.0 release.
The intention with this release is to get feedback from our users. All feedback is welcome, even if it is only to say that it works for you. We encourage users to try it out and give us feedback either by creating an issue here https://github.com/erlang/otp/issues or by posting to Erlangforums or the mailing list erlang-questions@erlang.org.
All artifacts for the release can be downloaded from the Erlang/OTP Github release and you can view the new documentation at https://erlang.org/documentation/doc-13.0-rc2/doc/.
You can also install the latest release using kerl like this: kerl build 25.0-rc2 25.0-rc2
.
Erlang/OTP 25 is a new major release with new features, improvements as well as a few incompatibilities. Some of the new features are highlighted below.
Many thanks to all contributors!
Below are some highlights of the release:
filelib:ensure_path/1
will ensure that all directories for the given path existsgroups_from_list/2
and groups_from_list/3
in the maps
moduleuniq/1
uniq/2
in the lists moduleEEP-60
. Features can be enabled/disabled during compilation with options (ordinary and +term) to erlc
as well as with directives in the file. Similar options
can be used to erl
for enabling/disabling features
allowed at runtime. The new maybe
expression EEP-49
is fully supported as the feature maybe_expr.perf
and gdb
, allowing them to show line numbers and even
the original Erlang source code when that can be found.Users can now configure ETS tables with the
{write_concurrency, auto}
option. This option forces
tables to automatically change the number of locks that
are used at run-time depending on how much concurrency
is detected. The {decentralized_counters, true}
option
is enabled by default when {write_concurrency, auto}
is
active.
Benchmark results comparing this option with the other ETS optimization options are available here: benchmarks.
message_queue_data=off_heap
has been optimized to
allow parallel reception of signals from multiple processes.
This can improve performance when many processes are sending in parallel to
one process. See benchmark.short
has been added to the
functions erlang:float_to_list/2
and
erlang:float_to_binary/2
. This option creates the
shortest correctly rounded string representation of the
given float that can be converted back to the same
float again.quote/1
and unquote/1
functions in
the uri_string
module - a replacement for the deprecated functions http_uri:encode
and http_uri:decode
.peer
supersedes the slave
module. The
slave
module is now deprecated and will be removed in OTP 27.global
will now by default prevent
overlapping partitions due to network issues. This is done by
actively disconnecting from nodes that reports that
they have lost connections to other nodes. This will
cause fully connected partitions to form instead of
leaving the network in a state with overlapping
partitions.
It is possible to turn off the new behavior by setting the
the kernel
configuration parameter prevent_overlapping_partitions
to false
.
Doing this will retain the same behavior as in OTP 24 and earlier.
The format_status/2
callback for gen_server
, gen_statem
and gen_event
has been deprecated in favor of the new
format_status/1
callback.
The new callback adds the possibility to limit and change many more things than the just the state.
timer
module has been modernized and made more
efficient, which makes the timer server less
susceptible to being overloaded. The timer:sleep/1
function now accepts an arbitrarily large integer.The maybe ... end
construction as proposed in EEP-49
has been implemented. It can simplify complex code
where otherwise deeply nested cases would have to be
used.
To enable maybe
, give the option {enable_feature,maybe_expr}
to
the compiler. The exact option to use will change in a coming release candidate and then it will also be possible to
use from inside the module being compiled.
{badrecord, ExpectedRecordTag}
exception used to be
raised. In this release, the exception has been changed
to {badrecord, ActualValue}
, where ActualValue
is the
value that was found instead of the expected record.-nifs()
to empower compiler and loader with
information about which functions may be overridden as NIFs by erlang:load_nif/2
.erl_error:format_exception/3,4
.crypto:hash_equals/2
which is a constant time comparision of hashvalues.erl_types
module. Parallelize the Dialyzer pass remote.missing_return
and extra_return
options to
raise warnings when specifications differ from inferred
types. These are similar to, but not quite as verbose
as overspecs and underspecs.min/2
,
max/2
, and erlang:raise/3
. Because of that, Dialyzer
can potentially generate new warnings. In particular,
functions that use erlang:raise/3
could now need a spec
with a no_return()
return type to avoid an unwanted
warning.For more details about new features and potential incompatibilities see
Erlang/OTP 24.3 is the third and final maintenance patch package for OTP 24, with mostly bug fixes as well as a few improvements.
Below are some highlights of the release:
crypto
app in OTP can now be compiled, linked and
used with the new OpenSSL 3.0
cryptolib. It has not yet been extensively tested,
so only recommended for experiments and alpha testing in this release.
There are not yet any guaranties that it works, not even together with other
OTP applications like for example SSL and SSH, although
there are no known errors.socket:sockaddr_in()
and
socket:sockaddr_in6()
when using gen_sctp
, gen_tcp
and
gen_udp
, will make it possible to use Link Local
IPv6 addresses.For more details and downloads follow this link
The Erlang/OTP source can also be found at GitHub on the official Erlang repository, https://github.com/erlang/otp
This is post number 2 in an ongoing series called Foundations of QAnal.
The goal for today is for you to understand
This is going to be brief (ish). I want to talk cool stuff like rational geometry. But first we have to talk about this.
This is (mostly) tedium. Most of this is known to every single math major (the stuff that isn’t is the weird details that the computability constraint imposes).
This is something you go over once in a “Foundations of Analysis” course (oh wait that’s what this is…), you get it, you move on, and you never worry about it again.
However, non-mathematicians are probably unclear on some of the mathematical terminology. Like, for instance, what exactly a “rational” number is and what distinguishes it from an “irrational” number.
I’m going to selectively skip some detail. Specifically, I’m going to skip most of the detail of why our “make sure every rational number has unique representation” procedure works. There’s a very deep rabbit hole of detail that you can dive into there.
Said detail is crucially important for logical rigor, but adds almost nothing in terms of cognitive clarity. Moreover, I already explained all of that excrutiating detail in my Math for Programmers playlist. So go watch that if you really want to dig in. Or if you’re a bookcel, try chapter 4 of Concrete Mathematics or chapter 1 of Numbers, Groups, and Cryptography.
A [rational number] is what you probably know as a fraction. It means a ratio of two integers. Rationals are also called “quotients” In mathematics, the standard notation for rational numbers is a capital Q (hence QAnal = rational analysis).
I will use the terms “rational”, “fraction”, “ratio”, and “quotient” interchangeably.
So, let’s pick a starting point for some definitions and we’ll improve it as we move on
{T, B}
, where B
is not 0
.T
is called the top (or the numerator) of the fractionB
is called the bottom (or denominator) of the fraction{T1, B1}
and {T2, B2}
are equal precisely when
0 =:= T1*B2 - T2*B1
I first want you to note a pattern here:
T1*B2 - T2*B1
and check to see if it is equal to the integer 0
.QAnal happens to have the property that everything ends up being written as finite computations with integer arithmetic. Which is extremely significant. The CIA’s math does not have this property.Anyway. In mathematics, this design pattern is known as [quotienting]. Which is some lovely terminology overloading. It’s a very important idea that will come up again and again and again, and so I want to note it when it shows up.
There’s a problem straightaway with this definition. It’s not a mathematical or logical flaw. It’s a practical flaw: we have many different pairs of integers that refer to the same rational number. For instance
1> RatEq = fun({T1, B1}, {T2, B2}) -> 0 =:= T1*B2 - T2*B1 end.
#Fun<erl_eval.43.65746770>
2> RatEq({1,2}, {2,4}).
true
3> RatEq({1,2}, {-2,-4}).
true
4> RatEq({1000,-20}, {-50,1}).
true
4> RatEq({100,-20}, {-50,1}).
false
The equality definition is reflective of the fact that we can scale the top and bottom of any fraction by a nonzero integer, and it represents the same fraction. So for instance, {1, 2}
is the same fraction as {2, 4}
, which is the same as {3, 6}
and so on.
If you stare carefully at the expression X = T1*B2 - T2*B1
, you will notice that if we take T1
and B1
and multiply both of them by a nonzero integer C
, it has the effect of scaling X
. That is:
X = T1*B2 - T2*B1
(C*T1)*B2 - T2*(C*B1)
= C*(T1*B2 - T2*B1)
= C*X
The same thing happens if you scale T2
and B2
. So the test for rational equality captures the fact that rationals have this scale-invariance to them.
In mathematics, this does not represent a problem. In computing, it does.
This brings up a principle of QAnal that I should have mentioned in the original post: equal things must have equal representation. This is the QAnal Equity Principle.
The QAnal Equity Principle implies that there must exist a procedure (which we call a [reduction procedure]) that reduces all members of a given [equivalence class] to a [canonical form]. You don’t have to do some secondary check to see if two things are equal. You look at the representation, and the two things are either equal or they aren’t.
Most of this post concerns the reduction procedure for rationals.
The CIA’s mathematics is laughably incompatible with the QAnal Equity Principle. A “real number” for instance is typically defined as an equivalence class of Cauchy sequences of rational numbers (never mind what exactly that means). There is not even a way to determine in finite time if two such Cauchy sequences lie in the same equivalence class, let alone a reduction procedure.
Some programming languages, such as Haskell, don’t have a universal “equals”, and instead insist that you explicitly define equals for each new data structure.
-- File: Rat.hs
module Rat where
data Rat = Rat {top :: Integer,
bot :: Integer}
instance Eq Rat where
(==) (Rat t1 b1) (Rat t2 b2) =
(==) 0
((-) ((*) t1 b2)
((*) t2 b1))
-- |shortcut for Rat that crashes if second argument is 0
q :: Integer -> Integer -> Rat
q _ 0 = error "cannot divide by zero"
q t b = Rat t b
In GHCi:
GHCi, version 8.0.2: http://www.haskell.org/ghc/ :? for help
Loaded GHCi configuration from /home/doctorajaykumar/.ghc/ghci.conf
Prelude> :l Rat.hs
[1 of 1] Compiling Rat ( Rat.hs, interpreted )
Ok, modules loaded: Rat.
*Rat> (q 1 2) == (q 2 4)
True
*Rat> (q 1 2) == (q (-2) (-4))
True
*Rat> (q 1000 (-20)) == (q (-50) 1)
True
*Rat> (q 100 (-20)) == (q (-50) 1)
False
This seems nice but it has the effect of limiting the scope of pattern matching within the language. For instance, consider this Erlang code
-module(eq).
-export([eq/2]).
eq(X, X) -> "pattern equal";
eq(X, Y) when X == Y -> "compare equal";
eq(_, _) -> "not equal".
This is meant to illustrate the difference between “pattern equals” and “compare equals”:
1> c(eq).
{ok,eq}
2> eq:eq(0, 0.0).
"compare equal"
3> eq:eq(0, 0).
"pattern equal"
4> eq:eq(0, 1).
"not equal"
When writing this, I naively tried to do something similar for Haskell:
-- |equals demonstration
eq :: Rat -> Rat -> String
eq x x = "pattern equal"
eq x y | (==) x y = "compare equal"
| otherwise = "not equal"
But it turns out that code doesn’t compile
Prelude> :l Rat.hs
[1 of 1] Compiling Rat ( Rat.hs, interpreted )
Rat.hs:23:4: error:
• Conflicting definitions for ‘x’
Bound at: Rat.hs:23:4
Rat.hs:23:6
• In an equation for ‘eq’
Failed, modules loaded: none.
I haven’t written much Haskell in years, and it seems during that time I’ve forgot some subtleties of the language.
The effective premise of Haskell (and similar languages) is that you can typecheck away all possible errors that might occur in your code. That’s just false.
There is a correct statement that is similar. Functional languages tend to have the property that once you sort out the shapes of your data, the procedural code writes itself. This is a theme throughout QAnal.
This is remarkable, and it’s why functional programming is fundamentally easy: you can offload all of the cognitive complexity of your program into your data structures.
However. You cannot offload uncertainty into your data structures. Crashes happen. Hardware fails. Disk reads fail. Networks fail. You can’t typecheck reality.
Note also that I used the word shape rather than type.
Lately I’ve been working primarily with two languages: Erlang and TypeScript. Neither of these languages have real type systems.
(Erlangers: TypeScript is basically dialyzer.js.)
Both languages instead have something called “success type systems”. This basically means that the type checker will only catch type errors at the point where data that violates the typespec is passed into a function call.
Dialyzer and TypeScript have pretty much identical origin stories: there’s a dynamically typed language but hoomans make type errors and want a type checker. So you invent a fake type system that is optional and can be applied gradually, then overlay it on top of the language.
Working with two similar, but slightly different success type systems has granted me a minor epiphany: type systems are fake news.
[ 5:32 AM ] Supreme Founder Ajay Kumar PHD : I was giving a naive definition of rationals as pairs of integers, and then saying this doesn't work because we have distinct pairs that correspond to equal rationals
[ 5:32 AM ] Supreme Founder Ajay Kumar PHD : And it's led to an interesting tangent about why Haskell sucks
[ 5:43 AM ] Supreme Founder Ajay Kumar PHD : @zxq9 you said something in a podcast a while ago about type systems. Something like C programmer graybeards who hate the world will tell you that there's only one type of data, and that's binary. The context of what you pretend the binary means is what gives something meaning
[ 5:44 AM ] zxq9 : Yes.
[ 5:44 AM ] zxq9 : Specifically, the only real data type in computerland is binary integers.
[ 5:44 AM ] zxq9 : You just pretend that means different things depending on what you are doing with it at the moment.
[ 5:45 AM ] zxq9 : The Quake III fast inverse square root thing is an example of that.
[ 5:45 AM ] Supreme Founder Ajay Kumar PHD : yeah ok I'll add that
[ 5:47 AM ] zxq9 : Same like when people pack a double with some specific data that has nothing to do with doubles (like a long unicode multibyte they know for certain can never be larger than 8 bytes or whatever) because they can stash it in a register temporarily to avoid a round trip out to the cache because they know nothing else can possibly be using that register just then to speed up unicode processing.
[ 5:49 AM ] zxq9 : Weird black magic tricks like this are possible because at the root there is only one data type as far as the computer is concerned: binary integers. What it does with those integers, integer operations, memory fetches, floating point operations, etc. is merely a function of the circuitry you happen to decide to wiggle with those bits attached -- the processor has no inherent meanings of its own that are attached. All of the meanings the computer "knows" and every belief it holds at that level are represented purely by the arrangement of the gates, and it mechanistically and faithfully pulses electrons through them, oblivious to whether it means anything, is in error, or is "good" or "bad".
[ 5:50 AM ] zxq9 : We give those numbers meanings at a different layer. Stated differently and at a very different level of interpretation, playing DayZ or having this little chat online is the Holographic Principle in action.
So, the reason Haskell sucks is that you end up spending most of your development time fighting with the type system. (Well, actually, you spend most of your time fighting with the packaging system, but you get my point.) All of your cognitive energy ends up being spent trying to shoehorn your problem into Haskell’s type system, where most of the time it really doesn’t fit.
In our terminology, Haskell’s type system has Angry Dragon Problems. And it ultimately stems from the false assumption that types are real, and the subsequent insistence that types should play a primary role in development.
In Erlang and TypeScript, the typechecker is your assistant. In Haskell, the typechecker is your boss.
The typechecker should be your assistant.
Data structures are shapes. They are not types. That’s what “structure” means.
/rant
Anyway, Erlang has shapes, not types. And it has a universal equals. And when you’re coding, you want to lean on the universal equals. Since we’re doing this in Erlang, we need to follow the QAnal Equity Principle.
Therefore we must do the following
So let me tell you the definitions.
We have to talk about signs in a minute, so I want to clear up some terminology.
Apparently the word “positive” is ambiguous. Half of the English-speaking world includes 0
as a positive number, and the other half does not. I am from the half of the world where 0
is not a positive number.
Everyone agrees that “strictly positive” excludes 0
, and that the word “non-negative” means numbers that are either strictly positive or zero.
The word “non-negative” sucks, because
0
is positive also thinks that 0
is negative, which would make them logically believe that “non-negative” and “strictly positive” are synonyms.Thankfully, mathematical terminology has a solid reputation of being confusing and illogical, so this is not an issue that appears in practice.I will try to avoid this issue, but will probably fail. Here is the convention I will follow:
X
is [strictly positive] precisely when 0 < X
.X
is [strictly negative] precisely when X < 0
.X
is [HIV-positive] precisely when 0 =< X
(because this terminology is AIDS).X
is [HIV-negative] precisely when X =< 0
.{T, B}
with the following properties
B
is strictly positive; that is, 0 < B
.T
and B
is 1
.(BTW, pairs of integers whose GCD is 1
are called [coprime] or [relatively prime]).T
is called the top of the fraction.B
is called the bottom of the fraction.{T1, B1}
and {T2, B2}
are equal precisely when
{T1, B1} =:= {T2, B2}
The condition 0 < B
takes care of the sign ambiguity:
*Rat> (q 1 (-2)) == (q (-1) 2)
True
The top of the fraction can be positive, negative, or zero. The bottom of the fraction must always be strictly positive. Division by zero is never allowed, so the bottom of the fraction can never be zero.
We could just as easily store the “sign bit” in the bottom of the fraction instead of the top. However, in that case, the bottom of the fraction could be strictly positive or strictly negative, but never zero, and the top of the fraction would have to be HIV-positive. Storing the sign bit in the top is more elegant.
The GCD condition takes care of the scale ambiguity:
*Rat> (q 1 2) == (q 40 80)
True
So alas, here is our type definition (source):
-type z() :: qx_z:z().
-type zp() :: qx_z:zp().
-type q() :: {q, z(), zp()}.
Z (short for German Zahlen = “numbers”) is standard mathematical notation for “integers”. If we look inside the qx_z
module, we can see that these are just aliases for standard Erlang types.
-type z() :: integer().
-type znn() :: non_neg_integer().
-type zp() :: pos_integer().
And our reduction procedure (source):
-spec q(z(), z()) -> q().
% @doc form the quotient of two integers
% Cannot divide by zero
q(_, 0) ->
error(badarg);
q(Top, Bot) when Bot < 0 ->
q(-1 * Top, -1 * Bot);
q(Top, Bot) when 0 < Bot,
is_integer(Top),
is_integer(Bot) ->
GCD = qx_z:gcd(Top, Bot),
{q, Top div GCD,
Bot div GCD}.
The logic is as follows
q(_, 0) ->
error(badarg);
-1
and try again:
q(Top, Bot) when Bot < 0 ->
q(-1 * Top, -1 * Bot);
q(Top, Bot) when 0 < Bot,
is_integer(Top),
is_integer(Bot) ->
GCD = qx_z:gcd(Top, Bot),
{q, Top div GCD,
Bot div GCD}.
Every time QAnal constructs a rational number, it uses this q
function. This ensures canonical representation throughout.
So this begs two questions:
qx_z:gcd/2
work?I will give a fake answer the first question, and give a real answer to the second question.
The fake answer to the first question is to look at the prime factorizations of the two integers.
Let’s switch to Scheme for a moment, and consider the fraction 105/60
.
scheme@(guile-user)> 105/60
$1 = 7/4
scheme@(guile-user)> (* 3 5 7)
$2 = 105
scheme@(guile-user)> (* 3 4 5)
$3 = 60
scheme@(guile-user)> (/ (* 3 5 7)
(* 3 4 5))
$4 = 7/4
You see that the GCD ends up being the “intersection” of the prime factorizations, and so when you divide by it, it cancels all the common factors.
The reason this is a fake answer is it begs the question “why is prime factorization unique?” That is, why is there not a different set of prime numbers whose product is 105
?
Well it turns out that the proper way to set this up logically is to start with the GCD algorithm. Then you use the properties of the GCD algorithm to prove the uniqueness of factorization.
I took some moderate care to set this up correctly in my Math for Programmers playlist. So go consult that. Or look at one of the books I suggested (links in the links section).
It’s very voluminous, only moderately interesting, and somewhat outside the scope of this post. You’ve gone your entire life without knowing why integer factorization is unique, and you’re doing just fine. Let’s keep going.
First of all, I should mention that the glowies are playing some word games with the word “greatest”. See, friend, whenever someone says that X is “best” or “greatest”, there’s an implicit ordering relation with respect to which X is the greatest.
The glowie word game here is subtly changing the ordering relation under your nose and not telling you. The CIA’s favorite activity is implicit context swaps. That’s why glowies love object-oriented programming.
This by the way is why the CIA is going woke. At the bottom, the CIA’s mathematics is based on smoke and mirrors and retarded word games. Wokeism is entirely smoke and mirrors and retarded word games. Glowie mathematicians in woke discourse are like a pig in shit. This is their whole thing.
The glowies implicit ordering relation with respect to which the GCD is greatest is the divisibility ordering on integers. In particular:
1
is the universal lower bound (1
divides everything and nothing divides 1
)0
is the universal upper bound (0
divides nothing and everything divides 0
)The divisibility ordering differs in many subtle ways from the standard ordering. For instance, the standard ordering is a total order: given two integers X
and Y
, at least one of the following statements is true:
X =< Y
Y =< X
This is not true with divisibility, which is a partial ordering.
I explored this caveat in slightly more depth in the Math for Programmers playlist.
For our purposes, this doesn’t actually matter. The first thing we do in computing gcd
is make sure both arguments are HIV-positive. And on HIV-positive integers, the divisibility ordering implies the standard ordering, except for the fact that 0
is a universal upper bound. That is:
X
and Y
are strictly positive integers and X
divides Y
X =< Y
with respect to the standard orderingThis is a pitfall that you might have to be aware of at some point down the line. And it’s very important to be alert at all times for signs of glowie word games.
The proper definition is: the [greatest common divisor] of a set S
of integers is the greatest lower bound of S
with respect to the divisibility relation.
Anyway, algorithm. It’s extremely simple.
The first premise is that gcd
is HIV-positive: meaning if we have the fractions (/ -105 60)
or (/ 105 60)
, the thing we want to cancel from the top and bottom is the same.
So the first thing we do is make sure both arguments are HIV-positive (source):
% make sure both arguments are HIV-positive
gcd(X, Y) ->
gcdabs(abs(X), abs(Y)).
The next premise is that gcd
is symmetric: meaning if we have the two fractions 60/105
or 105/60
, the integer we want to cancel from top and bottom is the same. So we establish a convention that the bigger number goes on the left (source).
% gcd where the arguments can be assumed to be HIV-positive,
% but not necessarily in the correct order
%
% This function is responsible for making sure the argument on
% the left is bigger than or equal to the argument on the right.
gcdabs(X, Y) when X < Y ->
gcdplus(Y, X);
gcdabs(X, Y) when X >= Y->
gcdplus(X, Y).
And finally we get to the heart of the matter: qx_z:gcdplus/2
.
% gcd where the arguments can be assumed to be HIV-positive and
% in the correct order
% zero is a universal upper bound in the divisibility relation.
% so when compared with anything, return the other thing
gcdplus(X, 0) ->
X;
% X might be equal to Y here, in which case, modulo(X, Y) = 0, and
% the next call will degenerate into the base case
gcdplus(X, Y) when X >= Y ->
% X = qY + R
% R = X - qY
% 0 =< R < Y
% therefore any common divisor of X and Y (such as the gcd) also
% divides R
% therefore
% gcd(X, Y) ->
% gcd(Y, R)
gcdplus(Y, modulo(X, Y)).
modulo/2
is a function that in this cases matches the rem
operator in Erlang.
Let’s walk though that 105
and 60
example:
gcdplus(105, 60) ->
% 105 rem 60 = 45
gcdplus(60, 45) ->
% 60 rem 45 = 15
gcdplus(45, 15) ->
% 45 rem 15 = 0
gcdplus(15, 0) ->
% base case
15
And if you remember our little Scheme thing, it’s true that 15
is the product of the common factors:
scheme@(guile-user)> 105/60
$1 = 7/4
scheme@(guile-user)> (* 3 5 7)
$2 = 105
scheme@(guile-user)> (* 3 4 5)
$3 = 60
scheme@(guile-user)> (/ (* 3 5 7)
(* 3 4 5))
$4 = 7/4
I will refer you to the code. Now that we’ve ironed out all of our shapes, the procedures are obvious. Godspeed.
In QAnal, we assume that these are true.
Below, I illustrate how the rational number arithmetic properties are consequences of the integer arithmetic properties.
It is possible, in a similar way, to rewrite the integer arithmetic properties in terms of a more “primitive” data structure (say, Peano arithmetic). I don’t think there’s anything to be gained clarity-wise from doing things like that, but it can be done.
It’s mathturbation: it’s not necessarily wrong to do it, but it’s probably wrong to do it in public. Nonetheless, there’s tons of videos on the internet if you want to learn how.
The properties of the integer operations are the [commutative ring] properties:
(equal? (z+ (z+ a b) c)
(z+ a (z+ b c))
(z+ a b c))
(equal? (z+ x y)
(z+ y x))
zero
which has the property of being an additive identity:
(equal? x
(z+ zero x)
(z+ x zero))
x
there exists a unique integer (minus x)
which is its additive inverse:
(equal 0
(z+ x (minus x))
(z+ (minus x) x))
(equal? (z* (z* a b) c)
(z* a (z* b c))
(z* a b c))
(equal? (z* x y)
(z* y x))
one
which has the property of being a multiplicative identity:
(equal? x
(z* one x)
(z* x one))
(equal? (z+ (z* a b) (z* a c))
(z* a (z+ b c)))
The first 8 properties are identical to those of integers. There is only one addtional property and that concerns division of rationals.
The properties of the rational operations are the [field] properties:
(equal? (q+ (q+ a b) c)
(q+ a (q+ b c))
(q+ a b c))
(equal? (q+ x y)
(q+ y x))
zero
which has the property of being an additive identity:
(equal? x
(q+ zero x)
(q+ x zero))
x
there exists a unique rational (minus x)
which is its additive inverse:
(equal zero
(q+ x (minus x))
(q+ (minus x) x))
(equal? (q* (q* a b) c)
(q* a (q* b c))
(q* a b c))
(equal? (q* x y)
(q* y x))
one
which has the property of being a multiplicative identity:
(equal? x
(q* one x)
(q* x one))
(equal? (q+ (q* a b) (q* a c))
(q* a (q+ b c)))
x
that is different from zero
has a multiplicative inverse, called (oneover x)
, with the property
(equal? one
(q* x (oneover x)))
Prove the properties of rational number operations.
What do I mean by that? What I mean is to use the rewrite layer idea: everything in QAnal can be rewritten in terms integer arithmetic.
So assume the properties of integer arithmetic are true, and also assume the scale invariance of rationals (the thing where if you multiply the top and bottom by the same nonzero integer, it’s the same rational number).
All you will need to do is some very straightforward algebra. In each case, you can show that if the properties of integer arithmetic are true, then the properties of rational arithmetic are true.
Disclaimer: I haven’t done this and I’m tired so that’s why it’s an exercise.
Let me give an example: I will prove that rational number addition is associative.
-spec plus(q(), q()) -> q().
% @doc add two rationals
plus({q, TL, BL}, {q, TR, BR}) ->
% make common denominator
q((TL * BR) + (TR * BL),
BR * BL).
I accidentally chose the hardest example so this is a bit long. But I did the hardest part of your homework for you.
I think this is a bit clearer in Scheme-ish notation
(equal? (q+ (/ t1 b1)
(/ t2 b2))
(/ (z+ (z* t1 b2)
(z* t2 b1))
(z* b1 b2)))
We are trying to prove that
(equal? (q+ (q+ q1 q2) q3)
(q+ q1 (q+ q2 q3)))
I’m used to the CIA’s notation, and so it’s very likely that I made an error in the S-expression manipulations below. One interesting constraint that the blog medium imposes is that I cannot use CIA notation. So far I like this constraint.
So let’s just pick this apart
; rewrite treelike
(equal? (q+ (q+ q1
q2)
q3)
(q+ q1
(q+ q2
q3)))
; expand each of q1, q2, q3
(equal? (q+ (q+ (/ t1 b1)
(/ t2 b2))
(/ t3 b3))
(q+ (/ t1 b1)
(q+ (/ t2 b2)
(/ t3 b3))))
; inside out rewrite, starting on the left:
(equal? (q+ (/ (z+ (z* t1 b2)
(z* t2 b1))
(z* b1 b2))
(/ t3 b3))
(q+ (/ t1 b1)
(q+ (/ t2 b2)
(/ t3 b3))))
; reformat so tops and bottoms are visually easier to pick out
; (no change to tree structure)
(equal? (q+ (/ (z+ (z* t1 b2) (z* t2 b1))
(z* b1 b2))
(/ t3
b3))
(q+ (/ t1 b1)
(q+ (/ t2 b2)
(/ t3 b3))))
; again rewriting inside out starting on the left
;
; q+ rewrites as
;
; - quotient of
; - sum of
; - product of
; - top left
; - bottom right
; - product of
; - top right
; - bottom left
; - product of
; - bottom left
; - bottom right
;
; jesus christ
(equal?
; quotient of
(/
; sum of
(z+
; product of
(z*
; top left
(z+ (z* t1 b2) (z* t2 b1))
; bottom right
b3
)
; product of
(z*
; top right
t3
; bottom left
(z* b1 b2)
)
)
; product of
(z*
; bottom left
(z* b1 b2)
; bottom right
b3
)
)
(q+ (/ t1 b1)
(q+ (/ t2 b2)
(/ t3 b3))))
; take binary trees with identical associative/commutative stems
; and rewrite them as a single n-ary tree where the branches are
; arranged alphabetically
;
; so for instance:
;
; (z* t3 (z* b1 b2))
;
; becomes
;
; (z* b1 b2 t3)
;
; result:
(equal?
; quotient of
(/
; sum of
(z+
; product of
(z*
; top left
(z+ (z* b1 t2) (z* b2 t1))
; bottom right
b3
)
; product of
(z* b1 b2 t3)
)
; product of
(z* b1 b2 b3)
)
(q+ (/ t1 b1)
(q+ (/ t2 b2)
(/ t3 b3))))
; apply the distributive law to collapse the tree some more
(equal?
; quotient of
(/
; sum of
(z+
(z* b1 b2 t3)
(z* b1 b3 t2)
(z* b2 b3 t1)
)
; product of
(z* b1 b2 b3)
)
(q+ (/ t1 b1)
(q+ (/ t2 b2)
(/ t3 b3))))
; reformat (no change to structure)
(equal? (/ (z+ (z* b1 b2 t3)
(z* b1 b3 t2)
(z* b2 b3 t1))
(z* b1 b2 b3))
(q+ (/ t1 b1)
(q+ (/ t2 b2)
(/ t3 b3))))
; alright now we get to do all of that again, but to the
; right-hand side.
;
; first we expand out the innermost q+ and rewrite it as integer
; operations
;
; I learned from experience on the left side that it's best to
; pre-emptively sort terms in dictionary order when the integer
; operations allow it. So I'm going to do that without saying it
; explicitly, alright?
(equal? (/ (z+ (z* b1 b2 t3)
(z* b1 b3 t2)
(z* b2 b3 t1))
(z* b1 b2 b3))
(q+ (/ t1 b1)
(/ (z+ (z* b2 t3)
(z* b3 t2))
(z* b2 b3))))
; reformat so tops and bottoms are vertically aligned in each
; quotient
(equal? (/ (z+ (z* b1 b2 t3)
(z* b1 b3 t2)
(z* b2 b3 t1))
(z* b1 b2 b3))
(q+ (/ t1
b1)
(/ (z+ (z* b2 t3) (z* b3 t2))
(z* b2 b3))))
; alright. q+ rewrites as
;
; - quotient of
; - sum of
; - product of
; - top left
; - bottom right
; - product of
; - top right
; - bottom left
; - product of
; - bottom left
; - bottom right
;
; jesus christ
(equal? (/ (z+ (z* b1 b2 t3)
(z* b1 b3 t2)
(z* b2 b3 t1))
(z* b1 b2 b3))
; quotient of
(/
; sum of
(z+
; product of
(z*
; top left
t1
; bottom right
(z* b2 b3)
)
; product of
(z*
; top right
(z+ (z* b2 t3) (z3* b3 t2))
; bottom left
b1
)
)
; product of
(z*
; bottom left
b1
; bottom right
(z* b2 b3)
)
)
)
; alright. home stretch folks
; let's apply some integer magic to the right hand side and go
; home
(equal? (/ (z+ (z* b1 b2 t3)
(z* b1 b3 t2)
(z* b2 b3 t1))
(z* b1 b2 b3))
; quotient of
(/
; sum of
(z+
; associative and commutative laws:
(z* b2 b3 t1)
; distributive law
(z+
(z* b1 b2 t3)
(z* b1 b3 t2)
)
)
; associative law
(z* b1 b2 b3)
)
)
; more integer magic:
(equal? (/ (z+ (z* b1 b2 t3)
(z* b1 b3 t2)
(z* b2 b3 t1))
(z* b1 b2 b3))
(/ (z+ (z* b1 b2 t3)
(z* b1 b3 t2)
(z* b2 b3 t1))
(z* b1 b2 b3)))
And alas, we have the same expressions on both sides, so the statement is true.
Proofs in QAnal typically end up being like this: very long, often laborious, and tedious chains of algebra.
We end up trading away a bit of intuition, and in exchange we get back logic that is extremely concrete and extremely verbose.
I may or may not stick with this. I am having horrible writer’s block trying to write this all out in a LaTeX PDF, Erlang, my videos, or in my Revelations. So I’m trying the blog medium. The blog has the “fire and forget” property. So we’ll see.
QAnal is my fork of mathematics. QAnal is short for “rational analysis”.
The impetus for this was about a year or so ago. I wanted to do a video explaining Gaussian elimination and neural networks with some code (still haven’t done that hundreds of videos later). I got the sense that both ideas were intrinsically very simple, just poorly explained. Mostly because those explaining the ideas didn’t really understand them.
The first thing I realized was why nobody gives code examples of Gaussian elimination: CIA floats. Gaussian elimination is a very mathematically (and computationally) straightforward algorithm. But it requires a lot of division. Something that computers are weirdly bad at.
1> (1/10) + (2/10) =:= (3/10).
false
So I went to implement rational number arithmetic in Erlang, and then write a matrix data structure over the rational numbers.
Implementing rationals is easy in a language that has built-in bignum arithmetic like Erlang or Scheme. Scheme actually has built-in rationals
scheme@(guile-user)> (equal? (+ (/ 1 10) (/ 2 10)) (/ 3 10))
$1 = #t
(“Rational” just means fraction with integers. Rational = ratio.)
But doing this in a language that doesn’t have built-in bignum arithmetic means you have to implement bignums yourself (or face overflow/underflow issues). And that’s a task beyond the capacity of almost anyone writing a math explainer (most of whom would struggle writing “hello world”).
So, as someone explaining Gaussian elimination, you have three options:
Well, option 3 is clearly the best one of those three
Problem number 2: a lot of mathematics depends on a leaky abstraction called the “real numbers” (which are fake, and which I may or may not explain at some point). CIA floats exist as a band-aid abstraction in computing to deal with mathematics centered around CIA reals.
Interestingly, the term “real number” emerged to distinguish them from “imaginary numbers”. In hindsight, imaginary numbers are real and real numbers are fake. Summary: the CIA is retarded
Let me give you a really really simple example.
What is x
? Well if you remember high school trigonometry, then you know the answer is the square root of 2?
This is the Pythagorean Theorem. The first theorem ever, and easily the most important theorem in mathematics.
Well, there’s a problem. As even the Ancient Greeks knew, there is no rational number with has the property that its square is two.
The argument is really simple. Let me lay it out.
But take it on faith for now. A fraction is reduced precisely when the top and bottom of the fraction contain no common factor. There is a canonical procedure for reducing a fraction called the Euclidean algorithm (which we’ll talk about soonish).
In particular, the canonical (reduced) form of any fraction has the property that the top and bottom of the fraction are not both even. Examples:
scheme@(guile-user)> (/ 54 30)
$1 = 9/5
scheme@(guile-user)> (/ 2 6)
$2 = 1/3
scheme@(guile-user)> (/ 2 4)
$3 = 1/2
(/ a b)
which has the following two properties:
(/ a b)
is 2a
and b
are not both evenI will derive a contradiction.
a
and b
(provided b
is not zero, in which case it crashes).
(define (check a b)
(equal? (expt (/ a b) 2)
(/ (expt a 2)
(expt b 2))))
Let’s try it
scheme@(guile-user) [2]> (define (check a b) (equal? (expt (/ a b) 2) (/ (expt a 2) (expt b 2))))
scheme@(guile-user) [2]> (check 2 4)
$5 = #t
scheme@(guile-user) [2]> (check 2 5)
$6 = #t
scheme@(guile-user) [2]> (check 8482 2919)
$7 = #t
This is a direct consequence of the rule for multiplying fractions:
times({q, TL, BL}, {q, TR, BR}) ->
q(TL * TR,
BL * BR).
(equal? (/ (expt a 2)
(expt b 2))
2)
then it is also true that
(equal? (expt a 2)
(* 2 (expt b 2)))
This is just algebra.
(expt a 2)
is 2 times another integer, which means that (expt a 2)
is even.a
itself is even; that is, there exists another integer x
such that (equal? a (* 2 x))
; substitution:
(equal? (expt (* 2 x) 2)
(* 2 (expt b 2)))
; commutative property of integer multiplication:
(equal? (* 4 (expt x 2))
(* 2 (expt b 2)))
; divide both sides by 2
(equal? (* 2 (expt x 2))
(expt b 2))
b
must also be even.(/ a b)
whose square is 2, then it must be the case that both a
and b
are even.(/ a b)
cannot be reduced, which is absurd, because we have an algorithm which reduces every fraction.This argument traces back (in different verbiage) to about 600 BC, to the school of the Pythagoreans, in Ancient Greece. This fact deeply troubled the Greeks, because it upset their weird perfectionist aesthetic. Legend has it that when Hippasus leaked the terrible secret—in modern verbiage—that the square root of 2 is irrational, the Pythagoreans took him out into the ocean and drowned him.
x
?The CIA’s answer is that x
is an “irrational” number, which is just simply called the square root of 2.
QAnal’s answer is that you’re asking the wrong question. I will convince you that my answer is right.
Let’s get back to the plot. I realized almost immediately that if one backpropagates the following constraints 1. no floats 2. no real numbers 3. no infinite processes —which are intrinsic to computing—back into mathematics, then a lot of stuff breaks, but things end up simplifying quite dramatically.
There’s an entire barely explored world of deeply interesting mathematics here, which is simpler, very basic, and makes infinitely more sense than the CIA’s mathematics.
What we get in QAnal, crucially, is mathematics that is computable. The CIA’s version of mathematics is barely computable.
QAnal’s notation ends up being a lot better. Our logic becomes a lot more clear. Everything is just cleaner and saner. There’s no waffling. Everything is super concrete and super clear. It’s just better.
Programmers out there: how many times have you
A lot right? Like that can be a monthly occurrence.
Well. Mathematics has been stuck on the first bullet point for the last 23 centuries. It’s a miracle that it works even a little bit.
Enough’s enough.
I’ve spent basically the last year going down a rabbit hole of learning this entire new world of mathematics. And being a bit miffed that this new, beautiful way of thinking isn’t standard.
I owe a major major debt of gratitude to Norman Wildberger. He realized these problems about 20 years ago, and has done a massive amount of work in developing what I call QAnal. In particular
We will learn about all of these at some point down the line.
QAnal is genuinely simpler and genuinely easier. And it’s not close. It’s a world of difference.
So. The basic premise is that the CIA has lost its way. It has started relying on joggerlicious abstractions (chiefly, the “real numbers”).
The CIA’s approach has the net effect of introducing angry dragon problems. This is where you introduce a simplifying assumption at the beginning, and you end up paying for it with massive uncontrollable chaotic sprawling complexity down the line.
I’m going to write a post about angry dragon problems at some point. OOP versus functional programming is a great example. For now I’ll leave you with this gif.
What I’m going to do for now is stick to my original premises:
Planned-ish:
Fuzzier:
Eventually:
Let’s get going!