When we say something like 5 + 3
, we aren’t usually surprised to learn that we are speaking of 8
. And when we write this as a program for a machine to execute, we expect it to evaluate this expression to 8
as well. Everyone is familiar with the sign +
since we all learned it in elementary school, and we know what happens when we add two numbers together.
But mathematics is less about numbers and more about structure. It’s just that counting with numbers happens to exhibit lots of structure, structure that is both useful to everyday commerce and also representative of abstractions that are reflected in all things. The structure possessed by numbers and revealed and studied by mathematics has as much to do with the numbers themselves as it does with bumblebees[1]. Numbers, to put a fine point on it, are simply the easiest thing we could work with in order to reveal and manipulate structure.
On the other hand, in programming, we aren’t interested in the simplest thing we could work with, but rather, real, tangible things that we seek to accomplish that needn’t be simple at all. Among other things, working with strings of text and lists of various items is a very common activity for a programmer. Sometimes we’d like to say something like "hello " + "there"
and expect the machine to understand what we mean. After all, just because we usually do mathematics with numbers doesn’t mean we shouldn’t be able to do mathematics with other things if we wanted to. Some languages, including Python and Ruby, oblige us here, and evaluate this to "hello there"
. This seems reasonable enough, but what about "there" + "hello "
? This gives us "therehello "
— a different result. With numbers, on the other hand, 5 + 3 = 8
and 3 + 5 = 8
as well. Additionally, 5 - 3
is a shorthand for 5 + -3
which equals 2
, and 3 - 5
means 3 + -5
which equals -2
. What could "hello" - "there"
mean? "hello" + -"there"
? Could we assign some useful meaning to -"there"
?
It’s not so easy. Our initial intuition that there is something “the same” about numbers and strings in this regard is challenged here, so what was it that gave us this notion? Well, when you put two strings together, you get a longer string containing both of the original ones. If you combine a string with an empty string, you get the original one, just like with numbers where if you add a number to zero, you get the original number back. But on the other hand, numbers have inverses — you can always tack on a minus sign to a number to get something that, when added to the original number, cancels it out. Strings and lists don’t have such a notion, at least, not an obvious one. Addition of numbers is commutative, meaning you get the same result no matter what order you add them in. Strings and lists give different results depending on the order in which they are combined.
In other words, these operations are not the same, though they have some commonalities that make them seem similar. In mathematics, we say that numbers, strings, and lists (and many other things) each form a monoid under a relevant notion of “concatenating” them together, that is to say, under a certain operation. “Monoid” here simply being a term to designate this very common structure, which we can consider to be intuitively represented in fastening lengths of string (real string, like from a ball of yarn) to make longer lengths of string (perhaps this is why strings are called strings?). But addition on numbers, by virtue of admitting inverses (a negative number corresponding to every positive number), is a richer structure than a monoid, called a group. In fact, addition of numbers by virtue of being commutative is even more structured than a group strictly needs to be. If we consider the plus sign to mean addition, therefore, it isn’t really correct to say something like "hello " + "there"
since concatenating strings doesn’t exhibit the structure that addition over numbers does. Rather, we need another operator here to correspond to this less structured — and yet more common — idea of a monoid. We could denote this operator as ~
[2], so that "hello " ~ "there" = "hello there"
.
So what should + mean in programming languages? One perspective, which we have been building up to here, is that + should mean addition, which must correspond to “the most common” group operation defined on the concerned type, and ideally an operation that forms an Abelian group (i.e. it is commutative, like addition over numbers). ~
should mean concatenation, like adjoining and tying lengths of string, which should correspond to “the most common” monoid operation defined on the concerned type. -
should indicate the inverse for the addition operation. Some types may support many operations that exhibit the structures we’ve been talking about, such as numbers, where multiplication (designated by ×
or *
) forms an Abelian group, too (if we are including numbers that could be fractions). In such cases, we should remember that the point to having a symbol (or any name) to use is purely convenience, and the choice of symbol is arbitrary as long as it is consistent. So we could just use whatever symbol we like, and if the idea behind the operation is generalizable beyond the specific type of object we are working with, then attempt to retain the meaning of that symbol to refer to (1) intuitively the same operation, or, as that may be hard to judge, then at least, (2) an operation possessing the same algebraic structure as the one designated by the symbol on the original type where we perceived the utility of the operation. These criteria would, for instance, rule out the use of +
for strings.
Some time ago I wrote a library in Racket that provides “generic” +
and ~
operators that behave the way we have been talking about here, and which can be used on any type of data. For example:
(+ #(1 2 3) #(1 2 3)) ; => #(2 4 6) (~ (list 1 2) (list 3)) ; => (list 1 2 3) (~ "hello " "there") ;=> "hello there"
This library also defines similarly mathematically grounded generic notions of other operators such as =
, <
, and >
so that they may be used across all types. Like so:
(< "apple" "banana") ;=> #t
If you are a Racket user, or are curious about the language laboratory that is Racket, give it a try.
The above is just one perspective on a subjective matter of convenience. What do you think + should mean in programming languages? Are there any languages you use that you feel have a particularly good way of handling structured operations across diverse types? (cough, Haskell, cough?)
[Update: Discussion in the Haskell and Racket communities on Reddit.]
[1] My uncle, a mathematician, would say that mathematics is about “the logical structure of patterns.”
[2] Some languages including Haskell denote a similar operator by ++
, although as I’m not too familiar with the language I’m not sure if it’s actually the same operator as this one or something much smarter — please pontificate in the comments if you know more 🙂
John
The below is one of the finest and most accurate paragraphs that I have ever read, anywhere.
“But mathematics is less about numbers and more about structure. It’s just that counting with numbers happens to exhibit lots of structure, structure that is both useful to everyday commerce and also representative of abstractions that are reflected in all things. The structure possessed by numbers and revealed and studied by mathematics has as much to do with the numbers themselves as it does with bumblebees[1]. Numbers, to put a fine point on it, are simply the easiest thing we could work with in order to reveal and manipulate structure.”
sid
🙏
sid
Incidentally, I spent a lot of time on this paragraph. I originally had a more terse version, but my partner (who doesn’t have a mathematical background) reviewed a draft and was resolute that the paragraph wasn’t clear. So I spent time decomposing it and this is the result. I am glad (and she is glad 😄) to hear that it came across well, and thanks again!
Arthur
One reason to use a + for both monoids and groups is that one can say that (in OO lingo) that a group extends a monoid. This idea of extension is all the more natural in rings and fields.
I find Haskell’s type classes very close to the mathematical notion of a structure (such as a monoid or group). I find it natural to explicitly express this form of inheritance in that language.
sid
So you could use
+
everywhere anda - b
would fail for non-groups just because-
isn’t implemented. I like that, too!Having more operators just means that you can disambiguate cases like vector concatenation vs vector addition. E.g.
(~ #(1 2) #(1 2)) => #(1 2 1 2)
, while(+ #(1 2) #(1 2)) => #(2 4)
.Maybe even a combination of the two approaches could work, e.g.
+
would concatenate strings, but on functions it could compose them by yielding a new function that adds the results of applying the component functions to the same arguments, i.e. a group defined on the functions in terms of their values rather than concatenating the transformations. While~
would simply do regular function composition. There’s also a case for 3 standard operators to capture~
(string-like concatenation),+
(number-like addition), and·
(function-like composition), which came up in some of the Reddit discussions.mgaert
I actually by the same thought process arrive at the opposite conclusion: + should be “Concatenate”, not “Add”– for numbers on a number line, concatenating and adding are the same thing (Lay an arrow of length 3 end to end against an arrow of length 4, you get an arrow of length 7), and this even generalizes to vectors of numbers.
There are some arguments against this, like the relationship between “-” and “+”, or the commutative property. But I think arguing that actually makes the case *stronger* that “+” should be “concat”: That’s how inheritance *works*. number+ has certain properties that are a superset of its “parent” generic+. When you arrange the types like this, + is open to extension but closed to modification. + can still be commutative– IF you know that you’re doing number+. But if you don’t know you’re doing number+, and maybe your types are strings, you get to keep relying on “+ as concat”.
The same logic applies to “-“. “-” is an *extension* of “+”. If you know that you’re operating on types that have “-” defined, such as number, then you can use “-” as the opposite of “+”, but if the extent of your knowledge is “this type supports +” then you can’t necessarily conclude that the type supports “-“.
Finally, what about the relationship between “*” and “+”. Of course we could keep doing it the object oriented way, and say that “*” is an extension on top of “+” that not all implementors of “+” need implement, but this is a missed opportunity. We can define generic* as a “distribution” operator, such that (generic* op a b c …) = (a op b op c). You might recognize this as a fold. Then (number* a b…) is simply (generic* number+ a b…). You could define *+ as (generic* generic+ a b…) and then anything that implements generic+ gets generic* for free.
sid
Love your conception of
*
as fold! I think this “object oriented” handling of+
would probably be the most convenient, but I would also advocate for the presence of more generic operators like~
, which could be used to disambiguate cases like vectors, where there is more than one possibility. E.g.#(1 2 3) + #(1 2 3)
could be either#(2 4 6)
or#(1 2 3 1 2 3)
. With two generic operators,+
could do the former while~
would do the latter. In other cases~
and+
would most likely coincide. The possibility of a third generic operator.
also came up in comments (here and on Reddit), to cover “transformation composition” (e.g. function composition), another common idea. In all of these cases, we could have these operators do a default (e.g. concat-like) operation unless a more specific definition is available, as you suggest, and maybe the kind of “extension” available to these operators would not be the same. E.g.-
would be available to extend+
but it would not affect~
or.
. But what should*
mean for matrices? Should it be(fold + ...)
as you suggested for numbers, or should it be.
, i.e. matrix multiplication? One option is that it could be the former if it isnum * matrix
and the latter if it ismatrix * matrix
. Which suggests that even if*
defaults to(fold + ...)
, that it could still be overridden by dispatching on the types of the operands and in that respect is no different from+
,~
and.
. Just like those operators,*
could have its own distinct options for extension, e.g./
for division.So all told it could be something like this: (1) built-in operators come with defaults for various operand types. These defaults need not be the same though they may often coincide (e.g.
~
and+
would coincide for matrices but.
would be different and would coincide with*
). (2) every operator can be overridden for particular combinations of operand types, (3) built-in operators come with different extension schemes — E.g.+
and-
,*
and/
,.
and “inverse transformation,” e.g. matrix inverse. What do you think?