Arc Forum | Concise math matters: McCarthy's m-expressions and other syntactic sugar

Arc Forum

Concise math matters: McCarthy's m-expressions and other syntactic sugar

11 points by dreeves 6493 days ago | 13 comments

Prescript: See section 7 of Paul Graham's "Being Popular". Especially this: "I've used Lisp my whole programming life and I still don't find prefix math expressions natural."

The coolest hackery has beautiful mathematical gems at the core or in the cleverest bits. And basic arithmetic is ubiquitous in code. Mathematics, as mathematicians write it, is beautifully elegant and concise. It is a 100-year language and then some. And it very much embodies Paul's principles for Arc. Let's not throw that away. There's a night-and-day difference in grokkability between the following equivalent expressions:

  (+ (* 2 (f (/ x 3))) y)
  2*f[x/3]+y

I absolutely appreciate the beauty of the syntaxlessness of lisp but we need the syntactic sugar for infix. Let's take as a hard constraint that we need to accept infix expressions, without distractions like setting them apart with special syntax or interspersing whitespace. I believe this means we have to give up two nice features in lisp syntax (but that it's worth it): 1. Arithmetic symbols (+ - * / = et al) in symbol names. 2. Whitespace as element delimiters in lists.

Suppose we use square brackets for s-expressions, move the first element in front, and delimit with semicolons (whitespace is even more insignificant than in lisp syntax, as it is no longer a delimiter):

  +[*[2; f[/[x;3]]]; y]

This is just as syntaxless as lisp style s-expressions, but notice how we now have total freedom to unambiguously insert infix expressions and use parentheses for grouping in any subexpression:

  +[*[2; f[x/3]]; y]

  +[2 * f[x / 3]; y]

  2*f[x/3]+y

Some details will need to be resolved, like semicolon being the comment character (or the obvious alternative, comma, being unquote in macros). Also, perhaps x * y needs to be syntactic sugar for times[x;y] rather than * [x;y] if we're to consistently forbid swearing characters in symbol names.

Do others agree that lack of infix expressions is a key barrier to adoption of lisp (and not unreasonably, as my first example above shows) and that we need a way to mix and match infix and prefix effortlessly?

What other ways could we achieve this? If we give up the constraint that we shouldn't need to set infix apart with special syntax then "sweet expressions" (http://www.dwheeler.com/readable/), where anything in curly braces is infix, comes close. By forbidding arithmetic characters in symbol names we can tighten it up by not needing to intersperse whitespace everywhere:

  (+ (* 2 f{x/3}) y)
  (+ {2*f(x/3)} y)
  {2*f(x/3)+y}

The need to wrap everything in curly braces is sometimes cumbersome though. If I want to double the result of (g 2 3) here,

  (f (g 2 3))

I have to do this:

  (f {2 * g(2 3)})

Whereas in the m-expression syntax I take this

  f[g[2; 3]]

and just stick in the "2 * " in the natural way:

  f[2*g[2; 3]]

5 points by vrk 6493 days ago | link

Is the prefix notation the biggest reason Lisp is not used more? No, I don't think so. Prefix, infix, or postfix is all a matter of what you are used to. It's possible to define the max operator as infix instead of prefix:

  (max 1 2) ; --> 2
  1 max 2   ; --> 2

It's associative (it's easy to prove). And after that you can do the regular stuff:

  1 max 3 max 2 max 5 ; --> 5

Does this look natural? Usually max is more of a function, or in mathematics one usually writes

  max{ <some set> }

or with set comprehension

  max{ x : <some condition for x> } .

I find infix max easier in pen-and-paper calculations.

Coming back to Lisp and Arc, is familiarity the only reason you want infix math operators? The biggest benefit you get from prefix compared to infix is its regularity and uniformity. Control structures, function calls, and (now in Arc) array and hash table access are all identical. This means you can freely add not only new functions, but new control structures, without hacking the parser and the compiler.

One reason why this happens is that you don't have precedence levels. Precedence is always explicit (bar Arc's new intrasymbol operators, which are infix), and this is why Lisp mathematical functions, such as + and *, accept multiple arguments. The distinction between (+ 1 (+ 2 3)) and (+ (+ 1 2) 3) is useless and even harmful. It's best to write it (+ 1 2 3) or 1 + 2 + 3, because the operation is associative.

Aside from losing this, you also lose the concept that your whole program is a list. This stands at the heart of Lisp macros. How would you write macros for opaque infix blocks?

-----

5 points by almkglor 6493 days ago | link

> Aside from losing this, you also lose the concept that your whole program is a list. This stands at the heart of Lisp macros. How would you write macros for opaque infix blocks?

The solution which dwheeler on the readable-discuss list supports is to simply have the reader translate infix to postfix. Syntax like {1 + 2 + 3} becomes readily translated into (+ 1 2 3). What's more, dwheeler suggests that precedence should not be supported at all: {1 + 2 * 3} is a badly formed expression that the reader will reject, you must use {1 + {2 * 3}}.

-----

1 point by vrk 6493 days ago | link

That's an interesting idea, but on the other hand, if you don't have precedence defined for the common cases, how are infix expressions better than prefix? Does it save typing and cognitive load?

  {1 + {2 * 3}}
  (+ 1 (* 2 3))

Maybe it's a bit clearer, as it's more familiar. Could this be extended to arbitrary two-parameter functions?

  (def foo (x y) ...)
  {x foo y}  ; --> (foo x y)

Why do I ask? Because then we would have a way to write

  ; "Object-oriented" method call
  {object -> method}

  ; Different kinds of assignments, common in C/C++/Java/etc.
  ; (Perhaps of little worth in Arc.)
  {x += 1}
  {x -= 1}
  {x |= bits}

  ; A "binding" operator, or Pascal-like assignment.
  {x := 1}

  ; An arbitrary relation.
  {a R b}

  ; Perl 6 pipe operator [1] look-alike.
  {{data ==> [...]} ==> processed}

[1] http://www.perl.com/pub/a/2003/04/09/synopsis.html?page=2

-----

3 points by almkglor 6493 days ago | link

> Maybe it's a bit clearer, as it's more familiar. Could this be extended to arbitrary two-parameter functions?

This is, in fact, the reason why dwheeler decided to eschew precedence - I kept pestering him that in the future I might want to create a 'convoke operator or some other arbitrarily-named relation. Basically any {_ sym _ [sym _]...} pattern gets converted to (sym _ _ ...)

The problem with supporting precedence is that you have to define it before someone starts mixing up the order of the symbols. And dwheeler wanted to do it at read time, so you had to have fixed precedence at read time. If someone randomly typed {x sym1 y sym2 z}, the reader wouldn't be able to know if it's (sym1 x (sym2 y z)) or (sym2 (sym1 x y) z). Initially we were thinking of supporting only +-*/ precedence and disallow mixing of other operators, but this started to get slippery: where do you draw the line where some operators are allowed to have precedence and others are not?

-----

2 points by jc 6485 days ago | link

This suggestion may be a bit un-Arc, but I'd prefer a "with-infix" macro implementing a well-defined sub-language than I would some kind of integration into the reader. An optional precedence list arg might be a good idea.

The problem there is, of course, that the s-exprs generated by the macro are essentially a black box, but in my thinking, if infix is being used minimally and purely for convenience's sake, it isn't necessary to make those s-exprs accessible to other macros (right?...).

-----

6 points by cadaver 6493 days ago | link

This is just a random thought of a noob: You can already use pairs, strings, hashtables in functional position, could this not be exploited likewise for numbers?

From the arc source: ... ((string? fn) (string-ref fn (car args))) ...

for numbers: ((number? fn) (apply (car args) fn (cdr args)))

You wouldn't have precedence, but it would work well with prefix notation since, if you make a mistake and use infix notation in a prefix expression, e.g. (+ 1 * 2 3), an error would be signalled.

-----

4 points by eds 6491 days ago | link

I like this idea, because it doesn't involves special meaning of braces or Algol-style function calls. I just prefer

  (2 + (3 * (sqrt 4)))

  {2 + {3 * sqrt(4)}}

but maybe this is just me. You could even add a fancy analysis function to do precedence which would allow things like

  (2 + 3 * (sqrt 4))

Maybe I am missing something that would create difficulties for this system, but I tried the modification suggested above and it actually worked, so I am inclined to think this could be a good way to implement infix math.

-----

3 points by cadaver 6491 days ago | link

I thought about this a little bit and I suppose the simple way would be to determine precedence based on the particular functions the symbols are bound to, rather than on the symbols themselves. But wouldn't you really want symbol precedence?

-----

2 points by almkglor 6493 days ago | link

You might be interested in our discussions on readable-discuss@lists.sourceforge.net . We've been discussing such things for quite some time now.

http://www.mail-archive.com/readable-discuss@lists.sourcefor... http://dir.gmane.org/gmane.lisp.readable-lisp

In particular, you might be interested in my discussion of the fictional language "Stutter":

http://www.mail-archive.com/readable-discuss@lists.sourcefor...

-----

1 point by mdemare 6493 days ago | link

How about a much simpler and less intrusive method? Add a reader method for curly brackets that switches the first and the second argument around. So:

    (a b c) = {b a c}

That means you can write a C-style expression such as this one:

    (f(x) + g(y)) * h(z)

either prefix-wise:

    (* (+ (f x) (g y)) (h z))

or infix-wise:

    {{(f x) + (g y)} * (h z)}

It doesn't give you precedence; worse, you cannot omit any brackets. But it does give you infix, and it's completely optional, and trivial to understand or implement.

-----

4 points by gmlk 6493 days ago | link

I think the + in arc is not a +, it is a ∑

The + and others are special binary cases for a more generic function.

-----

1 point by dreeves 6492 days ago | link

For those saying "but! but! macros!" please read section 7 of http://www.paulgraham.com/popular.html

This is about syntactic sugar only. Everything is still s-expressions underneath.

I don't know the right way to do this yet (mathematica is the best way I've seen so far) but one thing I'd like to implore Paul to do:

Forbid all special characters in symbol names. Nice as they are, they will be more valuable as syntactic sugar.

(Parting thought:

  x*2+7 - f(5/2,y+8)

  (- (+ (* x 2) 7) (f (/ 5 2) (+ y 8)))

)

-----

2 points by stefano 6493 days ago | link

Using M-expressions would reduce the regularity of Lisp, IMO. But mathematical expressions are more readable with infix notation. I think it would be nice to be able to say (math 8+9(sqrt 67)) instead of (+ 8 ( 9 (sqrt 67))).

-----