Arc Forum | Variables & scoping complaint

Arc Forum

	Variables & scoping complaint
	3 points by uu 2335 days ago \| 14 comments
	Kind of sucks that a lot of arc's variables are global by default (using the "=" function) so you need to wrap everything in "with" or "let" commands If I define something, only let me use it at its level or lower!!! Isn't that a sane thing to expect from a contemporary, "correct" functional programming language???? C and Python get it right!!! Global scope is NOT the proper default.

3 points by rocketnia 2329 days ago | link

It looks like the particular solution you'd like is for `=` to act as it does in Python, but I think Arc's behavior is preferable to Python's.

For a variable to "default" to "global scope" is basically the definition of how lexical scope works. To find a variable's binding occurrence, you look outward until you find it. Once you go far enough out, you get to the language definition itself, which ultimately must provide some "global" catchall case.

Python makes lexical scope much more complex to describe: Every variable is declared at a module (or REPL), class, or function boundary. To find the boundary where a variable is declared, first you look for the nearest boundary. If you've found a module boundary or a function boundary where the variable is in the parameter list, you're done. Otherwise, you search that boundary for any `=`, `global`, or `nonlocal` declarations of that variable. If you find `global`, you skip to the nearest module boundary, and you're done. Otherwise, if you find one or more `=` declarations and don't find `nonlocal`, you're done. Otherwise, you repeat this process at the next nearest boundary.

I do sympathize with a couple of the pain points you're talking about:

1. If a variable name is written with a typo, Arc's `=` will silently assign to a misspelled global binding. I think it'd be more ideal to alert the programmer to an error like C does.

2. To declare a local variable, an Arc programmer must use things like `let` and `with`, which add layers of parentheses and indentation. In C (ever since C99) and Python, local variable declarations can be performed in the middle of a block, and they don't affect the indentation of the following code.

For point 1, most of the time I like to program without using `=` at all.

For point 2, I think Lisp's prefix notation and lambda notation facilitate a style that uses more deeply nested expressions than you might be used to. When expressions can be more deeply nested, they don't have to be broken apart across quite as many local variables, so writing `let` and `with` forms here and there doesn't pose a serious problem most of the time. That said, I use what I call "weak opening parens" and a certain indentation style to do away with this extra nesting anyway (https://github.com/lathe/parendown-for-racket).

-----

3 points by i4cu 2329 days ago | link

This definitely not great, but having to use 'with' is not the real issue for me. In fact if you read through the code you'll notice that this style is often used deliberately (i.e. some people might argue that it's a feature).

I think the bigger issue is that global vars can be used without any errors or conflict warnings. So someone else can create code which sets globals which then gets loaded via libs and will pollute your program. So you could have lib#1 set var 'x' only for it to get clobbered by lib#2's version of var 'x' and arc just ignores this. That's bad. Really bad.

-----

2 points by krapp 2329 days ago | link

>So you could have lib#1 set var 'x' only for it to get clobbered by lib#2's version of var 'x' and arc just ignores this. That's bad. Really bad.

And how often is it intended behavior? How often does one want to have a macro parameter or apparently "local" variable overwrite an existing symbol globally and arbitrarily?

I can see it being useful in the case of "hot patching" code to just redefine a function, but that's still kind of crazy, since now the source code and running code are no longer the same. Might as well just edit the file and reload it, and have one source of truth.

It seems like an anti-feature. Something that seemed like a good idea in theory, but which in practice does more harm than good.

-----

2 points by akkartik 2328 days ago | link

I'm not following the two of you on precisely what this anti-feature is. Assigning to local variables using '=' does not create new global variables. Can you share a code sample showing what you're talking about?

-----

1 point by i4cu 2328 days ago | link

> Assigning to local variables using '=' does not create new global variables.

We're referring to the original posts problem where using = inside a function can redefine global variables:

  arc> (= data* '(green red)) 

  arc> (def my-fn () (= data* nil))
  #<procedure: my-fn>
  arc> (my-fn)
  nil
  arc> data*
  nil

Now my comment was referring a bigger problem that falls out of this 'feature' or 'anti-feature' (depending on your point of view).

assume you place this in html.arc

  (= data* '(blue red))

  ; (add-color 'purple)
  (def add-color (color)
    (push color data*))

Then I create apple tracker library 'lib/apples.arc' with:

  (= data* '(green red))

  ; (add-apple 'golden)
  (def add-apple (color)
    (push color data*))

Now at the top level of my progam:

  (load "lib/apples.arc")
  (load "lib/html.arc")

; now say the first apple in storage has gone brown:

  > (string (car data*) " has gone brown")

  "blue has gone brown"

say what? my blue apple has gone brown?

why is there no warning or error on loading my lib? That's pretty bad IMO.

-----

1 point by akkartik 2331 days ago | link

Sorry I just saw this. Can you elaborate? What is an example of a variable that C and Python define correctly but has a worse equivalent in Arc? Or what's a variable in Arc that bit you by being globally scoped?

In particular I'm surprised that you mentioned C. If we're worse than C in this respect I'd love to understand that better.

-----

3 points by sht 2331 days ago | link

He wants it to be a little more like scheme wherein:

   (define (f)
     (define v 10)
     ...)

creates v as a local variable. In Arc, and most lisps, you need to do:

   (defun f ()
     (let ((v 10)) 
       ...))

Additionally, Arc has the wrinkle (i think, i haven't used Arc recently) where if you do:

   (defun f ()
     (= v 10)
     ...)

you create a global variable and assign 10 to it, instead of the python default which creates v in local scope. Basically, he's trying to use imperative programming and doesn't want to declare local variables using let.

-----

2 points by rain1 2330 days ago | link

I have been thinking about a modified version of LET that takes a sequence of expressions (like begin) that are either assignments or not assignments. Then it would collect all consecutive assignments together into a letrec. For example:

    (let
      (foo)
      (= a 1)
      (= b 2)
      (= c 3)
      (bar)
      (= x 4)
      (= y 5)
      (baz))

would get collected into

    (begin
      (foo)
      (letrec ((a 1)
               (b 2)
               (c 3))
        (bar)
        (letrec ((x 4)
                 (y 5))
          (baz))))

and this "LET" form would take place of the implicit begin we have in most contexts. What do you think? Could it be useful.

-----

3 points by uu 2321 days ago | link

Thank you, that's 100% on point!

Setting local variables should be simpler (or equivalent to) setting global variables. It's so much easier to do (= v 10) than (let ((v 10) ...)) despite "let" / "with" arguably being preferable to "=" in most cases.

-----

2 points by akkartik 2330 days ago | link

I see. Yes, it can be a little sharp-edged for new-comers from other languages that `=` auto-defines globals. It's not immediately obvious that we don't use it much. Let me think about how to improve that.

Thanks for the clarification!

-----

3 points by krapp 2329 days ago | link

Yeah, as a newcomer I can attest that I did not know that. V here being global is really counter-intuitive:

  (defun f ()
     (= v 10)
     ...)

I would have expected (and would prefer) the default to be to bind to whatever the current scope is, and to have global (file level, then application level) scope be opt-in rather than opt-out. We can't assume that it's more likely new users will be familiar with lisp idioms.

-----

2 points by rocketnia 2327 days ago | link

In your example, the variables in scope inside `f` are exactly the same as the variables in scope outside of it. You're implying the "current scope" has changed, but if the set of variables in scope hasn't changed, then what other part has?

I think you were expecting a concept that Arc doesn't have. Adding an unnecessary concept to Arc would make the language more complicated for newcomers who weren't expecting it, right?

Of course, if users do consistently come in with the same intuition, it's pointless to design against it. Often we've gotta design for complex realities rather than simple principles. :) If you think it's a case like that, I can sympathize, and maybe someday I'll see it.

Sometimes I feel like general-purpose plain text programming language is such a specific topic that it leads to only one possible language design. Feeling that way is probably the only way I'll design a single language at all, rather than designing a lot of half-languages and never finishing any of them!

In this case, I actually _don't_ feel like there's _no_ potential to alternative notions of lexical scope or variable assignment, but I think Arc's at a sweet spot, and I've some extensive reasoning as to why....

---

"We can't assume that it's more likely new users will be familiar with lisp idioms."

If not some other language's idioms, where did you get the idea of there being a "current scope"?

It's true that many popular languages have features where they infer a variable declaration at some notion of "current scope" around innermost point (Python) or outermost point (CoffeeScript, Ruby, MATLAB) where a variable is assigned. Newcomers to Arc from to those languages might expect this. (I think uu must be bringing in Python experience.)

I think some languages (R, Kernel, maybe some Scheme interpreters) represent the lexical scope as a run time data structure, and variable assignments can add new variables to the local scope that were previously looked up from an outer scope. They have lexical scope, but arguably not static scope.

I also want to mention PHP, which is off doing its own thing where there's hardly any implicit inheritance between lexical scopes at all. Every variable lookup or assignment is restricted to the current function unless there's an explicit `use` or `global` declaration to imply otherwise. I kind of admire PHP's willingness to make the interaction between scopes explicit like this; it means PHP could evolve to have different parts of the code written in different languages, with explicit marshalling of values between all of them.

Newcomers coming in from any of these languages might have different expectations. And that's not to mention newcomers from JavaScript, Perl, Scheme, Common Lisp, Clojure, Erlang, Haskell, Elm, Java, C#, etc., who probably expect Arc's scoping to work exactly as it already does (or who raise completely unrelated issues, like objections to Arc's unhygienic macros :-p ).

So let's look at Arc as its own language.

Thanks to Arc's lexically scoped `fn`, it's basically an extension of the lambda calculus, and it has easy access to all known lambda calculus techniques for Turing-complete computation. This means Arc programmers basically don't have to use assignment at all unless they want to.

In Python, those lambda calculus techniques are possible to use in theory, but every nontrivial lambda must be named and pulled out onto its own line, giving us something a lot like `goto` label spaghetti.

In PHP, every nontrivial lambda must have a `use` declaration to pull in all the variables it captures. This can get to be particularly verbose, eventually to the point where it might be easier to pass around explicit context objects.

Even using lambda calculus techniques a little bit in Python or PHP means we start to have trouble with mutable variables. Lambda calculus uses functions for control flow, but using functions in Python or PHP means creating new scopes, which means we can't easily assign to outer variables from inside our conditionals and loops. Most uses of mutable variables involve some kind of conditional or loop (or variable capture for its own sake), since that's what makes them anything more than a sequence of variables that happen to share the same name. So the more we use lambda-calculus-style conditionals and loops, the less we effectively have access to mutable variables in the programs we're writing.

In both Python and PHP, it just takes a little more boilerplate to work around this: We give up on mutable variables altogether and simulate them with immutable variables that refer to mutable objects. (There's also `use (&$foo)` in PHP and `nonlocal foo` in Python, if you prefer not to give up on mutable variables, but they amount to far more boilerplate.)

The standard boilerplate for these things in Arc is pretty much less than zero, thanks to macros. An Arc programmer can write a custom conditional operator as a higher-order function, and then when they're tired of putting the conditional branches in `fn` every time they use it, they can write a macro that generates the `fn` automatically.

Since Scheme and Common Lisp were already well-worn combinations of lexically scoped `lambda`, mutation (`set!`/`setq`), and macros, all of this could pretty much be predicted when Arc was designed.

Nevertheless (or maybe out of having different goals than I'm expressing here), Paul Graham and co. tried out automatic local variables anyway. It was implemented for an early, unreleased version of Arc.[1] Then they pulled this feature out because they realized they kept introducing or removing lexical contours by mistake and breaking parts of their code.[2] I bet this is because they were implementing some of their control flow macros in terms of `fn`.

Could it be possible to follow through on their experiment without recognizing all the same mistakes and pulling the plug again? Yes, I bet it is.[3]

But I think Arc's local variable scoping rules and variable assignment behavior are exactly what they need to be:

- Implicit inheritance of lexical scope to enable lambda calculus techniques (unlike PHP).

- The easy ability to mutate variables in outer scopes so mutation can work together with lambda-calculus-style control flow (unlike Python, R, and Kernel).

This still leaves the CoffeeScript/Ruby/MATLAB approach on the table, where only the outermost assignment is treated as a declaration. I don't particularly like this approach, and that's because I prefer for the outermost level to be relatively seamless with the rest of the language. That way it's easier to break parts of the language off into optional libraries when it turns out they're not as helpful as expected. Arc's top level already isn't seamless with the rest of the language, but I think this would be a step in the wrong direction.

In summary: If users come in with incorrect ideas about Arc's variable assignment behavior based on their experience with other languages, I think that's most likely a place where other languages could learn something from Arc rather than the other way around. The higher-order techniques of lambda calculus are a sweet spot in language design, and Arc's system for local variable scope is well tailored to that. The Arc designers originally did try automatic local variables. They found them to be unnecessarily complex to work with, and I agree.

---

[1] http://www.paulgraham.com/arcll1.html "Here is a big difference between Arc and previous Lisps: local variables can be created implicitly by assigning them a value. If you do an assignment to a variable that doesn't already exist, you thereby create a lexical variable that lasts for the rest of the block. (Yes, we know this will make the code hard to compile, but we're going to try.)"

[2] http://paulgraham.com/arclessons.html "In Arc we were planning to let users declare local variables implicitly, just by assigning values to them. This turns out not to work, and the problem comes from an unforeseen quarter: macros. [...] In a language with implicit local variables and macros, you're always tripping over unexpected lexical contours. You don't want to create new lexical contours without announcing it. [...] It seemed to us a bad idea to have a feature so fragile that its own implementors couldn't use it properly. So no more implicit local variables."

[3] In Racket, the `racket/splicing` module (https://docs.racket-lang.org/reference/splicing.html) has a few rough edges, but it's a good example of how the choice of whether a macro changes the "current scope" can be controlled deliberately, even in a language with lambdas and macros. I didn't bring up Racket or Scheme's notion of "current scope" with all the other examples because it doesn't interact with variable assignment, but I think even that notion is a kind of ill-conceived complexity that I'm glad Arc doesn't have. It's handy to have local syntax that roughly resembles the top level to aid in refactoring, but on the one hand the resemblance isn't required to be perfect (and isn't perfect in Racket), and on the other hand the Scheme top level isn't very easy to make modular, so it's not even a good thing to resemble.

-----

2 points by rocketnia 2324 days ago | link

Oh, right, here's another reason I don't like the CoffeeScript/Ruby/MATLAB approach where a variable is declared automatically at the outermost point where it's assigned: I like to be able to declare local variables that shadow variables from outer scopes. When `=` declares non-shadowing variables only (since the rest of the time it acts as an assignment rather than a declaration), shadowing is more cumbersome to do.

-----

2 points by rain1 2327 days ago | link

point [2] sounds like a problem of non-hygienic macros. perhaps it can be reconsidered now that there's syntax objects.

-----