Arc Forumnew | comments | leaders | submit | rocketnia's commentslogin
3 points by rocketnia 56 days ago | link | parent | on: Using Arc in Production

I totally forgot this too. XD

reply

2 points by rocketnia 56 days ago | link | parent | on: Variables & scoping complaint

In your example, the variables in scope inside `f` are exactly the same as the variables in scope outside of it. You're implying the "current scope" has changed, but if the set of variables in scope hasn't changed, then what other part has?

I think you were expecting a concept that Arc doesn't have. Adding an unnecessary concept to Arc would make the language more complicated for newcomers who weren't expecting it, right?

Of course, if users do consistently come in with the same intuition, it's pointless to design against it. Often we've gotta design for complex realities rather than simple principles. :) If you think it's a case like that, I can sympathize, and maybe someday I'll see it.

Sometimes I feel like general-purpose plain text programming language is such a specific topic that it leads to only one possible language design. Feeling that way is probably the only way I'll design a single language at all, rather than designing a lot of half-languages and never finishing any of them!

In this case, I actually _don't_ feel like there's _no_ potential to alternative notions of lexical scope or variable assignment, but I think Arc's at a sweet spot, and I've some extensive reasoning as to why....

---

"We can't assume that it's more likely new users will be familiar with lisp idioms."

If not some other language's idioms, where did you get the idea of there being a "current scope"?

It's true that many popular languages have features where they infer a variable declaration at some notion of "current scope" around innermost point (Python) or outermost point (CoffeeScript, Ruby, MATLAB) where a variable is assigned. Newcomers to Arc from to those languages might expect this. (I think uu must be bringing in Python experience.)

I think some languages (R, Kernel, maybe some Scheme interpreters) represent the lexical scope as a run time data structure, and variable assignments can add new variables to the local scope that were previously looked up from an outer scope. They have lexical scope, but arguably not static scope.

I also want to mention PHP, which is off doing its own thing where there's hardly any implicit inheritance between lexical scopes at all. Every variable lookup or assignment is restricted to the current function unless there's an explicit `use` or `global` declaration to imply otherwise. I kind of admire PHP's willingness to make the interaction between scopes explicit like this; it means PHP could evolve to have different parts of the code written in different languages, with explicit marshalling of values between all of them.

Newcomers coming in from any of these languages might have different expectations. And that's not to mention newcomers from JavaScript, Perl, Scheme, Common Lisp, Clojure, Erlang, Haskell, Elm, Java, C#, etc., who probably expect Arc's scoping to work exactly as it already does (or who raise completely unrelated issues, like objections to Arc's unhygienic macros :-p ).

So let's look at Arc as its own language.

Thanks to Arc's lexically scoped `fn`, it's basically an extension of the lambda calculus, and it has easy access to all known lambda calculus techniques for Turing-complete computation. This means Arc programmers basically don't have to use assignment at all unless they want to.

In Python, those lambda calculus techniques are possible to use in theory, but every nontrivial lambda must be named and pulled out onto its own line, giving us something a lot like `goto` label spaghetti.

In PHP, every nontrivial lambda must have a `use` declaration to pull in all the variables it captures. This can get to be particularly verbose, eventually to the point where it might be easier to pass around explicit context objects.

Even using lambda calculus techniques a little bit in Python or PHP means we start to have trouble with mutable variables. Lambda calculus uses functions for control flow, but using functions in Python or PHP means creating new scopes, which means we can't easily assign to outer variables from inside our conditionals and loops. Most uses of mutable variables involve some kind of conditional or loop (or variable capture for its own sake), since that's what makes them anything more than a sequence of variables that happen to share the same name. So the more we use lambda-calculus-style conditionals and loops, the less we effectively have access to mutable variables in the programs we're writing.

In both Python and PHP, it just takes a little more boilerplate to work around this: We give up on mutable variables altogether and simulate them with immutable variables that refer to mutable objects. (There's also `use (&$foo)` in PHP and `nonlocal foo` in Python, if you prefer not to give up on mutable variables, but they amount to far more boilerplate.)

The standard boilerplate for these things in Arc is pretty much less than zero, thanks to macros. An Arc programmer can write a custom conditional operator as a higher-order function, and then when they're tired of putting the conditional branches in `fn` every time they use it, they can write a macro that generates the `fn` automatically.

Since Scheme and Common Lisp were already well-worn combinations of lexically scoped `lambda`, mutation (`set!`/`setq`), and macros, all of this could pretty much be predicted when Arc was designed.

Nevertheless (or maybe out of having different goals than I'm expressing here), Paul Graham and co. tried out automatic local variables anyway. It was implemented for an early, unreleased version of Arc.[1] Then they pulled this feature out because they realized they kept introducing or removing lexical contours by mistake and breaking parts of their code.[2] I bet this is because they were implementing some of their control flow macros in terms of `fn`.

Could it be possible to follow through on their experiment without recognizing all the same mistakes and pulling the plug again? Yes, I bet it is.[3]

But I think Arc's local variable scoping rules and variable assignment behavior are exactly what they need to be:

- Implicit inheritance of lexical scope to enable lambda calculus techniques (unlike PHP).

- The easy ability to mutate variables in outer scopes so mutation can work together with lambda-calculus-style control flow (unlike Python, R, and Kernel).

This still leaves the CoffeeScript/Ruby/MATLAB approach on the table, where only the outermost assignment is treated as a declaration. I don't particularly like this approach, and that's because I prefer for the outermost level to be relatively seamless with the rest of the language. That way it's easier to break parts of the language off into optional libraries when it turns out they're not as helpful as expected. Arc's top level already isn't seamless with the rest of the language, but I think this would be a step in the wrong direction.

In summary: If users come in with incorrect ideas about Arc's variable assignment behavior based on their experience with other languages, I think that's most likely a place where other languages could learn something from Arc rather than the other way around. The higher-order techniques of lambda calculus are a sweet spot in language design, and Arc's system for local variable scope is well tailored to that. The Arc designers originally did try automatic local variables. They found them to be unnecessarily complex to work with, and I agree.

---

[1] http://www.paulgraham.com/arcll1.html "Here is a big difference between Arc and previous Lisps: local variables can be created implicitly by assigning them a value. If you do an assignment to a variable that doesn't already exist, you thereby create a lexical variable that lasts for the rest of the block. (Yes, we know this will make the code hard to compile, but we're going to try.)"

[2] http://paulgraham.com/arclessons.html "In Arc we were planning to let users declare local variables implicitly, just by assigning values to them. This turns out not to work, and the problem comes from an unforeseen quarter: macros. [...] In a language with implicit local variables and macros, you're always tripping over unexpected lexical contours. You don't want to create new lexical contours without announcing it. [...] It seemed to us a bad idea to have a feature so fragile that its own implementors couldn't use it properly. So no more implicit local variables."

[3] In Racket, the `racket/splicing` module (https://docs.racket-lang.org/reference/splicing.html) has a few rough edges, but it's a good example of how the choice of whether a macro changes the "current scope" can be controlled deliberately, even in a language with lambdas and macros. I didn't bring up Racket or Scheme's notion of "current scope" with all the other examples because it doesn't interact with variable assignment, but I think even that notion is a kind of ill-conceived complexity that I'm glad Arc doesn't have. It's handy to have local syntax that roughly resembles the top level to aid in refactoring, but on the one hand the resemblance isn't required to be perfect (and isn't perfect in Racket), and on the other hand the Scheme top level isn't very easy to make modular, so it's not even a good thing to resemble.

reply

2 points by rocketnia 53 days ago | link

Oh, right, here's another reason I don't like the CoffeeScript/Ruby/MATLAB approach where a variable is declared automatically at the outermost point where it's assigned: I like to be able to declare local variables that shadow variables from outer scopes. When `=` declares non-shadowing variables only (since the rest of the time it acts as an assignment rather than a declaration), shadowing is more cumbersome to do.

reply

2 points by rain1 56 days ago | link

point [2] sounds like a problem of non-hygienic macros. perhaps it can be reconsidered now that there's syntax objects.

reply

3 points by rocketnia 58 days ago | link | parent | on: Variables & scoping complaint

It looks like the particular solution you'd like is for `=` to act as it does in Python, but I think Arc's behavior is preferable to Python's.

For a variable to "default" to "global scope" is basically the definition of how lexical scope works. To find a variable's binding occurrence, you look outward until you find it. Once you go far enough out, you get to the language definition itself, which ultimately must provide some "global" catchall case.

Python makes lexical scope much more complex to describe: Every variable is declared at a module (or REPL), class, or function boundary. To find the boundary where a variable is declared, first you look for the nearest boundary. If you've found a module boundary or a function boundary where the variable is in the parameter list, you're done. Otherwise, you search that boundary for any `=`, `global`, or `nonlocal` declarations of that variable. If you find `global`, you skip to the nearest module boundary, and you're done. Otherwise, if you find one or more `=` declarations and don't find `nonlocal`, you're done. Otherwise, you repeat this process at the next nearest boundary.

I do sympathize with a couple of the pain points you're talking about:

1. If a variable name is written with a typo, Arc's `=` will silently assign to a misspelled global binding. I think it'd be more ideal to alert the programmer to an error like C does.

2. To declare a local variable, an Arc programmer must use things like `let` and `with`, which add layers of parentheses and indentation. In C (ever since C99) and Python, local variable declarations can be performed in the middle of a block, and they don't affect the indentation of the following code.

For point 1, most of the time I like to program without using `=` at all.

For point 2, I think Lisp's prefix notation and lambda notation facilitate a style that uses more deeply nested expressions than you might be used to. When expressions can be more deeply nested, they don't have to be broken apart across quite as many local variables, so writing `let` and `with` forms here and there doesn't pose a serious problem most of the time. That said, I use what I call "weak opening parens" and a certain indentation style to do away with this extra nesting anyway (https://github.com/lathe/parendown-for-racket).

reply


I've left a code review at: https://github.com/arclanguage/anarki/pull/161#pullrequestre...

Basically, I think it's a bad idea to change `ac` into something which sometimes compiles Arc code and sometimes does something more like code-walking over s-expressions. I think Anarki's existing "$ ... unquote" syntax already serves this purpose and uses the same kind of code-walking but does so with a better separation of concerns.

Moreover, the way you're taking out the |foo| syntax so you can redefine it to be a variant of $ seems like a net loss.

(Some of my other comments on that review are less fundamental objections: Style nitpicks, observations of bugs, or wild ideas that I don't really expect anyone to act on in the short term.)

-----

3 points by krapp 68 days ago | link

>or wild ideas that I don't really expect anyone to act on in the short term

Those are the best kind of ideas!

-----


What you're describing is almost like what `or=` does already, just without causing an error if the variable starts out undefined. Maybe it can just be a change to `or=` rather than taking up the ^ character's real estate.

The way you're using it sounds a lot like `defvar` in Common Lisp. I've never used this technique, but it sounds useful for experimental programming like modifying a server, modifying an editor, modifying a music loop (live coding) without undergoing Arc's long load times each time. ;)

Any language with user-defined macros is gonna have some trouble with its compile time or load time, since it's positively designed to run arbitrarily slow user-defined code at that time. So as long as macros at all are part of a 100-year language, then some way to reload pieces of code without undergoing the process of loading all the rest of this code must play a certain part as well. So `defvar` style can be handy, but in a more pure language we might look to incremental computation research or something.

In the longer term, how do you figure it'd be good for Anarki control whether something gets overwritten or not? Seems to me like in a perfect world, you'd sometimes want to interpret the same definition as an overwriting definition or a non-overwriting definition without changing its syntax or anything. :-p Like, maybe we'd eventually want a `load-force` operation that loads a file in an "overwriting way." At that time, if we simply have `load-force` interpret all `or=` as `=`, then it might clobber too many uses of `or=`, so my advice to use `or=` for this will have turned out to be regrettable.

But the paradox is easily resolved if, for instance, the "overwriting way" doesn't make sense to you anyway. Like, maybe you'd just restart the whole (e.g.) server in that case, or go out of the way to rewrite every `defvar` into a `defparameter` (or in your case `^` into `=`) by hand.

-----

3 points by rocketnia 72 days ago | link

I also notice that the scope of this refactoring (= ...) into (^ ...) or (or= ...) is the same kind of refactoring that would help make Anarki a more modular language. Usually (= foo* (table)) is a sign that the code is really trying to create a makeshift sub-namespace. If intentions like this conveyed more explicitly with operations like (declare-sub-namespace foo*), then developing more useful module systems for Anarki would be a simpler task.

-----

2 points by krapp 72 days ago | link

When I was playing around with ns.arc, one of the things I tried was namespacing a table of functions and treating it like a class.

I think two changes would be better for modularity than changing assignment, however:

First, limit the ability of macros to overwrite existing symbols to the file in which the macro is called. This would allow modular or imported code to use macros without worrying about global namespace collisions, or needing to do anything special with assignments.

Second, and related to the first, would be to implicitly namespace based on file location. Currently, Arc uses a single namespace, and a reference to a root directory that's used to disambiguate relative paths called with (require).

We would ignore the root folder and use the remaining path and file name as the namespace.

This would allow explicitly resolving possible namespace collisions by referring to the namespace where necessary, the way it's done in other languages (at least in C++.) So if you had, say, a macro in news.arc with body as a parameter, first, it wouldn't mutate body in html.arc, and you could refer to (html!body) directly to disambiguate, using the same syntax for namespaces as tables.

An added benefit would be that imported code from a repo (adding a vendor path) would work as is namespaced as vendor/file or vendor/app/file.

It wouldn't be necessary to declare namespaces in most cases then. We could use the existing with statement to declare or extend namespaces instead of something like (declare-sub-namespace), similar to 'using' in C++

    (w/namespace foo
      (= bar (table))

    (foo!bar)

-----

3 points by shawn 72 days ago | link

Right. I started work on making anarki reloadable: https://github.com/arclanguage/anarki/pull/158

-----

3 points by shawn 72 days ago | link

In the longer term, how do you figure it'd be good for Anarki control whether something gets overwritten or not?

Definitely leave it up to the user.

Some variables (like constants) should always be overwritten, and so the user should write `(= foo* 42)` for those.

Other variables (like tables containing state) should only be set once on startup.

Like, maybe we'd eventually want a `load-force` operation that loads a file in an "overwriting way." At that time, if we simply have `load-force` interpret all `or=` as `=`, then it might clobber too many uses of `or=`, so my advice to use `or=` for this will have turned out to be regrettable.

Perhaps, though FWIW I haven't needed a force-reload type operation. That's accomplished via restarting the server.

The only drawback for `or=` is that if you have code like this:

  (= foo* (table)
     bar* (table)
     ...)
then you'd have to reindent the whole expression if you change from `=` to `or=`.

That's not a big deal though. I think I prefer `or=`.

-----

3 points by rocketnia 72 days ago | link

Thanks for answering. That makes the intentions pretty clear. :)

Another thing... Have you considered initializing the application state in one file and the hardcoded constants and functions in another, so that when you're changing the code, you can reload the constants file without the state file getting involved at all?

-----

3 points by krapp 72 days ago | link

If only there were some general purpose way to store stateful data separately from source code.

Like a... "base" for "data."

-----

3 points by akkartik 72 days ago | link

To be fair, not all the things we use global variables for would end up in a persistent store. Some of them we want to stay in sync with code.

-----

2 points by krapp 71 days ago | link

Sure, but losing data you don't want to lose because you reloaded a source code file does seem like more of an architectural than language issue. It would be a code smell in any other language.

My comment was slightly facetious but the more I think about it the more I'm wondering whether something like redis or php's apc wouldn't be a good idea - and not just as a lib file but integrated into Racket's processes for dealing with arc data directly.

It could serve both as a global data store and a basis for namespacing code in the future (see my other rambling comment about namespaces), since a "namespace" could just be a table key internally.

-----

2 points by i4cu 71 days ago | link

Are you suggesting a custom internal db?

Otherwise hitching the code to a third-party db, as a requirement, would really limit what could be done with the language and would create all kinds of problems. You would be locked into the db platform as a hardened limitation. You would inherit the complexity of external forces (i.e. what if some other app deletes or messes with the db). What about securing the access/ports on the db.. etc..

It's always possible, but I think you would have to implement something internal where you can properly isolate and fully support all platforms the language does.

Seems likes namespaces would solve these problems the right way.

-----

3 points by krapp 71 days ago | link

> Are suggesting a custom internal db?

Yes. Currently, the options we have for stateful data are file I/O, which doesn't work perfectly, or tables that can lose their state if the file they're in gets reloaded. I'm suggesting something like Redis or APC, but implemented in Arc at the language level, to separate that state from the source code.

I was also thinking (in vague, "sketch on the back of a coffee-stained napkin" detail) that it could also be used to flag variables for mutability and for namespacing. In that if you added "x" from foo.arc it would automatically be namespaced by filename and accessible elsewhere as "foo!x",so it wouldn't conflict with "x" in bar.arc.

>Otherwise hitching the code to a third-party db, as a requirement, would really limit what could be done with the language and would create all kinds of problems.

Yeah, but to be fair, Arc is already hitched to Racket, which appears to support SQL and SQLite, so maybe third party wrappers for that wouldn't be a bad idea as well... sometime in the future when things are organized enough that we can have a robust third party ecosystem.

-----

2 points by i4cu 71 days ago | link

> Arc is already hitched to Racket, which appears to support SQL and SQLite...

Well racket supports SQL and SQL lite as an option, but racket can also run on platforms that don't support them so it's not 'hitched'. i.e. compiling to run on micro-controllers, mobile devices etc.

-----

2 points by akkartik 71 days ago | link

Languages that have decent bindings to a database also have global variables that still have uses, and that can be lost when you restart the server or do other sorts of loading manipulations. There's a category of state that you want coupled to the state of the codebase.

Yes, you can definitely try to make these different categories of state less error-prone by architectural changes. But I don't think other languages do this well either. Mainstream languages, at least. I know there's research on transparent persistence where every global mutation is automatically persisted, and that's interesting. But I'm not aware of obvious and mature tooling ideas here that one can just bolt on to Arc.

All that said, database bindings would certainly be useful to bolt on to Arc.

-----

4 points by rocketnia 76 days ago | link | parent | on: Running in DrRacket?

The simple answer is no, there is no particularly good way to access the Anarki REPL in DrRacket at the moment.

There is a `#lang anarki`, but as far as I know nobody has used it yet. I only gave it a minimal amount of support to make it possible to use Anarki to write modules Racket code could (require ...), and I didn't pay any attention to DrRacket.

I recommend ignoring `#lang anarki` for now, because it might be subject to change as we figure out more of what things like DrRacket need from it.

Instead, I recommend running Anarki from the command line if you can, as described in the Anarki readme: https://github.com/arclanguage/anarki

Are you on Windows? Most of the instructions don't talk about Windows yet, but it's still possible there.

To open a REPL:

Linux and macOS:

  ./arc.sh
Windows:

  .\arc
If you've written a file of Anarki code in my-code.arc, you can run it like so:

Linux and macOS:

  ./arc.sh -n my-code.arc
Windows:

  .\arc my-code.arc
All these commands must be run from the Anarki directory (where arc.sh and arc.cmd are).

-----

3 points by nupa 75 days ago | link

Thanks! And yes I'm on windows, although that command didn't work at first -- protip, you can't run it from a powershell prompt, it has to be a cmd prompt. Weird.

-----

2 points by rocketnia 75 days ago | link

Interesting, 'cause I even tested it at a powershell prompt to make sure before I posted. What problems were you seeing?

-----

2 points by nupa 49 days ago | link

Apologies, I just saw this reply now. It looks like this:

    PS C:\git\anarki> .\arc
    Program 'arc.cmd' failed to run: Access is deniedAt line:1 char:1
    + .\arc
    + ~~~~~.
    At line:1 char:1
    + .\arc
    + ~~~~~
        + CategoryInfo          : ResourceUnavailable: (:) [], ApplicationFailedException
        + FullyQualifiedErrorId : NativeCommandFailed

reply

2 points by rocketnia 48 days ago | link

Thanks. I'm afraid I've never encountered something like that, but I can suggest a few things just in case....

Since arc.cmd is practically a one-line script, what do you get if you try to run the `racket` executable directly, like so?

  racket -t boot.rkt -e "(anarki-windows-cli)"
I've found pretty much just one thread online where people reported a similar issue (https://github.com/Microsoft/WSL/issues/2323). People attempted various fixes in the thread, and they didn't work for everyone.

Some of the approaches they took:

- There's a diagnostic tool which some people found limited success with. It would fix the location of their temp folder. Then a few days later it would break again. Here's a link to the comment that links to that tool: https://github.com/Microsoft/WSL/issues/2323#issuecomment-31...

- One person gave detailed instructions for diagnosing and fixing permission issues, in case that's what it is: https://github.com/Microsoft/WSL/issues/2323#issuecomment-41...

- It seems it might be some difference depending on whether the account you're doing this with is roaming or an administrator. At least, a lot of people in the thread thought it would make a difference. Maybe your command prompt works with Arc because it's a different user.

If you find out any more about what's going on, I hope you'll keep us in the loop. :)

Of course, since you already have this working via the command prompt, it's understandable if you prefer not to mess with it.

reply


This is what I think would be a great way to enter and print tables at the REPL:

  arc> (ob (v name "John Doe") (v age 23) (v id 73881))
  (ob (v age 23) (v id 73881) (v name "John Doe"))
And if they must be compatible with `read` and `write`, I think this would be a great way to render them for that:

  (##ob (v name "John Doe") (v age 23) (v id 73881))
This way it's just about as easy to refactor between `(##ob (v k1 ,v1)) and (ob (v k1 v1)) as it is to refactor between `(,a ,b ,c) and (list a b c).

(The v here is for "value." An alternate syntax, (kv ...), could be used for entries where the key isn't quoted.)

(Note that (##ob ...) here is a reader macro call. I'm using a design for reader macros that puts the macro name on the inside of a parenthesis, rather than the approach taken by things like Racket's #hash(...). That way, reader macro names can be descriptive without pushing the indentation far to the right.)

This approach generalizes to just about any other data structure we want, such as graphs, queues, sorted sets, etc. We don't have to pick out new parentheses for each one, and we don't have to specify idiosyncratic indentation rules either, so pretty printing at the REPL can be very nice automatically:

  (ob
    (v name "John Doe")
    (v age 23)
    (v id 73881))
  
  (ob
    (v name
      "John Doe")
    (v age
      23)
    (v id
      73881))
In terms of Racket implementation, it should be pretty easy to get most of this working using `gen:custom-write`, `make-constructor-style-printer`, and `pretty-print`. Racket's pretty-printer will probably give us results I find slightly less satisfying, but it's a start:

  (ob
   (v name "John Doe")
   (v age 23)
   (v id 73881))
  
  (ob
   (v
    name
    "John Doe")
   (v
    age
    23)
   (v
    id
    73881))
There are only a few other tricky parts:

- We may have to represent Arc tables as their own data structure, rather than directly as Racket hashes, so that they print nicely even when they're nested inside other Racket data structures like lists and vectors. This is one distinct place where, for the best possible Racket interop, we may need to avoid representing Arc values the same way as Racket ones. Then again, I think `port-print-handler` might provide the ability to print parts of Racket values using the Arc style, so it could be possible to get very nice interop here.

- In order to get (##foo to be processed as a call to an Arc reader macro called "foo", we would need to replace the Racket ( readtable entry with an entry that behaved the same as it does in non-## cases. Racket's ( syntax isn't as simple as it might seem, as I found out when I wrote a custom open parenthesis for Parendown, and I would be glad to copy out some of my Parendown code to make this work.

- Of course, it would take some design work to decide on Arc-side interfaces for defining things like reader macros, custom write behaviors, and maybe even custom REPL pretty print behaviors and custom quasiquotation behaviors (to determine where unquotes can go). In Racket, customization of the `write` or `print` behavior is usually done in a per-value-type way using `gen:custom-write`, but I think it would be better to associate them with the "current writer" or "current printer" somehow, just as the reader and macroexpander use the "current readtable" and the "current namespace." That would allow us to swap out the writer at the same time as we swap out the reader, rather than letting the `read` and `write` behavior get out of sync. Essentially, I would store all these things in the Arc namespace.

---

Would it be much trouble if I started working toward some of these things for Anarki or Amacx? If I do work on this, which things would need my help the most or would make the best milestones? Honestly, my top priorities right now are Punctaffy and Cene, so even though I can express opinions about Arc, I might not allocate the time to follow through on them myself. (My desire not to burden people with something that I think of as being in only in a half-finished state has always been one of the reasons I commit so rarely to Anarki.)

I know the reader syntax for tables bears very little resemblance to the curly brackets people have been talking about here, and I don't want to trample on that. Maybe tables can `write` with curly brackets while other things tend to use this more general-purpose style.

shawn, are you currently trying to write a full pretty-printer for Arc values from scratch just so Racket hashes can be written using curly brackets? Are you using `port-print-handler` or something? That's another thing I'd rather not trample on if you have an idea underway.

-----

2 points by i4cu 78 days ago | link

Personally, I don't think this is going to make the language more attractive. You've traded better printing for more verbose code.

  current-arc> (obj name "John Doe" age 23 id 73881)

  your-arc> (ob (v name "John Doe") (v age 23) (v id 73881))
maybe?:

  alt-arc> (ob name "John Doe" age 23 id 73881)
returns (Assuming you're attempting to have ordered tables?):

  (ob (v name "John Doe") (v age 23) (v id 73881))

-----

2 points by rocketnia 78 days ago | link

(I hope you don't mind if I change my mind and use `object` instead of `ob`. I just remembered `ob` is a pretty good local variable name for object values.)

Code could still use `obj`, even in the reader. These two things could be parsed as the same value:

  (##obj name "John Doe" age 23 id 73881)
  (##object (v name "John Doe") (v age 23) (v id 73881))
The reason I suggest interspersing extra brackets and v's, when the concise `obj` already exists, is to avoid idiosyncrasies of pretty-printing `obj` for larger examples.

Here's an example of how a nested table prints in the latest Anarki:

  arc> coerce*
  '#hash((bytes . #hash((string . #<procedure:...ne/anarki/ac.rkt:1128:21>)))
         (char
          .
          #hash((int . #<procedure:integer->char>)
                (num . #<procedure:...ne/anarki/ac.rkt:1133:21>)))
         (cons
          .
          #hash((queue . #<procedure:...t/private/kw.rkt:592:14>)
                (string . #<procedure:...ne/anarki/ac.rkt:1127:21>)
                (sym . #<procedure:...t/private/kw.rkt:592:14>)
                (table . #<procedure:...t/private/kw.rkt:592:14>)))
         (fn
  ...
As a human who can easily apply idiosyncratic rules, here's how I'd probably lay that out if I could only use (##obj ...):

  arc> coerce*
  '(##obj
     
     bytes (##obj string #<procedure:...ne/anarki/ac.rkt:1128:21>)
     
     char
     (##obj
       int #<procedure:integer->char>
       num #<procedure:...ne/anarki/ac.rkt:1133:21>)
     
     cons
     (##obj
       queue #<procedure:...t/private/kw.rkt:592:14>
       string #<procedure:...ne/anarki/ac.rkt:1127:21>
       sym #<procedure:...t/private/kw.rkt:592:14>
       table #<procedure:...t/private/kw.rkt:592:14>)
     
     fn
  ...
There are several idiosyncrasies in action there: I'm choosing not to indent values by the length of their keys, I'm choosing not to indent them further than their keys at all (or vice versa), I am grouping them on the same line when I can, and I'm putting in padding lines between every entry just because some of the keys and values are on separate lines.

Oh, and I'm not indenting things by the length of the "##obj" operation itself, just by two spaces in every case, but that's a more general rule I go by.

As far as Lisp code in general is concerned, those seem like personal preferences. I don't expect anyone to indent this quite the same way. Maybe people could take a shot at it and see if a consensus emerges here. :)

Now suppose I could only use `##object`:

  arc> coerce*
  '(##object
     (v bytes (##object (v string #<procedure:...ne/anarki/ac.rkt:1128:21>)))
     (v char
       (##object
         (v int #<procedure:integer->char>)
         (v num #<procedure:...ne/anarki/ac.rkt:1133:21>)))
     (v cons
       (##object
         (v queue #<procedure:...t/private/kw.rkt:592:14>)
         (v string #<procedure:...ne/anarki/ac.rkt:1127:21>)
         (v sym #<procedure:...t/private/kw.rkt:592:14>)
         (v table #<procedure:...t/private/kw.rkt:592:14>)))
     (v fn
  ...
This saves some lines by not needing whitespace to group keys with their objects. In even larger examples it can cost some lines since it introduces twice as much indentation at every level, so that might be a wash. What really makes a difference here is that all those pairs of parentheses can be pretty-printed just like function calls, so things that process the "##object" syntax don't need to make special considerations for pretty-printing it.

---

"Assuming you're attempting to have ordered tables"

In this thread, the original post's example used unordered tables. It doesn't matter to this design. Ordered tables and unordered tables can coexist with different ## names.

-----

2 points by aw 78 days ago | link

> Would it be much trouble if I started working toward some of these things for Anarki or Amacx?

My aspiration for Amacx is that it becomes a framework that allows you to create the language you want to create. By analogy, similar to how if you're writing a compiler, and you'd find LLVM useful, you can use LLVM as part of your toolchain to write your compiler.

Thus, if you (or someone) wanted to create a particular reader and printer syntax for tables (whether ##ob and v or something else), then you certainly should be able to do that.

I have both an Arc reader and printer written in Arc, but not yet included in Amacx because currently it's too slow. Working on the reader and printer makes the most sense, I think, after finishing my current work on source location tracking (assuming that works out), both because with a profiler it will be easier to see how to speed up the implementation, and because the reader will need to support source location tracking itself.

There's a lot of "if"s here, but in the happy scenario that everything works out, then hopefully adding ##ob and v (or whatever someone wants) will be easy: just add a few lines of Arc code :-)

-----

3 points by rocketnia 85 days ago | link | parent | on: Ask AF: Ordered Tables?

"I can't believe this never came up working with json"

I think order-preserving tables are pretty useful for JSON -- particularly to help ensure implementation details aren't leaking even in a distributed system where JSON is being passed through multiple parsers and serializers -- but it's worth noting that even JSON objects are specified to be unordered and are commonly treated as such. (RFC 8259: "An object is an unordered collection ...")

Even JavaScript's standard JSON.parse(...) doesn't preserve order the way you might like it to. Here's what I get in Firefox right now:

  > JSON.stringify(JSON.parse("{\"a\": 10, \"1\": 20}"))
  "{\"1\":20,\"a\":10}"
This is because JavaScript objects historically didn't preserve any particular iteration order, and although they have more consistent cross-browser behavior these days, they seem to have settled on some rules like iterating through numeric properties like "1" before other properties like "a". (From what I've found, I see people saying this behavior is a standard as of ES2015, but I don't see it in the ES2018 spec.)

JavaScript Maps are a different story; they're a newer addition to the language, and they're strictly specified to preserve insertion order. Other languages use dictionaries which preserve order, including Groovy (where [a: 1, b: 2] is a LinkedHashMap literal), Ruby, and... you've already found details on Python.

Even in Groovy, JavaScript (Maps), and Ruby, the only way to observe the order of their collections is to iterate from the beginning, and the only way to influence it is to insert entries at the end. This means any substantial interactions with the order of these collections will be just about as inefficient and inconvenient as if we had converted the whole collection to an association list, processed it that way, and converted it back. Essentially, it seems the order is preserved only to prevent programs from having language-implementation-dependent bugs, and maybe to help with recognizing values in debug output, not because it's a useful for programming.

Every one of these languages makes it easy to communicate the intent of an ordered collection by using lists or arrays of some sort. I'm pretty sure I've seen this lead to association lists even in JSON, particularly for things like database query results:

  [
    {"id":..., "name": ..., "occupation": ...},
    {"id":..., "name": ..., "occupation": ...}
  ]
When the order matters, I think this is the kind of approach to prefer, even if it takes a few more abstractions to make key lookup convenient.

-----

1 point by kinnard 85 days ago | link

What do you think of this: http://arclanguage.org/item?id=21066

-----

2 points by rocketnia 85 days ago | link

It could be me, but I have trouble imagining any situation where I would need this syntax.

- If I'm not doing both kinds of lookup several times in the same part of the code, the lookup styles don't both need to be concise at the same time. Different parts of the code can convert to different representations that are concise to use in that context.

- If it's the root layer of a data structure, then I can simply use two different local variables to refer to that data, one of which always treats it as a list and one of which always treats it as a table.

- Unless I'm processing some specific indices, I'll usually want to iterate through the whole data structure, so I usually wouldn't be using any indexing operations in the first place.

- If I know what specific indexes I want at development time, then I would usually use an unordered table (for easy indexing) or a fixed-length list (for easy destrucuring).

- If I'm performing any two lookups for the same reason, they can usually use the same expression in the code.

So I would have to be writing some code where I'm performing several different lookups of specific ordered indexes and specific chosen keyed indexes, all of which are for distinct reasons I know at development time but few of which are under indexes I know at development time. Moreover, each lookup must be two or more layers deep into a nested data structure.

This seems to me like a very specific situation. I suppose maybe it might come up more often in data-driven applications that have many scripted extensions which are specialized to certain configurations of data, but it seems to me even those would rarely care about specific ordered indexes.

Even what you've described about your application makes it sound like you only need to look things up by ordered indexes when the user wants to move an item from one stack to the next (or to the first). If that one part of your code is verbose, I recommend not worrying about it. If the rest of your code is verbose, how about converting your alists to tables whenever you enter that code?

-----

3 points by rocketnia 85 days ago | link

I'm sorry, you have a very clear idea in mind of the data structure you want, and it's very reasonable to want to know how to build it in a way that gives you convenient syntax for it.

I think I'm just trying to encourage you to get to know other techniques because I think that's the easiest way to start. Building a custom data structure in Arc is not something many people go to the trouble to do, so it's a bit hard to come up with a good example.

I don't think you need any new ssyntaxes for this new data structure.

Say you have an ordered dictionary value `db` and you want convenient syntaxes to look up items by index (where 0 should give you the first item) and by key (where 0 should give you the item with key 0). One thing you might be able to do is this:

  db!by-key.0
  db!by-pos.0
To get these lookups working, you only need to use Anarki's `defcall`. Here's a stub that might get you started:

  (defcall ordered-dict (self val)
    (case val
      by-key (fn (key) ...)
      by-pos (fn (pos) ...)
      (err "Expected by-key or by-pos")))
Unfortunately, this doesn't allow you to assign:

  (= db!by-key.0 "val")
  (= db!by-pos.0 "val")
To get these to work, you must know a little bit about how Arc implements assigment. The above two calls are essentially turn into this:

  (sref db!by-key "val" 0)
  (sref db!by-pos "val" 0)
This means you need to extend `sref` using Anarki's `defextend` to make these work.

But we've defined db!by-key and db!by-pos to be procedures that only know how to get the value, not how to set it. We need them to return something that knows how to do both.

So let's define another type, 'getset, which contains a getter function and a setter function.

  (def make-getset (get set)
    (annotate 'getset (list get set)))
  
  (def getset-get (gs . args)
    (apply rep.gs.0 args))
  
  (def getset-set (gs val . args)
    (apply rep.gs.1 val args))
We need it to be callable and settable:

  (defcall getset (self . args)
    (apply getset-get self args))

  (defextend sref (self val . args) (isa self 'getset)
    (apply getset-set self val args)))
Now we can refactor our (defcall ordered-dict ...) definition to put in implementations for setting:

  (defcall ordered-dict (self val)
    (case val
      
      by-key
      (make-getset
        (fn (key) ...)
        (fn (key val) ...))
      
      by-pos
      (make-getset
        (fn (pos) ...)
        (fn (pos val) ...))
      
      (err "Expected by-key or by-pos")))
You should pick some representation to use (annotate 'ordered-dict ...) with (or whatever other name you like for this type) so you can implement all four of these methods.

Note that we don't need to do (defextend sref ...) for ordered-dict values here because the `sref` calls don't ever get passed the table directly. You would only need that if you wanted (= db!by-key "val") or (= db.0 "val") to do something, but those look like buggy code to me.

So far, so good, but you probably want to do more with ordered dictionaries than getting and setting.

There's at least one more utility you might like to `defextend` in Anarki for use with ordered dictionaries: `walk`. A few Anarki utilities like `each` internally call `walk` to iterate over collections, so if you extend `walk`, you can get several Anarki utilities to work with ordered dictionaries all at once.

You may also want to make various other utilities for coercing between ordered dictionatires and other Anarki values, such as tables and alists. That way you can easily use Anarki's existing table and alist utilities when they work for your situation.

Note that this approach doesn't make it possible to use the {...} curly brace syntax you and shawn have been talking about. Since there are potentially many kinds of tables, I'm not sure whether giving them all read syntaxes really makes sense; we might start getting into some obscure punctuation. :-p

I hope that's a little bit more helpful advice for your situation. I tried to write up something like this the other day and got carried away implementing it all myself, but then I realized it might not be what you really wanted to use here. It didn't even take an approach as convenient to use as this one. Maybe this can give you enough of a starting point to reach a point you're happy with.

-----

3 points by i4cu 84 days ago | link

What about using '#' for by-pos and '?' for by-key via the reader?

  db#0
  db?0

-----


Isn't Common Lisp a language with a package system and unhygienic macros?

Common Lisp's approach is that the way a symbol is read incorporates information about the current namespace. That way usually all symbols, even quoted ones, can only have collisions if they have collisions within the same file, and this makes hygiene problems easier to debug on a per-file basis.

I don't think it's my favorite approach, but it could very well be a viable approach for Arc. I was using an approach somewhat like this in Lathe's namespace system, although instead of qualifying symbols at read time, I was qualifying each of them individually as needed, using Arc macros.

-----

2 points by rocketnia 85 days ago | link

Oh right, Clojure has unhygienic macros too. Clojure symbols also have namespaces as in Common Lisp.

-----

3 points by krapp 85 days ago | link

I guess the problem isn't the unhygienic macros, per se, but unhygienic macros and lack of namespaces.

Do you think it would be possible to get ns.arc to work with macros and feasible to add it to the core language?

-----

2 points by rocketnia 85 days ago | link

Good question, but ns.arc manipulates what Racket calls namespaces, which are data structures that carry certain state and variable bindings we might usually think of as "global," particularly the definitions of top-level variables.

What Common Lisp and Clojure call namespaces are like prefixes that get prepended to every symbol in a file, changing them from unqualified names into qualified names.

I think namespaces are a fine approach for Arc. If Anarki's going to have both, it's probably best to rename Anarki's interactions with Racket namespaces (like in ns.arc) so they're called "environments" or something, to reduce confusion. I think they will essentially fit the role of what Common Lisp calls environments.

Of course, people doing Racket interop will still need to know they're called namespaces on the Racket side. Is there another name we can use for Common Lisp style namespaces? "Qualifications" seems like it could work.

-----

3 points by i4cu 85 days ago | link

> Is there another name we can use for Common Lisp style namespaces?

realms ?

going for brevity here :)

-----


Some pretty good quotes from this thread: https://www.reddit.com/r/programming/comments/67gu9/take_the...

---

paulgraham: "Really? You've been mad at me for years for writing a new Lisp dialect? But new dialects are so common in the history of Lisp. I've probably used 20 in my life. And why be so attached to CL specifically?

In the old days, Lisp hackers always used multiple dialects, and basically tried to program as close to the platonic form of Lisp as they could modulo the flaws of whatever one they happened to be using. Don't things work that way now? Are there lots of people who are attached to CL specifically rather than Lisp generally?"

---

demoss: "What is annoying is that for 6 years now you have been building a following of people who go "Lisp is theoretically nice, but all the existing ones are SO full of onions! I'm going to wait for Arc to come out before I learn Lisp!""

---

death: "The cardinal rule of Lisp: don't reinvent, integrate."

paulgraham: "I don't know where you picked this up, but it seems the very opposite of the Lisp spirit to me. E.g. Steele and Sussman. Are you sure you didn't mean the cardinal rule of Java or something?"

-----

2 points by i4cu 85 days ago | link

Wow. I hadn't read that post before. And here I was thinking I'm too critical sometimes.

-----

More