I've always implicitly assumed that tables were just an implementation detail for templates, and assumed they were like objects. I think that might be why it never occurred to me to apply `len` on a template, and why a large program like news.arc never ran into this gap.
I have no objection to your approach of making them more like tables. I also have no objection to updating the documentation to stop referring to them as tables. One of those options might be less work than the other :)
> I've always implicitly assumed that tables were just an implementation detail for templates, and assumed they were like objects. I think that might be why it never occurred to me to apply `len` on a template, and why a large program like news.arc never ran into this gap.
Ah, interesting! Here's some background that might help understand why having them act like tables might help. `len` wasn't the first function that I found didn't work with templates; it was just the simplest. I was working on adding teardown functionality to unit-test.arc, and I wanted to look at some of the suites that were created, to see if I was adding tests properly. As the templates end up pretty big (one suite with two tests is about 25 lines), and it was late at night, I figured I'd make it simple on myself, and just look at the keys of the template.
This had nothing to do with the desired "production" code, but only with REPL-hacking introspection.
I want to make them reasonably easy to REPL-hack with; whether they're actually tables or not I don't particularly care right now. The most important table functions are probably len, keys, vals, and maybe maptable/each.
My answer to that question has always been, "it depends." The anti-encapsulation ethos (homoiconicity, using lists where other languages may use objects, the entire compiler fitting in one file and being accessible front-and-center, etc.) means that there's always the ability to peel back another layer of the onion when it becomes relevant.
I think it's a documentation issue. I think I had to search the forums to find out about it when I was playing with JSON interop. Nowhere on the actual template page in the Arc documentation does it tell you this is a thing.
Some things I've only been able to figure out by studying the compiler or arc source code itself. Granted, that's illuminating, but it's also sometimes really annoying.
Templates should just enforce a signature for table fields, but otherwise decompose to tables. I think the issue is that the tables generated by (inst) from a template are annotated as type 'tem when that should (could?) be done with a tag that doesn't actually change the type. Since the template name is passed to the function, you could just build the table with default values without annotating it. You could also delete most of the template functions that just serve as template versions of table functions.
edit: I was also assuming templates type-hinted field values but I don't think they do. Maybe they should?
Also, does anyone else find it a bit odd that a language feature like this is in a lib file rather than being part of the core language?
I disagree with your vision for templates. If you just want something that behaves like tables, why not just use tables? A helper that fills in default values would be pretty easy to write.
Think about the use case of news.arc. There's a list of 'objects' that need to be serialized to disk and unserialized from disk. What should happen if you change the default for a template in code? Should the default update transparently for existing objects? If so, you need some way to distinguish tables that were generated by templates. Which implies something that manages them throughout their life cycle.
>Small languages take it as a mark of pride to move as much as possible into libraries :)
Yeah, I've seen projects that show off the power of a language by doing "x in < 100 lines" that just don't count a remote API call with half a million LOC running on a server or something ;) But with language features like macros and templates that have become ubiquitous, I feel like it's kind of cheating not to just fold them into arc proper.
But that's just me... one thing I've learned being here is that I seem to flow against the culture more than with it, so I can just agree to disagree.
> If you just want something that behaves like tables, why not just use tables?
They are tables, that's what's frustrating. They're tables with metadata. From what I can tell reading earler posts about templates, they used to be something that behaves like tables. Interop between forum data and Racket (and any code where tables are expected) is awkward because that incompatibility has to be worked around, resulting in extra code and extra complexity. Templates need a separate API despite having the same behavior as tables.
>If so, you need some way to distinguish tables that were generated by templates. Which implies something that manages them throughout their life cycle.
Fair enough. But why is it necessary to change their type to do so? Why not make this a feature of tables as a whole, if it's useful? Or tag tables in a way that doesn't change their type, if that's possible?
I understand that tradeoffs have to be made and I'm not trying to be cantankerous or dismiss the value of anyone's work, and no, I couldn't do better myself (yet), which is why I'm commenting on it it rather than making a PR. I'm just wondering if this is the best possible implementation of the concept, given how often I and other people seem to run into issues with it.
Don't worry about sounding dismissive, I totally understand where the questions are coming from.
Tables and objects feel like separate concepts, and they have complementary strengths and weaknesses, and one doesn't subsume the other. To me it seems obvious that if we want to have both, we need them to have different types.
For example, sometimes you want the 'dynamic' ability to set arbitrary keys of metadata on a thing. Sometimes you want the same operation to be an error, by providing a schema. How would a single type do both? No language does so, to my knowledge.
Things should have the same type when they have compatible behavior. When they are incompatible, they shouldn't.
Supporting helpers like len and keys may well still make sense. And as the original story did, this is easy to do.
But in general, having incompatible types easily share functions without sharing too much is an open problem: https://en.wikipedia.org/wiki/Expression_problem A language can easily add a method to many types, or add a new type to many methods. But we don't yet know how to achieve both sides.
And honestly, I think the expression problem isn't important. It doesn't take too much code per method/type. And making it easier just encourages large, bloated codebases.
> ...given how often I and other people seem to run into issues with it.
One thing that might be useful here is a list of issues people have encountered with templates. Maybe we should create a wiki page on GitHub and add to it every time an issue comes up. Then we can have a big-picture view of them and a sense of how many are things people need to learn about Arc, and how many are bugs to be fixed.
I believe Anarki behaves exactly the same as Arc's intent when it comes to templates. The changes that I made here seemed strictly superior to the buggy implementation upstream. But if you disagree you should absolutely feel free to just revert the commits and go back to Arc behavior. I don't use Arc anymore, so my opinions are extremely weakly held, you don't have to bother persuading me. Or, if you have some other specific issue in mind, I'd be happy to be persuaded that I'm wrong.
"But in general, having incompatible types easily share functions without sharing too much is an open problem: https://en.wikipedia.org/wiki/Expression_problem A language can easily add a method to many types, or add a new type to many methods. But we don't yet know how to achieve both sides."
I'm trying to follow, but I think you and I must have different understandings of the expression problem. That article lists several known solutions to the expression problem. The solution Anarki uses is `defextend`.
What do you mean by "sharing too much"?
Is Anarki's `defextend` technique already encouraging a bloated codebase, or is there some other technique you're thinking of that would do that?
Yeah, I suppose you could say the problem is 'solved'. I think of it as a trade-off with costs. We don't know how to achieve zero cost.
For example, I absolutely agree with you that 2 lines per method to extend every table method to some new type constitutes a solution for us. But if we had a thousand such types and a thousand such methods, it may seem like less of a solution. But then `defextend` would be the victim rather than cause of bloat.
Ah, you're imagining us having to write and maintain 1000×1000 individual `defextend` forms someday? Yeah, that does seem like a problem that would not feel solved once we got to it. :-p
I don't think that aspect of the expression problem is solvable in a language design. Instead, it's an ongoing conversation in the community. Sometimes the intent of one feature and the intent of another feature interact, leading people to do a nonzero amount of work to figure out the intent of the two features put together. That work is an essential part of what the community is trying to accomplish together, so it's a cost that can't be eliminated. The intent has to be reflected in the code somewhere, so there will be a nonzero amount of code that serves feature-coordinating purposes.
Regardless, I'm optimistic that although the amount of code will be nonzero, it'll still have a manageable size. To the extent we have any kind of consistency around these feature interaction decisions, those consistent principles can develop into abstractions. The only way we'll have 1000×1000 individual intersections to maintain is if we as a community culture are already maintaining 1,000,000 compelling and distinct justifications for them. :)
I haven't read any more than a few papers on it, and maybe only one of those in depth (which I'll mention below). Mostly I'm going by forum threads, wiki articles, and the design choices certain languages make (like Inform's multimethods and Haskell's type classes).
As far as I understand the history, Philip Wadler's work basically defined the strict parameters of the expression problem and explored solutions for it. Separate compilation and the avoidance of dynamic casts were big deals for Wadler for some reason.
That work was focused on Java, where it's easy to define new classes that implement existing interfaces but impossible to implement new interfaces on existing classes.
The solution I'm most familiar with for Java-style languages is the use of object algebras, as described in Oliveira and Cook's "Extensibility for the Masses: Practical Extensibility with Object Algebras" (https://www.cs.utexas.edu/~wcook/Drafts/2012/ecoop2012.pdf). In this approach, when you extend the system with a new type, you define a new interface with a generic type parameter and a factory method for building that type, and you have that interface inherit all the existing factory methods. So you don't have to solve the unsolvable task of implementing a new interface for an existing class, because you're representing your types as type parameters and methods, not simply as classes.
So I think the main subject of research was how best to represent an extensible program's types and functions in a language like Java where the most obvious choices weren't expressive enough. I think it's more of a "how do we allow extensions to be made at all" problem than a "how do we make all the extensions maintainable" problem.
But then, I've really barely scratched the surface of the research, so I could easily be missing stuff like that.
> ... with language features like macros and templates that have become ubiquitous, I feel like it's kind of cheating not to just fold them into arc proper.
It's totally fine to move something into arc.arc if you want to do that. It's always felt like a non-existent distinction in my mind whether something is under arc.arc or libs/. Is Anarki all language or all standard library? Depends on how you look at it. Why does it matter?
> But that's just me... one thing I've learned being here is that I seem to flow against the culture more than with it, so I can just agree to disagree.
This doesn't feel like a disagreement, more like a language barrier. If I understood better I might know whether I agree or not.
arc> (withs nil 3)
Can't take car of nil
This is a minimal example from something I found in unit-test.arc. It's some macros related to setup code -- if there's no setup, I currently generate something like `(withs nil 3)`. But that errors in Anarki.
At least to me, this is expected. The commit you pointed out above switched the null value to '(). The symbol `nil` still evaluates to (). But the need for evaluation implies that it isn't available in contexts that are not evaluated, such as function arguments or in this particular slot of `withs`.
Like I said, happy to revert it if you don't like it. The whole thing came up because of this conversation: https://github.com/arclanguage/anarki/pull/145#issuecomment-.... The motivation was to simplify the Arc implementation. We already have a nil representation in the underlying Racket; it seems unnecessary to so bend over backwards to switch it to something else.
Yeah, that seems better. I'm still tracking down two test failures, but they're not because of this. I think templates now are of type 'tem, not type 'table.
I tried to make some changes to () instead of nil, and I was not a big fan of how it looked. I found it very unusual that unless quoted, parentheses mean function application. Letting () be the way to write the empty list (and I believe it worked differently quoted from unquoted, but I'm not sure offhand) completely breaks my mental model of how Lisps are parsed.
It's failing because `(type (inst 'foo))` is different in Anarki than Arc. It's a simple change to make it work; I just want to do two things before I stop looking at it:
1. Look deeper into the template inconsistencies. Thanks for the files about this in Anarki.
2. Decide if I want to cut support for Arc, or make this code work in both. This might just involve killing the test, as it's not the _most_ useful test.
Ouch, have the tests for unit-test.arc been failing for the past year? :( :( Very sorry about that. I see the failure now.
I somehow forgot that unit-test.arc has its own tests. Could you post the instructions for running the tests in the Readme? That would also have the salubrious side effect of showing people a way to run a bunch of existing tests.
Once that last one is passing (or maybe even before it's passing), should the top-level tests.arc run these tests too? That way this can be caught not only by Travis CI, but also by people running tests.arc according to the readme.
Anarki isn't really intended to avoid or minimize breaking changes. The unit tests verify only that everything is internally consistent. That boundary around 'internal' should include unit-test.arc, I think.
I _think_ it's some weirdness with the nil/empty list thing. I was getting a case where (str x) resulted in the string "nil", but whatever that object was was not treated as nil, for example in conditionals.
Interesting concept. I feel like I'd be ok with it? I want to say we should bind `nil` to `'()`, so existing code would continue to work, but I might be overindexing on compatability and what I'm used to.
I will admit to not being super sure what the real differences between nil and '() are. Presumably it's more than "what is the human-readable representation of the value that terminates a list/is the false value". But I'm not sure what. Also, is there a difference between the quoted and unquoted version? It feels odd to write () in a repl unquoted -- usually, I expect parens to mean a function or macro call.
It looks like the HTML specification defines this as a "non-void-html-element-start-tag-with-trailing-solidus parse error." The spec says that in this case, "The parser behaves as if the U+002F (/) is not present," but also that "[browsers] may abort the parser at the first parse error that they encounter for which they do not wish to apply the rules described in this specification."
I don't know of any browsers that abort the parsing altogether, so it's still reliable to write the HTML that way.
However, the similarity to XML is actively misleading in this case. When you process that document as HTML, you still get structure like this:
So if you're trying to write a polyglot HTML/XML document, self-closing <p /> tags still probably aren't a great option. Closing the paragraphs explicitly, like so, makes it clearer how the structure will end up:
I think modern HTML does have a reliable common subset with XML. Modern HTML treats <br></br> and <p /> as parse errors, but it treats <br /> and <p></p> as valid. To write HTML/XML polyglot content, you just need to pay attention to whether you're dealing with a void element like "br" or a non-void element like "p".
Incidentally, why use an HTML/XML polyglot at all? There are at least a few situations where it can make sense:
- You're serving it as HTML, but (at least someday) you might want to use an XML-processing tool on it or serve it as XHTML.
- You're trying to serve it as XHTML, but you're worried you'll mess up your server configuration and serve it as HTML by mistake.
- You're confident you can serve it as XHTML today, but you have a backup plan to serve it as HTML if needed. In particular, you're afraid someday your XHTML will be invalid due to a bug in your code, a bug in a browser, an intentional spec violation in a browser (e.g. for security or user privacy), or a backwards-incompatible change in the spec. The XHTML spec dictates that an invalid page won't be displayed at all, so if you end up with invalid XHTML for any of those reasons, your site will be rather unusable until you can implement a fix. If that happens at a time you're not ready to drop everything and look at the bug in depth, then you can make a pretty quick switch to serving it as HTML, and most of the page will display again.
Because of the brittle handling of errors, XHTML still hasn't really gotten off the ground. So it seems like the primary value of the HTML/XML polyglot is to serve a document as HTML but use XML-processing tools on it behind the scenes.
A side note...
In the very early days of XML and XHTML, when people were trying to make their HTML pages as XML-like as possible, many browsers would interpret something like <br/> as an element with the tag name "br/". That's why people got into the habit of putting in a space like <br />. That way those browsers would instead interpret the / as an attribute named "/", which was mostly harmless. Nowadays, the space is pretty much vestigial and you can just write <br/> if you want to.
I'm using a static site generator I wrote in arc. My workflow is as such:
1. Write my entries in an org file that contains all the entries.
2. Narrow to the subtree (C-x n s), and export to an HTML buffer (C-x C-e h H)
3. Manually copy the relevant part of the overly-large HTML file into a new file (blog-entry-name.html). This is only the content of a page; it does not include any headers, footers, navbar stuff, or the html wrapper around the body.
4. Insert by hand a serialized arc template, containing three keys: a url slug, the title of the page, and the publication date.
5. Update the frontpage of my site to link to the new page. This file is similarly formatted: an arc template, then html content.
6. Update a file that indicates which pages should go in the sidebar of my site.
7. Run an arc command to generate the entire static site.
8. Check it out locally, then rsync the content to my nearlyfreespeech.net server.
Obvious places for improvement are 3, 5, and 6.
I was thinking about this recently. There's something quite fun in writing extremely personal software. This is not a tool that is designed to be used by millions of people, and I'm ok with that. I'm actually quite happy with storing settings for the page and the html content in the same file! It seems like a neat hack to me.
Actually, poking through the markup of the forum in general, I'm still impressed by how simple the site really is. There's something to be said for minimalism like that. Not only does it make the initial development easier, but I imagine it's easier to do mashups and derivative works too.
> There's something to be said for minimalism like that. Not only does it make the initial development easier, but I imagine it's easier to do mashups and derivative works too.
If I may kick off a tangent, this is the part of "Worse is better" that tends to be forgotten/deemphasized in Pitman's formulation. C and Unix succeeded because they focused on keeping the implementation simple and accessible for many years. (They eventually forgot that lesson, of course, and have been coasting on the initial momentum for a very long time.)
Indeed. And Richard actually makes that point, that the "initial virus" has to be good and simple, and that having won it will have much more pressure to improve until it gets to 90% of "good". Unfortunately, in the process it conditions users to accept worse, and the patching process probably doesn't result in a simple end result.
In fact, reading the story about the "PC loser-ing problem", I realized that I was so conditioned by the Unix solution that I had never even _considered_ the former as a possibility. I do sometimes wonder how many amazingly good ideas we've lost, that would now actually be much simpler than the stack we have, but we're just used to it.
I think the concept could be better generalized by rephrasing it as "cheaper is better" though. Technically it's not "worse", it just has a different set of values. Obviously, users value it more, or they wouldn't adopt it.
I see it as closely related to ideas like "compatibility is key", "customer is king", and "money is power", each of which builds on the following.
Customers adopt products that have the best cost-benefit ratio. It doesn't matter if the fancy "good" solution is 10% better (from 90% to 100%) if it also costs 2x as much. Maintenance of the ideal solution may actually be cheaper, but it's really hard to estimate maintenance in advance, especially in design fields like software development.
Once the "cheap" solution is adopted, future adoption and upgrades are even cheaper compared to switching to the "good" solution, because the user is already invested, and has built a network of integrations that would be very hard to replicate.
The network effect and basic epidemiology probably provide good explanations for the rapid victory of "cheap" solutions—they spread faster because they are easier to "get", and that amplifies the infection rate to new nodes. Anyone can understand why to adopt something cheap. It takes a lot of effort to learn and understand the technical advantages of a superior system. Given the work involved in properly evaluating competing options to discover technically superior solutions, I think it's safe to assume that the percentage of potential customers that just pick the cheapest one that works, or that is already adopted by the largest number of other users, will always be higher than those who actually compare all the options to pick a better solution.
So "worse" solutions actually are "better", because they're cheaper to adopt. This is especially visible when you look at history and see how many times the systems focusing on backwards compatibility won out over those that merely tried to be "new." Compatibility reduces the cost of adoption. It's that simple.
Does that mean that we're doomed to a "race to the bottom"? I don't think so. In fact, I think with some care new solutions can be designed that are sufficiently better/faster/cheaper that they do disrupt the existing ecosystem. It happens all the time. We see Facebook beating Myspace, all the various chat programs killing XMPP, Slack starting to eat IRC, etc. Most of those did it by making adoption easier for new users. The secret is that a new system doesn't have to replace the existing system, just be easy to adopt. Lots of people use multiple chat programs at the same time. The Lean Startup book was written by an entrepreneur working on a chat system, who initially thought that to make adoption easy he had to integrate with existing systems. What they learned was that people didn't mind adding it to their list of chat systems, and actually liked the ability to meet new networks of friends.
I've been very intrigued recently by a lot of early internet protocols, like IRC, SMTP, NNTP, etc. which are very clean and simple. So easy to use that you can literally connect to an SMTP server via telnet and send an email by hand with just a few simple text commands. I've seen people mention gopher a few times recently (the core doesn't change very fast, but people like to implement custom clients), and even HTTP is pretty simple. I think there's a lot to be said for simple, text-based protocols, because they're easy to understand and implement something that connects to them. I almost think a good test for how complicated an interface is, is how easy it would be to implement in arc, which has very little library support for most of these things. It turns out to be quite easy to build an IRC bot with arc.
It is interesting to me that arc may not be very widely adopted, but it is probably one of the few programming languages that has almost as many implementations as it has community members. If we made it just a little bit easier to pick up and start using (particularly in production), the community would probably grow a little faster.
I think there's a lot of opportunity now and in the near future for reintroducing simple foundations, perhaps slightly extended, but mostly made more accessible for new users. Our technology stack has gotten so tall and complicated in the name of shortcuts and simplicity, that a lot of efficiency can be gained by cutting out a few layers. Once people start targeting certain abstraction boundaries, like WASM + WASI, it should be pretty easy to replace everything under that boundary with a much simpler system. A lot of the disadvantages of "good" systems, like microkernels vs monolithic ones, are now so completely outweighed by the rest of the environment that it should be pretty straightforward to build an OS with much better security much closer to the metal than what we have now with 2+ layers of VM sandboxing.
But you should elaborate on your last 2 paragraphs. I'm not sure I buy either that Arc adoption can pick up or that the mainstream tech stack will ever cut out layers.
My synthesis of "Worse is better" for myself (with Mu and SubX):
a) I don't think of evolution as "bad". Building something incompatible is indeed maladaptive. I'm clear-eyed about that.
b) Mu doesn't try to come up with the perfect architecture that doesn't need to evolve. Instead it tries to identify and eliminate every source of friction for future rewrites.
c) My goal isn't to go mainstream. I'd be happy to just have some minor Arc-level adoption. I think it's better to have a small number of people who actually understand the goal (an implementation that's friendly to outsiders) than to have a lot of adoption that causes Mu to forget its roots. My real goal is to build something that outlasts the mainstream stack (the way mammals outlasted the dinosaurs). That doesn't feel as difficult. It's clear the mainstream has a lot of baggage bogging it down. It'll eventually run out of steam. But probably not in my lifetime.
Anyways, I hope in a year or so to give Mu an Arc-like high-level language. It won't improve Arc's adoption, but hopefully it will help promulgate the spirit of this forum: to keep the implementation transparent, and to be friendly to newcomers without burning ourselves out.
>I install packages (although I think the only ones I use are ones I've created) by downloading the file, then calling `(load "/path/to/file")`.
To be fair, though... when most people say they want to "browse and install packages" for a language, they don't mean including local files through a REPL. Although that is the best we can do in Arc for now.