Just totally brainstorming here... don't know if this is a good idea or not... but
string:intersperse
returns the result as a string, so perhaps
ascons:intersperse
could return the result as a list.
And maybe like setforms with =, "ascons" could be a macro that works on the name passed to it, so it could say "oh look, I can expand into a call to the primitive intersperse that returns a list directly, instead of calling the general one that returns a string and then converting it to a list".
Hmm... so you're thinking of renaming testify to fun or asfn? I don't know if testify is general enough to earn one of those names. Aren't there other common utilities that turn expressions into functions, like thunk?
Have you considered testifn?
Update: Here are some other naming possibilities you made me think of, though I'm not sure if/where they fit in:
- fnify
- fnize
- fnk (sounds like "thunk")
To be sure, I do like your asfn, I just don't think testify is the right fit for it.
Aren't there other common utilities that turn expressions into functions, like thunk?
Well, thunk creates a function out of code, but I'm not aware of any other function that turns data into functions.
In wart testify is now generic. If it sees a non-function by default it compares, if it sees a function it returns it, and if it sees a list it checks for membership.
In Penknife I've been mulling over 'testify a lot, trying to figure out whether there's a way to stop special-casing the 'fn type or to make it an everyday coercion function by introducing a 'test type or renaming it to 'fnify. I think I have the start of some answers....
Recently I posted an example utility, 'copdate, that acted exactly like 'copy but called functions on the existing values to get the replacement ones. http://arclanguage.org/item?id=13652
It would be cool if 'copy itself did that. Essentially we'd check the replacement value, and if it was a function, we'd treat it as a conversion function, and otherwise we'd just use it directly. This check could be done by a satellite utility:
(def conversionify (x)
(case type.x fn x [do x]))
This is just as arbitrary about the 'fn type as 'testify is, and it has just as much right (IMO) to be called 'fnify. But now we have two cousins, so we can extract their common warts into another function:
(def abehavior (x)
(isa x 'fn))
(def testify (x)
(check x abehavior [iso x _])) ; not using 'is here!
(def conversionify (x)
(check x abehavior [do x]))
This 'abehavior utility is naturally extensible. Its default behavior can be nil, and each of its extensions can return t under new conditions. The question "Is this type of value supposed to be idiomatic for specifying custom behaviors for everyday Arc utilities?" isn't hard to answer. A behavior will need to have a 'defcall implementation anyway, so that's a good way to decide (but not perfect, 'cause indexable data structures use 'defcall too).
Also, 'testify and 'conversionify have a kinda nice common semantics: Each of them coerces to the behavior "type" based on its own idea of what a normal behavior is.
Is there a way to expose a design flaw in between 'abehavior and the places it's used? What if there's a type of value that's especially good at encoding tests or conversions, but not idiomatic for specifying callbacks? In fact, you've already mentioned one:
if it sees a list it checks for membership
I totally agree. I really like how Groovy's switch statements have that behavior. XD However, what's to say a list is an idiomatic way to give a list of options but a string isn't? Well, maybe I'd go with another function:
However nice (or paranoid ^^ ) this setup might be, what happens if you're looking for nil? Should nil be interpreted as an unsatisfiable condition? Well, that's probably not too bad; it's up to the utility designer to say what nil does, and we're in that realm at the moment. People will just have to say (some [iso x _] foo) instead of (some x foo) if x could be nil, even if they don't expect x to be any other kind of container or behavior.
Hopefully this rant has helped you nearly as much as it's helped me. :-p
Hmm, perhaps asfn is a bad idea because it seems too similar to (as fn ..) which has different semantics. For lists and hashes it takes an arg to index with.
Perhaps I need coercions to fn to be sensitive to the number of args. With 0 args, behave like testify, with 1 arg, index, with 2 args.. who knows?
Perhaps I need coercions to fn to be sensitive to the number of args. With 0 args, behave like testify , with 1 arg, index, with 2 args.. who knows?
I dunno.... The coercion itself would require just as much extra information (none) for each kind of coercion. (This is a great example of the one-conversion-per-type silliness that makes me give up on 'coerce in the first place.) Also, a test is a one-argument function, just like a list or table is, so the number of arguments doesn't distinguish anything there.
Hmm, I don't mind one conversion per type combination. Primitive functions need to make some assumptions, and yes you could provide coercion functions to print, some, etc., but that seems verbose almost all the time.
The key to me is to make coercions extensible. Then if I ever need a coercion to behave differently I can just wrap one of the types and write a new coercion.
Lisp historically suffers from the problem of having too many variants of basic stuff (http://arclanguage.org/item?id=12803). But the problem isn't that it's impossible to find the one true semantic for equality or coercion, it's that the language designers didn't put their foot down and pick one. Don't give me two different names with subtle variations, give me one strong name.
This is related to (in tension with, but not contradictory to) extensibility. Common lisp gives us three variations of equality. Different methods for string manipulation react differently to symbols: coercing symbols to strings works but (string 'a) dies. And neither equality nor coercion nor any primitives are extensible. This is exactly wrong-headed: give me one equality, make the string primitives all handle symbols uniformly -- and give me the empowering mechanisms to change the behavior when I need to.
The key to me is to make coercions extensible. Then if I ever need a coercion to behave differently I can just wrap one of the types and write a new coercion.
You could manually type-wrap your arguments when calling basic utilities, but that seems verbose almost all the time. ^_- If you're talking about putting the wrapper on the target type, like (rep:as special-coersion-wrapper x), that's still more verbose than special-coercion.x.
Something I've said for a while is that individual coercion functions like 'int and 'testify should be the norm. For every type 'coerce handles, there could be a global function that handled that type instead. The only thing this strategy seems to overlook in practice is the ability to convert a value back to its original type, and even that coercion-and-back can be less leaky as a single function (like the one I mention at http://arclanguage.org/item?id=13584).
---
Don't give me two different names with subtle variations, give me one strong name.
The difference between 'testify and 'conversionify (http://arclanguage.org/item?id=13678) isn't subtle, and neither is the difference between 'pr and 'write (or the more proper coercions [tostring:pr _] and [tostring:write _]). The purpose of a coercion isn't entirely explained by "I want something of type X." In these cases, it's "I want a Y of type X," and the Y is what's different.
Perhaps you can make a 'test type to turn 'testify into a coercion and an 'external-representation type so that 'write can be implemented in terms of 'pr. Maybe you can even replace the 'cons type with a 'list type to avoid the awkward (coerce "" 'cons) semantics, with (type nil) returning 'list. At that point I'll have fewer specific complaints. :)
On another note, I suspect in a land of polymorphism through extensibility, it makes more sense to test the type of something using an extensible predicate. If that's common, it'll probably make more sense to coerce to one of those predicates than to any specific type. Maybe 'coerce really should work like that? It could satisfy the contract that (as a b) always either errors out or returns something that satisfies the predicate a. This isn't to say I believe in this approach, but I hope it's food for thought for you. ^_^
---
I got rid of is from wart a few weeks ago.
Well, did you undefine 'eq too? Without a test for reference equality, there'll probably be gotchas when it comes to things like cyclic data structures, uninterned symbols, and weak references.
Me: "Don't give me two different names with subtle variations,
give me one strong name."
You: "The difference between 'testify and 'conversionify isn't subtle.."
Yeah I wasn't talking about your previous suggestions. I was talking about eq vs eql vs equal, or about string vs coerce 'string. I was talking about subtle variations in names.
Yes I still use eq. It's useful in some cases, no question, but not so often that it merits thinking about what to call it. If I were creating a runtime from scratch, I'd give it a long name, like pointer-equal or something. If I find a better use for the name I'll override it without hesitation.
Names that take up prime real estate are like defaults. Having a bunch of similar-sounding, equally memorable words that do almost the same thing is akin to having large, noisy menus of preferences. They make the language less ergonomic, they make it harder to fit into people's heads. "What does upcase do to a list again? Does it use this coerce or that one?"
If the default doesn't work for someone of course they'll make their own up. This is lisp. They're empowered. I don't need to consider that case. My job as a language designer is to be opinionated and provide strong defaults.
I'm only arguing the general case for coercions. It's possible that we need multiple kinds of coersions in specific cases, and that these need to have prime real estate as well.
If I were creating a runtime from scratch, I'd give [is] a long name, like pointer-equal or something.
Yeah, I think it's a bit unfortunate that 'iso has a longer name than 'is. I've thought about calling them '== and '=== instead, or even '== and 'is for maximum brevity, but those are more confusing in the sense you're talking about, in that their names would be interchangeable. I can't think of a good, short name for 'iso that couldn't be mistaken for the strictest kind of equality available. "Isomorphic," "equivalent," "similar," and maybe even "congruent" could work, but "iso" is about as far as they can be abbreviated.
...Well, maybe "qv" would work. XD It has no vowels though. Thinking about Inform 7 and rkts-lang makes me strive for names that are at least pronounceable; I remember Emily Short mentioning on her blog about how a near-English language is easier for English typers to type even when it's verbose, and rkts posting here about how it's nice to be able to speak about code verbally. I think it's come up a few times in discussions about the name 'cdr too. ...And I'm digressing so much. XD;;;;
---
I'm only arguing the general case for coercions.
This is the one spot I don't know I agree with, but only because I don't know what you mean. What's this "general case" again?
id could also mean the mathematical identity function, which arc calls idfn probably because it's a lisp-1 and we want to be able to create locals called id.
Hmm, how about if id was a low-level beast that converted any object into a say ptr type that contained its address. Now instead of (is a b) you'd say:
(iso id.a id.b)
That seems to me about the right level of verbosity for doing the things is does that iso can't do.
In Java, everything uses equals() where it can, but then it's not easy to get == behavior when it matters. Technically you can use a wrapper:
public final class Id
{
private Object x;
public Id( Object x ) { this.x = x; }
public int hashCode() { return System.identityHashCode( x ); }
public boolean equals( Object obj )
{ return obj instanceof Id && ((Id)obj).x == x; }
}
I'm digressing, but it would be so much nicer if every object kept a hard reference to its Id wrapper. That way the Id could be used as a WeakHashMap key.[1] Weak tables are one place where comparing keys for anything but reference identity makes no sense. XD
Back in terms of (iso id.a id.b), this would have the observable effect that (iso id.id.a id.id.a) would be true for all a.
[1] Thanks to the hard reference to the Id from its x, the Id itself wouldn't be collected until its x was unreachable too. A Id without such an anchor would potentially be garbage-collected much earlier than its x, and the WeakHashMap entry would be lost. ...Anyway, some of this could be solved by changing the design of WeakHashMap. :)
I think all we need is that it be a stable value when literal objects are interned. So id.4 would probably never change, and (id "a") would change in common lisp because it creates new strings each time.
The name "is" is just so perfect though. What we're talking about is the difference between "the same regardless of who and why you ask" and "the same if you ask the type designer." When I see something like "is" that plainly communicates sameness, I assume it's the no-matter-what version. On the other hand, saying things are "similar" doesn't entail they're identical; they might only be identical enough.
I do like "alike." It's short, and it suggests it has something to say about non-identical arguments. It's better than "like," because (like x y) could be interpreted as "x likes y" or "cause x to like y."
To stretch my analogy to breaking point, that's like saying I wish we could build a hundred yards into the ocean, it would make such fine waterfront property :) Some things are 'prime' but not real estate.
Ok that is probably not clear. I think I'd use is or iso, but not both.[1] iso is perfect because it is short and precise and conjures up the right image. I'll probably never find a use for is. That's ok. Good names minimize cognitive overhead, but there's still some overhead. The best name is the one you don't need, and there's enough good names that we don't have to microoptimize.
---
I'm not sure which variant is "the same regardless of who and why you ask". In common lisp there's 30 years of rationalizations why this is 'intuitive':
* (eq "a" "a")
nil
Yet arc chose otherwise:
arc> (is "a" "a")
t
So clearly it's not "the same regardless of who you ask".
If you created syntax for hash-table literals you may want this to be true:
(is {a: 0} {a: 0})
So the semantics of is are actually quite fungible. If pointer equality is a low level beast that requires knowing internal details let's not put it on a pedestal and give it a prime name.
(Again I'm less certain than I seem.)
[1] I think isa is fine though it's so similar in form because it's different enough from iso in behavior.
I'm not sure which variant is "the same regardless of who and why you ask".
By "the same regardless of who and why you ask," I mean something at least as picky as "the same regardless of what algorithm you try to use to distinguish them." Notice that the pickiest equality operator in a language is the only one that can satisfy that description; any pickier one would be an algorithm (well, a black-box algorithm) that acted as a counterexample.
I think 'eqv? is this operator in standard Scheme and 'eq? is this operator in any given Scheme implementation (but I dunno, maybe it's more hackish than that in practice). Meanwhile, I think 'is acts as this operator in a "standard" subset of Arc in which strings are considered immutable. (A destructive algorithm can distinguish nonempty mutable strings.) I avoid string mutation in Arc for exactly this reason.
---
So the semantics of is are actually quite fungible.
Yeah, especially once you get down to a certain point where nothing but 'is itself would allow you to compare things, then 'is is free to be however picky it wants to be. A minimally picky 'is would solve the halting problem when applied to closures, so something has to give. :)
The least arbitrary choice is to leave out 'is altogether for certain types, like Haskell does, but I dunno, for some reason I'm willing to sacrifice things like string mutation to keep 'is around. I mean, I think it's for the sake of things like cyclic data structures, uninterned symbols, and weak references, but I don't think those things need every type to have 'is.... Interesting. I might be changing my mind on this. ^_^
In general coerce is more useful than a bunch of specific coercion functions (say of the form srctype->desttype). Specific cases like testify may need multiple coercion functions, though.
"Perhaps you can make a 'test type to turn 'testify into a coercion and an 'external-representation type so that 'write can be implemented in terms of 'pr. Maybe you can even replace the 'cons type with a 'list type to avoid the awkward (coerce "" 'cons) semantics, with (type nil) returning 'list."
Yeah perhaps I've been too conservative in designing wart, and I shouldn't be so afraid of proliferating types. Hmm, perhaps if there was a way to automatically coerce types as needed.. Say I wrote a function that can only handle lists, and pass it a string, the runtime dynamically coerces to string. If there's no coercion to string, it finds a coercion through lists of chars.. Hmm, I wonder if people have tried this sort of dynamic searching for paths through the coercions table. If I could do that I'd be more carefree about spawning new types. Just create a new 'test type, write a coercion to 'fn, and everything from before would continue to work.
---
You're right of course, and I was wrong: you can't disambiguate the cases just by number of args. And even if you could that is perhaps hacky.
Proliferating types is something I don't blame you for worrying about. Since your extension system is based on one extension per type, introducing a new type means extending all the necessary utilities again. At one point I thought that was fine, that I could just put in macros that performed several extensions at once. Now I prefer a base-extensions-on-support-for-other-extensible-things approach for Penknife, but I haven't had enough time to figure out what the pros and cons are.
I've definitely considered that path-search idea before, and I think at one point I talked about why I don't have much faith in it. The short of it is, if there are multiple coercion paths from one type to another, how do you determine which path to use? Be careful; if there's already A->B->C and I define a new type D with A->D and D->C, then code that expects A->B->C might use A->D->C and break. Technically you could propagate the transitive closure each time a coercion is defined, thereby creating the kind of stabiility needed to solve that, but I don't know if that's intuitive enough. Maybe it is. :)
It's definitely one of the goals of wart to minimize the number of methods you have to give a new type to get all the primitives to work. sort isn't generic because the comparison operators are.[1] I think python's __names__ give us a pretty good approximation of what we need.
You could manually type-wrap your arguments when calling basic utilities, but that seems verbose almost all the time. ^_-
And in your approach you'd have to hardcode some coercion in every primitive.
If you're talking about putting the wrapper on the target type, like (rep:as special-coersion-wrapper x), that's still more verbose than special-coercion.x.
Whether you care about that verbosity depends on how often you need special-coercion. I just don't want some rare concept taking up space in the namespace.
Ok, enough generalities in this sideshow. Focus on my other responses :)
And in your approach you'd have to hardcode some coercion in every primitive.
I don't follow. In Arc I never say (coerce x 'sym); I always say sym.x instead. Neither of those is more hardcoded than the other, IMO; sym.x hardcodes the 'sym global binding, while (coerce x 'sym) hardcodes the 'coerce global binding and the 'sym type symbol, but they're both ways of looking up the same separately specified behavior.
...Oh, by "primitive" do you mean a function like 'sym? I think it's just a matter of perspective whether 'sym or 'coerce is the more basic concept. IMO, 'coerce acts primarily as a way to offload some functions into a less convenient namespace (which I could do myself with a verbose naming scheme), and secondarily as a way to add an extra dimension of meaning to type tags when they're not attached to types (which strikes me as suspicious wrt extensibility...).
Hmm, I guess the 'type function itself breaks polymorphism. I wouldn't arbitrarily limit the programmer from using 'type, and I know your 'defgeneric relies on it almost intrinsically, but now I've persuaded myself a bit more that I'd rather avoid dealing in specific type tags whenever possible.
---
Ok, enough generalities in this sideshow. Focus on my other responses :)
Apparently I don't do that. <.< Speaking of which, what was the original topic again? >< (* looks it up*)
:) Depends on how far back you go, but I choose to think it's about whether coerce is worthwhile or can be taken out.
If we can have 2 coercions to function, testify and checkify, every primitive that needs a function now must decide which one it's using. That's what I meant by hardcoding. You're trading off verbosity in setting things up for a new type (which I hope will be relatively painless) with just not being able to override certain decisions. I think I'd err toward being more configurable albeit verbose when a type needs a second coercion.
Uh, I only suggested 'checkify as a less confusing name for 'testify. If a language does include both, then yes, it's a bit arbitrary whether something uses one or the other. :) Orthogonality fail.
I believe the choice between 'testify and 'conversionify is very clear. There's no way 'all and friends would use 'conversionify, and there's no way my hypothetical update to 'copy would use 'testify.
---
You're trading off verbosity in setting things up for a new type (which I hope will be relatively painless) with just not being able to override certain decisions.
If you have a choice between (as test x) and (as check x), it's just as hard to override that decision.
I want to try really hard to have just one coercion per type combination, and let users create new types when they want a different conversion. You point out that's verbose, but it may not be overly so, and at least it's overrideable.
testify vs conversionify (ugh :) may be a case where the primitives themselves need multiple coercions. That's fine :) I'm just saying I'd try really, really hard to avoid it because people now have to keep track of two such things. And I'd definitely try to designate one of them the 'default' coercion like arc already does.
Yeah, I might have called it 'coercify if not for the confusion it would cause. ;) And so far I only have one (niche) use in mind for it, so it doesn't have to be nice and brief.
Actually, it would make more sense to call it "transformify." Both "coerce" and "convert" have the connotation of at least sometimes transforming from one kind of value to another. For my purposes, the function [+ 20 _] is a valid behavior--it certainly makes sense to 'copy a data structure but add 20 to one of its parts--and yet the inputs and outputs of that transformation are of the same kind.
---
may be a case where the primitives themselves need multiple coercions.
Nah, you can resort to (as test x) and (as conversion x), with 'test and 'conversion--or 'transform--being callable types.
This is the same thing I meant when I said "you can make a 'test type to turn 'testify into a coercion."
---
I want to try really hard to have just one coercion per type combination, and let users create new types when they want a different conversion. You point out that's verbose, but it may not be overly so, and at least it's overrideable.
Feels like I'm talking past you a bit.
When users want different conversions, they can already define new "anything -> type-foo" global functions, and those are already extensible/overrideable. There's no need to add a framework for that, even if it does manage to be a tolerably simple framework with a tolerably brief API.
Code that chooses between the super-similar testify.x and checkify.x is just as doomed to being arbitrary and hardcoded as if it were to choose between (as test x) and (as check x). The 'coerce framework doesn't help with this hardcoding issue.
---
And I'd definitely try to designate one of them the 'default' coercion like arc already does.
I don't generally agree with designating one operation as default simply based on the types it manages, like "anything -> type-foo." For a more obvious example, it would be especially silly to identify a default "number x number -> number" operation.
Still, I think we agree on what's important. If the point of operation X is to reshape things to expose their fooish quality, then we don't want an operation Y whose point is also to reshape things to expose their fooish quality.
You draw the line farther toward "reshape things to expose their _____ish quality" and embrace 'coerce as a way to abstract the _____ away. As for me, I suppose I've come to believe _____ is an intrinsically subjective, second-class concept.
At one point I wanted Penknife to have a type type, with operations like list'ish.x, list'ify.x, [inherits list seq], and [new.list a b c], which could then be abstracted into other utilities like [def list my-list a b c]. It's been a while, and now I've kinda given up looking for a great advantage; I think the one plus would be having only a single global variable to attach the list documentation to, and I'm not especially excited about that. ^_^
These days I have the vague notion that a user-defined type which extends all the right things oughta be an equally pure representation of _____ish quality; the correspondence between _____s and types isn't one-to-one.
"You draw the line farther toward "reshape things to expose their _____ish quality" and embrace 'coerce as a way to abstract the _____ away. As for me, I suppose I've come to believe _____ is an intrinsically subjective, second-class concept."
That's a pretty good summary. I really haven't considered that alternative at all, so keep me posted on how it goes :)
Lol, by "second-class concept", I mean something that isn't modeled as a value in the language. My gut reaction is that it takes zero work to pursue that alternative. :-p
But yeah, I do still need to solve the same problems 'coerce solves. My alternative, if it isn't obvious by now, is to have global functions that do (fooify x) and occasionally (transform-as-foo x (fn (fooish-x) fooish-result)).
coerce, ify, transform, these seem really similar to me. Why bother splitting such hairs?
I'm reminded of your earlier statement: The purpose of a coercion isn't entirely explained by "I want something of type X." In these cases, it's "I want a Y of type X," and the Y is what's different. Perhaps you didn't go far enough? If you're going to eschew coerce why not get rid of all trace of the types? Don't call it stringify. Call it serialize. Don't call it transform-to-int, call it parse-number. Don't call it aslist, call it scan. Or something like that. It isn't these names, but good name choices would bring me entirely in your camp :)
Ah, right. ^_^ Yeah, most of the times I "use" these things, I'm just making some other abstract nonsense utility. It could be a while until I have another down-to-earth application to try this stuff on.
coerce, ify, transform, these seem really similar to me.
Well, I don't distinguish "ify" from "coerce" except when it helps tell apart the subtle variants we're talking about in this conversation. :-p The subtle variants wouldn't all go into the same language, I hope.
"Transform," on the other hand, seems more general. Really, 'transform as a type wouldn't mean anything other than "unary function, preferably pure." I could very well be splitting hairs by drawing a distinction between that context and the everyday "function" context. However, I think there are some things I want to treat differently between those cases, like ints (multiplication in function position, constant functions in transform position).
If you're going to eschew coerce why not get rid of all trace of the types?
Actually, that's how I think about it. Coercion is a technique to focus output ranges or widen input ranges so utilities are easier to use. If I consistently want a certain kind of focusing--and only then--I'll put it into a utility.
I name those utilities to correspond not with specific types but with informal target sets, usually based on the extensible utilities they support. However, any given tagged type means little except the extensions associated with it, so I think my approach has a natural tendency toward corresponding names anyway.
Don't call it [this], call it [that].
I think I agree with you there. However, I don't expect to come up with witty names as fast as I change my mind about the designs of things. :) For now I'm happy resorting to cookie-cutter names like 'fooify, 'fooish, and 't-foo, just so it's easier to notice relationships between variables and between informal concepts.
"Coercion is a technique to focus output ranges or widen input ranges.."
That's a nice way to think about it.
"I don't expect to come up with witty names as fast as I change my mind about the designs of things. :) For now I'm happy resorting to cookie-cutter names like 'fooify, 'fooish, and 't-foo"
I think in terms of naming functions, it's a good idea to give short names to the functions you call often.
I also think that overloading or "punning" functions is fine, when it allows you to type a short name more often.
I think it's a good idea for the underlying, more primitive functions to remain available. For example, if I did want to call an intersperse with strings and have it return a list, it would nice to able to call the underlying intersperse function directly. The underlying function can be given some longer name, since you don't have to type it very often.
Eventually we start to notice some commonality among these patterns. We have a bunch of functions that could return either a list or a string. We have a bunch of functions that could search from the left or search from the right. I haven't given any thought to what the best way to express these patterns are (whether in the function name, or with prefixes, or with keyword arguments... etc.), but I don't think that some kind of declarative language ("I want to find the three rightmost characters that meet this test and return the result as a string") is necessarily a terrible idea...
So if I read you right, you're saying that having intersperse do something special for lists of strings is ok, and we could include a longer-named version that doesn't do the specialcase?
Yeah. I think so. For example, "map" returns a string when passed a string, which strikes me as a similar situation.
I could see having a longer-named version of "map" which is the primitive, simple case, that just returns a list. So if I know I want a list I have the option of calling the longer named version, while the shorter named versions such as "map" or "intersperse" do whatever is the most common case of what I usually want.
I think I was confused because intersperse doesn't seem like a short name to me :) But of course you mean 'prime namespace real estate', which is partly about length but also about being an english word and so easier to remember.[1]
intersperse still seems weirder than map, because map just returns the type of its args, while intersperse is checking the type of elements inside its list arg.
[1] This made me also focus on the fact that testify is actually an english word - and arc is misusing it.
intersperse still seems weirder than map , because map just returns the type of its args, while intersperse is checking the type of elements inside its list arg.
Yeah, that's one reason I'm not sure I agree with this proposal. Fundamentally, I don't like the idea that (intersperse x (list a b c)) would return (list a x b x c) sometimes and (+ a x b x c) others.
If "string:intersperse" is common enough to be a standalone utility with a shorter name, I'd probably name it after PHP's implode(). I'd probably make it use '+ too, so that it could be used to construct lists and custom types of sequence:
(def implode (first between seq)
(apply + first (intersperse between seq)))
(implode "?" '& ...)
(implode "" "\n" ...)
Using '+ instead of hardcoding 'string makes less sense if you have '+ dispatch on the type of the last argument. But hey, I'm a fan of dispatching on the first arg anyway. ;)
This made me also focus on the fact that testify is actually an english word - and arc is misusing it.
How about 'checkify? ^_^ I don't actually mind word misuse, but I do kinda like the idea of reserving the term "test" for unit tests, and I suppose there are even contexts where 'testify could be used for its English meaning:
- assertions
- debugger interaction
- proofs
- simulations of belief, knowledge, perception, persuasion, etc.
- generally, status reports, contracts, and sanity checks registered
with some surrounding framework or compiler
Force of habit. XD I was thinking of 'pos* as a local variable for some reason. (I was taking it out of functional position so the name wouldn't risk conflict with any macros defined before loading the file, but that's pointless, since a macro named 'pos* would have been overwritten by that point anyway.)
I think it is really, really bad to have to type "(do x)" instead of "x" just to avoid macroexpansion. Additionally, I think it is unnecessary: you'd only have to do that because arc3.1 happens not to check for lexical variables when determining whether to macroexpand. This is different from the way it works in Scheme and Common Lisp, and I don't think it was a deliberate design decision, and it is really easy to change:
So just treat it as a bug in arc3.1--it probably won't even cause problems most of the time, because it's kind of a rare case--and fix it in your own ac.scm, and assume that it will be fixed in future users' Arc implementations. Please do not establish "(do x)" as good coding style. If the compiler screwed up whenever you had a variable that had three vowels in a row, the solution would be to fix the damn compiler, not to awkwardly twist up your own code; and if other people were using the old broken compiler, you'd first tell them to upgrade, and only change your own program to cater to the bad compiler if it was absolutely necessary for some reason--and you'd do so after you'd gotten your program the way you wanted it.
Edit: It is probably a hard problem to diagnose if you just leave it there and someone uses it and is like "why is this screwing up?". So if you intend for others to use it, you could put in a thing like this:
(mac achtung (x) `(+ ,x 2))
(let achtung [+ _ 5]
(unless (is (achtung 0) 5)
(err "Oh god you have a bad Arc compiler. Fix that crap:
http://arclanguage.org/item?id=13606")))
> So just treat it as a bug in arc3.1--it probably won't even cause problems most of the time, because it's kind of a rare case--and fix it in your own ac.scm, and assume that it will be fixed in future users' Arc implementations.
I'm all too happy to do that. ^_^ Note that it'll probably mean I've introduced bugs to Rainbow and Jarc. :-p
Part of the reason I've harped on the state of affairs that causes me to write (do x), as well as the reason the state of affairs isn't altogether bad (that it makes macros that generate macro forms a tiny bit less susceptible to variable capture), has been in the hope that people will become confident that it's really annoying. ^^;
I honestly didn't know I'd be the only one actively defending it (rather than merely leaving things the way they are or upvoting), so I continued to give it the benefit of the doubt. Let's not do that. :)
> it probably won't even cause problems most of the time, because it's kind of a rare case
Actually, I use local variable names that clash with macros all the time, and that's why I started my (do x) and (call x) patterns in the first place. :) If I remove the cruft right away, my code will almost certainly break until everyone has upgraded. Dunno if anyone but me is going to suffer from my (published) code breaking, but hey. :-p
---
By the way, in this case, I was programming to a different core language implementation--in fact hacking on something core to that implementation--and I admit I cargo culted.
I was hoping to actually try running some code last night, but no dice. Anyway, that sounds like it would in fact be a problem. In fact, in Arc 3.1 at least, 'defset things are looked up at compile time, and 'get is given special treatment. Try something like this:
(= foo (table) bar (table) baz (table) qux (table)
quux (list nil nil))
(mac get1 (x) `(,x 1))
(let ref bar (= (ref foo) 2))
(let get idfn (= ((get baz) qux) 3))
(let cadr car (= (cadr quux) 4))
I wouldn't worry about 'get too much. It's inconsistent in just the same way as metafns are, so it's just one more in a list of names we shouldn't overwrite or use for local variables.
The setforms case is a bit more troublesome. Maybe they shouldn't be macro-like, and should instead be special-case extentions of 'sref? If the return value of 'get were its own type with 'defset and 'defcall support, that case could be eliminated too.
This may be a case where I've painted myself into a corner with wart. Since wart is built atop a lisp-2, I can't just expand the ssyntax a.b to (a b). I have to generate (call a b). Since I want to be able to say things like ++.a, I need call to handle macros as well. But now (call f x) will try to call a macro called f before it looks for the lexical binding.
It's not as big a problem as it may seem. You don't have to worry about future macros shadowing lexical bindings as long as they load afterwards.
The biggest compromise I've had to make because of this: using call-fn (which doesn't expand macros) in accumulate (https://github.com/akkartik/wart/blob/ed9a7d4da1fa017188fce2...) because I wanted to name one of the keyword args unless. So you seem to be watching over your creation after it's left the nest :)
(tangent)
I spent an embarrassingly long time trying to have lexical bindings override macros, before realizing that's impossible in wart: macros get expanded long before lexical bindings are created. So this is a case where you really need a full interpreter; any macro you write can't inspect lexical bindings up the call stack. (oh, for python's nested namespaces..)
(even bigger tangent)
Wart update: arc.arc is just about done. I'm going to start on the webserver, probably not srv.arc but palsecam's http.arc (http://www.arclanguage.org/item?id=11337).
I ended up dividing up ac.scm into 17 files, and arc.arc into 26 (the boundary is fuzzy). So far each file seems highly coherent, and most files are short, so the codebase feels the way the old-timers described forth code: "A component can usually be written in one or two screens of Forth." (http://prdownloads.sourceforge.net/thinking-forth/thinking-f..., pg 41; screens are Forth's units of code.)
> I ended up dividing up ac.scm into 17 files, and arc.arc into 26 (the boundary is fuzzy). So far each file seems highly coherent, and most files are short
Ok, I started reading wart. :)
Skimmed that Forth reading. I've only dabbled in Factor, but I'm sometimes tempted to explore concatenative programming more in depth.
Yes, "mliteral", for example, will only look ahead for as many characters as it needs to. So if you type
#\A [Enter]
then even though "#table" is more characters than are yet available in the input, it will only try as far as "#t" before it backtracks and tries a different alternative.
Please show an example of using macro-alias, if you had a working version :)
I expect you want
,',b
Imagine the variable "b" contains the symbol "foo". You want to insert (the first comma) the symbol "foo" (,foo) but that would treat foo like a variable, so you have to quote it (,'foo) to get a literal "foo".
Now you want to substitute the value of b for foo, so replace foo with (,b).
,'foo
foo => ,b
,',b
Why one quote instead of two?
,','b
That's because you don't want a literal "b", you want the value the variable "b" contains.
Why one quote instead of zero?
,,b
because you don't want the value of the variable "foo" when b contains the symbol foo, you want a literal foo.
Why do you have to use ,' at all? After all
`(one two three)
inserts literal symbols without any messing around. But now you have no way to evaluate anything.
Imagine that symbols didn't "auto quote" themselves in a quasiquotation expression. That is, suppose that `(one two three) was an error. Just wasn't implemented. So you had to write
`(,'one ,'two ,'there)
now it's easier to see where ,',b comes from.
Wow, this is pretty cool. I've never had a way to explain ,', before ^_^
I successfully applied your thinking to make args a gensym:
(mac macro-alias(a b)
(w/uniq args
`(mac ,a ,args
`(,',b ,@,args))))
This works like alias at http://arclanguage.org/item?id=13097 for macros. (It would also works on functions, but macro-aliases of functions are not first-class functions.)
These kinds of problems are exacerbated by the fact that all iterator and map/reduce type functions always return cons. Which at first glance might seem OK right (i.e. you pass in a vector and end up with a list), but in your functions you never actually see the list, it's just a result or by-product that you use passing into some other function. So for me, I read vector, but need to think list, and know things like conj are doing particular things under the hood.
What if the functions did behave the same on different data structures? Would that make having map and iterator functions always return cons less confusing?
The reason I ask is somethings I think about things like having car and cdr work on strings... and functions that take string arguments also accept lists of characters. This would simplify some code like map, because you could pass it any object that could be treated like a list, and it could cdr its way through without having to have special code to check whether it had been passed a string. However it would then always return a list; that is,
> "Would that make having map and iterator functions always return cons less confusing?"
Yes it would/does. When I created my arc libraries, I built the normalization in. So I have a join function, which detects the type and applies the appropriate operation. I don't use conj, because I don't want to think about those things.
Interesting that you picked a string example, since that's exactly how Clojure already works. That is, using map on a string returns a list of chars. Often I need to:
(apply str (map #{upcase %} "abc"))
For whatever reason I don't mind this, I expect strings get converted to lists.
Here, seeing only the vectors I would get caught thinking it would produce (1 2 3 4 5 6) instead you would get (6 5 4 1 2 3).
Of course I went and created two join functions with my arc functions.
join : always adds to left side list or vector
join> : always adds to right side list or vector
(let [myVector (atom [1 2 3])]
(each x [4 5 6]
(reset! myVector (join> x @myVector)))
@myVector)
And I no longer have problems, I can see what's happening at the top level and it's consistent.
So in the end it's not such a big deal when your willing to normalize all the functions.
[note: I never tried out the code, it may not work, but you get the idea]
I have a generic 'map design on my mind, and it may or may not be related to your 'deftransform approach. The idea is to have another method that takes two arguments: A value to convert into a lazy list and back (or seq, or stream, or whatever you want to call it ^^ ), and a function to transform the value while it's a lazy list. Then implementing 'map is just a matter of implementing it for lazy lists.
Technically, the intermediate value doesn't need to be of any particular type, just as long as it supports lazy list style iteration--another generic method. That generic method's what I'm calling "ifdecap". So with cons lists, for instance, you can have them implement 'ifdecap, and then the conversion-and-back method doesn't need to make any conversion to a lazy list (but it still needs to convert back).
A lot of this design is based on conversations we've had here before.
This isn't one of the conversations I was thinking of, but it's relevant to many of the things we've been talking about recently: http://arclanguage.org/item?id=8249
Here's another place an extensible 'map was brought up, which wasn't one of the conversations I was thinking about either. XD http://arclanguage.org/item?id=4141
Yeah wart now follows that recommendation for what to return. But it's still not clear how map should operate on new sequence types. Your second link suggests the idea from ruby that map rely on each. Hmm, I haven't built each yet in wart, and I've felt the fleeting need several times..
FWIW, I implement 'each based on 'some. Iterating with 'some has the advantage that you can break the loop early in a natural way, rather than with some tacked-on extra feature. Although 'all allows this too, and it might appear to be an equivalent starting point, you can only break an 'all loop using nil, and so the only possible results of (all ...) are nil and t, unlike 'some, which can return any value. That makes it easier to implement 'all in terms of 'some than vice versa; the (no:some no:func xs) approach loses no information.
Furthermore, 'some lets you implement 'each and 'all without resorting to mutation. This makes the interaction with continuations is a little less frustrating (IMO), as long as the types you're using have continuation-friendly implementations of 'some.
---
But it's still not clear how map should operate on new sequence types.
It's clear to me, at least. :-p Under my design, it's up to the type's designer to decide what it means to convert their type to a lazy list and back. I think I'd suggest erring on the side of converting back to a picky container (e.g. a string, which can only hold characters) even at the expense of errors, 'cause someone who wants to get around the errors can just convert their picky container into a lazy list before passing it to 'map, and convert it back to whatever lenient container they need after that.
Hmm, I wonder if each is a good way to encode the lazy-list conversion. Perhaps it makes sense to make each generic, and map just use each.
---
As a reminder of how much I still have to learn, I just spent 2 days off and on trying to get lexical functions to take precedence over macros in wart before realizing that it makes no sense; macros get expanded long before the lexical scope is even created.
I vaguely remember a comment that this is possible by changing 2 lines in the arc interpreter. Now I know that they make the resulting language impossible to compile :/
Since ac crawls over the s-expression transforming everything into Scheme, whenever it sees a fn (which is what creates lexical bindings), it keeps track of what variables are being introduced in that scope with the env argument. E.g., see the last line of the function
(define (ac-complex-fn args body env)
(let* ((ra (ar-gensym))
(z (ac-complex-args args env ra #t)))
`(lambda ,ra
(let* ,z
,@(ac-body* body (append (ac-complex-getargs z) env))))))
So it's entirely possible to compile things as function calls (instead of macro calls) when a variable's lexically bound. But this relies on how macros can't be lexically bound in Arc (no macrolet equivalent). On the other hand, first-class macros might be defined at runtime, and the compiler can't figure out whether something that looks like a function call is actually a macro call, hence forcing interpretation: http://arclanguage.org/item?id=11517.
It is theoretically possible to do a form of JIT compilation, where you compile the normal function (presuming function calls), but keep the source around and dynamically compile and link in branches of code for when the argument is a macro.
Basically, you'd have a jump table based on the argument. If it was a function, you'd jump to the function version. If it was a macro, you'd look it up in a hash table to see if it had been used before. If not, compile the function again and add the new one to the hash table.
Obviously, this wouldn't be as fast as a normal function call, since you'd have to do runtime checking of one of the arguments, but if you were careful the common case could be quite fast, with only the macro calls being slow and only new macro calls taking more than a small amount of time.
Of course, this is only one possible implementation. It is true that any language that supports macros could not be entirely statically compiled, but then again as far as I know any higher-order functional language requires runtime compilation as well. It partly depends on how you define "compilation."
It is true that any language that supports macros could not be entirely statically compiled
Actually, in a language with no side effects, I don't think there's anything keeping run time level code from being used at compile time, as long as there's already enough information to compile it. Who would ever know? ;) This means you have the most unique benefits of macros, that of being able to program them in the same language and environment as the rest of the program.
This is the basis behind Blade, an extensible statically compiled language I was making before I realized the lack of side effects made it really hard for me to write code for. :-p
> before I realized the lack of side effects made it really hard for me to write code for. :-p
So that's why you stopped working on Blade. I was curious! ^_^ Can you speak more about your realization that side effect-free programming is impractical? I've sometimes found myself tempted by pure FP, you see.
[PRO TIP: Click "link" (or "parent") on this comment, so you can view it (mostly) by itself rather than squashed against the right side of the page.]
For me, when I was writing my spider solitaire program, I had a couple of global variables, the-deck and the-piles, which represented the state of the game. the-deck was a list of randomly permuted cards, which were integers from 0 to 51 (or 0-25 with two suits, or 0-12 with one suit). the-piles was a list of piles, which were lists of cards (the zeroth element was the top of the pile), and the cards were of the form (list card revealed?), where "card" = 0-51 and "revealed" = t/nil. (Yes, a card is an integer in the deck, but a list of an integer and t/nil on the board. This does not need to be changed.)
To flip over the top card of the nth pile, I went:
(= (cadar the-piles.n) t)
To deal a card (face up) to top of each pile from the deck, I said:
(forlen i the-piles
(push (list pop.the-deck t) the-piles.i))
To move a chain of n cards from (the top of) one pile to another, I went:
(let (a b) (split the-piles.i n)
(= the-piles.i b
the-piles.j (join a the-piles.j)))
Also, I created a global variable called "prev-states", and every move the user performed would
(let (a b) pop.prev-states
(= the-deck a the-piles b))
. Now, what would you do in a "pure functional programming" language? If there were no assignment whatsoever, not even to global variables, I'd probably effectively implement global state by making every function take an extra argument representing global state, and shifting code around in other ways. If you could assign to global variables but all data structures were immutable (I think Clojure is like that, correct me if I'm wrong), then I'd have to keep rebinding the entire "the-piles" list when all I wanted to do was alter one or two piles. Revealing the top card of the nth pile would look something like this:
(= the-piles
(join (take (- n 1) the-piles)
(list:cons (list (car the-piles.n) t) (cdr the-piles.n))
(drop n the-piles)))
(Note that "push" and "pop" are legal because they just do global assignment; they don't alter any data structures). And then moving a chain of n cards from i to j would look like this:
(= the-piles
; ... oh god I have to know whether i>j
;(join (take (dec:min i j) the-piles)
; ... egad I may as well just say (if (< i j) ...
;(if (< i j)
; (join (take dec.i the-piles)
; (list:drop n the-piles.i)
; (cut the-piles i dec.j)
; (list:join (take n the-piles.i) the-piles.j)
; (drop j the-piles))
; (join ... egad the repetition is intolerable
; hmm, this'll work
(with (new-i (list:drop n the-piles.i)
new-j (list:join (take n the-piles.i) the-piles.j))
(if (< i j)
(join (take dec.i the-piles)
new-i
(cut the-piles i dec.j)
new-j
(drop j the-piles))
(join (take dec.j the-piles)
new-j
(cut the-piles j dec.i)
new-i
(drop i the-piles)))))
On the plus side, I would no longer need to use "deepcopy" when creating a backup of the game state.
(push (list the-deck the-piles) prev-states)
Which is pretty much the only benefit I'd get from outlawing modification of data structures, which I assume is what "pure functional programming" would mean. Don't even talk to me about how this would look if I had to simulate global variables...
Functional programming is a useful tactic in some cases. For example, I use a "chain-length" function, which might, say, take the list (7♥ 8♥ 9♣) and return 2, the number of revealed cards of consecutive rank of the same suit. I first thought about making it take "i" as an argument, referring to the ith pile in the-piles, but I made it take the list "xs" instead, and that allowed me to use it for other purposes: e.g. determining whether a move of n cards from pile i to j will create a longer single-suited chain. (Like, moving that 7♥ 8♥ chain onto a 9♥. I have a couple of AI-assistance procedures that look for moves satisfying conditions like that.) The expression is:
(is (+ n chain-length:the-piles.j)
(chain-length:join (take n the-piles.i) the-piles.j))
Had I written "chain-length" to refer just to the ith pile in the global "the-piles" variable, I'd... basically have to rewrite most of it. So factoring out things into individual functions (which, btw, is, I think, in the category of things called "functional programming", but isn't usually the same as making things side-effect-free) is frequently a good thing--it makes the program easier to write, to read, to understand, and to maintain.
But the goal, the reason you'd want to do it, is not because it's called "functional programming"--it's to make the program easier to write/read/understand/maintain. And if something called "functional programming" makes it harder to do those things, then I would advocate throwing it out. I think preventing all side effects, or all side effects except those on global variables, falls firmly into the latter category; I think my example demonstrates this.
; Move a chain of n cards from i to j.
(zap [let (taken leftovers) (split _.i n)
(copy _ i leftovers j (join taken _.j))]
the-piles)
; Reveal the top card of the nth pile.
(def copdate (orig . kvs)
(apply copy orig (mappend [list _.0 (_.1:orig _.0)] pair.kvs)))
(zap [copdate _ n [cons (list _.0 t) cdr._]] the-piles)
In principle I agree, disabling the programmer's ability to use mutation without reason is just frustrating. But I don't expect an example to help show this; most examples we could come up with would probably have easy-in-hindsight solutions like this. I think the point is that mutation gives us more easy ways to do things, so that we can quickly experiment and stuff.
I see you want to push the example to its limit a bit. ^_^ Well, I assume this hypothetical language would be stocked with utilities that made up for what its pureness missed out on. How believable are these?
That's definitely what I meant, and I was going to just say so, but whan I saw the ":)" I realized akkartik probably knew that and meant something like "that may not be 'oh god' complicated, but it's still not nearly as convenient as the mutating code." I don't know if that's a correct interpretation, but it made it more interesting to respond. ^_^
Yeah, I like the way you handle the-deck and the-piles. And I like this style of programming in general (i.e. a few global variables to keep state, functions corresponding to verbs in the problem space that manipulate the globals in an intuitive way). You've reminded me of how preventing all side effects could get awkward by ruling out this style of programming entirely.
To clarify, I'm not mostly interested in eliminating side effects because of some obsession with purity, but rather for performance sake. I've been pursuing so many lang design choices like fexprs/first-class macros that are supposed to really cripple performance, that I'm starting to worry my hobby implementations aren't even going to run. I'd heard that Haskell and Clojure get a big performance boost from enforced immutability/no side effects because it guarantees that most operations can safely be run concurrently, so I've just considered it an option.
I'm a real novice when it comes to optimization, and I'm certainly interested in techniques that could help me get decent performance without having to sacrifice mutability. TCO, strategic memoization, compiler hints? Any tips on this front could be really useful to me.
Aww, you know I can't resist talking about my hobby horses. XD
Blade is written in Groovy right now, and to make partial evaluation possible, everything is done in continuation-passing style. Partial calculations' continuations are held in stasis until the definitions they're waiting for become available. I guess it's related to cooperative multithreading and single-assignment variables, but I'm kinda playing it by ear.
Back when my work on Blade was petering off, the main reason was that Groovy's syntax wasn't very friendly for continuation-passing style. I really missed macros, and I wasn't in the mood for writing a complicated Groovy AST transformation. I wanted to just port it to Arc or something, but I had trouble writing Arc code itself without getting hung up in all the abstraction leaks (particularly unhygienic macros) and absent features (particularly OO-style type-based dispatch and weak references).
Meanwhile, thanks to ranting about Blade here, I realized it wouldn't be a good REPL language. Single-assignment has gotta paint people into corners at a REPL, and the alternative would be to compile the whole application over and over, using some meta-runtime to orchestrate the compiles. But what runtime?
Also, I began to suspect Bllade's design itself would result in something ugly to use and possibly even unusable. A declaration's behavior would depend on definitions, and a definition's value would be what the declararions said it was, and naturally there'd be deadlocks. The deadlocks were easy to avoid in toy examples, but I didn't know what limitations and gotchas Blade would develop once it matured. (I still wonder about that.)
So a plan fell into place: I could pursue a language that had similar high hopes for extensibility as Blade, but in a REPL-friendly form. That language's evaluation model would be much more familiar to me, and it would be closer to any of the languages I might choose to implement it in, so I'd probably be able to implement it faster and more confidently. Then I could use the resulting language--and the lessons I learned from it--to experiment with different variations on CPS and finish Blade.
So I started working on that language, and I called it Penknife. ^^
Altogether, I don't think this was based on any bad experience with pure functional programming. It was more about my expectation of those bad experiences, together with the frustration of writing CPS code in longhand Groovy.
> Meanwhile, thanks to ranting about Blade here, I realized it wouldn't be a good REPL language. Single-assignment has gotta paint people into corners at a REPL,
> So a plan fell into place: I could pursue a language that had similar high hopes for extensibility as Blade, but in a REPL-friendly form.
So the REPL was a large part of it. I've heard too that single-assignment is problematic for REPLs, but I don't quite understand why. Can't you just use let everywhere you'd use = and def ordinarily?
; Scribbling at the REPL
arc> (= x 5)
5
arc> (def add1 (n) (+ n 1))
#<procedure: add1>
arc> (add1 x)
6
; Single-assignment version
arc> (let x 5)
nil
arc> (let add1 [+ _ 1])
nil
arc> (let x 5
(let add1 [+ _ 1]
(add1 x)))
6
REPLs should provide a convenient way to grab commands from the history anyway, so you can just keep building on your previous let. It is more cumbersome, but you also avoid some of the nasty surprises that come after you've accumulated a complex state in your REPL.
Could this style of REPL'ing address the problems posed to REPLs by single-assignment, or are there larger problems I'm ignoring?
Update: In the particular REPL example I've shown, there's no need for the style transformation since nothing gets redefined. Hopefully you can still gather what I mean from it! ^_^
Can't you just use let everywhere you'd use = and def ordinarily?
Well, single-assignment can have variables that are introduced without values and then assigned to later. It's a bit of a relaxation of pure FP that (I think) doesn't sacrifice referential transparency. Hmm... actually, a lambda that assigns to a binding outside itself is not referentially transparent, so oh well. :-p
In any case, Blade isn't directly supposed to be a single-assignment or pure sort of language. I just want to view the space of Blade declarations as a completely orderless set, letting the compilation of each one run in parallel semantically, and I still want it to have an intuitive kind of determinism. Inasmuch as I can orderlessly orchestrate side effects in ways I find bearably intuitive, I'll probably add side effects to Blade.
Blade side effects would also be limited by the fact that I want to build up to an IDE that rewinds the compile when you edit. In fact, I'm almost certain my current plan won't work well with that. I expect custom declaration syntaxes to be supported by core-syntax declarations that happen to branch based on the declarations they find by reading the project tree, but then editing any part of a file would invalidate the branching itself and cause recompilation of all the custom-syntax declarations in that file. Seems there'll need to be a more edit-resistant kind of branching. I already expect a bit of IDE cooperation with this--it doesn't have to be as hackish as diffing plain text--but that's where my ideas leave off.
...Actually, I've got a bit of an idea already. XD A declaration that observes a file's contents in order to branch can limit its observation to some method that doesn't rely on the contents of the individual declarations, and then each of the branches can observe just its own part of the file. The core syntax already works like this, essentially splitting the file based on appearances of "\n\n[".
Anyway, you've gotten me thinking and ranting. ^_^
---
Would this style of REPL'ing address the problems posed to it by single-assignment, or are there other problems I've ignored?
The kind of problem I expect to happen is that I'll define something once, I'll define something based on that, and then I'll realize the first thing had a bug, but I won't be able to redefine it as simply as I can in Arc. Once I do enter the correct version, I'll have to paste in all the things that depend on it too.
To draw a connection, it's similar to how when you want to redefine a macro in Arc, you have to re-enter all the code that used the macro in order for it to use the new one, whereas fexprs wouldn't have this problem.
Hmm, I wonder if each is a good way to encode the lazy-list conversion. Perhaps it makes sense to make each generic, and map just use each.
Feels a bit like you missed my point. Just in case, it was:
1. Make 'some, 'fn-ifdecap, and 'as-lazy-list generic.
; 'defgeneric versions
(some (fn (x) <test>) xs)
(fn-ifdecap (fn (x xs) <split case>) (fn () <empty case>) xs)
(as-lazy-list (fn (lazy-orig) <lazy result>) orig)
; the versions I currently have in my Penknife core library draft
[fn-any xs [fn [x] <test>]]
[fn-ifdecap xs [fn [x xs] <split case>] [fn [] <empty case>]]
[fn-decap-erism orig [fn [lazy-orig] <lazy result>]]
; I'll probably rename this to "call-as-seq".
; A "decap-er" is something that has fn-ifdecap behavior.
2. Base utilities like 'ifdecap (a macro), 'all, 'each, and 'map off of these.
The version of 'map in my draft is something like this:
[fun fn-map [seq func]
[rely:decap-erism lazy-seq seq
[nextlet rest lazy-seq ; aka xloop, or named let
[make-decap-er:fn [then else]
[ifdecap first rest rest
[then func.first next.rest]
else.]]]]]
No 'each is required, just 'fn and 'xloop. :) And the 'xloop is just there to keep from bothering with conversion when recurring on the tail.
Aside: The "rely" here is supposed to make the result of fn-map undefined if the result of decap-erism is undefined. That part probably isn't easy to translate to Arc; I essentially have an extra implicit parameter in every Penknife function call, specifying what to try instead if the call is rejected. It's kind of a tacked-on thing, and I'd avoid it if I had some better idea, but it seems to help immensely with writing extensible utilities.
> Now I know that they make the resulting language impossible to compile :/
Is that a bad thing?
Update: What I mean is, after the recently-shared LtU thread about fexprs [1], I've had trouble understanding why compilation is so often considered a must-have feature. If a language is difficult or impossible to compile, why not just interpret it?
Interpretation seems fitting to me for a language like arc whose focus is on exploratory programming and axiomatic language design.
I'm starting to think that if a language is compilable by default (i.e. without compiler hints from the programmer), it could be an indicator that the language isn't dynamic enough.
For example, arc's macros are fairly powerful as is, but people sometimes wish their scoping was more robust or that you could pass them around as function arguments. When it's learned that these features come at the cost of compilability, the usual reaction is to conclude that the features aren't worth it. I think the proper reaction (when expressivity is more important to you than performance) is to conclude that compilation isn't worth it.
My opinions on this stuff are fairly new and untested, so maybe one of the more experienced lispers here just needs to set me straight. ;)
I like compiled languages like Scheme because they let me be expressive. I can use whatever oddball syntaxes I want, safe in the knowledge that the parsing inefficiencies will only give my application horrible load time. :) Performance isn't nearly as important to me as expressiveness, but I do sometimes write programs for purposes other than thought experiments, and those generally need to finish what they're doing faster than the time it takes for me to give up and rewrite them. ^^ In those cases, I'm happier if I've chosen a language that gives me a fair amount of practicality and a good amount of expressiveness at the same time, rather than one that promises lots of expressiveness but forces me to give it all up in order to get things done.
Fexpr calls--I assume you're talking about those--are pretty much never what I actually want to express in syntax anyway. When I'm writing and reading code, I'm viewing it from a static standpoint, and so I choose syntaxes that make sense in static ways. If the very description of my algorithm means different things at different points in the execution of the program, that's darn confusing. XD (I suppose that would be Necker Cube code.) Not all fexpr-powered programs need to use them that way, but if they don't, what's the point?
Were I to be viewing the program in motion, it would be a different story. Then I'd want a very dynamic syntax, even one that shows me the values of variables and other dynamic information, while still presenting it all in a well-factored enough way that it's easy to digest. I'd do searches on things like "log file" and "data visualization" to find inspiration. But then I'd probably never write in that syntax. Since run time deals heavily in concretes at high speeds, I'd probably look for a way to interact using some kind of shell scripting language, or even a joystick input. :-p
I hope I haven't been too abrasive here. ^^; I actually have only a very weak conscious idea of why I like that Arc's compiled, and it took me several hours to collect my thoughts enough to post. Then this post kinda found itself all at once. ^_^
Not abrasive at all! :) Hope you don't mind if I respond in small pieces (i.e. some now, some later).
> Not all fexpr-powered programs need to use them that way, but if they don't, what's the point?
Am I incorrect in thinking that arc macros are a subset of fexprs in terms of expressiveness? You should be able to define 'mac in terms of an fexpr and eval, and then do all the macro stuff you're used to doing. A disadvantage would be the increased run-time penalty. Advantages could include:
1. Lexical scoping now applies to macros, so you no longer have to use that do.foo pattern to avoid accidental macro-expansion (http://arclanguage.org/item?id=11688)
2. Local and anonymous macros
3. Module system possibilities open up because you can now store callable macros in lists and tables
4. The language implementation can become smaller and more hackable with the elimination of the macro-expander
Do you find any of these to be particularly desirable, or do you have objections about any of them?
I'm reminded again of a post of mine I brought up a while ago, http://arclanguage.org/item?id=11684. [1] The only point I raise against fexprs there--and interpretation of quoted code in general--is that it gives up static optimization and verification. ^_^ This is part of why it took me a while to form a response initially; my main objection to your argument was that the assumption "when expressivity is more important to you than performance" wasn't black and white for me, so I wasn't sure it would be a relevant response.
So I'll probably be a bit wishy-washy in the following responses. Our opinions aren't actually in conflict; they're just motivated by different premises.
Am I incorrect in thinking that arc macros are a subset of fexprs in terms of expressiveness?
I think so. I think the one semantic difference is that you can redefine an fexpr and count on all the existing uses of it to work using the new definition. But that's only a positive. :-p
---
1. Lexical scoping now applies to macros, so you no longer have to use that do.foo pattern to avoid accidental macro-expansion (http://arclanguage.org/item?id=11688)*
The most natural (IMO) way to add this to Arc would be to have the compiler pass around a complete local static environment, rather than just a list of the local variables. It could be as simple as replacing that list with an alist. Of course, then there'd need to be a 'w/mac form too.
I admit this solution adds complexity to the language core. That's issue #4, though. ^_^
---
3. Module system possibilities open up because you can now store callable macros in lists and tables
Yep, that's totally true. You can import such structures globally in either kind of language, especially if they support replacing the global environment so you don't clobber existing things (Penknife's approach), but local scope imports make the meanings of variables in their scope ambiguous until the import is performed at run time, and at that point, compile-time macros will have been missed.
Personally, I like knowing which variables I'm importing at the time I write the code, so even if I do want to locally import a bunch of variables all at once, perhaps for some macro, I find it tolerable to build a complete (with (a _!a b _!b c _!c) ...) form for that purpose. Alternatively I'd limit local imports to statically determinable sets of names (like forcing (w/import-from var-list foo ...) to use a predefined global variable 'var-list). Of course, these aren't as general. ^_^
--
4. The language implementation can become smaller and more hackable with the elimination of the macro-expander
That's also true. :) My opinion is that a smaller language implementation doesn't imply a more convenient language to use, even if it may be a more convenient language to maintain. If for some reason we want to compile some of our code (probably for performance, but also potentially for code generation), and the language already gives us a suitable framework to do that in, we don't have to build that framework ourselves. In fact, it's more convenient to simulate a simple framework (e.g. Arc) within a complicated one (e.g. Racket) than vice versa.
I'll aggregate them in ar for now, and then eventually we can put them in their own project where they'll be easily accessible to other runtime implementors and to people using them as examples.
I don't know how to cooperate with the GC from Racket (instead of from C, as described in the manual) while using pointers to modify Racket data structures. If it's possible, you might be able to get help on how to do it on the Racket mailing list. But it's not a Racket bug.
The issue here with GC isn't that we're modifying immutable pairs. It's that we're modifying a Racket data structure ourselves, without letting Racket do it for us, and without cooperating with the garbage collector. (If we modify a Racket data structure ourselves it doesn't matter if we do it from Racket or from C, it's the same issue). We'd have the same GC issue if we were using pointers to modify any Racket data structure, whether mutable pairs or immutable pairs or vectors or whatever.
(That's what performing "unsafe" operations means: we can cause seg faults or mess up the garbage collector. "Safe" operations are ones where we let Racket do the data structure manipulation for us, and so it's a Racket bug if there's a seg fault or GC problem).
That we're modifying Racket's immutable pairs could potentially give us a another bug though. When lists are immutable,
(let ((x (list 1 2 3)))
(foo x)
(car x))
and if the compiler knows that the built-in Racket list and car are being called by "list" and "car" through the module system, the compiler could figure out that it can go ahead and return 1 for this expression without actually having to perform the car operation: the compiler can perform optimizations with immutable pairs that it can't do with mutable pairs. I haven't seen any evidence that this issue has bitten us yet, but it remains a possibility that some future implementation of Racket might perform new optimizations that would mess us up when we're modifying immutable pairs.
Ok, makes sense. I hadn't followed that by 'data structure' you meant 'internal data structure'.
Arc's not just a userland racket program; ptr-set! and ptr-ref are low-level creatures. I hadn't focused on this fact, even though I'd seen the comments at set-ca/dr!
Exactly. Back in the MzScheme 3xx days Arc was, to MzScheme, just an ordinary (if rather large) MzScheme program. And, with my runtime project (if successful), Arc will once again to Racket be just a big Racket program, so any bugs with Racket will be legitimate Racket bugs that we can go ahead and file a Racket bug report on ^_^
This is actually a pretty big realization for me. Between the documentation thread 2 months ago (http://arclanguage.org/item?id=12860) and this realization that the queue bug is all the fault of our arc implementation, my opinion of the racket runtime is entirely rehabilitated. (http://arclanguage.org/item?id=12302 was the closest thing to a flame war I've been involved in here)
Now my only complaint with mzscheme in general is that it isn't dynamic enough, and forces us to use its module system :) But even that's just because we're using racket in this weird 'backwards compatibility' mode. I'm looking forward to ar because it'll let arc use all of racket's modern goodies (keyword args, extensible equality, ...)
Edit: I suppose we're still stuck with scheme's phase-separated macros.
P.S. I'm really, really, really happy that you've narrowed the problem down to a garbage collection issue. Implementing Arc lists with mpair's seemed to fix the queue bug, but I had no proof. I was still possible that I had merely moved data structures around in memory or whatever enough so that the bug simply wasn't manifesting by running the queue test. Now that we know that it's a GC problem, I have much more confidence that the mpair implementation is in fact one way to fix the bug.
Just wondering, I haven't looked into ar that deeply yet:
How invasive is the change to mpairs really? Is it drastic enough you needed to create most of a new runtime, or is that just because you're like the rest of us and want to experiment ^^
http://awwx.ws/mpair0 was my attempt to simply modify Arc 3.1's ac.scm to use mpair's. I got enough working to see that using mpair's appeared to fix the queue bug, but my implementation was otherwise quite buggy. (Lots of places where Racket lists appeared where Arc lists should have been, and vice versa; and weird mixes of the two, etc.) I have no doubt that someone cleverer than me could get it to work, but I just kept getting lost trying to figure out my bugs.
So I decided to implement the compiler incrementally, where I could test each step as I went. It would take a lot more work (a bit like getting my jeep stuck in the swamp, and so hauling myself out ten feet at a time with a winch), but it was an approach that I knew I would be successful at.
Then, since I was rolling through a compiler rewrite anyway, I decided to also go ahead and reflect the compiler into Arc to make it more hackable ^_^.
One of the 3 goals listed in the readme is "to make Arc more hackable," so I would guess that mpairs isn't most of it. Reorganizing ac.scm and adding unit tests so that the internals are easier to modify is probably a significant motivation.