Arc Forumnew | comments | leaders | submit | hjek's commentslogin
2 points by hjek 1 day ago | link | parent | on: Ask: Why is there no "save-list"?

Yep. Or change `save-table` to be an extension of `writefile` using `defextend`:

   (defextend writefile (val name) (is (type val) 'table)
     [the body of the save-table function here]

reply

2 points by rocketnia 23 hours ago | link

Even if you merge `writefile` and `save-table` like this, but not `readfile1` and `read-table`, then people still need to know, at development time, what type of data is in the file in order to read it, so they might as well use a type-specific way to write it as well. Unfortunately, merging `readfile1` and `read-table` isn't really possible, since their serialized representations overlap; they can't reconstitute information that was never written to the file to begin with.

From a bigger-picture point of view, this seems like it would become a non-issue once Arc had its own reader. I assume the problem with reading tables using `read` is that Racket's reader constructs immutable hashes. An Arc-specific reader would naturally construct Arc's mutable tables instead.

Doesn't Racket's reader give us similar problems in that it reads immutable strings and cons cells, too? So these problems could all be approached as a single project.

In the short term, it's not a project that would need a whole new reader. It could just be an adaptation of Racket's existing reader... something like this:

  (define (correcting-arc-read in)
    (let loop ([result (read in)])
      (match result
        
        ; TODO: See if this should construct a Racket mutable cons cell
        ; instead (`mcons`). Right now this just creates an immutable
        ; one, which should be fine since Arc uses an unsafe technique
        ; to mutate those.
        [(cons a b) (cons (loop a) (loop b))]
        
        [ (? hash?)
          (make-hash
            (map
              (match-lambda [(cons k v) (cons (loop k) (loop v))])
              (hash->list result)))]
        [ (? string?)
          ; We construct a new mutable string with the same content as
          ; `result`.
          (substring result 0)]
        ; We handle tagged values, which are represented as mutable
        ; Racket vectors.
        [(? vector?) (list->vector (map loop (vector->list result)))]
        
        ; We handle various atomic values. (TODO: Add more of these
        ; cases until we've accounted for every writable type Arc
        ; supports. Alternatively, just make this a catch-all
        ; `[_ result]`.)
        [(? number?) result]
        [(? symbol?) result])))
Writing Arc's `queue` type might be tricky, since that representation relies on sharing. It's possible queues (and other tagged values in general) should have a customized read and write behavior.

reply

3 points by hjek 22 hours ago | link

> I assume the problem with reading tables using `read` is that Racket's reader constructs immutable hashes.

Racket also has mutable hashes created using `make-hash`[0] rather than `hash`. It could just be that the tables are not serialised as something Racket reads as mutable hashes when deserialising it back again?

[0]: https://docs.racket-lang.org/reference/hashtables.html#%28de...

reply

2 points by rocketnia 15 hours ago | link

"It could just be that the tables are not serialised as something Racket reads as mutable hashes when deserialising it back again?"

I'm pretty sure the Racket reader never reads a mutable hash, but that it's possible for a custom reader extension to do it.

Some of Racket's approach to module encapsulation depends on syntax objects being deeply immutable. In particular, a module can export a macro that expands to (set! my-private-module-binding 20) but which uses `syntax-protect` so that the client of that macro can't use the `my-private-module-binding` identifier for any other purpose. If the lists constituting a program's syntax were usually mutable, then it would be hard to stop the client from just mutating that expansion to make something like (list my-private-module-binding 20), giving it access to bindings that were meant to be private.

I think this is why Racket's `read-syntax` creates immutable data. As for why `read` does it too, I think it's just a case of `read` being a relatively unused feature in Racket. They don't have many use cases for `read` that they wouldn't rather use `read-syntax` for, so they don't usually have reasons for the behavior of `read` to diverge from the behavior of `read-syntax`.

All this being said, they could pretty easily add built-in syntaxes for mutable hashes, but I think it just hasn't come up. Who ever really wants to read a mutable value? In Racket, where immutable values are well-supported by the core libraries, you aren't gonna need it. In the rare case you want it, it's easy enough to do a deep traversal to build the mutable structure you're looking for (like my example `correcting-arc-read` does).

It only comes up as a particular problem in Arc. Arc's language design doesn't account for the existence of immutable values at all, so working around them when they appear can be a bit quirky.

reply

3 points by hjek 8 hours ago | link

> I'm pretty sure the Racket reader never reads a mutable hash, but that it's possible for a custom reader extension to do it.

No..?

    > (require racket/serialize)
    > (define basket (make-hash))
    > (hash-set! basket 'fruit 'apple)
    > (write-to-file (serialize basket) "basket.txt")
    > (define bucket (deserialize (file->value "basket.txt")))
    > (hash-set! bucket 'fruit 'banana)
    > bucket
    '#hash((fruit . banana))

reply

3 points by rocketnia 4 hours ago | link

The Racket reader reads a seven-element list there, not a mutable hash.

Looks like you're proposing to use a two-stage serialization format. One stage is `read` and `write`, and the other is `serialize` and `deserialize`. At the level of language design, what's the point of designing it this way? (Why does Racket design it this way, anyway?)

I can see not wanting to have cyclic syntax or syntax-with-sharing in the core language just because it's a pain in the neck to parse and another pain in the neck to interpret. Maybe that's reason enough to have a separate `racket/serialize` library.

But isn't the main issue here that Arc's `read` creates immutable tables when the rest of the language only deals with mutable ones? The mutable tables go through `write` just fine, but `read` doesn't read the same kind of value that was written out. If and when this situation is improved, I don't see where `racket/serialize` would come into play.

reply

2 points by akkartik 19 hours ago | link

I have an old fork (https://github.com/akkartik/arc) that has an extensible generic pair of functions called serialize and unserialize which emit not just the value but also tagged with its type. read and write are built atop them.

reply


Just a guess: Maybe the Racket version shipped with Ubuntu Trusty Tahr is out of date? Perhap try adding the Racket PPA[0].

[0]: https://launchpad.net/~plt/+archive/ubuntu/racket

reply

2 points by hjek 8 days ago | link | parent | on: Knark - rewrite in plain Racket?

Have you done any web development in Clojure?

I like Clojure as a language, but I'm a bit overwhelmed with choice when it comes to web frameworks and the package managers you need to use those web frameworks (somewhat similar to Common Lisp), whereas with Racket and Arc there's just one place to start. Do you have any experience with Clojure web frameworks?

I'm always a bit scared of adding external dependencies to any project, but Rich Hickey has this great rant about semantic versioning where he argues that a new "major version" is essentially just a bad excuse for breaking existing stuff. I wonder if Clojure web libraries take those principles to heart or whether it's another left-pad incident waiting to happen?

reply

3 points by i4cu 8 days ago | link

This is a surprisingly difficult question to answer, but here I go...

FYI cljs = clojurescript.

When Clojure first came out there was a core set of libraries that everyone flocked to:

1. Compojure (a routing library)

  - similar to arc's defop
https://github.com/weavejester/compojure

example:

  (defroutes app
    (GET "/" [] "<h1>Hello World</h1>")
    (route/not-found "<h1>Page not found</h1>"))
1a. Ring (a middleware library)

  - parses the web request and converts it into a hash-map of meaningful values.
  - similar to srv.arc
https://github.com/ring-clojure/ring

2. Hiccup (HTML Soup)

  -  similar to hmtl.arc, but uses a data structure to provide flexibility.
https://github.com/weavejester/hiccup

example:

  [:head
    [:meta {:http-equiv "Content-type"
            :content "text/html; charset=utf-8"}]
    [:title "adder"]
    [:link {:href "/adder.css" :rel "stylesheet" :type "text/css"}]]
Much of this came about when people read example blogs like this:

https://mmcgrana.github.io/2010/07/develop-deploy-clojure-we...

note: it's a 2010 article so some of it's outdated, but the idea would be the same.

At that time everyone was racing to make more robust web frameworks. Many of them were from people doing the above stuff only adding features. However, shortly afterwards cljs was released and another slew of web frameworks came out as people embraced writing web apps client side. Then again, shortly afterwards, Facebooks React became the new thing and advanced the idea of further separating out the data content from the UI composing for rendering. At this point data models (i.e. big hash-maps) and syncing that data to the UI became the new norm. And even since then more advanced frameworks came out, such as Fulcro, that further extend the data modelling & syncing features (https://github.com/fulcrologic/fulcro).

Through out all this many web frameworks became abandon-ware and now it's really hard for a newbie to make sense of which one to use. In my opinion:

1. If you want to do what Arc does (server side page generation) then use Compojure + Hiccup.

2. If you want to write basic client side cljs code there are dom libraries like:

- Dommy https://github.com/plumatic/dommy

- Domina https://github.com/levand/domina

These are fairly simple to use.

2a. If you want to write client side cljs code that takes advantage of React then use Reagent. https://github.com/reagent-project/reagent (Much better more interesting that Domina/Dommy)

If you have a desire to enter the more advanced data-model-UI-syncing arena where I would probably use Fulcro (but haven't). Note that these advanced frameworks like Fulcro expect you to know much more about state management / data modelling and it could be a steep learning curve for some.

I've been developing cljs web apps for over 4 years. Over these years I've tried some of the frameworks, but I ended up writing my own as none of them could do what I needed.

Is that helpful?

All of that may seem like too much, but remember you really only need Compojure + Hiccup to be where Arc is at.

Edit: I just noted that you know Datalog, so I think Fulcro is a good fit for you (see http://book.fulcrologic.com/#GraphDB).

reply

2 points by hjek 7 days ago | link

> Is that helpful?

Yes! Thanks a lot for the write-up.

That Compojure example does look quite familiar and Arc-like, and looks like it can handle multipart post requests too[0].

I'd never heard of Fulcro before. Given what is often emphasised about Clojure, at first glance at the Fulcro docs I'm a bit surprised how often they mention state and mutations:

> The other very common case is this: You’ve loaded something from the server, and you’d like to use it as the basis for form fields. In this case the data is already normalized in your state database, and you’ll need to work on it via a mutation.[1]

Also, I'm too much into graceful degredation to ever go all out Cljs, unless it was for a phone app. But I find that it's often interesting to see how people do things in Clojure, even when not using that language, so I'll be giving those Fulcro videos a look.

[0]: https://github.com/whostolebenfrog/compojure-multipart

[1]: http://book.fulcrologic.com/#_initializing_in_a_mutation

reply

3 points by i4cu 7 days ago | link

> I'd never heard of Fulcro before. Given what is often emphasised about Clojure, at first glance at the Fulcro docs I'm a bit surprised how often they mention state and mutations:

Well things on the client side can be sometimes be mutable. No one gets around the fact the DOM is a mutable only object. But besides that, the Fulcro library has labelled one of their feature's a 'Mutation'. Which was probably a bad choice, but it has nothing to do with the immutability of the underlying cljs object that it uses for that "Mutation". You'll notice the example is using 'swap!'. That means it's modifying an atom; Where an atom is an interface to make changes to the immutable object it holds. So really 'swap!' takes the change request, constructs a new version the original thing held in the atom, with changes, then 'swap's it with the original item inside the atom. The original thing was never changed (no changes to existing slots in memory). Hence clojure's things are immutable, and they are in Fulcro too, accept when changing the DOM tree.

As for state that's mentioned all the time in Clojure :)

reply

2 points by akkartik 8 days ago | link

This is the Rich Hickey talk from 2016: https://www.youtube.com/watch?v=oyLBGkS5ICk. If you listen to it, it was made in an effort to adjust the community's trajectory. I too would like to hear if adjustments did happen.

Basically the way I interpreted it[1] is that "major version" is a meaningless concept. If you're making incompatible changes, rename the package. If that's a disincentive to making incompatible changes -- great!

[1] http://akkartik.name/post/versioning

reply

2 points by hjek 8 days ago | link

Nice blog post.

> Rich Hickey pointed out last year that the convention of bumping the major version of a library to indicate incompatibility conveys no more actionable information than just changing the name of the library.

Yes, I guess Hickey is applying this idea of immutability not only to data structures but also to APIs and even databases[0] with the proprietary database service Datomic. Interestingly Datomic uses Datalog as query language, so it's straight forward to apply at least some of those ideas with the Racket Datalog package[1].

As of now I have a basic web forum working with all data storage done in Datalog (except for file uploads!). I'll post some code once the design is a bit more settled. It's still in the breaking-things-all-the-time phase.

What I find a bit tricky about Datalog is that relations are stored together with other facts, which in my mind feels a bit like storing code in a database, but maybe I just haven't wrapped my head around it yet.

Does anyone here have favourite articles or talks about logic programming?

[0]: https://www.youtube.com/watch?v=EKdV1IgAaFc

[1]: https://docs.racket-lang.org/datalog/interop.html

reply

3 points by i4cu 8 days ago | link

> What I find a bit tricky about Datalog is that relations are stored together with other facts, which in my mind feels a bit like storing code in a database, but maybe I just haven't wrapped my head around it yet.

I can't speak for Datalog, but I've used Datomic.

If it's the same, then a 'fact' is comprised of an entity (the id), + an attribute, + a value.

The relationships are made by storing an entity id into the value slot of another fact. Thus the model is both flat (being a list of facts) and hierarchical (they can point to each other). It pretty much becomes a graph database. Is that what you mean?

reply

2 points by hjek 7 days ago | link

> I can't speak for Datalog, but I've used Datomic.

Cool!

> The relationships are made by storing an entity id into the value slot of another fact. Thus the model is both flat (being a list of facts) and hierarchical (they can point to each other). It pretty much becomes a graph database. Is that what you mean?

Yes, exactly! What I worry about (because I'm fairly new to logic programming) is whether it could potentially be difficult to update a program where storage of business logic and storage of data aren't separated?

If we assume we have a Hacker News web app where we have a fact: One day Alice submits a story with the title "How to peel onions"; and we are thinking "Why on earth did she post that here?!?" So we add this relation to our code: A story is `irrelevant` if it has the word "onion" in the title. Then, the next day we get another fact: Now Bob has submitted a story called "How onion routing works". This new story by Bob then makes us reconsider our definition of `irrelevant`.

In a typical imperative program we'd just edit the code and redefine the `irrelevant` predicate, and it would take effect next time we run the program (or instantly if we enter it at a repl). But here in our logic program we store this `irrelevant` relation in our graph database, so even though we have removed it from our code, it is still sitting there in the database along with all the facts, outside the reach and responsibility of git, or whichever VCS we're using.

Yes, so my question is: How do you practically deal with changes to business logic in logic programming where data storage and relation storage is one and the same? Perhaps Datomic just avoids this issue somehow? I may also be missing or misunderstanding something.

reply

3 points by i4cu 7 days ago | link

Can't say I know what the options are since I don't know Datalog or the DB you're using, but is this reasonable?:

  -----------------------------------------------------------
  Entity              | Attribute | Value
  -----------------------------------------------------------
  person-id-001       | name      | Alice
  person-id-001       | stories   | [story-id-001, story-id-002...]   

  story-id-001        | headline  | "How to peel onions"
  irrelevant-word-001 | stories   | [story-id-001, ...]
  irrelevant-word-001 | word      | onion
  -----------------------------------------------------------
So if you decide that onion is no longer irrelevant then delete the entity 'irrelevant-word-001'. Which seems, at least to me, better than making code pushes.

So all of this assumes a few things:

- Your DB supports a cardinality of 'many' items in the value slot.

- Your query language can perform joins.

Of course none of this helps when someone changes a headline, but only full-text search DB's will help you do that.

Edit: made edits.

reply

3 points by i4cu 7 days ago | link

To be complete (and somehow I edited this out):

  -----------------------------------------------------------
  Entity        | Attribute          | Value
  -----------------------------------------------------------
  globals       | irrelevant-words   | [irrelevant-word-001, ...]   
  -----------------------------------------------------------
So you would also need to remove the value 'irrelevant-word-001' from the above. At least this is how I would do it in Datomic anyway.

What's interesting (at least to me) is that Datomic has a function called 'retractEntity' [1] which auto-magically removes all references of an entity in any value slot when you retract the entity. Man I love Datomic :)

[1] https://docs.datomic.com/on-prem/transactions.html#dbfn-retr...

reply

2 points by hjek 7 days ago | link

> So if you decide that onion is no longer irrelevant then delete the entity 'irrelevant-word-001'. Which seems, at least to me, better than making code pushes.

I can make sense of that when there's just one instance of this app running. Yet imagine the scenario where the web app has been published, and suddenly other people are running this web app. If the business logic is then changed, somehow I'd have to tell those people: "Oh btw, when you're running `git pull` next time, then you just also gotta run this query to retract some of the old relations from the database."

Definitely not a problem for me yet, but I can just smell it coming. I could add those retractions to the code, but they would have to stay there indefinitely, because it's not possible to tell if those retractions have taken place on everyone's databases yet.

Maybe I'm over-thinking this.

> What's interesting (at least to me) is that Datomic has a function called 'retractEntity'

Looks like the one called `~` in Racket's Datalog[0].

[0]: https://docs.racket-lang.org/datalog/interop.html#%28form._%...

reply

2 points by i4cu 7 days ago | link

Quick questions before going further.

Is this irrelevant-word example a real feature you're building into the app or a contrived example to understand Racket DataLog DB use?

If it's a real feature, and I'm assuming it is. Then I'm also assuming that when a story is submitted you're parsing the title and adding the relationship to the current set of irrelevant-words that are stored.

So the question's are:

1. How are you going to remove the past relationships between stories and an irrelevant word that's getting removed? (looks like we've answered this).

2. How are you going to make sure past stories gain the relationship to newly added irrelevant words?

3. How are you going to handle title changes.

After you get handle on these then what you really need to do provide an interface, from within the apps admin tools, to trigger the noted functionality. This way the business logic is in the app and it's modifying the data.

reply

2 points by hjek 7 days ago | link

> Is this irrelevant-word example a real feature you're building into the app or a contrived example to understand Racket DataLog DB use?

It's a simplified and slightly contrived illustration of an issue in this pre-alpha code I haven't published yet, perhaps just because I haven't thought of a name for the project yet. But yes, let's assume it's a real feature for now.

> 3. How are you going to handle title changes.

I like Hickey's idea of accretion of data - with a timestamp! - and not forgetting previous facts. I think he's talking about it in The Database as a Value[0]. So, a story could have a few different titles in the database, and the newest one is the one you get to see.

The thing is, it's easy to add a timestamp to facts as a way of not considering old facts without forgetting them, but not to relations. For example in Racket Datalog[1]:

    (! (voted "i4cu" 'up 134))
can easily be get a timestamp:

    (! (voted "i4cu" 'up 134 1542151773))
but that would not really make sense in a relation like

    (! (root A B) (ancestor A B) (parent null A))
So, I don't feel there's a need for ever retracting facts, because timestamps solve that. (Even when deleting something, you could just add the fact that is has been deleted.) But with relations (a.k.a. business logic) I think I will need to retract things, which is tricky because this logic is not only present in the code but also in the database. This was the problem I was asking about.

> 2. How are you going to make sure past stories gain the relationship to newly added irrelevant words?

Datalog queries reflect the current set of facts and relations, so the possible irrelevance of a story would not be stored anywhere, so it wouldn't need to be updated.

> 1. How are you going to remove the past relationships between stories and an irrelevant word that's getting removed?

Some as above. For example, in a place oriented program a story object could have a boolean attribute `irrelevant?`, whereas I'm sending a query every time this value is needed, so no stale `irrelevance` attributes are stored anywhere.

> After you get handle on these then what you really need to do provide an interface, from within the apps admin tools, to trigger the noted functionality. This way the business logic is in the app and it's modifying the data.

Yes, that is kind of there already, as in having functionality for changing titles. The `irrelevant` functionality is not there now.

Ok, I think I just need to work on getting this code publishable, because it might be easier to discuss tangible examples.

[0]: https://www.infoq.com/presentations/Datomic-Database-Value

[1]: https://docs.racket-lang.org/datalog/interop.html

reply

2 points by i4cu 6 days ago | link

Yeah, Datomic doesn't expect the relationships to be stored in the DB. It stores a bunch of indexes for you and it has a great query language, but that's it.

So where will you're data be? In a local data structure? I read your racket Datalog link, but it doesn't show any details for the database side (i.e. durability etc.) even though it's labelled a database.

Also, I'm curious what made you choose a graph db. It seems like you're inheriting a lot of complexity and I'm wondering what the benefit is over a more traditional sql or nosql db.

reply

2 points by hjek 6 days ago | link

> So where will you're data be? In a local data structure?

Yes, I think. The database just stored in memory but it can be serialized and saved to the disk using `write-theory`[0] and loaded `read-theory`. That is what I'm doing for now, and it's a very naive and inefficient to do a full database dump rather than just appending new data, and I presume it's particularly in this area where Datomic is way more optimised and well thought out.

> Also, I'm curious what made you choose a graph db. It seems like you're inheriting a lot of complexity and I'm wondering what the benefit is over a more traditional sql or nosql db.

Well, I did the initial work on the web app: creating user accounts, adding posts and replies, and then I got to data storage. Initially I did a News-style flat-file database, just saving data as lists in files, that are then loaded into memory when the program starts. It mostly worked but also felt a bit complicated, and I thought that perhaps I should just use a proper database?

What I like about news.arc is that you can just launch it without any configuration, so MySQL and PostGreSQL were out of the question, and I started reading a bit about SQLite. But I've also had this fascination with logic programming, from what people are posting here[1][2], and from reading a bit of The Reasoned Schemer, and I watched some of those Rich Hickey talks again, where he talks about Datalog, which happens to be available for Racket.

There are just some things that are incredibly simple in declarative/logic programming. For example, if you have facts about stories being `parent` of their replies, then it's simple to just define the `ancestor` relation, and when you have the `ancestor` relation, you automagically get `descendants` without having to write any code, because it's just the inverse of `ancestor`:

    (! (:- (ancestor A B)
           (parent A B)))
    (! (:- (ancestor A B)
           (parent A C)
           (ancestor C B)))
But, I've also bumped into some questions - more practical than theorical - and that is why it's interested hearing about your experience with Datomic, and why I'm asking here.

So, SQLite is still on the table. I'm not too familiar with NoSQL, but my impression is that they are all about speed and scalability of data storage. I haven't used MongoDB but isn't it essentially just like storing JSON in a file, except faster? It would be interesting if any of those could be used in conjunction with Datalog though, if don't add too complexity for the sake of increased speed.

[0]: https://docs.racket-lang.org/datalog/interop.html#%28def._%2...

[1]: http://arclanguage.org/item?id=20650

[2]: http://arclanguage.org/item?id=20519

reply

2 points by i4cu 6 days ago | link

So a few things I wanted to point out:

Datomic vs. DataLog

Datomic uses DataLog as part of its query language, but that's pretty much where the comparison should end. Things like "treating the database as a value", and features such as data accretion that Rich talks about have nothing to do with DataLog. They're features of Datomic. So for example when you mention never retracting data, well your data size is going to continuously grow unless you write your own data management layer on top. Datomic, on the other hand, does this for you. When you want to query the database over time, then you're going to need to store time intervals for all of your data and incorporate that into each query. Where as in Datomic (which has a time log) you can pass in the DB itself as a value (with an associated time interval) and Datomic will make sure your queries are working against the dataset that accounts for the time interval.

I'm pointing this out because it seems to me that you're doing (or are going to be doing) a lot of work that may not be worth it for what you're trying to accomplish.

Nosql

> I'm not too familiar with NoSQL, but my impression is that they are all about speed and scalability of data storage.

Yes and No. Often speed can be a feature Nosql dbs advertise, but really, for me anyway, it's about flexibility and ease of use. Traditional RDBMS, for example, require creating schemas. Many Nosql databases don't require a schema at all which makes it easier to use and more flexible to change. Nosql's are often a key-value store so it can be really easy to take a hash-map or table of data from your code and just dump it into an nosql datastore and be able to query it.

My personal favourite is Redis and it might be worth considering for your app.

You can:

- store a value under a key [1]

- store table data [2]

- store values in a set [3] (which allows intersection/difference queries)

- store values in a sorted-set [4] (which allows you query by some numerical value like timestamp)

- use it to manage relationships [5]

The reasons I mention Redis is that the HN app is very well suited to it. HN only keeps 'x' amount of data in memory. And in Redis the data lives in memory. Also Redis allows you to set expiry times on data for auto eviction [6]. And Redis also supports ordered lists [7] which can make it useful for lisp based languages.

However it's not embedded. And if that's a requirement I'd almost suggest you move away from Racket and adopt a language that has more options for embeddable databases. I guess if you're willing to roll your own (and it looks like you may be) then that's awesome too.

But in case you decide otherwise... The library I use is Redis Carmine [8], but there are Racket clients [9].

1. https://redis.io/commands/set

2. https://redis.io/commands/mset

2a. https://redis.io/commands/mget

3. https://redis.io/commands/sadd

4. https://redis.io/commands/zadd

5. search: "Representing and querying graphs using an hexastore" https://redis.io/topics/indexes

6. https://redis.io/commands/expire

7. https://redis.io/commands/lset

8. https://github.com/ptaoussanis/carmine

9. https://redis.io/clients#racket

reply

2 points by hjek 5 days ago | link

> Datomic uses DataLog as part of its query language, but that's pretty much where the comparison should end. Things like "treating the database as a value", and features such as data accretion that Rich talks about have nothing to do with DataLog.

I'm not sure I totally agree with this. I think that apart from talking about the design of Datomic, he also has a more general point against what he calls PLOP (PLace Oriented Programming), which Datalog does address.

For example in plain Racket a value is lost if something else is put in its place:

    > (define foo 'bar)
    > (define foo 'baz)
    > foo
    'baz
In Datalog you just accrete facts:

    > (! (is foo bar))
    > (! (is foo baz))
    > (? (is foo X))
    is(foo, bar).
    is(foo, baz).
Hickey is also mentioning how git doesn't do PLOP in that it doesn't throw out your commit history (without you asking it to do so).

> The reasons I mention Redis is that the HN app is very well suited to it. HN only keeps 'x' amount of data in memory. And in Redis the data lives in memory. Also Redis allows you to set expiry times on data for auto eviction [6].

Interesting. Just checked news.arc, and yes `initload*` is set to 15000. Interesting idea from Redis with expiry times. I'll check it out. I hadn't considered the scenario of storing text enough to max out on memory, because it would probably be premature optimisation, but good to keep in mind. I'd like to give Redis/Rackdis a try; thanks for the suggestion. I've been hosting an Etherpad Lite instance, and Redis was painless to setup.

> I'm pointing this out because it seems to me that you're doing (or are going to be doing) a lot of work that may not be worth it for what you're trying to accomplish.

Yes, my priorities here are definitely to make the code as brief and simple as possible, and to not have to do to much work. With plain Datalog it's very little work to timestamp a fact, and it's also kind of necessary, e.g. to figure out which fact is most recent, when previous facts are not removed. I'm just trying to get the gist of Hickey's ideas here.

reply

2 points by i4cu 5 days ago | link

> PLOP (PLace Oriented Programming), which Datalog does address.

Yeah, I was thinking more along the lines that Datomic has built-in functionality to address the caching, cache eviction, and indexing that goes along with all that data accumulation. But you're correct, DataLog does accumulate facts.

> Interesting. Just checked news.arc, and yes `initload*` is set to 15000.

I did the same thing, about 6 or 7 years ago, that you're doing now. I ported HN to Clojure (which is actually how I learned Clojure). If memory serves me correctly when I was doing the work I realized I needed a real DB if I wanted to support load balancing. i.e. I needed to centralize the data for the authentication and fnid session info. I think Arc calls them fnids... You probably know better than I do now, but Arc has all this code to expire these session fnids and so, for me, Redis was just a good fit for that task.

Anyways, I'll be sure to take a look at the final result of your work.

Cheers.

reply

2 points by hjek 5 days ago | link

> I did the same thing, about 6 or 7 years ago, that you're doing now. I ported HN to Clojure (which is actually how I learned Clojure).

Cool!

> I needed to centralize the data for the authentication and fnid session info. I think Arc calls them fnids... You probably know better than I do now, but Arc has all this code to expire these session fnids and so, for me, Redis was just a good fit for that task.

The Racket web server is quite "batteries included" and comes with these different managers[0] for dealing with expiration of sessions/continuations, such as the LRU manager:

> The memory limit is set to `memory-threshold` bytes. Continuations start with 24 life points. Life points are deducted at the rate of one every 10 minutes, or one every 5 seconds when the memory limit is exceeded. Hence the maximum life time for a continuation is 4 hours, and the minimum is 2 minutes.

> If the load on the server spikes—as indicated by memory usage—the server will quickly expire continuations, until the memory is back under control. If the load stays low, it will still efficiently expire old continuations.

[0]: https://docs.racket-lang.org/web-server/servlet.html?q=respo...

reply

2 points by i4cu 5 days ago | link

> If the load on the server spikes...

When I was referring to load balancing and centralizing the data I was referring to many web servers sharing a centralized/external source for auth/session data.

I'm unfamiliar with racket's web server 'servlets'. The docs are little unclear (at least to me). Can these servlets live on a separate server so that the data can be shared between web servers? I'm guessing that was/is not a requirement for you, but I'm just interested in knowing if that's how it can work.

Uh oh, you're getting me interested in Racket now. I can't have that... I have too many projects :)

edit: I guess at the end of the day these servlets are web-servers right, so you can, even if you have to do it over http and build an api.

reply

2 points by hjek 4 days ago | link

> Can these servlets live on a separate server so that the data can be shared between web servers?

Probably. I assume that serializable continuations[0] from stateless servlets can just be stored wherever, like in Redis or something, instead of in the memory of one server.

> I ported HN to Clojure

If that is something you have published, it'd be fun to see, whether it's finished or not.

> Uh oh, you're getting me interested in Racket now.

My impression is that Clojure is faster, less verbose partly due to clever syntax and provides more immutable data structures than Racket. But when it comes to documentation and error messages, I find Racket more coherent and comprehensible.

Say, if I wanted to connect to a SQL databse, with Racket I'd use the DB module[1], end of discussion. But with Clojure there's Korma, ClojureQL, Persist, HoneySQL, Yesql, a JDBC wrapper from Clojure contrib, SQLingvo, oj, Suricatta, aggregate, Hyperion, HugSQL, and probably a few more[2][3]. That multitude of libraries with similar purpose may be useful in some cases, sure, but also potentially a bit overwhelming for beginners, so I guess that's why I found it easier to get started with Racket.

[0]: https://docs.racket-lang.org/web-server/stateless.html#%28pa...

[1]: https://docs.racket-lang.org/db/

[2]: https://stackoverflow.com/questions/294802/use-a-database-wi...

[3]: https://adambard.com/blog/clojure-sql-libs-compared/

reply

2 points by i4cu 4 days ago | link

> If that is something you have published, it'd be fun to see, whether it's finished or not.

I actually tried to look it out the other day during this conv, but it's buried somewhere unavailable right now. If I find/get to it I'll post.

> That multitude of libraries with similar purpose may be useful in some cases, sure, but also potentially a bit overwhelming for beginners, so I guess that's why I found it easier to get started with Racket.

Agreed. Navigating the volume libraries and the options available is a real pain in the beginning, but once you get past that, then it's not bad at all. At the same time, take a look at the quality of Clojure's Redis Carmine Library vs. Racket's Redis Libraries. Miles apart.

To each their own, right :)

reply

2 points by hjek 8 days ago | link | parent | on: Knark - rewrite in plain Racket?

Personally I think the way Arc deals with Racket interop is pretty solid (expect passing lists to Racket functions, as can be seen in app.arc). There's a lot to like about Arc: its terseness when compared to Racket and the anaphoric macros, but here's what pushed me to try out plain Racket:

- Arc is too slow to handle file uploads. Arc is not suitable for web apps that handle image and video upload.

- This is not a problem with Arc but with News: It relies a lot on state. I'm not a purist, but it is to an extend making it a bit difficult to hack on sometimes, e.g. the `unmarkdown` function because original input is "forgotten", and how the score of an item is just a numerical value rather than something that can be derived from voting data which makes `unvote` difficult to implement correctly.

- Racket has a nice way of just representing html as s-expressions, where with Arc it's functions and macros some of which return a value and some of which print to stdout. Also, a lot of the html in News is a bit hacky.

reply

3 points by hjek 21 days ago | link | parent | on: Tell Arc: Arc 3.2

Thanks for fixing the install instructions, too!

reply

2 points by hjek 48 days ago | link | parent | on: Recursive anonymous functions?

Never mind, found out. It's to calculate the indentation level of a comment `c` in News:

    ((fn (f i) (f f i)) (fn (f i) (aif i!parent (+ 1 (f f (item it))) 0)) c)

reply

4 points by akkartik 47 days ago | link

Is that the Y combinator? I don't think I've seen it ever used "for real". It's not in either Arc 3.1 or Anarki.

You aren't using bracket notation as you originally asked. Might as well just use afn. It can call itself recursively as self. http://arclanguage.github.io/ref/anaphoric.html

reply

3 points by waterhouse 43 days ago | link

The Y combinator itself is more cumbersome, having an extra currying step or two. I prefer the form hjek is using—which is a function that expects to take "itself" as an extra parameter, like this:

  (fn (f i)
    (aif i!parent
         (+ 1 (f f (item it)))
         0))
So the recursive call, "(<self> (item it))", is implemented as "(f f (item it))". And then usage is very simple: actually give it itself as an extra argument.

The Y combinator works with a different function signature:

  (fn (f)
    (fn (i)
      (aif i!parent
           (+ 1 (f (item it)))
           0)))
That is, the function takes "something that's not quite itself" as an argument, and returns a function which does one step of computation and may do a "recursive" call using the thing that was passed into it. The implementation would therefore like to be:

  (def fix (f) ;aka Y
    (fn (i)
      ((f (fix f)) i)))
But, if we're doing the entire thing with anonymous recursion, we can (laboriously) implement fix like this:

  (= fix ;aka Y
     (fn (f)
       ((fn (g) (g g))
        (fn (g)
          (fn (i)
            ((f (g g)) i))))))
Every recursion step involves creating multiple lambdas. Eek. (It's even worse if you use the general, n-argument Y combinator, in which case you must use "apply" and create lists.) Whereas with hjek's non-curried approach, only a constant number of lambdas have to be created at runtime. (Optimizing compilers might be able to cut it down to 0.)

If you want to create a macro like afn or rfn, and want the user to be able to act like the function is named F and accepts just the parameter i, you can put a wrapper into the macroexpansion, like this:

  (rfn F (i)
    (aif i!parent
         (+ 1 (F (item it)))
         0))
  ->
  (fn (i)
    ((fn (f)
       (f f i))
     (fn (f i)
       (let F (fn (i) (f f i))
         (aif i!parent
              (+ 1 (F (item it)))
              0)))))
And in this case, while the code does call for creating an F-lambda on every recursive call, I think it's easier for the compiler to eliminate it—I don't remember whether I'd gotten Racket to do it. (I think it probably did eliminate it when working with Racket code, but Arc, which generates all the ar-funcall expressions, might not have allowed that.)

The actual code for rfn will create a variable and then modify it, creating a lexical environment with a cycle in it. That's certainly a more straightforward approach. I figure the above is useful only if you're working in a context where you really want to avoid mutation or true cycles. (For example, I am considering a system that detects macros whose expansion is completely side-effect-free. It might be easier to use the above approach to defining iteration than to teach the system that rfn is "close enough" to being side-effect-free.)

reply

2 points by hjek 44 days ago | link

I've been using `aif` and `awhen` a lot but didn't know about `afn`. Thanks!

Is it the Y combinator? I didn't get that far in The Little Schemer yet, but I'll have to check.

reply

2 points by akkartik 44 days ago | link

Doesn't look quite like it, but close.

reply

2 points by hjek 50 days ago | link | parent | on: Writing a sane list macro

Never mind, found less hacky solution:

    (mac li item
      `(tag li ,@item))

    (mac ol items
      `(tag ol ,@(map [list 'li _] items)))

reply

2 points by akkartik 50 days ago | link

That seems pretty much unhacky :)

reply

2 points by hjek 60 days ago | link | parent | on: Algolia HN Search source

Has anyone used this with Anarki, ever?

-----

3 points by hjek 60 days ago | link | parent | on: Inline JavaScript

> 4. All inline style attributes need to be removed and changes to news.css or news.js will need to be made in order to compensate.

Wat. Wow, browsers today! Is CSS vuln by default? Is that really necessary?

-----

3 points by i4cu 60 days ago | link

Strict CSP settings are a form of whitelisting what js, css etc, is valid thus protecting from injection. Inline code for both js and css can't be whitelisted like header items can be so they will fail (unless you use the hash code hack mentioned for js).

Css is vulnerable too (since at least 2009):

https://scarybeastsecurity.blogspot.com/2009/12/generic-cros...

"By controlling a little bit of text in the victim domain, the attacker can inject what appears to be a valid CSS string. It does not matter what proceeds this CSS string: HTML, binary data, JSON, XML. The CSS parser will ruthlessly hunt down any CSS constructs within whatever blob is pulled from the victim's domain...."

Furthermore:

https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP

"A policy needs to include a default-src or script-src directive to prevent inline scripts from running, as well as blocking the use of eval() . A policy needs to include a default-src or style-src directive to restrict inline styles from being applied from a <style> element or a style attribute."

So it's just the 'style' attribute people worry about and strict CSP manages.

-----

2 points by hjek 57 days ago | link

Thanks for all those links. Not sure I "get" the browsers of today. Not sure I can be bothered manually adding hash-codes for inline JS (Maybe doable from Arc, but sounds hacky).

It will be difficult to deal with some styling functions from Arc, like `grayrange` that's greying out comments with negative score. Perhaps JS is more suitable?

I wonder in which file the CSP would need to be implemented in Arc, or whether it's easier to set them in an Nginx config.

reply

2 points by i4cu 57 days ago | link

> ... but sounds hacky

That's because it is a hack (as mentioned in my original comment edit#1).

My comments are only intended provide whatever help I can towards the original posting context which suggested a strict CSP criteria.

None of these things have to be done. It's up to you to decide, so really the question becomes what are you doing it for? Are you building a news site for a community of a few thousand people in a niche group? or are you making a news app that others can buy into for their own product/uses? The latter would make me want to ensure it's CSP capable, while the former - not so much.

> It will be difficult to deal with some styling functions from Arc, like 'grayrange'...

I would just create 10 or 20 or whatever number of css entries that act as a segmented gradient (call them .color-reduct1 to .color-reduct10) then create a server side function that takes the output value of grayrange and picks one the css entries. Then add that class to the html element and you're good to go. It's not a perfect gradient but it would be enough that I doubt it would make any noticeable difference.

Js is also an option, but then you have to store and pass the score into the js calculation which requires much more work then the above solution. Plus it forces you to expose the score (which HN no longer does)

> I wonder in which file the CSP would need to be implemented in Arc, or whether it's easier to set them in an Nginx config.

If you want to make code that's generic and useable by others then it needs to be in arc (not everyone will use Nginx). I suggested using arc templates [1] already and I still think this is the right way go. Establish the base template definition in srv.arc and then each app can modify that base template from their app file. Additionally allowing defop to optionally pass in over-rides will make it dynamic if you need that variance.

I'm sure there are dozen ways to do it, but that's my suggestion anyway.

1. http://arclanguage.com/item?id=20730

reply

2 points by krapp 57 days ago | link

> Not sure I can be bothered manually adding hash-codes for inline JS (Maybe doable from Arc, but sounds hacky).

No one bothers, everyone moves all of their JS to an external file (which is what i'm working on now) or they just don't bother with CSP headers at all.

>It will be difficult to deal with some styling functions from Arc, like `grayrange` that's greying out comments with negative score. Perhaps JS is more suitable?

The score for each comment could be added as a data attribute and JS could apply the style based on that. Offloading that to JS might make the forum more responsive.

...as well as, maybe, having markdown done entirely in JS, but that's for the future.

[edit] ... as well as maybe thread folding with JS and localstorage.

reply

2 points by hjek 60 days ago | link | parent | on: Self-hosting the Anarki community

Looks like banned IPs are written to the disk even:

    (def set-ip-ban (user ip yesno (o info))
      (= (banned-ips* ip) (and yesno (list user (seconds) info)))
      (todisk banned-ips*))

-----

More