ok, I think in order to give advice I'll need to understand your goals better. (After all, I could say "you should do X", but if X doesn't lead you to accomplishing something you want, than my advice wouldn't be very useful).
When you say you want to make an Arc compiler, do you mean that you that you'd like to create a compiler that implements all of Arc, and so it could be used for example to run a news forum like Hacker News on?
Or, are you for example primarily interested in learning, and if so, what would you like to learn?
Any goal you have is valid, but it's relevant for your question about gc. For example, one of the choices you have is whether to not support threads, or to support threads but only on a single core, or to support threads on multiple cores. Which one you choose has a major impact on how to approach gc: it determines both what algorithms you can use and also what algorithms you want to use.
This project starts with the purpose of making an S-expression language which can be easily embedded into another project.
Since I really like arc's sugar-syntax, I would like to implement all language specification which arc has, and make it extensible and easy to embed.for above reasons, I would rather do not take any kernel-level threading techniques, but with some user mode thread and in-background dispatcher to support async socket io. Currently, it have tail recursion optimization, continuation, macro and a mark&sweep garbage collector.
Overall, you're right, it's primarily a project of learning how to make a language.
Just so I understand your terminology, by "user mode thread" do you mean that you will have threads at the Arc language level, and that they will be implemented by your runtime by having your C code itself switch between threads? (And thus the Arc language level threads will all run inside of one operating system thread?)
One thought that occurs to me is that the next thing that might be most useful is to have a way to easily see the performance of the garbage collector in a program you're running. Then when you notice "oh look, Arc is allocating a large number of short lived small objects" or whatever you can tune your garbage collector (or look for a gc algorithm) that works well with the pattern you're seeing.