Why build Sasquach lang
I'm writing a new functional programming language called Sasquach that targets the Java Virtual Machine (JVM). This post aims to answer why I'm building it, why I'm not just using another language, the goals and non-goals of the language, and why it's targeting the JVM.
Why
For the most part I'm building this language for myself. I've had a few side projects over the years, usually related to distributed systems or developer tools. The problem is that these pieces of software are most useful for teams of developers and are usually on the critical path. I'm a strong believer in dogfooding. The lack of a large scale system to test with or a team of developers to get feedback from drains my motivation to work on these projects. The fact that these systems are often business critical makes it difficult to do as a solo side project while maintaining a full time job.
I wanted to work on something that is both technically challenging and that I could actually use myself. I've been interested in programming language design for a while now, there was a 3 month period where I would read a post from Oleg Kiselyov's site at least once a day. I avoided actually writing a language because I was intimidated by the amount of upfront work required. After setting aside my latest project for the reasons mentioned above, I finally said fuck it and just started writing a language. I owe a big thank you to Jakub Dziworski, whose series on building a JVM language was a huge help when getting started.
Why not just use language X?
Besides just wanting a new project, I've had a general disappointment in the languages I've used. I have tried out several languages and they fall short of exactly what I'm looking for.
- Elixir is great at expressing high level logic in a concise manner, thanks to pattern matching, macros, and a pragmatic approach to functional programming. It also has first class concurrency and a unique approach to error handling. However I prefer static typing, and type specs are bolted on. Macros can also make code difficult to follow without diving into the source code.
- Scala has great facilities for functional programming and a powerful typesystem. Scala 3 is a big improvement by taking common patterns and building them in as language constructs, e.g. typeclasses. It can also take advantage of the Java ecosystem of libraries, which greatly expands its usefulness. However the mix of OO, FP, and a complicated typesystem comes at the cost of a lot of syntax and having several different ways to express the same program. It also leads to slow compile times and a difficult to build IDE integration. The community also has a tendency to write libraries as if you were writing Haskell.
- Ocaml has a pragmatic view when it comes to functional programming and compiles fast. However multithreading, though close, is a work in progress. Library support is not great and tooling is kind of a mixed bag. There were times when I wasn't sure when I was supposed to use dune and when I was supposed to use opam. I also strongly dislike the split in the ecosystem when it comes to a standard library, which leaks into other choices around other libaries e.g. Lwt vs Async
- Haskell has never caught my interest, though I understand why some people like it. As a disclaimer, Haskell is the language in this list that I've used the least by far. I prefer strict functional languages instead of dealing with laziness and monads. I also strongly dislike the idea of
compiler pluginslanguage extensions, which in essence forks the syntax and means you have to figure out what "flavor" of the language is being used. Not sure how this works out in practice. - Rust is definitely my favorite language. It has a nice mix of functional programming using iterators and pattern matching, along with ad-hoc polymorphism via traits. Knowing which functions can mutate a struct makes understanding code a lot easier. It also has a great library ecosystem and community. It falls short for web projects though. Wrestling with the borrow checker, slow compile times, and dealing with async/await do not lend themselves well to a server-side rendered web project with quick iteration times.
- Kotlin is another language that I like a lot. In some ways it's similar to Rust in that idiomatic code uses a mix of functional programming with imperative/object-oriented code. Extension methods provide a good way to change the way external code is used and nullability in the type system is amazing. The lack of ad-hoc polymorphism is my main gripe with the language. I would also prefer having either entirely functional code or some way to indicate mutability like Rust.
Goals
Here are the goals of the project in no particular order. I expect the to evolve somewhat over time as I continue to build out the language.
- Easily express high level, type-safe logic - Writing Sasquach code should feel like writing a a dynamic language with a type annotations only required on functions declarations. Support for ad-hoc polymorphism is a must.
- Small base language - The language should have relatively few base constructs that easily compose, e.g. modules and tuples are just structs. There should almost always be just one way to express a construct syntactically, e.g. functions and lambdas, module and struct construction uses same syntax. The standard library, however, will be extensive.
- Compiler is your friend - The compiler should have clear error messages and suggestions. Compilation should run fast, giving the developer a quick edit-compile-run loop. The compiler should also be designed in such a way that it can provide a good IDE experience.
- Server-side web framework - The main use case in mind is to build out a batteries included server side web framework like Elixir's Phoenix. I really like the way that Phoenix was built out of several independent well designed libraries with some glue code and conventions that make everything work together. I want to build a type-safe framework in a similar fashion, e.g. with a type-safe version of Ecto for validation and transformation of unstructured data.
- Cutting edge of the JVM - The language should work on new JVM versions as they are released. It should take advantage of the latest features to ensure that the language is using everything that the JVM has to offer. I plan to make structs primitive classes and for concurrency to be built with Project Loom in mind.
- Easy to reason about state - This one I'm a bit fuzzy about. My current thinking is to keep the language
purely functionalmutation-free, with mutability only allowed with Java interop. Usually state in functional languages is maintained through recursion, so I'd need to implement tail call elimination. I haven't figured out how I'd want to manage mutability, whether it's mutablility modifiers on variables or mandating Java interop occurs in special blocks.
Non-goals
Compiler pluginsLanguage extensions - There should be one language that everyone uses. Users should not have to determine what syntax is valid based on the libraries being used. Macros or a form of type-safe reflection should fill this niche if needed.- Runtime performance - Obviously I would like to make the language as fast as possible and will strive to do so. However if there is a choice between language expressiveness and runtime performance, expressiveness will win. The use of the JVM should make the language plenty fast.
- Calling Sasquach from Java - There will be no support for making Sasquach code easy to call from Java. The coding style and type system is different enough that making this possible would be require a lot of effort and careful design. I prefer not to be constrained by Java the language.
- Subtype polymorphism - The language will use row polymorphism instead, supporting both would make the typechecker much more complicated. I also don't think it makes a much sense in a functional language with ad-hoc polymorphism.
Why the JVM?
I'm much more interested in building a language than building a virtual machine, so compiling to an existing VM lets me avoid writing a bunch of code. Compiling to an existing platform also lets me take advantage of the existing library ecosystem instead of needing to build everything from scratch or linking to C libraries. I chose the JVM specifically because I have a lot of experience with Java from my career and because it has an amazing ecosystem of libraries. I also wanted to gain a better understand of how the JVM works and messing with bytecode seems like a great way to go deeper. I've already learned a lot about what's in a classfile and how method invocation works. invokedynamic
alone will be the topic of a future post.
Wrapping up
I'm excited to say that is my first ever blog post! I'm going to continue blogging about the process of building Sasquach with a mix of short and long form posts. If you have any questions, feedback, or comments, please send it to sasquach at pentlander.com. Shout out to @feoh and @andyc on lobste.rs for encouraging me to blog in the first place!
Update 1: I used some terms that don't accurately reflect my intentions, I've crossed them out and replaced them with the correct terms. I've also update the Haskell section with a disclaimer.