Building Blocks of the Sasquach Type System

10 Jul, 2021

This post I will describe my current vision for the Sasquach type system. Some content mentioned here is already implemented, though much of it is not. I indicate what's currently working¹ at the end of each section . I expect some aspects of the type system to change as I continue to implement it.

Structs

Sasquach is a high level functional language that aims to feel like you're (mostly) writing dynamically typed code. As a result, the type system needs to be quite flexible. The core data abstraction is a struct. A struct consists of an unordered, immutable set of fields and functions. Structs are not nominal, so two struct types are equivalent if they contain the same public field and function types. Note that this does not mean if you compare two structs at runtime that they will be equal, it only means that their types are the same.

{ foo: (n: Int) -> boolean, bar: String, baz: { foobar: Double } } == { bar: String, baz: { foobar: Double }, foo: (n: Int) -> Boolean }

Sasquach also has first-class modules, which are just singleton structs.

FooModule {
    count: Int = 10,
    add = (a: Int, b: Int): Int -> a + b,
}

let fooStructLiteral = { count = 20, add: (a: Int, b: Int): Int -> a + b }

typeof FooModule == typeof fooStructLiteral

The one difference between modules and struct literals is that modules require type annotations on all of their public fields and functions. There are two reasons for this:

It provides API documentation to those that wish to use the module.
It simplifies and accelerates the type resolution process.

The second point has a massive effect on the compilation speed and IDE experience. When a module imports another module, it only has to look at the defined type signature. This makes type checking embarrassingly parallel at the module level. With full type inference, the type system needs to recursively resolve the parameter types and return types of any functions being used. Requiring type signatures also greatly helps with IDE support, as it means that making changes to a function only affects the type checking of other modules if it changes the signature. Here's an example of the issues full type inference can cause for IDEs:

Greeting {
  // Inferred type is (greeting: String) -> String
  improveGreeting = (greeting: String) -> {
    "${greeting}!"
  }
}

// Now we want to change it, so we start making changes in the IDE
Greeting {
  // Inferred type is now (greeting: String) -> Void
  improveGreeting = (greeting: String) -> {
    // We haven't finished writing the code yet, we only have the if true part right now. An if 
    // expression without and else branch returns nothing, so the function now returns type Void.
    
    // This means that any code that uses "Greet.improveGreeting" no longer typechecks correctly 
    // and you get a bunch of type errors in the IDE until you finish the else branch. 
    // If you include the type signature, you only get type errors within this function.
    if String.length(greeting) < 10 {
        "${greeting}!"
    }
  }
}

Everything in this section is implemented except for the typeof builtin and string interpolation. Oh and comments 🙃.

Tuples

Tuples are like a structs with unnamed fields. In fact, at runtime they are actually structs with fields named 1, 2, 3, etc. however they are treated differently by the type system. This is to avoid creating some weird types with features enabled by [row polymorphism](#Row Polymorphism).

let foo = (10, "hi", 145D)

typeof foo == (Int, String, Double)

Tuples are not implemented and neither are double literals.

Type Aliases

It is possible to name a type using a type alias. Type alias simply provide a more concise way to refer to a type, they have no bearing actual type checking.

type User = { userId: Int, name: String },

User == { name: String, userId: Int }

Modules can also declare type aliases that are exported to other modules. There is a special Self type alias that exports the module name as a type. I'm currently unsure about the Self type because it seems like it can lead to confusion. On the other hand, I dislike the look of Ocaml's convention of a T type when writing out type signatures.

Product {
    type Self = { name: String, sku: String, variants: List[Variant] },
    // Yes I know prices should be big decimal. This is for simplicity.
    type Variant = { variantName: String, price: Double },
}

// Other module
use foo/bar/Product,

// Return type refers to the Product "Self" type alias
showMeTheMorty = (): Product -> {
    // Types must be imported or qualified
    let variant: Product.Variant = { variantName = "special edition", price = 10D }
    { name = "Morty Action Figure", sku = "rckndmrty", variants = List.of(variant) }
}

Type aliases are not implemented.

Row Polymorphism

The behavior described so far provides a very limited set of structural typing. It only allows a function to take a struct that has the exact type provided in the signature. If a struct has any extra fields, the function will not accept it.

foo = (user: { userId: Int, name: String }): Void -> print(user.name)

// This does not compile due to the "occupation" and "age" fields
foo({ userId = 10, name = "Dave", occupation = "Rapper", age = 33 })

To allow this, Sasquach uses something called row polymorphism. This essentially means that you can declare a special "row variable" in the struct type that contains all fields not declared in the struct type.

// R is the row type variable in this case
foo = [R](user: { userId: Int, name: String, ..R }): Void -> print(user.name)

// This does compile since the given struct contains "userId" and "name". R is now bound to the extra
// fields, which in this case makes R == { occupation: String, age: Int }
foo({ userId = 10, name = "Dave", occupation = "Rapper", age = 33 })

I decided to use ..R rather than | r like some other languages use for two reasons:

The use of | would clash with union type declarations (which I haven't talked about yet).
The syntax matches the spread operator in Typescript and Python, which has similar semantics.

Having row polymorphism also enables some powerful abstractions when you can also refer to those extra fields in the function. This allows functions to add and remove fields from an input struct to form a new struct type. Note that syntax here is still pending, it could use some DRY.

// Rows can be used for type-safe routing of an http request. This function takes the last section 
// of the path and adds a new field to the body.

// The type of r here is R, which contains all of the fields in the struct argument except for "body"
userRoute = [R](requestPath: List[String], { body, ..r }: { body: String, ..R }): { body: String, userName: String, ..R} -> {
    let userName = List.last(requestPath)
    // Return a struct that includes the fields of the input struct, as well as a new userName field
    { body, userName, ..r }
}

This is somewhat implemented, struct arguments currently always behave like row types but with no way to refer to the row type or row variable.

Parametric Polymorphism (aka Generics)

Parametric polymorphism in Sasquach is relatively simple compared to other languages. Since Sasquach doesn't have nominal types, it is impossible to express that a struct inherits from another struct type. This mean that you don't have to deal with covariance and contravariance of types. In addition, Sasquach does not have typeclasses², so you can't say that type T must implement Eq or Serializable. It just represents a type variable that gets replaced by another type.

Box {
    of = [T](value: T): { value: T } -> { value },
    map = [T, U](box: { value: T }, mapper: { map: (input: T) -> U }): { value: U } -> { value = mapper.map(box.value) }
}

let intBox = Box.of(10)
Box.map(intBox, { map: (input) -> String.ofInt(input) })
=> { value: "10" }

This was implemented a couple of days ago, minus the field punning.

Atoms

Atoms are singleton values whose type evaluates to itself, e.g. typeof :foo == :foo. The name "atom" is based on Erlang/Elixir, though the same concept exists in Clojure as a keyword, in Ruby as a symbol, and in Typescript as a string literal type. Atoms on their own aren't terribly useful, however they can be used for pattern matching.

Intersection and Union Types

Intersection types allow combining the fields to multiple struct types into a single struct type.

type PeanutButter = { peanutButterAmount: Int },
type Jelly = { jellyAmount: Int },
type PBandJ = PeanutButter & Jelly,

PBandJ == { peanutButterAmount: Int, jellyAmount: Int },

Union types express that a type can be one type or another. They can be combined with atoms to create an ad-hoc enum or tagged unions. They are especially useful because Sasquach does not allow overloading functions.

type Toasted = ToastSetting | Int | Double,
// A union of atoms can be matched across exhaustively
type ToastSetting = :barely | :golden | :burnt,

let toastedSetting: Toasted = :burnt
let toastedLevel: Toasted = 10

let toast = (toasted: Toasted): Double -> match toasted {
    :burnt => 1D,
    :golden => 5D,
    :burnt => 10D,
    // Cannot resolve if level is Int or Double, so type is Int | Double
    level => match level {
        // Can specific the type in a match to disambiguate the types. This match could occur at the
        // parent match, the nested match is just for demonstration purposes. 
        lvl: Int => Double.fromInt(lvl)
        lvl: Double => lvl
    },
}

toast(:burnt)
=> 10D

toast(4)
=> 4D

Intersection and union type combine in interesting ways. They can reduce a lot of boilerplate in structurally typed languages.

type ToastedPBandJ = PBandJ & { toasted: Toasted },

let goldenPBandJ: ToastedPBandJ = { peanutButterAmount = 10, jellyAmount = 2, toasted = :golden }
let burntPBandJ: ToastedPBandJ = { peanutButterAmount = 3, jellyAmount = 7, toasted = 10 }

type JellyLevel = { jellyAmount: :none | :some | :lots },
JellyLevel & Jelly == { jellyAmount: Int | :none | :some | :lots }

You can find plenty more interesting examples for how to use union and intersection types via Typescript docs and articles. A future post will talk about the tradeoffs between tagged and untagged union types. Unions and intersection types are not implemented.

Nominal Types

I lied earlier when I said there aren't any nominal types. There are actually two forms of nominal types:

Foreign types (aka Java classes) - Java class types are nominal because, well, Java is a nominally typed language. Sasquaches' type system makes no attempts to resolve classes against their inheritance hierarchy. This means you cannot assign a HashMap to a Map-typed variable. However, foreign function calls will happily accept concrete objects where interfaces are required.
Opaque types - This is the one form of nominal type that can be created within Sasquach. It exists to hide the actual implementation details of the inner type so users can't access the fields within or try to pattern match on it. I'm not sure if this will need a separate keyword, it might be sufficient to expose a struct with only private fields. Opaque types could also be useful to hide the Java types that underlie a modules actual implementation.

Foreign types are implemented, but opaque types are not.

Conclusion

This was a bit of a whirlwind tour of the Sasquach type system. Future posts will explore how these types interact in various ways and how they make a coherent language. I think it'll be easier to write about these interactions once I implement more of the above. In short, the type system is like a more sane version of Typescript's, with more powerful pattern matching due to the ability to disambiguate types at runtime. Feel free to email feedback and comments to sasquach at pentlander.com

Working means likely full of bugs and lacking error handling :)↩
Ad hoc polymorphism will be implemented using modular implicits and will be the subject of a future post.↩

#pl #sasquach #types