Haskell

Notice: Experimental page format

This is a page from the latest site version, rolled out only on a select few preview pages (as of late 2025). Some things many be out of place or broken; sorry about that!

type	wiki
created	2024-05-28 05:29
modified	2025-10-11 00:39

Images

placeholder

Sketches

placeholder

Foreword: this page was written with the content in Typed lambda calculus fresh on my mind. Implicit references are made to more formal type theory concepts throughout, and this page may not be a great standalone accounting of the covered Haskell topics (I’ve confused myself re-reading this page without the lambda calculus terminology so top-of-mind, for instance).

Syntax groups

Expressions are the symbolic representations of values. This takes place at both the term and type level.
```
-- expressions refer and evaluate to values
-- here I refer to a value X as "value<X>" to separate it from a symbol
-- representation of that value, include the symbol "X" itself

-- this is at the *term* level
9+1 --refers to--> value<10>
10  --refers to--> value<10>

-- this is at the *type* level
Integer --refers to-> <space of integers>
Integer -> Integer --refers to-> <space of functions from/to integers>
```
Here we’re simply saying that expressions are the things we write down (a choice of symbols), while values are the objects to which those expressions refer (the more abstract thing “under the hood,” independent of representation). And this applies to any syntactical representation of categories of objects, so we make a distinction between this happening at a term and type level (but the expression-value “pattern” is present in either case).
Equations associate patterns with expressions, specifically in term space. The expression here is therefore analogous to the usual notion of a term, and the pattern is more a reusable, functional alias to parametrize that term structure in various contexts. This is somewhat muddy: patterns are more than just names, but they aren’t fundamentally part of a function term. They are perhaps best thought of as syntactic sugar for case switching different behavior that we bundle up under the same alias, such that the alias behaves as “one function” (but endowed with case matching over its arguments in the intuitive sense).

To hit on this point more (since it initially confused me): equations are not responsible for creating function terms in that they do not fundamentally facilitate abstraction. = effectively just attaches a term (RHS) to a name (LHS), but with the pattern matching over various signatures. A definition like
```
-- equations are of the form <pattern> = <expression>
factorial 0 = 1
factorial n = n * factorial (n - 1)
```
simply breaks up the definition of a single function, factorial, into two cases based on the input. The original confusion arose due to terms appearing on the LHS, and so it feels as if = is the thing doing the “bridging” here to connect them. But = is not a replacement for λ in the usual lambda calculus sense; the latter is occurring implicitly on the RHS, and = merely attaches the resulting function to the name. To be clear:
```
factorial n = n * factorial (n - 1)

-- is semantically equivalent to

factorial = \n -> n * factorial (n - 1)

-- which might be (lazily) represented in lambda calculus as

factorial := λn. n * factorial (n - 1)
```
So: while = helps assign names to terms in this fairly primitive sense, and is separate from the actual creation of function terms, it is more than just a := in that the LHS pattern can be matched against later on. It is more than a name in that way, and really a name for a function term that case matches its input before determining its behavior. If we wanted to be very explicit, even across several Haskell equations, factorial can still be truly thought of a single term with this case switching embedded:
```
factorial := λn. case n of (0 -> 1 ∣ m -> m × factorial(m−1))
```
Mind you, everything is a function when we draw analogies to lambda calculus. The most general characterization of this is written as follows, where case ... of is a valid Haskell expression:
```
-- a set of equations like
f p11 ... p1k = e1
...
f pn1 ... pnk = en

-- is precisely equivalent to
f x1 x2 ... xk = case (x1, ..., xk) of
    (p11, ..., p1k) -> e1
    ...
    (pn1, ..., pnk) -> en

-- and can be more explicitly written
f = \x1 x2 ... xk ->
    case (x1, ..., xk) of
        (p11, ..., p1k) -> e1
        ...
        (pn1, ..., pnk) -> en
```
This syntactically separates each of the involved components here:
- = attaches an expression to an alias
- \ initializes an abstraction (function)
- case ... of facilitates pattern matching
This story is far less clear with just the first version (the set of equations), where our convenient use of = masks these mechanisms.
Declarations are what we generally call statements that serve as definitions. This includes equations (i.e., pattern bindings), as well as type synonyms, data statements, etc. Any statement that associates some expression with a name, “ascribing meaning” to scoped aliases, is a declaration.

The notion of a “declaration” is more general than what we called an equation above. One can think of the hierarchy of terms involved loosely as
```
Declaration (introduces objects)
├─ Binding‑declaration (pattern-expression associations, aka equations)
│   ├─ Function‑binding    -- “f x y = …”  (multiple clauses allowed)
│   └─ Pattern‑binding     -- “(x,y) = …”, “Just n = …”, etc.
└─ Other declarations      -- type, data, class, import, …
```
Note that function bindings can fail to match (refutable), while pattern bindings are lazy (and are irrefutable).
Equation resolution: matching, binding, evaluation

Syntax definitions

Expressions (syntactic terms) refer to values (the actual abstract objects). 5 is the character (expression) we use to refer to the “concept of 5.” Simply put, we say the values are the real things, expressions are how we write them down (and not strictly in a canonical form, e.g., both 5 and 4+1 are expressions that yield the same value).
Type expressions are how we write type values (or just “types”), just like the above distinction. For instance, Integer -> Integer is the type expression that represents the type of functions to/from integers.
A value together with its type is called a “typing,” and we use :: to declare a type relation (read as “has type”):
```
5 :: Integer
inc :: Integer -> Integer
[1,2,3] :: [Integer]
```

Polymorphic types

In $\lambda 2$ we introduced generics and higher-order quantification of type variables. Haskell only supports universal quantification (i.e., the specification of generics), and we do this just using “bare” (free) type variables. So something like

length                  :: [a] -> Integer
length []               =  0
length (x:xs)           =  1 + length xs

does the following:

First declares length to be a generic function operating on lists of any type and maps them to an integer. In our more formal type theory context we might instead write
```
length : ∀a. List[a] -> Int
```
but the point is that we don’t need to explicitly bind a with ∀. Note also that [x] is a type constructor. Recall our formal definition of a list type, like
```
rec type List[Item] = 
  [nil: Unit,
   cons: {head: Item,
          tail: List[Item]
         }
  ]
```
Although it’s sort of hard to justify putting a in the middle of some brackets in the functional sense (so [a] versus something like [](a), where in the latter case it’s clear we’re explicitly “calling” something we’ve decided to label as []), we simply define [a] = List[a].
With length [] = 0, we’re declaring that length applied to [], the empty list (or nil above), evaluates 0.
With length (x:xs) = 1 + length xs, we define the rest of the function’s behavior recursively. Note that x:xs basically unpacks cons of the above definition, assigning x to the head of the list and xs to the remaining sublist.

To be clear, our list type is a variant/sum type, and has values that are either nil or cons, or in Haskell, either [] or (x:xs). Our eliminator here is pattern matching, which is being done implicitly as we define our “overloaded” function variants, which will match a sum type [a] as either its nil or cons “internal” type.

User-defined types

data can be used to declare new types:

data Bool = False | True

data Color = Red | Green | Blue | Indigo | Violet

data Point a = Pt a a

Bool and Color are nullary type constructors (can define all types as type constructors, where those not parametrized by a type variable can just be seen as a constant, nullary function) that are sum types. The enumerated “values” here are called data constructors (also nullary in this case). To be clear: data constructors produce values, type constructors produce types.

Point provides an example of a “proper” type constructor, ranging over a type variable a. data Point a suggests Point is a type constructor generic in any type a, and has terms that are constructed via a data constructor Pt a a. The latter can perhaps be more formally written Pt t1:a t2:a, which is to say it “operates on” two terms t1 and t2, both of type a as bound by the type Point a. So on the left, a is a type variable (ranging over all types), but on the right, a is a term variable (ranging over terms/values of of type a).

In any case, from the usual programming perspective, this notation slightly confuses me because it seems to do so little, and the RHS feels so redundant: we end up with two super generic things that define basically no specific functionality. Point a and Pt a a seem to nearly do the exactly same thing. But from the type theory perspective, I suppose it’s clear enough, where the RHS is serving to tell us the signature we’ll use to introduce terms of the type. In Python, for instance, this is like the difference between a class declaration and the constructor signature:

class Point[a]:
    def __init__(self, x: a, y: a):
        ...

The confusing part of the association here is that, in Python, we don’t introduce some new alias for the constructor; we just use the class name and have Point[int](1, 2), with the type variable and terms all in one place. This raises a key point, and a common practice in Haskell: we can simply let these two names be the same to capture this arguably more intuitive syntax (they are in separate namespaces, so there’s no concern of clashing), and could have

data Point a = Point a a

So to be clear on the usage:

-- in the former case
Pt  2.0  3.0            :: Point Float
Pt True False           :: Point Bool

-- in the latter
Point  2.0  3.0         :: Point Float
Point True False        :: Point Bool

While I get we might want to make a distinction between the names of constructors in different spaces (term vs type), for now it seems there’s really no reason to name them differently.

Pt a a is complete

This perhaps jumps the gun a bit with respect to the pattern matching details provided below, but I want to expand a bit on a particularly sticky concept as I warm up to Haskell conventions. For some time, I’ve treated the declaration

data Point a = Pt a a

as incomplete, in the sense I had always implicitly expected some more concrete definition for Pt as a constructor must be provided later. From above, I believe my intuition rather directly lead me to think the Python analog was

class Point[a]:
    def __init__(self, x: a, y: a):
        ...

without the constructor body, as in both appear to leave out the final details for what Pt should in fact do with the two values provided to it. Note that it is perfectly acceptable to think of Point’s __init__ as positionally equivalent to Pt: they both serve as constructors for Point[a]/Point a. In any case, Pt a a appears merely to set up the constructor signature just like __init__ shell above. But there is nothing more to say: it is already complete.

It’s true that Pt :: a -> a -> Point a, but this evokes thoughts that Pt is something that will take two values x: a and y: a and produce some new form that will have type Point a. However, Pt a a is that thing, literally; there’s not a more involved object with attached variables or methods like we expect in OOP. The mechanics for acting upon such an object are built-in: pattern matching. Pt may be used, for instance, as follows:

p        :: Point Int
p        = Pt 3 4

norm1    :: Num a => Point a -> a
norm1 (Pt x y) = abs x + abs y  -- eliminate (pattern match)

An object like Pt 3 4 is directly matched against in the pattern (Pt x y): we’re able to bring our component values into the RHS scope as x and y directly, “destructuring” our object upfront.

The larger point here: data declarations don’t yield something strictly more concrete than what they appear to be syntactically. As vague as something like Pt a a looks, this form alone shapes our type: two terms of type a in a box called Pt. That’s about as specific as we need to get, and as specific as data declarations allow.

Type synonyms

The type keyword lets us define type “synonyms,” (or type aliases, as I’d prefer to call them) assigning new names to compositions of previously defined types. For example,

type String             = [Char]
type Person             = (Name,Address)
type Name               = String
data Address            = None | Addr String

-- can also do this for polymorphic types
data Tree a             = Leaf a | Branch (Tree a) (Tree a)
type AssocTree a b      = (Tree a, Tree b)

Functions

-> notion for types, as usual:

add :: Integer -> Integer -> Integer
add x y = x + y

Functions are curried in the usual way, i.e., can think of add as a function with one arg that returns another function with an arg:
```
add :: Integer -> (Integer -> Integer)
```
Curried functions make partial application particularly clear, e.g,
```
inc = add 1
```
There’s still another operand to “finish” the add operation, but the 1 is tucked away as the first operand, such that inc is effectively now just f(x) = add 1 x.

Functions can also be taken as an argument (no surprises here):

map                     :: (a->b) -> [a] -> [b]
map f  []               =  []
map f (x:xs)            =  f x : map f xs

-- example application
map (add 1) [1,2,3] => [2,3,4]

And to be clear about function terms and their names:

inc x                  = x+1
add x y                = x+y

-- are really shorthand for:

inc                    = \x   -> x+1
add                    = \x y -> x+y

where we use \x -> x+1 to define an anonymous function, in the usual $\lambda$ -expression or abstraction sense from lambda calculus.

Functions are non-strict

Haskell processes definitions rather than assignments from other languages. If we have something like

v = 1/0

Haskell treats this like “define v as 1/0” rather than “compute 1/0 and store the result in v.” The declaration alone does not imply any computation occurs, and in general, evaluation is lazy in nature. Such an evaluation will occur only when the value is needed in some context, in which case only then will we encounter the zero division error.

For a function like

const1 x = 1

evaluations of the function never even look at the value of its argument. It knows it doesn’t need it, and again, something like const1 (1/0) will evaluate to 1. It’s as if 1/0 simply gets to stay in syntax form, and any issues will go completely undiscovered until we need to coerce that syntax into a value.

Recursive functions

Haskell permits “informal” self-reference when defining recursive functions, as is commonplace in most modern programming languages. We discuss the issues here at length in Typed lambda calculus§Recursion and the fix combinator. I think there’s an important new perspective to really internalize here when thinking about the object we get from recursive definitions, certainly at a deeper level than I’ve been accustomed to.

Take for instance the following:

ones                    = 1 : ones

The usual surface-level interpretation here, at least as far as I think about it, is that ones is some arbitrarily large, “expanding” self-referential object/tree. Now this is perfectly fine, and even practically accurate when we try to follow the recursive stack traces in some real program. But this is less valid when we really force ourselves to make ones concrete at definition, rather than a dynamically growing thing.

Note what we’re actually saying with the definition: ones is some object that, together with some extra structure (the prefixed 1), is still itself. In other words, prefixing with a 1 can’t change what ones actually is here, it must already “internalize” that action such that the object already includes it, in a sense¹. With any finite structure, this of course feels incredibly paradoxical: we are by definition adding something on top of ourselves, which must give us something new, something bigger. But when ones is simply the infinite sequence of 1s, we break out of this limitation. Such a sequence does not change when we add another 1 to it. We can also see this object as the fixed point of the function adding our structure, e.g., f(x) = 1:x. This harkens back to our definition of the Y-combinator, albeit where x in f(x) is a nullary function and the f(x) we just provided looks less like a functional. In any case, f(ones) = ones, which is effectively what our original declaration requested of us: assign to ones the object that doesn’t change when adding some structure to it (in this case, prepending a 1).

The fibonacci sequence is a slightly more involved definition:

fib             = 1 : 1 : [ a+b | (a,b) <- zip fib (tail fib) ]

Again, I find there are two ways to start understanding a definition like this:

As a “generating” sequence, and you can unravel it one call at a time. We take what we know to be concrete to start (the base case): the sequence 1 1. We then zip up the sequence with its offset to produce the next chunk by adding the pairs:
- [1 1] [1 null] -> [2], add this to produce 1 : 1 : [2]
- [1 1 2] [1 2 null] -> [2 3], add this to produce 1 : 1 : [2 3]
- [1 1 2 3] [1 2 3 null] -> [2 3 5], add this to produce 1 : 1 : [2 3 5]
Note how with each step we’re effectively “peeking” at what the next item will be, and then we perform the full computation again, taking that new sequence as if it were the value of fib to begin with. This is like packing in more and more recursive calls, approaching the true object.
Rather than an iteratively applied function, which is perhaps the only practical way to build up intuition and get some object, you “snap” straight to the full, global, infinite object.

Note again how we’re not even really defining functions in the usual way. For instance, in other languages we might canonically construct a fib(n) function that computes the sequence for the first n items. Here it’s as if we’re defining an infinite sequence outright; at least in this example, fib is not a function that accepts arguments (although a concrete value could be lazily evaluated if some piece of code attempted to index into the fib list).

Case expressions and pattern matching

Patterns are the symbols used to refer to variables in parameterized contexts, like function definitions as we’ve seen. Haskell makes use of pattern matching to verify appropriate cases. Attaching values to variables or arguments can be generally thought of as first undergoing matching (determining an appropriate definition to fill), and thereafter binding to the variable in that context.

Patterns can either fail, succeed, or diverge. They diverge when an error is present (i.e., $\bot$ ), fail if no pattern in an equation is matched, and succeed when at least one is (and the first is taken). Matching occurs left to right in an “equation,” ie, a sequence of patterns on a line of a function declaration, and top to bottom across lines.

As-is patterns: you can attach aliases to patterns on the left side that can be reused on the right side:
```
f (x:xs)                = x:x:xs

f s@(x:xs)             = x:s
```
Wild-cards: can use _ to match against any input value
Boolean guards: effectively define boolean cases for input values. The “usual” case can be thought more of as direct matches, whereas here we have more control over the exact conditioning being tested for a match:
```
sign x |  x >  0        =   1
       |  x == 0        =   0
       |  x <  0        =  -1
```

Case expressions

All pattern matching thus far has been seen in the context of function definitions (which is general enough), but we don’t always want to define a function to do this. So a function like

f p11 ... p1k = e1
...
f pn1 ... pnk = eN

This is basically a case-switch kind of expression over K constraints (the usual case being k=1; we match one value against N cases and produce a value according to the matched case). We can move this functionality into a case expression, like

f x1 x2 ... xk = case (x1, ..., xk) of
    (p11, ..., p1k) -> e1
    ...
    (pn1, ..., pnk) -> en

For instance, we can re-define take (although I didn’t define it above; the before and after are both below):

-- equation matching in func def
take  0     _           =  []
take  _     []          =  []
take  n     (x:xs)      =  x : take (n-1) xs

-- same definition but with case expression
take m ys               = case (m,ys) of
                            (0,_)       ->  []
                            (_,[])      ->  []
                            (n,x:xs)    ->  x : take (n-1) xs

In this case I don’t feel it really demonstrates the utility (we just get the same thing, and the first takes fewer lines), but the point is that the case matching can take place on the RHS, in an arbitrary scope, as part of the actual “functional code” we might use to actually define a function. That is to say: we can use pattern matching beyond just inside function definitions.

Note that if statements can also be reduced to pattern matching via case expressions:

case e1 of True  -> e2
           False -> e3

-- can expressed with
if e1 then e2 else e3

-- which can still be seen as a function
if-then-else :: Bool -> a -> a -> a

Lazy patterns

Lazy patterns are of the form ~pat, and are irrefutable, in that matching against a value will always succeed. (This has been a particularly slippery concept to wrap my head around, but I think I’ve got it down now.) Below I follow both the tutorial and an example from the wiki.

There are few lines of confusion to battle against here. First, we said functions were non-strict earlier, meaning if I pass in a problematic value to a function that doesn’t use it, it won’t “break” the function and it’ll evaluate without a hitch. This already feels like “laziness” in a sense: we only evaluate expressions/definitions at the time they’re needed. But it’s not lazy at the time the argument is matched, if it needs to be. For example, if I have f (a,b) = g a b, where f accepts a pair as input and splits up the items to apply g, Haskell will check for a pair constructor in the input before g can be applied. That is, the elements of the input appear to be needed, so at the time of f’s evaluation, we’ll check that our input is a pair, and otherwise fail to match. This very check is what we’re considering “not lazy enough.”

A lazy pattern like f ~(a,b) = g a b, however, includes an irrefutable pattern, and will match successfully on any input passed to f. That input may very well not be a pair, as desired, but we’re explicitly saying we don’t care…yet. You can basically treat the lazy definition like

f p = g (fst p) (snd p)

where this is a legitimate non-lazy analog (p is not specific, yet we will treat it like it’s a pair inside the function body). We just see some input p, and while we know this should be a pair, we don’t care at the time of the function evaluation. You can think of f p as simply delaying the pair evaluation until we evaluate g. This is powerful because g itself may not care about the values of its inputs, in which case it can produce an output regardless of the actual values. Alternatively, g could also have lazy patterns for its inputs and defer the evaluation even further. In any case, the big point here is that lazy patterns let us pretend input values meet the structural checks until the whole expression actually needs to be evaluated. I find the simplest motivating example to be the case where a constant function is buried inside arbitrarily many outer evaluations:

C  x = 1

aN p = C p
...
a2 p = a3 p
a1 p = a2 p

If I then want to call a1 for a particular value, I’ll check that the value matches the (non-lazy) pattern p from a1 p, and then attempt to evaluate a2 p. I’ll then do the same for a2, eventually getting to a3, and so on. Here I actually have to maintain a recursive stack and spend time unpacking each function until I reach C. If these were all lazy patterns, however, I can go straight to C, delaying all the evaluation steps I detailed above until the very latest moment. And in this case, we find it doesn’t even matter: my output will be 1 regardless of the input. Lazy patterns here allow my program to never need to even look at p to get that result, allowing it to “be aware” of computations it can skip (by virtue of us delaying the computation, until we get to a point where we can just toss out the whole expression).

This sounds sensible enough, but it opens up two questions for me:

If this is so helpful, why aren’t all patterns evaluated lazily?
Strict patterns seem purposeful in that they enforce the structure we declare the input should be. Don’t we mostly want our program to stop if there’s a structural inconsistency, even if it doesn’t have a functional impact?

Note how these questions are mostly at odds with each other, but I suppose that underscores some of the confusion I have here. The above example was fairly contrived, and only a guess as to the actual behavior. Below is a more principled analysis from this very thorough breakdown of laziness.

Non-strictness vs laziness

We just mentioned how non-strictness is not quite what we mean by laziness. Non-strictness is the general property of Haskell programs that ensures expressions won’t be evaluated until needed; “we evaluate as little as possible and delay evaluation as long as possible.” But that characterization is loose in the sense that what we mean by “evaluate” and “need” are loose. For instance, we might check for the presence of a constructor to verify type correctness but not actually look at any values, at least not unless we need to. As I understand it, lazy evaluation is effectively a mechanism for enforcing a particularly strong notion of non-strictness.

Thunks

A thunk is effectively an unevaluated expression, and is the mechanism through which we’ll represent lazy evaluation. Like before, we’ll use an example with a pair that’s pattern-matched on the LHS:

let (x, y) = (length [1..5], reverse "olleh") in ...

where we assume x and y are used somewhere beyond the in clause. Now here we might take x=5 and y="hello", but this involves resolving those expressions when binding the values. But we again don’t do this until we actually need those values, somewhere after the in. Until then, we can call both x and y thunks: unevaluated expressions, “lying dormant” but able to be resolved at any moment. As before, note the pattern matching happening in the pair argument. If we instead have

let z = (length [1..5], reverse "olleh") in ...

Here z is a thunk, and we don’t have to deconstruct it right away like we did with x and y before (although that’s all that happened there, producing two thunks). We can be very explicit by including a pattern match on z:

let z     = (length [1..5], reverse "olleh")
   (n, s) = z 
in ...

After the first line, z is still a thunk. But in line 2, we pattern match on z, requiring us to split it into two thunks, with (n, s) being (<thunk>, <thunk>). To quote the article:

The compiler thinks ‘I better make sure that pattern does indeed match z, and in order to do that, I need to make sure z is a pair.’ Be careful, though — we’re not yet doing anything with the component parts (the calls to length and reverse), so they can remain unevaluated.

We can take this a step further with something like

let z     = (length [1..5], reverse "olleh")
   (n, s) = z 
   'h':ss = s
in ...

which is the same as before, but now pattern matches on the second component of z, checking that it is a list with a head 'h', and attaches to ss the tail of the list. Specifically at this stage we

Evaluate s at a surface level to check it’s a list (or a cons object), such that s = <thunk> : <thunk>, basically
The first newly “revealed” thunk is evaluated to check it’s an h, leaving us with 'h':ss = 'h' : <thunk>

In total, we see that Haskell values can be partially evaluated, and any one line (or just piece) of computation may require some minimum amount of needed evaluation (like with the pair or list checks, where we’re “peeling” back some of the layers only to check for the needed structure), i.e., leaving as many nested components as thunks as possible.

How the value (4, [1, 2]) is evaluated, step by step — How the value `(4, [1, 2])` is evaluated, step by step

We have formal names for the “layers of evaluation” involved here. Any of the intermediate evaluation steps of a value is said to be weak head normal form (WHNF), while a fully evaluated value is in normal form. Here we’ve laid some of the groundwork of laziness and how to think about layers of evaluation via thunks. Nothing is evaluated until it is needed, generally speaking (as in we’re not even onto lazy patterns at this point). Interestingly, aside from a few I/O exceptions, pattern matching is the only place where Haskell values are evaluated; in the end, everything is left as a thunk until a pattern requires peeling back a layer to check the structure…and nowhere else does this thunk resolution take place.

Lazy and strict functions

Functions can be lazy or strict in an argument (and possibly different across each of them). A function is said to be strict in an argument if it does some evaluation with it, evaluating it to at least WHNF, while being lazy in an argument if no evaluation takes place. A function is stricter than another if it evaluates an argument to a deeper level (of the same structure, say; general comparison is between barebones f x and g x). Performing evaluation is also called “forcing” a value in some cases.

If we “force” (try to evaluate whatsoever) the value of undefined, our program will halt. So the following will yield errors

let (x, y) = undefined in x
length undefined
head undefined

Each of these do some checking on the values, trying to peel back one layer of thunks. But in doing so they encounter undefined, and halt. But a thunk that’s “hiding” an undefined value can go undetected if we never try to evaluate, and lazier settings like the following don’t produce any errors:

-- we just see our value is a pair, but not the values in the positions
let (x, y) = (4, undefined) in x

-- same sorta thing, we just see that we've got a list but dont look at the
-- cell values
length [undefined, undefined, undefined]

head (4 : undefined)

We can call a function f strict if (and only if) f undefined results in an error (implying that f will peek at the structure of its argument, evaluating at least one level, which is a problem). This is at least how we can determine strictness without actually knowing f’s definition.

It’s worth noting there’s a bit of a confusing nuance to really hit on with this definition². When we say a lazy function is one that doesn’t evaluate its argument, we naturally mean this when evaluating f (otherwise we just have this unevaluated thunk f x). So given we’re forcing f x, do we force x as a result? That is, if I need to “fully evaluate” f x, will I need to fully evaluate x to get there? If not, this basically entails that f’s “real value” doesn’t depend on x in any way or it hides it inside another abstraction that doesn’t need to be unpacked in order for f x to be in normal form. So laziness means we can get there without needing to think about x.

Lazy pattern matching

Now we’ve got a bit more context behind delayed evaluation and intermediate forms to motivate the explicit use of lazy patterns. We already introduced them, but here’s a simple example demonstrating their use once more:

Prelude> let f (x,y) = 1
Prelude> f undefined
*** Exception: Prelude.undefined

Prelude> let f ~(x,y) = 1
Prelude> f undefined
1

The first f is strict: it doesn’t refer to x or y in its body, but it accepts a structured input (a pair) and we have to evaluate whatever’s passed in to check it has a pair constructor. So when we pass in undefined, we are forced to check if that’s a pair, and we end up evaluating undefined which halts our program. But the ~ in the argument pattern in the second f delays this structural check, meaning f will just “take it on the chin” for whatever value we give it (even if it’s not a pair). We simply wait until the last possible moment to use whatever that value is, rather than “gate checking” it on input. This means that f undefined doesn’t cause any issues; we wait as long as possible to evaluate what we pass in, and in this case that turns out to be forever. So undefined goes by un-evaluated.

Scoping and nesting

There two ways to create local, “block-like” bindings: let and where.

let expressions facilitate local bindings scoped to a particular expression. For example
```
let y   = a*b
    f x = (x+y)/y
in f c + f d
```
Here we’re defining y and f locally (and note that f uses y), and then taking them as available names to construct the final expression f c + f d. let ... in can be used anywhere a typical expression can, so we could have the above in a function binding like
```
normalizePair a b c d =
  let y   = a*b
      f x = (x+y)/y
  in  f c + f d
```
where clauses are similar to let in that they facilitate local bindings, but they are crucially not expressions. where clauses are only allowed at the top level of a set of equations or case expression and must “attach” to a declaration (function or pattern bind).

So an analog of the above that makes it look a whole lot like let:
```
normalizePair a b c d =
  f c + f d
    where
      y   = a*b
      f z = (z+y)/y
```
This is okay because we’re operating right at the function declaration here. But if we were working in some more deeply nested layers, you would have to use a let expression, since where would not produce a sub-expression. Think of the above let variant as actually producing a proper expression strictly on the RHS the equals sign, whereas here the where is effectively attaching to the LHS and isn’t valid in isolation. As in, if I just took
```
f c + f d
  where
    y   = a*b
    f z = (z+y)/y
```
this would be invalid, not an expression, etc; the where can’t “see” any nearby function binding. Again, it’s like where is just a convenience sugar that comes coupled with declarations:
```
pattern = expression
  where ...
```
The use of where is coupled with the presence of pattern; take the pattern out and you can’t use it, basically. This is mostly convenient when you’ve got a function with guards and you need your locally bound variables across the guard constraints, like
```
f x y  |  y>z           =  ...
       |  y==z          =  ...
       |  y<z           =  ...
     where z = x*x
```
Making z available to each constraint is something that where handles automatically. You can’t use let here because this whole thing isn’t an expression, it’s a set of expressions. You could use lets inside the ... on the RHS of each constraint, but that’s painful. where gets to “sit above” the expression level and be a helper for declarations.

These statements were a bit confusing at first because they just feel so particular. But as I slowly get used to them, it’s becoming clear they just facilitate the typical composability of most non-functional languages when it comes to working with variables in scopes. That is, the really very basic ability to define some variables within a function and reuse them to build up other variables or terms. Those are very automatic in languages like Python, but require a bit more care in Haskell. A very basic example of where (with no guards, so let could used here with similar effectiveness) drove this home a bit, feeling pretty familiar:

areaTriangleTrig  a b c = c * height / 2   -- use trigonometry
    where
    cosa   = (b ^ 2 + c ^ 2 - a ^ 2) / (2 * b * c)
    sina   = sqrt (1 - cosa ^ 2)
    height = b * sina

Type classes

Type classes allow for the categorization of types for which particular behavior is defined. For example, if we think about an equality operator, i.e., ==, we might want this to be broadly applicable for comparing values of many different types. We face a few practical challenges right away:

Not all types are comparable. That is, types aren’t required to admit some notion of equality, so we can expect == to remain undefined for some types.
For types that support it, what equality means may differ heavily across those types. Here we expect == to be overloaded such that the same operator can be used in each type context with whatever specific machinery we need to compare those values.

These are standard concerns for general operators. Haskell’s type classes address this by effectively allowing us to declare interfaces that types can inherit. For instance, we can define the class of types that are “equatable” as

class Eq a where 
  (==)                  :: a -> a -> Bool

This says: a type a is an instance of the class Eq if it defines an operator == that compares values of type a. The function/operator == is considered a method of the class. We can “attach” types to this class with instance declarations, like

instance Eq Integer where 
  x == y                =  x `integerEq` y

This can basically be taken to mean we’re “enrolling” the type Integer in the Eq class by fulfilling the template/interface required by Eq (where integerEq is assumed to be a sufficient comparison function).

Contexts

With access to a type class, we can use contexts to more tightly bound polymorphic type expressions to be quantified only over types belonging to that class. For example, we can now generally refer to our == operator as having the type

(==)                    :: (Eq a) => a -> a -> Bool

which suggests that == is a generic function only over types that are instances of the class Eq. The notation (<class> <type-var>) => <type-expr> is how we generally capture this: (C t) => E bounds the type t as it appears in the type expression E to belong to the type class C. This actually feels very natural, and closes the loop on a lot of confusing syntax I’ve seen up to this point. This further feels like bounded UQ as we’ve studied it in broader type theory: we can express types that are quantified over bounded type variables, and those bounds are facilitated by type classes.

Contexts can be used generally in type expressions, including in other class definitions:

class  (Eq a) => Ord a  where
  (<), (<=), (>=), (>)  :: a -> a -> Bool
  max, min              :: a -> a -> a

This defines a new class Ord a, where the type variable a is bounded to itself be an instance of the class Eq. Ord is considered a subclass of Eq: we’re saying it is defined only over types that belong to that class, is thus comparatively more specific.

One can also express multiple inheritance like
```
class (Eq a, Show a) => C a where ...
```
One can use class constraints (i.e., contexts) within the method definitions of another class on a type variable except that which is bound at the class level:
```
class C a where
  m                     :: Show b => a -> b
```
This is quite natural: we’re just saying that a can’t be further restricted by m’s type expression (e.g., doing something like :: Show a => a -> b). That is, we have no power to restrict a at that stage; if we wanted to that, we’d need to move it into the class declaration (e.g., class (Show a) => C a).

Type constructors

Recall our parametric Point type from before: data Point a = Pt a a. Point is a type constructor: it takes a type a and produces another type Point a. So Point a is concrete, first-order (for some a), i.e., values can inhabit this type. But Point alone, as a “higher-order” type, is not first-order, and there aren’t any values in Haskell that can inhabit it.

To be clear, these are both concrete values in type space (i.e., types), they just have different kinds:

Point Int : *
Point     : * -> *

We’re basically saying that a type needs to have kind * in order to be inhabitable by a value. That is, only first-order types can be inhabited by a value; it’s fairly nonsensical to suggest that Point could be inhabited by a value. Note that both Point and Point Int inhabit kinds, it’s just that only Point Int is a construct that can then further be inhabited by a value while Point is “too abstract” (we first need to plug in another type to yield a first-order type).

Digression: Sorts and universes

…Why is that? In what way is Point too abstract? It inhabits a kind just like Point a (for any a); is that not enough to just call them both types?

This is a good place to have a discussion regarding a critical distinction in how we categorize terms, types, and higher-order sorts. This has been a bit of a pain point throughout my type theory “journey,” and the rubber seemingly meets the road now that we’re discussing these items in a more concrete context like Haskell.

The key elements I want to highlight here crop up as we start thinking about where type constructors “live.” Above, we just said both Point Int and Point are “types,” since they both are concrete objects living in type space. They’re not at the term level, and they’re not a kind: they’re sandwiched in between. Nevertheless, calling them both types isn’t exactly how we’d word it in everyday use, and it’s pretty much outright wrong in a Haskell context given that Point can’t be inhabited by a value in the usual sense.

Now, the fact that Point can’t be inhabited by a term/value makes intuitive sense: it’s a function that operates on types. How could you even build a term that makes sense here? Of course, you can quantify over Point, binding its type variable so that the expression can become closed and represent something concrete, but it otherwise just includes a free type variable that’s hard to make sense of.

So how do we formalize the distinction here? What I mean by that is both Point and Point a are in this “type realm” like we’ve said; we’re not reaching for kinds or higher-order sorts. In a type theoretic sense we can call them both just types, similar to how we can still call a lambda abstraction a term even though it abstracts over terms (and type constructors like Point abstract over types). Universes are what help us further distinguish these items (in the Martin-Lof sense).

Universes help capture what we mean by higher or lower order types. A “higher-order type” doesn’t mean we’re referring to kinds (as in a “higher order sort,” jumping from the classification “type” to “kind”), but instead to abstraction over types. In particular, universes represent a hierarchy of types, such that each step up the ladder of universes implies abstraction over the items from the previous universe.

Type₀ : Type₁
Type₁ : Type₂
...
Point     : Type₁        -- because Tree : Type₀ → Type₀
Point Int : Type₀
Pt 3 4    : Point Int
3         : Int

Here $\text{Type}_N$ , or $\mathcal{U}_n$ , refers to the $n$ -th universe type. Note that we somewhat overload our usual : in that we technically have types on both sides (and not a type on the LHS, kind on the RHS). This is okay given we think of these universe types as collections of certain kinds of types and therefore represent a higher order construction; something like T : Type₀ says that the type T is in the 0-th universe.

How do we know something like Type₀ is itself a type, though? Part of me is tempted to simply say whatever it is, it could be a kind. The thing is, we construct this hierarchy by saying the universe Type₀ is an element in the universe Type₁, and we don’t leave “type world” to do this. So while the kind * certainly captures all nullary type constructors (first-order types), the kind itself is beyond types altogether, and can’t exactly be taken as a type in a higher type universe like Type₁. But we’re doing that in spirit, we can just think of it more like lugging the whole collection of Type₀ types with us such that it becomes a new primitive in our higher universe.

(I’d like a little more on this; really getting why a universe type can still be a type in the usual sense. Probably the connection to abstraction over types, as in type constructors, is what’s can be convincing here; if Type0 is similar enough to a type constructor, and the latter is a type, then the former can reasonably be as well.)

To be clear: universes simply group up types, breaking up type space into “orders” of types. First-order types, i.e., those belonging to Type₀, are the only ones that can be inhabited by values. Other higher-order types can then simply be seen as having some “impassable” universe layers separating them from the term space. Therefore, they can only be inhabited by objects in the $N-1$ universe, and for all but Type₀, those inhabiting objects are still themselves types. This gives us a meaningful basis for saying a phrase like “type constructors are inhabited by first-order types;” if something is going to inhabit a type “that abstract,” it can only be a “more concrete” type. Doing this for arbitrarily high order types eventually gets us to a “most concrete” type, after which the inhabiting object simply becomes a term (although one can naturally still think of terms as values in a universe, where a concrete type is simply a set of its possible inhabiting terms).

I’ve pondered this for a considerable amount of time, struggling to really grok the idea of letting a collection of types, as represented by $\text{Type}_0$ , itself be a type. It simply didn’t track for quite some time: it just feels like a kind, and I don’t like that we seem to ignore this. In fact, it seems very clear when we let universes be a partition of type space, in the sense that each universe is a collection of types bundling up a “new group” of types built on top of the last group. This tracks with the idea of universes building up increasing abstract type constructors, and aligns nicely with the notion of “unions of kinds.” The problem with this is that each universe is a term in the next one. That is, the thing we’re using to reference a collection of types is now just a primitive term in the next universe up.

That bothered me for a long time. I just didn’t get what the thing was supposed to now be in the next universe. $\mathcal{U}_0$ as a type is sensible: I can think of it like a set with elements inside, and if we liken $:$ to $\in$ , then something like $Integer : \mathcal{U}_0$ tracks just fine. But $\mathcal{U}_0$ as a standalone term annoys me, and it no longer feels right. The interpretation that helps me here is to allow $\mathcal{U}_0$ to be a building block for terms on this new “plane.” You can even liken it to * as a kind, in the way we use * in kind expressions to abstract over types. Whatever work the symbol * is doing, we’re basically letting the reference to $\mathcal{U}_0$ (as a term) do that same work. And you can start building other terms with it, e.g., $\mathcal{U}_0\rightarrow \mathcal{U}_0$ , which is really not any different from what we mean by $*\rightarrow *$ . Note how while $\mathcal{U}_1\rightarrow \mathcal{U}_1$ would be in $\mathcal{U}_2$ (a new universe), it can still be related to a kind like $(* \rightarrow\cdots\rightarrow *) \rightarrow (* \rightarrow\cdots\rightarrow *)$ (i.e., a map from an $n$ -arity type constructor to an $m$ -arity type constructor). That is, higher universes don’t correspond to higher sorts; it’s not like we have to leave “kind space” to start representing universes beyond $\mathcal{U}_1$ . Instead, universes are basically just convenient ways to refer to all types with a certain level of abstraction (or below), and we can canonically think of those things as still having some ascribable kinds. Universes also group up types with a “broader brush:” $\mathcal{U}_1$ , for instance, basically groups up all first-order type constructors. We don’t have a convenient way to refer to all types that meet that description with kinds; we have to write out $*\rightarrow *$ , $*\rightarrow *\rightarrow *$ , etc to capture the notion of arbitrary first-order arity. All types with that structure inhabit $\mathcal{U}_1$ , however, so we get a single term we can now use to refer to them. We make a jump to a new universe where that single “smaller” universe reference a is new term, and build some new higher-order terms from there. This is again just like assigning a name to some first-order kinds, and building outer higher-order kinds (like example above) that nest the first-order ones inside.

Another thing that might help, if not pretty much implied by the above: we effectively “reuse” the notion of typing at each new universe. It’s odd to initially start representing collections of types as types themselves. We never claim to leave type space, but we wrap lower universes back around to be terms that are categorized by higher universes. We just reuse the notion of type inhabitance each time to capture that relationship: each time we get some fundamentally new thing we can work as a term to build up other terms in this new universe.

class Functor f where
  fmap                  :: (a -> b) -> f a -> f b

instance Functor Point where
  fmap f (Point a)      = Point (f a)

This may already be quite clear, but to reinforce this intuition even more (because I find it important): ones here is not some object that we find ourselves prepending a 1 to. When looking at the RHS, I find myself thinking that ones could technically be some arbitrary value, and we just need to find one that makes the equation work. In a sense that’s perfectly okay, but I think there’s too much mental freedom with my read on what’s happening here. I think it’s better to be very clear that ones is born entirely out of the extra structure used in that equation. A similar point can be made with the general fix setting we explored in Typed lambda calculus, e.g. for factorial:
```
G(fact) = λ(n: Int). if n=0 then 1 else n * fact(n-1)
```
Here fact isn’t some construct that we find ourselves putting into that equation, “plugging it in” and checking if it works. No; it is completely defined by that equation. It is not a separate thing in any meaningful sense. The same applies for ones above: it is the thing that internalizes, infinitely, the action of prepending a 1.
This note is really just a reminder to not treat the equation and the term so separately, since I seem to have that tendency in my latest re-reading of this material. Recursive terms like these are nothing but sponges that must absorb whatever structure shows up around it in the definition.
↩︎
I say confusing because I initially took this to mean something different based on the wording in the Wikibooks article. The article says

Often, we only care about WHNF, so a function that evaluates its argument to at least WHNF is called strict and one that performs no evaluation is lazy.

This confused me because it makes laziness sound like something we can see in the function body, and whether it uses that input in a particular way. As in, whether we explicitly take an evaluation step inside the function’s definition. The id function sounds like it’s lazy under this definition: it doesn’t evaluate x in the body, it just returns what it was given. While this definition is a little sloppy, my interpretation is what’s wrong. Whatever evaluation may take place in f’s body, it won’t take place until f itself is evaluated. So our quoted line is exactly the same thing as the “refined” statement

Given that we’re forcing f x, does x get forced as a result?

That is to say, once we actually try to make the term f x a concrete value, strictness means we’ll need to take x to a concrete value in order to get there.
Another hint that the first line doesn’t make sense: just above we said evaluation only takes place during pattern matching. So unless f does some pattern matching when checking input values (in which case it’s obviously strict, regardless of how we’re interpreting our definition), the only evaluation that can take place in its body is if it passes that input off to another function call which needs to do some pattern matching. Point being, there’s no other canonical way for evaluation to even take place in the function body: we either evaluate when checking the input or call another function that does. So if our function f doesn’t pattern match on input, then our first definition is likening strictness to simply calling a function internally that does pattern matching. This would be a pretty shallow notion of strictness if that’s what we mean, i.e., strict functions are those that call functions that pattern match. id would not be strict under such a definition.
↩︎

debug mode

key	value
id	39973
path	/home/smgr/Documents/notes/Haskell.md
rpath	Haskell.md
name	Haskell.md
title	Haskell
link	Haskell
ftype	md
ctime	1760168377.24
mtime	1760168359.0
atime	1760168359.0
id_1	4734
name_1	Haskell.md
type	wiki
yaml_text	title: Haskell created: 2024-05-28 05:29 modified: 2025-10-11 00:39 datelink: [[2024-05-28]] type: wiki summary:
id_2	10965
name_fmt	Haskell.md+html5
name_2	Haskell.md
format	html5
content	Foreword: this page was written with the content in Typed lambda calculus fresh on my mind. Implicit references are made to more formal type theory concepts throughout, and this page may not be a great standalone accounting of the covered Haskell topics (I’ve confused myself re-reading this page without the lambda calculus terminology so top-of-mind, for instance). Syntax groups Expressions are the symbolic representations of values. This takes place at both the term and type level. `-- expressions refer and evaluate to values -- here I refer to a value X as "value<X>" to separate it from a symbol -- representation of that value, include the symbol "X" itself -- this is at the term level 9+1 --refers to--> value<10> 10 --refers to--> value<10> -- this is at the type level Integer --refers to-> <space of integers> Integer -> Integer --refers to-> <space of functions from/to integers>` Here we’re simply saying that expressions are the things we write down (a choice of symbols), while values are the objects to which those expressions refer (the more abstract thing “under the hood,” independent of representation). And this applies to any syntactical representation of categories of objects, so we make a distinction between this happening at a term and type level (but the expression-value “pattern” is present in either case). Equations associate patterns with expressions, specifically in term space. The expression here is therefore analogous to the usual notion of a term, and the pattern is more a reusable, functional alias to parametrize that term structure in various contexts. This is somewhat muddy: patterns are more than just names, but they aren’t fundamentally part of a function term. They are perhaps best thought of as syntactic sugar for case switching different behavior that we bundle up under the same alias, such that the alias behaves as “one function” (but endowed with case matching over its arguments in the intuitive sense). To hit on this point more (since it initially confused me): equations are not responsible for creating function terms in that they do not fundamentally facilitate abstraction. `=` effectively just attaches a term (RHS) to a name (LHS), but with the pattern matching over various signatures. A definition like `-- equations are of the form <pattern> = <expression> factorial 0 = 1 factorial n = n * factorial (n - 1)` simply breaks up the definition of a single function, `factorial`, into two cases based on the input. The original confusion arose due to terms appearing on the LHS, and so it feels as if `=` is the thing doing the “bridging” here to connect them. But `=` is not a replacement for `λ` in the usual lambda calculus sense; the latter is occurring implicitly on the RHS, and `=` merely attaches the resulting function to the name. To be clear: `factorial n = n * factorial (n - 1) -- is semantically equivalent to factorial = \n -> n * factorial (n - 1) -- which might be (lazily) represented in lambda calculus as factorial := λn. n * factorial (n - 1)` So: while `=` helps assign names to terms in this fairly primitive sense, and is separate from the actual creation of function terms, it is more than just a `:=` in that the LHS pattern can be matched against later on. It is more than a name in that way, and really a name for a function term that case matches its input before determining its behavior. If we wanted to be very explicit, even across several Haskell equations, `factorial` can still be truly thought of a single term with this case switching embedded: `factorial := λn. case n of (0 -> 1 ∣ m -> m × factorial(m−1))` Mind you, everything is a function when we draw analogies to lambda calculus. The most general characterization of this is written as follows, where `case ... of` is a valid Haskell expression: `-- a set of equations like f p11 ... p1k = e1 ... f pn1 ... pnk = en -- is precisely equivalent to f x1 x2 ... xk = case (x1, ..., xk) of (p11, ..., p1k) -> e1 ... (pn1, ..., pnk) -> en -- and can be more explicitly written f = \x1 x2 ... xk -> case (x1, ..., xk) of (p11, ..., p1k) -> e1 ... (pn1, ..., pnk) -> en` This syntactically separates each of the involved components here: `=` attaches an expression to an alias `\` initializes an abstraction (function) `case ... of` facilitates pattern matching This story is far less clear with just the first version (the set of equations), where our convenient use of `=` masks these mechanisms. Declarations are what we generally call statements that serve as definitions. This includes equations (i.e., pattern bindings), as well as type synonyms, `data` statements, etc. Any statement that associates some expression with a name, “ascribing meaning” to scoped aliases, is a declaration. The notion of a “declaration” is more general than what we called an equation above. One can think of the hierarchy of terms involved loosely as `Declaration (introduces objects) ├─ Binding‑declaration (pattern-expression associations, aka equations) │ ├─ Function‑binding -- “f x y = …” (multiple clauses allowed) │ └─ Pattern‑binding -- “(x,y) = …”, “Just n = …”, etc. └─ Other declarations -- type, data, class, import, …` Note that function bindings can fail to match (refutable), while pattern bindings are lazy (and are irrefutable). Equation resolution: matching, binding, evaluation Syntax definitions Expressions (syntactic terms) refer to values (the actual abstract objects). `5` is the character (expression) we use to refer to the “concept of 5.” Simply put, we say the values are the real things, expressions are how we write them down (and not strictly in a canonical form, e.g., both `5` and `4+1` are expressions that yield the same value). Type expressions are how we write type values (or just “types”), just like the above distinction. For instance, `Integer -> Integer` is the type expression that represents the type of functions to/from integers. A value together with its type is called a “typing,” and we use `::` to declare a type relation (read as “has type”): `5 :: Integer inc :: Integer -> Integer [1,2,3] :: [Integer]` Polymorphic types In $\lambda 2$ we introduced generics and higher-order quantification of type variables. Haskell only supports universal quantification (i.e., the specification of generics), and we do this just using “bare” (free) type variables. So something like `length :: [a] -> Integer length [] = 0 length (x:xs) = 1 + length xs` does the following: First declares `length` to be a generic function operating on lists of any type and maps them to an integer. In our more formal type theory context we might instead write `length : ∀a. List[a] -> Int` but the point is that we don’t need to explicitly bind `a` with `∀`. Note also that `[x]` is a type constructor. Recall our formal definition of a list type, like `rec type List[Item] = [nil: Unit, cons: {head: Item, tail: List[Item] } ]` Although it’s sort of hard to justify putting `a` in the middle of some brackets in the functional sense (so `[a]` versus something like `[](a)`, where in the latter case it’s clear we’re explicitly “calling” something we’ve decided to label as `[]`), we simply define `[a] = List[a]`. With `length [] = 0`, we’re declaring that `length` applied to `[]`, the empty list (or `nil` above), evaluates `0`. With `length (x:xs) = 1 + length xs`, we define the rest of the function’s behavior recursively. Note that `x:xs` basically unpacks `cons` of the above definition, assigning `x` to the head of the list and `xs` to the remaining sublist. To be clear, our list type is a variant/sum type, and has values that are either `nil` or `cons`, or in Haskell, either `[]` or `(x:xs)`. Our eliminator here is pattern matching, which is being done implicitly as we define our “overloaded” function variants, which will match a sum type `[a]` as either its `nil` or `cons` “internal” type. User-defined types `data` can be used to declare new types: `data Bool = False \| True data Color = Red \| Green \| Blue \| Indigo \| Violet data Point a = Pt a a` `Bool` and `Color` are nullary type constructors (can define all types as type constructors, where those not parametrized by a type variable can just be seen as a constant, nullary function) that are sum types. The enumerated “values” here are called data constructors (also nullary in this case). To be clear: data constructors produce values, type constructors produce types. `Point` provides an example of a “proper” type constructor, ranging over a type variable `a`. `data Point a` suggests `Point` is a type constructor generic in any type `a`, and has terms that are constructed via a data constructor `Pt a a`. The latter can perhaps be more formally written `Pt t1:a t2:a`, which is to say it “operates on” two terms `t1` and `t2`, both of type `a` as bound by the type `Point a`. So on the left, `a` is a type variable (ranging over all types), but on the right, `a` is a term variable (ranging over terms/values of of type `a`). In any case, from the usual programming perspective, this notation slightly confuses me because it seems to do so little, and the RHS feels so redundant: we end up with two super generic things that define basically no specific functionality. `Point a` and `Pt a a` seem to nearly do the exactly same thing. But from the type theory perspective, I suppose it’s clear enough, where the RHS is serving to tell us the signature we’ll use to introduce terms of the type. In Python, for instance, this is like the difference between a class declaration and the constructor signature: `class Point[a]: def __init__(self, x: a, y: a): ...` The confusing part of the association here is that, in Python, we don’t introduce some new alias for the constructor; we just use the class name and have `Point[int](1, 2)`, with the type variable and terms all in one place. This raises a key point, and a common practice in Haskell: we can simply let these two names be the same to capture this arguably more intuitive syntax (they are in separate namespaces, so there’s no concern of clashing), and could have `data Point a = Point a a` So to be clear on the usage: `-- in the former case Pt 2.0 3.0 :: Point Float Pt True False :: Point Bool -- in the latter Point 2.0 3.0 :: Point Float Point True False :: Point Bool` While I get we might want to make a distinction between the names of constructors in different spaces (term vs type), for now it seems there’s really no reason to name them differently. `Pt a a` is complete This perhaps jumps the gun a bit with respect to the pattern matching details provided below, but I want to expand a bit on a particularly sticky concept as I warm up to Haskell conventions. For some time, I’ve treated the declaration `data Point a = Pt a a` as incomplete, in the sense I had always implicitly expected some more concrete definition for `Pt` as a constructor must be provided later. From above, I believe my intuition rather directly lead me to think the Python analog was `class Point[a]: def __init__(self, x: a, y: a): ...` without the constructor body, as in both appear to leave out the final details for what `Pt` should in fact do with the two values provided to it. Note that it is perfectly acceptable to think of `Point`’s `__init__` as positionally equivalent to `Pt`: they both serve as constructors for `Point[a]`/`Point a`. In any case, `Pt a a` appears merely to set up the constructor signature just like `__init__` shell above. But there is nothing more to say: it is already complete. It’s true that `Pt :: a -> a -> Point a`, but this evokes thoughts that `Pt` is something that will take two values `x: a` and `y: a` and produce some new form that will have type `Point a`. However, `Pt a a` is that thing, literally; there’s not a more involved object with attached variables or methods like we expect in OOP. The mechanics for acting upon such an object are built-in: pattern matching. `Pt` may be used, for instance, as follows: `p :: Point Int p = Pt 3 4 norm1 :: Num a => Point a -> a norm1 (Pt x y) = abs x + abs y -- eliminate (pattern match)` An object like `Pt 3 4` is directly matched against in the pattern `(Pt x y)`: we’re able to bring our component values into the RHS scope as `x` and `y` directly, “destructuring” our object upfront. The larger point here: data declarations don’t yield something strictly more concrete than what they appear to be syntactically. As vague as something like `Pt a a` looks, this form alone shapes our type: two terms of type `a` in a box called `Pt`. That’s about as specific as we need to get, and as specific as data declarations allow. Type synonyms The `type` keyword lets us define type “synonyms,” (or type aliases, as I’d prefer to call them) assigning new names to compositions of previously defined types. For example, `type String = [Char] type Person = (Name,Address) type Name = String data Address = None \| Addr String -- can also do this for polymorphic types data Tree a = Leaf a \| Branch (Tree a) (Tree a) type AssocTree a b = (Tree a, Tree b)` Functions `->` notion for types, as usual: `add :: Integer -> Integer -> Integer add x y = x + y` Functions are curried in the usual way, i.e., can think of `add` as a function with one arg that returns another function with an arg: `add :: Integer -> (Integer -> Integer)` Curried functions make partial application particularly clear, e.g, `inc = add 1` There’s still another operand to “finish” the `add` operation, but the `1` is tucked away as the first operand, such that `inc` is effectively now just `f(x) = add 1 x`. Functions can also be taken as an argument (no surprises here): `map :: (a->b) -> [a] -> [b] map f [] = [] map f (x:xs) = f x : map f xs -- example application map (add 1) [1,2,3] => [2,3,4]` And to be clear about function terms and their names: `inc x = x+1 add x y = x+y -- are really shorthand for: inc = \x -> x+1 add = \x y -> x+y` where we use `\x -> x+1` to define an anonymous function, in the usual $\lambda$ -expression or abstraction sense from lambda calculus. Functions are non-strict Haskell processes definitions rather than assignments from other languages. If we have something like `v = 1/0` Haskell treats this like “define `v` as `1/0`” rather than “compute `1/0` and store the result in `v`.” The declaration alone does not imply any computation occurs, and in general, evaluation is lazy in nature. Such an evaluation will occur only when the value is needed in some context, in which case only then will we encounter the zero division error. For a function like `const1 x = 1` evaluations of the function never even look at the value of its argument. It knows it doesn’t need it, and again, something like `const1 (1/0)` will evaluate to `1`. It’s as if `1/0` simply gets to stay in syntax form, and any issues will go completely undiscovered until we need to coerce that syntax into a value. Recursive functions Haskell permits “informal” self-reference when defining recursive functions, as is commonplace in most modern programming languages. We discuss the issues here at length in Typed lambda calculus§Recursion and the `fix` combinator. I think there’s an important new perspective to really internalize here when thinking about the object we get from recursive definitions, certainly at a deeper level than I’ve been accustomed to. Take for instance the following: `ones = 1 : ones` The usual surface-level interpretation here, at least as far as I think about it, is that `ones` is some arbitrarily large, “expanding” self-referential object/tree. Now this is perfectly fine, and even practically accurate when we try to follow the recursive stack traces in some real program. But this is less valid when we really force ourselves to make `ones` concrete at definition, rather than a dynamically growing thing. Note what we’re actually saying with the definition: `ones` is some object that, together with some extra structure (the prefixed `1`), is still itself. In other words, prefixing with a `1` can’t change what `ones` actually is here, it must already “internalize” that action such that the object already includes it, in a sense¹. With any finite structure, this of course feels incredibly paradoxical: we are by definition adding something on top of ourselves, which must give us something new, something bigger. But when `ones` is simply the infinite sequence of 1s, we break out of this limitation. Such a sequence does not change when we add another 1 to it. We can also see this object as the fixed point of the function adding our structure, e.g., `f(x) = 1:x`. This harkens back to our definition of the Y-combinator, albeit where `x` in `f(x)` is a nullary function and the `f(x)` we just provided looks less like a functional. In any case, `f(ones) = ones`, which is effectively what our original declaration requested of us: assign to `ones` the object that doesn’t change when adding some structure to it (in this case, prepending a 1). The fibonacci sequence is a slightly more involved definition: `fib = 1 : 1 : [ a+b \| (a,b) <- zip fib (tail fib) ]` Again, I find there are two ways to start understanding a definition like this: As a “generating” sequence, and you can unravel it one call at a time. We take what we know to be concrete to start (the base case): the sequence `1 1`. We then zip up the sequence with its offset to produce the next chunk by adding the pairs: `[1 1] [1 null] -> [2]`, add this to produce `1 : 1 : [2]` `[1 1 2] [1 2 null] -> [2 3]`, add this to produce `1 : 1 : [2 3]` `[1 1 2 3] [1 2 3 null] -> [2 3 5]`, add this to produce `1 : 1 : [2 3 5]` Note how with each step we’re effectively “peeking” at what the next item will be, and then we perform the full computation again, taking that new sequence as if it were the value of `fib` to begin with. This is like packing in more and more recursive calls, approaching the true object. Rather than an iteratively applied function, which is perhaps the only practical way to build up intuition and get some object, you “snap” straight to the full, global, infinite object. Note again how we’re not even really defining functions in the usual way. For instance, in other languages we might canonically construct a `fib(n)` function that computes the sequence for the first `n` items. Here it’s as if we’re defining an infinite sequence outright; at least in this example, `fib` is not a function that accepts arguments (although a concrete value could be lazily evaluated if some piece of code attempted to index into the `fib` list). Case expressions and pattern matching Patterns are the symbols used to refer to variables in parameterized contexts, like function definitions as we’ve seen. Haskell makes use of pattern matching to verify appropriate cases. Attaching values to variables or arguments can be generally thought of as first undergoing matching (determining an appropriate definition to fill), and thereafter binding to the variable in that context. Patterns can either fail, succeed, or diverge. They diverge when an error is present (i.e., $\bot$ ), fail if no pattern in an equation is matched, and succeed when at least one is (and the first is taken). Matching occurs left to right in an “equation,” ie, a sequence of patterns on a line of a function declaration, and top to bottom across lines. As-is patterns: you can attach aliases to patterns on the left side that can be reused on the right side: `f (x:xs) = x:x:xs f s@(x:xs) = x:s` Wild-cards: can use `_` to match against any input value Boolean guards: effectively define boolean cases for input values. The “usual” case can be thought more of as direct matches, whereas here we have more control over the exact conditioning being tested for a match: `sign x \| x > 0 = 1 \| x == 0 = 0 \| x < 0 = -1` Case expressions All pattern matching thus far has been seen in the context of function definitions (which is general enough), but we don’t always want to define a function to do this. So a function like `f p11 ... p1k = e1 ... f pn1 ... pnk = eN` This is basically a case-switch kind of expression over `K` constraints (the usual case being `k=1`; we match one value against `N` cases and produce a value according to the matched case). We can move this functionality into a `case` expression, like `f x1 x2 ... xk = case (x1, ..., xk) of (p11, ..., p1k) -> e1 ... (pn1, ..., pnk) -> en` For instance, we can re-define `take` (although I didn’t define it above; the before and after are both below): `-- equation matching in func def take 0 _ = [] take _ [] = [] take n (x:xs) = x : take (n-1) xs -- same definition but with case expression take m ys = case (m,ys) of (0,_) -> [] (_,[]) -> [] (n,x:xs) -> x : take (n-1) xs` In this case I don’t feel it really demonstrates the utility (we just get the same thing, and the first takes fewer lines), but the point is that the case matching can take place on the RHS, in an arbitrary scope, as part of the actual “functional code” we might use to actually define a function. That is to say: we can use pattern matching beyond just inside function definitions. Note that `if` statements can also be reduced to pattern matching via case expressions: `case e1 of True -> e2 False -> e3 -- can expressed with if e1 then e2 else e3 -- which can still be seen as a function if-then-else :: Bool -> a -> a -> a` Lazy patterns Lazy patterns are of the form `~pat`, and are irrefutable, in that matching against a value will always succeed. (This has been a particularly slippery concept to wrap my head around, but I think I’ve got it down now.) Below I follow both the tutorial and an example from the wiki. There are few lines of confusion to battle against here. First, we said functions were non-strict earlier, meaning if I pass in a problematic value to a function that doesn’t use it, it won’t “break” the function and it’ll evaluate without a hitch. This already feels like “laziness” in a sense: we only evaluate expressions/definitions at the time they’re needed. But it’s not lazy at the time the argument is matched, if it needs to be. For example, if I have `f (a,b) = g a b`, where `f` accepts a pair as input and splits up the items to apply `g`, Haskell will check for a pair constructor in the input before `g` can be applied. That is, the elements of the input appear to be needed, so at the time of `f`’s evaluation, we’ll check that our input is a pair, and otherwise fail to match. This very check is what we’re considering “not lazy enough.” A lazy pattern like `f ~(a,b) = g a b`, however, includes an irrefutable pattern, and will match successfully on any input passed to `f`. That input may very well not be a pair, as desired, but we’re explicitly saying we don’t care…yet. You can basically treat the lazy definition like `f p = g (fst p) (snd p)` where this is a legitimate non-lazy analog (`p` is not specific, yet we will treat it like it’s a pair inside the function body). We just see some input `p`, and while we know this should be a pair, we don’t care at the time of the function evaluation. You can think of `f p` as simply delaying the pair evaluation until we evaluate `g`. This is powerful because `g` itself may not care about the values of its inputs, in which case it can produce an output regardless of the actual values. Alternatively, `g` could also have lazy patterns for its inputs and defer the evaluation even further. In any case, the big point here is that lazy patterns let us pretend input values meet the structural checks until the whole expression actually needs to be evaluated. I find the simplest motivating example to be the case where a constant function is buried inside arbitrarily many outer evaluations: `C x = 1 aN p = C p ... a2 p = a3 p a1 p = a2 p` If I then want to call `a1` for a particular value, I’ll check that the value matches the (non-lazy) pattern `p` from `a1 p`, and then attempt to evaluate `a2 p`. I’ll then do the same for `a2`, eventually getting to `a3`, and so on. Here I actually have to maintain a recursive stack and spend time unpacking each function until I reach `C`. If these were all lazy patterns, however, I can go straight to `C`, delaying all the evaluation steps I detailed above until the very latest moment. And in this case, we find it doesn’t even matter: my output will be `1` regardless of the input. Lazy patterns here allow my program to never need to even look at `p` to get that result, allowing it to “be aware” of computations it can skip (by virtue of us delaying the computation, until we get to a point where we can just toss out the whole expression). This sounds sensible enough, but it opens up two questions for me: If this is so helpful, why aren’t all patterns evaluated lazily? Strict patterns seem purposeful in that they enforce the structure we declare the input should be. Don’t we mostly want our program to stop if there’s a structural inconsistency, even if it doesn’t have a functional impact? Note how these questions are mostly at odds with each other, but I suppose that underscores some of the confusion I have here. The above example was fairly contrived, and only a guess as to the actual behavior. Below is a more principled analysis from this very thorough breakdown of laziness. Non-strictness vs laziness We just mentioned how non-strictness is not quite what we mean by laziness. Non-strictness is the general property of Haskell programs that ensures expressions won’t be evaluated until needed; “we evaluate as little as possible and delay evaluation as long as possible.” But that characterization is loose in the sense that what we mean by “evaluate” and “need” are loose. For instance, we might check for the presence of a constructor to verify type correctness but not actually look at any values, at least not unless we need to. As I understand it, lazy evaluation is effectively a mechanism for enforcing a particularly strong notion of non-strictness. Thunks A thunk is effectively an unevaluated expression, and is the mechanism through which we’ll represent lazy evaluation. Like before, we’ll use an example with a pair that’s pattern-matched on the LHS: `let (x, y) = (length [1..5], reverse "olleh") in ...` where we assume `x` and `y` are used somewhere beyond the `in` clause. Now here we might take `x=5` and `y="hello"`, but this involves resolving those expressions when binding the values. But we again don’t do this until we actually need those values, somewhere after the `in`. Until then, we can call both `x` and `y` thunks: unevaluated expressions, “lying dormant” but able to be resolved at any moment. As before, note the pattern matching happening in the pair argument. If we instead have `let z = (length [1..5], reverse "olleh") in ...` Here `z` is a thunk, and we don’t have to deconstruct it right away like we did with `x` and `y` before (although that’s all that happened there, producing two thunks). We can be very explicit by including a pattern match on `z`: `let z = (length [1..5], reverse "olleh") (n, s) = z in ...` After the first line, `z` is still a thunk. But in line 2, we pattern match on `z`, requiring us to split it into two thunks, with `(n, s)` being `(<thunk>, <thunk>)`. To quote the article: The compiler thinks ‘I better make sure that pattern does indeed match `z`, and in order to do that, I need to make sure `z` is a pair.’ Be careful, though — we’re not yet doing anything with the component parts (the calls to `length` and `reverse`), so they can remain unevaluated. We can take this a step further with something like `let z = (length [1..5], reverse "olleh") (n, s) = z 'h':ss = s in ...` which is the same as before, but now pattern matches on the second component of `z`, checking that it is a list with a head `'h'`, and attaches to `ss` the tail of the list. Specifically at this stage we Evaluate `s` at a surface level to check it’s a list (or a `cons` object), such that `s = <thunk> : <thunk>`, basically The first newly “revealed” thunk is evaluated to check it’s an `h`, leaving us with `'h':ss = 'h' : <thunk>` In total, we see that Haskell values can be partially evaluated, and any one line (or just piece) of computation may require some minimum amount of needed evaluation (like with the pair or list checks, where we’re “peeling” back some of the layers only to check for the needed structure), i.e., leaving as many nested components as thunks as possible. We have formal names for the “layers of evaluation” involved here. Any of the intermediate evaluation steps of a value is said to be weak head normal form (WHNF), while a fully evaluated value is in normal form. Here we’ve laid some of the groundwork of laziness and how to think about layers of evaluation via thunks. Nothing is evaluated until it is needed, generally speaking (as in we’re not even onto lazy patterns at this point). Interestingly, aside from a few I/O exceptions, pattern matching is the only place where Haskell values are evaluated; in the end, everything is left as a thunk until a pattern requires peeling back a layer to check the structure…and nowhere else does this thunk resolution take place. Lazy and strict functions Functions can be lazy or strict in an argument (and possibly different across each of them). A function is said to be strict in an argument if it does some evaluation with it, evaluating it to at least WHNF, while being lazy in an argument if no evaluation takes place. A function is stricter than another if it evaluates an argument to a deeper level (of the same structure, say; general comparison is between barebones `f x` and `g x`). Performing evaluation is also called “forcing” a value in some cases. If we “force” (try to evaluate whatsoever) the value of `undefined`, our program will halt. So the following will yield errors `let (x, y) = undefined in x length undefined head undefined` Each of these do some checking on the values, trying to peel back one layer of thunks. But in doing so they encounter `undefined`, and halt. But a thunk that’s “hiding” an `undefined` value can go undetected if we never try to evaluate, and lazier settings like the following don’t produce any errors: `-- we just see our value is a pair, but not the values in the positions let (x, y) = (4, undefined) in x -- same sorta thing, we just see that we've got a list but dont look at the -- cell values length [undefined, undefined, undefined] head (4 : undefined)` We can call a function `f` strict if (and only if) `f undefined` results in an error (implying that `f` will peek at the structure of its argument, evaluating at least one level, which is a problem). This is at least how we can determine strictness without actually knowing `f`’s definition. It’s worth noting there’s a bit of a confusing nuance to really hit on with this definition². When we say a lazy function is one that doesn’t evaluate its argument, we naturally mean this when evaluating `f` (otherwise we just have this unevaluated thunk `f x`). So given we’re forcing `f x`, do we force `x` as a result? That is, if I need to “fully evaluate” `f x`, will I need to fully evaluate `x` to get there? If not, this basically entails that `f`’s “real value” doesn’t depend on `x` in any way or it hides it inside another abstraction that doesn’t need to be unpacked in order for `f x` to be in normal form. So laziness means we can get there without needing to think about `x`. Lazy pattern matching Now we’ve got a bit more context behind delayed evaluation and intermediate forms to motivate the explicit use of lazy patterns. We already introduced them, but here’s a simple example demonstrating their use once more: `Prelude> let f (x,y) = 1 Prelude> f undefined *** Exception: Prelude.undefined Prelude> let f ~(x,y) = 1 Prelude> f undefined 1` The first `f` is strict: it doesn’t refer to `x` or `y` in its body, but it accepts a structured input (a pair) and we have to evaluate whatever’s passed in to check it has a pair constructor. So when we pass in `undefined`, we are forced to check if that’s a pair, and we end up evaluating `undefined` which halts our program. But the `~` in the argument pattern in the second `f` delays this structural check, meaning `f` will just “take it on the chin” for whatever value we give it (even if it’s not a pair). We simply wait until the last possible moment to use whatever that value is, rather than “gate checking” it on input. This means that `f undefined` doesn’t cause any issues; we wait as long as possible to evaluate what we pass in, and in this case that turns out to be forever. So `undefined` goes by un-evaluated. Scoping and nesting There two ways to create local, “block-like” bindings: `let` and `where`. `let` expressions facilitate local bindings scoped to a particular expression. For example `let y = ab f x = (x+y)/y in f c + f d` Here we’re defining `y` and `f` locally (and note that `f` uses `y`), and then taking them as available names to construct the final expression `f c + f d`. `let ... in` can be used anywhere a typical expression can, so we could have the above in a function binding like `normalizePair a b c d = let y = ab f x = (x+y)/y in f c + f d` `where` clauses are similar to `let` in that they facilitate local bindings, but they are crucially not expressions. `where` clauses are only allowed at the top level of a set of equations or case expression and must “attach” to a declaration (function or pattern bind). So an analog of the above that makes it look a whole lot like `let`: `normalizePair a b c d = f c + f d where y = ab f z = (z+y)/y` This is okay because we’re operating right at the function declaration here. But if we were working in some more deeply nested layers, you would have to use a `let` expression, since `where` would not produce a sub-expression. Think of the above `let` variant as actually producing a proper expression strictly on the RHS the equals sign, whereas here the `where` is effectively attaching to the LHS and isn’t valid in isolation. As in, if I just took `f c + f d where y = ab f z = (z+y)/y` this would be invalid, not an expression, etc; the `where` can’t “see” any nearby function binding. Again, it’s like `where` is just a convenience sugar that comes coupled with declarations: `pattern = expression where ...` The use of `where` is coupled with the presence of `pattern`; take the pattern out and you can’t use it, basically. This is mostly convenient when you’ve got a function with guards and you need your locally bound variables across the guard constraints, like `f x y \| y>z = ... \| y==z = ... \| y<z = ... where z = xx` Making `z` available to each constraint is something that `where` handles automatically. You can’t use `let` here because this whole thing isn’t an expression, it’s a set* of expressions. You could use `let`s inside the `...` on the RHS of each constraint, but that’s painful. `where` gets to “sit above” the expression level and be a helper for declarations. These statements were a bit confusing at first because they just feel so particular. But as I slowly get used to them, it’s becoming clear they just facilitate the typical composability of most non-functional languages when it comes to working with variables in scopes. That is, the really very basic ability to define some variables within a function and reuse them to build up other variables or terms. Those are very automatic in languages like Python, but require a bit more care in Haskell. A very basic example of `where` (with no guards, so `let` could used here with similar effectiveness) drove this home a bit, feeling pretty familiar: `areaTriangleTrig a b c = c * height / 2 -- use trigonometry where cosa = (b ^ 2 + c ^ 2 - a ^ 2) / (2 * b * c) sina = sqrt (1 - cosa ^ 2) height = b * sina` Type classes Type classes allow for the categorization of types for which particular behavior is defined. For example, if we think about an equality operator, i.e., `==`, we might want this to be broadly applicable for comparing values of many different types. We face a few practical challenges right away: Not all types are comparable. That is, types aren’t required to admit some notion of equality, so we can expect `==` to remain undefined for some types. For types that support it, what equality means may differ heavily across those types. Here we expect `==` to be overloaded such that the same operator can be used in each type context with whatever specific machinery we need to compare those values. These are standard concerns for general operators. Haskell’s type classes address this by effectively allowing us to declare interfaces that types can inherit. For instance, we can define the class of types that are “equatable” as `class Eq a where (==) :: a -> a -> Bool` This says: a type `a` is an instance of the class `Eq` if it defines an operator `==` that compares values of type `a`. The function/operator `==` is considered a method of the class. We can “attach” types to this class with instance declarations, like instance Eq Integer where x == y = x `integerEq` y This can basically be taken to mean we’re “enrolling” the type `Integer` in the `Eq` class by fulfilling the template/interface required by `Eq` (where `integerEq` is assumed to be a sufficient comparison function). Contexts With access to a type class, we can use contexts to more tightly bound polymorphic type expressions to be quantified only over types belonging to that class. For example, we can now generally refer to our `==` operator as having the type `(==) :: (Eq a) => a -> a -> Bool` which suggests that `==` is a generic function only over types that are instances of the class `Eq`. The notation `(<class> <type-var>) => <type-expr>` is how we generally capture this: `(C t) => E` bounds the type `t` as it appears in the type expression `E` to belong to the type class `C`. This actually feels very natural, and closes the loop on a lot of confusing syntax I’ve seen up to this point. This further feels like bounded UQ as we’ve studied it in broader type theory: we can express types that are quantified over bounded type variables, and those bounds are facilitated by type classes. Contexts can be used generally in type expressions, including in other class definitions: `class (Eq a) => Ord a where (<), (<=), (>=), (>) :: a -> a -> Bool max, min :: a -> a -> a` This defines a new class `Ord a`, where the type variable `a` is bounded to itself be an instance of the class `Eq`. `Ord` is considered a subclass of `Eq`: we’re saying it is defined only over types that belong to that class, is thus comparatively more specific. One can also express multiple inheritance like `class (Eq a, Show a) => C a where ...` One can use class constraints (i.e., contexts) within the method definitions of another class on a type variable except that which is bound at the class level: `class C a where m :: Show b => a -> b` This is quite natural: we’re just saying that `a` can’t be further restricted by `m`’s type expression (e.g., doing something like `:: Show a => a -> b`). That is, we have no power to restrict `a` at that stage; if we wanted to that, we’d need to move it into the class declaration (e.g., `class (Show a) => C a`). More on contexts For a moment, restriction with contexts feels like we’re bringing in more than just types to our quantified type variable. With something like `(Eq a) => a -> a -> Bool` I’m saying I want a type that belongs to `Eq`, which means I’ve got a defined method along with whatever type `a` shows up. That feels odd, like I’ve got a term-type bundle that I’m quantifying over. This confused me at first, since it feels like I’m magically able to enforce a behavioral constraint. That isn’t to say that’s a bad thing: there’s lots of freedom there. But I struggled to draw up a pure type theoretic analogy. The thing is: it’s not really a behavioral constraint, and what we’re doing can be rather trivially expressed as simply bringing the method types into the type expression. For example, we can alternatively write the above as `(a -> a -> Bool) -> a -> a -> Bool` That is, my context `(Eq a) =>` is merely “bringing along” the requirement that we have another defined term of a certain type. A given type class only “enforces” any of its methods up to their types anyway: we don’t actually have some behavioral, term-level declaration. So any time I use a type class as context, I can just think about explicitly bringing the type signatures of each of its methods into the type expression I’m constraining. Since our example here coincides with the type we assigned to `==` globally, we can run with that to demonstrate further: writing `== :: (Eq a) => a -> a -> Bool` can be directly interpreted as saying `==` is a function operating on two equatable types and returning a `Bool` writing `== :: (a -> a -> Bool) -> a -> a -> Bool` says the exact same thing as above, but it makes clear that whatever type `a` we’re working with brings its own notion of equality in the form of the `(a -> a -> Bool)` function (which is all that initially meant by forcing an equatable type with `Eq a`, so same difference) This makes clear a few things: Our type variable `a`, when constrained by contexts, is still just a type variable in the usual sense. The method “baggage” we’re requiring can be seen as something extra, yes, but it’s extra in the type expression, not “in” `a` itself. We’re basically seeing how `==` is really just a “wrapper” generic function that does nothing but call underlying methods. It takes two values of the same type, along with a method that operates on values of that type (the type-specific equality method), and plugs those values in. This is how we get ad-hoc polymorphism (i.e., overloading) via parametric polymorphism: we define fully generic functions, but require calls on values of a particular type to manually supply methods that work with that type, such that our outer generic function can delegate all interaction to those methods. This is basically just giving us a way to manually define or “inject” type-specific function behavior on a type-by-type basis, which is of course what we observe when overloading operators in most other languages. The point is that, in Haskell, we have to build that up using proper generics (i.e., parametric polymorphism). Type constructors Recall our parametric `Point` type from before: `data Point a = Pt a a`. `Point` is a type constructor: it takes a type `a` and produces another type `Point a`. So `Point a` is concrete, first-order (for some `a`), i.e., values can inhabit this type. But `Point` alone, as a “higher-order” type, is not first-order, and there aren’t any values in Haskell that can inhabit it. To be clear, these are both concrete values in type space (i.e., types), they just have different kinds: `Point Int : * Point : * -> ` We’re basically saying that a type needs to have kind `` in order to be inhabitable by a value. That is, only first-order types can be inhabited by a value; it’s fairly nonsensical to suggest that `Point` could be inhabited by a value. Note that both `Point` and `Point Int` inhabit kinds, it’s just that only `Point Int` is a construct that can then further be inhabited by a value while `Point` is “too abstract” (we first need to plug in another type to yield a first-order type). Digression: Sorts and universes …Why is that? In what way is `Point` too abstract? It inhabits a kind just like `Point a` (for any `a`); is that not enough to just call them both types? This is a good place to have a discussion regarding a critical distinction in how we categorize terms, types, and higher-order sorts. This has been a bit of a pain point throughout my type theory “journey,” and the rubber seemingly meets the road now that we’re discussing these items in a more concrete context like Haskell. The key elements I want to highlight here crop up as we start thinking about where type constructors “live.” Above, we just said both `Point Int` and `Point` are “types,” since they both are concrete objects living in type space. They’re not at the term level, and they’re not a kind: they’re sandwiched in between. Nevertheless, calling them both types isn’t exactly how we’d word it in everyday use, and it’s pretty much outright wrong in a Haskell context given that `Point` can’t be inhabited by a value in the usual sense. Now, the fact that `Point` can’t be inhabited by a term/value makes intuitive sense: it’s a function that operates on types. How could you even build a term that makes sense here? Of course, you can quantify over `Point`, binding its type variable so that the expression can become closed and represent something concrete, but it otherwise just includes a free type variable that’s hard to make sense of. So how do we formalize the distinction here? What I mean by that is both `Point` and `Point a` are in this “type realm” like we’ve said; we’re not reaching for kinds or higher-order sorts. In a type theoretic sense we can call them both just types, similar to how we can still call a lambda abstraction a term even though it abstracts over terms (and type constructors like `Point` abstract over types). Universes are what help us further distinguish these items (in the Martin-Lof sense). Universes help capture what we mean by higher or lower order types. A “higher-order type” doesn’t mean we’re referring to kinds (as in a “higher order sort,” jumping from the classification “type” to “kind”), but instead to abstraction over types. In particular, universes represent a hierarchy of types, such that each step up the ladder of universes implies abstraction over the items from the previous universe. `Type₀ : Type₁ Type₁ : Type₂ ... Point : Type₁ -- because Tree : Type₀ → Type₀ Point Int : Type₀ Pt 3 4 : Point Int 3 : Int` Here $\text{Type}_N$ , or $\mathcal{U}_n$ , refers to the $n$ -th universe type. Note that we somewhat overload our usual `:` in that we technically have types on both sides (and not a type on the LHS, kind on the RHS). This is okay given we think of these universe types as collections of certain kinds of types and therefore represent a higher order construction; something like `T : Type₀` says that the type `T` is in the 0-th universe. How do we know something like `Type₀` is itself a type, though? Part of me is tempted to simply say whatever it is, it could be a kind. The thing is, we construct this hierarchy by saying the universe `Type₀` is an element in the universe `Type₁`, and we don’t leave “type world” to do this. So while the kind `` certainly captures all nullary type constructors (first-order types), the kind itself is beyond types altogether, and can’t exactly be taken as a type in a higher type universe like `Type₁`. But we’re doing that in spirit, we can just think of it more like lugging the whole collection of `Type₀` types with us such that it becomes a new primitive in our higher universe. (I’d like a little more on this; really getting* why a universe type can still be a type in the usual sense. Probably the connection to abstraction over types, as in type constructors, is what’s can be convincing here; if Type0 is similar enough to a type constructor, and the latter is a type, then the former can reasonably be as well.) To be clear: universes simply group up types, breaking up type space into “orders” of types. First-order types, i.e., those belonging to `Type₀`, are the only ones that can be inhabited by values. Other higher-order types can then simply be seen as having some “impassable” universe layers separating them from the term space. Therefore, they can only be inhabited by objects in the $N-1$ universe, and for all but `Type₀`, those inhabiting objects are still themselves types. This gives us a meaningful basis for saying a phrase like “type constructors are inhabited by first-order types;” if something is going to inhabit a type “that abstract,” it can only be a “more concrete” type. Doing this for arbitrarily high order types eventually gets us to a “most concrete” type, after which the inhabiting object simply becomes a term (although one can naturally still think of terms as values in a universe, where a concrete type is simply a set of its possible inhabiting terms). I’ve pondered this for a considerable amount of time, struggling to really grok the idea of letting a collection of types, as represented by $\text{Type}_0$ , itself be a type. It simply didn’t track for quite some time: it just feels like a kind, and I don’t like that we seem to ignore this. In fact, it seems very clear when we let universes be a partition of type space, in the sense that each universe is a collection of types bundling up a “new group” of types built on top of the last group. This tracks with the idea of universes building up increasing abstract type constructors, and aligns nicely with the notion of “unions of kinds.” The problem with this is that each universe is a term in the next one. That is, the thing we’re using to reference a collection of types is now just a primitive term in the next universe up. That bothered me for a long time. I just didn’t get what the thing was supposed to now be in the next universe. $\mathcal{U}_0$ as a type is sensible: I can think of it like a set with elements inside, and if we liken $:$ to $\in$ , then something like $Integer : \mathcal{U}_0$ tracks just fine. But $\mathcal{U}_0$ as a standalone term annoys me, and it no longer feels right. The interpretation that helps me here is to allow $\mathcal{U}_0$ to be a building block for terms on this new “plane.” You can even liken it to `` as a kind, in the way we use `` in kind expressions to abstract over types. Whatever work the symbol `` is doing, we’re basically letting the reference to $\mathcal{U}_0$ (as a term) do that same work. And you can start building other terms with it, e.g., $\mathcal{U}_0\rightarrow \mathcal{U}_0$ , which is really not any different from what we mean by $\rightarrow $ . Note how while $\mathcal{U}_1\rightarrow \mathcal{U}_1$ would be in $\mathcal{U}_2$ (a new universe), it can still be related to a kind like $( \rightarrow\cdots\rightarrow ) \rightarrow ( \rightarrow\cdots\rightarrow )$ (i.e., a map from an $n$ -arity type constructor to an $m$ -arity type constructor). That is, higher universes don’t correspond to higher sorts; it’s not like we have to leave “kind space” to start representing universes beyond $\mathcal{U}_1$ . Instead, universes are basically just convenient ways to refer to all types with a certain level of abstraction (or below), and we can canonically think of those things as still having some ascribable kinds. Universes also group up types with a “broader brush:” $\mathcal{U}_1$ , for instance, basically groups up all first-order type constructors. We don’t have a convenient way to refer to all types that meet that description with kinds; we have to write out $\rightarrow $ , $\rightarrow \rightarrow $ , etc to capture the notion of arbitrary first-order arity. All types with that structure inhabit $\mathcal{U}_1$ , however, so we get a single term we can now use to refer to them. We make a jump to a new universe where that single “smaller” universe reference a is new term, and build some new higher-order terms from there. This is again just like assigning a name to some first-order kinds, and building outer higher-order kinds (like example above) that nest the first-order ones inside. Another thing that might help, if not pretty much implied by the above: we effectively “reuse” the notion of typing at each new universe. It’s odd to initially start representing collections of types as types themselves. We never claim to leave type space, but we wrap lower universes back around to be terms that are categorized by higher universes. We just reuse the notion of type inhabitance each time to capture that relationship: each time we get some fundamentally new thing we can work as a term to build up other terms in this new universe. `class Functor f where fmap :: (a -> b) -> f a -> f b instance Functor Point where fmap f (Point a) = Point (f a)` www.haskell.org/tutorial/goodies.html wiki.haskell.org/Lazy_pattern_match en.wikibooks.org/wiki/Haskell/Laziness This may already be quite clear, but to reinforce this intuition even more (because I find it important): `ones` here is not some object that we find ourselves prepending a 1 to. When looking at the RHS, I find myself thinking that `ones` could technically be some arbitrary value, and we just need to find one that makes the equation work. In a sense that’s perfectly okay, but I think there’s too much mental freedom with my read on what’s happening here. I think it’s better to be very clear that `ones` is born entirely out of the extra structure used in that equation. A similar point can be made with the general `fix` setting we explored in Typed lambda calculus, e.g. for factorial: `G(fact) = λ(n: Int). if n=0 then 1 else n * fact(n-1)` Here `fact` isn’t some construct that we find ourselves putting into that equation, “plugging it in” and checking if it works. No; it is completely defined by that equation. It is not a separate thing in any meaningful sense. The same applies for `ones` above: it is the thing that internalizes, infinitely, the action of prepending a 1. This note is really just a reminder to not treat the equation and the term so separately, since I seem to have that tendency in my latest re-reading of this material. Recursive terms like these are nothing but sponges that must absorb whatever structure shows up around it in the definition. ↩︎ I say confusing because I initially took this to mean something different based on the wording in the Wikibooks article. The article says Often, we only care about WHNF, so a function that evaluates its argument to at least WHNF is called strict and one that performs no evaluation is lazy. This confused me because it makes laziness sound like something we can see in the function body, and whether it uses that input in a particular way. As in, whether we explicitly take an evaluation step inside the function’s definition. The `id` function sounds like it’s lazy under this definition: it doesn’t evaluate `x` in the body, it just returns what it was given. While this definition is a little sloppy, my interpretation is what’s wrong. Whatever evaluation may take place in `f`’s body, it won’t take place until `f` itself is evaluated. So our quoted line is exactly the same thing as the “refined” statement Given that we’re forcing `f x`, does `x` get forced as a result? That is to say, once we actually try to make the term `f x` a concrete value, strictness means we’ll need to take `x` to a concrete value in order to get there. Another hint that the first line doesn’t make sense: just above we said evaluation only takes place during pattern matching. So unless `f` does some pattern matching when checking input values (in which case it’s obviously strict, regardless of how we’re interpreting our definition), the only evaluation that can take place in its body is if it passes that input off to another function call which needs to do some pattern matching. Point being, there’s no other canonical way for evaluation to even take place in the function body: we either evaluate when checking the input or call another function that does. So if our function `f` doesn’t pattern match on input, then our first definition is likening strictness to simply calling a function internally that does pattern matching. This would be a pretty shallow notion of strictness if that’s what we mean, i.e., strict functions are those that call functions that pattern match. `id` would not be strict under such a definition. ↩︎
name_fmt_1	Haskell.md+html5
toc	Syntax groups Syntax definitions Polymorphic types User-defined types Type synonyms Functions Functions are non-strict Recursive functions Case expressions and pattern matching Case expressions Lazy patterns Non-strictness vs laziness Thunks
type_1	wiki
created	2024-05-28 05:29
modified	2025-10-11 00:39
summary
abstract
series
aggregates	[{'id': 10965, 'path': '/home/smgr/Documents/notes/Haskell.md', 'rpath': 'Haskell.md', 'name': 'Haskell.md', 'title': 'Haskell', 'link': 'Haskell', 'ftype': 'md', 'ctime': '1760168377.24', 'mtime': '1760168359.0', 'atime': '1760168359.0', 'type': 'wiki', 'yaml_text': 'title: Haskell\ncreated: 2024-05-28 05:29\nmodified: 2025-10-11 00:39\ndatelink: [[2024-05-28]]\ntype: wiki\nsummary: ', 'name_fmt': 'Haskell.md+html5', 'format': 'html5', 'content': '\n \n \n Foreword: this page was written with the content in\nTyped lambda calculus fresh on my\nmind. Implicit references are made to more formal type theory concepts\nthroughout, and this page may not be a great standalone accounting of\nthe covered Haskell topics (I’ve confused myself re-reading this page\nwithout the lambda calculus terminology so top-of-mind, for\ninstance). \n \n \n Syntax groups \n \n \n Expressions are the symbolic representations of\nvalues. This takes place at both the\nterm and type level. \n `-- expressions refer and evaluate to values\n-- here I refer to a value X as "value<X>" to separate it from a symbol\n-- representation of that value, include the symbol "X" itself\n\n-- this is at the term level\n9+1 --refers to--> value<10>\n10 --refers to--> value<10>\n\n-- this is at the type level\nInteger --refers to-> <space of integers>\nInteger -> Integer --refers to-> <space of functions from/to integers>` \n Here we’re simply saying that expressions are the things we\nwrite down (a choice of symbols), while values are the objects\nto which those expressions refer (the more abstract thing “under the\nhood,” independent of representation). And this applies to any\nsyntactical representation of categories of objects, so we make a\ndistinction between this happening at a term and type level (but the\nexpression-value “pattern” is present in either case). \n \n \n Equations associate patterns with\nexpressions, specifically in term space. The expression\nhere is therefore analogous to the usual notion of a term, and the\npattern is more a reusable, functional alias to parametrize that term\nstructure in various contexts. This is somewhat muddy: patterns are more\nthan just names, but they aren’t fundamentally part of a function term.\nThey are perhaps best thought of as syntactic sugar for case switching\ndifferent behavior that we bundle up under the same alias, such that the\nalias behaves as “one function” (but endowed with case matching over its\narguments in the intuitive sense). \n To hit on this point more (since it initially confused me): equations\nare not responsible for creating function terms in that they do\nnot fundamentally facilitate abstraction. `=` effectively\njust attaches a term (RHS) to a name (LHS), but with the pattern\nmatching over various signatures. A definition like \n `-- equations are of the form <pattern> = <expression>\nfactorial 0 = 1\nfactorial n = n * factorial (n - 1)` \n simply breaks up the definition of a single function,\n`factorial`, into two cases based on the input. The original\nconfusion arose due to terms appearing on the LHS, and so it feels as if\n`=` is the thing doing the “bridging” here to connect them.\nBut `=` is not a replacement for `λ` in the usual\nlambda calculus sense; the latter is occurring implicitly on the RHS,\nand `=` merely attaches the resulting function to the name.\nTo be clear: \n `factorial n = n * factorial (n - 1)\n\n-- is semantically equivalent to\n\nfactorial = \\n -> n * factorial (n - 1)\n\n-- which might be (lazily) represented in lambda calculus as\n\nfactorial := λn. n * factorial (n - 1)` \n So: while `=` helps assign names to terms in this fairly\nprimitive sense, and is separate from the actual creation of function\nterms, it is more than just a `:=` in that the LHS\npattern can be matched against later on. It is more than a name in that\nway, and really a name for a function term that case matches its input\nbefore determining its behavior. If we wanted to be very explicit, even\nacross several Haskell equations, `factorial` can still be\ntruly thought of a single term with this case switching\nembedded: \n `factorial := λn. case n of (0 -> 1 ∣ m -> m × factorial(m−1))` \n Mind you, everything is a function when we draw analogies to\nlambda calculus. The most general characterization of this is written as\nfollows, where `case ... of` is a valid Haskell\nexpression: \n `-- a set of equations like\nf p11 ... p1k = e1\n...\nf pn1 ... pnk = en\n\n-- is precisely equivalent to\nf x1 x2 ... xk = case (x1, ..., xk) of\n (p11, ..., p1k) -> e1\n ...\n (pn1, ..., pnk) -> en\n\n-- and can be more explicitly written\nf = \\x1 x2 ... xk ->\n case (x1, ..., xk) of\n (p11, ..., p1k) -> e1\n ...\n (pn1, ..., pnk) -> en` \n This syntactically separates each of the involved components\nhere: \n \n \n `=` attaches an expression to an alias \n \n \n `\\` initializes an abstraction (function) \n \n \n `case ... of` facilitates pattern matching \n \n \n This story is far less clear with just the first version (the set of\nequations), where our convenient use of `=` masks these\nmechanisms. \n \n \n Declarations are what we generally call statements\nthat serve as definitions. This includes equations (i.e., pattern\nbindings), as well as type synonyms, `data` statements, etc.\nAny statement that associates some expression with a name, “ascribing\nmeaning” to scoped aliases, is a declaration. \n The notion of a “declaration” is more general than what we called an\nequation above. One can think of the hierarchy of terms involved loosely\nas \n `Declaration (introduces objects)\n├─ Binding‑declaration (pattern-expression associations, aka equations)\n│ ├─ Function‑binding -- “f x y = …” (multiple clauses allowed)\n│ └─ Pattern‑binding -- “(x,y) = …”, “Just n = …”, etc.\n└─ Other declarations -- type, data, class, import, …` \n Note that function bindings can fail to match (refutable), while\npattern bindings are lazy (and are irrefutable). \n \n \n Equation resolution: matching, binding,\nevaluation \n \n \n \n \n Syntax definitions \n \n \n Expressions (syntactic terms) refer to values (the\nactual abstract objects). `5` is the character (expression)\nwe use to refer to the “concept of 5.” Simply put, we say the\nvalues are the real things, expressions are how we\nwrite them down (and not strictly in a canonical form, e.g., both\n`5` and `4+1` are expressions that yield the same\nvalue). \n \n \n Type expressions are how we write type values (or\njust “types”), just like the above distinction. For instance,\n`Integer -> Integer` is the type expression that\nrepresents the type of functions to/from integers. \n \n \n A value together with its type is called a “typing,” and we use\n`::` to declare a type relation (read as “has type”): \n `5 :: Integer\ninc :: Integer -> Integer\n[1,2,3] :: [Integer]` \n \n \n \n Polymorphic types \n \n In\n $\\lambda 2$ \nwe introduced generics and higher-order quantification of type\nvariables. Haskell only supports universal quantification (i.e., the\nspecification of generics), and we do this just using “bare” (free) type\nvariables. So something like \n \n `length :: [a] -> Integer\nlength [] = 0\nlength (x:xs) = 1 + length xs` \n \n does the following: \n \n \n \n First declares `length` to be a generic function operating\non lists of any type and maps them to an integer. In our more formal\ntype theory context we might instead write \n `length : ∀a. List[a] -> Int` \n but the point is that we don’t need to explicitly bind `a`\nwith `∀`. Note also that `[x]` is a type\nconstructor. Recall our formal definition of a list type, like \n `rec type List[Item] = \n [nil: Unit,\n cons: {head: Item,\n tail: List[Item]\n }\n ]` \n Although it’s sort of hard to justify putting `a` in the\nmiddle of some brackets in the functional sense (so `[a]`\nversus something like `[](a)`, where in the latter case it’s\nclear we’re explicitly “calling” something we’ve decided to label as\n`[]`), we simply define `[a] = List[a]`. \n \n \n With `length [] = 0`, we’re declaring that\n`length` applied to `[]`, the empty list\n(or `nil` above), evaluates `0`. \n \n \n With `length (x:xs) = 1 + length xs`, we define the rest\nof the function’s behavior recursively. Note that `x:xs`\nbasically unpacks `cons` of the above definition, assigning\n`x` to the head of the list and `xs` to the\nremaining sublist. \n \n \n \n To be clear, our list type is a variant/sum type, and has\nvalues that are either `nil` or `cons`, or in\nHaskell, either `[]` or `(x:xs)`. Our eliminator\nhere is pattern matching, which is being done implicitly as we define\nour “overloaded” function variants, which will match a sum type\n`[a]` as either its `nil` or `cons`\n“internal” type. \n \n \n \n User-defined types \n \n `data` can be used to declare new types: \n \n `data Bool = False \| True\n\ndata Color = Red \| Green \| Blue \| Indigo \| Violet\n\ndata Point a = Pt a a` \n \n `Bool` and `Color` are nullary type\nconstructors (can define all types as type constructors, where\nthose not parametrized by a type variable can just be seen as a\nconstant, nullary function) that are sum types. The enumerated “values”\nhere are called data constructors (also nullary in this case).\nTo be clear: data constructors produce values, type constructors produce\ntypes. \n \n \n `Point` provides an example of a “proper” type\nconstructor, ranging over a type variable `a`.\n`data Point a` suggests `Point` is a type\nconstructor generic in any type `a`, and has terms that are\nconstructed via a data constructor `Pt a a`. The latter can\nperhaps be more formally written `Pt t1:a t2:a`, which is to\nsay it “operates on” two terms `t1` and `t2`, both\nof type `a` as bound by the type `Point a`. So on\nthe left, `a` is a type variable (ranging over all types),\nbut on the right, `a` is a term variable (ranging over\nterms/values of of type `a`). \n \n \n In any case, from the usual programming perspective, this notation\nslightly confuses me because it seems to do so little, and the RHS feels\nso redundant: we end up with two super generic things that define\nbasically no specific functionality. `Point a` and\n`Pt a a` seem to nearly do the exactly same thing. But from\nthe type theory perspective, I suppose it’s clear enough, where the RHS\nis serving to tell us the signature we’ll use to introduce terms of the\ntype. In Python, for instance, this is like the difference between a\nclass declaration and the constructor signature: \n \n `class Point[a]:\n def __init__(self, x: a, y: a):\n ...` \n \n The confusing part of the association here is that, in Python, we\ndon’t introduce some new alias for the constructor; we just use the\nclass name and have `Point[int](1, 2)`, with the type\nvariable and terms all in one place. This raises a key point, and a\ncommon practice in Haskell: we can simply let these two names be the\nsame to capture this arguably more intuitive syntax (they are in\nseparate namespaces, so there’s no concern of clashing), and could\nhave \n \n `data Point a = Point a a` \n \n So to be clear on the usage: \n \n `-- in the former case\nPt 2.0 3.0 :: Point Float\nPt True False :: Point Bool\n\n-- in the latter\nPoint 2.0 3.0 :: Point Float\nPoint True False :: Point Bool` \n \n While I get we might want to make a distinction between the names of\nconstructors in different spaces (term vs type), for now it seems\nthere’s really no reason to name them differently. \n \n \n \n`Pt a a` is complete\n \n \n This perhaps jumps the gun a bit with respect to the pattern matching\ndetails provided below, but I want to expand a bit on a particularly\nsticky concept as I warm up to Haskell conventions. For some time, I’ve\ntreated the declaration \n \n `data Point a = Pt a a` \n \n as incomplete, in the sense I had always implicitly expected\nsome more concrete definition for `Pt` as a constructor must\nbe provided later. From above, I believe my intuition rather directly\nlead me to think the Python analog was \n \n `class Point[a]:\n def __init__(self, x: a, y: a):\n ...` \n \n without the constructor body, as in both appear to leave out\nthe final details for what `Pt` should in fact do with the\ntwo values provided to it. Note that it is perfectly acceptable to think\nof `Point`’s `__init__` as positionally equivalent\nto `Pt`: they both serve as constructors for\n`Point[a]`/`Point a`. In any case,\n`Pt a a` appears merely to set up the constructor signature\njust like `__init__` shell above. But there is nothing more\nto say: it is already complete. \n \n \n It’s true that `Pt :: a -> a -> Point a`, but this\nevokes thoughts that `Pt` is something that will take two\nvalues `x: a` and `y: a` and produce some new form\nthat will have type `Point a`. However, `Pt a a`\nis that thing, literally; there’s not a more involved object\nwith attached variables or methods like we expect in OOP. The mechanics\nfor acting upon such an object are built-in: pattern matching.\n`Pt` may be used, for instance, as follows: \n \n `p :: Point Int\np = Pt 3 4\n\nnorm1 :: Num a => Point a -> a\nnorm1 (Pt x y) = abs x + abs y -- eliminate (pattern match)` \n \n An object like `Pt 3 4` is directly matched against in the\npattern `(Pt x y)`: we’re able to bring our component values\ninto the RHS scope as `x` and `y` directly,\n“destructuring” our object upfront. \n \n \n The larger point here: data declarations don’t yield something\nstrictly more concrete than what they appear to be syntactically. As\nvague as something like `Pt a a` looks, this form alone\nshapes our type: two terms of type `a` in a box called\n`Pt`. That’s about as specific as we need to get, and as\nspecific as data declarations allow. \n \n \n \n Type synonyms \n \n The `type` keyword lets us define type “synonyms,” (or\ntype aliases, as I’d prefer to call them) assigning new names to\ncompositions of previously defined types. For example, \n \n `type String = [Char]\ntype Person = (Name,Address)\ntype Name = String\ndata Address = None \| Addr String\n\n-- can also do this for polymorphic types\ndata Tree a = Leaf a \| Branch (Tree a) (Tree a)\ntype AssocTree a b = (Tree a, Tree b)` \n \n \n \n \n Functions \n \n \n `->` notion for types, as usual: \n `add :: Integer -> Integer -> Integer\nadd x y = x + y` \n \n \n Functions are curried in the usual way, i.e., can think of\n`add` as a function with one arg that returns another\nfunction with an arg: \n `add :: Integer -> (Integer -> Integer)` \n \n \n Curried functions make partial application particularly\nclear, e.g, \n `inc = add 1` \n There’s still another operand to “finish” the `add`\noperation, but the `1` is tucked away as the first operand,\nsuch that `inc` is effectively now just\n`f(x) = add 1 x`. \n \n \n Functions can also be taken as an argument (no surprises here): \n `map :: (a->b) -> [a] -> [b]\nmap f [] = []\nmap f (x:xs) = f x : map f xs\n\n-- example application\nmap (add 1) [1,2,3] => [2,3,4]` \n \n \n And to be clear about function terms and their names: \n `inc x = x+1\nadd x y = x+y\n\n-- are really shorthand for:\n\ninc = \\x -> x+1\nadd = \\x y -> x+y` \n where we use `\\x -> x+1` to define an anonymous\nfunction, in the usual\n $\\lambda$ -expression\nor abstraction sense from lambda calculus. \n \n \n \n Functions are non-strict \n \n Haskell processes definitions rather than\nassignments from other languages. If we have something like \n \n `v = 1/0` \n \n Haskell treats this like “define `v` as `1/0`”\nrather than “compute `1/0` and store the result in\n`v`.” The declaration alone does not imply any computation\noccurs, and in general, evaluation is lazy in nature.\nSuch an evaluation will occur only when the value is needed in some\ncontext, in which case only then will we encounter the zero division\nerror. \n \n \n For a function like \n \n `const1 x = 1` \n \n evaluations of the function never even look at the value of its\nargument. It knows it doesn’t need it, and again, something like\n`const1 (1/0)` will evaluate to `1`. It’s as if\n`1/0` simply gets to stay in syntax form, and any issues will\ngo completely undiscovered until we need to coerce that syntax into a\nvalue. \n \n \n \n Recursive functions \n \n Haskell permits “informal” self-reference when defining recursive\nfunctions, as is commonplace in most modern programming languages. We\ndiscuss the issues here at length in Typed lambda calculus§Recursion and the\n`fix` combinator. I think there’s an important new\nperspective to really internalize here when thinking about the object we\nget from recursive definitions, certainly at a deeper level than I’ve\nbeen accustomed to. \n \n \n Take for instance the following: \n \n `ones = 1 : ones` \n \n The usual surface-level interpretation here, at least as far as I\nthink about it, is that `ones` is some arbitrarily large,\n“expanding” self-referential object/tree. Now this is perfectly fine,\nand even practically accurate when we try to follow the recursive stack\ntraces in some real program. But this is less valid when we really force\nourselves to make `ones` concrete at definition,\nrather than a dynamically growing thing. \n \n \n Note what we’re actually saying with the definition:\n`ones` is some object that, together with some extra\nstructure (the prefixed `1`), is still\nitself. In other words, prefixing with a `1` can’t change\nwhat `ones` actually is here, it must already\n“internalize” that action such that the object already includes it, in a\nsense¹. With any finite structure, this of\ncourse feels incredibly paradoxical: we are by definition adding\nsomething on top of ourselves, which must give us something\nnew, something bigger. But when `ones` is simply the infinite\nsequence of 1s, we break out of this limitation. Such a sequence\ndoes not change when we add another 1 to it. We can also see\nthis object as the fixed point of the function adding our structure,\ne.g., `f(x) = 1:x`. This harkens back to our definition of\nthe Y-combinator, albeit where `x` in `f(x)` is a\nnullary function and the `f(x)` we just provided looks less\nlike a functional. In any case, `f(ones) = ones`, which is\neffectively what our original declaration requested of us: assign to\n`ones` the object that doesn’t change when adding some\nstructure to it (in this case, prepending a 1). \n \n \n The fibonacci sequence is a slightly more involved definition: \n \n `fib = 1 : 1 : [ a+b \| (a,b) <- zip fib (tail fib) ]` \n \n Again, I find there are two ways to start understanding a definition\nlike this: \n \n \n \nAs a “generating” sequence, and you can unravel it one call at a time.\nWe take what we know to be concrete to start (the base case): the\nsequence `1 1`. We then zip up the sequence with its offset\nto produce the next chunk by adding the pairs:\n \n \n `[1 1] [1 null] -> [2]`, add this to produce\n`1 : 1 : [2]` \n \n \n `[1 1 2] [1 2 null] -> [2 3]`, add this to produce\n`1 : 1 : [2 3]` \n \n \n `[1 1 2 3] [1 2 3 null] -> [2 3 5]`, add this to\nproduce `1 : 1 : [2 3 5]` \n \n \n Note how with each step we’re effectively “peeking” at what the next\nitem will be, and then we perform the full computation again, taking\nthat new sequence as if it were the value of `fib` to begin\nwith. This is like packing in more and more recursive calls, approaching\nthe true object. \n \n \n Rather than an iteratively applied function, which is perhaps the\nonly practical way to build up intuition and get some object,\nyou “snap” straight to the full, global, infinite object. \n \n \n \n Note again how we’re not even really defining functions in the usual\nway. For instance, in other languages we might canonically construct a\n`fib(n)` function that computes the sequence for the first\n`n` items. Here it’s as if we’re defining an infinite\nsequence outright; at least in this example, `fib` is not a\nfunction that accepts arguments (although a concrete value could be\nlazily evaluated if some piece of code attempted to index into the\n`fib` list). \n \n \n \n \n Case expressions and pattern matching \n \n Patterns are the symbols used to refer to variables\nin parameterized contexts, like function definitions as we’ve seen.\nHaskell makes use of pattern matching to verify appropriate cases.\nAttaching values to variables or arguments can be generally thought of\nas first undergoing matching (determining an appropriate definition to\nfill), and thereafter binding to the variable in that\ncontext. \n \n \n Patterns can either fail, succeed, or diverge. They diverge when an\nerror is present (i.e.,\n $\\bot$ ),\nfail if no pattern in an equation is matched, and succeed when at least\none is (and the first is taken). Matching occurs left to right in an\n“equation,” ie, a sequence of patterns on a line of a function\ndeclaration, and top to bottom across lines. \n \n \n \n As-is patterns: you can attach aliases to patterns on the left side\nthat can be reused on the right side: \n `f (x:xs) = x:x:xs\n\nf s@(x:xs) = x:s` \n \n \n Wild-cards: can use `_` to match against any input\nvalue \n \n \n Boolean guards: effectively define boolean cases for input values.\nThe “usual” case can be thought more of as direct matches, whereas here\nwe have more control over the exact conditioning being tested for a\nmatch: \n `sign x \| x > 0 = 1\n \| x == 0 = 0\n \| x < 0 = -1` \n \n \n \n Case expressions \n \n All pattern matching thus far has been seen in the context of\nfunction definitions (which is general enough), but we don’t always want\nto define a function to do this. So a function like \n \n `f p11 ... p1k = e1\n...\nf pn1 ... pnk = eN` \n \n This is basically a case-switch kind of expression over\n`K` constraints (the usual case being `k=1`; we\nmatch one value against `N` cases and produce a value\naccording to the matched case). We can move this functionality into a\n`case` expression, like \n \n `f x1 x2 ... xk = case (x1, ..., xk) of\n (p11, ..., p1k) -> e1\n ...\n (pn1, ..., pnk) -> en` \n \n For instance, we can re-define `take` (although I didn’t\ndefine it above; the before and after are both below): \n \n `-- equation matching in func def\ntake 0 _ = []\ntake _ [] = []\ntake n (x:xs) = x : take (n-1) xs\n\n-- same definition but with case expression\ntake m ys = case (m,ys) of\n (0,_) -> []\n (_,[]) -> []\n (n,x:xs) -> x : take (n-1) xs` \n \n In this case I don’t feel it really demonstrates the utility (we just\nget the same thing, and the first takes fewer lines), but the point is\nthat the case matching can take place on the RHS, in an arbitrary scope,\nas part of the actual “functional code” we might use to actually define\na function. That is to say: we can use pattern matching beyond just\ninside function definitions. \n \n \n Note that `if` statements can also be reduced to pattern\nmatching via case expressions: \n \n `case e1 of True -> e2\n False -> e3\n\n-- can expressed with\nif e1 then e2 else e3\n\n-- which can still be seen as a function\nif-then-else :: Bool -> a -> a -> a` \n \n \n Lazy patterns \n \n Lazy patterns are of the form `~pat`, and are\nirrefutable, in that matching against a value will always\nsucceed. (This has been a particularly slippery concept to wrap my head\naround, but I think I’ve got it down now.) Below I follow both the\ntutorial and an example from the wiki. \n \n \n There are few lines of confusion to battle against here. First, we\nsaid functions were non-strict earlier, meaning if I pass in a\nproblematic value to a function that doesn’t use it, it won’t “break”\nthe function and it’ll evaluate without a hitch. This already feels like\n“laziness” in a sense: we only evaluate expressions/definitions at the\ntime they’re needed. But it’s not lazy at the time the\nargument is matched, if it needs to be. For example, if I have\n`f (a,b) = g a b`, where `f` accepts a pair as\ninput and splits up the items to apply `g`, Haskell will\ncheck for a pair constructor in the input before `g` can be\napplied. That is, the elements of the input appear to be needed, so at\nthe time of `f`’s evaluation, we’ll check that our input is a\npair, and otherwise fail to match. This very check is what we’re\nconsidering “not lazy enough.” \n \n \n A lazy pattern like `f ~(a,b) = g a b`, however, includes\nan irrefutable pattern, and will match successfully on any\ninput passed to `f`. That input may very well not be a pair,\nas desired, but we’re explicitly saying we don’t care…yet. You\ncan basically treat the lazy definition like \n \n `f p = g (fst p) (snd p)` \n \n where this is a legitimate non-lazy analog (`p` is not\nspecific, yet we will treat it like it’s a pair inside the function\nbody). We just see some input `p`, and while we know\nthis should be a pair, we don’t care at the time of the function\nevaluation. You can think of `f p` as simply delaying the\npair evaluation until we evaluate `g`. This is powerful\nbecause `g` itself may not care about the values of its\ninputs, in which case it can produce an output regardless of the actual\nvalues. Alternatively, `g` could also have lazy patterns for\nits inputs and defer the evaluation even further. In any case, the big\npoint here is that lazy patterns let us pretend input values meet the\nstructural checks until the whole expression actually needs to be\nevaluated. I find the simplest motivating example to be the case where a\nconstant function is buried inside arbitrarily many outer\nevaluations: \n \n `C x = 1\n\naN p = C p\n...\na2 p = a3 p\na1 p = a2 p` \n \n If I then want to call `a1` for a particular value, I’ll\ncheck that the value matches the (non-lazy) pattern `p` from\n`a1 p`, and then attempt to evaluate `a2 p`. I’ll\nthen do the same for `a2`, eventually getting to\n`a3`, and so on. Here I actually have to maintain a recursive\nstack and spend time unpacking each function until I reach\n`C`. If these were all lazy patterns, however, I can go\nstraight to `C`, delaying all the evaluation steps I\ndetailed above until the very latest moment. And in this case, we find\nit doesn’t even matter: my output will be `1`\nregardless of the input. Lazy patterns here allow my program to never\nneed to even look at `p` to get that result,\nallowing it to “be aware” of computations it can skip (by virtue of us\ndelaying the computation, until we get to a point where we can just toss\nout the whole expression). \n \n \n This sounds sensible enough, but it opens up two questions for\nme: \n \n \n \n If this is so helpful, why aren’t all patterns evaluated lazily? \n \n \n Strict patterns seem purposeful in that they enforce the structure we\ndeclare the input should be. Don’t we mostly want our program to stop if\nthere’s a structural inconsistency, even if it doesn’t have a functional\nimpact? \n \n \n \n Note how these questions are mostly at odds with each other, but I\nsuppose that underscores some of the confusion I have here. The above\nexample was fairly contrived, and only a guess as to the actual\nbehavior. Below is a more principled analysis from this very thorough\nbreakdown of laziness. \n \n \n Non-strictness vs laziness \n \n We just mentioned how non-strictness is not quite what we mean by\nlaziness. Non-strictness is the general property of\nHaskell programs that ensures expressions won’t be evaluated until\nneeded; “we evaluate as little as possible and delay evaluation as long\nas possible.” But that characterization is loose in the sense that what\nwe mean by “evaluate” and “need” are loose. For instance, we might check\nfor the presence of a constructor to verify type correctness but not\nactually look at any values, at least not unless we need to. As I\nunderstand it, lazy evaluation is effectively a mechanism for\nenforcing a particularly strong notion of non-strictness. \n \n \n \n Thunks \n \n A thunk is effectively an unevaluated expression,\nand is the mechanism through which we’ll represent lazy evaluation. Like\nbefore, we’ll use an example with a pair that’s pattern-matched on the\nLHS: \n \n `let (x, y) = (length [1..5], reverse "olleh") in ...` \n \n where we assume `x` and `y` are used somewhere\nbeyond the `in` clause. Now here we might take\n`x=5` and `y="hello"`, but this involves resolving\nthose expressions when binding the values. But we again don’t do\nthis until we actually need those values, somewhere after the\n`in`. Until then, we can call both `x` and\n`y` thunks: unevaluated expressions, “lying dormant”\nbut able to be resolved at any moment. As before, note the pattern\nmatching happening in the pair argument. If we instead have \n \n `let z = (length [1..5], reverse "olleh") in ...` \n \n Here `z` is a thunk, and we don’t have to deconstruct it\nright away like we did with `x` and `y` before\n(although that’s all that happened there, producing two\nthunks). We can be very explicit by including a pattern match on\n`z`: \n \n `let z = (length [1..5], reverse "olleh")\n (n, s) = z \nin ...` \n \n After the first line, `z` is still a thunk. But in line 2,\nwe pattern match on `z`, requiring us to split it into two\nthunks, with `(n, s)` being\n`(<thunk>, <thunk>)`. To quote the article: \n \n \n \n The compiler thinks ‘I better make sure that pattern does indeed\nmatch `z`, and in order to do that, I need to make sure\n`z` is a pair.’ Be careful, though — we’re not yet doing\nanything with the component parts (the calls to `length` and\n`reverse`), so they can remain unevaluated. \n \n\n \n We can take this a step further with something like \n \n `let z = (length [1..5], reverse "olleh")\n (n, s) = z \n \'h\':ss = s\nin ...` \n \n which is the same as before, but now pattern matches on the second\ncomponent of `z`, checking that it is a list with a head\n`\'h\'`, and attaches to `ss` the tail of the list.\nSpecifically at this stage we \n \n \n \n Evaluate `s` at a surface level to check it’s a list (or a\n`cons` object), such that\n`s = <thunk> : <thunk>`, basically \n \n \n The first newly “revealed” thunk is evaluated to check it’s an\n`h`, leaving us with\n`\'h\':ss = \'h\' : <thunk>` \n \n \n \n In total, we see that Haskell values can be partially evaluated, and\nany one line (or just piece) of computation may require some minimum\namount of needed evaluation (like with the pair or list checks, where\nwe’re “peeling” back some of the layers only to check for the needed\nstructure), i.e., leaving as many nested components as thunks as\npossible. \n \n \n \n\n\n \n \n \n We have formal names for the “layers of evaluation” involved here.\nAny of the intermediate evaluation steps of a value is said to be\nweak head normal form (WHNF), while a fully evaluated value is\nin normal form. Here we’ve laid some of the groundwork\nof laziness and how to think about layers of evaluation via thunks.\nNothing is evaluated until it is needed, generally speaking (as\nin we’re not even onto lazy patterns at this point). Interestingly,\naside from a few I/O exceptions, pattern matching is the only\nplace where Haskell values are evaluated; in the end, everything is left\nas a thunk until a pattern requires peeling back a layer to check the\nstructure…and nowhere else does this thunk resolution take place. \n \n \n Lazy and strict functions \n \n Functions can be lazy or strict in an argument (and possibly\ndifferent across each of them). A function is said to be\nstrict in an argument if it does some\nevaluation with it, evaluating it to at least WHNF, while being\nlazy in an argument if no evaluation takes\nplace. A function is stricter than another if it evaluates an\nargument to a deeper level (of the same structure, say; general\ncomparison is between barebones `f x` and `g x`).\nPerforming evaluation is also called “forcing” a value in some\ncases. \n \n \n If we “force” (try to evaluate whatsoever) the value of\n`undefined`, our program will halt. So the following will\nyield errors \n \n `let (x, y) = undefined in x\nlength undefined\nhead undefined` \n \n Each of these do some checking on the values, trying to peel\nback one layer of thunks. But in doing so they encounter\n`undefined`, and halt. But a thunk that’s “hiding” an\n`undefined` value can go undetected if we never try to\nevaluate, and lazier settings like the following don’t produce\nany errors: \n \n `-- we just see our value is a pair, but not the values in the positions\nlet (x, y) = (4, undefined) in x\n\n-- same sorta thing, we just see that we\'ve got a list but dont look at the\n-- cell values\nlength [undefined, undefined, undefined]\n\nhead (4 : undefined)` \n \n We can call a function `f` strict if (and only if)\n`f undefined` results in an error (implying that\n`f` will peek at the structure of its argument, evaluating at\nleast one level, which is a problem). This is at least how we can\ndetermine strictness without actually knowing `f`’s\ndefinition. \n \n \n It’s worth noting there’s a bit of a confusing nuance to really hit\non with this definition². When we say a lazy function is one\nthat doesn’t evaluate its argument, we naturally mean this when\nevaluating `f` (otherwise we just have this unevaluated\nthunk `f x`). So given we’re forcing `f x`, do we\nforce `x` as a result? That is, if I need to “fully evaluate”\n`f x`, will I need to fully evaluate `x` to get\nthere? If not, this basically entails that `f`’s “real value”\ndoesn’t depend on `x` in any way or it hides\nit inside another abstraction that doesn’t need to be unpacked in\norder for `f x` to be in normal form. So laziness\nmeans we can get there without needing to think about\n`x`. \n \n \n \n Lazy pattern matching \n \n Now we’ve got a bit more context behind delayed evaluation and\nintermediate forms to motivate the explicit use of lazy patterns. We\nalready introduced them, but here’s a simple example demonstrating their\nuse once more: \n \n `Prelude> let f (x,y) = 1\nPrelude> f undefined\n*** Exception: Prelude.undefined\n\nPrelude> let f ~(x,y) = 1\nPrelude> f undefined\n1` \n \n The first `f` is strict: it doesn’t refer to\n`x` or `y` in its body, but it accepts a\nstructured input (a pair) and we have to evaluate whatever’s passed in\nto check it has a pair constructor. So when we pass in\n`undefined`, we are forced to check if that’s a pair, and we\nend up evaluating `undefined` which halts our program. But\nthe `~` in the argument pattern in the second `f`\ndelays this structural check, meaning `f` will just “take it\non the chin” for whatever value we give it (even if it’s not a pair). We\nsimply wait until the last possible moment to use whatever that value\nis, rather than “gate checking” it on input. This means that\n`f undefined` doesn’t cause any issues; we wait as\nlong as possible to evaluate what we pass in, and in this case that\nturns out to be forever. So `undefined` goes by\nun-evaluated. \n \n \n \n Scoping and nesting \n \n There two ways to create local, “block-like” bindings:\n`let` and `where`. \n \n \n \n `let` expressions facilitate local bindings scoped to a\nparticular expression. For example \n `let y = ab\n f x = (x+y)/y\nin f c + f d` \n Here we’re defining `y` and `f` locally (and\nnote that `f` uses `y`), and then taking them as\navailable names to construct the final expression\n`f c + f d`. `let ... in` can be used anywhere a\ntypical expression can, so we could have the above in a function binding\nlike \n `normalizePair a b c d =\n let y = ab\n f x = (x+y)/y\n in f c + f d` \n \n \n `where` clauses are similar to `let` in that\nthey facilitate local bindings, but they are crucially not\nexpressions. `where` clauses are only allowed at the top\nlevel of a set of equations or case expression and must “attach” to a\ndeclaration (function or pattern bind). \n So an analog of the above that makes it look a whole lot like\n`let`: \n `normalizePair a b c d =\n f c + f d\n where\n y = ab\n f z = (z+y)/y` \n This is okay because we’re operating right at the function\ndeclaration here. But if we were working in some more deeply nested\nlayers, you would have to use a `let` expression, since\n`where` would not produce a sub-expression. Think of the\nabove `let` variant as actually producing a proper expression\nstrictly on the RHS the equals sign, whereas here the `where`\nis effectively attaching to the LHS and isn’t valid in isolation. As in,\nif I just took \n `f c + f d\n where\n y = ab\n f z = (z+y)/y` \n this would be invalid, not an expression, etc; the `where`\ncan’t “see” any nearby function binding. Again, it’s like\n`where` is just a convenience sugar that comes coupled with\ndeclarations: \n `pattern = expression\n where ...` \n The use of `where` is coupled with the presence of\n`pattern`; take the pattern out and you can’t use it,\nbasically. This is mostly convenient when you’ve got a function with\nguards and you need your locally bound variables across the guard\nconstraints, like \n `f x y \| y>z = ...\n \| y==z = ...\n \| y<z = ...\n where z = xx` \n Making `z` available to each constraint is something that\n`where` handles automatically. You can’t use `let`\nhere because this whole thing isn’t an expression, it’s a set\nof expressions. You could* use `let`s inside the\n`...` on the RHS of each constraint, but that’s painful.\n`where` gets to “sit above” the expression level and be a\nhelper for declarations. \n \n \n \n These statements were a bit confusing at first because they just feel\nso particular. But as I slowly get used to them, it’s becoming clear\nthey just facilitate the typical composability of most non-functional\nlanguages when it comes to working with variables in scopes. That is,\nthe really very basic ability to define some variables within a function\nand reuse them to build up other variables or terms. Those are very\nautomatic in languages like Python, but require a bit more care in\nHaskell. A very basic example of `where` (with no guards, so\n`let` could used here with similar effectiveness) drove this\nhome a bit, feeling pretty familiar: \n \n `areaTriangleTrig a b c = c * height / 2 -- use trigonometry\n where\n cosa = (b ^ 2 + c ^ 2 - a ^ 2) / (2 * b * c)\n sina = sqrt (1 - cosa ^ 2)\n height = b * sina` \n \n \n Type classes \n \n Type classes allow for the categorization of types for which\nparticular behavior is defined. For example, if we think about an\nequality operator, i.e., `==`, we might want this to be\nbroadly applicable for comparing values of many different types. We face\na few practical challenges right away: \n \n \n \n Not all types are comparable. That is, types aren’t required\nto admit some notion of equality, so we can expect `==` to\nremain undefined for some types. \n \n \n For types that support it, what equality means may differ heavily\nacross those types. Here we expect `==` to be overloaded such\nthat the same operator can be used in each type context with whatever\nspecific machinery we need to compare those values. \n \n \n \n These are standard concerns for general operators. Haskell’s type\nclasses address this by effectively allowing us to declare interfaces\nthat types can inherit. For instance, we can define the class of types\nthat are “equatable” as \n \n `class Eq a where \n (==) :: a -> a -> Bool` \n \n This says: a type `a` is an instance of the class\n`Eq` if it defines an operator `==` that compares\nvalues of type `a`. The function/operator `==` is\nconsidered a method of the class. We can “attach” types to this\nclass with instance declarations, like \n \n instance Eq Integer where \n x == y = x `integerEq` y \n \n This can basically be taken to mean we’re “enrolling” the type\n`Integer` in the `Eq` class by fulfilling the\ntemplate/interface required by `Eq` (where\n`integerEq` is assumed to be a sufficient comparison\nfunction). \n \n \n Contexts \n \n With access to a type class, we can use contexts to more\ntightly bound polymorphic type expressions to be quantified only over\ntypes belonging to that class. For example, we can now generally refer\nto our `==` operator as having the type \n \n `(==) :: (Eq a) => a -> a -> Bool` \n \n which suggests that `==` is a generic function only over\ntypes that are instances of the class `Eq`. The notation\n`(<class> <type-var>) => <type-expr>` is\nhow we generally capture this: `(C t) => E` bounds the\ntype `t` as it appears in the type expression `E`\nto belong to the type class `C`. This actually feels very\nnatural, and closes the loop on a lot of confusing syntax I’ve seen up\nto this point. This further feels like bounded UQ as we’ve studied it in\nbroader type theory: we can express types that are quantified over\nbounded type variables, and those bounds are facilitated by\ntype classes. \n \n \n Contexts can be used generally in type expressions, including in\nother class definitions: \n \n `class (Eq a) => Ord a where\n (<), (<=), (>=), (>) :: a -> a -> Bool\n max, min :: a -> a -> a` \n \n This defines a new class `Ord a`, where the type variable\n`a` is bounded to itself be an instance of the class\n`Eq`. `Ord` is considered a subclass of\n`Eq`: we’re saying it is defined only over types that belong\nto that class, is thus comparatively more specific. \n \n \n \n One can also express multiple inheritance like \n `class (Eq a, Show a) => C a where ...` \n \n \n One can use class constraints (i.e., contexts) within the method\ndefinitions of another class on a type variable except that\nwhich is bound at the class level: \n `class C a where\n m :: Show b => a -> b` \n This is quite natural: we’re just saying that `a` can’t be\nfurther restricted by `m`’s type expression (e.g., doing\nsomething like `:: Show a => a -> b`). That is, we have\nno power to restrict `a` at that stage; if we wanted to that,\nwe’d need to move it into the class declaration (e.g.,\n`class (Show a) => C a`). \n \n \n \n \nMore on contexts\n \n \n For a moment, restriction with contexts feels like we’re bringing in\nmore than just types to our quantified type variable. With\nsomething like \n \n `(Eq a) => a -> a -> Bool` \n \n I’m saying I want a type that belongs to `Eq`, which means\nI’ve got a defined method along with whatever type `a` shows\nup. That feels odd, like I’ve got a term-type bundle that I’m\nquantifying over. This confused me at first, since it feels like I’m\nmagically able to enforce a behavioral constraint. That isn’t to say\nthat’s a bad thing: there’s lots of freedom there. But I struggled to\ndraw up a pure type theoretic analogy. \n \n \n The thing is: it’s not really a behavioral constraint, and\nwhat we’re doing can be rather trivially expressed as simply bringing\nthe method types into the type expression. For example, we can\nalternatively write the above as \n \n `(a -> a -> Bool) -> a -> a -> Bool` \n \n That is, my context `(Eq a) =>` is merely “bringing\nalong” the requirement that we have another defined term of a certain\ntype. A given type class only “enforces” any of its methods up to their\ntypes anyway: we don’t actually have some behavioral, term-level\ndeclaration. So any time I use a type class as context, I can just think\nabout explicitly bringing the type signatures of each of its methods\ninto the type expression I’m constraining. Since our example here\ncoincides with the type we assigned to `==` globally, we can\nrun with that to demonstrate further: \n \n \n \n writing `== :: (Eq a) => a -> a -> Bool` can be\ndirectly interpreted as saying `==` is a function operating\non two equatable types and returning a `Bool` \n \n \n writing\n`== :: (a -> a -> Bool) -> a -> a -> Bool`\nsays the exact same thing as above, but it makes clear that whatever\ntype `a` we’re working with brings its own notion of equality\nin the form of the `(a -> a -> Bool)` function (which\nis all that initially meant by forcing an equatable type with\n`Eq a`, so same difference) \n \n \n \n This makes clear a few things: \n \n \n \n Our type variable `a`, when constrained by contexts, is\nstill just a type variable in the usual sense. The method “baggage”\nwe’re requiring can be seen as something extra, yes, but it’s extra\nin the type expression, not “in” `a` itself. \n \n \n We’re basically seeing how `==` is really just a “wrapper”\ngeneric function that does nothing but call underlying methods. It takes\ntwo values of the same type, along with a method that operates on values\nof that type (the type-specific equality method), and plugs those values\nin. This is how we get ad-hoc polymorphism (i.e.,\noverloading) via parametric polymorphism: we define fully generic\nfunctions, but require calls on values of a particular type to manually\nsupply methods that work with that type, such that our outer generic\nfunction can delegate all interaction to those methods. This is\nbasically just giving us a way to manually define or “inject”\ntype-specific function behavior on a type-by-type basis, which is of\ncourse what we observe when overloading operators in most other\nlanguages. The point is that, in Haskell, we have to build that up using\nproper generics (i.e., parametric polymorphism). \n \n \n \n \n \n Type constructors \n \n Recall our parametric `Point` type from before:\n`data Point a = Pt a a`. `Point` is a type\nconstructor: it takes a type `a` and produces another\ntype `Point a`. So `Point a` is concrete,\nfirst-order (for some `a`), i.e., values can inhabit this\ntype. But `Point` alone, as a “higher-order” type, is not\nfirst-order, and there aren’t any values in Haskell that can inhabit\nit. \n \n \n To be clear, these are both concrete values in type space (i.e.,\ntypes), they just have different kinds: \n \n `Point Int : \nPoint : -> ` \n \n We’re basically saying that a type needs to have kind ``\nin order to be inhabitable by a value. That is, only first-order types\ncan be inhabited by a value; it’s fairly nonsensical to suggest that\n`Point` could be inhabited by a value. Note that both\n`Point` and `Point Int` inhabit kinds, it’s just\nthat only `Point Int` is a construct that can then further be\ninhabited by a value while `Point` is “too abstract” (we\nfirst need to plug in another type to yield a first-order\ntype). \n \n \n \nDigression: Sorts and universes\n \n \n …Why is that? In what way is `Point` too abstract? It\ninhabits a kind just like `Point a` (for any `a`);\nis that not enough to just call them both types? \n \n \n This is a good place to have a discussion regarding a critical\ndistinction in how we categorize terms, types, and higher-order sorts.\nThis has been a bit of a pain point throughout my type theory “journey,”\nand the rubber seemingly meets the road now that we’re discussing these\nitems in a more concrete context like Haskell. \n \n \n The key elements I want to highlight here crop up as we start\nthinking about where type constructors “live.” Above, we just said both\n`Point Int` and `Point` are “types,” since they\nboth are concrete objects living in type space. They’re not at the term\nlevel, and they’re not a kind: they’re sandwiched in between.\nNevertheless, calling them both types isn’t exactly how we’d word it in\neveryday use, and it’s pretty much outright wrong in a Haskell context\ngiven that `Point` can’t be inhabited by a value in\nthe usual sense. \n \n \n Now, the fact that `Point` can’t be inhabited by a\nterm/value makes intuitive sense: it’s a function that operates on\ntypes. How could you even build a term that makes sense here?\nOf course, you can quantify over `Point`, binding its type\nvariable so that the expression can become closed and represent\nsomething concrete, but it otherwise just includes a free type variable\nthat’s hard to make sense of. \n \n \n So how do we formalize the distinction here? What I mean by that is\nboth `Point` and `Point a` are in this “type\nrealm” like we’ve said; we’re not reaching for kinds or higher-order\nsorts. In a type theoretic sense we can call them both just\ntypes, similar to how we can still call a lambda abstraction a term even\nthough it abstracts over terms (and type constructors like\n`Point` abstract over types).\nUniverses are what help us further distinguish these\nitems (in the Martin-Lof sense). \n \n \n Universes help capture what we mean by higher or lower order types. A\n“higher-order type” doesn’t mean we’re referring to kinds (as in a\n“higher order sort,” jumping from the classification “type” to “kind”),\nbut instead to abstraction over types. In particular, universes\nrepresent a hierarchy of types, such that each step up the ladder of\nuniverses implies abstraction over the items from the previous\nuniverse. \n \n `Type₀ : Type₁\nType₁ : Type₂\n...\nPoint : Type₁ -- because Tree : Type₀ → Type₀\nPoint Int : Type₀\nPt 3 4 : Point Int\n3 : Int` \n \n Here\n $\\text{Type}_N$ ,\nor\n $\\mathcal{U}_n$ ,\nrefers to the\n $n$ -th\nuniverse type. Note that we somewhat overload our usual `:`\nin that we technically have types on both sides (and not a type on the\nLHS, kind on the RHS). This is okay given we think of these universe\ntypes as collections of certain kinds of types and therefore represent a\nhigher order construction; something like `T : Type₀` says\nthat the type `T` is in the 0-th universe. \n \n \n How do we know something like `Type₀` is itself a type,\nthough? Part of me is tempted to simply say whatever it is, it could\nbe a kind. The thing is, we construct this hierarchy by saying the\nuniverse `Type₀` is an element in the universe\n`Type₁`, and we don’t leave “type world” to do this. So while\nthe kind `` certainly captures all nullary type constructors\n(first-order types), the kind itself is beyond types altogether, and\ncan’t exactly be taken as a type in a higher type universe like\n`Type₁`. But we’re doing that in spirit, we can just\nthink of it more like lugging the whole collection of `Type₀`\ntypes with us such that it becomes a new primitive in our higher\nuniverse. \n \n \n (I’d like a little more on this; really getting* why a\nuniverse type can still be a type in the usual sense. Probably the\nconnection to abstraction over types, as in type constructors, is what’s\ncan be convincing here; if Type0 is similar enough to a type\nconstructor, and the latter is a type, then the former can reasonably be\nas well.) \n \n \n To be clear: universes simply group up types, breaking up type space\ninto “orders” of types. First-order types, i.e., those belonging to\n`Type₀`, are the only ones that can be inhabited by values.\nOther higher-order types can then simply be seen as having some\n“impassable” universe layers separating them from the term space.\nTherefore, they can only be inhabited by objects in the\n $N-1$ \nuniverse, and for all but `Type₀`, those inhabiting\nobjects are still themselves types. This gives us a meaningful\nbasis for saying a phrase like “type constructors are inhabited by\nfirst-order types;” if something is going to inhabit a type “that\nabstract,” it can only be a “more concrete” type. Doing this for\narbitrarily high order types eventually gets us to a “most concrete”\ntype, after which the inhabiting object simply becomes a term (although\none can naturally still think of terms as values in a universe, where a\nconcrete type is simply a set of its possible inhabiting terms). \n \n \n I’ve pondered this for a considerable amount of time, struggling to\nreally grok the idea of letting a collection of types, as represented by\n $\\text{Type}_0$ ,\nitself be a type. It simply didn’t track for quite some time: it just\nfeels like a kind, and I don’t like that we seem to ignore\nthis. In fact, it seems very clear when we let universes be a\npartition of type space, in the sense that each universe is a\ncollection of types bundling up a “new group” of types built on top of\nthe last group. This tracks with the idea of universes building up\nincreasing abstract type constructors, and aligns nicely with the notion\nof “unions of kinds.” The problem with this is that each universe is\na term in the next one. That is, the thing we’re using to reference\na collection of types is now just a primitive term in the next universe\nup. \n \n \n That bothered me for a long time. I just didn’t get what the thing\nwas supposed to now be in the next universe.\n $\\mathcal{U}_0$ \nas a type is sensible: I can think of it like a set with\nelements inside, and if we liken\n $:$ \nto\n $\\in$ ,\nthen something like\n $Integer : \\mathcal{U}_0$ \ntracks just fine. But\n $\\mathcal{U}_0$ \nas a standalone term annoys me, and it no longer feels right.\nThe interpretation that helps me here is to allow\n $\\mathcal{U}_0$ \nto be a building block for terms on this new “plane.” You can even liken\nit to `` as a kind, in the way we use `` in kind\nexpressions to abstract over types. Whatever work the symbol\n`` is doing, we’re basically letting the reference to\n $\\mathcal{U}_0$ \n(as a term) do that same work. And you can start building other terms\nwith it, e.g.,\n $\\mathcal{U}_0\\rightarrow \\mathcal{U}_0$ ,\nwhich is really not any different from what we mean by\n $\\rightarrow $ .\nNote how while\n $\\mathcal{U}_1\\rightarrow \\mathcal{U}_1$ \nwould be in\n $\\mathcal{U}_2$ \n(a new universe), it can still be related to a kind like\n $(\n\\rightarrow\\cdots\\rightarrow ) \\rightarrow ( \\rightarrow\\cdots\\rightarrow )$ \n(i.e., a map from an\n $n$ -arity\ntype constructor to an\n $m$ -arity\ntype constructor). That is, higher universes don’t correspond to higher\nsorts; it’s not like we have to leave “kind space” to start representing\nuniverses beyond\n $\\mathcal{U}_1$ .\nInstead, universes are basically just convenient ways to refer to all\ntypes with a certain level of abstraction (or below), and we can\ncanonically think of those things as still having some ascribable kinds.\nUniverses also group up types with a “broader brush:”\n $\\mathcal{U}_1$ ,\nfor instance, basically groups up all first-order type constructors. We\ndon’t have a convenient way to refer to all types that meet that\ndescription with kinds; we have to write out\n $\\rightarrow $ ,\n $\\rightarrow \\rightarrow $ ,\netc to capture the notion of arbitrary first-order arity. All\ntypes with that structure inhabit\n $\\mathcal{U}_1$ ,\nhowever, so we get a single term we can now use to refer to them. We\nmake a jump to a new universe where that single “smaller” universe\nreference a is new term, and build some new higher-order terms from\nthere. This is again just like assigning a name to some first-order\nkinds, and building outer higher-order kinds (like example above) that\nnest the first-order ones inside. \n \n \n Another thing that might help, if not pretty much implied by the\nabove: we effectively “reuse” the notion of typing at each new universe.\nIt’s odd to initially start representing collections of types as types\nthemselves. We never claim to leave type space, but we wrap lower\nuniverses back around to be terms that are categorized by higher\nuniverses. We just reuse the notion of type inhabitance each time to\ncapture that relationship: each time we get some fundamentally new thing\nwe can work as a term to build up other terms in this new\nuniverse. \n \n \n `class Functor f where\n fmap :: (a -> b) -> f a -> f b\n\ninstance Functor Point where\n fmap f (Point a) = Point (f a)` \n \n www.haskell.org/tutorial/goodies.html \n wiki.haskell.org/Lazy_pattern_match \n en.wikibooks.org/wiki/Haskell/Laziness \n \n \n \n\n \n \n \n \n \n \n \n This may already be quite clear, but to reinforce this intuition even\nmore (because I find it important): `ones` here is not some\nobject that we find ourselves prepending a 1 to. When looking at the\nRHS, I find myself thinking that `ones` could technically be\nsome arbitrary value, and we just need to find one that makes the\nequation work. In a sense that’s perfectly okay, but I think there’s too\nmuch mental freedom with my read on what’s happening here. I think it’s\nbetter to be very clear that `ones` is born entirely out\nof the extra structure used in that equation. A similar point can\nbe made with the general `fix` setting we explored in Typed lambda calculus, e.g.\xa0for\nfactorial: \n `G(fact) = λ(n: Int). if n=0 then 1 else n * fact(n-1)` \n Here `fact` isn’t some construct that we find ourselves\nputting into that equation, “plugging it in” and checking if it works.\nNo; it is completely defined by that equation. It is not a\nseparate thing in any meaningful sense. The same applies for\n`ones` above: it is the thing that internalizes,\ninfinitely, the action of prepending a 1. \nThis note is really just a reminder to not treat the equation and the\nterm so separately, since I seem to have that tendency in my latest\nre-reading of this material. Recursive terms like these are nothing but\nsponges that must absorb whatever structure shows up around it\nin the definition.\n \n↩︎ \n \n I say confusing because I initially took this to mean something\ndifferent based on the wording in the Wikibooks article. The article\nsays \n \n Often, we only care about WHNF, so a function that evaluates its\nargument to at least WHNF is called strict and one that performs no\nevaluation is lazy. \n \n This confused me because it makes laziness sound like something we\ncan see in the function body, and whether it uses that input in a\nparticular way. As in, whether we explicitly take an evaluation step\ninside the function’s definition. The `id` function\nsounds like it’s lazy under this definition: it doesn’t evaluate\n`x` in the body, it just returns what it was given. While\nthis definition is a little sloppy, my interpretation is what’s wrong.\nWhatever evaluation may take place in `f`’s body, it won’t\ntake place until `f` itself is evaluated. So our quoted line\nis exactly the same thing as the “refined” statement \n \n Given that we’re forcing `f x`, does `x` get\nforced as a result? \n \n That is to say, once we actually try to make the term\n`f x` a concrete value, strictness means we’ll need to take\n`x` to a concrete value in order to get there. \nAnother hint that the first line doesn’t make sense: just above we said\nevaluation only takes place during pattern matching. So unless\n`f` does some pattern matching when checking input values (in\nwhich case it’s obviously strict, regardless of how we’re interpreting\nour definition), the only evaluation that can take place in its body is\nif it passes that input off to another function call which needs to do\nsome pattern matching. Point being, there’s no other canonical way for\nevaluation to even take place in the function body: we either evaluate\nwhen checking the input or call another function that does. So if our\nfunction `f` doesn’t pattern match on input, then our first\ndefinition is likening strictness to simply calling a function\ninternally that does pattern matching. This would be a pretty shallow\nnotion of strictness if that’s what we mean, i.e., strict functions are\nthose that call functions that pattern match. `id` would\nnot be strict under such a definition.\n \n↩︎ \n \n \n', 'toc': ' \n Syntax\ngroups \n Syntax\ndefinitions\n \n Polymorphic\ntypes \n User-defined types\n \n Type\nsynonyms \n \n \n Functions\n \n Functions are non-strict \n Recursive functions \n \n Case expressions and\npattern matching\n \n Case\nexpressions \n Lazy patterns\n \n Non-strictness vs laziness \n Thunks \n \n \n ', 'created': '2024-05-28 05:29', 'modified': '2025-10-11 00:39', 'summary': '', 'abstract': '', 'series': ''}, {'id': 10966, 'path': '/home/smgr/Documents/notes/Haskell.md', 'rpath': 'Haskell.md', 'name': 'Haskell.md', 'title': 'Haskell', 'link': 'Haskell', 'ftype': 'md', 'ctime': '1760168377.24', 'mtime': '1760168359.0', 'atime': '1760168359.0', 'type': 'wiki', 'yaml_text': 'title: Haskell\ncreated: 2024-05-28 05:29\nmodified: 2025-10-11 00:39\ndatelink: [[2024-05-28]]\ntype: wiki\nsummary: ', 'name_fmt': 'Haskell.md+src', 'format': 'src', 'content': '', 'toc': '', 'created': '2024-05-28 05:29', 'modified': '2025-10-11 00:39', 'summary': '', 'abstract': '', 'series': ''}]
indexes	{'format': {'html5': {'id': 10965, 'path': '/home/smgr/Documents/notes/Haskell.md', 'rpath': 'Haskell.md', 'name': 'Haskell.md', 'title': 'Haskell', 'link': 'Haskell', 'ftype': 'md', 'ctime': '1760168377.24', 'mtime': '1760168359.0', 'atime': '1760168359.0', 'type': 'wiki', 'yaml_text': 'title: Haskell\ncreated: 2024-05-28 05:29\nmodified: 2025-10-11 00:39\ndatelink: [[2024-05-28]]\ntype: wiki\nsummary: ', 'name_fmt': 'Haskell.md+html5', 'format': 'html5', 'content': '\n \n \n Foreword: this page was written with the content in\nTyped lambda calculus fresh on my\nmind. Implicit references are made to more formal type theory concepts\nthroughout, and this page may not be a great standalone accounting of\nthe covered Haskell topics (I’ve confused myself re-reading this page\nwithout the lambda calculus terminology so top-of-mind, for\ninstance). \n \n \n Syntax groups \n \n \n Expressions are the symbolic representations of\nvalues. This takes place at both the\nterm and type level. \n `-- expressions refer and evaluate to values\n-- here I refer to a value X as "value<X>" to separate it from a symbol\n-- representation of that value, include the symbol "X" itself\n\n-- this is at the term level\n9+1 --refers to--> value<10>\n10 --refers to--> value<10>\n\n-- this is at the type level\nInteger --refers to-> <space of integers>\nInteger -> Integer --refers to-> <space of functions from/to integers>` \n Here we’re simply saying that expressions are the things we\nwrite down (a choice of symbols), while values are the objects\nto which those expressions refer (the more abstract thing “under the\nhood,” independent of representation). And this applies to any\nsyntactical representation of categories of objects, so we make a\ndistinction between this happening at a term and type level (but the\nexpression-value “pattern” is present in either case). \n \n \n Equations associate patterns with\nexpressions, specifically in term space. The expression\nhere is therefore analogous to the usual notion of a term, and the\npattern is more a reusable, functional alias to parametrize that term\nstructure in various contexts. This is somewhat muddy: patterns are more\nthan just names, but they aren’t fundamentally part of a function term.\nThey are perhaps best thought of as syntactic sugar for case switching\ndifferent behavior that we bundle up under the same alias, such that the\nalias behaves as “one function” (but endowed with case matching over its\narguments in the intuitive sense). \n To hit on this point more (since it initially confused me): equations\nare not responsible for creating function terms in that they do\nnot fundamentally facilitate abstraction. `=` effectively\njust attaches a term (RHS) to a name (LHS), but with the pattern\nmatching over various signatures. A definition like \n `-- equations are of the form <pattern> = <expression>\nfactorial 0 = 1\nfactorial n = n * factorial (n - 1)` \n simply breaks up the definition of a single function,\n`factorial`, into two cases based on the input. The original\nconfusion arose due to terms appearing on the LHS, and so it feels as if\n`=` is the thing doing the “bridging” here to connect them.\nBut `=` is not a replacement for `λ` in the usual\nlambda calculus sense; the latter is occurring implicitly on the RHS,\nand `=` merely attaches the resulting function to the name.\nTo be clear: \n `factorial n = n * factorial (n - 1)\n\n-- is semantically equivalent to\n\nfactorial = \\n -> n * factorial (n - 1)\n\n-- which might be (lazily) represented in lambda calculus as\n\nfactorial := λn. n * factorial (n - 1)` \n So: while `=` helps assign names to terms in this fairly\nprimitive sense, and is separate from the actual creation of function\nterms, it is more than just a `:=` in that the LHS\npattern can be matched against later on. It is more than a name in that\nway, and really a name for a function term that case matches its input\nbefore determining its behavior. If we wanted to be very explicit, even\nacross several Haskell equations, `factorial` can still be\ntruly thought of a single term with this case switching\nembedded: \n `factorial := λn. case n of (0 -> 1 ∣ m -> m × factorial(m−1))` \n Mind you, everything is a function when we draw analogies to\nlambda calculus. The most general characterization of this is written as\nfollows, where `case ... of` is a valid Haskell\nexpression: \n `-- a set of equations like\nf p11 ... p1k = e1\n...\nf pn1 ... pnk = en\n\n-- is precisely equivalent to\nf x1 x2 ... xk = case (x1, ..., xk) of\n (p11, ..., p1k) -> e1\n ...\n (pn1, ..., pnk) -> en\n\n-- and can be more explicitly written\nf = \\x1 x2 ... xk ->\n case (x1, ..., xk) of\n (p11, ..., p1k) -> e1\n ...\n (pn1, ..., pnk) -> en` \n This syntactically separates each of the involved components\nhere: \n \n \n `=` attaches an expression to an alias \n \n \n `\\` initializes an abstraction (function) \n \n \n `case ... of` facilitates pattern matching \n \n \n This story is far less clear with just the first version (the set of\nequations), where our convenient use of `=` masks these\nmechanisms. \n \n \n Declarations are what we generally call statements\nthat serve as definitions. This includes equations (i.e., pattern\nbindings), as well as type synonyms, `data` statements, etc.\nAny statement that associates some expression with a name, “ascribing\nmeaning” to scoped aliases, is a declaration. \n The notion of a “declaration” is more general than what we called an\nequation above. One can think of the hierarchy of terms involved loosely\nas \n `Declaration (introduces objects)\n├─ Binding‑declaration (pattern-expression associations, aka equations)\n│ ├─ Function‑binding -- “f x y = …” (multiple clauses allowed)\n│ └─ Pattern‑binding -- “(x,y) = …”, “Just n = …”, etc.\n└─ Other declarations -- type, data, class, import, …` \n Note that function bindings can fail to match (refutable), while\npattern bindings are lazy (and are irrefutable). \n \n \n Equation resolution: matching, binding,\nevaluation \n \n \n \n \n Syntax definitions \n \n \n Expressions (syntactic terms) refer to values (the\nactual abstract objects). `5` is the character (expression)\nwe use to refer to the “concept of 5.” Simply put, we say the\nvalues are the real things, expressions are how we\nwrite them down (and not strictly in a canonical form, e.g., both\n`5` and `4+1` are expressions that yield the same\nvalue). \n \n \n Type expressions are how we write type values (or\njust “types”), just like the above distinction. For instance,\n`Integer -> Integer` is the type expression that\nrepresents the type of functions to/from integers. \n \n \n A value together with its type is called a “typing,” and we use\n`::` to declare a type relation (read as “has type”): \n `5 :: Integer\ninc :: Integer -> Integer\n[1,2,3] :: [Integer]` \n \n \n \n Polymorphic types \n \n In\n $\\lambda 2$ \nwe introduced generics and higher-order quantification of type\nvariables. Haskell only supports universal quantification (i.e., the\nspecification of generics), and we do this just using “bare” (free) type\nvariables. So something like \n \n `length :: [a] -> Integer\nlength [] = 0\nlength (x:xs) = 1 + length xs` \n \n does the following: \n \n \n \n First declares `length` to be a generic function operating\non lists of any type and maps them to an integer. In our more formal\ntype theory context we might instead write \n `length : ∀a. List[a] -> Int` \n but the point is that we don’t need to explicitly bind `a`\nwith `∀`. Note also that `[x]` is a type\nconstructor. Recall our formal definition of a list type, like \n `rec type List[Item] = \n [nil: Unit,\n cons: {head: Item,\n tail: List[Item]\n }\n ]` \n Although it’s sort of hard to justify putting `a` in the\nmiddle of some brackets in the functional sense (so `[a]`\nversus something like `[](a)`, where in the latter case it’s\nclear we’re explicitly “calling” something we’ve decided to label as\n`[]`), we simply define `[a] = List[a]`. \n \n \n With `length [] = 0`, we’re declaring that\n`length` applied to `[]`, the empty list\n(or `nil` above), evaluates `0`. \n \n \n With `length (x:xs) = 1 + length xs`, we define the rest\nof the function’s behavior recursively. Note that `x:xs`\nbasically unpacks `cons` of the above definition, assigning\n`x` to the head of the list and `xs` to the\nremaining sublist. \n \n \n \n To be clear, our list type is a variant/sum type, and has\nvalues that are either `nil` or `cons`, or in\nHaskell, either `[]` or `(x:xs)`. Our eliminator\nhere is pattern matching, which is being done implicitly as we define\nour “overloaded” function variants, which will match a sum type\n`[a]` as either its `nil` or `cons`\n“internal” type. \n \n \n \n User-defined types \n \n `data` can be used to declare new types: \n \n `data Bool = False \| True\n\ndata Color = Red \| Green \| Blue \| Indigo \| Violet\n\ndata Point a = Pt a a` \n \n `Bool` and `Color` are nullary type\nconstructors (can define all types as type constructors, where\nthose not parametrized by a type variable can just be seen as a\nconstant, nullary function) that are sum types. The enumerated “values”\nhere are called data constructors (also nullary in this case).\nTo be clear: data constructors produce values, type constructors produce\ntypes. \n \n \n `Point` provides an example of a “proper” type\nconstructor, ranging over a type variable `a`.\n`data Point a` suggests `Point` is a type\nconstructor generic in any type `a`, and has terms that are\nconstructed via a data constructor `Pt a a`. The latter can\nperhaps be more formally written `Pt t1:a t2:a`, which is to\nsay it “operates on” two terms `t1` and `t2`, both\nof type `a` as bound by the type `Point a`. So on\nthe left, `a` is a type variable (ranging over all types),\nbut on the right, `a` is a term variable (ranging over\nterms/values of of type `a`). \n \n \n In any case, from the usual programming perspective, this notation\nslightly confuses me because it seems to do so little, and the RHS feels\nso redundant: we end up with two super generic things that define\nbasically no specific functionality. `Point a` and\n`Pt a a` seem to nearly do the exactly same thing. But from\nthe type theory perspective, I suppose it’s clear enough, where the RHS\nis serving to tell us the signature we’ll use to introduce terms of the\ntype. In Python, for instance, this is like the difference between a\nclass declaration and the constructor signature: \n \n `class Point[a]:\n def __init__(self, x: a, y: a):\n ...` \n \n The confusing part of the association here is that, in Python, we\ndon’t introduce some new alias for the constructor; we just use the\nclass name and have `Point[int](1, 2)`, with the type\nvariable and terms all in one place. This raises a key point, and a\ncommon practice in Haskell: we can simply let these two names be the\nsame to capture this arguably more intuitive syntax (they are in\nseparate namespaces, so there’s no concern of clashing), and could\nhave \n \n `data Point a = Point a a` \n \n So to be clear on the usage: \n \n `-- in the former case\nPt 2.0 3.0 :: Point Float\nPt True False :: Point Bool\n\n-- in the latter\nPoint 2.0 3.0 :: Point Float\nPoint True False :: Point Bool` \n \n While I get we might want to make a distinction between the names of\nconstructors in different spaces (term vs type), for now it seems\nthere’s really no reason to name them differently. \n \n \n \n`Pt a a` is complete\n \n \n This perhaps jumps the gun a bit with respect to the pattern matching\ndetails provided below, but I want to expand a bit on a particularly\nsticky concept as I warm up to Haskell conventions. For some time, I’ve\ntreated the declaration \n \n `data Point a = Pt a a` \n \n as incomplete, in the sense I had always implicitly expected\nsome more concrete definition for `Pt` as a constructor must\nbe provided later. From above, I believe my intuition rather directly\nlead me to think the Python analog was \n \n `class Point[a]:\n def __init__(self, x: a, y: a):\n ...` \n \n without the constructor body, as in both appear to leave out\nthe final details for what `Pt` should in fact do with the\ntwo values provided to it. Note that it is perfectly acceptable to think\nof `Point`’s `__init__` as positionally equivalent\nto `Pt`: they both serve as constructors for\n`Point[a]`/`Point a`. In any case,\n`Pt a a` appears merely to set up the constructor signature\njust like `__init__` shell above. But there is nothing more\nto say: it is already complete. \n \n \n It’s true that `Pt :: a -> a -> Point a`, but this\nevokes thoughts that `Pt` is something that will take two\nvalues `x: a` and `y: a` and produce some new form\nthat will have type `Point a`. However, `Pt a a`\nis that thing, literally; there’s not a more involved object\nwith attached variables or methods like we expect in OOP. The mechanics\nfor acting upon such an object are built-in: pattern matching.\n`Pt` may be used, for instance, as follows: \n \n `p :: Point Int\np = Pt 3 4\n\nnorm1 :: Num a => Point a -> a\nnorm1 (Pt x y) = abs x + abs y -- eliminate (pattern match)` \n \n An object like `Pt 3 4` is directly matched against in the\npattern `(Pt x y)`: we’re able to bring our component values\ninto the RHS scope as `x` and `y` directly,\n“destructuring” our object upfront. \n \n \n The larger point here: data declarations don’t yield something\nstrictly more concrete than what they appear to be syntactically. As\nvague as something like `Pt a a` looks, this form alone\nshapes our type: two terms of type `a` in a box called\n`Pt`. That’s about as specific as we need to get, and as\nspecific as data declarations allow. \n \n \n \n Type synonyms \n \n The `type` keyword lets us define type “synonyms,” (or\ntype aliases, as I’d prefer to call them) assigning new names to\ncompositions of previously defined types. For example, \n \n `type String = [Char]\ntype Person = (Name,Address)\ntype Name = String\ndata Address = None \| Addr String\n\n-- can also do this for polymorphic types\ndata Tree a = Leaf a \| Branch (Tree a) (Tree a)\ntype AssocTree a b = (Tree a, Tree b)` \n \n \n \n \n Functions \n \n \n `->` notion for types, as usual: \n `add :: Integer -> Integer -> Integer\nadd x y = x + y` \n \n \n Functions are curried in the usual way, i.e., can think of\n`add` as a function with one arg that returns another\nfunction with an arg: \n `add :: Integer -> (Integer -> Integer)` \n \n \n Curried functions make partial application particularly\nclear, e.g, \n `inc = add 1` \n There’s still another operand to “finish” the `add`\noperation, but the `1` is tucked away as the first operand,\nsuch that `inc` is effectively now just\n`f(x) = add 1 x`. \n \n \n Functions can also be taken as an argument (no surprises here): \n `map :: (a->b) -> [a] -> [b]\nmap f [] = []\nmap f (x:xs) = f x : map f xs\n\n-- example application\nmap (add 1) [1,2,3] => [2,3,4]` \n \n \n And to be clear about function terms and their names: \n `inc x = x+1\nadd x y = x+y\n\n-- are really shorthand for:\n\ninc = \\x -> x+1\nadd = \\x y -> x+y` \n where we use `\\x -> x+1` to define an anonymous\nfunction, in the usual\n $\\lambda$ -expression\nor abstraction sense from lambda calculus. \n \n \n \n Functions are non-strict \n \n Haskell processes definitions rather than\nassignments from other languages. If we have something like \n \n `v = 1/0` \n \n Haskell treats this like “define `v` as `1/0`”\nrather than “compute `1/0` and store the result in\n`v`.” The declaration alone does not imply any computation\noccurs, and in general, evaluation is lazy in nature.\nSuch an evaluation will occur only when the value is needed in some\ncontext, in which case only then will we encounter the zero division\nerror. \n \n \n For a function like \n \n `const1 x = 1` \n \n evaluations of the function never even look at the value of its\nargument. It knows it doesn’t need it, and again, something like\n`const1 (1/0)` will evaluate to `1`. It’s as if\n`1/0` simply gets to stay in syntax form, and any issues will\ngo completely undiscovered until we need to coerce that syntax into a\nvalue. \n \n \n \n Recursive functions \n \n Haskell permits “informal” self-reference when defining recursive\nfunctions, as is commonplace in most modern programming languages. We\ndiscuss the issues here at length in Typed lambda calculus§Recursion and the\n`fix` combinator. I think there’s an important new\nperspective to really internalize here when thinking about the object we\nget from recursive definitions, certainly at a deeper level than I’ve\nbeen accustomed to. \n \n \n Take for instance the following: \n \n `ones = 1 : ones` \n \n The usual surface-level interpretation here, at least as far as I\nthink about it, is that `ones` is some arbitrarily large,\n“expanding” self-referential object/tree. Now this is perfectly fine,\nand even practically accurate when we try to follow the recursive stack\ntraces in some real program. But this is less valid when we really force\nourselves to make `ones` concrete at definition,\nrather than a dynamically growing thing. \n \n \n Note what we’re actually saying with the definition:\n`ones` is some object that, together with some extra\nstructure (the prefixed `1`), is still\nitself. In other words, prefixing with a `1` can’t change\nwhat `ones` actually is here, it must already\n“internalize” that action such that the object already includes it, in a\nsense¹. With any finite structure, this of\ncourse feels incredibly paradoxical: we are by definition adding\nsomething on top of ourselves, which must give us something\nnew, something bigger. But when `ones` is simply the infinite\nsequence of 1s, we break out of this limitation. Such a sequence\ndoes not change when we add another 1 to it. We can also see\nthis object as the fixed point of the function adding our structure,\ne.g., `f(x) = 1:x`. This harkens back to our definition of\nthe Y-combinator, albeit where `x` in `f(x)` is a\nnullary function and the `f(x)` we just provided looks less\nlike a functional. In any case, `f(ones) = ones`, which is\neffectively what our original declaration requested of us: assign to\n`ones` the object that doesn’t change when adding some\nstructure to it (in this case, prepending a 1). \n \n \n The fibonacci sequence is a slightly more involved definition: \n \n `fib = 1 : 1 : [ a+b \| (a,b) <- zip fib (tail fib) ]` \n \n Again, I find there are two ways to start understanding a definition\nlike this: \n \n \n \nAs a “generating” sequence, and you can unravel it one call at a time.\nWe take what we know to be concrete to start (the base case): the\nsequence `1 1`. We then zip up the sequence with its offset\nto produce the next chunk by adding the pairs:\n \n \n `[1 1] [1 null] -> [2]`, add this to produce\n`1 : 1 : [2]` \n \n \n `[1 1 2] [1 2 null] -> [2 3]`, add this to produce\n`1 : 1 : [2 3]` \n \n \n `[1 1 2 3] [1 2 3 null] -> [2 3 5]`, add this to\nproduce `1 : 1 : [2 3 5]` \n \n \n Note how with each step we’re effectively “peeking” at what the next\nitem will be, and then we perform the full computation again, taking\nthat new sequence as if it were the value of `fib` to begin\nwith. This is like packing in more and more recursive calls, approaching\nthe true object. \n \n \n Rather than an iteratively applied function, which is perhaps the\nonly practical way to build up intuition and get some object,\nyou “snap” straight to the full, global, infinite object. \n \n \n \n Note again how we’re not even really defining functions in the usual\nway. For instance, in other languages we might canonically construct a\n`fib(n)` function that computes the sequence for the first\n`n` items. Here it’s as if we’re defining an infinite\nsequence outright; at least in this example, `fib` is not a\nfunction that accepts arguments (although a concrete value could be\nlazily evaluated if some piece of code attempted to index into the\n`fib` list). \n \n \n \n \n Case expressions and pattern matching \n \n Patterns are the symbols used to refer to variables\nin parameterized contexts, like function definitions as we’ve seen.\nHaskell makes use of pattern matching to verify appropriate cases.\nAttaching values to variables or arguments can be generally thought of\nas first undergoing matching (determining an appropriate definition to\nfill), and thereafter binding to the variable in that\ncontext. \n \n \n Patterns can either fail, succeed, or diverge. They diverge when an\nerror is present (i.e.,\n $\\bot$ ),\nfail if no pattern in an equation is matched, and succeed when at least\none is (and the first is taken). Matching occurs left to right in an\n“equation,” ie, a sequence of patterns on a line of a function\ndeclaration, and top to bottom across lines. \n \n \n \n As-is patterns: you can attach aliases to patterns on the left side\nthat can be reused on the right side: \n `f (x:xs) = x:x:xs\n\nf s@(x:xs) = x:s` \n \n \n Wild-cards: can use `_` to match against any input\nvalue \n \n \n Boolean guards: effectively define boolean cases for input values.\nThe “usual” case can be thought more of as direct matches, whereas here\nwe have more control over the exact conditioning being tested for a\nmatch: \n `sign x \| x > 0 = 1\n \| x == 0 = 0\n \| x < 0 = -1` \n \n \n \n Case expressions \n \n All pattern matching thus far has been seen in the context of\nfunction definitions (which is general enough), but we don’t always want\nto define a function to do this. So a function like \n \n `f p11 ... p1k = e1\n...\nf pn1 ... pnk = eN` \n \n This is basically a case-switch kind of expression over\n`K` constraints (the usual case being `k=1`; we\nmatch one value against `N` cases and produce a value\naccording to the matched case). We can move this functionality into a\n`case` expression, like \n \n `f x1 x2 ... xk = case (x1, ..., xk) of\n (p11, ..., p1k) -> e1\n ...\n (pn1, ..., pnk) -> en` \n \n For instance, we can re-define `take` (although I didn’t\ndefine it above; the before and after are both below): \n \n `-- equation matching in func def\ntake 0 _ = []\ntake _ [] = []\ntake n (x:xs) = x : take (n-1) xs\n\n-- same definition but with case expression\ntake m ys = case (m,ys) of\n (0,_) -> []\n (_,[]) -> []\n (n,x:xs) -> x : take (n-1) xs` \n \n In this case I don’t feel it really demonstrates the utility (we just\nget the same thing, and the first takes fewer lines), but the point is\nthat the case matching can take place on the RHS, in an arbitrary scope,\nas part of the actual “functional code” we might use to actually define\na function. That is to say: we can use pattern matching beyond just\ninside function definitions. \n \n \n Note that `if` statements can also be reduced to pattern\nmatching via case expressions: \n \n `case e1 of True -> e2\n False -> e3\n\n-- can expressed with\nif e1 then e2 else e3\n\n-- which can still be seen as a function\nif-then-else :: Bool -> a -> a -> a` \n \n \n Lazy patterns \n \n Lazy patterns are of the form `~pat`, and are\nirrefutable, in that matching against a value will always\nsucceed. (This has been a particularly slippery concept to wrap my head\naround, but I think I’ve got it down now.) Below I follow both the\ntutorial and an example from the wiki. \n \n \n There are few lines of confusion to battle against here. First, we\nsaid functions were non-strict earlier, meaning if I pass in a\nproblematic value to a function that doesn’t use it, it won’t “break”\nthe function and it’ll evaluate without a hitch. This already feels like\n“laziness” in a sense: we only evaluate expressions/definitions at the\ntime they’re needed. But it’s not lazy at the time the\nargument is matched, if it needs to be. For example, if I have\n`f (a,b) = g a b`, where `f` accepts a pair as\ninput and splits up the items to apply `g`, Haskell will\ncheck for a pair constructor in the input before `g` can be\napplied. That is, the elements of the input appear to be needed, so at\nthe time of `f`’s evaluation, we’ll check that our input is a\npair, and otherwise fail to match. This very check is what we’re\nconsidering “not lazy enough.” \n \n \n A lazy pattern like `f ~(a,b) = g a b`, however, includes\nan irrefutable pattern, and will match successfully on any\ninput passed to `f`. That input may very well not be a pair,\nas desired, but we’re explicitly saying we don’t care…yet. You\ncan basically treat the lazy definition like \n \n `f p = g (fst p) (snd p)` \n \n where this is a legitimate non-lazy analog (`p` is not\nspecific, yet we will treat it like it’s a pair inside the function\nbody). We just see some input `p`, and while we know\nthis should be a pair, we don’t care at the time of the function\nevaluation. You can think of `f p` as simply delaying the\npair evaluation until we evaluate `g`. This is powerful\nbecause `g` itself may not care about the values of its\ninputs, in which case it can produce an output regardless of the actual\nvalues. Alternatively, `g` could also have lazy patterns for\nits inputs and defer the evaluation even further. In any case, the big\npoint here is that lazy patterns let us pretend input values meet the\nstructural checks until the whole expression actually needs to be\nevaluated. I find the simplest motivating example to be the case where a\nconstant function is buried inside arbitrarily many outer\nevaluations: \n \n `C x = 1\n\naN p = C p\n...\na2 p = a3 p\na1 p = a2 p` \n \n If I then want to call `a1` for a particular value, I’ll\ncheck that the value matches the (non-lazy) pattern `p` from\n`a1 p`, and then attempt to evaluate `a2 p`. I’ll\nthen do the same for `a2`, eventually getting to\n`a3`, and so on. Here I actually have to maintain a recursive\nstack and spend time unpacking each function until I reach\n`C`. If these were all lazy patterns, however, I can go\nstraight to `C`, delaying all the evaluation steps I\ndetailed above until the very latest moment. And in this case, we find\nit doesn’t even matter: my output will be `1`\nregardless of the input. Lazy patterns here allow my program to never\nneed to even look at `p` to get that result,\nallowing it to “be aware” of computations it can skip (by virtue of us\ndelaying the computation, until we get to a point where we can just toss\nout the whole expression). \n \n \n This sounds sensible enough, but it opens up two questions for\nme: \n \n \n \n If this is so helpful, why aren’t all patterns evaluated lazily? \n \n \n Strict patterns seem purposeful in that they enforce the structure we\ndeclare the input should be. Don’t we mostly want our program to stop if\nthere’s a structural inconsistency, even if it doesn’t have a functional\nimpact? \n \n \n \n Note how these questions are mostly at odds with each other, but I\nsuppose that underscores some of the confusion I have here. The above\nexample was fairly contrived, and only a guess as to the actual\nbehavior. Below is a more principled analysis from this very thorough\nbreakdown of laziness. \n \n \n Non-strictness vs laziness \n \n We just mentioned how non-strictness is not quite what we mean by\nlaziness. Non-strictness is the general property of\nHaskell programs that ensures expressions won’t be evaluated until\nneeded; “we evaluate as little as possible and delay evaluation as long\nas possible.” But that characterization is loose in the sense that what\nwe mean by “evaluate” and “need” are loose. For instance, we might check\nfor the presence of a constructor to verify type correctness but not\nactually look at any values, at least not unless we need to. As I\nunderstand it, lazy evaluation is effectively a mechanism for\nenforcing a particularly strong notion of non-strictness. \n \n \n \n Thunks \n \n A thunk is effectively an unevaluated expression,\nand is the mechanism through which we’ll represent lazy evaluation. Like\nbefore, we’ll use an example with a pair that’s pattern-matched on the\nLHS: \n \n `let (x, y) = (length [1..5], reverse "olleh") in ...` \n \n where we assume `x` and `y` are used somewhere\nbeyond the `in` clause. Now here we might take\n`x=5` and `y="hello"`, but this involves resolving\nthose expressions when binding the values. But we again don’t do\nthis until we actually need those values, somewhere after the\n`in`. Until then, we can call both `x` and\n`y` thunks: unevaluated expressions, “lying dormant”\nbut able to be resolved at any moment. As before, note the pattern\nmatching happening in the pair argument. If we instead have \n \n `let z = (length [1..5], reverse "olleh") in ...` \n \n Here `z` is a thunk, and we don’t have to deconstruct it\nright away like we did with `x` and `y` before\n(although that’s all that happened there, producing two\nthunks). We can be very explicit by including a pattern match on\n`z`: \n \n `let z = (length [1..5], reverse "olleh")\n (n, s) = z \nin ...` \n \n After the first line, `z` is still a thunk. But in line 2,\nwe pattern match on `z`, requiring us to split it into two\nthunks, with `(n, s)` being\n`(<thunk>, <thunk>)`. To quote the article: \n \n \n \n The compiler thinks ‘I better make sure that pattern does indeed\nmatch `z`, and in order to do that, I need to make sure\n`z` is a pair.’ Be careful, though — we’re not yet doing\nanything with the component parts (the calls to `length` and\n`reverse`), so they can remain unevaluated. \n \n\n \n We can take this a step further with something like \n \n `let z = (length [1..5], reverse "olleh")\n (n, s) = z \n \'h\':ss = s\nin ...` \n \n which is the same as before, but now pattern matches on the second\ncomponent of `z`, checking that it is a list with a head\n`\'h\'`, and attaches to `ss` the tail of the list.\nSpecifically at this stage we \n \n \n \n Evaluate `s` at a surface level to check it’s a list (or a\n`cons` object), such that\n`s = <thunk> : <thunk>`, basically \n \n \n The first newly “revealed” thunk is evaluated to check it’s an\n`h`, leaving us with\n`\'h\':ss = \'h\' : <thunk>` \n \n \n \n In total, we see that Haskell values can be partially evaluated, and\nany one line (or just piece) of computation may require some minimum\namount of needed evaluation (like with the pair or list checks, where\nwe’re “peeling” back some of the layers only to check for the needed\nstructure), i.e., leaving as many nested components as thunks as\npossible. \n \n \n \n\n\n \n \n \n We have formal names for the “layers of evaluation” involved here.\nAny of the intermediate evaluation steps of a value is said to be\nweak head normal form (WHNF), while a fully evaluated value is\nin normal form. Here we’ve laid some of the groundwork\nof laziness and how to think about layers of evaluation via thunks.\nNothing is evaluated until it is needed, generally speaking (as\nin we’re not even onto lazy patterns at this point). Interestingly,\naside from a few I/O exceptions, pattern matching is the only\nplace where Haskell values are evaluated; in the end, everything is left\nas a thunk until a pattern requires peeling back a layer to check the\nstructure…and nowhere else does this thunk resolution take place. \n \n \n Lazy and strict functions \n \n Functions can be lazy or strict in an argument (and possibly\ndifferent across each of them). A function is said to be\nstrict in an argument if it does some\nevaluation with it, evaluating it to at least WHNF, while being\nlazy in an argument if no evaluation takes\nplace. A function is stricter than another if it evaluates an\nargument to a deeper level (of the same structure, say; general\ncomparison is between barebones `f x` and `g x`).\nPerforming evaluation is also called “forcing” a value in some\ncases. \n \n \n If we “force” (try to evaluate whatsoever) the value of\n`undefined`, our program will halt. So the following will\nyield errors \n \n `let (x, y) = undefined in x\nlength undefined\nhead undefined` \n \n Each of these do some checking on the values, trying to peel\nback one layer of thunks. But in doing so they encounter\n`undefined`, and halt. But a thunk that’s “hiding” an\n`undefined` value can go undetected if we never try to\nevaluate, and lazier settings like the following don’t produce\nany errors: \n \n `-- we just see our value is a pair, but not the values in the positions\nlet (x, y) = (4, undefined) in x\n\n-- same sorta thing, we just see that we\'ve got a list but dont look at the\n-- cell values\nlength [undefined, undefined, undefined]\n\nhead (4 : undefined)` \n \n We can call a function `f` strict if (and only if)\n`f undefined` results in an error (implying that\n`f` will peek at the structure of its argument, evaluating at\nleast one level, which is a problem). This is at least how we can\ndetermine strictness without actually knowing `f`’s\ndefinition. \n \n \n It’s worth noting there’s a bit of a confusing nuance to really hit\non with this definition². When we say a lazy function is one\nthat doesn’t evaluate its argument, we naturally mean this when\nevaluating `f` (otherwise we just have this unevaluated\nthunk `f x`). So given we’re forcing `f x`, do we\nforce `x` as a result? That is, if I need to “fully evaluate”\n`f x`, will I need to fully evaluate `x` to get\nthere? If not, this basically entails that `f`’s “real value”\ndoesn’t depend on `x` in any way or it hides\nit inside another abstraction that doesn’t need to be unpacked in\norder for `f x` to be in normal form. So laziness\nmeans we can get there without needing to think about\n`x`. \n \n \n \n Lazy pattern matching \n \n Now we’ve got a bit more context behind delayed evaluation and\nintermediate forms to motivate the explicit use of lazy patterns. We\nalready introduced them, but here’s a simple example demonstrating their\nuse once more: \n \n `Prelude> let f (x,y) = 1\nPrelude> f undefined\n*** Exception: Prelude.undefined\n\nPrelude> let f ~(x,y) = 1\nPrelude> f undefined\n1` \n \n The first `f` is strict: it doesn’t refer to\n`x` or `y` in its body, but it accepts a\nstructured input (a pair) and we have to evaluate whatever’s passed in\nto check it has a pair constructor. So when we pass in\n`undefined`, we are forced to check if that’s a pair, and we\nend up evaluating `undefined` which halts our program. But\nthe `~` in the argument pattern in the second `f`\ndelays this structural check, meaning `f` will just “take it\non the chin” for whatever value we give it (even if it’s not a pair). We\nsimply wait until the last possible moment to use whatever that value\nis, rather than “gate checking” it on input. This means that\n`f undefined` doesn’t cause any issues; we wait as\nlong as possible to evaluate what we pass in, and in this case that\nturns out to be forever. So `undefined` goes by\nun-evaluated. \n \n \n \n Scoping and nesting \n \n There two ways to create local, “block-like” bindings:\n`let` and `where`. \n \n \n \n `let` expressions facilitate local bindings scoped to a\nparticular expression. For example \n `let y = ab\n f x = (x+y)/y\nin f c + f d` \n Here we’re defining `y` and `f` locally (and\nnote that `f` uses `y`), and then taking them as\navailable names to construct the final expression\n`f c + f d`. `let ... in` can be used anywhere a\ntypical expression can, so we could have the above in a function binding\nlike \n `normalizePair a b c d =\n let y = ab\n f x = (x+y)/y\n in f c + f d` \n \n \n `where` clauses are similar to `let` in that\nthey facilitate local bindings, but they are crucially not\nexpressions. `where` clauses are only allowed at the top\nlevel of a set of equations or case expression and must “attach” to a\ndeclaration (function or pattern bind). \n So an analog of the above that makes it look a whole lot like\n`let`: \n `normalizePair a b c d =\n f c + f d\n where\n y = ab\n f z = (z+y)/y` \n This is okay because we’re operating right at the function\ndeclaration here. But if we were working in some more deeply nested\nlayers, you would have to use a `let` expression, since\n`where` would not produce a sub-expression. Think of the\nabove `let` variant as actually producing a proper expression\nstrictly on the RHS the equals sign, whereas here the `where`\nis effectively attaching to the LHS and isn’t valid in isolation. As in,\nif I just took \n `f c + f d\n where\n y = ab\n f z = (z+y)/y` \n this would be invalid, not an expression, etc; the `where`\ncan’t “see” any nearby function binding. Again, it’s like\n`where` is just a convenience sugar that comes coupled with\ndeclarations: \n `pattern = expression\n where ...` \n The use of `where` is coupled with the presence of\n`pattern`; take the pattern out and you can’t use it,\nbasically. This is mostly convenient when you’ve got a function with\nguards and you need your locally bound variables across the guard\nconstraints, like \n `f x y \| y>z = ...\n \| y==z = ...\n \| y<z = ...\n where z = xx` \n Making `z` available to each constraint is something that\n`where` handles automatically. You can’t use `let`\nhere because this whole thing isn’t an expression, it’s a set\nof expressions. You could* use `let`s inside the\n`...` on the RHS of each constraint, but that’s painful.\n`where` gets to “sit above” the expression level and be a\nhelper for declarations. \n \n \n \n These statements were a bit confusing at first because they just feel\nso particular. But as I slowly get used to them, it’s becoming clear\nthey just facilitate the typical composability of most non-functional\nlanguages when it comes to working with variables in scopes. That is,\nthe really very basic ability to define some variables within a function\nand reuse them to build up other variables or terms. Those are very\nautomatic in languages like Python, but require a bit more care in\nHaskell. A very basic example of `where` (with no guards, so\n`let` could used here with similar effectiveness) drove this\nhome a bit, feeling pretty familiar: \n \n `areaTriangleTrig a b c = c * height / 2 -- use trigonometry\n where\n cosa = (b ^ 2 + c ^ 2 - a ^ 2) / (2 * b * c)\n sina = sqrt (1 - cosa ^ 2)\n height = b * sina` \n \n \n Type classes \n \n Type classes allow for the categorization of types for which\nparticular behavior is defined. For example, if we think about an\nequality operator, i.e., `==`, we might want this to be\nbroadly applicable for comparing values of many different types. We face\na few practical challenges right away: \n \n \n \n Not all types are comparable. That is, types aren’t required\nto admit some notion of equality, so we can expect `==` to\nremain undefined for some types. \n \n \n For types that support it, what equality means may differ heavily\nacross those types. Here we expect `==` to be overloaded such\nthat the same operator can be used in each type context with whatever\nspecific machinery we need to compare those values. \n \n \n \n These are standard concerns for general operators. Haskell’s type\nclasses address this by effectively allowing us to declare interfaces\nthat types can inherit. For instance, we can define the class of types\nthat are “equatable” as \n \n `class Eq a where \n (==) :: a -> a -> Bool` \n \n This says: a type `a` is an instance of the class\n`Eq` if it defines an operator `==` that compares\nvalues of type `a`. The function/operator `==` is\nconsidered a method of the class. We can “attach” types to this\nclass with instance declarations, like \n \n instance Eq Integer where \n x == y = x `integerEq` y \n \n This can basically be taken to mean we’re “enrolling” the type\n`Integer` in the `Eq` class by fulfilling the\ntemplate/interface required by `Eq` (where\n`integerEq` is assumed to be a sufficient comparison\nfunction). \n \n \n Contexts \n \n With access to a type class, we can use contexts to more\ntightly bound polymorphic type expressions to be quantified only over\ntypes belonging to that class. For example, we can now generally refer\nto our `==` operator as having the type \n \n `(==) :: (Eq a) => a -> a -> Bool` \n \n which suggests that `==` is a generic function only over\ntypes that are instances of the class `Eq`. The notation\n`(<class> <type-var>) => <type-expr>` is\nhow we generally capture this: `(C t) => E` bounds the\ntype `t` as it appears in the type expression `E`\nto belong to the type class `C`. This actually feels very\nnatural, and closes the loop on a lot of confusing syntax I’ve seen up\nto this point. This further feels like bounded UQ as we’ve studied it in\nbroader type theory: we can express types that are quantified over\nbounded type variables, and those bounds are facilitated by\ntype classes. \n \n \n Contexts can be used generally in type expressions, including in\nother class definitions: \n \n `class (Eq a) => Ord a where\n (<), (<=), (>=), (>) :: a -> a -> Bool\n max, min :: a -> a -> a` \n \n This defines a new class `Ord a`, where the type variable\n`a` is bounded to itself be an instance of the class\n`Eq`. `Ord` is considered a subclass of\n`Eq`: we’re saying it is defined only over types that belong\nto that class, is thus comparatively more specific. \n \n \n \n One can also express multiple inheritance like \n `class (Eq a, Show a) => C a where ...` \n \n \n One can use class constraints (i.e., contexts) within the method\ndefinitions of another class on a type variable except that\nwhich is bound at the class level: \n `class C a where\n m :: Show b => a -> b` \n This is quite natural: we’re just saying that `a` can’t be\nfurther restricted by `m`’s type expression (e.g., doing\nsomething like `:: Show a => a -> b`). That is, we have\nno power to restrict `a` at that stage; if we wanted to that,\nwe’d need to move it into the class declaration (e.g.,\n`class (Show a) => C a`). \n \n \n \n \nMore on contexts\n \n \n For a moment, restriction with contexts feels like we’re bringing in\nmore than just types to our quantified type variable. With\nsomething like \n \n `(Eq a) => a -> a -> Bool` \n \n I’m saying I want a type that belongs to `Eq`, which means\nI’ve got a defined method along with whatever type `a` shows\nup. That feels odd, like I’ve got a term-type bundle that I’m\nquantifying over. This confused me at first, since it feels like I’m\nmagically able to enforce a behavioral constraint. That isn’t to say\nthat’s a bad thing: there’s lots of freedom there. But I struggled to\ndraw up a pure type theoretic analogy. \n \n \n The thing is: it’s not really a behavioral constraint, and\nwhat we’re doing can be rather trivially expressed as simply bringing\nthe method types into the type expression. For example, we can\nalternatively write the above as \n \n `(a -> a -> Bool) -> a -> a -> Bool` \n \n That is, my context `(Eq a) =>` is merely “bringing\nalong” the requirement that we have another defined term of a certain\ntype. A given type class only “enforces” any of its methods up to their\ntypes anyway: we don’t actually have some behavioral, term-level\ndeclaration. So any time I use a type class as context, I can just think\nabout explicitly bringing the type signatures of each of its methods\ninto the type expression I’m constraining. Since our example here\ncoincides with the type we assigned to `==` globally, we can\nrun with that to demonstrate further: \n \n \n \n writing `== :: (Eq a) => a -> a -> Bool` can be\ndirectly interpreted as saying `==` is a function operating\non two equatable types and returning a `Bool` \n \n \n writing\n`== :: (a -> a -> Bool) -> a -> a -> Bool`\nsays the exact same thing as above, but it makes clear that whatever\ntype `a` we’re working with brings its own notion of equality\nin the form of the `(a -> a -> Bool)` function (which\nis all that initially meant by forcing an equatable type with\n`Eq a`, so same difference) \n \n \n \n This makes clear a few things: \n \n \n \n Our type variable `a`, when constrained by contexts, is\nstill just a type variable in the usual sense. The method “baggage”\nwe’re requiring can be seen as something extra, yes, but it’s extra\nin the type expression, not “in” `a` itself. \n \n \n We’re basically seeing how `==` is really just a “wrapper”\ngeneric function that does nothing but call underlying methods. It takes\ntwo values of the same type, along with a method that operates on values\nof that type (the type-specific equality method), and plugs those values\nin. This is how we get ad-hoc polymorphism (i.e.,\noverloading) via parametric polymorphism: we define fully generic\nfunctions, but require calls on values of a particular type to manually\nsupply methods that work with that type, such that our outer generic\nfunction can delegate all interaction to those methods. This is\nbasically just giving us a way to manually define or “inject”\ntype-specific function behavior on a type-by-type basis, which is of\ncourse what we observe when overloading operators in most other\nlanguages. The point is that, in Haskell, we have to build that up using\nproper generics (i.e., parametric polymorphism). \n \n \n \n \n \n Type constructors \n \n Recall our parametric `Point` type from before:\n`data Point a = Pt a a`. `Point` is a type\nconstructor: it takes a type `a` and produces another\ntype `Point a`. So `Point a` is concrete,\nfirst-order (for some `a`), i.e., values can inhabit this\ntype. But `Point` alone, as a “higher-order” type, is not\nfirst-order, and there aren’t any values in Haskell that can inhabit\nit. \n \n \n To be clear, these are both concrete values in type space (i.e.,\ntypes), they just have different kinds: \n \n `Point Int : \nPoint : -> ` \n \n We’re basically saying that a type needs to have kind ``\nin order to be inhabitable by a value. That is, only first-order types\ncan be inhabited by a value; it’s fairly nonsensical to suggest that\n`Point` could be inhabited by a value. Note that both\n`Point` and `Point Int` inhabit kinds, it’s just\nthat only `Point Int` is a construct that can then further be\ninhabited by a value while `Point` is “too abstract” (we\nfirst need to plug in another type to yield a first-order\ntype). \n \n \n \nDigression: Sorts and universes\n \n \n …Why is that? In what way is `Point` too abstract? It\ninhabits a kind just like `Point a` (for any `a`);\nis that not enough to just call them both types? \n \n \n This is a good place to have a discussion regarding a critical\ndistinction in how we categorize terms, types, and higher-order sorts.\nThis has been a bit of a pain point throughout my type theory “journey,”\nand the rubber seemingly meets the road now that we’re discussing these\nitems in a more concrete context like Haskell. \n \n \n The key elements I want to highlight here crop up as we start\nthinking about where type constructors “live.” Above, we just said both\n`Point Int` and `Point` are “types,” since they\nboth are concrete objects living in type space. They’re not at the term\nlevel, and they’re not a kind: they’re sandwiched in between.\nNevertheless, calling them both types isn’t exactly how we’d word it in\neveryday use, and it’s pretty much outright wrong in a Haskell context\ngiven that `Point` can’t be inhabited by a value in\nthe usual sense. \n \n \n Now, the fact that `Point` can’t be inhabited by a\nterm/value makes intuitive sense: it’s a function that operates on\ntypes. How could you even build a term that makes sense here?\nOf course, you can quantify over `Point`, binding its type\nvariable so that the expression can become closed and represent\nsomething concrete, but it otherwise just includes a free type variable\nthat’s hard to make sense of. \n \n \n So how do we formalize the distinction here? What I mean by that is\nboth `Point` and `Point a` are in this “type\nrealm” like we’ve said; we’re not reaching for kinds or higher-order\nsorts. In a type theoretic sense we can call them both just\ntypes, similar to how we can still call a lambda abstraction a term even\nthough it abstracts over terms (and type constructors like\n`Point` abstract over types).\nUniverses are what help us further distinguish these\nitems (in the Martin-Lof sense). \n \n \n Universes help capture what we mean by higher or lower order types. A\n“higher-order type” doesn’t mean we’re referring to kinds (as in a\n“higher order sort,” jumping from the classification “type” to “kind”),\nbut instead to abstraction over types. In particular, universes\nrepresent a hierarchy of types, such that each step up the ladder of\nuniverses implies abstraction over the items from the previous\nuniverse. \n \n `Type₀ : Type₁\nType₁ : Type₂\n...\nPoint : Type₁ -- because Tree : Type₀ → Type₀\nPoint Int : Type₀\nPt 3 4 : Point Int\n3 : Int` \n \n Here\n $\\text{Type}_N$ ,\nor\n $\\mathcal{U}_n$ ,\nrefers to the\n $n$ -th\nuniverse type. Note that we somewhat overload our usual `:`\nin that we technically have types on both sides (and not a type on the\nLHS, kind on the RHS). This is okay given we think of these universe\ntypes as collections of certain kinds of types and therefore represent a\nhigher order construction; something like `T : Type₀` says\nthat the type `T` is in the 0-th universe. \n \n \n How do we know something like `Type₀` is itself a type,\nthough? Part of me is tempted to simply say whatever it is, it could\nbe a kind. The thing is, we construct this hierarchy by saying the\nuniverse `Type₀` is an element in the universe\n`Type₁`, and we don’t leave “type world” to do this. So while\nthe kind `` certainly captures all nullary type constructors\n(first-order types), the kind itself is beyond types altogether, and\ncan’t exactly be taken as a type in a higher type universe like\n`Type₁`. But we’re doing that in spirit, we can just\nthink of it more like lugging the whole collection of `Type₀`\ntypes with us such that it becomes a new primitive in our higher\nuniverse. \n \n \n (I’d like a little more on this; really getting* why a\nuniverse type can still be a type in the usual sense. Probably the\nconnection to abstraction over types, as in type constructors, is what’s\ncan be convincing here; if Type0 is similar enough to a type\nconstructor, and the latter is a type, then the former can reasonably be\nas well.) \n \n \n To be clear: universes simply group up types, breaking up type space\ninto “orders” of types. First-order types, i.e., those belonging to\n`Type₀`, are the only ones that can be inhabited by values.\nOther higher-order types can then simply be seen as having some\n“impassable” universe layers separating them from the term space.\nTherefore, they can only be inhabited by objects in the\n $N-1$ \nuniverse, and for all but `Type₀`, those inhabiting\nobjects are still themselves types. This gives us a meaningful\nbasis for saying a phrase like “type constructors are inhabited by\nfirst-order types;” if something is going to inhabit a type “that\nabstract,” it can only be a “more concrete” type. Doing this for\narbitrarily high order types eventually gets us to a “most concrete”\ntype, after which the inhabiting object simply becomes a term (although\none can naturally still think of terms as values in a universe, where a\nconcrete type is simply a set of its possible inhabiting terms). \n \n \n I’ve pondered this for a considerable amount of time, struggling to\nreally grok the idea of letting a collection of types, as represented by\n $\\text{Type}_0$ ,\nitself be a type. It simply didn’t track for quite some time: it just\nfeels like a kind, and I don’t like that we seem to ignore\nthis. In fact, it seems very clear when we let universes be a\npartition of type space, in the sense that each universe is a\ncollection of types bundling up a “new group” of types built on top of\nthe last group. This tracks with the idea of universes building up\nincreasing abstract type constructors, and aligns nicely with the notion\nof “unions of kinds.” The problem with this is that each universe is\na term in the next one. That is, the thing we’re using to reference\na collection of types is now just a primitive term in the next universe\nup. \n \n \n That bothered me for a long time. I just didn’t get what the thing\nwas supposed to now be in the next universe.\n $\\mathcal{U}_0$ \nas a type is sensible: I can think of it like a set with\nelements inside, and if we liken\n $:$ \nto\n $\\in$ ,\nthen something like\n $Integer : \\mathcal{U}_0$ \ntracks just fine. But\n $\\mathcal{U}_0$ \nas a standalone term annoys me, and it no longer feels right.\nThe interpretation that helps me here is to allow\n $\\mathcal{U}_0$ \nto be a building block for terms on this new “plane.” You can even liken\nit to `` as a kind, in the way we use `` in kind\nexpressions to abstract over types. Whatever work the symbol\n`` is doing, we’re basically letting the reference to\n $\\mathcal{U}_0$ \n(as a term) do that same work. And you can start building other terms\nwith it, e.g.,\n $\\mathcal{U}_0\\rightarrow \\mathcal{U}_0$ ,\nwhich is really not any different from what we mean by\n $\\rightarrow $ .\nNote how while\n $\\mathcal{U}_1\\rightarrow \\mathcal{U}_1$ \nwould be in\n $\\mathcal{U}_2$ \n(a new universe), it can still be related to a kind like\n $(\n\\rightarrow\\cdots\\rightarrow ) \\rightarrow ( \\rightarrow\\cdots\\rightarrow )$ \n(i.e., a map from an\n $n$ -arity\ntype constructor to an\n $m$ -arity\ntype constructor). That is, higher universes don’t correspond to higher\nsorts; it’s not like we have to leave “kind space” to start representing\nuniverses beyond\n $\\mathcal{U}_1$ .\nInstead, universes are basically just convenient ways to refer to all\ntypes with a certain level of abstraction (or below), and we can\ncanonically think of those things as still having some ascribable kinds.\nUniverses also group up types with a “broader brush:”\n $\\mathcal{U}_1$ ,\nfor instance, basically groups up all first-order type constructors. We\ndon’t have a convenient way to refer to all types that meet that\ndescription with kinds; we have to write out\n $\\rightarrow $ ,\n $\\rightarrow \\rightarrow $ ,\netc to capture the notion of arbitrary first-order arity. All\ntypes with that structure inhabit\n $\\mathcal{U}_1$ ,\nhowever, so we get a single term we can now use to refer to them. We\nmake a jump to a new universe where that single “smaller” universe\nreference a is new term, and build some new higher-order terms from\nthere. This is again just like assigning a name to some first-order\nkinds, and building outer higher-order kinds (like example above) that\nnest the first-order ones inside. \n \n \n Another thing that might help, if not pretty much implied by the\nabove: we effectively “reuse” the notion of typing at each new universe.\nIt’s odd to initially start representing collections of types as types\nthemselves. We never claim to leave type space, but we wrap lower\nuniverses back around to be terms that are categorized by higher\nuniverses. We just reuse the notion of type inhabitance each time to\ncapture that relationship: each time we get some fundamentally new thing\nwe can work as a term to build up other terms in this new\nuniverse. \n \n \n `class Functor f where\n fmap :: (a -> b) -> f a -> f b\n\ninstance Functor Point where\n fmap f (Point a) = Point (f a)` \n \n www.haskell.org/tutorial/goodies.html \n wiki.haskell.org/Lazy_pattern_match \n en.wikibooks.org/wiki/Haskell/Laziness \n \n \n \n\n \n \n \n \n \n \n \n This may already be quite clear, but to reinforce this intuition even\nmore (because I find it important): `ones` here is not some\nobject that we find ourselves prepending a 1 to. When looking at the\nRHS, I find myself thinking that `ones` could technically be\nsome arbitrary value, and we just need to find one that makes the\nequation work. In a sense that’s perfectly okay, but I think there’s too\nmuch mental freedom with my read on what’s happening here. I think it’s\nbetter to be very clear that `ones` is born entirely out\nof the extra structure used in that equation. A similar point can\nbe made with the general `fix` setting we explored in Typed lambda calculus, e.g.\xa0for\nfactorial: \n `G(fact) = λ(n: Int). if n=0 then 1 else n * fact(n-1)` \n Here `fact` isn’t some construct that we find ourselves\nputting into that equation, “plugging it in” and checking if it works.\nNo; it is completely defined by that equation. It is not a\nseparate thing in any meaningful sense. The same applies for\n`ones` above: it is the thing that internalizes,\ninfinitely, the action of prepending a 1. \nThis note is really just a reminder to not treat the equation and the\nterm so separately, since I seem to have that tendency in my latest\nre-reading of this material. Recursive terms like these are nothing but\nsponges that must absorb whatever structure shows up around it\nin the definition.\n \n↩︎ \n \n I say confusing because I initially took this to mean something\ndifferent based on the wording in the Wikibooks article. The article\nsays \n \n Often, we only care about WHNF, so a function that evaluates its\nargument to at least WHNF is called strict and one that performs no\nevaluation is lazy. \n \n This confused me because it makes laziness sound like something we\ncan see in the function body, and whether it uses that input in a\nparticular way. As in, whether we explicitly take an evaluation step\ninside the function’s definition. The `id` function\nsounds like it’s lazy under this definition: it doesn’t evaluate\n`x` in the body, it just returns what it was given. While\nthis definition is a little sloppy, my interpretation is what’s wrong.\nWhatever evaluation may take place in `f`’s body, it won’t\ntake place until `f` itself is evaluated. So our quoted line\nis exactly the same thing as the “refined” statement \n \n Given that we’re forcing `f x`, does `x` get\nforced as a result? \n \n That is to say, once we actually try to make the term\n`f x` a concrete value, strictness means we’ll need to take\n`x` to a concrete value in order to get there. \nAnother hint that the first line doesn’t make sense: just above we said\nevaluation only takes place during pattern matching. So unless\n`f` does some pattern matching when checking input values (in\nwhich case it’s obviously strict, regardless of how we’re interpreting\nour definition), the only evaluation that can take place in its body is\nif it passes that input off to another function call which needs to do\nsome pattern matching. Point being, there’s no other canonical way for\nevaluation to even take place in the function body: we either evaluate\nwhen checking the input or call another function that does. So if our\nfunction `f` doesn’t pattern match on input, then our first\ndefinition is likening strictness to simply calling a function\ninternally that does pattern matching. This would be a pretty shallow\nnotion of strictness if that’s what we mean, i.e., strict functions are\nthose that call functions that pattern match. `id` would\nnot be strict under such a definition.\n \n↩︎ \n \n \n', 'toc': ' \n Syntax\ngroups \n Syntax\ndefinitions\n \n Polymorphic\ntypes \n User-defined types\n \n Type\nsynonyms \n \n \n Functions\n \n Functions are non-strict \n Recursive functions \n \n Case expressions and\npattern matching\n \n Case\nexpressions \n Lazy patterns\n \n Non-strictness vs laziness \n Thunks \n \n \n ', 'created': '2024-05-28 05:29', 'modified': '2025-10-11 00:39', 'summary': '', 'abstract': '', 'series': ''}, 'src': {'id': 10966, 'path': '/home/smgr/Documents/notes/Haskell.md', 'rpath': 'Haskell.md', 'name': 'Haskell.md', 'title': 'Haskell', 'link': 'Haskell', 'ftype': 'md', 'ctime': '1760168377.24', 'mtime': '1760168359.0', 'atime': '1760168359.0', 'type': 'wiki', 'yaml_text': 'title: Haskell\ncreated: 2024-05-28 05:29\nmodified: 2025-10-11 00:39\ndatelink: [[2024-05-28]]\ntype: wiki\nsummary: ', 'name_fmt': 'Haskell.md+src', 'format': 'src', 'content': '', 'toc': '', 'created': '2024-05-28 05:29', 'modified': '2025-10-11 00:39', 'summary': '', 'abstract': '', 'series': ''}}}

blocks mode

Haskell.md@para@10:1-14:63

Haskell.md@listitem@17:1-41:1

Haskell.md@listitem@120:3-121:1

Haskell.md@listitem@121:3-122:1

Haskell.md@listitem@122:3-123:1

Haskell.md@listitem@41:1-126:1

Haskell.md@listitem@126:1-145:1

Haskell.md@listitem@145:1-146:1

Haskell.md@listitem@148:1-153:1

Haskell.md@listitem@153:1-156:1

Haskell.md@listitem@156:1-164:1

Haskell.md@para@166:1-169:29

Haskell.md@para@177:1-177:20

Haskell.md@listitem@179:1-204:1

Haskell.md@listitem@204:1-206:1

Haskell.md@listitem@206:1-210:1

Haskell.md@para@211:1-215:24

Haskell.md@para@218:1-218:41

Haskell.md@para@228:1-232:74

Haskell.md@para@234:1-241:14

Haskell.md@para@243:1-250:43

Haskell.md@para@258:1-264:5

Haskell.md@para@270:1-270:29

Haskell.md@para@282:1-284:43

Haskell.md@para@289:1-291:78

Haskell.md@para@297:1-300:11

Haskell.md@para@308:1-314:39

Haskell.md@para@316:1-321:62

Haskell.md@para@331:1-333:46

Haskell.md@para@335:1-339:25

Haskell.md@para@343:1-345:20

Haskell.md@listitem@360:1-366:1

Haskell.md@listitem@366:1-372:1

Haskell.md@listitem@372:1-381:1

Haskell.md@listitem@381:1-391:1

Haskell.md@listitem@391:1-405:1

Haskell.md@para@407:1-408:26

Haskell.md@para@414:1-418:48

Haskell.md@para@420:1-420:20

Haskell.md@para@426:1-430:9

Haskell.md@para@433:1-438:43

Haskell.md@para@440:1-440:33

Haskell.md@para@446:1-451:41

Haskell.md@para@453:1-467:80

Haskell.md@para@469:1-469:63

Haskell.md@para@475:1-475:80

Haskell.md@listitem@481:3-482:1

Haskell.md@listitem@482:3-483:1

Haskell.md@listitem@483:3-484:1

Haskell.md@listitem@477:1-489:1

Haskell.md@listitem@489:1-492:1

Haskell.md@para@493:1-498:73

Haskell.md@para@501:1-506:14

Haskell.md@para@508:1-512:45

Haskell.md@listitem@514:1-522:1

Haskell.md@listitem@522:1-523:1

Haskell.md@listitem@523:1-532:1

Haskell.md@para@534:1-536:40

Haskell.md@para@544:1-547:24

Haskell.md@para@556:1-557:34

Haskell.md@para@572:1-576:74

Haskell.md@para@578:1-579:13

Haskell.md@para@593:1-596:60

Haskell.md@para@598:1-609:19

Haskell.md@para@611:1-614:62

Haskell.md@para@620:1-631:53

Haskell.md@para@642:1-653:23

Haskell.md@para@655:1-655:67

Haskell.md@listitem@657:1-658:1

Haskell.md@listitem@658:1-662:1

Haskell.md@para@663:1-666:72

Haskell.md@para@669:1-677:58

Haskell.md@para@680:1-682:55

Haskell.md@para@688:1-694:34

Haskell.md@para@700:1-702:74

Haskell.md@para@710:1-712:33

Haskell.md@blockquote@714:3-718:1

Haskell.md@para@719:1-719:52

Haskell.md@para@728:1-730:48

Haskell.md@listitem@732:1-734:1

Haskell.md@listitem@734:1-736:1

Haskell.md@para@737:1-741:49

Haskell.md@figure@743:1-743:91

Haskell.md@para@745:1-754:34

Haskell.md@para@757:1-763:59

Haskell.md@para@765:1-766:54

Haskell.md@para@774:1-777:77

Haskell.md@para@790:1-793:54

Haskell.md@para@795:1-803:79

Haskell.md@para@806:1-808:69

Haskell.md@para@820:1-830:52

Haskell.md@para@834:1-834:74

Haskell.md@listitem@836:1-856:1

Haskell.md@listitem@856:1-911:1

Haskell.md@para@912:1-920:32

Haskell.md@para@931:1-934:69

Haskell.md@listitem@936:1-938:1

Haskell.md@listitem@938:1-942:1

Haskell.md@para@943:1-945:80

Haskell.md@para@952:1-955:30

Haskell.md@para@962:1-964:64

Haskell.md@para@967:1-970:9

Haskell.md@para@976:1-983:76

Haskell.md@para@985:1-986:13

Haskell.md@para@994:1-997:34

Haskell.md@listitem@999:1-1004:1

Haskell.md@listitem@1004:1-1018:1

Haskell.md@para@1022:1-1023:69

Haskell.md@para@1029:1-1034:34

Haskell.md@para@1036:1-1038:70

Haskell.md@para@1044:1-1051:24

Haskell.md@listitem@1053:1-1056:1

Haskell.md@listitem@1056:1-1061:1

Haskell.md@para@1062:1-1062:31

Haskell.md@listitem@1064:1-1068:1

Haskell.md@listitem@1068:1-1081:1

Haskell.md@para@1084:1-1088:73

Haskell.md@para@1090:1-1091:27

Haskell.md@para@1098:1-1104:27

Haskell.md@para@1109:1-1110:79

Haskell.md@para@1112:1-1116:14

Haskell.md@para@1118:1-1124:65

Haskell.md@para@1126:1-1131:23

Haskell.md@para@1133:1-1139:51

Haskell.md@para@1141:1-1146:38

Haskell.md@para@1158:1-1163:48

Haskell.md@para@1165:1-1173:64

Haskell.md@para@1175:1-1179:28

Haskell.md@para@1181:1-1193:51

Haskell.md@para@1195:1-1205:54

Haskell.md@para@1207:1-1238:29

Haskell.md@para@1240:1-1247:14

Haskell.md@note@1264:7-1300:45

Haskell.md@note@1302:7-1326:75