From 41007fce2a333ba78be703c35f4afccade3369e5 Mon Sep 17 00:00:00 2001 From: Thomas Letan Date: Sun, 12 Jul 2020 10:25:42 +0200 Subject: New article on Algebraic Datatypes --- site/posts/AlgebraicDatatypes.v | 678 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 678 insertions(+) create mode 100644 site/posts/AlgebraicDatatypes.v (limited to 'site/posts/AlgebraicDatatypes.v') diff --git a/site/posts/AlgebraicDatatypes.v b/site/posts/AlgebraicDatatypes.v new file mode 100644 index 0000000..b5a786f --- /dev/null +++ b/site/posts/AlgebraicDatatypes.v @@ -0,0 +1,678 @@ +(** * Proving Algebraic Datatypes are “Algebraic” *) + +(** Several programming languages allow programmers to define (potentially + recursive) custom types, by composing together existing ones. For instance, + in OCaml, one can define lists as follows: + +<< +type 'a list = +| Cons of 'a * 'a list +| Nil +>> + + This translates in Haskell as + +<< +data List a = + Cons a (List a) +| Nil +>> + + In Rust: + +<< +enum List { + Cons(A, Box), + Nil, +} +>> + + In Coq: + +<< +Inductive list a := +| cons : a -> list a -> list a +| nil +>> + + And so forth. + + Each language will have its own specific constructions, and the type systems + of OCaml, Haskell, Rust and Coq —to only cite them— are far from being + equivalent. That being said, they often share a common “base formalism,” + usually (and sometimes abusively) referred to as _algebraic datatypes_. This + expression is used because under the hood any datatype can be encoded as a + composition of types using two operators: sum ([+]) and product ([*]) for + types. + + - [a + b] is the disjoint union of types [a] and [b]. Any term of [a] + can be injected into [a + b], and the same goes for [b]. Conversely, + a term of [a + b] can be projected into either [a] or [b]. + - [a * b] is the Cartesian product of types [a] and [b]. Any term of [a * + b] is made of one term of [a] and one term of [b] (remember tuples?). + + For an algebraic datatype, one constructor allows for defining “named + tuples”, that is ad-hoc product types. Besides, constructors are mutually + exclusive: you cannot define the same term using two different constructors. + Therefore, a datatype with several constructors is reminescent of a disjoint + union. Coming back to the [list] type, under the syntactic sugar of + algebraic datatypes, the [list α] type is equivalent to [unit + α * list α], + where [unit] models the [nil] case, and [α + list α] models the [cons] case. + + The set of types which can be defined in a language together with [+] and + [*] form an “algebraic structure” in the mathematical sense, hence the + name. It means the definitions of [+] and [*] have to satisfy properties + such as commutativity or the existence of neutral elements. In this article, + we will prove some of them in Coq. More precisely, + + - [+] is commutative, that is ##\forall (x, y),\ x + y + = y + x## + - [+] is associative, that is ##\forall (x, y, z),\ (x + + y) + z = x + (y + z)## + - [+] has a neutral element, that is ##\exists e_s, + \ \forall x,\ x + e_s = x## + - [*] is commutative, that is ##\forall (x, y),\ x * y + = y * x## + - [*] is associative, that is ##\forall (x, y, z),\ (x + * y) * z = x * (y * z)## + - [*] has a neutral element, that is ##\exists e_p, + \ \forall x,\ x * e_p = x## + - The distributivity of [+] and [*], that is ##\forall + (x, y, z),\ x * (y + z) = x * y + x * z## + - [*] has an absorbing element, that is ##\exists e_a, + \ \forall x, \ x * e_a = e_a## + + For the record, the [sum] and [prod] types are defined in Coq as follows: + +<< +Inductive sum (A B : Type) : Type := +| inl : A -> sum A B +| inr : B -> sum A B + +Inductive prod (A B : Type) : Type := +| pair : A -> B -> prod A B +>> + + #
site/posts/AlgebraicDatatypes.v
# + + #