diff options
Diffstat (limited to 'site/posts')
44 files changed, 6734 insertions, 5003 deletions
diff --git a/site/posts/AlgebraicDatatypes.md b/site/posts/AlgebraicDatatypes.md new file mode 100644 index 0000000..baeec43 --- /dev/null +++ b/site/posts/AlgebraicDatatypes.md @@ -0,0 +1,712 @@ +--- +published: 2020-07-12 +modified: 2023-05-07 +tags: ['coq'] +abstract: | + The set of types which can be defined in a language together with $+$ and + $*$ form an “algebraic structure” in the mathematical sense, hence the + name. It means the definitions of $+$ and $*$ have to satisfy properties + such as commutativity or the existence of neutral elements. In this + article, we prove the `sum`{.coq} and `prod`{.coq} Coq types satisfy these + properties. +--- + +# Proving Algebraic Datatypes are “Algebraic” + +Several programming languages allow programmers to define (potentially +recursive) custom types, by composing together existing ones. For instance, in +OCaml, one can define lists as follows: + +```ocaml +type 'a list = +| Cons of 'a * 'a list +| Nil +``` + +This translates in Haskell as + +```haskell +data List a = + Cons a (List a) +| Nil +``` + +In Rust as + +```rust +enum List<A> { + Cons(A, Box<List<a>>), + Nil, +} +``` + +Or in Coq as + +```coq +Inductive list a := +| cons : a -> list a -> list a +| nil +``` + +And so forth. + +Each language will have its own specific constructions, and the type systems +of OCaml, Haskell, Rust and Coq —to only cite them— are far from being +equivalent. That being said, they often share a common “base formalism,” +usually (and sometimes abusively) referred to as *algebraic datatypes*. This +expression is used because under the hood any datatype can be encoded as a +composition of types using two operators: sum ($+$) and product ($*$) for +types. + + - $a + b$ is the disjoint union of types $a$ and $b$. Any term of $a$ + can be injected into $a + b$, and the same goes for $b$. Conversely, + a term of $a + b$ can be projected into either $a$ or $b$. + - $a * b$ is the Cartesian product of types $a$ and $b$. Any term of $a * + b$ is made of one term of $a$ and one term of $b$ (think tuples). + +For an algebraic datatype, one constructor allows for defining “named +tuples,” that is ad hoc product types. Besides, constructors are mutually +exclusive: you cannot define the same term using two different constructors. +Therefore, a datatype with several constructors is reminiscent of a disjoint +union. Coming back to the $\mathrm{list}$ type, under the syntactic sugar of +algebraic datatypes, we can define it as + +$$\mathrm{list}_\alpha \equiv \mathrm{unit} + \alpha * \mathrm{list}_\alpha$$ + +where $\mathrm{unit}$ models the `nil`{.coq} case, and $α * \mathrm{list}_\alpha$ +models the `cons`{.coq} case. + +The set of types which can be defined in a language together with $+$ and +$*$ form an “algebraic structure” in the mathematical sense, hence the +name. It means the definitions of $+$ and $*$ have to satisfy properties +such as commutativity or the existence of neutral elements. In this article, +we will prove some of them in Coq. More precisely, + + - $+$ is commutative, that is $\forall (x, y),\ x + y = y + x$ + - $+$ is associative, that is $\forall (x, y, z),\ (x + y) + z = x + (y + z)$ + - $+$ has a neutral element, that is $\exists e_s,\ \forall x,\ x + e_s = x$ + - $*$ is commutative, that is $\forall (x, y),\ x * y = y * x$ + - $*$ is associative, that is $\forall (x, y, z),\ (x * y) * z = x * (y * z)$ + - $*$ has a neutral element, that is $\exists e_p,\ \forall x,\ x * e_p = x$ + - The distributivity of $+$ and $*$, that is $\forall + (x, y, z),\ x * (y + z) = x * y + x * z$ + - $*$ has an absorbing element, that is $\exists e_a,\ \forall x, \ x * e_a = e_a$ + +For the record, the `sum`{.coq} ($+$) and `prod`{.coq} ($*$) types are defined +in Coq as follows: + +```coq +Inductive sum (A B : Type) : Type := +| inl : A -> sum A B +| inr : B -> sum A B + +Inductive prod (A B : Type) : Type := +| pair : A -> B -> prod A B +``` + +## An Equivalence for `Type`{.coq} + +Algebraic structures come with *equations* expected to be true. This means +there is an implicit dependency which is —to my opinion— too easily overlooked: +the definition of $=$. In Coq, `=`{.coq} is a built-in relation that states +that two terms are “equal” if they can be reduced to the same “hierarchy” of +constructors. This is too strong in the general case, and in particular for our +study of algebraic structures of `Type`{.coq}. It is clear that, to Coq’s +opinion, $\alpha + \beta$ is not structurally *equal* to $\beta + \alpha$, yet +we will have to prove they are “equivalent.” + +### Introducing `type_equiv`{.coq} + +Since `=`{.coq} for `Type`{.coq} is not suitable for reasoning about algebraic +datatypes, we introduce our own equivalence relation, denoted `==`{.coq}. We +say two types $\alpha$ and $\beta$ are equivalent up to an isomorphism +when for any term of type $\alpha$, there exists a counterpart term of type +$\beta$ and vice versa. In other words, $\alpha$ and $\beta$ are equivalent if +we can exhibit two functions $f$ and $g$ such that: + +$$\forall (x : α),\ x = g(f(x))$$ + +$$\forall (y : β),\ y = f(g(y))$$ + +This translates into the following inductive type. + +```coq +Reserved Notation "x == y" (at level 72). + +Inductive type_equiv (α β : Type) : Prop := +| mk_type_equiv (f : α -> β) (g : β -> α) + (equ1 : forall (x : α), x = g (f x)) + (equ2 : forall (y : β), y = f (g y)) + : α == β +where "x == y" := (type_equiv x y). +``` + +As mentioned earlier, we prove two types are equivalent by exhibiting +two functions, and proving these functions satisfy two properties. We +introduce a `Ltac` notation to that end. + +```coq +Tactic Notation "equiv" "with" uconstr(f) "and" uconstr(g) + := apply (mk_type_equiv f g). +``` + +The tactic `equiv with f and g`{.coq} will turn a goal of the form `α == +β`{.coq} into two subgoals to prove `f` and `g` form an isomorphism. + +### `type_equiv`{.coq} is an Equivalence + +We can prove it by demonstrating it is + +1. Reflexive, +2. Symmetric, and +3. Transitive. + +#### `type_equiv`{.coq} is reflexive + +This proof is straightforward. A type $\alpha$ is equivalent to itself because: + +$$\forall (x : α),\ x = id(id(x))$$ + +```coq +Lemma type_equiv_refl (α : Type) : α == α. +Proof. + now equiv with (@id α) and (@id α). +Qed. +``` +#### `type_equiv`{.coq} is symmetric + +If $\alpha = \beta$, then we know there exists two functions $f$ and $g$ which +satisfy the expected properties. We can “swap” them to prove that $\beta == +\alpha$. + +```coq +Lemma type_equiv_sym {α β} (equ : α == β) : β == α. +Proof. + destruct equ as [f g equ1 equ2]. + now equiv with g and f. +Qed. +``` + +#### `type_equiv`{.coq} is transitive + +If $\alpha = \beta$, we know there exists two functions $f_\alpha$ and +$g_\beta$ which satisfy the expected properties of `type_equiv`{.coq}. +Similarly, because $\beta = \gamma$, +we know there exists two additional functions $f_\beta$ and $g_\gamma$. We can +compose these functions together to prove $\alpha = \gamma$. + +As a reminder, composing two functions $f$ and $g$ (denoted by `f >>> g`{.coq} +thereafter) consists in using the result of $f$ as the input of $g$. + +```coq +Infix ">>>" := (fun f g x => g (f x)) (at level 70). + +Lemma type_equiv_trans {α β γ} (equ1 : α == β) (equ2 : β == γ) + : α == γ. +Proof. + destruct equ1 as [fα gβ equαβ equβα], + equ2 as [fβ gγ equβγ equγβ]. + equiv with (fα >>> fβ) and (gγ >>> gβ). + + intros x. + rewrite <- equβγ. + now rewrite <- equαβ. + + intros x. + rewrite <- equβα. + now rewrite <- equγβ. +Qed. +``` +#### Conclusion + +The Coq standard library introduces the `Equivalence`{.coq} type class. We can +provide an instance of this type class for `type_equiv`{.coq}, using the three +lemmas we have proven in this section. + +```coq +#[refine] +Instance type_equiv_Equivalence : Equivalence type_equiv := + {}. + +Proof. + + intros x. + apply type_equiv_refl. + + intros x y. + apply type_equiv_sym. + + intros x y z. + apply type_equiv_trans. +Qed. +``` + +### Examples + +#### `list`{.coq}’s canonical form + +We now come back to our initial example, given in the introduction of this +write-up. We can prove our assertion, that is `list α == unit + α * list +α`{.coq}. + +```coq +Lemma list_equiv (α : Type) + : list α == unit + α * list α. + +Proof. + equiv with (fun x => match x with + | [] => inl tt + | x :: rst => inr (x, rst) + end) + and (fun x => match x with + | inl _ => [] + | inr (x, rst) => x :: rst + end). + + now intros [| x rst]. + + now intros [[] | [x rst]]. +Qed. +``` + +#### `list`{.coq} is a morphism + +This means that if `α == β`{.coq}, then `list α == list β`{.coq}. We prove this +by defining an instance of the `Proper`{.coq} type class. + +```coq +Instance list_Proper + : Proper (type_equiv ==> type_equiv) list. + +Proof. + add_morphism_tactic. + intros α β [f g equαβ equβα]. + equiv with (map f) and (map g). + all: setoid_rewrite map_map; intros l. + + replace (fun x : α => g (f x)) + with (@id α). + ++ symmetry; apply map_id. + ++ apply functional_extensionality. + apply equαβ. + + replace (fun x : β => f (g x)) + with (@id β). + ++ symmetry; apply map_id. + ++ apply functional_extensionality. + apply equβα. +Qed. +``` + +The use of the `Proper`{.coq} type class allows for leveraging hypotheses of +the form `α == β`{.coq} with the `rewrite` tactic. I personally consider +providing instances of `Proper`{.coq} whenever it is possible to be a good +practice, and would encourage any Coq programmers to do so. + +#### `nat`{.coq} is a special purpose `list` + +Did you notice? Now, using `type_equiv`{.coq}, we can prove it! + +```coq +Lemma nat_and_list : nat == list unit. +Proof. + equiv with (fix to_list n := + match n with + | S m => tt :: to_list m + | _ => [] + end) + and (fix of_list l := + match l with + | _ :: rst => S (of_list rst) + | _ => 0 + end). + + induction x; auto. + + induction y; auto. + rewrite <- IHy. + now destruct a. +Qed. +``` + +#### Non-empty lists + +We can introduce a variant of `list`{.coq} which contains at least one element +by modifying the `nil`{.coq} constructor so that it takes one argument instead +of none. + +```coq +Inductive non_empty_list (α : Type) := +| ne_cons : α -> non_empty_list α -> non_empty_list α +| ne_singleton : α -> non_empty_list α. +``` + +We can demonstrate the relation between `list`{.coq} and +`non_empty_list`{.coq}, which reveals an alternative implementation of +`non_empty_list`{.coq}. More precisely, we can prove that `forall (α : Type), +non_empty_list α == α * list α`{.coq}. It is a bit more cumbersome, but not that +much. We first define the conversion functions, then prove they satisfy the +properties expected by `type_equiv`{.coq}. + +```coq +Fixpoint non_empty_list_of_list {α} (x : α) (l : list α) + : non_empty_list α := + match l with + | y :: rst => ne_cons x (non_empty_list_of_list y rst) + | [] => ne_singleton x + end. + +#[local] +Fixpoint list_of_non_empty_list {α} (l : non_empty_list α) + : list α := + match l with + | ne_cons x rst => x :: list_of_non_empty_list rst + | ne_singleton x => [x] + end. + +Definition list_of_non_empty_list {α} (l : non_empty_list α) + : α * list α := + match l with + | ne_singleton x => (x, []) + | ne_cons x rst => (x, list_of_non_empty_list rst) + end. + +Lemma ne_list_list_equiv (α : Type) + : non_empty_list α == α * list α. + +Proof. + equiv with list_of_non_empty_list + and (prod_curry non_empty_list_of_list). + + intros [x rst|x]; auto. + cbn. + revert x. + induction rst; intros x; auto. + cbn; now rewrite IHrst. + + intros [x rst]. + cbn. + destruct rst; auto. + change (non_empty_list_of_list x (α0 :: rst)) + with (ne_cons x (non_empty_list_of_list α0 rst)). + replace (α0 :: rst) + with (list_of_non_empty_list + (non_empty_list_of_list α0 rst)); auto. + revert α0. + induction rst; intros y; [ reflexivity | cbn ]. + now rewrite IHrst. +Qed. +``` + +## The `sum`{.coq} Operator + +### `sum`{.coq} Is a Morphism + +To prove this, we compose together the functions whose existence is implied by +$\alpha = \alpha'$ and $\beta = \beta'$. To that end, we introduce the +auxiliary function `lr_map`{.coq}. + +```coq +Definition lr_map_sum {α β α' β'} (f : α -> α') (g : β -> β') + (x : α + β) + : α' + β' := + match x with + | inl x => inl (f x) + | inr y => inr (g y) + end. +``` + +Then, we prove `sum`{.coq} is a morphism by defining a `Proper`{.coq} instance. + +```coq +Instance sum_Proper + : Proper (type_equiv ==> type_equiv ==> type_equiv) sum. +Proof. + add_morphism_tactic. + intros α α' [fα gα' equαα' equα'α] + β β' [fβ gβ' equββ' equβ'β]. + equiv with (lr_map_sum fα fβ) + and (lr_map_sum gα' gβ'). + + intros [x|y]; cbn. + ++ now rewrite <- equαα'. + ++ now rewrite <- equββ'. + + intros [x|y]; cbn. + ++ now rewrite <- equα'α. + ++ now rewrite <- equβ'β. +Qed. +``` + +### `sum`{.coq} Is Commutative + +```coq +Definition sum_invert {α β} (x : α + β) : β + α := + match x with + | inl x => inr x + | inr x => inl x + end. + +Lemma sum_com {α β} : α + β == β + α. +Proof. + equiv with sum_invert and sum_invert; + now intros [x|x]. +Qed. +``` + +### `sum`{.coq} Is Associative + +The associativity of `sum`{.coq} is straightforward to prove, and should not +pose a particular challenge to prospective readers[^joke]. + +[^joke]: If we assume that this article is well written, that is. + +```coq +Lemma sum_assoc {α β γ} : α + β + γ == α + (β + γ). +Proof. + equiv with (fun x => + match x with + | inl (inl x) => inl x + | inl (inr x) => inr (inl x) + | inr x => inr (inr x) + end) + and (fun x => + match x with + | inl x => inl (inl x) + | inr (inl x) => inl (inr x) + | inr (inr x) => inr x + end). + + now intros [[x|x]|x]. + + now intros [x|[x|x]]. +Qed. +``` + +### `sum`{.coq} Has A Neutral Element + +We need to find a type $e$ such that $\alpha + e = \alpha$ for any type +$\alpha$ (similarly to $x + 0 = x$ for any natural number $x$). + +Any empty type (that is, a type with no term such as `False`{.coq}) can act as +the natural element of `Type`{.coq}. As a reminder, empty types in Coq are +defined with the following syntax[^empty]: + +```coq +Inductive empty := . +``` + +[^empty]: Note that `Inductive empty.`{.coq} is erroneous. + + When the `:=`{.coq} is omitted, Coq defines an inductive type with one + constructor, making such a type equivalent to `unit`{.coq}, not + `False`{.coq}. + +From a high-level perspective, `empty`{.coq} being the neutral element of +`sum`{.coq} makes sense. Because we cannot construct a term of type `empty`{.coq}, +then `α + empty`{.coq} contains exactly the same numbers of terms as `α`{.coq}. +This is the intuition. Now, how can we convince Coq that our intuition is +correct? Just like before, by providing two functions of types: + + - `α -> α + empty`{.coq} + - `α + empty -> α`{.coq} + +The first function is `inl`{.coq}, that is one of the constructors of +`sum`{.coq}. + +The second function is trickier to write in Coq, because it comes down to +writing a function of type is `empty -> α`{.coq}. + +```coq +Definition from_empty {α} : empty -> α := + fun x => match x with end. +``` + +It is the exact same trick that allows Coq to encode proofs by +contradiction. + +If we combine `from_empty`{.coq} with the generic function + +```coq +Definition unwrap_left_or {α β} + (f : β -> α) (x : α + β) + : α := + match x with + | inl x => x + | inr x => f x + end. +``` + +Then, we have everything to prove that `α == α + empty`{.coq}. + +```coq +Lemma sum_neutral (α : Type) : α == α + empty. +Proof. + equiv with inl and (unwrap_left_or from_empty); + auto. + now intros [x|x]. +Qed. +``` + +## The `prod`{.coq} Operator + +This is very similar to what we have just proven for `sum`{.coq}, so expect +less text for this section. + +### `prod`{.coq} Is A Morphism + +```coq +Definition lr_map_prod {α α' β β'} + (f : α -> α') (g : β -> β') + : α * β -> α' * β' := + fun x => match x with (x, y) => (f x, g y) end. + +Instance prod_Proper + : Proper (type_equiv ==> type_equiv ==> type_equiv) prod. + +Proof. + add_morphism_tactic. + intros α α' [fα gα' equαα' equα'α] + β β' [fβ gβ' equββ' equβ'β]. + equiv with (lr_map_prod fα fβ) + and (lr_map_prod gα' gβ'). + + intros [x y]; cbn. + rewrite <- equαα'. + now rewrite <- equββ'. + + intros [x y]; cbn. + rewrite <- equα'α. + now rewrite <- equβ'β. +Qed. +``` + +### `prod`{.coq} Is Commutative + +```coq +Definition prod_invert {α β} (x : α * β) : β * α := + (snd x, fst x). + +Lemma prod_com {α β} : α * β == β * α. + +Proof. + equiv with prod_invert and prod_invert; + now intros [x y]. +Qed. +``` + +### `prod`{.coq} Is Associative + +```coq +Lemma prod_assoc {α β γ} + : α * β * γ == α * (β * γ). + +Proof. + equiv with (fun x => + match x with + | ((x, y), z) => (x, (y, z)) + end) + and (fun x => + match x with + | (x, (y, z)) => ((x, y), z) + end). + + now intros [[x y] z]. + + now intros [x [y z]]. +Qed. +``` + +### `prod`{.coq} Has A Neutral Element + +```coq +Lemma prod_neutral (α : Type) : α * unit == α. + +Proof. + equiv with fst and ((flip pair) tt). + + now intros [x []]. + + now intros. +Qed. +``` + +### `prod`{.coq} Has An Absorbing Element *) + +And this absorbing element is `empty`{.coq}, just like the absorbing element of +the multiplication of natural numbers is $0$ (that is, the neutral element of +the addition). + +```coq +Lemma prod_absord (α : Type) : α * empty == empty. +Proof. + equiv with (snd >>> from_empty) + and (from_empty). + + intros [_ []]. + + intros []. +Qed. +``` + +## `prod`{.coq} And `sum`{.coq} Distributivity + +Finally, we can prove the distributivity property of `prod`{.coq} and +`sum`{.coq} using a similar approach to prove the associativity of `prod`{.coq} +and `sum`{.coq}. + +```coq +Lemma prod_sum_distr (α β γ : Type) + : α * (β + γ) == α * β + α * γ. +Proof. + equiv with (fun x => match x with + | (x, inr y) => inr (x, y) + | (x, inl y) => inl (x, y) + end) + and (fun x => match x with + | inr (x, y) => (x, inr y) + | inl (x, y) => (x, inl y) + end). + + now intros [x [y | y]]. + + now intros [[x y] | [x y]]. +Qed. +``` + +## Bonus: Algebraic Datatypes and Metaprogramming + +Algebraic datatypes are very suitable for generating functions, as demonstrated +by the automatic deriving of typeclass in Haskell or trait in Rust. Because a +datatype can be expressed in terms of `sum`{.coq} and `prod`{.coq}, you just +have to know how to deal with these two constructions to start metaprogramming. + +We can take the example of the `fold`{.coq} functions. A `fold`{.coq} function +is a function which takes a container as its argument, and iterates over the +values of that container in order to compute a result. + +We introduce `fold_type INPUT CANON_FORM OUTPUT`{.coq}, a tactic to compute the +type of the fold function of the type `INPUT`, whose “canonical form” (in terms +of `prod`{.coq} and `sum`{.coq}) is `CANON_FORM` and whose result type is +`OUTPUT`. Interested readers have to be familiar with `Ltac`. + +```coq +Ltac fold_args b a r := + lazymatch a with + | unit => + exact r + | b => + exact (r -> r) + | (?c + ?d)%type => + exact (ltac:(fold_args b c r) * ltac:(fold_args b d r))%type + | (b * ?c)%type => + exact (r -> ltac:(fold_args b c r)) + | (?c * ?d)%type => + exact (c -> ltac:(fold_args b d r)) + | ?a => + exact (a -> r) + end. + +Ltac currying a := + match a with + | ?a * ?b -> ?c => exact (a -> ltac:(currying (b -> c))) + | ?a => exact a + end. + +Ltac fold_type b a r := + exact (ltac:(currying (ltac:(fold_args b a r) -> b -> r))). +``` + +We use it to compute the type of a `fold`{.coq} function for `list`{.coq}. + +```coq +Definition fold_list_type (α β : Type) : Type := + ltac:(fold_type (list α) (unit + α * list α)%type β). +``` + +```coq +fold_list_type = + fun α β : Type => β -> (α -> β -> β) -> list α -> β + : Type -> Type -> Type +``` + +It is exactly what you could have expected (as match the type of +`fold_right`{.coq}). + +Generating the body of the function is possible in theory, but probably not in +`Ltac` without modifying a bit `type_equiv`{.coq}. This could be a nice +use case for [MetaCoq](https://github.com/MetaCoq/metacoq) though. + diff --git a/site/posts/AlgebraicDatatypes.v b/site/posts/AlgebraicDatatypes.v deleted file mode 100644 index f32551c..0000000 --- a/site/posts/AlgebraicDatatypes.v +++ /dev/null @@ -1,682 +0,0 @@ -(** #<nav><p class="series">../coq.html</p> - <p class="series-prev">./ClightIntroduction.html</p> - <p class="series-next">./Coqffi.html</p></nav># *) - -(** * Proving Algebraic Datatypes are “Algebraic” *) - -(** Several programming languages allow programmers to define (potentially - recursive) custom types, by composing together existing ones. For instance, - in OCaml, one can define lists as follows: - -<< -type 'a list = -| Cons of 'a * 'a list -| Nil ->> - - This translates in Haskell as - -<< -data List a = - Cons a (List a) -| Nil ->> - - In Rust: - -<< -enum List<A> { - Cons(A, Box< List<a> >), - Nil, -} ->> - - In Coq: - -<< -Inductive list a := -| cons : a -> list a -> list a -| nil ->> - - And so forth. - - Each language will have its own specific constructions, and the type systems - of OCaml, Haskell, Rust and Coq —to only cite them— are far from being - equivalent. That being said, they often share a common “base formalism,” - usually (and sometimes abusively) referred to as _algebraic datatypes_. This - expression is used because under the hood any datatype can be encoded as a - composition of types using two operators: sum ([+]) and product ([*]) for - types. - - - [a + b] is the disjoint union of types [a] and [b]. Any term of [a] - can be injected into [a + b], and the same goes for [b]. Conversely, - a term of [a + b] can be projected into either [a] or [b]. - - [a * b] is the Cartesian product of types [a] and [b]. Any term of [a * - b] is made of one term of [a] and one term of [b] (remember tuples?). - - For an algebraic datatype, one constructor allows for defining “named - tuples”, that is ad-hoc product types. Besides, constructors are mutually - exclusive: you cannot define the same term using two different constructors. - Therefore, a datatype with several constructors is reminescent of a disjoint - union. Coming back to the [list] type, under the syntactic sugar of - algebraic datatypes, the [list α] type is equivalent to [unit + α * list α], - where [unit] models the [nil] case, and [α * list α] models the [cons] case. - - The set of types which can be defined in a language together with [+] and - [*] form an “algebraic structure” in the mathematical sense, hence the - name. It means the definitions of [+] and [*] have to satisfy properties - such as commutativity or the existence of neutral elements. In this article, - we will prove some of them in Coq. More precisely, - - - [+] is commutative, that is #<span class="imath">#\forall (x, y),\ x + y - = y + x#</span># - - [+] is associative, that is #<span class="imath">#\forall (x, y, z),\ (x - + y) + z = x + (y + z)#</span># - - [+] has a neutral element, that is #<span class="imath">#\exists e_s, - \ \forall x,\ x + e_s = x#</span># - - [*] is commutative, that is #<span class="imath">#\forall (x, y),\ x * y - = y * x#</span># - - [*] is associative, that is #<span class="imath">#\forall (x, y, z),\ (x - * y) * z = x * (y * z)#</span># - - [*] has a neutral element, that is #<span class="imath">#\exists e_p, - \ \forall x,\ x * e_p = x#</span># - - The distributivity of [+] and [*], that is #<span class="imath">#\forall - (x, y, z),\ x * (y + z) = x * y + x * z#</span># - - [*] has an absorbing element, that is #<span class="imath">#\exists e_a, - \ \forall x, \ x * e_a = e_a#</span># - - For the record, the [sum] and [prod] types are defined in Coq as follows: - -<< -Inductive sum (A B : Type) : Type := -| inl : A -> sum A B -| inr : B -> sum A B - -Inductive prod (A B : Type) : Type := -| pair : A -> B -> prod A B ->> - - #<nav id="generate-toc"></nav># - - #<div id="history">site/posts/AlgebraicDatatypes.v</div># *) - -From Coq Require Import - Basics Setoid Equivalence Morphisms - List FunctionalExtensionality. -Import ListNotations. - -Set Implicit Arguments. - -(** ** An Equivalence for [Type] *) - -(** Algebraic structures come with _equations_ expected to be true. This means - there is an implicit dependency which is —to my opinion— too easily - overlooked: the definition of [=]. In Coq, [=] is a built-in relation that - states that two terms are “equal” if they can be reduced to the same - “hierarchy” of constructors. This is too strong in the general case, and in - particular for our study of algebraic structures of [Type]. It is clear - that, to Coq’s opinion, [α + β] is not structurally _equal_ to [β + α], yet - we will have to prove they are “equivalent.” *) - -(** *** Introducing [type_equiv] *) - -(** Since [=] for [Type] is not suitable for reasoning about algebraic - datatypes, we introduce our own equivalence relation, denoted [==]. We say - two types [α] and [β] are equivalent up to an isomorphism (denoted by [α == - β]) when for any term of type [α], there exists a counter-part term of type - [β] and vice versa. In other words, [α] and [β] are equivalent if we can - exhibit two functions [f] and [g] such that: - - #<span class="dmath">#\forall (x : α),\ x = g(f(x))#</span># - - #<span class="dmath">#\forall (y : β),\ y = f(g(y))#</span># - - In Coq, this translates into the following inductive types. *) - -Reserved Notation "x == y" (at level 72). - -Inductive type_equiv (α β : Type) : Prop := -| mk_type_equiv (f : α -> β) (g : β -> α) - (equ1 : forall (x : α), x = g (f x)) - (equ2 : forall (y : β), y = f (g y)) - : α == β -where "x == y" := (type_equiv x y). - -(** As mentioned earlier, we prove two types are equivalent by exhibiting - two functions, and proving these functions satisfy two properties. We - introduce a <<Ltac>> notation to that end. *) - -Tactic Notation "equiv" "with" uconstr(f) "and" uconstr(g) - := apply (mk_type_equiv f g). - -(** The tactic [equiv with f and g] will turn a goal of the form [α == β] into - two subgoals to prove [f] and [g] form an isomorphism. *) - -(** *** [type_equiv] is an Equivalence *) - -(** [type_equiv] is an equivalence, and we can prove it by demonstrating it is - (1) reflexive, (2) symmetric, and (3) transitive. - - [type_equiv] is reflexive. *) - -Lemma type_equiv_refl (α : Type) : α == α. - -(** This proof is straightforward. A type [α] is equivalent to itself because: - - #<span class="imath">#\forall (x : α),\ x = id(id(x))#</span># *) - -Proof. - now equiv with (@id α) and (@id α). -Qed. - -(** [type_equiv] is symmetric. *) - -Lemma type_equiv_sym {α β} (equ : α == β) : β == α. - -(** If [α == β], then we know there exists two functions [f] and [g] which - satisfy the expected properties. We can “swap” them to prove that [β == α]. - *) - -Proof. - destruct equ as [f g equ1 equ2]. - now equiv with g and f. -Qed. - -(** [type_equiv] is transitive *) - -Lemma type_equiv_trans {α β γ} (equ1 : α == β) (equ2 : β == γ) - : α == γ. - -(** If [α == β], we know there exists two functions [fα] and [gβ] which satisfy - the expected properties of [type_equiv]. Similarly, because [β == γ], we - know there exists two additional functions [fβ] and [gγ]. We can compose - these functions together to prove [α == γ]. - - As a reminder, composing two functions [f] and [g] (denoted by [f >>> g] - thereafter) consists in using the result of [f] as the input of [g]: *) - -Infix ">>>" := (fun f g x => g (f x)) (at level 70). - -(** Then comes the proof. *) - -Proof. - destruct equ1 as [fα gβ equαβ equβα], - equ2 as [fβ gγ equβγ equγβ]. - equiv with (fα >>> fβ) and (gγ >>> gβ). - + intros x. - rewrite <- equβγ. - now rewrite <- equαβ. - + intros x. - rewrite <- equβα. - now rewrite <- equγβ. -Qed. - -(** The Coq standard library introduces the [Equivalence] type class. We can - provide an instance of this type class for [type_equiv], using the three - lemmas we have proven in this section. *) - -#[refine] -Instance type_equiv_Equivalence : Equivalence type_equiv := - {}. - -Proof. - + intros x. - apply type_equiv_refl. - + intros x y. - apply type_equiv_sym. - + intros x y z. - apply type_equiv_trans. -Qed. - -(** *** Examples *) - -(** **** [list]’s Canonical Form *) - -(** We now come back to our initial example, given in the Introduction of this - write-up. We can prove our assertion, that is [list α == unit + α * list - α]. *) - -Lemma list_equiv (α : Type) - : list α == unit + α * list α. - -Proof. - equiv with (fun x => match x with - | [] => inl tt - | x :: rst => inr (x, rst) - end) - and (fun x => match x with - | inl _ => [] - | inr (x, rst) => x :: rst - end). - + now intros [| x rst]. - + now intros [[] | [x rst]]. -Qed. - -(** **** [list] is a Morphism *) - -(** This means that if [α == β], then [list α == list β]. We prove this by - defining an instance of the [Proper] type class. *) - -Instance list_Proper - : Proper (type_equiv ==> type_equiv) list. - -Proof. - add_morphism_tactic. - intros α β [f g equαβ equβα]. - equiv with (map f) and (map g). - all: setoid_rewrite map_map; intros l. - + replace (fun x : α => g (f x)) - with (@id α). - ++ symmetry; apply map_id. - ++ apply functional_extensionality. - apply equαβ. - + replace (fun x : β => f (g x)) - with (@id β). - ++ symmetry; apply map_id. - ++ apply functional_extensionality. - apply equβα. -Qed. - -(** The use of the [Proper] type class allows for leveraging hypotheses of the - form [α == β] with the [rewrite] tactic. I personally consider providing - instances of [Proper] whenever it is possible to be a good practice, and - would encourage any Coq programmers to do so. *) - -(** **** [nat] is a Special-Purpose [list] *) - -(** Did you notice? Now, using [type_equiv], we can prove it! *) - -Lemma nat_and_list : nat == list unit. - -Proof. - equiv with (fix to_list n := - match n with - | S m => tt :: to_list m - | _ => [] - end) - and (fix of_list l := - match l with - | _ :: rst => S (of_list rst) - | _ => 0 - end). - + induction x; auto. - + induction y; auto. - rewrite <- IHy. - now destruct a. -Qed. - -(** **** Non-empty Lists *) - -(** We can introduce a variant of [list] which contains at least one element by - modifying the [nil] constructor so that it takes one argument instead of - none. *) - -Inductive non_empty_list (α : Type) := -| ne_cons : α -> non_empty_list α -> non_empty_list α -| ne_singleton : α -> non_empty_list α. - -(** We can demonstrate the relation between [list] and [non_empty_list], which - reveals an alternative implementation of [non_empty_list]. More precisely, - we can prove that [forall (α : Type), non_empty_list α == α * list α]. It - is a bit more cumbersome, but not that much. We first define the conversion - functions, then prove they satisfy the properties expected by - [type_equiv]. *) - -Fixpoint non_empty_list_of_list {α} (x : α) (l : list α) - : non_empty_list α := - match l with - | y :: rst => ne_cons x (non_empty_list_of_list y rst) - | [] => ne_singleton x - end. - -#[local] -Fixpoint list_of_non_empty_list {α} (l : non_empty_list α) - : list α := - match l with - | ne_cons x rst => x :: list_of_non_empty_list rst - | ne_singleton x => [x] - end. - -Definition prod_list_of_non_empty_list {α} (l : non_empty_list α) - : α * list α := - match l with - | ne_singleton x => (x, []) - | ne_cons x rst => (x, list_of_non_empty_list rst) - end. - -Lemma ne_list_list_equiv (α : Type) - : non_empty_list α == α * list α. - -Proof. - equiv with prod_list_of_non_empty_list - and (prod_curry non_empty_list_of_list). - + intros [x rst|x]; auto. - cbn. - revert x. - induction rst; intros x; auto. - cbn; now rewrite IHrst. - + intros [x rst]. - cbn. - destruct rst; auto. - change (non_empty_list_of_list x (α0 :: rst)) - with (ne_cons x (non_empty_list_of_list α0 rst)). - replace (α0 :: rst) - with (list_of_non_empty_list - (non_empty_list_of_list α0 rst)); auto. - revert α0. - induction rst; intros y; [ reflexivity | cbn ]. - now rewrite IHrst. -Qed. - -(** ** The [sum] Operator *) - -(** *** [sum] is a Morphism *) - -(** This means that if [α == α'] and [β == β'], then [α + β == α' + β']. To - prove this, we compose together the functions whose existence is implied by - [α == α'] and [β == β']. To that end, we introduce the auxiliary function - [lr_map]. *) - -Definition lr_map_sum {α β α' β'} (f : α -> α') (g : β -> β') - (x : α + β) - : α' + β' := - match x with - | inl x => inl (f x) - | inr y => inr (g y) - end. - -(** Then, we prove [sum] is a morphism by defining a [Proper] instance. *) - -Instance sum_Proper - : Proper (type_equiv ==> type_equiv ==> type_equiv) sum. - -Proof. - add_morphism_tactic. - intros α α' [fα gα' equαα' equα'α] - β β' [fβ gβ' equββ' equβ'β]. - equiv with (lr_map_sum fα fβ) - and (lr_map_sum gα' gβ'). - + intros [x|y]; cbn. - ++ now rewrite <- equαα'. - ++ now rewrite <- equββ'. - + intros [x|y]; cbn. - ++ now rewrite <- equα'α. - ++ now rewrite <- equβ'β. -Qed. - -(** *** [sum] is Commutative *) - -Definition sum_invert {α β} (x : α + β) : β + α := - match x with - | inl x => inr x - | inr x => inl x - end. - -Lemma sum_com {α β} : α + β == β + α. - -Proof. - equiv with sum_invert and sum_invert; - now intros [x|x]. -Qed. - -(** *** [sum] is Associative *) - -(** The associativity of [sum] is straightforward to prove, and should not pose - a particular challenge to perspective readers; if we assume that this - article is well-written, that is! *) - -Lemma sum_assoc {α β γ} : α + β + γ == α + (β + γ). - -Proof. - equiv with (fun x => - match x with - | inl (inl x) => inl x - | inl (inr x) => inr (inl x) - | inr x => inr (inr x) - end) - and (fun x => - match x with - | inl x => inl (inl x) - | inr (inl x) => inl (inr x) - | inr (inr x) => inr x - end). - + now intros [[x|x]|x]. - + now intros [x|[x|x]]. -Qed. - -(** *** [sum] has a Neutral Element *) - -(** We need to find a type [e] such that [α + e == α] for any type [α] - (similarly to #<span class="imath">#x~+~0~=~x#</span># for any natural - number #<span class="imath">#x#</span># that is). - - Any empty type (that is, a type with no term such as [False]) can act as the - natural element of [Type]. As a reminder, empty types in Coq are defined - with the following syntax: *) - -Inductive empty := . - -(** Note that the following definition is erroneous. - -<< -Inductive empty. ->> - - Using [Print] on such a type illustrates the issue. - -<< -Inductive empty : Prop := Build_empty { } ->> - - That is, when the [:=] is omitted, Coq defines an inductive type with one - constructor. - - Coming back to [empty] being the neutral element of [sum]. From a high-level - perspective, this makes sense. Because we cannot construct a term of type - [empty], then [α + empty] contains exactly the same numbers of terms as [α]. - This is the intuition. Now, how can we convince Coq that our intuition is - correct? Just like before, by providing two functions of types: - - - [α -> α + empty] - - [α + empty -> α] - - The first function is [inl], that is one of the constructor of [sum]. - - The second function is more tricky to write in Coq, because it comes down to - writing a function of type is [empty -> α]. *) - -Definition from_empty {α} : empty -> α := - fun x => match x with end. - -(** It is the exact same trick that allows Coq to encode proofs by - contradiction. - - If we combine [from_empty] with the generic function *) - -Definition unwrap_left_or {α β} - (f : β -> α) (x : α + β) - : α := - match x with - | inl x => x - | inr x => f x - end. - -(** Then, we have everything to prove that [α == α + empty]. *) - -Lemma sum_neutral (α : Type) : α == α + empty. - -Proof. - equiv with inl and (unwrap_left_or from_empty); - auto. - now intros [x|x]. -Qed. - -(** ** The [prod] Operator *) - -(** This is very similar to what we have just proven for [sum], so expect less - text for this section. *) - -(** *** [prod] is a Morphism *) - -Definition lr_map_prod {α α' β β'} - (f : α -> α') (g : β -> β') - : α * β -> α' * β' := - fun x => match x with (x, y) => (f x, g y) end. - -Instance prod_Proper - : Proper (type_equiv ==> type_equiv ==> type_equiv) prod. - -Proof. - add_morphism_tactic. - intros α α' [fα gα' equαα' equα'α] - β β' [fβ gβ' equββ' equβ'β]. - equiv with (lr_map_prod fα fβ) - and (lr_map_prod gα' gβ'). - + intros [x y]; cbn. - rewrite <- equαα'. - now rewrite <- equββ'. - + intros [x y]; cbn. - rewrite <- equα'α. - now rewrite <- equβ'β. -Qed. - -(** *** [prod] is Commutative *) - -Definition prod_invert {α β} (x : α * β) : β * α := - (snd x, fst x). - -Lemma prod_com {α β} : α * β == β * α. - -Proof. - equiv with prod_invert and prod_invert; - now intros [x y]. -Qed. - -(** *** [prod] is Associative *) - -Lemma prod_assoc {α β γ} - : α * β * γ == α * (β * γ). - -Proof. - equiv with (fun x => - match x with - | ((x, y), z) => (x, (y, z)) - end) - and (fun x => - match x with - | (x, (y, z)) => ((x, y), z) - end). - + now intros [[x y] z]. - + now intros [x [y z]]. -Qed. - -(** *** [prod] has a Neutral Element *) - -Lemma prod_neutral (α : Type) : α * unit == α. - -Proof. - equiv with fst and ((flip pair) tt). - + now intros [x []]. - + now intros. -Qed. - -(** ** [prod] has an Absorbing Element *) - -(** And this absorbing element is [empty], just like the absorbing element of - the multiplication of natural number is #<span class="imath">#0#</span># - (the neutral element of the addition). *) - -Lemma prod_absord (α : Type) : α * empty == empty. - -Proof. - equiv with (snd >>> from_empty) - and (from_empty). - + intros [_ []]. - + intros []. -Qed. - -(** ** [prod] and [sum] Distributivity *) - -(** Finally, we can prove the distributivity property of [prod] and [sum] using - a similar approach to prove the associativity of [prod] and [sum]. *) - -Lemma prod_sum_distr (α β γ : Type) - : α * (β + γ) == α * β + α * γ. - -Proof. - equiv with (fun x => match x with - | (x, inr y) => inr (x, y) - | (x, inl y) => inl (x, y) - end) - and (fun x => match x with - | inr (x, y) => (x, inr y) - | inl (x, y) => (x, inl y) - end). - + now intros [x [y | y]]. - + now intros [[x y] | [x y]]. -Qed. - -(** ** Bonus: Algebraic Datatypes and Metaprogramming *) - -(** Algebraic datatypes are very suitable for generating functions, as - demonstrated by the automatic deriving of typeclass in Haskell or trait in - Rust. Because a datatype can be expressed in terms of [sum] and [prod], you - just have to know how to deal with these two constructions to start - metaprogramming. - - We can take the example of the [fold] functions. A [fold] function is a - function which takes a container as its argument, and iterates over the - values of that container in order to compute a result. - - We introduce [fold_type INPUT CANON_FORM OUTPUT], a tactic to compute the - type of the fold function of the type <<INPUT>>, whose “canonical form” (in - terms of [prod] and [sum]) is <<CANON_FORM>> and whose result type is - #<code>#OUTPUT#</code>#. Interested readers have to be familiar with - [Ltac]. *) - -Ltac fold_args b a r := - lazymatch a with - | unit => - exact r - | b => - exact (r -> r) - | (?c + ?d)%type => - exact (ltac:(fold_args b c r) * ltac:(fold_args b d r))%type - | (b * ?c)%type => - exact (r -> ltac:(fold_args b c r)) - | (?c * ?d)%type => - exact (c -> ltac:(fold_args b d r)) - | ?a => - exact (a -> r) - end. - -Ltac currying a := - match a with - | ?a * ?b -> ?c => exact (a -> ltac:(currying (b -> c))) - | ?a => exact a - end. - -Ltac fold_type b a r := - exact (ltac:(currying (ltac:(fold_args b a r) -> b -> r))). - -(** We use it to compute the type of a [fold] function for [list]. *) - -Definition fold_list_type (α β : Type) : Type := - ltac:(fold_type (list α) (unit + α * list α)%type β). - -(** Here is the definition of [fold_list_type], as outputed by [Print]. - -<< -fold_list_type = - fun α β : Type => β -> (α -> β -> β) -> list α -> β - : Type -> Type -> Type ->> - - It is exactly what you could have expected (as match the type of - [fold_right]). - - Generating the body of the function is possible in theory, but probably not - in [Ltac] without modifying a bit [type_equiv]. This could be a nice - use-case for #<a href="https://github.com/MetaCoq/metacoq">#MetaCoq#</a>#, - though. *) diff --git a/site/posts/August2022.md b/site/posts/August2022.md new file mode 100644 index 0000000..c40a0c0 --- /dev/null +++ b/site/posts/August2022.md @@ -0,0 +1,136 @@ +--- +published: 2022-08-15 +modified: 2023-05-09 +series: + parent: series/Retrospectives.html + next: posts/September2022.html +tags: ['emacs', 'meta'] +abstract: | + In an attempt to start a regular blogging habbits, I am giving a try to the + monthly “status updates” format. This month: some Emacs config hacking, and + some changes on how this website is generated. +--- + +# What happened in August 2022? + +Without further ado, let’s take a look at what was achieved +for the last thirty days or so. + +## Emacs + +I have started tweaking and improving my Emacs +[configuration](https://src.soap.coffee/dotfiles/emacs.d.git) +again[^minimalism]. + +[^minimalism]: After having used Emacs for seven years now, I am nowhere close + to consider my configuration as a done project. I really envy developers + who are using their editor with little to no customization. + +### Theme Selection Menu + +The change I am the most excited about is that I have *finally* reduced the +boilerplate in need to write to use a new theme. I am very indecisive when +it comes to theming. I like to have my choices, and I get tired of any +color scheme pretty quickly. As a consequence, I introduced a customizable +variable to let me select a theme dynamically, and have this choice persist +across Emacs session. + +I have a Hydra menu that allows me to select which theme I want to +use for the time being. It looks like this. + +#[A Hydra menu for selecting a theme.](/img/select-theme.png) + +But adding new entries to this menu was very cumbersome, and mostly +boilerplate that I know a good macro could abstract away. And I can +finally report that I was right all along. I have my macros now, +and they allow me to have the Hydra menu above generated with these +simple lines of code. + +```lisp +(use-theme ancientless "a" :straight nil :load-path "~/.emacs.d/lisp") +(use-theme darkless "d" :straight nil :load-path "~/.emacs.d/lisp") +(use-theme brightless "b" :straight nil :load-path "~/.emacs.d/lisp") +(use-theme monotropic "m") +(use-theme monokai "M") +(use-theme nothing "n") +(use-theme eink "e") +(use-theme dracula "D") +(use-theme chocolate "c") +(use-themes-from tao-theme + '(("tl" . tao-yang) + ("td" . tao-yin))) +``` + + +### Eldoc and Flycheck Popups + +I have been experimenting with several combinations of packages to +have Eldoc and Flycheck using pop-up-like mechanisms to report +things to me, instead of the echo area. + +The winning setup for now is the one that uses the [`quick-peek` +package](https://github.com/cpitclaudel/quick-peek). That is, +[`flycheck-inline`](https://github.com/flycheck/flycheck-inline) (customized to +use `quick-peek`, as suggested in their README), and +[`eldoc-overlay`](https://melpa.org/#/eldoc-overlay). This works well enough, +so the pop-ups of eldoc are maybe a bit too distracting. + +#[`flycheck-inline` in action with an OCaml compilation error.](/img/flycheck-inline.png) + +In my quest for pop-ups, I ran into several issues with the packages I tried +out. For instance, [`eldoc-box`](https://github.com/casouri/eldoc-box) was very +nice, but also very slow for some reason. It turns out there was an issue +about that slowness, wherein the culprit was identified. This allowed me to +[submit a pull request that got merged rather +quickly](https://github.com/casouri/eldoc-box/pull/48). + +Similarly, after a packages update, I discovered +[`flycheck-ocaml`](https://github.com/flycheck/flycheck-ocaml) was no longer +working, and [submit a patch to fix the +issue](https://github.com/flycheck/flycheck-ocaml/pull/14). + +## This Website + I have not been investing a lot of time in this website for the past + six years or so. This month, things change a bit on that side too. + +### New Contents + +First, I have published a (short) article on [higher-order +polymorphism in OCaml](/posts/RankNTypesInOCaml.html). The goal was for me to +log somewhere the solution for an odd problem I was confronted to at +`$WORK`{.bash}, but the resulting article was not doing a great job as +conveying this. In particular, two comments on Reddit motivated me to rework +it, and I am glad I did. I hope you enjoy the retake. + +Once this was out of the way, I decided that generating this website was taking +way too much time for no good enough reason. The culprit was **`cleopatra`**, a +toolchain I had developed in 2020 to integrate the build process of this +website as additional contents that I thought might interest people. The sad +things were: **`cleopatra`** was adding a significant overhead, and I never take +the time to actually document them properly. + +### Under the Hood + +Overall, the cost of using **`cleopatra`** was not worth the burden, and so I +got rid of it. Fortunately, it was not very difficult, since the job of +**`cleopatra`** was to extracting the generation processes from org files; I +just add to implement a small `makefile` to make use of these files, without +having to rely on **`cleopatra`** anymore. + +This was something I was pondering to do for a long time, and as +often in these circumstances, this gave me the extra motivation I +needed to tackle other ideas I had in mind for this website. This +is why now, rather than starting one Emacs process per Org file I +have to process, my build toolchain starts one Emacs server, and +later uses `emacsclient`. + +Now, most of the build time is spent by [soupault](https://soupault.app). I guess +I will have to spend some time on the Lua plugins I have developed for it at +some point. + +## A New Mailing List + +Finally, I have created [a public +mailing](https://lists.sr.ht/~lthms/public-inbox) list that is available if you +want to start a discussion on one of my articles. Don’t hesitate to use it, or +to register to it! diff --git a/site/posts/CFTSpatialShell.md b/site/posts/CFTSpatialShell.md new file mode 100644 index 0000000..602bd21 --- /dev/null +++ b/site/posts/CFTSpatialShell.md @@ -0,0 +1,200 @@ +--- +published: 2023-04-27 +tags: ['spatial-shell'] +abstract: | + In August, 2022, I have discovered Material Shell, and decided to implement + a dynamic tiling management “a la Material Shell” for sway I called Spatial + Shell. Spatial Shell works on my machine, which means it will definitely + break on yours, and I would love to know how. +--- + +# Spatial Shell: Call For Testers + +In August 2022, I have discovered [Material +Shell](https://material-shell.com). A few weeks later, I had pieced together a +working prototype of a dynamic tiling management “a la Material Shell” for +[sway](https://swaywm.org). By October, the project was basically fulfilling my +needs, and I had already started to use it on my workstation[^1]. The project +sat there for a while until I rediscovered this thing called *holidays*. + +[^1]: I tried so you do not have to: having my graphical session going crazy + during a work meeting because of a software I had written. + +For a short week, I tried to address at many of the remaining issues and +missing features that I was aware of. Then, I started to write +[man pages](https://lthms.github.io/spatial-shell/spatial.1.html), which +turned out to be the perfect opportunity to clean up every clunkiness I could +possibly find. + +I can’t help but find the result rather nice and satisfying, and I hope you +will enjoy it too! [Spatial Shell](https://github.com/lthms/spatial-shell) +works on my machine, which means it will definitely break on yours. But this is +where the fun lies, right? At this point, I definitely think the project is +ready to fall into the hands of (motivated) alpha testers. + +Anyway, let me give you a tour! + +## Spatial Model + +At its core, Spatial Shell allows you to navigate a grid of windows. +Workspace are rows which can be individually configured to determine how many +windows (cells) you can see at once. More precisely, workspaces in Spatial +Shell can use two layouts: + +- **Maximize:** One window is displayed at a time +- **Column:** Several windows are displayed side by side, to your convenience + +The reason **Maximize** is not a particular case of **Column**, but instead a +layout on its own, is to easily allow you to switch to and back from maximizing +the focused window. The following picture[^2] summarizes one particular setup with +tree workspaces, each configured differently. + +[^2]: Created using [Excalidraw](https://excalidraw.com/). + +#[Spatial Shell allows users to configure the layout of each workspace individually.](/img/spatial-shell-example.png) + +- Workspace 1 contains three windows, and uses the **Column** layout to display + at most three windows, so each window is visible, with the focus being on + the leftmost one. +- Workspace 2 contains four windows, and uses the **Column** layout to display at + most two windows. As a consequence, two windows are not visible. +- Workspace 3 contains two windows, and uses the **Maximize** layout so only the + focused window is visible. + +To help users know which window is currently holding the focus, Spatial Shell +slightly reduces the opacity of unfocused windows (as poorly hinted by the gray +backgrounds in the figure). Finally, Spatial Shell can also set a background +for empty workspaces (using `swaybg`{.bash} under the hood). + +And this is basically it. There is not much more to say about Spatial Shell +features. + +## Configuration + +From an implementation and user experience perspective, Spatial Shell is taking +inspiration from i3 and sway. + +More precisely, Spatial Shell comprises two executables: + +- [**spatial**(1)](https://lthms.github.io/spatial-shell/spatial.1.html), the + daemon implementing the spatial model described in the previous section, and +- [**spatialmsg**(1)](https://lthms.github.io/spatial-shell/spatialmsg.1.html), a + client used to control the current instance of spatial. + +Assuming `$spatial`{.bash} and `$spatialmsg`{.bash} contains the paths to +spatial and spatialmsg binaries respectively, then the simplest sway +configuration to start using Spatial Shell is the following + +```bash +exec $spatial + +# moving the focus +bindsym $mod+h exec $spatialmsg "focus left" +bindsym $mod+l exec $spatialmsg "focus right" +bindsym $mod+k exec $spatialmsg "focus up" +bindsym $mod+j exec $spatialmsg "focus down" + +# moving the focused window +bindsym $mod+Shift+h exec $spatialmsg "move left" +bindsym $mod+Shift+l exec $spatialmsg "move right" +bindsym $mod+Shift+k exec $spatialmsg "move up" +bindsym $mod+Shift+j exec $spatialmsg "move down" + +# toggle between Maximize and Column layout +bindsym $mod+space exec $spatialmsg "toggle layout" + +# modify the number of windows to display in the Column layout +bindsym $mod+i exec $spatialmsg "column count decrement" +bindsym $mod+o exec $spatialmsg "column count increment" + +# start waybar, spatial will take care of the rest +exec waybar +``` + +By default, Spatial Shell sets the initial configuration of a workspace to +the Column layout with two columns at most, and sets the opacity of the +unfocused windows to 75%. This can be customized, either globally or per +workspace, by creating a configuration file in +`$XDG_CONFIG_HOME/spatial/config`{.bash}[^3]. + +[^3]: If unset, `$XDG_CONFIG_HOME`{.bash} defaults to + `$HOME/.config`{.bash}. + +For instance, my config file looks like that. + +```bash +[workspace=3] default layout maximize +background "~/.config/spatial/wallpaper.jpg" +unfocus opacity 85 +``` + +See [**spatial**(5)](https://lthms.github.io/spatial-shell/spatial.5.html) for +the list of commands supported by Spatial Shell. + +As a side note, readers familiar with sway will definitely pick the resemblance +with sway and swaymsg, and it actually goes pretty deep. In a nutshell, swaymsg +connects to a UNIX socket created by sway at startup time, to send it commands +(see [**spatial**(5)](https://lthms.github.io/spatial-shell/spatial.5.html)) +using a dedicated IPC protocol inherited from i3 (see +[**sway-ipc**(7)](https://lthms.github.io/spatial-shell/sway-ipc.7.html)). Not +only spatial also relies on sway IPC protocol to interact with sway and +implement its spatial model, it creates a UNIX of its own, and supports its own +protocol +([**spatial-ipc**(7)](https://lthms.github.io/spatial-shell/spatial-ipc.7.html][spatial-ipc.7.html)). + +## Waybar Integration + +It is a common practice to use a so-called “bar” with sway, to display some +useful information about the current state of the system. In the `contrib/` +directory of [Spatial Shell repository](https://github.com/lthms/spatial-shell), +interested readers will find a configuration for +[Waybar](https://github.com/Alexays/Waybar)[^design]. This configuration is +somewhat clunky at the moment, due to the limitations of the custom widget of +Waybar which does not allow to have one widget defines several “buttons.” I am +interested in investing a bit of time to see if I could write a native widget, +similarly to sway’s one. + +[^design]: Readers familiar with Material Shell design will not be surprised by + the general look and feel of the screenshot below. + +#[Mandatory screenshot of Spatial Shell, with the Waybar configuration.](/img/spatial-shell.png) + +That being said, the user experience with this integration is already pretty +neat. As long as you don’t need more than 6 workspaces and 8 windows per +workspaces[^constants], you are good to go! + +[^constants]: These constants are totally arbitrary and can be increased by + modifying the Waybar config, but the issue will remain that a + limit will exist. + +## Building from Source + +As of April 2023, the only way to get Spatial Shell is to build it from source. + +You will need the following runtime dependencies: + +- sway (i3 might be supported at some point) +- gmp (for bad reasons, fortunalety this will be removed at some point) +- swaybg +- waybar (if you want the full experience) + +You will need the following build dependencies: + +- opam +- scdoc (for the man pages) + +Then, building and installing Spatial Shell is as simple as using the two +following commands. + +```bash +make build-deps +make install +``` + +The latter command will install Spatial Shell’s binaries in +`/usr/local/bin`{.bash}, and the man pages in `/usr/local/man`{.bash}. You can +remove them with `make uninstall`{.bash}. + +To install Waybar theme, copy `contrib/waybar/spatialbar.py`{.bash} to +`/usr/local/bin/spatialbar`{.bash} for instance, and the Waybar style and +config file to `$HOME/.config/waybar`{.bash}. diff --git a/site/posts/CleopatraV1.md b/site/posts/CleopatraV1.md new file mode 100644 index 0000000..ddefcc0 --- /dev/null +++ b/site/posts/CleopatraV1.md @@ -0,0 +1,402 @@ +--- +published: 2020-02-04 +modified: 2023-05-12 +tags: ['meta', 'literate-programming', 'emacs'] +abstract: | + Our literate programming toolchain cannot live solely inside Org files, + waiting to be turned into actual code by Babel. Otherwise there we would + not have any code to execute the first time we try to generate the website. +--- + +# A Literate Toolchain To Build This Website + +A literate program is a particular type of software program where code is not +directly written in source files, but rather in a text document as code +snippets. In essence, literate programming allows for writing in the same place +both the software program and its technical documentation. + +**`cleopatra`** is a “literate toolchain” I have implemented to build this +website, and you are currently reading it[^past]. That is, **`cleopatra`** is +both the build system and an article of this website! To achieve this, +**`cleopatra`** has been written as a collection of org files which can be +either “tangled” using [Babel](https://orgmode.org/worg/org-contrib/babel/) or +“exported” as a HTML document. Tangling here refers to extract marked code +blocks into files. + +[^past]: This sentence was true when this article was published, but things + have changed since then. + + What you are reading is actually the rendered version of a Markdown + document that was manually “translated” from the Org original document, + named `Bootstrap.org`. Interested readers can [have a look at the original + version + here](https://src.soap.coffee/soap.coffee/lthms.git/tree/site/cleopatra?id=9329e9883a52eb95c0803a46560c396d149ef2c6). + + Truth be told, said version is probably complete gibberish for anyone who + isn’t me. For this reason, this version was actually heavily reworked… + Because I have too much free time, probably. + +The page you are currently reading is **`cleopatra`** entry point. Its +primary purpose is to define two Makefiles —`makefile` and `bootstrap.mk`— +and the necessary emacs-lisp script to tangle this document. + +On the one hand, `makefile` is the main entrypoint of **`cleopatra`**. It +serves two purposes: + +1. It initiates a few global variables, and +2. It provides a rule to tangle this document, that is to update itself and + `bootstrap.mk`. + +On the other hand, `bootstrap.mk` is used to declare the various “generation +processes” used to generate this website. + +`makefile` and the emacs-lisp scripts are versioned, because they are necessary +to bootstrap **`cleopatra`**; but since they are also defined in this document, +it means **`cleopatra`** can update itself, in some sense. This is to be kept in mind +when modifying this document to hastily. + +## Global Constants and Variables + +First, `makefile` defines several global “constants”[^constants]. In a nutshell, + +- `$ROOT`{.bash} tells Emacs where the root of your website sources is, so + that tangled output filenames can be given relative to it rather than the org + files. So for instance, the `block_src` tangle parameter for `Makefile` + looks like `:tangle Makefile`{.lisp}, instead of `:tangle + ../../Makefile`{.lisp}. +- `$CLEODIR`{.bash} tells **`cleopatra`** where its sources live. If you place + it inside the `site/` directory (as it is intended), and you enable the use + of `org` files to author your contents, then **`cleopatra`** documents will + be part of your website. If you don’t want that, just move the directory + outside the `site/` directory, and update the `$CLEODIR`{.bash} variable + accordingly. + +[^constants]: As far as I know, `make` does not support true *constant* values, + It is assumed generation processes will not modify them. + +For this website, these constants are defined as follows[^comments]. + +[^comments]: I will use a comment in the first line to recall to which file a + given block code is expected to be tangled. + +```makefile +# makefile: +ROOT := $(shell pwd) +CLEODIR := site/cleopatra +``` + +We then introduce two variables to list the output of the generation processes, +with two purposes in mind: keeping the `.gitignore` up-to-date automatically, +and providing rules to remove them. + +- `$ARTIFACTS`{.bash} lists the short-term artifacts which can be removed + frequently without too much hassle. They will be removed by `make clean`. +- `$CONFIGURE`{.bash} lists the long-term artifacts whose generation can be + time consuming. They will only be removed by`~make cleanall`. + +```makefile +# makefile: +ARTIFACTS := build.log +CONFIGURE := + +clean : + @rm -rf ${ARTIFACTS} + +cleanall : clean + @rm -rf ${CONFIGURE} +``` + +Generation processes can declare new build outputs using the `+=` assignement +operators. Using another operator will likely provoke an undesirable result. + +## Tangling Org Documents + +**`cleopatra`** is a literate program implemented with Org mode, an Emacs major +editing mode. We provide the necessary bits to easily tangle Org documents. + +The configuration of Babel is done using an emacs lisp script called +`tangle-org.el` whose status is similar to `Makefile`. It is part of the +bootstrap process, and therefore lives “outside” of **`cleopatra`** (it is not +deleted with `make clean` for instance). However, it is overwritten when this +file is tangled. If you try to modify it and find that **`cleopatra`** does not +work properly, you should restore it. + +```lisp +;;; tangle-org.el: +(require 'org) +(cd (getenv "ROOT")) +(setq org-confirm-babel-evaluate nil) +(setq org-src-preserve-indentation t) +(add-to-list 'org-babel-default-header-args + '(:mkdirp . "yes")) +(org-babel-do-load-languages + 'org-babel-load-languages + '((shell . t))) +(org-babel-tangle) +``` + +We define variables that ensure that the `$ROOT`{.bash} environment variable is +set and `tangle-org.el` is loaded when using Emacs. + +```makefile +# makefile: +EMACSBIN := emacs +EMACS := ROOT="${ROOT}" ${EMACSBIN} +TANGLE := --batch \ + --load="${ROOT}/scripts/tangle-org.el" \ + 2>> build.log +``` + +Finally, we introduce a [canned +recipe](https://www.gnu.org/software/make/manual/html_node/Canned-Recipes.html#Canned-Recipes) +to seamlessly tangle a given file[^canned]. + +[^canned]: It was the first time I had used canned recipes, and I don’t think I + had the opportunity to re-use it ever again. + +```makefile +# makefile: +define emacs-tangle = +echo " tangle $<" +${EMACS} $< ${TANGLE} +endef +``` + +## Updating `.gitignore` Automatically + +Assuming each generation process correctly defines its `$ARTIFACTS`{.bash} +and `$CONFIGURE`{.bash} variables, we have all the information we need to +update `.gitignore` automatically. + +This is done by adding markers in `.gitignore` to define a region under the +control of **`cleopatra`**, and writing a script to update said region after +each build. + +```bash +# update-gitignore.sh: +BEGIN_MARKER="# begin generated files" +END_MARKER="# begin generated files" + +# remove the previous list of generated files to ignore +sed -i -e "/${BEGIN_MARKER}/,/${END_MARKER}/d" .gitignore +# remove trailing empty lines +sed -i -e :a -e '/^\n*$/{$d;N;};/\n$/ba' .gitignore + +# output the list of files to ignore +echo "" >> .gitignore +echo ${BEGIN_MARKER} >> .gitignore +for f in $@; do + echo "${f}" >> .gitignore +done +echo ${END_MARKER} >> .gitignore +``` + +The `ignore` rule of `makefile` is defined as follows. + +```makefile +# makefile: +ignore : + @echo " update gitignore" + @scripts/update-gitignore.sh \ + ${ARTIFACTS} \ + ${CONFIGURE} +``` + +## Bootstrapping + +The core purpose of `makefile` remains to bootstrap the chain of generation +processes. This chain is divided into three stages: `prebuild`, `build`, and +`postbuild`. + +This translates as follows in `makefile`. + +```makefile +# makefile: +default : postbuild ignore + +init : + @rm -f build.log + +prebuild : init + +build : prebuild + +postbuild : build + +.PHONY : init prebuild build postbuild ignore +``` + +A *generation process* in **`cleopatra`** is a Makefile which provides rules for +these three stages, along with the utilities used by these rules. More +precisely, a generation process `proc` is defined in `proc.mk`. The rules of +`proc.mk` for each stage are expected to be prefixed by `proc-`, *e.g.*, +`proc-prebuild` for the `prebuild` stage. + +Eventually, the following dependencies are expected between within the chain of +generation processes for every generation process. + +```makefile +prebuild : proc-prebuild +build : proc-build +postbuild : proc-postbuild + +proc-build : proc-prebuild +proc-postbuild : proc build +``` + +**`cleopatra`** is a literate toolchain whose main purpose is to allow me to +turn the scripts I wrote to generate my website into blogposts of said website. +As such, it allows me to implement the generation processes using Org mode, +which means that before being able to start generating HTML files, +**`cleopatra`** has to tangle the generation processes. + +To achieve this, **`cleopatra`** relies on a particular behavior of `make` +regarding the `include` directive. If there exists a rule to generate a +Makefile used as an operand of `include`, `make` will use this rule to update +(if necessary) said Makefile before actually including it. + +Therefore, the rules of the following form achieve our ambition of extensibility. + +```makefile +include ${PROC}.mk + +prebuild : ${PROC}-prebuild +build : ${PROC}-build +postbuild : ${PROC}-postbuild + +${PROC}-prebuild : ${PROC}.mk ${AUX} +${PROC}-build : ${PROC}-prebuild +${PROC}-postbuild : ${PROC}-build + +${PROC}.mk ${AUX} &:\ + ${CLEODIR}/${IN} + @$(emacs-tangle) + +CONFIGURE += ${PROC}.mk ${AUX} + +.PHONY : ${PROC}-prebuild \ + ${PROC}-build \ + ${PROC}-postbuild +``` + +where + +- `$IN`{.bash} is the Org document which contains the generation process code +- `$PROC`{.bash} is the name of the generation process +- `$AUX`{.bash} lists the utilities of the generation process tangled from + `$IN`{.bash} with `$PROC.mk`{.bash} + +We use `&:` is used in place of `:` to separate the target from its +dependencies in the “tangle rule.”[^obscure] This tells `make` that the recipe of this +rule generates all these files. + +[^obscure]: Yet another obscure Makefile trick I have never encountered again. + +Rather than writing these rules manually for each generation process we want to +define, we rely on to [noweb of +Babel](https://orgmode.org/worg/org-tutorials/org-latex-export.html). We call +`extends` the primitive to generate new generation processes. + +We derive the rule to tangle `bootstrap.mk` using `extends`, using the following Org mode syntax. + +```org +#+BEGIN_SRC makefile :noweb yes +# makefile: +<<extends(IN="Bootstrap.org", PROC="bootstrap", AUX="scripts/update-gitignore.sh")>> +#+END_SRC +``` + +which gives the following result. + +```makefile +include bootstrap.mk + +prebuild : bootstrap-prebuild +build : bootstrap-build +postbuild : bootstrap-postbuild + +bootstrap-prebuild : bootstrap.mk scripts/update-gitignore.sh +bootstrap-build : bootstrap-prebuild +bootstrap-postbuild : bootstrap-build + +bootstrap.mk scripts/update-gitignore.sh &:\ + ${CLEODIR}/Bootstrap.org + @$(emacs-tangle) + +CONFIGURE += bootstrap.mk scripts/update-gitignore.sh + +.PHONY : bootstrap-prebuild \ + bootstrap-build \ + bootstrap-postbuild +``` + +These are the last lines of `makefile`. The rest of the generation processes will be +declared in `bootstrap.mk`. + +## Generation Processes + +In this section, we construct `bootstrap.mk` by enumerating the generation +processes that are currently used to generate the website you are reading. + +We recall that each generation process shall + +1. Define `proc-prebuild`, `proc-build`, and `proc-postbuild` +2. Declare dependencies between stages of generation processes +3. Declare build outputs (see `ARTIFACTS` and `CONFIGURE`) + +### Theming and Templating + +The +[`theme`](https://src.soap.coffee/soap.coffee/lthms.git/tree/site/cleopatra/Theme.org?id=9329e9883a52eb95c0803a46560c396d149ef2c6) +generation process controls the general appearance of the website. More +precisely, it introduces the main template used by `soupault` +(`main/templates.html`), and the main SASS sheet used by this template. + +If a generation process produces a set of styles within a specific SASS files, +the current approach is + +1. To make this file a dependency of `theme-build` +2. To modify `style/main.sass` in `theme` + to import this file + +Eventually, the second step will be automated, but, in the meantime, +this customization is mandatory. + +### Configuring Soupault + +The +[`soupault`](https://src.soap.coffee/soap.coffee/lthms.git/tree/site/cleopatra/Soupault.org?id=9329e9883a52eb95c0803a46560c396d149ef2c6) +generation configures and run `soupault`, in order to generate a static +website. + +If a generation process `proc` produces files that will eventually be +integrated to your website, its `proc-build` recipe needs to be executed +*before* the `soupault-build` recipe. This can be enforced by making the +dependency explicit to `make`, *i.e.*, + +```makefile +soupault-build : proc-build +``` + +Eventually, generation processes shall be allowed to produce specific `soupault` +widgets to be integrated into `soupault.conf`. + +### Authoring Contents + +The fact that **`cleopatra`** is a literate program which gradually generates +itself was not intended: it is a consequence of my desire to be able to easily +use whatever format I so desire for writing my contents, and Org documents in +particular. + +In the present website, contents can be written in the following format: + +- **HTML Files:** This requires no particular set-up, since HTML is the *lingua + franca* of `soupault`. +- **Regular Coq files:** Coq is a system which allows writing machine-checked + proofs, and it comes with a source “prettifier” called `coqdoc`. [Learn more + about the generation process for Coq + files](https://src.soap.coffee/soap.coffee/lthms.git/tree/site/cleopatra/Contents/Coq.org?id=9329e9883a52eb95c0803a46560c396d149ef2c6). +- **Org documents:** Emacs comes with a powerful editing mode called [Org + mode](https://orgmode.org/), and Org documents are really pleasant to work + with. [Learn more about the generation process for Org + documents](https://src.soap.coffee/soap.coffee/lthms.git/tree/site/cleopatra/Contents/Org.org?id=9329e9883a52eb95c0803a46560c396d149ef2c6) diff --git a/site/posts/CleopatraV1.org b/site/posts/CleopatraV1.org deleted file mode 100644 index ee5789d..0000000 --- a/site/posts/CleopatraV1.org +++ /dev/null @@ -1,324 +0,0 @@ -#+BEGIN_EXPORT html -<h1><strong><code>cleopatra</code></strong>: Bootstrapping an Extensible Toolchain</h1> -#+END_EXPORT - -#+BEGIN_TODO -You are about to read the first version of *~cleopatra~*, the toolchain -initially implemented to build the website you are reading. Since then, -*~cleopatra~* has been completely rewritten as a -[[https://cleopatra.soap.coffee][independant, more generic command-line -program]]. That being said, the build process described in this write-up remains -the one implemented in *~cleopatra~* the Second. -#+END_TODO - -A literate program is a particular type of software program where code is not -directly written in source files, but rather in text document as code -snippets. In some sense, literate programming allows for writing in the same -place both the software program and its technical documentation. - -That being said, *~cleopatra~* is a toolchain to build a website before being a -literate program, and one of its objective is to be /part of this very website -it is used to generate/. To acheive this, *~cleopatra~* has been written as a -collection of org files which can be either “tangled” using -[[https://orgmode.org/worg/org-contrib/babel/][Babel]] or “exported” as a HTML -document. Tangling here refers to extracted marked code blocks into files. - -The page you are currently reading is *~cleopatra~* entry point. Its primilarly -purpose is to introduce two Makefiles: ~Makefile~ and ~bootstrap.mk~. - -#+TOC: headlines 2 - -#+BEGIN_EXPORT html -<div id="history">site/posts/CleopatraV1.org</div> -#+END_EXPORT - -~Makefile~ serves two purposes: it initiates a few global variables, and it -provides a rule to generate ~bootstrap.mk~. At this point, some readers may -wonder /why/ we need ~Makefile~ in this context, and the motivation behind this -choice is really reminescent of a boot sequence. The rationale is that we need a -“starting point” for *~cleopatra~*. The toolchain cannot live solely inside -org-files, otherwise there would not have any code to execute the first time we -tried to generate the website. We need an initial Makefile, one that has little -chance to change, so that we can almost consider it read-only. Contrary to the -other Makefiles that we will generate, this one will not be deleted by ~make -clean~. - -This is similar to your computer: it requires a firmware to boot, whose purpose -—in a nutshell— is to find and load an operating system. - -Modifying the content of ~Makefile~ in this document /will/ modify -~Makefile~. This means one can easily put *~cleopatra~* into an inconsistent -state, which would prevent further generation. This is why the generated -~Makefile~ should be versioned, so that you can restore it using ~git~ if you -made a mistake when you modified it. - -For readers interested in using *~cleopatra~* for their own websites, this -documents tries to highlight the potential modifications they would have to -make. - -* Global Constants and Variables - -First, ~Makefile~ defines several global “constants” (although as far as I know -~make~ does not support true constant values, it is expected further generation -process will not modify them). - -In a nutshell, - -- ~ROOT~ :: - Tell Emacs where the root of your website sources is, so that tangled output - filenames can be given relative to it rather than the org files. So for - instance, the ~BLOCK_SRC~ tangle parameter for ~Makefile~ looks like ~:tangle - Makefile~, instead of ~:tangle ../../Makefile~. -- ~CLEODIR~ :: - Tell *~cleopatra~* where its sources live. If you place it inside the ~site/~ - directory (as it is intended), and you enable the use of ~org~ files to author - your contents, then *~cleopatra~* documents will be part of your website. If - you don’t want that, just move the directory outside the ~site/~ directory, - and update the ~CLEODIR~ variable accordingly. - -For this website, these constants are defined as follows. - -#+BEGIN_SRC makefile :noweb no-export -ROOT := $(shell pwd) -CLEODIR := site/cleopatra -#+END_SRC - -We then introduce two variables to list the output of the generation processes, -with two purposes in mind: keeping the ~.gitignore~ up-to-date automatically, -and providing rules to remove them. - -- ~ARTIFACTS~ :: - Short-term artifacts which can be removed frequently without too much - hassle. They will be removed by ~make clean~. -- ~CONFIGURE~ :: - Long-term artifacts whose generation can be time consuming. They will only be - removed by ~make cleanall~. - -#+BEGIN_SRC makefile -ARTIFACTS := build.log -CONFIGURE := -#+END_SRC - -Generation processes shall declare new build outputs using the ~+=~ assignement -operators. Using another operator will likely provent an underisable result. - -* Easy Tangling of Org Documents - -*~cleopatra~* is a literate program implemented with Org mode, an Emacs major -editing mode. We provide the necessary bits to easily tangle Org documents. - -The configuration of Babel is done using an emacs lisp script called -~tangle-org.el~ whose status is similar to ~Makefile~. It is part of the -bootstrap process, and therefore lives “outside” of *~cleopatra~* (it is not -deleted with ~make clean~ for instance). However, it is overwritten. If you try -to modify it and find that *~cleopatra~* does not work properly, you should -restore it using ~git~. - -#+BEGIN_SRC emacs-lisp -(require 'org) -(cd (getenv "ROOT")) -(setq org-confirm-babel-evaluate nil) -(setq org-src-preserve-indentation t) -(add-to-list 'org-babel-default-header-args - '(:mkdirp . "yes")) -(org-babel-do-load-languages - 'org-babel-load-languages - '((shell . t))) -(org-babel-tangle) -#+END_SRC - -We define variables that ensure that the ~ROOT~ environment variable is set and -~tangle-org.el~ is loaded when using Emacs. - -#+BEGIN_SRC makefile -EMACSBIN := emacs -EMACS := ROOT="${ROOT}" ${EMACSBIN} -TANGLE := --batch \ - --load="${ROOT}/scripts/tangle-org.el" \ - 2>> build.log -#+END_SRC - -Finally, we introduce a -[[https://www.gnu.org/software/make/manual/html_node/Canned-Recipes.html#Canned-Recipes][canned -recipe]] to seamlessly tangle a given file. - -#+BEGIN_SRC makefile -define emacs-tangle = -echo " tangle $<" -${EMACS} $< ${TANGLE} -endef -#+END_SRC - -* Bootstrapping - -The core purpose of ~Makefile~ remains to bootstrap the chain of generation -processes. This chain is divided into three stages: ~prebuild~, ~build~, and -~postbuild~. - -This translates as follows in ~Makefile~. - -#+BEGIN_SRC makefile -default : postbuild ignore - -init : - @rm -f build.log - -prebuild : init - -build : prebuild - -postbuild : build - -.PHONY : init prebuild build postbuild ignore -#+END_SRC - -A *generation process* in *~cleopatra~* is a Makefile which provides rules for -these three stages, along with the utilities used by these rules. More -precisely, a generation process ~proc~ is defined in ~proc.mk~. The rules of -~proc.mk~ for each stage are expected to be prefixed by ~proc-~, /e.g./, -~proc-prebuild~ for the ~prebuild~ stage. - -Eventually, the following dependencies are expected between within the chain of -generation processes. - -#+BEGIN_SRC makefile -prebuild : proc-prebuild -build : proc-build -postbuild : proc-postbuild - -proc-build : proc-prebuild -proc-postbuild : proc build -#+END_SRC - -Because *~cleopatra~* is a literate program, generation processes are defined in -Org documents –which may contains additional utilities like scripts or -templates—, and therefore need to be tangled prior to be effectively -useful. *~cleopatra~ relies on a particular behavior of ~make~ regarding the -~include~ directive. If there exists a rule to generate a Makefile used as an -operand of ~include~, ~make~ will use this rule to update (if necessary) said -Makefile before actually including it. - -Therefore, rules of the following form achieve our ambition of extensibility. - -#+BEGIN_SRC makefile :noweb yes -<<extends(PROC="${PROC}", IN="${IN}", AUX="${AUX}")>> -#+END_SRC - -where - -- ~${IN}~ is the Org document which contains the generation process code -- ~${PROC}~ is the name of the generation process -- ~${AUX}~ lists the utilities of the generation process tangled from ~${IN}~ - with ~${PROC}.mk~ - -We use ~&:~ is used in place of ~:~ to separate the target from its dependencies -in the “tangle rule.” This tells ~make~ that the recipe of this rule generates -all these files. - -Writing these rules manually —has yours truly had to do in the early days of his -website— has proven to be error-prone. - -One desirable feature for *~cleopatra~* would be to generate them automatically, -by looking for relevant ~:tangle~ directives inside the input Org document. The -challenge lies in the “relevant” part: the risk exists that we have false -posivite. However and as a first steps towards a fully automated solution, we -can leverage the evaluation features of Babel here. - -Here is a bash script which, given the proper variables, would generate the -expected Makefile rule. - -#+NAME: extends -#+BEGIN_SRC bash :var PROC="" :var AUX="" :var IN="" :results output -cat <<EOF -include ${PROC}.mk - -prebuild : ${PROC}-prebuild -build : ${PROC}-build -postbuild : ${PROC}-postbuild - -${PROC}-prebuild : ${PROC}.mk ${AUX} -${PROC}-build : ${PROC}-prebuild -${PROC}-postbuild : ${PROC}-build - -${PROC}.mk ${AUX} &:\\ - \${CLEODIR}/${IN} - @\$(emacs-tangle) - -CONFIGURE += ${PROC}.mk ${AUX} - -.PHONY : ${PROC}-prebuild \\ - ${PROC}-build \\ - ${PROC}-postbuild -EOF -#+END_SRC - -The previous source block is given a name (=extends=), and an explicit lists of -variables (~IN~, ~PROC~, and ~AUX~). Thanks to the -[[https://orgmode.org/worg/org-tutorials/org-latex-export.html][noweb syntax of -Babel]], we can insert the result of the evaluation of =extends= inside another -source block when the latter is tangled. - -We derive the rule to tangle ~bootstrap.mk~ using =extends=, which gives us the -following Makefile snippet. - -#+BEGIN_SRC makefile :noweb yes -<<extends(IN="Bootstrap.org", PROC="bootstrap", AUX="scripts/update-gitignore.sh")>> -#+END_SRC - -Beware that, as a consequence, modifying code block of =extends= is as -“dangerous” as modifying ~Makefile~ itself. Keep that in mind if you start -hacking *~cleopatra~*! - -Additional customizations of *~cleopatra~* will be parth ~bootstrap.mk~, rather -than ~Makefile~. - -* Generation Processes - -Using the =extends= noweb reference, *~cleopatra~* is easily extensible. In -this section, we first detail the structure of a typical generation process. -Then, we construct ~bootstrap.mk~ by enumerating the generation processes that -are currently used to generate the website you are reading. - -Each generation process shall - -1. Define ~proc-prebuild~, ~proc-build~, and ~proc-postbuild~ -2. Declare dependencies between stages of generation processes -3. Declare build outputs (see ~ARTIFACTS~ and ~CONFIGURE~) - -* Wrapping-up - -#+BEGIN_SRC bash :shebang "#+/bin/bash" -BEGIN_MARKER="# begin generated files" -END_MARKER="# begin generated files" - -# remove the previous list of generated files to ignore -sed -i -e "/${BEGIN_MARKER}/,/${END_MARKER}/d" .gitignore -# remove trailing empty lines -sed -i -e :a -e '/^\n*$/{$d;N;};/\n$/ba' .gitignore - -# output the list of files to ignore -echo "" >> .gitignore -echo ${BEGIN_MARKER} >> .gitignore -for f in $@; do - echo "${f}" >> .gitignore -done -echo ${END_MARKER} >> .gitignore -#+END_SRC - -#+BEGIN_SRC makefile -ignore : - @echo " update gitignore" - @scripts/update-gitignore.sh \ - ${ARTIFACTS} \ - ${CONFIGURE} - -clean : - @rm -rf ${ARTIFACTS} - -cleanall : clean - @rm -rf ${CONFIGURE} -#+END_SRC - -# Local Variables: -# org-src-preserve-indentation: t -# End: diff --git a/site/posts/ClightIntroduction.md b/site/posts/ClightIntroduction.md new file mode 100644 index 0000000..67b40db --- /dev/null +++ b/site/posts/ClightIntroduction.md @@ -0,0 +1,386 @@ +--- +tags: ['coq'] +published: 2020-03-20 +modified: 2020-12-08 +abstract: | + Clight is a “simplified” C AST used by CompCert, the certified C compiler. + In this write-up, we prove a straighforward functional property of a small + C function, as an exercise to discover the Clight semantics. +--- + +# A Study of Clight and its Semantics + +CompCert is a certified C compiler which comes with a proof of semantics +preservation. What this means is the following: the semantics of the C code you +write is preserved by CompCert compilation passes up to the generated machine +code. + +I had been interested in CompCert for quite some times, and ultimately +challenged myself to study Clight and its semantics. This write-up is the +result of this challenge, written as I was progressing. + +## Installing CompCert + +CompCert has been added to `opam`, and as a consequence can be very easily +used as a library for other Coq developments. A typical use case is for a +project to produce Clight (the high-level AST of CompCert), and to benefit +from CompCert proofs after that. + +Installing CompCert is as easy as + +```bash +opam install coq-compcert +``` + +More precisely, this article uses `coq-compcert.3.8`. + +Once `opam` terminates, the `compcert` namespace becomes available. In +addition, several binaries are now available if you have correctly set your +`$PATH`{.bash} environment variable. For instance, `clightgen` takes a C file +as an argument, and generates a Coq file which contains the Clight generated by +CompCert. + +## Problem Statement + +Our goal for this first write-up is to prove that the C function + +```c +int add (int x, int y) { + return x + y; +} +``` + +returns the expected result, that is `x + y`{.c}. The `clightgen` tool +generates (among other things) the following AST[^read]. + +[^read]: It has been modified in order to improve its readability. + +```coq +From compcert Require Import Clight Ctypes Clightdefs AST + Coqlib Cop. + +Definition _x : ident := 1%positive. +Definition _y : ident := 2%positive. + +Definition f_add : function := + {| fn_return := tint + ; fn_callconv := cc_default + ; fn_params := [(_x, tint); (_y, tint)] + ; fn_vars := [] + ; fn_temps := [] + ; fn_body := Sreturn + (Some (Ebinop Oadd + (Etempvar _x tint) + (Etempvar _y tint) + tint)) + |}. +``` + +The fields of the `function`{.coq} type are pretty self-explanatory (as it is +often the case in CompCert’s ASTs as far as I can tell for now). + +Identifiers in Clight are (`positive`{.coq}) indices. The `fn_body` field is of +type `statement`{.coq}, with the particular constructor `Sreturn`{.coq} whose argument +is of type `option expr`{.coq}, and `statement`{.coq} and `expr`{.coq} look like the two main +types to study. The predicates `step1`{.coq} and `step2`{.coq} allow for reasoning +about the execution of a `function`{.coq}, step-by-step (hence the name). It +appears that `clightgen` generates Clight terms using the function call +convention encoded by `step2`{.coq}. To reason about a complete execution, it +appears that we can use `star`{.coq} (from the `Smallstep`{.coq} module) which is +basically a trace of `step`{.coq}. These semantics are defined as predicates (that +is, they live in `Prop`{.coq}). They allow for reasoning about +state transformation, where a state is either + +- A function call, with a given list of arguments and a continuation +- A function return, with a result and a continuation +- A `statement`{.coq} execution within a `function`{.coq} + +We import several CompCert modules to manipulate *values* (in our case, +bounded integers). + +```coq +From compcert Require Import Values Integers. +Import Int. +``` + +Putting everything together, the lemma we want to prove about `f_add`{.coq} is +the following. + +```coq +Lemma f_add_spec (env : genv) + (t : Events.trace) + (m m' : Memory.Mem.mem) + (v : val) (x y z : int) + (trace : Smallstep.star step2 env + (Callstate (Ctypes.Internal f_add) + [Vint x; Vint y] + Kstop + m) + t + (Returnstate (Vint z) Kstop m')) + : z = add x y. +``` + +## Proof Walkthrough + +We introduce a custom `inversion`{.coq} tactic which does some clean-up in +addition to just perform the inversion. + +```coq +Ltac smart_inv H := + inversion H; subst; cbn in *; clear H. +``` + +We can now try to prove our lemma. + +```coq +Proof. +``` + +We first destruct `trace`{.coq}, and we rename the generated hypothesis in order +to improve the readability of these notes. + +```coq + smart_inv trace. + rename H into Hstep. + rename H0 into Hstar. +``` + +This generates two hypotheses. + +``` +Hstep : step1 + env + (Callstate (Ctypes.Internal f_add) + [Vint x; Vint y] + Kstop + m) + t1 + s2 +Hstar : Smallstep.star + step2 + env + s2 + t2 + (Returnstate (Vint z) Kstop m') +``` + +In other words, to “go” from a `Callstate`{.coq} of `f_add`{.coq} to a +`Returnstate`{.coq}, there is a first step from a `Callstate`{.coq} to a state +`s2`{.coq}, then a succession of steps to go from `s2`{.coq} to a +`Returnstate`{.coq}. + +We consider the single `step`{.coq}, in order to determine the actual value of +`s2`{.coq} (among other things). To do that, we use `smart_inv`{.coq} on +`Hstep`{.coq}, and again perform some renaming. + +```coq + smart_inv Hstep. + rename le into tmp_env. + rename e into c_env. + rename H5 into f_entry. +``` + +This produces two effects. First, a new hypothesis is added to the context. + +``` +f_entry : function_entry1 + env + f_add + [Vint x; Vint y] + m + c_env + tmp_env + m1 +``` + +Then, the `Hstar`{.coq} hypothesis has been updated, because we now have a more +precise value of `s2`{.coq}. More precisely, `s2`{.coq} has become + +``` +State + f_add + (Sreturn + (Some (Ebinop Oadd + (Etempvar _x tint) + (Etempvar _y tint) + tint))) + Kstop + c_env + tmp_env + m1 +``` + +Using the same approach as before, we can uncover the next step. + +```coq + smart_inv Hstar. + rename H into Hstep. + rename H0 into Hstar. +``` + +The resulting hypotheses are + +``` +Hstep : step2 env + (State + f_add + (Sreturn + (Some + (Ebinop Oadd + (Etempvar _x tint) + (Etempvar _y tint) + tint))) + Kstop c_env tmp_env m1) t1 s2 +Hstar : Smallstep.star + step2 + env + s2 + t0 + (Returnstate (Vint z) Kstop m') +``` + +An inversion of `Hstep`{.coq} can be used to learn more about its resulting +state… So let’s do just that. + +```coq + smart_inv Hstep. + rename H7 into ev. + rename v0 into res. + rename H8 into res_equ. + rename H9 into mem_equ. +``` + +The generated hypotheses have become + +``` +res : val +ev : eval_expr env c_env tmp_env m1 + (Ebinop Oadd + (Etempvar _x tint) + (Etempvar _y tint) + tint) + res +res_equ : sem_cast res tint tint m1 = Some v' +mem_equ : Memory.Mem.free_list m1 + (blocks_of_env env c_env) + = Some m'0 +``` + +Our understanding of these hypotheses is the following + +- The expression `_x + _y`{.coq} is evaluated using the `c_env`{.coq} + environment (and we know thanks to `binding`{.coq} the value of `_x`{.coq} + and `_y`{.coq}), and its result is stored in `res`{.coq} +- `res`{.coq} is cast into a `tint`{.coq} value, and acts as the result of + `f_add`{.coq} + +The `Hstar`{.coq} hypothesis is now interesting + +``` +Hstar : Smallstep.star + step2 env + (Returnstate v' Kstop m'0) t0 + (Returnstate (Vint z) Kstop m') +``` + +It is clear that we are at the end of the execution of `f_add`{.coq} (even if +Coq generates two subgoals, the second one is not relevant and easy to +discard). + +```coq + smart_inv Hstar; [| smart_inv H ]. +``` + +We are making good progress here, and we can focus our attention on the `ev`{.coq} +hypothesis, which concerns the evaluation of the `_x + _y`{.coq} expression. We can +simplify it a bit further. + +```coq + smart_inv ev; [| smart_inv H]. + rename H4 into fetch_x. + rename H5 into fetch_y. + rename H6 into add_op. +``` + +In a short-term, the hypotheses `fetch_x`{.coq} and `fetch_y`{.coq} are the +most important. + +``` +fetch_x : eval_expr env c_env tmp_env m1 (Etempvar _x tint) v1 +fetch_y : eval_expr env c_env tmp_env m1 (Etempvar _y tint) v2 +``` + +The current challenge we face is to prove that we know their value. At this +point, we can have a look at `f_entry`{.coq}. This is starting to look +familiar: `smart_inv`{.coq}, then renaming, etc. + +```coq + smart_inv f_entry. + clear H. + clear H0. + clear H1. + smart_inv H3; subst. + rename H2 into allocs. +``` + +We are almost done. Let’s simplify as much as possible `fetch_x`{.coq} and +`fetch_y`{.coq}. Each time, the `smart_inv`{.coq} tactic generates two subgoals, +but only the first one is relevant. The second one is not, and can be +discarded. + +```coq + smart_inv fetch_x; [| inversion H]. + smart_inv H2. + smart_inv fetch_y; [| inversion H]. + smart_inv H2. +``` + +We now know the values of the operands of `add`{.coq}. The two relevant +hypotheses that we need to consider next are `add_op`{.coq} and +`res_equ`{.coq}. They are easy to read. + +``` +add_op : sem_binarith + (fun (_ : signedness) (n1 n2 : Integers.int) + => Some (Vint (add n1 n2))) + (fun (_ : signedness) (n1 n2 : int64) + => Some (Vlong (Int64.add n1 n2))) + (fun n1 n2 : Floats.float + => Some (Vfloat (Floats.Float.add n1 n2))) + (fun n1 n2 : Floats.float32 + => Some (Vsingle (Floats.Float32.add n1 n2))) + v1 tint v2 tint m1 = Some res +``` + +- `add_op`{.coq} is the addition of `Vint x`{.coq} and `Vint y`{.coq}, and its + result is `res`{.coq}. + + ``` + res_equ : sem_cast res tint tint m1 = Some (Vint z) + ``` + +- `res_equ`{.coq} is the equation which says that the result of `f_add`{.coq} is `res`{.coq}, + after it has been cast as a `tint`{.coq} value. + +We can simplify `add_op`{.coq} and `res_equ`{.coq}, and this allows us to +conclude. + +```coq + smart_inv add_op. + smart_inv res_equ. + reflexivity. +Qed. +``` + +## Conclusion + +The definitions of Clight are straightforward, and the [CompCert +documentation](http://compcert.inria.fr/doc/index.html) is very pleasant to +read. Understanding Clight and its semantics can be very interesting if you +are working on a language that you want to translate into machine code. +However, proving some functional properties of a given C snippet using only CompCert +can quickly become cumbersome. From this perspective, the +[VST](https://github.com/PrincetonUniversity/VST) project is very interesting, +as its main purpose is to provide tools to reason about Clight programs more +easily. diff --git a/site/posts/ClightIntroduction.v b/site/posts/ClightIntroduction.v deleted file mode 100644 index 85d084b..0000000 --- a/site/posts/ClightIntroduction.v +++ /dev/null @@ -1,357 +0,0 @@ -(** #<nav><p class="series">../coq.html</p>
- <p class="series-prev">./RewritingInCoq.html</p>
- <p class="series-next">./AlgebraicDatatypes.html</p></nav># *)
-
-(** * A Study of Clight and its Semantics *)
-(* begin hide *)
-From Coq Require Import List.
-Import ListNotations.
-(* end hide *)
-(** CompCert is a certified C compiler which comes with a proof of semantics
- preservation. What this means is the following: the semantics of the C code
- you write is preserved by CompCert compilation passes up to the generated
- machine code.
-
- I had been interested in CompCert for quite some times, and ultimately
- challenged myself to study Clight and its semantics. This write-up is the
- result of this challenge, written as I was progressing.
-
- #<nav id="generate-toc"></nav>#
-
- #<div id="history">site/posts/ClightIntroduction.v</div># *)
-
-(** ** Installing CompCert *)
-
-(** CompCert has been added to <<opam>>, and as a consequence can be very easily
- used as a library for other Coq developments. A typical use case is for a
- project to produce Clight (the high-level AST of CompCert), and to benefit
- from CompCert proofs after that.
-
- Installing CompCert is as easy as
-
-<<
-opam install coq-compcert
->>
-
- More precisely, this article uses #<code>coq-compcert.3.8</code>#.
-
- Once <<opam>> terminates, the <<compcert>> namespace becomes available. In
- addition, several binaries are now available if you have correctly set your
- <<PATH>> environment variable. For instance, <<clightgen>> takes a C file as
- an argument, and generates a Coq file which contains the Clight generated by
- CompCert. *)
-
-(** ** Problem Statement *)
-
-(** Our goal for this first write-up is to prove that the C function
-
-<<
-int add (int x, int y) {
- return x + y;
-}
->>
-
- returns the expected result, that is <<x + y>>. The <<clightgen>> tool
- generates (among other things) the following AST (note: I have modified it
- in order to improve its readability). *)
-
-From compcert Require Import Clight Ctypes Clightdefs AST
- Coqlib Cop.
-
-Definition _x : ident := 1%positive.
-Definition _y : ident := 2%positive.
-
-Definition f_add : function :=
- {| fn_return := tint
- ; fn_callconv := cc_default
- ; fn_params := [(_x, tint); (_y, tint)]
- ; fn_vars := []
- ; fn_temps := []
- ; fn_body := Sreturn
- (Some (Ebinop Oadd
- (Etempvar _x tint)
- (Etempvar _y tint)
- tint))
- |}.
-
-(** The fields of the [function] type are pretty self-explanatory (as it is
- often the case in CompCert’s ASTs as far as I can tell for now).
-
- Identifiers in Clight are ([positive]) indices. The [fn_body] field is of
- type [statement], with the particular constructor [Sreturn] whose argument
- is of type [option expr], and [statement] and [expr] look like the two main
- types to study. The predicates [step1] and [step2] allow for reasoning
- about the execution of a [function], step by step (hence the name). It
- appears that <<clightgen>> generates Clight terms using the function call
- convention encoded by [step2]. To reason about a complete execution, it
- appears that we can use [star] (from the [Smallstep] module) which is
- basically a trace of [step]. These semantics are defined as predicates (that
- is, they live in [Prop]). They allow for reasoning about
- state-transformation, where a state is either
-
- - A function call, with a given list of arguments and a continuation
- - A function return, with a result and a continuation
- - A [statement] execution within a [function]
-
- We import several CompCert modules to manipulate _values_ (in our case,
- bounded integers). *)
-
-From compcert Require Import Values Integers.
-Import Int.
-
-(** Putting everything together, the lemma we want to prove about [f_add] is
- the following. *)
-
-Lemma f_add_spec (env : genv)
- (t : Events.trace)
- (m m' : Memory.Mem.mem)
- (v : val) (x y z : int)
- (trace : Smallstep.star step2 env
- (Callstate (Ctypes.Internal f_add)
- [Vint x; Vint y]
- Kstop
- m)
- t
- (Returnstate (Vint z) Kstop m'))
- : z = add x y.
-
-(** ** Proof Walkthrough *)
-
-(** We introduce a custom [inversion] tactic which does some clean-up in
- addition to just perform the inversion. *)
-
-Ltac smart_inv H :=
- inversion H; subst; cbn in *; clear H.
-
-(** We can now try to prove our lemma. *)
-
-Proof.
-
-(** We first destruct [trace], and we rename the generated hypothesis in order
- to improve the readability of these notes. *)
-
- smart_inv trace.
- rename H into Hstep.
- rename H0 into Hstar.
-
-(** This generates two hypotheses.
-
-<<
-Hstep : step1
- env
- (Callstate (Ctypes.Internal f_add)
- [Vint x; Vint y]
- Kstop
- m)
- t1
- s2
-Hstar : Smallstep.star
- step2
- env
- s2
- t2
- (Returnstate (Vint z) Kstop m')
->>
-
- In other words, to “go” from a [Callstate] of [f_add] to a [Returnstate],
- there is a first step from a [Callstate] to a state [s2], then a succession
- of steps to go from [s2] to a [Returnstate].
-
- We consider the single [step], in order to determine the actual value of
- [s2] (among other things). To do that, we use [smart_inv] on [Hstep], and
- again perform some renaming. *)
-
- smart_inv Hstep.
- rename le into tmp_env.
- rename e into c_env.
- rename H5 into f_entry.
-
-(** This produces two effects. First, a new hypothesis is added to the context.
-
-<<
-f_entry : function_entry1
- env
- f_add
- [Vint x; Vint y]
- m
- c_env
- tmp_env
- m1
->>
-
- Then, the [Hstar] hypothesis has been updated, because we now have a more
- precise value of [s2]. More precisely, [s2] has become
-
-<<
-State
- f_add
- (Sreturn
- (Some (Ebinop Oadd
- (Etempvar _x tint)
- (Etempvar _y tint)
- tint)))
- Kstop
- c_env
- tmp_env
- m1
->>
-
- Using the same approach as before, we can uncover the next step. *)
-
- smart_inv Hstar.
- rename H into Hstep.
- rename H0 into Hstar.
-
-(** The resulting hypotheses are
-
-<<
-Hstep : step2 env
- (State
- f_add
- (Sreturn
- (Some
- (Ebinop Oadd
- (Etempvar _x tint)
- (Etempvar _y tint)
- tint)))
- Kstop c_env tmp_env m1) t1 s2
-Hstar : Smallstep.star
- step2
- env
- s2
- t0
- (Returnstate (Vint z) Kstop m')
->>
-
- An inversion of [Hstep] can be used to learn more about its resulting
- state… So let’s do just that. *)
-
- smart_inv Hstep.
- rename H7 into ev.
- rename v0 into res.
- rename H8 into res_equ.
- rename H9 into mem_equ.
-
-(** The generated hypotheses have become
-
-<<
-res : val
-ev : eval_expr env c_env tmp_env m1
- (Ebinop Oadd
- (Etempvar _x tint)
- (Etempvar _y tint)
- tint)
- res
-res_equ : sem_cast res tint tint m1 = Some v'
-mem_equ : Memory.Mem.free_list m1
- (blocks_of_env env c_env)
- = Some m'0
->>
-
- Our understanding of these hypotheses is the following
-
- - The expression [_x + _y] is evaluated using the [c_env] environment (and
- we know thanks to [binding] the value of [_x] and [_y]), and its result
- is stored in [res]
- - [res] is cast into a [tint] value, and acts as the result of [f_add]
-
- The [Hstar] hypothesis is now interesting
-
-<<
-Hstar : Smallstep.star
- step2 env
- (Returnstate v' Kstop m'0) t0
- (Returnstate (Vint z) Kstop m')
->>
-
- It is clear that we are at the end of the execution of [f_add] (even if Coq
- generates two subgoals, the second one is not relevant and easy to
- discard). *)
-
- smart_inv Hstar; [| smart_inv H ].
-
-(** We are making good progress here, and we can focus our attention on the [ev]
- hypothesis, which concerns the evaluation of the [_x + _y] expression. We
- can simplify it a bit further. *)
-
- smart_inv ev; [| smart_inv H].
- rename H4 into fetch_x.
- rename H5 into fetch_y.
- rename H6 into add_op.
-
-(** In a short-term, the hypotheses [fetch_x] and [fetch_y] are the most
- important.
-
-<<
-fetch_x : eval_expr env c_env tmp_env m1 (Etempvar _x tint) v1
-fetch_y : eval_expr env c_env tmp_env m1 (Etempvar _y tint) v2
->>
-
- The current challenge we face is to prove that we know their value. At this
- point, we can have a look at [f_entry]. This is starting to look familiar:
- [smart_inv], then renaming, etc. *)
-
- smart_inv f_entry.
- clear H.
- clear H0.
- clear H1.
- smart_inv H3; subst.
- rename H2 into allocs.
-
-(** We are almost done. Let’s simplify as much as possible [fetch_x] and
- [fetch_y]. Each time, the [smart_inv] tactic generates two suboals, but only
- the first one is relevant. The second one is not, and can be discarded. *)
-
- smart_inv fetch_x; [| inversion H].
- smart_inv H2.
- smart_inv fetch_y; [| inversion H].
- smart_inv H2.
-
-(** We now know the values of the operands of [add]. The two relevant hypotheses
- that we need to consider next are [add_op] and [res_equ]. They are easy to
- read.
-
-<<
-add_op : sem_binarith
- (fun (_ : signedness) (n1 n2 : Integers.int)
- => Some (Vint (add n1 n2)))
- (fun (_ : signedness) (n1 n2 : int64)
- => Some (Vlong (Int64.add n1 n2)))
- (fun n1 n2 : Floats.float
- => Some (Vfloat (Floats.Float.add n1 n2)))
- (fun n1 n2 : Floats.float32
- => Some (Vsingle (Floats.Float32.add n1 n2)))
- v1 tint v2 tint m1 = Some res
->>
-
- - [add_op] is the addition of [Vint x] and [Vint y], and its result is
- [res].
-
-<<
-res_equ : sem_cast res tint tint m1 = Some (Vint z)
->>
-
- - [res_equ] is the equation which says that the result of [f_add] is
- [res], after it has been cast as a [tint] value.
-
- We can simplify [add_op] and [res_equ], and this allows us to
- conclude. *)
-
- smart_inv add_op.
- smart_inv res_equ.
- reflexivity.
-Qed.
-
-(** ** Conclusion *)
-
-(** The definitions of Clight are easy to understand, and the #<a
- href="http://compcert.inria.fr/doc/index.html">#CompCert
- documentation#</a># is very pleasant to read. Understanding
- Clight and its semantics can be very interesting if you are
- working on a language that you want to translate into machine
- code. However, proving functional properties of a given C snippet
- using only CompCert can quickly become cumbersome. From this
- perspective, the #<a
- href="https://github.com/PrincetonUniversity/VST">#VST#</a>#
- project is very interesting, as its main purpose is to provide
- tools to reason about Clight programs more easily. *)
diff --git a/site/posts/ColorlessThemes-0.2.md b/site/posts/ColorlessThemes-0.2.md new file mode 100644 index 0000000..8cb32a4 --- /dev/null +++ b/site/posts/ColorlessThemes-0.2.md @@ -0,0 +1,29 @@ +--- +published: 2020-02-15 +tags: ['releases'] +abstract: | + Introducing a new macro, more friendly with themes gallery like Peach + Melpa. +--- + +# Release of `colorless-themes-0.2` + +[I have tagged and released a new version of +`colorless-themes`](https://git.sr.ht/~lthms/colorless-themes.el/refs/0.2). +The motivation behind modifying the version number is an important breaking +change regarding how the `colorless-themes-make`{.lisp} macro shall be used. + +Before `0.2`, the macro was calling `deftheme`{.lisp} and +`provide-theme`{.lisp} itself. In practices, it meant the actual themes were +not using these two functions themselves. It was working great in isolation, +but it turns out many tools such as `use-package`{.lisp} or [Peach +Melpa](https://peach-melpa.org) (an auto-generated Emacs themes gallery) are +relying on the presence of these functions to decide whether a file provides a +theme or not. As of now, `nordless-theme` and `lavenderless-theme` have been +updated accordingly, and [they appear on Peach +Melpa](https://peach-melpa.org/themes/lavenderless-theme/variants/lavenderless). +Their loading can also be deferred using the `:defer`{.lisp} keyword of the +`use-package`{.lisp} macro. + +if you happen to be a user of `colorless-themes`, feel free to shoot an email! +I would love to hear about your experience using a mostly colorless theme. diff --git a/site/posts/Coqffi-1-0-0.md b/site/posts/Coqffi-1-0-0.md new file mode 100644 index 0000000..e90d5b5 --- /dev/null +++ b/site/posts/Coqffi-1-0-0.md @@ -0,0 +1,522 @@ +--- +published: 2020-12-10 +modified: 2023-04-29 +tags: ['coq', 'ocaml', 'coqffi'] +abstract: | + For each entry of a cmi file, coqffi tries to generate an equivalent (from + the extraction mechanism perspective) Coq definition. In this article, we + walk through how coqffi works. +--- + +# `coqffi.1.0.0` In A Nutshell + +For each entry of a `cmi` file (a *compiled* `mli` file), `coqffi` +tries to generate an equivalent (from the extraction mechanism +perspective) Coq definition. In this article, we walk through how +`coqffi` works. + +Note that we do not dive into the vernacular commands `coqffi` +generates. They are of no concern for users of `coqffi`. + +## Getting Started + +### Requirements + +The latest version of `coqffi` (`1.0.0~beta8`) +is compatible with OCaml `4.08` up to `4.14`, and Coq `8.12` up top +`8.13`. If you want to use `coqffi`, but have incompatible +requirements of your own, feel free to +[submit an issue](https://github.com/coq-community/coqffi/issues). + +### Installing `coqffi` + +The recommended way to install `coqffi` is through the +[Opam Coq Archive](https://coq.inria.fr/opam/www), in the `released` +repository. If you haven’t activated this repository yet, you can use the +following bash command. + +```bash +opam repo add coq-released https://coq.inria.fr/opam/released +``` + +Then, installing `coqffi` is as simple as + +```bash +opam install coq-coqffi +``` + +You can also get the source from [the upstream `git` +repository](https://github.com/coq-community/coqffi). The `README` provides the +necessary pieces of information to build it from source. + +### Additional Dependencies + +One major difference between Coq and OCaml is that the former is pure, +while the latter is not. Impurity can be modeled in pure languages, +and Coq does not lack of frameworks in this respect. `coqffi` currently +supports two of them: +[`coq-simple-io`](https://github.com/Lysxia/coq-simple-io) and +[FreeSpec](https://github.com/ANSSI-FR/FreeSpec). It is also possible to use it +with [Interaction Trees](https://github.com/DeepSpec/InteractionTrees), albeit +in a less direct manner. + +### Primitive Types + +`coqffi` supports a set of primitive types, *i.e.*, a set of OCaml +types for which it knows an equivalent type in Coq. The list is the +following (the Coq types are fully qualified in the table, but not in +the generated Coq module as the necessary `Import` statements are +generated too). + +| OCaml type | Coq type | +| ----------------- | ----------------------------- | +| `bool`{.ocaml} | `Coq.Init.Datatypes.bool` | +| `char`{.ocaml} | `Coq.Strings.Ascii.ascii` | +| `int`{.ocaml} | `CoqFFI.Data.Int.i63` | +| `'a list`{.ocaml} | `Coq.Init.Datatypes.list a` | +| `'a Seq.t`{.ocaml} | `CoqFFI.Data.Seq.t` | +| `'a option`{.ocaml} | `Coq.Init.Datatypes.option a` | +| `('a, 'e) result`{.ocaml} | `Coq.Init.Datatypes.sum` | +| `string`{.ocaml} | `Coq.Strings.String.string` | +| `unit`{.ocaml} | `Coq.Init.Datatypes.unit` | +| `exn`{.ocaml} | `CoqFFI.Exn` | + +The `i63`{.coq} type is introduced by the `CoqFFI`{.coq} theory to provide +signed primitive integers to Coq users. They are implemented on top of the +(unsigned) Coq native integers introduced in Coq `8.13`. The `i63` type will be +deprecated once the support for [signed primitive +integers](https://github.com/coq/coq/pull/13559) is implemented[^compat]. + +[^compat]: This is actually one of the sources of incompatibility of `coqffi` + with most recent versions of Coq. + +When processing the entries of a given interface model, `coqffi` will +check that they only use these types, or types introduced by the +interface module itself. + +Sometimes, you may encounter a situation where you have two interface +modules `b.mli` and `b.mli`, such that `b.mli` uses a type introduced +in `a.mli`. To deal with this scenario, you can use the `--witness` +flag to generate `A.v`. This will tell `coqffi` to also generate +`A.ffi`; this file can then be used when generating `B.v` thanks to +the `-I` option. Furthermore, for `B.v` to compile the `--require` +option needs to be used to ensure the `A` Coq library (`A.v`) is +required. + +To give a more concrete example, given ~a.mli~ + +```ocaml +type t +``` + +and `b.mli` + +```ocaml +type a = A.t +``` + +To generate `A.v`, we can use the following commands: + +```bash +ocamlc a.mli +coqffi --witness -o A.v a.cmi +``` + +Which would generate the following axiom for `t`. + +```coq +Axiom t : Type. +``` + +Then, generating `B.v` can be achieved as follows: + +```bash +ocamlc b.mli +coqffi -I A.ffi -ftransparent-types -r A -o B.v b.cmi +``` + +which results in the following output for `v`: + +```coq +Require A. + +Definition u : Type := A.t. +``` + +## Code Generation + +`coqffi` distinguishes five types of entries: types, pure values, +impure primitives, asynchronous primitives, exceptions, and +modules. We now discuss how each one of them is handled. + +### Types + +By default, `coqffi` generates axiomatized definitions for each type defined in +a `.cmi` file. This means that `type t`{.ocaml} becomes `Axiom t : Type`{.coq}. +Polymorphism is supported, *i.e.*, `type 'a t`{.ocaml} becomes `Axiom t : +forall (a : Type), Type`{.coq}. + +It is possible to provide a “model” for a type using the `coq_model` +annotation, for instance, for reasoning purposes. That is, we can specify +that a type is equivalent to a `list`. + +```ocaml +type 'a t [@@coq_model "list"] +``` + +This generates the following Coq definition. + +```coq +Definition t : forall (a : Type), Type := list. +``` + +It is important to be careful when using the =coq_model= annotation. More +precisely, the fact that `t` is a `list` in the “Coq universe” shall not be +used while the implementation phase, only the verification phase. + +Unnamed polymorphic type parameters are also supported. In presence of +such parameters, `coqffi` will find it a name that is not already +used. For instance, + +```ocaml +type (_, 'a) ast +``` + +becomes + +```coq +Axiom ast : forall (b : Type) (a : Type), Type. +``` + +Finally, `coqffi` has got an experimental feature called `transparent-types` +(enabled by using the `-ftransparent-types` command-line argument). If the type +definition is given in the module interface, then `coqffi` tries to generate +an equivalent definition in Coq. For instance, + +```ocaml +type 'a llist = + | LCons of 'a * (unit -> 'a llist) + | LNil +``` + +becomes + +```coq +Inductive llist (a : Type) : Type := +| LCons (x0 : a) (x1 : unit -> llist a) : llist a +| LNil : llist a. +``` + +Mutually recursive types are supported, so + +```ocaml +type even = Zero | ESucc of odd +and odd = OSucc of even +``` + +becomes + +```coq +Inductive odd : Type := +| OSucc (x0 : even) : odd +with even : Type := +| Zero : even +| ESucc (x0 : odd) : even. +``` + +Besides, `coqffi` supports alias types, as suggested in this write-up +when we discuss witness files. + +The `transparent-types` feature is **experimental**, and is currently +limited to variant types. It notably does not support records. Besides, it may +generate incorrect Coq types, because it does not check whether or not the +[positivity +condition](https://coq.inria.fr/refman/language/core/inductive.html#positivity-condition) +is satisfied. + +### Pure values + +`coqffi` decides whether or not a given OCaml value is pure or impure +with the following heuristics: + +- Constants are pure +- Functions are impure by default +- Functions with a `coq_model` annotation are pure +- Functions marked with the `pure` annotation are pure +- If the `pure-module` feature is enabled (`-fpure-module`), then synchronous + functions (which do not live inside the + [~Lwt~](https://ocsigen.org/lwt/5.3.0/manual/manual) monad) are pure + +Similarly to types, `coqffi` generates axioms (or definitions if the +`coq_model` annotation is used) for pure values. Then, + +```ocaml +val unpack : string -> (char * string) option [@@pure] +``` + +becomes + +```ocaml +Axiom unpack : string -> option (ascii * string). +``` + +Polymorphic values are supported. + +```ocaml +val map : ('a -> 'b) -> 'a list -> 'b list [@@pure] +``` + +becomes + +```coq +Axiom map : forall (a : Type) (b : Type), (a -> b) -> list a -> list b. +``` + +Again, unnamed polymorphic typse are supported, so + +```ocaml +val ast_to_string : _ ast -> string [@@pure] +``` + +becomes + +```coq +Axiom ast_to_string : forall (a : Type), string. +``` + +### Impure Primitives + +`coqffi` reserves a special treatment for /impure/ OCaml functions. +Impurity is usually handled in pure programming languages by means of +monads, and `coqffi` is no exception to the rule. + +Given the set of impure primitives declared in an interface module, +`coqffi` will (1) generate a typeclass which gathers these primitives, +and (2) generate instances of this typeclass for supported backends. + +We illustrate the rest of this section with the following impure +primitives. + +```ocaml +val echo : string -> unit +val scan : unit -> string +``` + +where `echo` allows writing something the standard output, and `scan` +to read the standard input. + +Assuming the processed module interface is named `console.mli`, the +following Coq typeclass is generated. + +```coq +Class MonadConsole (m : Type -> Type) := { echo : string -> m unit + ; scan : unit -> m string + }. +``` + +Using this typeclass and with the additional support of an additional +`Monad` typeclass, we can specify impure computations which interacts +with the console. For instance, with the support of `ExtLib`, one can +write. + +```coq +Definition pipe `{Monad m, MonadConsole m} : m unit := + let* msg := scan () in + echo msg. +``` + +There is no canonical way to model impurity in Coq, but over the years +several frameworks have been released to tackle this challenge. + +`coqffi` provides three features related to impure primitives. + +#### `simple-io` + +When this feature is enabled, `coqffi` generates an instance of the +typeclass for the =IO= monad introduced in the `coq-simple-io` package + +```coq +Axiom io_echo : string -> IO unit. +Axiom io_scan : unit -> IO string. + +Instance IO_MonadConsole : MonadConsole IO := { echo := io_echo + ; scan := io_scan + }. +``` + +It is enabled by default, but can be disabled using the +`-fno-simple-io` command-line argument. + +#### `interface` + +When this feature is enabled, `coqffi` generates an inductive type which +describes the set of primitives available, to be used with frameworks like +[FreeSpec](https://github.com/lthms/FreeSpec) or [Interactions +Trees](https://github.com/DeepSpec/InteractionTrees). + +```coq +Inductive CONSOLE : Type -> Type := +| Echo : string -> CONSOLE unit +| Scan : unit -> CONSOLE string. + +Definition inj_echo `{Inject CONSOLE m} (x0 : string) : m unit := + inject (Echo x0). + +Definition inj_scan `{Inject CONSOLE m} (x0 : unit) : m string := + inject (Scan x0). + +Instance Inject_MonadConsole `{Inject CONSOLE m} : MonadConsole m := + { echo := inj_echo + ; scan := inj_scan + }. +``` + +Providing an instance of the form `forall i, Inject i M`{.coq} is enough for +your monad `M` to be compatible with this feature[^example]. + +[^example]: See for instance [how FreeSpec implements + it](https://github.com/lthms/FreeSpec/blob/master/theories/FFI/FFI.v)). + +#### `freespec` + +When this feature in enabled, `coqffi` generates a semantics for the +inductive type generated by the `interface` feature. + +```coq +Axiom unsafe_echo : string -> unit. +Axiom unsafe_scan : uint -> string. + +Definition console_unsafe_semantics : semantics CONSOLE := + bootstrap (fun a e => + local match e in CONSOLE a return a with + | Echo x0 => unsafe_echo x0 + | Scan x0 => unsafe_scan x0 + end). +``` + +### Asynchronous Primitives + +`coqffi` also reserves a special treatment for *asynchronous* +primitives —*i.e.*, functions which live inside the `Lwt` monad— when +the `lwt` feature is enabled. + +The treatment is very analogous to the one for impure primitives: (1) +a typeclass is generated (with the `_Async` suffix), and (2) an +instance for the `Lwt` monad is generated. Besides, an instance for +the “synchronous” primitives is also generated for `Lwt`. If the +`interface` feature is enabled, an interface datatype is generated, +which means you can potentially use Coq to reason about your +asynchronous programs (using FreeSpec and alike, although the +interleaving of asynchronous programs in not yet supported in +FreeSpec). + +By default, the type of the `Lwt` monad is `Lwt.t`. You can override +this setting using the `--lwt-alias` option. This can be useful when +you are using an alias type in place of `Lwt.t`. + +### Exceptions + +OCaml features an exception mechanism. Developers can define their +own exceptions using the `exception` keyword, whose syntax is similar +to the constructors’ definition. For instance, + +```ocaml +exception Foo of int * bool +``` + +introduces a new exception `Foo` which takes two parameters of type `int`{.ocaml} and +`bool`{.ocaml}. `Foo (x, y)` constructs of value of type `exn`{.ocaml}. + +For each new exception introduced in an OCaml module, `coqffi` +generates (1) a so-called “proxy type,” and (2) conversion functions +to and from this type. + +Coming back to our example, the “proxy type” generates by `coqffi` is + +```coq +Inductive FooExn : Type := +| MakeFooExn (x0 : i63) (x1 : bool) : FooExn. +``` + +Then, `coqffi` generates conversion functions. + +```coq +Axiom exn_of_foo : FooExn -> exn. +Axiom foo_of_exn : exn -> option FooExn. +``` + +Besides, `coqffi` also generates an instance for the `Exn` typeclass +provided by the `CoqFFI` theory: + +```coq +Instance FooExn_Exn : Exn FooExn := + { to_exn := exn_of_foo + ; of_exn := foo_of_exn + }. +``` + +Under the hood, `exn`{.ocaml} is an +[extensible +datatype](https://caml.inria.fr/pub/docs/manual-ocaml/extensiblevariants.html), +and how `coqffi` supports it will probably be generalized in future releases. + +Finally, `coqffi` has a minimal support for functions which may raise +exceptions. Since OCaml type system does not allow to identify such +functions, they need to be annotated explicitly, using the +=may_raise= annotation. In such a case, `coqffi` will change the +return type of the function to use the =sum= Coq inductive type. + +For instance, + +```ocaml +val from_option : 'a option -> 'a [@@may_raise] [@@pure] +``` + +becomes + +```coq +Axiom from_option : forall (a : Type), option a -> sum a exn. +``` + +### Modules + +Lastly, `coqffi` supports OCaml modules described within `mli` files, +when they are specified as `module T : sig ... end`{.ocaml}. For instance, + +```ocaml +module T : sig + type t + + val to_string : t -> string [@@pure] +end +``` + +becomes + +```coq +Module T. + Axiom t : Type. + + Axiom to_string : t -> string. +End T. +``` + +As of now, the following construction is unfortunately *not* +supported, and will be ignored by `coqffi`: + +```ocaml +module S = sig + type t + + val to_string : t -> string [@@pure] +end + +module T : S +``` + +## Moving Forward + +`coqffi` comes with a comprehensive man page. In addition, the +interested reader can proceed to the next article of this series, +which explains how [`coqffi` can be used to easily implement an echo +server in Coq](/posts/CoqffiEcho.html). diff --git a/site/posts/Coqffi.org b/site/posts/Coqffi.org deleted file mode 100644 index f8d9695..0000000 --- a/site/posts/Coqffi.org +++ /dev/null @@ -1,18 +0,0 @@ -#+TITLE: A series on ~coqffi~ - -#+SERIES: ../coq.html -#+SERIES_PREV: AlgebraicDatatypes.html - -~coqffi~ generates Coq FFI modules from compiled OCaml interface -modules (~.cmi~). In practice, it greatly reduces the hassle to mix -OCaml and Coq modules within the same codebase, especially when used -together with the ~dune~ build system. - -- [[./CoqffiIntro.org][~coqffi~ in a Nutshell]] :: - For each entry of a ~cmi~ file, ~coqffi~ tries to generate an - equivalent (from the extraction mechanism perspective) Coq - definition. In this article, we walk through how ~coqffi~ works. -- [[./CoqffiEcho.org][Implementing an Echo Server in Coq with ~coqffi~]] :: - In this tutorial, we will demonstrate how ~coqffi~ can be used to - implement an echo server, /i.e./, a TCP server which sends back - any input it receives from its clients. diff --git a/site/posts/CoqffiEcho.md b/site/posts/CoqffiEcho.md new file mode 100644 index 0000000..0494854 --- /dev/null +++ b/site/posts/CoqffiEcho.md @@ -0,0 +1,519 @@ +--- +published: 2020-12-10 +modified: 2021-08-20 +tags: ['coq', 'ocaml', 'coqffi'] +abstract: | + In this article, we will demonstrate how `coqffi` can be used to implement + an echo server, *i.e.*, a TCP server which sends back any input it receives + from its clients. +--- + +# Implementing an Echo Server in Coq with `coqffi.1.0.0` + +In this article, we will demonstrate how `coqffi` can be used to +implement an echo server, *i.e.*, a TCP server which sends back any +input it receives from its clients. In addition to `coqffi`, you will need to +install `coq-simple-io`. The latter is available in the [`released` repository +of the Opam Coq Archive](https://github.com/coq/opam-coq-archive). + +```bash +opam install coq-coqffi coq-simple-io +``` + +Besides, you can download [the source tree presented in this +article](/files/coqffi-tutorial.tar.gz) if you want to try to read the source +directly, or modify it to your taste. + +## Project Layout + +Before diving too much into the implementation of our echo server, we +first give an overview of the resulting project’s layout. Since we aim +at implementing a program, we draw our inspiration from the idiomatic +way of organizing a OCaml project. + +We have three directories at the root of the project. + +- **`ffi/` contains the low-level OCaml code:** + It provides an OCaml library (`ffi`), and a Coq theory (`FFI`{.coq}) which + gathers the FFI modules generated by `coqffi`. +- **`src/` contains the Coq implementation of our echo server:** It provides a + Coq theory (`Echo`{.coq}) which depends on the `FFI`{.coq} theory the + `SimpleIO`{.coq} theory of `coq-simple~io`. This theory provides the + implementation of our echo server in Coq. +- **`bin/` contains the pieces of code to get an executable program:** It + contains a Coq module (`echo.v`) which configures and uses the extraction + mechanism to generate an OCaml module (`echo.ml`). This OCaml module can be + compiled to get an executable program. + +Note that we could have decided to only have one Coq theory. We could +also have added a fourth directory (`theories/`) for formal +verification specific code, but this is out of the scope of this +tutorial. + +Overall, we use `dune` to compile and compose the different parts of +the echo server. `dune` has a native —yet unstable at the time of +writing— support for building Coq projects, with very convenient +stanzas like `coq.theory` and `coq.extraction`. + +The following graph summarizes the dependencies between each component +(plain arrows symbolize software dependencies). + +#[The echo server dependency graph. Dashed boxes are generated.](/img/echo-deps.svg [The echo server dependy graph]) + +We enable Coq-related stanza with `(using coq 0.2)`{.lisp} in the +`dune-project`{.dune}. file. + +```lisp +(lang dune 2.7) +(using coq 0.2) +``` + +The rest of this tutorial proceeds by diving into each directory. + +## FFI Bindings + +Our objective is to implement an echo server, *i.e.*, a server which +(1) accepts incoming connections, and (2) sends back any incoming +messages. We will consider two classes of effects. One is related to +creating and manipulating TCP sockets. The other is dedicated to +process management, more precisely to be able to fork when receiving +incoming connections. + +Therefore, the `ffi` library will provide two modules. Likewise, the +`FFI`{.coq} theory will provide two analogous modules generated by `coqffi`. + +In the `ffi/` directory, we add the following stanza to the `dune` file. + +```lisp +(library + (name ffi) + (libraries unix)) +``` + +`dune` will look for any `.ml` and `.mli` files within the directory and will +consider they belong to the `ffi` library. We use the +[`unix`](https://caml.inria.fr/pub/docs/manual-ocaml/libref/Unix.html) library +to implement the features we are looking for. + +Then, we add the following stanza to the `dune` file of the `ffi/` +directory. + +```lisp +(coq.theory + (name FFI)) +``` + +This tells `dune` to look for `.v` file within the `ffi/` directory, +in order to build them with Coq. A nice feature of `dune` is that if we +automatically generate Coq files, they will be automatically “attached” to this +theory. + +### Sockets + +Sockets are boring. The following OCaml module interface provides the +necessary type and functions to manipulate them. + +```ocaml +type socket_descr + +val open_socket : string -> int -> socket_descr +val listen : socket_descr -> unit +val recv : socket_descr -> string +val send : socket_descr -> string -> int +val accept_connection : socket_descr -> socket_descr +val close_socket : socket_descr -> unit +``` + +Our focus is how to write the interface modules for `coqffi`. Since the object +of this tutorial is not the implementation of an echo server in itself, the +implementation details of the `ffi` library will not be discussed, but is +provided at the end of this article. + +`dune` generates `.cmi` files for the `.mli` files of our library, and +provides the necessary bits to easily locate them. Besides, the +`action` stanza can be used here to tell to `dune` how to generate the +module `Socket.v` from `file.cmi`. We add the following entry to +`ffi/dune`. + +```lisp +(rule + (target Socket.v) + (action (run coqffi %{cmi:socket} -o %{target}))) +``` + +We call `coqffi` without any feature-related command-line argument, +which means only the `simple-io` feature is enabled. As a consequence, +the `socket_descr` type is axiomatized in Coq, and in addition to a +`MonadSocket` monad, `coqffi` will generate an instance for this monad +for the `IO` monad of `coq-simple-io`. + +The stanza generates the following Coq module. + +```coq +(* This file has been generated by coqffi. *) + +Set Implicit Arguments. +Unset Strict Implicit. +Set Contextual Implicit. +Generalizable All Variables. +Close Scope nat_scope. + +From CoqFFI Require Export Extraction. +From SimpleIO Require Import IO_Monad. + +Axiom socket_descr : Type. + +Extract Constant socket_descr => "Ffi.Socket.socket_descr". + +(** * Impure Primitives *) + +(** ** Monad Definition *) + +Class MonadSocket (m : Type -> Type) : Type := + { open_socket : string -> i63 -> m socket_descr + ; listen : socket_descr -> m unit + ; recv : socket_descr -> m string + ; send : socket_descr -> string -> m i63 + ; accept_connection : socket_descr -> m socket_descr + ; close_socket : socket_descr -> m unit + }. + +(** ** [IO] Instance *) + +Axiom io_open_socket : string -> i63 -> IO socket_descr. +Axiom io_listen : socket_descr -> IO unit. +Axiom io_recv : socket_descr -> IO string. +Axiom io_send : socket_descr -> string -> IO i63. +Axiom io_accept_connection : socket_descr -> IO socket_descr. +Axiom io_close_socket : socket_descr -> IO unit. + +Extract Constant io_open_socket + => "(fun x1 x2 k__ -> k__ ((Ffi.Socket.open_socket x1 x2)))". +Extract Constant io_listen => "(fun x1 k__ -> k__ ((Ffi.Socket.listen x1)))". +Extract Constant io_recv => "(fun x1 k__ -> k__ ((Ffi.Socket.recv x1)))". +Extract Constant io_send + => "(fun x1 x2 k__ -> k__ ((Ffi.Socket.send x1 x2)))". +Extract Constant io_accept_connection + => "(fun x1 k__ -> k__ ((Ffi.Socket.accept_connection x1)))". +Extract Constant io_close_socket + => "(fun x1 k__ -> k__ ((Ffi.Socket.close_socket x1)))". + +Instance IO_MonadSocket : MonadSocket IO := + { open_socket := io_open_socket + ; listen := io_listen + ; recv := io_recv + ; send := io_send + ; accept_connection := io_accept_connection + ; close_socket := io_close_socket + }. + +(* The generated file ends here. *) +``` + +### Process Management + +In order to avoid a client to block the server by connecting to it +without sending anything, we can fork a new process for each client. + +```ocaml +type identity = Parent of int | Child + +val fork : unit -> identity +``` + +This time, the `proc.mli` module interface introduces a transparent +type, /i.e./, it also provides its definition. This is a good use case +for the `transparent-types` feature of `coqffi`. In the stanza for +generating `Proc.v`, we enable it with the `-ftransparent-types` +command-line argument, like this. + +```lisp +(rule + (target Proc.v) + (action (run coqffi -ftransparent-types %{cmi:proc} -o %{target}))) +``` + +which generates the following Coq module. + +```coq +(* This file has been generated by coqffi. *) + +Set Implicit Arguments. +Unset Strict Implicit. +Set Contextual Implicit. +Generalizable All Variables. +Close Scope nat_scope. + +From CoqFFI Require Export Extraction. +From SimpleIO Require Import IO_Monad. + +Inductive identity : Type := +| Parent (x0 : i63) : identity +| Child : identity. + +Extract Inductive identity => "Ffi.Proc.identity" + [ "Ffi.Proc.Parent" "Ffi.Proc.Child" ]. + +(** * Impure Primitives *) + +(** ** Monad Definition *) + +Class MonadProc (m : Type -> Type) : Type := { fork : unit -> m identity + }. + +(** ** [IO] Instance *) + +Axiom io_fork : unit -> IO identity. + +Extract Constant io_fork => "(fun x1 k__ -> k__ ((Ffi.Proc.fork x1)))". + +Instance IO_MonadProc : MonadProc IO := { fork := io_fork + }. + +(* The generated file ends here. *) +``` + +We now have everything we need to implement an echo server in Coq. + +## Implementing an Echo Server + +Our implementation will be part of a dedicated Coq theory, called `Echo`{.coq}. +This is done easily a `dune`{.coq} file in the `src/` directory, with the +following content. + +```lisp +(coq.theory + (name Echo) + (theories FFI)) +``` + +In the rest of this section, we will discuss the content of the unique +module of this theory. Hopefully, readers familiar with programming +impurity by means of monads will not find anything particularly +surprising here. + +Let us start with the inevitable sequence of import commands. We use +the `Monad`{.coq} and `MonadFix`{.coq} typeclasses of `ExtLib`{.coq}, and our +FFI modules from the `FFI`{.coq} theory we have previously defined. + +```coq +From ExtLib Require Import Monad MonadFix. +From FFI Require Import Proc Socket. +``` + +Letting Coq guess the type of unintroduced variables using the `` ` `` +annotation (*e.g.*, in presence of`` `{Monad m}``{.coq}, Coq understands `m` +is of type `Type -> Type`) is always nice, so we enable it. + +```coq +Generalizable All Variables. +``` + +We enable the monad notation provided by `ExtLib`. In this article, we +prefer the `let*` notation (as recently introduced by OCaml) over the +`<-` notation of Haskell, but both are available. + +```coq +Import MonadLetNotation. +Open Scope monad_scope. +``` + +Then, we define a notation to be able to define local, monadic +recursive functions using the `mfix` combinator of the `MonadFix` +typeclass. + +```coq +Notation "'let_rec*' f x ':`' p 'in' q" :` + (let f :` mfix (fun f x `> p) in q) + (at level 61, x pattern, f name, q at next level, right associativity). +``` + +Note that `mfix` does /not/ check whether or not the defined function +will terminate (contrary to the `fix` keyword of Coq). This is +fortunate because in our case, we do not want our echo server to +converge, but rather to accept an infinite number of connections. + +We can demonstrate how this notation can be leveraged by defining a +generic TCP server, parameterized by a handler to deal with incoming +connections. + +```coq +Definition tcp_srv `{Monad m, MonadFix m, MonadProc m, MonadSocket m} + (handler : socket_descr -> m unit) + : m unit := + let* srv := open_socket "127.0.0.1" 8888 in + listen srv;; + + let_rec* tcp_aux _ := + let* client := accept_connection srv in + let* res := fork tt in + match res with + | Parent _ => close_socket client >>= tcp_aux + | Child => handler client + end + in + + tcp_aux tt. +``` + +The handler for the echo server is straightforward: it just reads +incoming bytes from the socket, sends it back, and closes the socket. + +```coq +Definition echo_handler `{Monad m, MonadSocket m} (sock : socket_descr) + : m unit := + let* msg := recv sock in + send sock msg;; + close_socket sock. +``` + +Composing our generic TCP server with our echo handler gives us an +echo server. + +```coq +Definition echo_server `{Monad m, MonadFix m, MonadProc m, MonadSocket m} + : m unit := + tcp_srv echo_handler. +``` + +Because `coqffi` has generated typeclasses for the impure primitives +of `proc.mli` and `socket.mli`, `echo_server` is polymorphic, and can +be instantiated for different monads. When it comes to extracting our +program, we will generally prefer the `IO` monad of `coq-simple-io`. +But we could also imagine verifying the client handler with FreeSpec, +or the generic TCP server with Interaction Trees (which support +diverging computations). Overall, we can have different verification +strategies for different parts of our program, by leveraging the most +relevant framework for each part, yet being able to extract it in an +efficient form. + +The next section shows how this last part is achieved using, once +again, a convenient stanza of dune. + +## Extracting and Building an Executable + +The `0.2` version of the Coq-related stanzas of `dune` provides the +`coq.extraction` stanza, which can be used to build a Coq module +expected to generate `ml` files. + +In our case, we will write `bin/echo.v` to extract the `echo_server` +in a `echo.ml` module, and uses the `executable` stanza of `dune` to +get an executable from this file. To achieve this, the `bin/dune` +file simply requires these two stanzas. + +```lisp +(coq.extraction + (prelude echo) + (theories Echo) + (extracted_modules echo)) + +(executable + (name echo) + (libraries ffi)) +``` + +We are almost done. We now need to write the `echo.v` module, which +mostly consists of (1) providing a `MonadFix` instance for the `IO` +monad, (2) using the `IO.unsafe_run` function to escape the `IO` +monad, (3) calling the `Extraction`{.coq} command to wrap it up. + +```coq +From Coq Require Extraction. +From ExtLib Require Import MonadFix. +From SimpleIO Require Import SimpleIO. +From Echo Require Import Server. + +Instance MonadFix_IO : MonadFix IO := + { mfix := @IO.fix_io }. + +Definition main : io_unit := + IO.unsafe_run echo_server. + +Extraction "echo.ml" main. +``` + +Since we are using the `i63`{.coq} type (signed 63bits integers) of the +`CoqFFI` theory, and since `i63`{.coq} is implemented under the hood with Coq +primitive integers, we *also* need to provide a `Uint63`{.ocaml} module with a +`of_int`{.ocaml} function. Fortunately, this module is straightforward to +write. + +```ocaml +let of_int x = x +``` + +And *voilà*. A call to `dune` at the root of the repository will +build everything (Coq and OCaml alike). Starting the echo server +is as simple as + +```bash +dune exec bin/echo.exe +``` + +And connecting to it can be achieved with a program like `telnet`. + +```console +$ telnet 127.0.0.1 8888 +Trying 127.0.0.1... +Connected to 127.0.0.1. +Escape character is '^]'. +hello, echo server! +hello, echo server! +Connection closed by foreign host. +``` + +## Appendix + +### The `Socket` OCaml Module + +There is not much to say, except that (as already stated) we use the +`Unix`{.ocaml} module to manipulate sockets, and we attach to each socket a +buffer to store incoming bytes. + +```ocaml +let buffer_size = 1024 + +type socket_descr = { + fd : Unix.file_descr; + recv_buffer : bytes; +} + +let from_fd fd = + let rbuff = Bytes.create buffer_size in + { fd ` fd; recv_buffer ` rbuff } + +let open_socket hostname port = + let open Unix in + let addr = inet_addr_of_string hostname in + let fd = socket PF_INET SOCK_STREAM 0 in + setsockopt fd SO_REUSEADDR true; + bind fd (ADDR_INET (addr, port)); + from_fd fd + +let listen sock = Unix.listen sock.fd 1 + +let recv sock = + let s = Unix.read sock.fd sock.recv_buffer 0 buffer_size in + Bytes.sub_string sock.recv_buffer 0 s + +let send sock msg = + Unix.write_substring sock.fd msg 0 (String.length msg) + +let accept_connection sock = + Unix.accept sock.fd |> fst |> from_fd + +let close_socket sock = Unix.close sock.fd +``` + +### The `Proc` OCaml Module + +Thanks to the `Unix` module, the implementation is pretty straightforward. + +```ocaml +type identity = Parent of int | Child + +let fork x = + match Unix.fork x with + | 0 -> Child + | x -> Parent x +``` diff --git a/site/posts/CoqffiEcho.org b/site/posts/CoqffiEcho.org deleted file mode 100644 index 8d48c48..0000000 --- a/site/posts/CoqffiEcho.org +++ /dev/null @@ -1,472 +0,0 @@ -#+TITLE: Implementing an Echo Server in Coq with ~coqffi~ - -#+SERIES: ./Coqffi.html -#+SERIES_PREV: ./CoqffiIntro.html - -#+NAME: coqffi_output -#+BEGIN_SRC sh :results output :exports none :var mod="" -cat ${ROOT}/lp/coqffi-tutorial/_build/default/ffi/${mod} -#+END_SRC - -In this article, we will demonstrate how ~coqffi~ can be used to -implement an echo server, /i.e./, a TCP server which sends back any -input it receives from its clients. In addition to ~coqffi~, you will -need to install ~coq-simple-io~. The latter is available in the -~released~ repository of the Opam Coq Archive. - -#+BEGIN_SRC sh -opam install coq-coqffi coq-simple-io -#+END_SRC - -Besides, this article is a literate program, and you can download -[[/files/coqffi-tutorial.tar.gz][the resulting source tree]] if you -want to try to read the source directly, or modify it to your taste. - -#+BEGIN_EXPORT html -<nav id="generate-toc"></nav> -<div id="history">site/posts/CoqffiEcho.org</div> -#+END_EXPORT - -* Project Layout - -Before diving too much into the implementation of our echo server, we -first give an overview of the resulting project’s layout. Since we aim -at implementing a program, we draw our inspiration from the idiomatic -way of organizing a OCaml project. - -#+BEGIN_SRC sh :results output :exports results -cd ${ROOT}/lp/coqffi-tutorial/ -tree --noreport -I "_build" -#+END_SRC - -We have three directories at the root of the project. - -- ~ffi/~ contains the low-level OCaml code :: - It provides an OCaml library (~ffi~), and a Coq theory (~FFI~) which - gathers the FFI modules generated by ~coqffi~. -- ~src/~ contains the Coq implementation of our echo server :: - It provides a Coq theory (~Echo~) which depends on the ~FFI~ theory - the ~SimpleIO~ theory of ~coq-simple~io~. This theory provides the - implementation of our echo server in Coq. -- ~bin/~ contains the pieces of code to get an executable program :: - It contains a Coq module (~echo.v~) which configures and uses the - extraction mechanism to generate an OCaml module (~echo.ml~). This - OCaml module can be compiled to get an executable program. - -Note that we could have decided to only have one Coq theory. We could -also have added a fourth directory (~theories/~) for formal -verification specific code, but this is out of the scope of this -tutorial. - -Overall, we use ~dune~ to compile and compose the different parts of -the echo server. ~dune~ has a native —yet unstable at the time of -writing— support for building Coq projects, with very convenient -stanzas like =coq.theory= and =coq.extraction=. - -The following graph summarizes the dependencies between each component -(plain arrows symbolize software dependencies). - -#+BEGIN_SRC dot :file deps.svg :exports results -digraph dependencies { - graph [nodesep="0.4"]; - rankdir="LR"; - node [shape=box]; - subgraph { - rank=same; - FFI [label="Socket.v" style="dashed"]; - ffi [label="socket.mli"]; - } - subgraph { - Echo [label="Echo.v"]; - } - - subgraph { - rank=same; - echo_v [label="main.v"]; - echo_ml [label="main.ml" style="dashed"]; - } - - ffi -> FFI [style="dashed" label="coqffi "]; - echo_ml -> echo_v [dir=back style="dashed" label="coqc "]; - FFI -> Echo -> echo_v; - ffi -> echo_ml; -} -#+END_SRC - -We enable Coq-related stanza with ~(using coq 0.2)~ in the -~dune-project~. -file. - -#+BEGIN_SRC lisp :tangle coqffi-tutorial/dune-project -(lang dune 2.7) -(using coq 0.2) -#+END_SRC - -The rest of this tutorial proceeds by diving into each directory. - -* FFI Bindings - -Our objective is to implement an echo server, /i.e./, a server which -(1) accepts incoming connections, and (2) sends back any incoming -messages. We will consider two classes of effects. One is related to -creating and manipulating TCP sockets. The other is dedicated to -process management, more precisely to be able to fork when receiving -incoming connections. - -Therefore, the ~ffi~ library will provide two modules. Likewise, the -~FFI~ theory will provide two analogous modules generated by ~coqffi~. - -In the ~ffi/~ directory, we add the following stanza to the ~dune~ -file. - -#+BEGIN_SRC lisp :tangle coqffi-tutorial/ffi/dune -(library - (name ffi) - (libraries unix)) -#+END_SRC - -~dune~ will look for any ~.ml~ and ~.mli~ files within the directory -and will consider they belong to the ~ffi~ library. We use the -[[https://caml.inria.fr/pub/docs/manual-ocaml/libref/Unix.html][~unix~]] -library to implement the features we are looking for. - -Then, we add the following stanza to the ~dune~ file of the ~ffi/~ -directory. - -#+BEGIN_SRC lisp :tangle coqffi-tutorial/ffi/dune -(coq.theory - (name FFI)) -#+END_SRC - -This tells ~dune~ to look for ~.v~ file within the ~ffi/~ directory, -in order to build them with Coq. A nice feature of ~dune~ is that if -we automatically generate Coq files, they will be automatically -“attached” to this theory. - -** Sockets - -Sockets are boring. The following OCaml module interface provides the -necessary type and functions to manipulate them. - -#+BEGIN_SRC ocaml :tangle coqffi-tutorial/ffi/socket.mli -type socket_descr - -val open_socket : string -> int -> socket_descr -val listen : socket_descr -> unit -val recv : socket_descr -> string -val send : socket_descr -> string -> int -val accept_connection : socket_descr -> socket_descr -val close_socket : socket_descr -> unit -#+END_SRC - -Our focus is how to write the interface modules for ~coqffi~. Since -the object of this tutorial is not the implementation of an echo -server in itself, the implementation details of the ~ffi~ library will -not be discussed. - -#+BEGIN_details -#+HTML: <summary>Implementation for <code>socket.mli</code></summary> - -There is not much to say, except that (as already stated) we use the -~Unix~ module to manipulate sockets, and we attach to each socket a -buffer to store incoming bytes. - -#+BEGIN_SRC ocaml :tangle coqffi-tutorial/ffi/socket.ml -let buffer_size = 1024 - -type socket_descr = { - fd : Unix.file_descr; - recv_buffer : bytes; -} - -let from_fd fd = - let rbuff = Bytes.create buffer_size in - { fd = fd; recv_buffer = rbuff } - -let open_socket hostname port = - let open Unix in - let addr = inet_addr_of_string hostname in - let fd = socket PF_INET SOCK_STREAM 0 in - setsockopt fd SO_REUSEADDR true; - bind fd (ADDR_INET (addr, port)); - from_fd fd - -let listen sock = Unix.listen sock.fd 1 - -let recv sock = - let s = Unix.read sock.fd sock.recv_buffer 0 buffer_size in - Bytes.sub_string sock.recv_buffer 0 s - -let send sock msg = - Unix.write_substring sock.fd msg 0 (String.length msg) - -let accept_connection sock = - Unix.accept sock.fd |> fst |> from_fd - -let close_socket sock = Unix.close sock.fd -#+END_SRC -#+END_details - -~dune~ generates ~.cmi~ files for the ~.mli~ files of our library, and -provides the necessary bits to easily locate them. Besides, the -=action= stanza can be used here to tell to ~dune~ how to generate the -module ~Socket.v~ from ~file.cmi~. We add the following entry to -~ffi/dune~. - -#+BEGIN_SRC lisp :tangle coqffi-tutorial/ffi/dune -(rule - (target Socket.v) - (action (run coqffi %{cmi:socket} -o %{target}))) -#+END_SRC - -We call ~coqffi~ without any feature-related command-line argument, -which means only the ~simple-io~ feature is enabled. As a consequence, -the ~socket_descr~ type is axiomatized in Coq, and in addition to a -=MonadSocket= monad, ~coqffi~ will generate an instance for this monad -for the =IO= monad of ~coq-simple-io~. - -Interested readers can have a look at the generated Coq module below. - -#+BEGIN_details -#+HTML: <summary><code>Socket.v</code> as generated by <code>coqffi</code></summary> - -#+BEGIN_SRC coq :noweb yes -<<coqffi_output(mod="Socket.v")>> -#+END_SRC -#+END_details - -** Process Management - -In order to avoid a client to block the server by connecting to it -without sending anything, we can fork a new process for each client. - -#+BEGIN_SRC ocaml :tangle coqffi-tutorial/ffi/proc.mli -type identity = Parent of int | Child - -val fork : unit -> identity -#+END_SRC - -#+BEGIN_details -#+HTML: <summary>Implementation for <code>proc.mli</code></summary> - -Again, thanks to the ~Unix~ module, the implementation is pretty -straightforward. - -#+BEGIN_SRC ocaml :tangle coqffi-tutorial/ffi/proc.ml -type identity = Parent of int | Child - -let fork x = - match Unix.fork x with - | 0 -> Child - | x -> Parent x -#+END_SRC -#+END_details - -This time, the ~proc.mli~ module interface introduces a transparent -type, /i.e./, it also provides its definition. This is a good use case -for the ~transparent-types~ feature of ~coqffi~. In the stanza for -generating ~Proc.v~, we enable it with the ~-ftransparent-types~ -command-line argument, like this. - -#+BEGIN_SRC lisp :tangle coqffi-tutorial/ffi/dune -(rule - (target Proc.v) - (action (run coqffi -ftransparent-types %{cmi:proc} -o %{target}))) -#+END_SRC - -#+BEGIN_details -#+HTML: <summary><code>Proc.v</code> as generated by <code>coqffi</code></summary> -#+BEGIN_SRC coq :noweb yes -<<coqffi_output(mod="Proc.v")>> -#+END_SRC -#+END_details - -We now have everything we need to implement an echo server in Coq. - -* Implementing an Echo Server - -Our implementation will be part of a dedicated Coq theory, called -~Echo~. This is done easily a ~dune~ file in the ~src/~ directory, -with the following content. - -#+BEGIN_SRC lisp :tangle coqffi-tutorial/src/dune -(coq.theory - (name Echo) - (theories FFI)) -#+END_SRC - -In the rest of this section, we will discuss the content of the unique -module of this theory. Hopefully, readers familiar with programming -impurity by means of monads will not find anything particularly -surprising here. - -Let us start with the inevitable sequence of import commands. We use -the =Monad= and =MonadFix= typeclasses of =ExtLib=, and our FFI -modules from the =FFI= theory we have previously defined. - -#+BEGIN_SRC coq :tangle coqffi-tutorial/src/Server.v -From ExtLib Require Import Monad MonadFix. -From FFI Require Import Proc Socket. -#+END_SRC - -Letting Coq guess the type of unintroduced variables using the ~`~ -annotation (/e.g./, in presence of ~`{Monad m}~, Coq understands ~m~ -is of type ~Type -> Type~) is always nice, so we enable it. - -#+BEGIN_SRC coq :tangle coqffi-tutorial/src/Server.v -Generalizable All Variables. -#+END_SRC - -We enable the monad notation provided by =ExtLib=. In this article, we -prefer the ~let*~ notation (as recently introduced by OCaml) over the -~<-~ notation of Haskell, but both are available. - -#+BEGIN_SRC coq :tangle coqffi-tutorial/src/Server.v -Import MonadLetNotation. -Open Scope monad_scope. -#+END_SRC - -Then, we define a notation to be able to define local, monadic -recursive functions using the =mfix= combinator of the =MonadFix= -typeclass. - -#+BEGIN_SRC coq :tangle coqffi-tutorial/src/Server.v -Notation "'let_rec*' f x ':=' p 'in' q" := - (let f := mfix (fun f x => p) in q) - (at level 61, x pattern, f name, q at next level, right associativity). -#+END_SRC - -Note that ~mfix~ does /not/ check whether or not the defined function -will terminate (contrary to the ~fix~ keyword of Coq). This is -fortunate because in our case, we do not want our echo server to -converge, but rather to accept an infinite number of connections. - -We can demonstrate how this notation can be leveraged by defining a -generic TCP server, parameterized by a handler to deal with incoming -connections. - -#+BEGIN_SRC coq :tangle coqffi-tutorial/src/Server.v -Definition tcp_srv `{Monad m, MonadFix m, MonadProc m, MonadSocket m} - (handler : socket_descr -> m unit) - : m unit := - let* srv := open_socket "127.0.0.1" 8888 in - listen srv;; - - let_rec* tcp_aux _ := - let* client := accept_connection srv in - let* res := fork tt in - match res with - | Parent _ => close_socket client >>= tcp_aux - | Child => handler client - end - in - - tcp_aux tt. -#+END_SRC - -The handler for the echo server is straightforward: it just reads -incoming bytes from the socket, sends it back, and closes the socket. - -#+BEGIN_SRC coq :tangle coqffi-tutorial/src/Server.v -Definition echo_handler `{Monad m, MonadSocket m} (sock : socket_descr) - : m unit := - let* msg := recv sock in - send sock msg;; - close_socket sock. -#+END_SRC - -Composing our generic TCP server with our echo handler gives us an -echo server. - -#+BEGIN_SRC coq :tangle coqffi-tutorial/src/Server.v -Definition echo_server `{Monad m, MonadFix m, MonadProc m, MonadSocket m} - : m unit := - tcp_srv echo_handler. -#+END_SRC - -Because ~coqffi~ has generated typeclasses for the impure primitives -of ~proc.mli~ and ~socket.mli~, =echo_server= is polymorphic, and can -be instantiated for different monads. When it comes to extracting our -program, we will generally prefer the =IO= monad of ~coq-simple-io~. -But we could also imagine verifying the client handler with FreeSpec, -or the generic TCP server with Interaction Trees (which support -diverging computations). Overall, we can have different verification -strategies for different parts of our program, by leveraging the most -relevant framework for each part, yet being able to extract it in an -efficient form. - -The next section shows how this last part is achieved using, once -again, a convenient stanza of dune. - -* Extracting and Building an Executable - -The ~0.2~ version of the Coq-related stanzas of ~dune~ provides the -~coq.extraction~ stanza, which can be used to build a Coq module -expected to generate ~ml~ files. - -In our case, we will write ~bin/echo.v~ to extract the ~echo_server~ -in a ~echo.ml~ module, and uses the =executable= stanza of ~dune~ to -get an executable from this file. To achieve this, the ~bin/dune~ -file simply requires these two stanzas. - -#+BEGIN_SRC lisp :tangle coqffi-tutorial/bin/dune -(coq.extraction - (prelude echo) - (theories Echo) - (extracted_modules echo)) - -(executable - (name echo) - (libraries ffi)) -#+END_SRC - -We are almost done. We now need to write the ~echo.v~ module, which -mostly consists of (1) providing a =MonadFix= instance for the =IO= -monad, (2) using the =IO.unsafe_run= function to escape the =IO= -monad, (3) calling the src_coq[:exports code]{Extraction} command to -wrap it up. - -#+BEGIN_SRC coq :tangle coqffi-tutorial/bin/echo.v -From Coq Require Extraction. -From ExtLib Require Import MonadFix. -From SimpleIO Require Import SimpleIO. -From Echo Require Import Server. - -Instance MonadFix_IO : MonadFix IO := - { mfix := @IO.fix_io }. - -Definition main : io_unit := - IO.unsafe_run echo_server. - -Extraction "echo.ml" main. -#+END_SRC - -Since we are using the =i63= type (signed 63bits integers) of the -~CoqFFI~ theory, and since =i63= is implemented under the hood with -Coq primitive integers, we /also/ need to provide a =Uint63= module -with a =of_int= function. Fortunately, this module is straightforward -to write. - -#+BEGIN_SRC ocaml :tangle coqffi-tutorial/bin/uint63.ml -let of_int x = x -#+END_SRC - -And /voilà/. A call to ~dune~ at the root of the repository will -build everything (Coq and OCaml alike). Starting the echo server -is as simple as - -#+BEGIN_SRC sh -dune exec bin/echo.exe -#+END_SRC - -And connecting to it can be achieved with a program like =telnet=. - -#+BEGIN_SRC console -$ telnet 127.0.0.1 8888 -Trying 127.0.0.1... -Connected to 127.0.0.1. -Escape character is '^]'. -hello, echo server! -hello, echo server! -Connection closed by foreign host. -#+END_SRC diff --git a/site/posts/CoqffiIntro.org b/site/posts/CoqffiIntro.org deleted file mode 100644 index e85f4b4..0000000 --- a/site/posts/CoqffiIntro.org +++ /dev/null @@ -1,516 +0,0 @@ -#+TITLE: ~coqffi~ in a Nutshell - -#+SERIES: ./Coqffi.html -#+SERIES_NEXT: ./CoqffiEcho.html - -For each entry of a ~cmi~ file (a /compiled/ ~mli~ file), ~coqffi~ -tries to generate an equivalent (from the extraction mechanism -perspective) Coq definition. In this article, we walk through how -~coqffi~ works. - -Note that we do not dive into the vernacular commands ~coqffi~ -generates. They are of no concern for users of ~coqffi~. - -#+BEGIN_EXPORT html -<nav id="generate-toc"></nav> -<div id="history">site/posts/CoqffiIntro.org</div> -#+END_EXPORT - -* Getting Started - -** Requirements - -The latest version of ~coqffi~ (~1.0.0~beta7~ at the time of writing) -is compatible with OCaml ~4.08~ up to ~4.12~, and Coq ~8.12~ up top -~8.13~. If you want to use ~coqffi~, but have incompatible -requirements of your own, feel free to -[[https://github.com/coq-community/coqffi/issues][submit an issue]]. - -** Installing ~coqffi~ - -The recommended way to install ~coqffi~ is through the -[[https://coq.inria.fr/opam/www][Opam Coq Archive]], in the ~released~ -repository. If you haven’t activated this repository yet, you can use -the following bash command. - -#+BEGIN_SRC sh -opam repo add coq-released https://coq.inria.fr/opam/released -#+END_SRC - -Then, installing ~coqffi~ is as simple as - -#+BEGIN_SRC sh -opam install coq-coqffi -#+END_SRC - -You can also get the source from -[[https://github.com/coq-community/coqffi][the upstream ~git~ -repository]]. The ~README~ provides the necessary pieces of -information to build it from source. - -** Additional Dependencies - -One major difference between Coq and OCaml is that the former is pure, -while the latter is not. Impurity can be modeled in pure languages, -and Coq does not lack of frameworks in this respect. ~coqffi~ -currently supports two of them: [[https://github.com/Lysxia/coq-simple-io][~coq-simple-io~]] and [[https://github.com/ANSSI-FR/FreeSpec][FreeSpec]]. It is -also possible to use it with [[https://github.com/DeepSpec/InteractionTrees][Interaction Trees]], albeit in a less -direct manner. - -* Primitive Types - -~coqffi~ supports a set of primitive types, /i.e./, a set of OCaml -types for which it knows an equivalent type in Coq. The list is the -following (the Coq types are fully qualified in the table, but not in -the generated Coq module as the necessary ~Import~ statement are -generated too). - -| OCaml type | Coq type | -|-------------------+-------------------------------| -| =bool= | =Coq.Init.Datatypes.bool= | -| =char= | =Coq.Strings.Ascii.ascii= | -| =int= | =CoqFFI.Data.Int.i63= | -| ='a list= | =Coq.Init.Datatypes.list a= | -| ='a Seq.t= | =CoqFFI.Data.Seq.t= | -| ='a option= | =Coq.Init.Datatypes.option a= | -| =('a, 'e) result= | =Coq.Init.Datatypes.sum= | -| =string= | =Coq.Strings.String.string= | -| =unit= | =Coq.Init.Datatypes.unit= | -| =exn= | =CoqFFI.Exn= | - -The =i63= type is introduced by the =CoqFFI= theory to provide signed -primitive integers to Coq users. They are implemented on top of the -(sadly unsigned) Coq native integers introduced in Coq -~8.10~. Hopefully, the =i63= type will be deprecated once [[https://github.com/coq/coq/pull/13559][signed -primitive integers find their way to Coq upstream]]. - -When processing the entries of a given interface model, ~coqffi~ will -check that they only use these types, or types introduced by the -interface module itself. - -Sometimes, you may encounter a situation where you have two interface -modules ~b.mli~ and ~b.mli~, such that ~b.mli~ uses a type introduced -in ~a.mli~. To deal with this scenario, you can use the ~--witness~ -flag to generate ~A.v~. This will tell ~coqffi~ to also generate -~A.ffi~; this file can then be used when generating ~B.v~ thanks to -the ~-I~ option. Furthermore, for ~B.v~ to compile the ~--require~ -option needs to be used to ensure the ~A~ Coq library (~A.v~) is -required. - -To give a more concrete example, given ~a.mli~ - -#+BEGIN_SRC ocaml -type t -#+END_SRC - -and ~b.mli~ - -#+BEGIN_SRC ocaml -type a = A.t -#+END_SRC - -To generate ~A.v~, we can use the following commands: - -#+BEGIN_SRC bash -ocamlc a.mli -coqffi --witness -o A.v a.cmi -#+END_SRC - -Which would generate the following axiom for =t=. - -#+BEGIN_SRC coq -Axiom t : Type. -#+END_SRC - -Then, generating ~B.v~ can be achieved as follows: - -#+BEGIN_SRC bash -ocamlc b.mli -coqffi -I A.ffi -ftransparent-types -r A -o B.v b.cmi -#+END_SRC - -which results in the following output for =v=: - -#+BEGIN_SRC coq -Require A. - -Definition u : Type := A.t. -#+END_SRC - -* Code Generation - -~coqffi~ distinguishes five types of entries: types, pure values, -impure primitives, asynchronous primitives, exceptions, and -modules. We now discuss how each one of them is handled. - -** Types - -By default, ~coqffi~ generates axiomatized definitions for each type -defined in a ~.cmi~ file. This means that src_ocaml[:exports -code]{type t} becomes src_coq[:exports code]{Axiom t : Type}. -Polymorphism is supported, /i.e./, src_ocaml[:exports code]{type 'a t} -becomes src_coq[:exports code]{Axiom t : forall (a : Type), Type}. - -It is possible to provide a “model” for a type using the =coq_model= -annotation, for instance for reasoning purposes. For instance, -we can specify that a type is equivalent to a =list=. - -#+BEGIN_SRC ocaml -type 'a t [@@coq_model "list"] -#+END_SRC - -This generates the following Coq definition. - -#+BEGIN_SRC coq -Definition t : forall (a : Type), Type := list. -#+END_SRC - -It is important to be careful when using the =coq_model= annotation. -More precisely, the fact that =t= is a =list= in the “Coq universe” -shall not be used while the implementation phase, only the -verification phase. - -Unamed polymorphic type parameters are also supported. In presence of -such parameters, ~coqffi~ will find it a name that is not already -used. For instance, - -#+BEGIN_SRC ocaml -type (_, 'a) ast -#+END_SRC - -becomes - -#+BEGIN_SRC ocaml -Axiom ast : forall (b : Type) (a : Type), Type. -#+END_SRC - -Finally, ~coqffi~ has got an experimental feature called -~transparent-types~ (enabled by using the ~-ftransparent-types~ -command-line argument). If the type definition is given in the module -interface, then ~coqffi~ tries to generates an equivalent definition -in Coq. For instance, - -#+BEGIN_SRC ocaml -type 'a llist = - | LCons of 'a * (unit -> 'a llist) - | LNil -#+END_SRC - -becomes - -#+BEGIN_SRC coq -Inductive llist (a : Type) : Type := -| LCons (x0 : a) (x1 : unit -> llist a) : llist a -| LNil : llist a. -#+END_SRC - -Mutually recursive types are supported, so - -#+BEGIN_SRC ocaml -type even = Zero | ESucc of odd -and odd = OSucc of even -#+END_SRC - -becomes - -#+BEGIN_SRC coq -Inductive odd : Type := -| OSucc (x0 : even) : odd -with even : Type := -| Zero : even -| ESucc (x0 : odd) : even. -#+END_SRC - -Besides, ~coqffi~ supports alias types, as suggested in this write-up -when we discuss witness files. - -The ~transparent-types~ feature is *experimental*, and is currently -limited to variant types. It notably does not support -records. Besides, it may generate incorrect Coq types, because it does -not check whether or not the [[https://coq.inria.fr/refman/language/core/inductive.html#positivity-condition][positivity condition]] is -satisfied. - -** Pure values - -~coqffi~ decides whether or not a given OCaml values is pure or impure -with the following heuristics: - -- Constants are pure -- Functions are impure by default -- Functions with a =coq_model= annotation are pure -- Functions marked with the =pure= annotation are pure -- If the ~pure-module~ feature is enabled (~-fpure-module~), - then synchronous functions (which do not live inside the [[https://ocsigen.org/lwt/5.3.0/manual/manual][~Lwt~]] - monad) are pure - -Similarly to types, ~coqffi~ generates axioms (or definitions, if the -~coq_model~ annotation is used) for pure values. Then, - -#+BEGIN_SRC ocaml -val unpack : string -> (char * string) option [@@pure] -#+END_SRC - -becomes - -#+BEGIN_SRC coq -Axiom unpack : string -> option (ascii * string). -#+END_SRC - -Polymorphic values are supported. - -#+BEGIN_SRC ocaml -val map : ('a -> 'b) -> 'a list -> 'b list [@@pure] -#+END_SRC - -becomes - -#+BEGIN_SRC coq -Axiom map : forall (a : Type) (b : Type), (a -> b) -> list a -> list b. -#+END_SRC - -Again, unamed polymorphic type are supported, so - -#+BEGIN_SRC ocaml -val ast_to_string : _ ast -> string [@@pure] -#+END_SRC - -becomes - -#+BEGIN_SRC coq -Axiom ast_to_string : forall (a : Type), string. -#+END_SRC - -** Impure Primitives - -~coqffi~ reserves a special treatment for /impure/ OCaml functions. -Impurity is usually handled in pure programming languages by means of -monads, and ~coqffi~ is no exception to the rule. - -Given the set of impure primitives declared in an interface module, -~coqffi~ will (1) generate a typeclass which gathers these primitives, -and (2) generate instances of this typeclass for supported backends. - -We illustrate the rest of this section with the following impure -primitives. - -#+BEGIN_SRC ocaml -val echo : string -> unit -val scan : unit -> string -#+END_SRC - -where =echo= allows writing something the standard output, and =scan= -to read the standard input. - -Assuming the processed module interface is named ~console.mli~, the -following Coq typeclass is generated. - -#+BEGIN_SRC coq -Class MonadConsole (m : Type -> Type) := { echo : string -> m unit - ; scan : unit -> m string - }. -#+END_SRC - -Using this typeclass and with the additional support of an additional -=Monad= typeclass, we can specify impure computations which interacts -with the console. For instance, with the support of ~ExtLib~, one can -write. - -#+BEGIN_SRC coq -Definition pipe `{Monad m, MonadConsole m} : m unit := - let* msg := scan () in - echo msg. -#+END_SRC - -There is no canonical way to model impurity in Coq, but over the years -several frameworks have been released to tackle this challenge. - -~coqffi~ provides three features related to impure primitives. - -*** ~simple-io~ - -When this feature is enabled, ~coqffi~ generates an instance of the -typeclass for the =IO= monad introduced in the ~coq-simple-io~ package - -#+BEGIN_SRC coq -Axiom io_echo : string -> IO unit. -Axiom io_scan : unit -> IO string. - -Instance IO_MonadConsole : MonadConsole IO := { echo := io_echo - ; scan := io_scan - }. -#+END_SRC - -It is enabled by default, but can be disabled using the -~-fno-simple-io~ command-line argument. - -*** ~interface~ - -When this feature is enabled, ~coqffi~ generates an inductive type -which describes the set of primitives available, to be used with -frameworks like [[https://github.com/ANSSI-FR/FreeSpec][FreeSpec]] or -[[https://github.com/DeepSpec/InteractionTrees][Interactions Trees]] - -#+BEGIN_SRC coq -Inductive CONSOLE : Type -> Type := -| Echo : string -> CONSOLE unit -| Scan : unit -> CONSOLE string. - -Definition inj_echo `{Inject CONSOLE m} (x0 : string) : m unit := - inject (Echo x0). - -Definition inj_scan `{Inject CONSOLE m} (x0 : unit) : m string := - inject (Scan x0). - -Instance Inject_MonadConsole `{Inject CONSOLE m} : MonadConsole m := - { echo := inj_echo - ; scan := inj_scan - }. -#+END_SRC - -Providing an instance of the form src_coq[:exports code]{forall i, -Inject i M} is enough for your monad =M= to be compatible with this -feature (see for instance -[[https://github.com/ANSSI-FR/FreeSpec/blob/master/theories/FFI/FFI.v][how -FreeSpec implements it]]). - -*** ~freespec~ - -When this feature in enabled, ~coqffi~ generates a semantics for the -inductive type generated by the ~interface~ feature. - -#+BEGIN_SRC coq -Axiom unsafe_echo : string -> unit. -Axiom unsafe_scan : uint -> string. - -Definition console_unsafe_semantics : semantics CONSOLE := - bootstrap (fun a e => - local match e in CONSOLE a return a with - | Echo x0 => unsafe_echo x0 - | Scan x0 => unsafe_scan x0 - end). -#+END_SRC - -** Asynchronous Primitives - -~coqffi~ also reserves a special treatment for /asynchronous/ -primitives —/i.e./, functions which live inside the ~Lwt~ monad— when -the ~lwt~ feature is enabled. - -The treatment is very analoguous to the one for impure primitives: (1) -a typeclass is generated (with the ~_Async~ suffix), and (2) an -instance for the ~Lwt~ monad is generated. Besides, an instance for -the “synchronous” primitives is also generated for ~Lwt~. If the -~interface~ feature is enabled, an interface datatype is generated, -which means you can potentially use Coq to reason about your -asynchronous programs (using FreeSpec and alike, although the -interleaving of asynchronous programs in not yet supported in -FreeSpec). - -By default, the type of the ~Lwt~ monad is ~Lwt.t~. You can override -this setting using the ~--lwt-alias~ option. This can be useful when -you are using an alias type in place of ~Lwt.t~. - -** Exceptions - -OCaml features an exception mechanism. Developers can define their -own exceptions using the ~exception~ keyword, whose syntax is similar -to constructors definition. For instance, - -#+BEGIN_SRC ocaml -exception Foo of int * bool -#+END_SRC - -introduces a new exception =Foo= which takes two parameters of type -=int= and =bool=. =Foo (x, y)= constructs of value of type =exn=. - -For each new exceptions introduced in an OCaml module, ~coqffi~ -generates (1) a so-called “proxy type,” and (2) conversion functions -to and from this type. - -Coming back to our example, the “proxy type” generates by ~coqffi~ is - -#+BEGIN_SRC coq -Inductive FooExn : Type := -| MakeFooExn (x0 : i63) (x1 : bool) : FooExn. -#+END_SRC - -Then, ~coqffi~ generates conversion functions. - -#+BEGIN_SRC coq -Axiom exn_of_foo : FooExn -> exn. -Axiom foo_of_exn : exn -> option FooExn. -#+END_SRC - -Besides, ~coqffi~ also generates an instance for the =Exn= typeclass -provided by the =CoqFFI= theory: - -#+BEGIN_SRC coq -Instance FooExn_Exn : Exn FooExn := - { to_exn := exn_of_foo - ; of_exn := foo_of_exn - }. -#+END_SRC - -Under the hood, =exn= is an [[https://caml.inria.fr/pub/docs/manual-ocaml/extensiblevariants.html][extensible datatype]], and how ~coqffi~ -supports it will probably be generalized in future releases. - -Finally, ~coqffi~ has a minimal support for functions which may raise -exceptions. Since OCaml type system does not allow to identify such -functions, they need to be annotated explicitely, using the -=may_raise= annotation. In such a case, ~coqffi~ will change the -return type of the function to use the =sum= Coq inductive type. - -For instance, - -#+BEGIN_SRC ocaml -val from_option : 'a option -> 'a [@@may_raise] [@@pure] -#+END_SRC - -becomes - -#+BEGIN_SRC coq -Axiom from_option : forall (a : Type), option a -> sum a exn. -#+END_SRC - -** Modules - -Lastly, ~coqffi~ supports OCaml modules described within ~mli~ files, -when they are specify as ~module T : sig ... end~. For instance, - -#+BEGIN_SRC ocaml -module T : sig - type t - - val to_string : t -> string [@@pure] -end -#+END_SRC - -becomes - -#+BEGIN_SRC coq -Module T. - Axiom t : Type. - - Axiom to_string : t -> string. -End T. -#+END_SRC - -As of now, the following construction is unfortunately *not* -supported, and will be ignored by ~coqffi~: - -#+BEGIN_SRC coq -module S = sig - type t - - val to_string : t -> string [@@pure] -end - -module T : S -#+END_SRC - -* Moving Forward - -~coqffi~ comes with a comprehensive man page. In addition, the -interested reader can proceed to the next article of this series, -which explains how [[./CoqffiEcho.org][~coqffi~ can be used to easily implement an echo -server in Coq]]. diff --git a/site/posts/DiscoveringCommonLisp.md b/site/posts/DiscoveringCommonLisp.md new file mode 100644 index 0000000..b3c1e3d --- /dev/null +++ b/site/posts/DiscoveringCommonLisp.md @@ -0,0 +1,252 @@ +--- +published: 2018-06-17 +tags: ['lisp'] +abstract: | + Common Lisp is a venerable programming languages like no other I know. From + the creation of a Lisp package up to the creation of a standalone + executable, we explore the shore of this strange beast. +--- + +# Discovering Common Lisp with `trivial-gamekit` + +I always wanted to learn some Lisp dialect. In the meantime, +[lykan](https://github.com/lkn-org/lykan) —my Slayers Online clone— begins to +take shape. So, of course, my brain got an idea: *why not writing a client for +lykan in some Lisp dialect?*[^why] I asked on +[Mastodon](https://mastodon.social/@lthms/100135240390747697) if there were +good game engines for Lisp, and someone told me about +[`trivial-gamekit`](https://github.com/borodust/trivial-gamekit). + +[^why]: Spoiler alert: this wasn’t the most efficient approach for the lykan + project. But it was fun. + +I have no idea if I will manage to implement a decent client using +trivial-gamekit, but why not trying? This article is the first of a series +about my experiments, discoveries and difficulties. The complete project +detailed in this article is available [as a +gist](https://gist.github.com/lthms/9833f4851843119c966917775b4c4180). + +## Common Lisp, Quicklisp and `trivial-gamekit` + +The trivial-gamekit +[website](https://borodust.github.io/projects/trivial-gamekit/) lists several +requirements. Two are related to Lisp: + +1. Quicklisp +2. SBCL or CCL + +[Quicklisp](https://quicklisp.org/beta) is an experimental package manager for +Lisp projects, while SBCL and CCL are two Lisp implementations. I had already +installed [Clisp](https://www.archlinux.org/packages/?name=clisp), and it took +me quite some time to understand my mistake. Fortunately, +[SBCL](https://www.archlinux.org/packages/?name=sbcl) is also packaged in +ArchLinux. + +With a compatible Lisp implementation, installing Quicklisp as a user is +straightforward. Following the website instructions is enough. At the end of +the process, you will have a new directory `${HOME}/quicklisp`{.bash}[^go]. + +[^go]: The purpose of this directory is similar to the [Go + workspace](https://github.com/golang/go/wiki/SettingGOPATH). + +Quicklisp is not a native feature of SBCL, and requires a small bit of +configurations to be made available automatically. You have to create a file +`${HOME}/.sbclrc`{.bash}, with the following content: + +```lisp +(load "~/quicklisp/setup") +``` + +There is one final step to be able to use `trivial-gamekit`. + +```bash +sbcl --eval '(ql-dist:install-dist "http://bodge.borodust.org/dist/org.borodust.bodge.txt")' \ + --quit +``` + +As of June 2018, Quicklisp [does not support +HTTPS](https://github.com/quicklisp/quicklisp-client/issues/167). + +## Introducing Lysk + +### Packaging + +The first thing I search for when I learn a new language is how projects are +organized. From this perspective, `trivial-gamekit` pointed me directly to +Quicklisp + +Creating a new Quicklisp project is straightforward. From my understanding, new +Quicklisp projects have to be located inside +`${HOME}/quicklisp/local-projects`{.bash}. I am not particularly happy with +this, but it is not really important. + +The current code name of my Lisp game client is lysk. + +```bash +mkdir ~/quicklisp/local-projects/lysk +``` + +Quicklisp packages (systems?) are defined through `asd` files. +I have firstly created `lysk.asd` as follows: + +```lisp +(asdf:defsystem lysk + :description "Lykan Game Client" + :author "lthms" + :license "GPLv3" + :version "0.0.1" + :serial t + :depends-on (trivial-gamekit) + :components ((:file "package") + (:file "lysk"))) +``` + +`:serial t`{.lisp} means that the files detailed in the `components`{.lisp} +field depends on the previous ones. That is, `lysk.lisp` depends on +`package.lisp` in this case. It is possible to manage files dependencies +manually, with the following syntax: + +```lisp +(:file "seconds" :depends-on "first") +``` + +I have declared only one dependency: `trivial-gamekit`. That way, Quicklisp +will load it for us. + +The first “true” Lisp file we define in our skeleton is `package.lisp`. +Here is its content: + +```lisp +(defpackage :lysk + (:use :cl) + (:export run app)) +``` + +Basically, this means we use two symbols, `run`{.lisp} and `app`{.lisp}. + +### A Game Client + +The `lysk.lisp` file contains the program in itself. My first goal was to +obtain the following program: at startup, it shall create a new window in +fullscreen, and exit when users release the left button of their mouse. It is +worth mentioning that I had to report [an issue to the `trivial-gamekit` +upstream](https://github.com/borodust/trivial-gamekit/issues/30) in order to +make my program work as expected. + +While it may sound scary —it suggests `trivial-gamekit` is a relatively young +project— the author has implemented a fix in less than an hour! He also took +the time to answer many questions I had when I joined the `#lispgames` Freenode +channel. + +Before going any further, let’s have a look at the complete file. + +```lisp +(cl:in-package :lysk) + +(gamekit:defgame app () () + (:fullscreen-p 't)) + +(defmethod gamekit:post-initialize ((app app)) + (gamekit:bind-button :mouse-left :released + (lambda () (gamekit:stop)))) + +(defun run () + (gamekit:start 'app)) +``` + +The first line is some kind of header, to tell Lisp the owner of the file. + +The `gamekit:defgame`{.lisp} function allows for creating a new game +application (called `app`{.lisp} in our case). I ask for a fullscreen window +with `:fullscreen-p`{.lisp}. Then, we use the `gamekit:post-initialize`{.lisp} +hook to bind a handler to the release of the left button of our mouse. This +handler is a simple call to `gamekit:stop`{.lisp}. Finally, we define a new +function `run`{.lisp} which only starts our application. + +Pretty straightforward! + +### Running our Program + +To “play” our game, we can start the SBCL REPL. + +```bash +sbcl --eval '(ql:quickload :lysk)' --eval '(lysk:run)' +``` + +### A Standalone Executable + +It looks like empower a REPL-driven development. That being said, once the +development is finished, I don't think I will have a lot of success if I ask my +future players to start SBCL to enjoy my game. Fortunately, `trivial-gamekit` +provides a dedicated function to bundle the game as a standalone executable. + +Following the advice of the [**@borodust**](https://github.com/borodust) —the +`trivial-gamekit` author— I created a second package to that end. First, we +need to edit the `lysk.asd` file to detail a second package: + +```lisp +(asdf:defsystem lysk/bundle + :description "Bundle the Lykan Game Client" + :author "lthms" + :license "GPLv3" + :version "0.0.1" + :serial t + :depends-on (trivial-gamekit/distribution lysk) + :components ((:file "bundle"))) + ``` + +This second package depends on lysk (our game client) and +trivial-gamekit/distribution. The latter provides the `deliver`{.lisp} +function, and we use it in the `bundle.lisp` file: + +```lisp +(cl:defpackage :lysk.bundle + (:use :cl) + (:export deliver)) + +(cl:in-package :lysk.bundle) + +(defun deliver () + (gamekit.distribution:deliver :lysk 'lysk:app)) +``` + +To bundle the game, we can use SBCL from our command line interface. + +```bash +sbcl --eval "(ql:quickload :lysk/bundle)" \ + --eval "(lysk.bundle:deliver)" \ + --quit +``` + +## Conclusion + +Objectively, there is not much in this article. However, because I am totally +new to Lisp, it took me quite some time to get these few lines of code to work +together. All being told I think this constitutes a good `trivial-gamekit` +skeleton. Do not hesitate to use it this way. + +Thanks again to [**@borodust**](https://github.com/borodust), for your time and +all your answers! + +## Appendix: a Makefile + +I like Makefile, so here is one to `run`{.lisp} the game directly, or +`bundle`{.lisp} it. + +```makefile +run: + @sbcl --eval "(ql:quickload :lysk)" \ + --eval "(lysk:run)" + +bundle: + @echo -en "[ ] Remove old build" + @rm -rf build/ + @echo -e "\r[*] Remove old build" + @echo "[ ] Building" + @sbcl --eval "(ql:quickload :lysk/bundle)" \ + --eval "(lysk.bundle:deliver)" \ + --quit + @echo "[*] Building" + +.PHONY: bundle run +``` diff --git a/site/posts/DiscoveringCommonLisp.org b/site/posts/DiscoveringCommonLisp.org deleted file mode 100644 index 479f76c..0000000 --- a/site/posts/DiscoveringCommonLisp.org +++ /dev/null @@ -1,261 +0,0 @@ -#+BEGIN_EXPORT html -<h1>Discovering Common Lisp with <code>trivial-gamekit</code></h1> - -<p>This article has originally been published on <span class="time">June 17, -2018</span>.</p> -#+END_EXPORT - -I always wanted to learn some Lisp dialect. -In the meantime, [[https://github.com/lkn-org/lykan][lykan]] —my Slayers Online clone— begins to take shape. -So, of course, my brain got an idea: /why not writing a client for lykan in some -Lisp dialect?/ -I asked on [[https://mastodon.social/@lthms/100135240390747697][Mastodon]] if there were good game engine for Lisp, and someone told me -about [[https://github.com/borodust/trivial-gamekit][trivial-gamekit]]. - -I have no idea if I will manage to implement a decent client using -trivial-gamekit, but why not trying? -This article is the first of a series about my experiments, discoveries and -difficulties. - -The code of my client is hosted on my server, using the pijul vcs. -If you have pijul installed, you can clone the repository: - -#+BEGIN_SRC bash -pijul clone "https://pijul.lthms.xyz/lkn/lysk" -#+END_SRC - -In addition, the complete project detailed in this article is available [[https://gist.github.com/lthms/9833f4851843119c966917775b4c4180][as a -gist]]. - -#+TOC: headlines 2 - -#+BEGIN_EXPORT html -<div id="history">site/posts/DiscoveringCommonLisp.org</div> -#+END_EXPORT - -* Common Lisp, Quicklisp and trivial-gamekit - -The trivial-gamekit [[https://borodust.github.io/projects/trivial-gamekit/][website]] lists several requirements. -Two are related to Lisp: - -1. Quicklisp -2. SBCL or CCL - -Quicklisp is an experimental package manager for Lisp project (it was easy to -guess, because there is a link to [[https://quicklisp.org/beta][quicklisp website]] in the trivial-gamekit -documentation). -As for SBCL and CCL, it turns out they are two Lisp implementations. -I had already installed [[https://www.archlinux.org/packages/?name=clisp][clisp]], and it took me quite some times to understand my -mistake. -Fortunately, [[https://www.archlinux.org/packages/?name=sbcl][sbcl]] is also packaged in ArchLinux. - -With a compatible Lisp implementation, installing Quicklisp as a user is -straightforward. -Following the website instructions is enough. -At the end of the process, you will have a new directory ~${HOME}/quicklisp~, -whose purpose is similar to the [[https://github.com/golang/go/wiki/SettingGOPATH][go workspace]]. - -Quicklisp is not a native feature of sbcl, and has to be loaded to be available. -To do it automatically, you have to create a file ~${HOME}/.sbclrc~, with the -following content: - -#+BEGIN_SRC common-lisp -(load "~/quicklisp/setup") -#+END_SRC - -There is one final step to be able to use trivial-gamekit. - -#+BEGIN_SRC bash -sbcl --eval '(ql-dist:install-dist "http://bodge.borodust.org/dist/org.borodust.bodge.txt")' \ - --quit -#+END_SRC - -As for now[fn::June 2018], Quicklisp [[https://github.com/quicklisp/quicklisp-client/issues/167][does not support HTTPS]]. - -* Introducing Lysk - -** Packaging - -The first thing I search for when I learn a new language is how projects are -organized. -From this perspective, trivial-gamekit pointed me directly to Quicklisp - -Creating a new Quicklisp project is very simple, and this is a very good thing. -As I said, the ~${HOME}/quicklisp~ directory acts like the go workspace. -As far as I can tell, new Quicklisp projects have to be located inside -~${HOME}/quicklisp/local-projects~. -I am not particularly happy with it, but it is not really important. - -The current code name of my Lisp game client is lysk. - -#+BEGIN_SRC bash -mkdir ~/quicklisp/local-projects/lysk -#+END_SRC - -Quicklisp packages (systems?) are defined through ~asd~ files. -I have firstly created ~lysk.asd~ as follows: - -#+BEGIN_SRC common-lisp -(asdf:defsystem lysk - :description "Lykan Game Client" - :author "lthms" - :license "GPLv3" - :version "0.0.1" - :serial t - :depends-on (trivial-gamekit) - :components ((:file "package") - (:file "lysk"))) -#+END_SRC - -~:serial t~ means that the files detailed in the ~components~ field depends on -the previous ones. -That is, ~lysk.lisp~ depends on ~package.lisp~ in this case. -It is possible to manage files dependencies manually, with the following syntax: - -#+BEGIN_SRC common-lisp -(:file "seconds" :depends-on "first") -#+END_SRC - -I have declared only one dependency: trivial-gamekit. -That way, Quicklisp will load it for us. - -The first “true” Lisp file we define in our skeleton is ~package.lisp~. -Here is its content: - -#+BEGIN_SRC common-lisp -(defpackage :lysk - (:use :cl) - (:export run app)) -#+END_SRC - -Basically, this means we use two symbols, ~run~ and ~app~. - -** A Game Client - -The ~lysk.lisp~ file contains the program in itself. -My first goal was to obtain the following program: at startup, it shall creates -a new window in fullscreen, and exit when users release the left button of their -mouse. -It is worth mentioning that I had to report [[https://github.com/borodust/trivial-gamekit/issues/30][an issue to the trivial-gamekit -upstream]] in order to make my program work as expected. -While it may sounds scary —it definitely shows trivial-gamekit is a relatively -young project— the author has implemented a fix in less than an hour! -He also took the time to answer many questions I had when I joined the -~#lispgames~ Freenode channel. - -Before going any further, lets have a look at the complete file. - -#+BEGIN_SRC common-lisp -(cl:in-package :lysk) - -(gamekit:defgame app () () - (:fullscreen-p 't)) - -(defmethod gamekit:post-initialize ((app app)) - (gamekit:bind-button :mouse-left :released - (lambda () (gamekit:stop)))) - -(defun run () - (gamekit:start 'app)) -#+END_SRC - -The first line is some kind of header, to tell Lisp the owner of the file. - -The ~gamekit:defgame~ function allows for creating a new game application -(called ~app~ in our case). -I ask for a fullscreen window with ~:fullscreen-p~. -Then, we use the ~gamekit:post-initialize~ hook to bind a handler to the release -of the left button of our mouse. -This handler is a simple call to ~gamekit:stop~. -Finally, we define a new function ~run~ which only starts our application. - -Pretty straightforward, right? - -** Running our Program - -To “play” our game, we can start the sbcl REPL. - -#+BEGIN_SRC bash -sbcl --eval '(ql:quickload :lysk)' --eval '(lysk:run)' -#+END_SRC - -And it works! - -** A Standalone Executable - -It looks like empower a REPL-driven development. -That being said, once the development is finished, I don't think I will have a -lot of success if I ask my future players to start sbcl to enjoy my game. -Fortunately, trivial-gamekit provides a dedicated function to bundle the game as -a standalone executable. - -Following the advises of the borodust —the trivial-gamekit author— I created a -second package to that end. -First, we need to edit the ~lysk.asd~ file to detail a second package: - -#+BEGIN_SRC common-lisp -(asdf:defsystem lysk/bundle - :description "Bundle the Lykan Game Client" - :author "lthms" - :license "GPLv3" - :version "0.0.1" - :serial t - :depends-on (trivial-gamekit/distribution lysk) - :components ((:file "bundle"))) -#+END_SRC - -This second package depends on lysk (our game client) and and -trivial-gamekit/distribution. -The latter provides the ~deliver~ function, and we use it in the ~bundle.lisp~ -file: - -#+BEGIN_SRC common-lisp -(cl:defpackage :lysk.bundle - (:use :cl) - (:export deliver)) - -(cl:in-package :lysk.bundle) - -(defun deliver () - (gamekit.distribution:deliver :lysk 'lysk:app)) -#+END_SRC - -To bundle the game, we can use ~sbcl~ from our command line interface. - -#+BEGIN_SRC bash -sbcl --eval "(ql:quickload :lysk/bundle)" \ - --eval "(lysk.bundle:deliver)" \ - --quit -#+END_SRC - -* Conclusion - -Objectively, there is not much in this article. -However, because I am totally new to Lisp, it took me quite some time to get -these few lines of code to work together. -All being told I think this constitutes a good trivial-gamekit skeleton. -Do not hesitate to us it this way. - -Thanks again to borodust, for your time and all your answers! - -* Appendix: a Makefile - -I like Makefile, so here is one to ~run~ the game directly, or ~bundle~ it. - -#+BEGIN_SRC makefile -run: - @sbcl --eval "(ql:quickload :lysk)" \ - --eval "(lysk:run)" - -bundle: - @echo -en "[ ] Remove old build" - @rm -rf build/ - @echo -e "\r[*] Remove old build" - @echo "[ ] Building" - @sbcl --eval "(ql:quickload :lysk/bundle)" \ - --eval "(lysk.bundle:deliver)" \ - --quit - @echo "[*] Building" - -.PHONY: bundle run -#+END_SRC diff --git a/site/posts/EndOfPhd.md b/site/posts/EndOfPhd.md new file mode 100644 index 0000000..7299450 --- /dev/null +++ b/site/posts/EndOfPhd.md @@ -0,0 +1,80 @@ +--- +published: 2019-01-15 +tags: ['research'] +abstract: | + It has been a long journey —4 years, 10 days— but I have completed my PhD + on October 25, 2018. + +--- + +# I am no longer a PhD. student + +It has been a long journey —4 years, 10 days— but I have completed my PhD on +October 25, 2018. The exact title of my PhD thesis is “[*Specifying and +Verifying Hardware-based Security Enforcement +Mechanisms*](https://inria.hal.science/tel-01989940v2/file/2018_LETAN_archivage.pdf)”. + +## Abstract + +In this thesis, we consider a class of security enforcement mechanisms we +called *Hardware-based Security Enforcement* (HSE). In such mechanisms, some +trusted software components rely on the underlying hardware architecture to +constrain the execution of untrusted software components with respect to +targeted security policies. For instance, an operating system which configures +page tables to isolate userland applications implements a HSE mechanism. + +For a HSE mechanism to correctly enforce a targeted security policy, it +requires both hardware and trusted software components to play their parts. +During the past decades, several vulnerability disclosures have defeated HSE +mechanisms. We focus on the vulnerabilities that are the result of errors at +the specification level, rather than implementation errors. In some critical +vulnerabilities, the attacker makes a legitimate use of one hardware component +to circumvent the HSE mechanism provided by another one. For instance, cache +poisoning attacks leverage inconsistencies between cache and DRAM’s access +control mechanisms. We call this class of attacks, where an attacker leverages +inconsistencies in hardware specifications, *compositional attacks*. + +Our goal is to explore approaches to specify and verify HSE mechanisms using +formal methods that would benefit both hardware designers and software +developers. Firstly, a formal specification of HSE mechanisms can be leveraged +as a foundation for a systematic approach to verify hardware specifications, in +the hope of uncovering potential compositional attacks ahead of time. Secondly, +it provides unambiguous specifications to software developers, in the form of a +list of requirements. + +Our contribution is two-fold: + +1. We propose a theory of HSE mechanisms against hardware architecture models. + This theory can be used to specify and verify such mechanisms. To evaluate + our approach, we propose a minimal model for a single core x86-based + computing platform. We use it to specify and verify the HSE mechanism + provided by Intel to isolate the code executed while the CPU is in System + Management Mode (SMM), a highly privileged execution mode of x86 + microprocessors. We have written machine-checked proofs in the Coq proof + assistant to that end. +2. We propose a novel approach inspired by algebraic effects to enable modular + verification of complex systems made of interconnected components as a first + step towards addressing the challenge posed by the scale of the x86 hardware + architecture. This approach is not specific to hardware models, and could + also be leveraged to reason about the composition of software components as + well. In addition, we have implemented our approach in the Coq theorem + prover, and the resulting framework takes advantage of Coq proof automation + features to provide general-purpose facilities to reason about components + interactions. + +## Publications + +If you are interested, you can have a look at the paper I wrote during my PhD: + +- [SpecCert: Specifying and Verifying Hardware-based Security Enforcement + Mechanisms](https://inria.hal.science/hal-01361422v1/file/speccert-fm2016.pdf), + with Pierre Chifflier, Guillame Hiet and Benjamin Morin, at Formal Methods + 2016 +- [Modular Verification of Programs with Effects and Effect Handlers in + Coq](https://inria.hal.science/hal-01799712v1/file/main.pdf), with Yann + Régis-Gianas, Pierre Chifflier and Guillaume Hiet, at Formal Methods 2018 + +You can also have a look at the Coq frameworks I have published: + +- [SpecCert on Github](https://github.com/lthms/speccert) (CeCILL-B) +- [FreeSpec on Github](https://github.com/lthms/FreeSpec) (MPL-2.0) diff --git a/site/posts/ExtensibleTypeSafeErrorHandling.md b/site/posts/ExtensibleTypeSafeErrorHandling.md new file mode 100644 index 0000000..216a185 --- /dev/null +++ b/site/posts/ExtensibleTypeSafeErrorHandling.md @@ -0,0 +1,420 @@ +--- +published: 2018-02-04 +modified: 2023-05-08 +tags: ['haskell'] +abstract: | + Ever heard of “extensible effects?” By applying the same principle, but for + error handling, the result is nice, type-safe API for Haskell, with a lot + of GHC magic under the hood. +--- + +# Extensible Type-Safe Error Handling in Haskell + +A colleague of mine introduced me to the benefits of +[`error-chain`](https://crates.io/crates/error-chain), a crate which aims to +implement “*consistent error handling*” for Rust. I found the overall design +pretty convincing, and in his use case, the crate really makes error handling +clearer and flexible. I knew [Pijul](https://pijul.org) was also using +`error-chain` at that time, but I never had the opportunity to dig more into it. + +At the same time, I have read quite a lot about *extensible effects* in +Functional Programming, for an academic article I have submitted to [Formal +Methods 2018](http://www.fm2018.org)[^fm2018]. In particular, the +[freer](https://hackage.haskell.org/package/freer) package provides a very nice +API to define monadic functions which may use well-identified effects. For +instance, we can imagine that `Console`{.haskell} identifies the functions +which may print to and read from the standard output. A function +`askPassword`{.haskell} which displays a prompt and get the user password would +have this type signature: + +[^fm2018]: The odds were in my favor: the aforementioned academic article has + been accepted. + +```haskell +askPassword :: Member Console r => Eff r () +``` + +Compared to `IO`{.haskell}, `Eff`{.haskell} allows for meaningful type +signatures. It becomes easier to reason about function composition, and you +know that a given function which lacks a given effect in its type signature +will not be able to use them. As a predictable drawback, `Eff`{.haskell} can +become burdensome to use. + +Basically, when my colleague showed me his Rust project and how he was using +`error-chain`, the question popped out. *Can we use an approach similar to +`Eff`{.haskell} to implement a Haskell-flavored `error-chain`?* + +Spoiler alert: the answer is yes. In this post, I will dive into the resulting +API, leaving for another time the details of the underlying implementation[^api]. +Believe me, there is plenty to say. If you want to have a look already, the +current implementation can be found on +[GitHub](https://github.com/lthms/chain). + +[^api]: For once, I wanted to write about the *result* of a project, instead of + *how it is implemented*. + +In this article, I will use several “advanced” GHC pragmas. I will not explain +each of them, but I will *try* to give some pointers for the reader who wants +to learn more. + + +## State of the Art + +This is not an academic publication, and my goal was primarily to explore the +arcane of the Haskell type system, so I might have skipped the proper study of +the state of the art. That being said, I have written programs in Rust and +Haskell before. + +### Starting Point + +In Rust, `Result<T, E>`{.rust} is the counterpart of `Either E T`{.haskell} in +Haskell[^either]. You can use it to model to wrap either the result of a +function (`T`) or an error encountered during this computation (~E~). Both +`Either`{.haskell} and `Result`{.rust} are used in order to achieve the same +end, that is writing functions which might fail. + +[^either]: I wonder if they deliberately choose to swap the two type arguments. + +On the one hand, `Either E`{.haskell} is a monad. It works exactly as +`Maybe`{.haskell} (returning an error acts as a shortcut for the rest of the +function), but gives you the ability to specify *why* the function has failed. +To deal with effects, the `mtl` package provides `EitherT`{.haskell}, a +transformer version of `Either`{.haskell} to be used in a monad stack. + +On the other hand, the Rust language provides the `?`{.rust} syntactic sugar, +to achieve the same thing. That is, both languages provide you the means to +write potentially failing functions without the need to care locally about +failure. If your function `f` uses a function `g` which might fail, and want to +fail yourself if `f` fails, it becomes trivial. + +Out of the box, neither `EitherT`{.haskell} nor `Result`{.rust} is extensible. +The functions must use the exact same `E`, or errors must be converted +manually. + +### Handling Errors in Rust + +Rust and the `error-chain` crate provide several means to overcome this +limitation. In particular, it has the `Into`{.rust} and `From`{.rust} traits to +ease the conversion from one error to another. Among other things, the +`error-chain` crate provides a macro to easily define a wrapper around many +errors types, basically your own and the one defined by the crates you are +using. + +I see several drawbacks to this approach. First, it is extensible if you take +the time to modify the wrapper type each time you want to consider a new error +type. Second, either you can either use one error type or every error +type. + +However, the `error-chain` package provides a way to solve a very annoying +limitation of `Result`{.rust} and `Either`{.haskell}. When you “catch” an +error, after a given function returns its result, it can be hard to determine +from where the error is coming from. Imagine you are parsing a very complicated +source file, and the error you get is `SyntaxError`{.rust} with no additional +context. How would you feel? + +`error-chain` solves this by providing an API to construct a chain of errors, +rather than a single value. + +```rust +my_function().chain_err(|| "a message with some context")?; +``` + +The `chain_err` function makes it easier to replace a given error in its +context, leading to be able to write more meaningful error messages for +instance. + +## The `ResultT`{.haskell} Monad + +The `ResultT`{.haskell} is an attempt to bring together the extensible power of +`Eff`{.haskell} and the chaining of errors of `chain_err`. I will admit that, +for the latter, the current implementation of `ResultT`{.haskell} is probably +less powerful, but to be honest I mostly cared about the “extensible” thing, so +it is not very surprising. + +This monad is an alternative to neither Monad Stacks a la mtl nor to the +`Eff`{.haskell} monad. In its current state, it aims to be a more powerful and +flexible version of `EitherT`{.haskell}. + +### Parameters + +As often in Haskell, the `ResultT`{.haskell} monad can be parameterized in +several ways. + +```haskell +data ResultT msg (err :: [*]) m a +``` + +- `msg`{.haskell} is the type of messages you can stack to provide more context + to error handling +- `err`{.haskell} is a *row of errors*[^row], it basically describes the set of + errors you will eventually have to handle +- `m`{.haskell} is the underlying monad stack of your application, knowing that + `ResultT`{.haskell} is not intended to be stacked itself +- `a`{.haskell} is the expected type of the computation result + +[^row]: You might have noticed `err`{.haskell} is of kind `[*]`{.haskell}. To write such a thing, + you will need the + [`DataKinds`{.haskell}](https://www.schoolofhaskell.com/user/konn/prove-your-haskell-for-great-safety/dependent-types-in-haskell) + GHC pragmas. + +### `achieve`{.haskell} and `abort`{.haskell} + +The two main monadic operations which comes with ~ResultT~ are ~achieve~ and +~abort~. The former allows for building the context, by stacking so-called +messages which describe what you want to do. The latter allows for bailing on a +computation and explaining why. + +```haskell +achieve :: (Monad m) + => msg + -> ResultT msg err m a + -> ResultT msg err m a +``` + +`achieve`{.haskell} should be used for `do`{.haskell} blocks. You can use +`<?>`{.haskell} to attach a contextual message to a given computation. + +The type signature of `abort`{.haskell} is also interesting, because it +introduces the `Contains`{.haskell} typeclass (e.g., it is equivalent to +`Member`{.haskell} for `Eff`{.haskell}). + +```haskell +abort :: (Contains err e, Monad m) + => e + -> ResultT msg err m a +``` + +This reads as follows: “*you can abort with an error of type `e`{.haskell} if +and only if the row of errors `err`{.haskell} contains the type +`e`{.haskell}.*” + +For instance, imagine we have an error type `FileError`{.haskell} to describe +filesystem-related errors. Then, we can imagine the following function: + +```haskell +readContent :: (Contains err FileError, MonadIO m) + => FilePath + -> ResultT msg err m String +``` + +We could leverage this function in a given project, for instance, to read its +configuration files (for the sake of the example, it has several configuration +files). This function can use its own type to describe ill-formed description +(`ConfigurationError`{.haskell}). + +```haskell +parseConfiguration :: (Contains err ConfigurationError, MonadIO m) + => String + -> String + -> ResultT msg err m Configuration +``` + +To avoid repeating `Contains`{.haskell} when the row of errors needs to +contains several elements, we introduce `:<`{.haskell}[^top] (read *subset or +equal*): + +```haskell +getConfig :: ( '[FileError, ConfigurationError] :< err + , MonadIO m) + => ResultT String err m Configuration +getConfig = do + achieve "get configuration from ~/.myapp directory" $ do + f1 <- readContent "~/.myapp/init.conf" + <?> "fetch the main configuration" + f2 <- readContent "~/.myapp/net.conf" + <?> "fetch the net-related configuration" + + parseConfiguration f1 f2 +``` + +You might see, now, why I say ~ResultT~ is extensible. You can use two functions +with totally unrelated errors, as long as the caller advertises that with +~Contains~ or ~:<~. + +[^top]: If you are confused by `:<`{.haskell}, it is probably because you were + not aware that the + [`TypeOperators`{.haskell}](https://ocharles.org.uk/blog/posts/2014-12-08-type-operators.html) + GHC pragma was a thing. + +### Recovering by Handling Errors + +Monads are traps, you can only escape them by playing with their +rules. `ResultT`{.haskell} comes with `runResultT`{.haskell}. + +```haskell +runResultT :: Monad m => ResultT msg '[] m a -> m a +``` + +This might be surprising: we can only escape out from the `ResultT`{.haskell} +if we do not use *any errors at all*. That is, `ResultT`{.haskell} forces us to +handle errors before calling `runResultT`{.haskell}. + +`ResultT`{.haskell} provides several functions prefixed by `recover`{.haskell}. +Their type signatures can be a little confusing, so we will dive into the +simpler one: + +```haskell +recover :: forall e m msg err a. + (Monad m) + => ResultT msg (e ': err) m a + -> (e -> [msg] -> ResultT msg err m a) + -> ResultT msg err m a +``` + +`recover`{.haskell} allows for *removing* an error type from the row of errors, +To do that, it requires to provide an error handler to determine what to do +with the error raised during the computation and the stack of messages at that +time. Using `recover`{.haskell}, a function may use more errors than advertised +in its type signature, but we know by construction that in such a case, it +handles these errors so that it is transparent for the function user. The type +of the handler is `e -> [msg] -> ResultT msg err m a`{.haskell}, which means +the handler *can raise errors if required*. + +`recoverWhile msg`{.haskell} is basically a synonym for `achieve msg $ +recover`{.haskell}. `recoverMany`{.haskell} allows for doing the same with a +row of errors, by providing as many functions as required. Finally, +`recoverManyWith`{.haskell} simplifies `recoverMany`{.haskell}: you can provide +only one function tied to a given typeclass, on the condition that the handling +errors implement this typeclass. + +Using `recover`{.haskell} and its siblings often require to help a bit the +Haskell type system, especially if we use lambdas to define the error handlers. +Doing that is usually achieved with the `Proxy a`{.haskell} dataype (where +`a`{.haskell} is a phantom type). I would rather use the +`TypeApplications`{.haskell} pragma[^tap]. + +```haskell +recoverManyWith @[FileError, NetworkError] @DescriptiveError + (do x <- readFromFile f + y <- readFromNetwork socket + printToStd x y) + printErrorAndStack +``` + +The `DecriptiveError`{.haskell} typeclass can be seen as a dedicated +`Show`{.haskell}, to give textual representation of errors. It is inspired by +the macros of `error_chain`. + +We can start from an empty row of errors, and allows ourselves to +use more errors thanks to the `recover*` functions. + +[^tap]: The + [TypeApplications](https://medium.com/@zyxoas/abusing-haskell-dependent-types-to-make-redis-queues-safer-cc31db943b6c) + pragmas is probably one of my favorites. + + When I use it, it feels almost like if I were writing a Coq document. + +## `cat` in Haskell using `ResultT`{.haskell} + +`ResultT`{.haskell} only cares about error handling. The rest of the work is up +to the underlying monad `m`{.haskell}. That being said, nothing forbids us to +provide fine-grained API, *e.g.*, for Filesystem-related functions. From an +error handling perspective, the functions provided by `Prelude` (the standard +library of Haskell) are pretty poor, and the documentation is not really +precise regarding the kind of error we can encounter while using it. + +In this section, I will show you how we can leverage `ResultT`{.haskell} to +**(i)** define an error-centric API for basic file management functions and +**(ii)** use this API to implement a `cat`-like program which read a file and +print its content in the standard output. + +### (A Lot Of) Error Types + +We could have one sum type to describe in the same place all the errors we can +find, and later use the pattern matching feature of Haskell to determine which +one has been raised. The thing is, this is already the job done by the row of +errors of ~ResultT~. Besides, this means that we could raise an error for being +not able to write something into a file in a function which /opens/ a file. + +Because ~ResultT~ is intended to be extensible, we should rather define several +types, so we can have a fine-grained row of errors. Of course, too many types +will become burdensome, so this is yet another time where we need to find the +right balance. + +```haskell +newtype AlreadyInUse = AlreadyInUse FilePath +newtype DoesNotExist = DoesNotExist FilePath +data AccessDeny = AccessDeny FilePath IO.IOMode +data EoF = EoF +data IllegalOperation = IllegalRead | IllegalWrite +``` + +To be honest, this is a bit too much for the real life, but we are in a blog post +here, so we should embrace the potential of `ResultT`{.haskell}. + +### Filesystem API + +By reading the +[`System.IO`{.haskell}](https://hackage.haskell.org/package/base-4.9.1.0/docs/System-IO.html) +documentation, we can infer what our functions type signatures should look +like. I will not discuss their actual implementation in this article, as this +requires me to explain how `IO`{.haskell} deals with errors itself (and this +article is already long enough to my taste). You can have a look at [this +gist](https://gist.github.com/lthms/c669e68e284a056dc8c0c3546b4efe56) if you +are interested. + +```haskell +openFile :: ( '[AlreadyInUse, DoesNotExist, AccessDeny] :< err + , MonadIO m) + => FilePath -> IOMode -> ResultT msg err m Handle + +getLine :: ('[IllegalOperation, EoF] :< err, MonadIO m) + => IO.Handle + -> ResultT msg err m Text + +closeFile :: (MonadIO m) + => IO.Handle + -> ResultT msg err m () +``` + +### Implementing `cat` + +We can use the `ResultT`{.haskell} monad, its monadic operations and our +functions to deal with the file system in order to implement a `cat`-like +program. I tried to comment on the implementation to make it easier to follow. + +```haskell +cat :: FilePath -> ResultT String err IO () +cat path = + -- We will try to open and read this file to mimic + -- `cat` behaviour. + -- We advertise that in case something goes wrong + -- the process. + achieve ("cat " ++ path) $ do + -- We will recover from a potential error, + -- but we will abstract away the error using + -- the `DescriptiveError` typeclass. This way, + -- we do not need to give one handler by error + -- type. + recoverManyWith @[Fs.AlreadyInUse, Fs.DoesNotExist, Fs.AccessDeny, Fs.IllegalOperation] + @(Fs.DescriptiveError) + (do f <- Fs.openFile path Fs.ReadMode + -- `repeatUntil` works like `recover`, except + -- it repeats the computation until the error + -- actually happpens. + -- I could not have used `getLine` without + -- `repeatUntil` or `recover`, as it is not + -- in the row of errors allowed by + -- `recoverManyWith`. + repeatUntil @(Fs.EoF) + (Fs.getLine f >>= liftIO . print) + (\_ _ -> liftIO $ putStrLn "%EOF") + closeFile f) + printErrorAndStack + where + -- Using the `DescriptiveError` typeclass, we + -- can print both the stack of Strings which form + -- the context, and the description of the generic + -- error. + printErrorAndStack e ctx = do + liftIO . putStrLn $ Fs.describe e + liftIO $ putStrLn "stack:" + liftIO $ print ctx +``` + +The type signature of `cat`{.haskell} teaches us that this function handles any +error it might encounter. This means we can use it anywhere we want: both in +another computation inside `ResultT`{.haskell} which might raise errors +completely unrelated to the file system, or we can use it with +`runResultT`{.haskell}, escaping the `ResultT`{.haskell} monad (only to fall +into the `IO`{.haskell} monad, but this is another story). diff --git a/site/posts/ExtensibleTypeSafeErrorHandling.org b/site/posts/ExtensibleTypeSafeErrorHandling.org deleted file mode 100644 index e817e31..0000000 --- a/site/posts/ExtensibleTypeSafeErrorHandling.org +++ /dev/null @@ -1,398 +0,0 @@ -#+TITLE: Extensible Type-Safe Error Handling in Haskell - -#+SERIES: ../haskell.html - -#+BEGIN_EXPORT html -<p>This article has originally been published on <span -id="original-created-at">February 04, 2018</span>.</p> -#+END_EXPORT - -#+BEGIN_EXPORT html -<nav id="generate-toc"></nav> -<div id="history">site/posts/ExtensibleTypeSafeErrorHandling.org</div> -#+END_EXPORT - -A colleague of mine introduced me to the benefits of [[https://crates.io/crates/error-chain][~error-chain~]], a crate which -aims to implement /“consistent error handling”/ for Rust. I found the overall -design pretty convincing, and in his use case, the crate really makes its error -handling clearer and flexible. I knew /pijul/ uses ~error-chain~ to, but I never -had the occasion to dig more into it. - -At the same time, I have read quite a lot about /extensible effects/ in -Functional Programming, for an academic article I have submitted to -[[http://www.fm2018.org][Formal Methods 2018]][fn:fm2018]. In particular, the [[https://hackage.haskell.org/package/freer][freer]] package provides a very -nice API to define monadic functions which may use well-identified effects. For -instance, we can imagine that ~Console~ identifies the functions which may print -to and read from the standard output. A function ~askPassword~ which displays a -prompt and get the user password would have this type signature: - -#+BEGIN_SRC haskell -askPassword :: Member Console r => Eff r () -#+END_SRC - -Compared to ~IO~, ~Eff~ allows for meaningful type signatures. It becomes easier -to reason about function composition, and you know that a given function which -lacks a given effect in its type signature will not be able to use them. As a -predictable drawback, ~Eff~ can become burdensome to use. - -Basically, when my colleague showed me its Rust project and how he was using -~error-chain~, the question popped out. *Can we use an approach similar to ~Eff~ -to implement a Haskell-flavoured ~error-chain~?* - -Spoiler alert: the answer is yes. In this post, I will dive into the resulting -API, leaving for another time the details of the underlying -implementation. Believe me, there is plenty to say. If you want to have a look -already, the current implementation can be found on [[https://github.com/lethom/chain][GitHub]]. - -In this article, I will use several “advanced” GHC pragmas. I will not explain -each of them, but I will /try/ to give some pointers for the reader who wants to -learn more. - -[fn:fm2018] If the odds are in my favour, I will have plenty of occasions to write -more about this topic. - -* State of the Art - -This is not an academic publication, and my goal was primarily to explore the -arcane of the Haskell type system, so I might have skipped the proper study of -the state of the art. That being said, I have written programs in Rust and -Haskell before. - -** Starting Point - -In Rust, ~Result<T, E>~ is the counterpart of ~Either E T~ in -Haskell[fn:either]. You can use it to model to wrap either the result of a -function (~T~) or an error encountered during this computation (~E~). -Both ~Either~ and ~Result~ are used in order to achieve the same end, that is -writing functions which might fail. - -On the one hand, ~Either E~ is a monad. It works exactly as ~Maybe~ (returning -an error acts as a shortcut for the rest of the function), but gives you the -ability to specify /why/ the function has failed. To deal with effects, the -~mtl~ package provides ~EitherT~, a transformer version of ~Either~ to be used -in a monad stack. - -On the other hand, the Rust language provides the ~?~ syntactic sugar, to -achieve the same thing. That is, both languages provide you the means to write -potentially failing functions without the need to care locally about failure. If -your function ~B~ uses a function ~A~ which might fail, and want to fail -yourself if ~A~ fails, it becomes trivial. - -Out of the box, neither ~EitherT~ nor ~Result~ is extensible. The functions must -use the exact same ~E~, or errors must be converted manually. - -[fn:either] I wonder if they deliberately choose to swap the two type arguments. - -** Handling Errors in Rust - -Rust and the ~error-chain~ crate provide several means to overcome this -limitation. In particular, it has the ~Into~ and ~From~ traits to ease the -conversion from one error to another. Among other things, the ~error-chain~ -crate provides a macro to easily define a wrapper around many errors types, -basically your own and the one defined by the crates you are using. - -I see several drawbacks to this approach. First, it is extensible if you take -the time to modify the wrapper type each time you want to consider a new error -type. Second, either you can either use one error type or every error -type. - -However, the ~error-chain~ package provides a way to solve a very annoying -limitation of ~Result~ and ~Either~. When you “catch” an error, after a given -function returns its result, it can be hard to determine from where the error is -coming from. Imagine you are parsing a very complicated source file, and the -error you get is ~SyntaxError~ with no additional context. How would you feel? - -~error-chain~ solves this by providing an API to construct a chain of errors, -rather than a single value. - -#+BEGIN_SRC rust -my_function().chain_err(|| "a message with some context")?; -#+END_SRC - -The ~chain_err~ function makes it easier to replace a given error in its -context, leading to be able to write more meaningful error messages for -instance. - -* The ResultT Monad - -The ~ResultT~ is an attempt to bring together the extensible power of ~Eff~ and -the chaining of errors of ~chain_err~. I will admit that, for the latter, the -current implementation of ~ResultT~ is probably less powerful, but to be honest -I mostly cared about the “extensible” thing, so it is not very surprising. - -This monad is not an alternative to neither Monad Stacks a la mtl nor to the -~Eff~ monad. In its current state, it aims to be a more powerful and flexible -version of ~EitherT~. - -** Parameters - -As often in Haskell, the ~ResultT~ monad can be parameterised in several ways. - -#+BEGIN_SRC haskell -data ResultT msg (err :: [*]) m a -#+END_SRC - -- ~msg~ is the type of messages you can stack to provide more context to error - handling -- ~err~ is a /row of errors/[fn:row], it basically describes the set of errors - you will eventually have to handle -- ~m~ is the underlying monad stack of your application, knowing that ~ResultT~ - is not intended to be stacked itself -- ~a~ is the expected type of the computation result - -[fn:row] You might have notice ~err~ is of kind ~[*]~. To write such a thing, -you will need the [[https://www.schoolofhaskell.com/user/konn/prove-your-haskell-for-great-safety/dependent-types-in-haskell][DataKinds]] GHC pragmas. - -** ~achieve~ and ~abort~ - -The two main monadic operations which comes with ~ResultT~ are ~achieve~ and -~abort~. The former allows for building the context, by stacking so-called -messages which describe what you want to do. The latter allows for bailing on a -computation and explaining why. - -#+BEGIN_SRC haskell -achieve :: (Monad m) - => msg - -> ResultT msg err m a - -> ResultT msg err m a -#+END_SRC - -~achieve~ should be used for ~do~ blocks. You can use ~<?>~ to attach a -contextual message to a given computation. - -The type signature of ~abort~ is also interesting, because it introduces the -~Contains~ typeclass (e.g., it is equivalent to ~Member~ for ~Eff~). - -#+BEGIN_SRC haskell -abort :: (Contains err e, Monad m) - => e - -> ResultT msg err m a -#+END_SRC - -This reads as follows: /“you can abort with an error of type ~e~ if and only if -the row of errors ~err~ contains the type ~e~.”/ - -For instance, imagine we have an error type ~FileError~ to describe -filesystem-related errors. Then, we can imagine the following function: - -#+BEGIN_SRC haskell -readContent :: (Contains err FileError, MonadIO m) - => FilePath - -> ResultT msg err m String -#+END_SRC - -We could leverage this function in a given project, for instance to read its -configuration files (for the sake of the example, it has several configuration -files). This function can use its own type to describe ill-formed description -(~ConfigurationError~). - -#+BEGIN_SRC haskell -parseConfiguration :: (Contains err ConfigurationError, MonadIO m) - => String - -> String - -> ResultT msg err m Configuration -#+END_SRC - -To avoid repeating ~Contains~ when the row of errors needs to contains several -elements, we introduce ~:<~[fn:top] (read /subset or equal/): - -#+BEGIN_SRC haskell -getConfig :: ( '[FileError, ConfigurationError] :< err - , MonadIO m) - => ResultT String err m Configuration -getConfig = do - achieve "get configuration from ~/.myapp directory" $ do - f1 <- readContent "~/.myapp/init.conf" - <?> "fetch the main configuration" - f2 <- readContent "~/.myapp/net.conf" - <?> "fetch the net-related configuration" - - parseConfiguration f1 f2 -#+END_SRC - -You might see, now, why I say ~ResultT~ is extensible. You can use two functions -with totally unrelated errors, as long as the caller advertises that with -~Contains~ or ~:<~. - -[fn:top] If you are confused by ~:<~, it is probably because you were not aware -of the [[https://ocharles.org.uk/blog/posts/2014-12-08-type-operators.html][TypeOperators]] before. Maybe it was for the best. :D - -** Recovering by Handling Errors - -Monads are traps, you can only escape them by playing with their -rules. ~ResultT~ comes with ~runResultT~. - -#+BEGIN_SRC haskell -runResultT :: Monad m => ResultT msg '[] m a -> m a -#+END_SRC - -This might be surprising: we can only escape out from the ~ResultT~ if we do not -use /any errors at all/. In fact, ~ResultT~ forces us to handle errors before -calling ~runResultT~. - -~ResultT~ provides several functions prefixed by ~recover~. Their type -signatures can be a little confusing, so we will dive into the simpler one: - -#+BEGIN_SRC haskell -recover :: forall e m msg err a. - (Monad m) - => ResultT msg (e ': err) m a - -> (e -> [msg] -> ResultT msg err m a) - -> ResultT msg err m a -#+END_SRC - -~recover~ allows for /removing/ an error type from the row of errors, To do -that, it requires to provide an error handler to determine what to do with the -error raised during the computation and the stack of messages at that -time. Using ~recover~, a function may use more errors than advertised in its -type signature, but we know by construction that in such a case, it handles -these errors so that it is transparent for the function user. The type of the -handler is ~e -> [msg] -> ResultT msg err m a~, which means the handler /can -raise errors if required/. ~recoverWhile msg~ is basically a synonym for -~achieve msg $ recover~. ~recoverMany~ allows for doing the same with a row of -errors, by providing as many functions as required. Finally, ~recoverManyWith~ -simplifies ~recoverMany~: you can provide only one function tied to a given -typeclass, on the condition that the handling errors implement this typeclass. - -Using ~recover~ and its siblings often requires to help a bit the Haskell -type system, especially if we use lambdas to define the error handlers. Doing -that is usually achieved with the ~Proxy a~ dataype (where ~a~ is a phantom -type). I would rather use the TypeApplications[fn:tap] pragma. - -#+BEGIN_SRC haskell -recoverManyWith @[FileError, NetworkError] @DescriptiveError - (do x <- readFromFile f - y <- readFromNetwork socket - printToStd x y) - printErrorAndStack -#+END_SRC - -The ~DecriptiveError~ typeclass can be seen as a dedicated ~Show~, to give -textual representation of errors. It is inspired by the macros of ~error_chain~. - -We can start from an empty row of errors, and allows ourselves to -use more errors thanks to the ~recover*~ functions. - -[fn:tap] The [[https://medium.com/@zyxoas/abusing-haskell-dependent-types-to-make-redis-queues-safer-cc31db943b6c][TypeApplications]] pragmas is probably one of my favourites. When I -use it, it feels almost like if I were writing some Gallina. - -* ~cat~ in Haskell using ResultT - -~ResultT~ only cares about error handling. The rest of the work is up to the -underlying monad ~m~. That being said, nothing forbids us to provide -fine-grained API for, e.g. Filesystem-related functions. From an error handling -perspective, the functions provided by Prelude (the standard library of Haskell) -are pretty poor, and the documentation is not really precise regarding the kind -of error we can encounter while using it. - -In this section, I will show you how we can leverage ~ResultT~ to *(i)* define an -error-centric API for basic file management functions and *(ii)* use this API to -implement a ~cat~-like program which read a file and print its content in the -standard output. - -** (A Lot Of) Error Types - -We could have one sum type to describe in the same place all the errors we can -find, and later use the pattern matching feature of Haskell to determine which -one has been raised. The thing is, this is already the job done by the row of -errors of ~ResultT~. Besides, this means that we could raise an error for being -not able to write something into a file in a function which /opens/ a file. - -Because ~ResultT~ is intended to be extensible, we should rather define several -types, so we can have a fine-grained row of errors. Of course, too many types -will become burdensome, so this is yet another time where we need to find the -right balance. - -#+BEGIN_SRC haskell -newtype AlreadyInUse = AlreadyInUse FilePath -newtype DoesNotExist = DoesNotExist FilePath -data AccessDeny = AccessDeny FilePath IO.IOMode -data EoF = EoF -data IllegalOperation = IllegalRead | IllegalWrite -#+END_SRC - -To be honest, this is a bit too much for the real life, but we are in a blog post -here, so we should embrace the potential of ~ResultT~. - -** Filesystem API - -By reading the [[https://hackage.haskell.org/package/base-4.9.1.0/docs/System-IO.html][System.IO]] documentation, we can infer what our functions type -signatures should look like. I will not discuss their actual implementation in -this article, as this requires me to explain how `IO` deals with errors itself -(and this article is already long enough to my taste). You can have a look at -[[https://gist.github.com/lethom/c669e68e284a056dc8c0c3546b4efe56][this gist]] if you are interested. - -#+BEGIN_SRC haskell -openFile :: ( '[AlreadyInUse, DoesNotExist, AccessDeny] :< err - , MonadIO m) - => FilePath -> IOMode -> ResultT msg err m Handle -#+END_SRC - -#+BEGIN_SRC haskell -getLine :: ('[IllegalOperation, EoF] :< err, MonadIO m) - => IO.Handle - -> ResultT msg err m Text -#+END_SRC - -#+BEGIN_SRC haskell -closeFile :: (MonadIO m) - => IO.Handle - -> ResultT msg err m () -#+END_SRC - -** Implementing ~cat~ - -We can use the ~ResultT~ monad, its monadic operations and our functions to deal -with the file system in order to implement a ~cat~-like program. I tried to -comment on the implementation to make it easier to follow. - -#+BEGIN_SRC haskell -cat :: FilePath -> ResultT String err IO () -cat path = - -- We will try to open and read this file to mimic - -- `cat` behaviour. - -- We advertise that in case something goes wrong - -- the process. - achieve ("cat " ++ path) $ do - -- We will recover from a potential error, - -- but we will abstract away the error using - -- the `DescriptiveError` typeclass. This way, - -- we do not need to give one handler by error - -- type. - recoverManyWith @[Fs.AlreadyInUse, Fs.DoesNotExist, Fs.AccessDeny, Fs.IllegalOperation] - @(Fs.DescriptiveError) - (do f <- Fs.openFile path Fs.ReadMode - -- `repeatUntil` works like `recover`, except - -- it repeats the computation until the error - -- actually happpens. - -- I could not have used `getLine` without - -- `repeatUntil` or `recover`, as it is not - -- in the row of errors allowed by - -- `recoverManyWith`. - repeatUntil @(Fs.EoF) - (Fs.getLine f >>= liftIO . print) - (\_ _ -> liftIO $ putStrLn "%EOF") - closeFile f) - printErrorAndStack - where - -- Using the `DescriptiveError` typeclass, we - -- can print both the stack of Strings which form - -- the context, and the description of the generic - -- error. - printErrorAndStack e ctx = do - liftIO . putStrLn $ Fs.describe e - liftIO $ putStrLn "stack:" - liftIO $ print ctx -#+END_SRC - -The type system of ~cat~ teaches us that this function handles any error it -might encounter. This means we can use it anywhere we want… in another -computation inside ~ResultT~ which might raise errors completely unrelated to -the file system, for instance. Or! We can use it with ~runResultT~, escaping the -~ResultT~ monad (only to fall into the ~IO~ monad, but this is another story). - -* Conclusion - -For once, I wanted to write about the /result/ of a project, instead of /how it -is implemented/. Rest assured, I do not want to skip the latter. I need to clean -up a bit the code before bragging about it. diff --git a/site/posts/Ltac.org b/site/posts/Ltac.org deleted file mode 100644 index 37580cd..0000000 --- a/site/posts/Ltac.org +++ /dev/null @@ -1,34 +0,0 @@ -#+TITLE: A Series on Ltac - -#+SERIES: ../coq.html -#+SERIES_PREV: ./StronglySpecifiedFunctions.html -#+SERIES_NEXT: ./RewritingInCoq.html - -Ltac is the “tactic language” of Coq. It is commonly advertised as the common -approach to write proofs, which tends to bias how it is introduced to -new Coq users[fn::I know /I/ was introduced to Coq in a similar way in -my Master courses.]. In this series, we present Ltac as the -metaprogramming tool it is, since fundamentally it is an imperative -language which allows for constructing Coq terms interactively and -incrementally. - -- [[./LtacMetaprogramming.html][Ltac is an Imperative Metaprogramming Language]] :: - Ltac generates terms, therefore it is a metaprogramming language. It does it - incrementally, by using primitives to modifying an implicit state, therefore - it is an imperative language. Henceforth, it is an imperative metaprogramming - language. - -- [[./LtacPatternMatching.html][Pattern Matching on Types and Contexts]] :: - Ltac allows for pattern matching on types and contexts. In this article, we - give a short introduction on this feature of key importance. Ltac programs - (“proof scripts”) generate terms, and the shape of said terms can be very - different regarding the initial context. For instance, ~induction x~ will - refine the current goal using an inductive principle dedicated to the type of - ~x~. - -- [[./MixingLtacAndGallina.html][Mixing Ltac and Gallina Together for Fun and Profit]] :: - One of the most misleading introduction to Coq is to say that “Gallina is for - programs, while tactics are for proofs.” Gallina is the preferred way to - construct programs, and tactics are the preferred way to construct proofs. - The key word here is “preferred.” Coq actually allows for /mixing/ - Ltac and Gallina together. diff --git a/site/posts/LtacMetaprogramming.md b/site/posts/LtacMetaprogramming.md new file mode 100644 index 0000000..2f3e6d7 --- /dev/null +++ b/site/posts/LtacMetaprogramming.md @@ -0,0 +1,279 @@ +--- +published: 2020-08-28 +modified: 2023-05-08 +tags: ['coq'] +series: + parent: series/Ltac.html + next: posts/LtacPatternMatching.html +abstract: | + Ltac generates terms, therefore it is a metaprogramming language. It does + it incrementally, by using primitives to modifying an implicit state, + therefore it is an imperative language. Henceforth, it is an imperative + metaprogramming language. +--- + +# Ltac is an Imperative Metaprogramming Language + +Coq is often depicted as an _interactive_ proof assistant, thanks to its +proof environment. Inside the proof environment, Coq presents the user a +goal, and said user solves said goal by means of tactics which describes a +logical reasoning. For instance, to reason by induction, one can use the +`induction`{.coq} tactic, while a simple case analysis can rely on the +`destruct`{.coq} or `case_eq`{.coq} tactics, etc. It is not uncommon for new +Coq users to be introduced to Ltac, the Coq default tactic language, using this +proof-centric approach. This is not surprising, since writing proofs remains +the main use case for Ltac. In practice, though, this discourse remains an +abstraction which hides away what actually happens under the hood when Coq +executes a proof script. + +To really understand what Ltac is about, we need to recall ourselves that +Coq kernel is mostly a type checker. A theorem statement is expressed as a +“type” (which lives in a dedicated sort called `Prop`{.coq}), and a proof of +this theorem is a term of this type, just like the term `S (S O)`{.coq} ($2$) +is of type `nat`{.coq}. Proving a theorem in Coq requires to construct a term +of the type encoding said theorem, and Ltac allows for incrementally +constructing this term, one step at a time. + +Ltac generates terms, therefore it is a metaprogramming language. It does it +incrementally, by using primitives to modifying an implicit state, therefore +it is an imperative language. Henceforth, it is an imperative +metaprogramming language. + +To summarize, a goal presented by Coq inside the environment proof is a hole +within the term being constructed. It is presented to users as: + +- A list of “hypotheses,” which are nothing more than the variables + in the scope of the hole +- A type, which is the type of the term to construct to fill the hole + +We illustrate what happens under the hood of Coq executes a simple proof +script. One can use the `Show Proof`{.coq} vernacular command to exhibit +this. + +We illustrate how Ltac works with the following example. + +```coq +Theorem add_n_O : forall (n : nat), n + O = n. +Proof. +``` + +The `Proof`{.coq} vernacular command starts the proof environment. Since no +tactic has been used, the term we are trying to construct consists solely in a +hole, while Coq presents us a goal with no hypothesis, and with the exact type +of our theorem, that is `forall (n : nat), n + O = n`{.coq}. + +A typical Coq course will explain to students the `intro`{.coq} tactic allows for +turning the premise of an implication into an hypothesis within the context. + +$$C \vdash P \rightarrow Q$$ + +becomes + +$$C,\ P \vdash Q$$ + +This is a fundamental rule of deductive reasoning, and `intro`{.coq} encodes it. +It achieves this by refining the current hole into an anonymous function. +When we use + +```coq + intro n. +``` + +it refines the term + +```coq + ?Goal1 +``` + +into + +```coq + fun (n : nat) => ?Goal2 +``` + +The next step of this proof is to reason about induction on `n`. For `nat`, +it means that to be able to prove + +$$\forall n, \mathrm{P}\ n$$ + +we need to prove that + +- $\mathrm{P}\ 0$ +- $\forall n, \mathrm{P}\ n \rightarrow \mathrm{P}\ (S n)$ + +The `induction`{.coq} tactic effectively turns our goal into two subgoals. But +why is that? Because, under the hood, Ltac is refining the current goal using +the `nat_ind`{.coq} function automatically generated by Coq when `nat`{.coq} +was defined. The type of `nat_ind`{.coq} is + +```coq + forall (P : nat -> Prop), + P 0 + -> (forall n : nat, P n -> P (S n)) + -> forall n : nat, P n +``` + +Interestingly enough, `nat_ind`{.coq} is not an axiom. It is a regular Coq +function, whose definition is + +```coq + fun (P : nat -> Prop) (f : P 0) + (f0 : forall n : nat, P n -> P (S n)) => + fix F (n : nat) : P n := + match n as n0 return (P n0) with + | 0 => f + | S n0 => f0 n0 (F n0) + end +``` + +So, after executing + +```coq + induction n. +``` + +The hidden term we are constructing becomes + +```coq + (fun n : nat => + nat_ind + (fun n0 : nat => n0 + 0 = n0) + ?Goal3 + (fun (n0 : nat) (IHn : n0 + 0 = n0) => ?Goal4) + n) +``` + +And Coq presents us two goals. + +First, we need to prove $\mathrm{P}\ 0$, *i.e.*, +$0 + 0 = 0$. Similarly to Coq presenting a goal when what we are actually doing +is constructing a term, the use of $=$ and $+$ (*i.e.*, the Coq notations +mechanism) hides much here. We can ask Coq to be more explicit by using the +vernacular command `Set Printing All`{.coq} to learn that when Coq presents us +a goal of the form `0 + 0 = 0`{.coq}, it is actually looking for a term of type +`@eq nat (Nat.add O O) O`{.coq}. + +`Nat.add`{.coq} is a regular Coq (recursive) function. + +```coq + fix add (n m : nat) {struct n} : nat := + match n with + | 0 => m + | S p => S (add p m) + end +``` + +Similarly, `eq`{.coq} is *not* an axiom. It is a regular inductive type, defined +as follows. + +```coq +Inductive eq (A : Type) (x : A) : A -> Prop := +| eq_refl : eq A x x +``` + +Coming back to our current goal, proving `@eq nat (Nat.add 0 0) 0`{.coq}[^equ1] +requires to construct a term of a type whose only constructor is +`eq_refl`{.coq}. `eq_refl`{.coq} accepts one argument, and encodes the proof +that said argument is equal to itself. In practice, Coq type checker will accept +the term `@eq_refl _ x`{.coq} when it expects a term of type `@eq _ x y`{.coq} +*if* it can reduce `x`{.coq} and `y`{.coq} to the same term. + +[^equ1]: That is, `0 + 0 = 0`{.coq} + +Is it the case for `@eq nat (Nat.add 0 0) 0`{.coq}? It is, since by definition +of `Nat.add`{.coq}, `Nat.add 0 x`{.coq} is reduced to `x`{.coq}. We can use the +`reflexivity`{.coq} tactic to tell Coq to fill the current hole with +`eq_refl`{.coq}. + +```coq + + reflexivity. +``` + +Suspicious readers may rely on `Show Proof`{.coq} to verify this assertion. + +```coq + (fun n : nat => + nat_ind + (fun n0 : nat => n0 + 0 = n0) + eq_refl + (fun (n0 : nat) (IHn : n0 + 0 = n0) => ?Goal4) + n) +``` + +`?Goal3`{.coq} has indeed be replaced by `eq_refl`. + +One goal remains, as we need to prove that if `n + 0 = n`{.coq}, then `S n + 0 += S n`{.coq}. Coq can reduce `S n + 0`{.coq} to `S (n + 0)`{.coq} by definition +of `Nat.add`{.coq}, but it cannot reduce `S n`{.coq} more than it already is. + +```coq + + cbn. +``` + +We cannot just use `reflexivity`{.coq} here (*i.e.*, fill the hole with +`eq_refl`{.coq}), since `S (n + 0)`{.coq} and `S n`{.coq} cannot be reduced to +the same term. + +However, at this point of the proof, we have the `IHn`{.coq} hypothesis (*i.e.*, the +`IHn`{.coq} argument of the anonymous function whose body we are trying to +construct). The `rewrite`{.coq} tactic allows for substituting in a type an +occurrence of `x`{.coq} by `y`{.coq} as long as we have a proof of `x = y`{.coq}. *) + +```coq + rewrite IHn. +``` + +Similarly to `induction`{.coq} using a dedicated function, `rewrite`{.coq} refines +the current hole with the `eq_ind_r`{.coq} function[^noaxiom]. Replacing `n + +0`{.coq} with `n`{.coq} transforms the goal into `S n = S n`{.coq}. Here, we +can use `reflexivity`{.coq} (i.e., `eq_refl`{.coq}) to conclude. *) + +[^noaxiom]: Again, not an axiom. + +```coq + reflexivity. +``` + +After this last tactic, the work is done. There is no more goal to fill inside +the proof term that we have carefully constructed. + +```coq + (fun n : nat => + nat_ind + (fun n0 : nat => n0 + 0 = n0) + eq_refl + (fun (n0 : nat) (IHn : n0 + 0 = n0) => + eq_ind_r (fun n1 : nat => S n1 = S n0) eq_refl IHn) + n) +``` + +We can finally use `Qed`{.coq} or `Defined`{.coq} to tell Coq to type check this +term. That is, Coq does not trust Ltac, but rather type check the term to +verify it is correct. This way, in case Ltac has a bug which makes it +construct an ill-formed type, at the very least Coq can reject it. + +```coq +Qed. +``` + +In conclusion, tactics are used to incrementally refine hole inside a term +until the latter is complete. They do it in a very specific manner, to +encode certain reasoning rules. + +On the other hand, the `refine`{.coq} tactic provides a generic, low-level way +to do the same thing. Knowing how a given tactic works allows for mimicking +its behavior using the `refine`{.coq} tactic. + +If we take the previous theorem as an example, we can prove it using this +alternative proof script. + +```coq +Theorem add_n_O' : forall (n : nat), n + O = n. +Proof. + refine (fun n => _). + refine (nat_ind (fun n => n + 0 = n) _ _ n). + + refine eq_refl. + + refine (fun m IHm => _). + refine (eq_ind_r (fun n => S n = S m) _ IHm). + refine eq_refl. +Qed. diff --git a/site/posts/LtacMetaprogramming.v b/site/posts/LtacMetaprogramming.v deleted file mode 100644 index 6b086e3..0000000 --- a/site/posts/LtacMetaprogramming.v +++ /dev/null @@ -1,254 +0,0 @@ -(** #<nav><p class="series">Ltac.html</p> - <p class="series-next">LtacPatternMatching.html</p></nav># *) - -(** * Ltac is an Imperative Metaprogramming Language *) - -(** #<div id="history">site/posts/LtacMetaprogramming.v</div># *) - -(** Coq is often depicted as an _interactive_ proof assistant, thanks to its - proof environment. Inside the proof environment, Coq presents the user a - goal, and said user solves said goal by means of tactics which describes a - logical reasoning. For instance, to reason by induction, one can use the - <<induction>> tactic, while a simple case analysis can rely on the - <<destruct>> or <<case_eq>> tactics, etc. It is not uncommon for new Coq - users to be introduced to Ltac, the Coq default tactic language, using this - proof-centric approach. This is not surprising, since writing proofs remains - the main use-case for Ltac. In practice though, this discourse remains an - abstraction which hides away what actually happens under the hood when Coq - executes a proof scripts. - - To really understand what Ltac is about, we need to recall ourselves that - Coq kernel is mostly a typechecker. A theorem statement is expressed as a - “type” (which lives in a dedicated sort called [Prop]), and a proof of this - theorem is a term of this type, just like the term [S (S O)] (#<span - class="im">#2#</span>#) is of type [nat]. Proving a theorem in Coq requires - to construct a term of the type encoding said theorem, and Ltac allows for - incrementally constructing this term, one step at a time. - - Ltac generates terms, therefore it is a metaprogramming language. It does it - incrementally, by using primitives to modifying an implicit state, therefore - it is an imperative language. Henceforth, it is an imperative - metaprogramming language. - - To summarize, a goal presented by Coq inside the environment proof is a hole - within the term being constructed. It is presented to users as: - - - A list of “hypotheses,” which are nothing more than the variables - in the scope of the hole - - A type, which is the type of the term to construct to fill the hole - - We illustrate what happens under the hood of Coq executes a simple proof - script. One can use the <<Show Proof>> vernacular command to exhibit - this. - - We illustrate how Ltac works with the following example. *) - -Theorem add_n_O : forall (n : nat), n + O = n. - -Proof. - -(** The <<Proof>> vernacular command starts the proof environment. Since no - tactic has been used, the term we are trying to construct consists solely in - a hole, while Coq presents us a goal with no hypothesis, and with the exact - type of our theorem, that is [forall (n : nat), n + O = n]. - - A typical Coq course will explain students the <<intro>> tactic allows for - turning the premise of an implication into an hypothesis within the context. - - #<div class="dmath">#C \vdash P \rightarrow Q#</div># - - becomes - - #<div class="dmath">#C,\ P \vdash Q#</div># - - This is a fundamental rule of deductive reasoning, and <<intro>> encodes it. - It achieves this by refining the current hole into an anymous function. - When we use *) - - intro n. - -(** it refines the term - -<< - ?Goal1 ->> - - into - -<< - fun (n : nat) => ?Goal2 ->> - - The next step of this proof is to reason about induction on [n]. For [nat], - it means that to be able to prove - - #<div class="dmath">#\forall n, \mathrm{P}\ n#</div># - - we need to prove that - - - #<div class="imath">#\mathrm{P}\ 0#</div># - - #<div class="imath">#\forall n, \mathrm{P}\ n \rightarrow \mathrm{P}\ (S n)#</div># - - The <<induction>> tactic effectively turns our goal into two subgoals. But - why is that? Because, under the hood, Ltac is refining the current goal - using the [nat_ind] function automatically generated by Coq when [nat] was - defined. The type of [nat_ind] is - -<< - forall (P : nat -> Prop), - P 0 - -> (forall n : nat, P n -> P (S n)) - -> forall n : nat, P n ->> - Interestingly enough, [nat_ind] is not an axiom. It is a regular Coq function, whose definition is - -<< - fun (P : nat -> Prop) (f : P 0) - (f0 : forall n : nat, P n -> P (S n)) => - fix F (n : nat) : P n := - match n as n0 return (P n0) with - | 0 => f - | S n0 => f0 n0 (F n0) - end ->> - - So, after executing *) - - induction n. - -(** The hidden term we are constructing becomes - -<< - (fun n : nat => - nat_ind - (fun n0 : nat => n0 + 0 = n0) - ?Goal3 - (fun (n0 : nat) (IHn : n0 + 0 = n0) => ?Goal4) - n) ->> - - And Coq presents us two goals. - - First, we need to prove #<span class="dmath">#\mathrm{P}\ 0#</span>#, i.e., - #<span class="dmath">#0 + 0 = 0#</span>#. Similarly to Coq presenting a goal - when what we are actually doing is constructing a term, the use of #<span - class="dmath">#+#</span># and #<span class="dmath">#+#</span># (i.e., Coq - notations mechanism) hide much here. We can ask Coq to be more explicit by - using the vernacular command <<Set Printing All>> to learn that when Coq - presents us a goal of the form [0 + 0 = 0], it is actually looking for a - term of type [@eq nat (Nat.add O O) O]. - - [Nat.add] is a regular Coq (recursive) function. - -<< - fix add (n m : nat) {struct n} : nat := - match n with - | 0 => m - | S p => S (add p m) - end ->> - - Similarly, [eq] is _not_ an axiom. It is a regular inductive type, defined - as follows. - -<< -Inductive eq (A : Type) (x : A) : A -> Prop := -| eq_refl : eq A x x ->> - - Coming back to our current goal, proving [@eq nat (Nat.add 0 0) 0] (i.e., [0 - + 0 = 0]) requires to construct a term of a type whose only constructor is - [eq_refl]. [eq_refl] accepts one argument, and encodes the proof that said - argument is equal to itself. In practice, Coq typechecker will accept the - term [@eq_refl _ x] when it expects a term of type [@eq _ x y] _if_ it can - reduce [x] and [y] to the same term. - - Is it the case for [0 + 0 = 0]? It is, since by definition of [Nat.add], [0 - + x] is reduced to [x]. We can use the <<reflexivity>> tactic to tell Coq to - fill the current hole with [eq_refl]. *) - - + reflexivity. - - (** Suspicious readers may rely on <<Show Proof>> to verify this assertion - assert: -<< - (fun n : nat => - nat_ind - (fun n0 : nat => n0 + 0 = n0) - eq_refl - (fun (n0 : nat) (IHn : n0 + 0 = n0) => ?Goal4) - n) ->> - - <<?Goal3>> has indeed be replaced by [eq_refl]. - - One goal remains, as we need to prove that if [n + 0 = n], then [S n + 0 = S - n]. Coq can reduce [S n + 0] to [S (n + 0)] by definition of [Nat.add], but - it cannot reduce [S n] more than it already is. We can request it to do so - using tactics such as [cbn]. *) - - + cbn. - -(** We cannot just use <<reflexivity>> here (i.e., fill the hole with - [eq_refl]), since [S (n + 0)] and [S n] cannot be reduced to the same term. - However, at this point of the proof, we have the [IHn] hypothesis (i.e., the - [IHn] argument of the anonymous function whose body we are trying to - construct). The <<rewrite>> tactic allows for substituting in a type an - occurence of [x] by [y] as long as we have a proof of [x = y]. *) - - rewrite IHn. - - (** Similarly to <<induction>> using a dedicated function , <<rewrite>> refines - the current hole with the [eq_ind_r] function (not an axiom!). Replacing [n - + 0] with [n] transforms the goal into [S n = S n]. Here, we can use - <<reflexivity>> (i.e., [eq_refl]) to conclude. *) - - reflexivity. - -(** After this last tactic, the work is done. There is no more goal to fill - inside the proof term that we have carefully constructed. - -<< - (fun n : nat => - nat_ind - (fun n0 : nat => n0 + 0 = n0) - eq_refl - (fun (n0 : nat) (IHn : n0 + 0 = n0) => - eq_ind_r (fun n1 : nat => S n1 = S n0) eq_refl IHn) - n) ->> - - We can finally use [Qed] or [Defined] to tell Coq to typecheck this - term. That is, Coq does not trust Ltac, but rather typecheck the term to - verify it is correct. This way, in case Ltac has a bug which makes it - construct ill-formed type, at the very least Coq can reject it. *) - -Qed. - -(** In conclusion, tactics are used to incrementally refine hole inside a term - until the latter is complete. They do it in a very specific manner, to - encode certain reasoning rule. - - On the other hand, the <<refine>> tactic provides a generic, low-level way - to do the same thing. Knowing how a given tactic works allows for mimicking - its behavior using the <<refine>> tactic. - - If we take the previous theorem as an example, we can prove it using this - alternative proof script. *) - -Theorem add_n_O' : forall (n : nat), n + O = n. - -Proof. - refine (fun n => _). - refine (nat_ind (fun n => n + 0 = n) _ _ n). - + refine eq_refl. - + refine (fun m IHm => _). - refine (eq_ind_r (fun n => S n = S m) _ IHm). - refine eq_refl. -Qed. - -(** ** Conclusion *) - -(** This concludes our introduction to Ltac as an imperative metaprogramming - language. In the #<a href="LtacPatternMatching.html">#next part of this series#</a>#, we - present Ltac powerful pattern matching capabilities. *) diff --git a/site/posts/LtacPatternMatching.md b/site/posts/LtacPatternMatching.md new file mode 100644 index 0000000..2210a61 --- /dev/null +++ b/site/posts/LtacPatternMatching.md @@ -0,0 +1,203 @@ +--- +published: 2020-08-28 +tags: ['coq'] +series: + parent: series/Ltac.html + prev: posts/LtacMetaprogramming.html + next: posts/MixingLtacAndGallina.html +abstract: | + Ltac allows for pattern matching on types and contexts. In this article, we + give a short introduction on this feature of key importance. Ltac programs + (“proof scripts”) generate terms, and the shape of said terms can be very + different regarding the initial context. For instance, `induction x`{.coq} + will refine the current goal using an inductive principle dedicated to the + type of `x`{.coq}. +--- + +# Pattern Matching on Types and Contexts + +In the [a previous article](/posts/LtacMetaprogramming.html) of our series on +Ltac, we have shown how tactics allow for constructing Coq terms incrementally. +Ltac programs (“proof scripts”) generate terms, and the shape of said terms can +be very different regarding the initial context. For instance, `induction +x`{.coq} will refine the current goal by using an inductive principle dedicated to +the type of `x`{.coq}. + +This is possible because Ltac allows for pattern matching on types and +contexts. In this article, we give a short introduction on this feature of +key importance. + +## To `lazymatch`{.coq} or to `match`{.coq} + +Gallina provides one pattern matching construction, whose semantics is always +the same: for a given term, the first pattern to match will always be selected. +On the contrary, Ltac provides several pattern matching constructions with +different semantics. This key difference has probably been motivated because +Ltac is not a total language: a tactic may not always succeed. + +Ltac programmers can use `match`{.coq} or `lazymatch`{.coq}. One the one hand, +with `match`{.coq}, if one pattern matches, but the branch body fails, Coq will +backtrack and try the next branch. On the other hand, `lazymatch`{.coq} will +stop with an error. + +So, for instance, the two following tactics will print two different messages: + +```coq +Ltac match_failure := + match goal with + | _ + => fail "fail in first branch" + | _ + => fail "fail in second branch" + end. + +Ltac lazymatch_failure := + lazymatch goal with + | _ + => fail "fail in first branch" + | _ + => fail "fail in second branch" + end. +``` + +We can try that quite easily by starting a dumb goal (eg. `Goal (True).`{.coq}) +and call our tactic. For `match_failure`{.coq}, we get: + +``` +Ltac call to "match_failure" failed. +Error: Tactic failure: fail in second branch. +``` + +On the other hand, for `lazymatch_failure`{.coq}, we get: + +``` +Ltac call to "match_failure'" failed. +Error: Tactic failure: fail in first branch. +``` + +Moreover, pattern matching allows for matching over *patterns*, not just +constants. Here, Ltac syntax differs from Gallina’s. In Gallina, if a +variable name is used in a pattern, Gallina creates a new variable in the +scope of the related branch. If a variable with the same name already +existed, it is shadowed. On the contrary, in Ltac, using a variable name +as-is in a pattern implies an equality check. + +To illustrate this difference, we can take the example of the following +tactic. + +```coq +Ltac successive x y := + lazymatch y with + | S x => idtac + | _ => fail + end. +``` + +`successive x y`{.coq} will fail if `y`{.coq} is not the successor of +`x`{.coq}. On the contrary, the “syntactically equivalent” function in Gallina +will exhibit a totally different behavior. + +```coq +Definition successor (x y : nat) : bool := + match y with + | S x => true + | _ => false + end. +``` + +Here, the first branch of the pattern match is systematically selected when +`y`{.coq} is not `O`{.coq}, and in this branch, the argument `x`{.coq} is shadowed by the +predecessor of `y`{.coq}. + +For Ltac to adopt a behavior similar to Gallina, the `?`{.coq} prefix shall be +used. For instance, the following tactic will check whether its argument +has a known successor, prints it if it does, or fail otherwise. + +```coq +Ltac print_pred_or_zero x := + match x with + | S ?x => idtac x + | _ => fail + end. +``` + +On the one hand, `print_pred_or_zero 3`{.coq} will print `2`{.coq}. On the +other hand, if there exists a variable `x : nat`{.coq} in the context, calling +`print_pred_or_zero x`{.coq} will fail, since the exact value of `x`{.coq} is +not known. + +### Pattern Matching on Types with `type of`{.coq} + +The `type of`{.coq} operator can be used in conjunction to pattern matching to +generate different terms according to the type of a variable. We could +leverage this to reimplement `induction`{.coq} for instance. + +As an example, we propose the following `not_nat`{.coq} tactic which, given an +argument `x`{.coq}, fails if `x`{.coq} is of type `nat`{.coq}. + +```coq +Ltac not_nat x := + lazymatch type of x with + | nat => fail "argument is of type nat" + | _ => idtac + end. +``` + +With this definition, `not_nat true`{.coq} succeeds since `true`{.coq} is of +type `bool`{.coq}, and `not_nat O`{.coq} since `O`{.coq} encodes $0$ in +`nat`{.coq}. + +We can also use the `?`{.coq} prefix to write true pattern. For instance, the +following tactic will fail if the type of its supplied argument has at least +one parameter. + +```coq +Ltac not_param_type x := + lazymatch type of x with + | ?f ?a => fail "argument is of type with parameter" + | _ => idtac + end. +``` + +Both `not_param_type (@nil nat)`{.coq} of type `list nat`{.coq} and `@eq_refl +nat 0`{.coq} of type `0 = 0`{.coq} fail, but `not_param_type 0`{.coq} of type +`nat`{.coq} succeeds. *) + +## Pattern Matching on the Context with `goal`{.coq} + +Lastly, we discuss how Ltac allows for inspecting the context (*i.e.*, the +hypotheses and the goal) using the `goal`{.coq} keyword. + +For instance, we propose a naive implementation of the `subst`{.coq} tactic +as follows. + +```coq +Ltac subst' := + repeat + match goal with + | H : ?x = _ |- context[?x] + => repeat rewrite H; clear H + end. +``` + +With `goal`{.coq}, patterns are of the form `H : (pattern), ... |- +(pattern)`{.coq}. + +- At the left side of `|-`{.coq}, we match on hypotheses. Beware that + contrary to variable names in pattern, hypothesis names behaves as in + Gallina (i.e., fresh binding, shadowing, etc.). In the branch, we are + looking for equations, i.e., a hypothesis of the form `?x = _`{.coq}. +- At the right side of `|-`{.coq}, we match on the goal. + +In both cases, Ltac makes available an interesting operator, +`context[(pattern)`{.coq}], which is satisfied if `(pattern)`{.coq} appears somewhere in +the object we are pattern matching against. So, the branch of the `match`{.coq} +reads as follows: we are looking for an equation `H`{.coq} which specifies the +value of an object `x`{.coq} which appears in the goal. If such an equation +exists, `subst'`{.coq} tries to `rewrite x`{.coq} as many times as possible. + +This implementation of `subst'`{.coq} is very fragile, and will not work if the +equation is of the form `_ = ?x`{.coq}, and it may behave poorly if we have +“transitive equations”, such as there exists hypotheses `?x = ?y`{.coq} and `?y += _`{.coq}. Motivated readers may be interested in proposing a more robust +implementation of `subst'`{.coq}. diff --git a/site/posts/LtacPatternMatching.v b/site/posts/LtacPatternMatching.v deleted file mode 100644 index a0b8a4d..0000000 --- a/site/posts/LtacPatternMatching.v +++ /dev/null @@ -1,188 +0,0 @@ -(** #<nav><p class="series">Ltac.html</p> - <p class="series-prev">./LtacMetaprogramming.html</p> - <p class="series-next">./MixingLtacAndGallina.html</p></nav># *) - -(** * Pattern Matching on Types and Contexts *) - -(** In the #<a href="LtacMetaprogramming.html">#previous article#</a># of our - series on Ltac, we have shown how tactics allow for constructing Coq terms - incrementally. Ltac programs (“proof scripts”) generate terms, and the - shape of said terms can be very different regarding the initial context. For - instance, [induction x] will refine the current goal using an inductive - principle dedicated to the type of [x]. - - This is possible because Ltac allows for pattern matching on types and - contexts. In this article, we give a short introduction on this feature of - key importance. *) - -(** #<nav id="generate-toc"></nav># - - #<div id="history">site/posts/LtacPatternMatching.v</div># *) - -(** ** To [lazymatch] or to [match] *) - -(** Gallina provides one pattern matching construction, whose semantics is - always the same: for a given term, the first pattern to match will always be - selected. On the contrary, Ltac provides several pattern matching - constructions with different semantics. This key difference has probably - been motivated because Ltac is not a total language: a tactic may not always - succeed. - - Ltac programmers can use [match] or [lazymatch]. One the one hand, with - [match], if one pattern matches, but the branch body fails, Coq will - backtrack and try the next branch. On the other hand, [lazymatch] will stop - on error. - - So, for instance, the two following tactics will print two different - messages: *) - -Ltac match_failure := - match goal with - | _ - => fail "fail in first branch" - | _ - => fail "fail in second branch" - end. - -Ltac lazymatch_failure := - lazymatch goal with - | _ - => fail "fail in first branch" - | _ - => fail "fail in second branch" - end. - -(** We can try that quite easily by starting a dumb goal (eg. [Goal (True).]) - and call our tactic. For [match_failure], we get: - -<< -Ltac call to "match_failure" failed. -Error: Tactic failure: fail in second branch. ->> - - On the other hand, for [lazymatch_failure], we get: - -<< -Ltac call to "match_failure'" failed. -Error: Tactic failure: fail in first branch. ->> - - Moreover, pattern matching allows for matching over _patterns_, not just - constants. Here, Ltac syntax differs from Gallina’s. In Gallina, if a - variable name is used in a pattern, Gallina creates a new variable in the - scope of the related branch. If a variable with the same name already - existed, it is shadowed. On the contrary, in Ltac, using a variable name - as-is in a pattern implies an equality check. - - To illustrate this difference, we can take the example of the following - tactic. *) - -Ltac successive x y := - lazymatch y with - | S x => idtac - | _ => fail - end. - -(** [successive x y] will fails if [y] is not the successor of [x]. On the - contrary, the “syntactically equivalent” function in Gallina will exhibit - a totally different behavior. *) - -Definition successor (x y : nat) : bool := - match y with - | S x => true - | _ => false - end. - -(** Here, the first branch of the pattern match is systematically selected when - [y] is not O, and in this branch, the argument [x] is shadowed by the - predecessor of [y]. - - For Ltac to adopt a behavior similar to Gallina, the [?] prefix shall be - used. For instance, the following tactic will check whether its argument - has a known successor, prints it if it does, or fail otherwise. *) - -Ltac print_pred_or_zero x := - match x with - | S ?x => idtac x - | _ => fail - end. - -(** On the one hand, [print_pred_or_zero 3] will print [2]. On the other hand, - if there exists a variable [x : nat] in the context, calling - [print_pred_or_zero x] will fail, since the exact value of [x] is not - known. *) - -(** ** Pattern Matching on Types with [type of] *) - -(** The [type of] operator can be used in conjunction to pattern matching to - generate different terms according to the type of a variable. We could - leverage this to reimplement <<induction>> for instance. - - As an example, we propose the following <<not_nat>> tactic which, given an - argument [x], fails if [x] is of type [nat]. *) - -Ltac not_nat x := - lazymatch type of x with - | nat => fail "argument is of type nat" - | _ => idtac - end. - -(** With this definition, <<not_nat true>> succeeds since [true] is of type - [bool], and [not_nat O] since [O] encodes #<span class="imath">#0#</span># in - [nat]. - - We can also use the [?] prefix to write true pattern. For instance, the - following tactic will fail if the type of its supplied argument has at least - one parameter. *) - -Ltac not_param_type x := - lazymatch type of x with - | ?f ?a => fail "argument is of type with parameter" - | _ => idtac - end. - -(** Both <<not_param_type (@nil nat)>> of type [list nat] and - <<(@eq_refl nat 0)>> of type [0 = 0] fail, but <<not_param_type 0>> of type [nat] - succeeds. *) - -(** ** Pattern Matching on the Context with [goal] *) - -(** Lastly, we discuss how Ltac allows for inspecting the context (i.e., the - hypotheses and the goal) using the [goal] keyword. - - For instance, we propose a naive implementation of the [subst] tactic - as follows. *) - -Ltac subst' := - repeat - match goal with - | H : ?x = _ |- context[?x] - => repeat rewrite H; clear H - end. - -(** With [goal], patterns are of the form <<H : (pattern), ... |- (pattern)>>. - - - At the left side of [|-], we match on hypotheses. Beware that - contrary to variable name in pattern, hypothesis names behaves as in - Gallina (i.e., fresh binding, shadowing, etc.). In the branch, we are - looking for equations, i.e., an hypothesis of the form [?x = _]. - - At the right side of [|-], we match on the goal. - - In both cases, Ltac makes available an interesting operator, - [context[(pattern)]], which is satisfies if [(pattern)] appears somewhere in - the object we are pattern matching against. So, the branch of the [match] - reads as follows: we are looking for an equation [H] which specifies the - value of an object [x] which appears in the goal. If such an equation - exists, <<subst'>> tries to <<rewrite>> [x] as many time as possible. - - This implementation of [subst'] is very fragile, and will not work if the - equation is of the form [_ = ?x], and it may behaves poorly if we have - “transitive equations”, such as there exists hypotheses [?x = ?y] and [?y = - _]. Motivated readers may be interested in proposing more robust - implementation of [subst']. *) - -(** ** Conclusion *) - -(** This concludes our tour on Ltac pattern matching capabilities. In the #<a - href="MixingLtacAndGallina.html">#next article of this series#</a>#, we - explain how Ltac and Gallina can actually be used simultaneously. *) diff --git a/site/posts/May2023.md b/site/posts/May2023.md new file mode 100644 index 0000000..a65c469 --- /dev/null +++ b/site/posts/May2023.md @@ -0,0 +1,167 @@ +--- +published: 2023-05-18 +tags: + - emacs + - meta + - neovim + - releases + - spatial-shell +series: + parent: series/Retrospectives.html + prev: posts/November2022.html +abstract: | + “Regularity is key.” But sometimes, it is a bit hard to get right. + Anyway, let’s catch up. +--- + +# What happened since December 2022? + +Initially, I started this “What happened” series as an exercise to publish +more regularly on this website. Suffice to say, I haven’t done a particularly +impressive job in that regard, which only means I have a lot of room for +improvement. + +Anyway, if the first few months of 2023 has been mostly `$WORK`{.bash} focus, +the same cannot be said for April and May. For one, I have started +[running](/running.html) again. But this is only the tip of the iceberg. + +## Spatial Shell got its first releases + +[Spatial Shell](https://github.com/lthms/spatial-shell) is probably my hobby +project I am most excited about. The [“call for testers” +article](/posts/CFTSpatialShell.html) I have published recently managed to +catch the attention of a few folks[^fail]. The perspective to publish such a +write-up was a very strong source of motivation for me to clean up a project I +was using daily for several months now, and I am very satisfied with the +result. + +[^fail]: You want to hear a lesson I learned the hard way just after publishing + it? Before calling for testers, it is better to [be sure your project can + actually be compiled easily by the potential + volunteers](https://github.com/lthms/spatial-shell/issues/2#issuecomment-1527193430). + +Mass adoption is still a distant horizon, but still, the project is now +mainstream enough that it has already been mentioned in [a random topic on the +OCaml discourse by someone who isn’t +me](https://discuss.ocaml.org/t/window-manager-xmonad-in-ocaml/12048/4). 🎉 + +This led me to formally release a first version of Spatial Shell in the end of +April, and a second today. For the first time, I have also published [an +Archlinux package](https://aur.archlinux.org/packages/spatial-shell), to make +the life of potential early adopters even easier. Do not hesitate to upvote it +so that it can find its way to the `extra` repository some day. + +## Goodbye Emacs! Remember me, Neovim? + +In 2015, I started using Coq for my PhD thesis and at the time, there was no +real support for (Neo)vim[^coq]. Everyone was using [Proof +General](https://proofgeneral.github.io/) and Emacs, so I was left with little +choice but to follow through. With only my courage and the [good advice of a +fellow “vimer” who had also made a similar +journey](https://juanjoalvarez.net/posts/2014/vim-emacsevil-chaotic-migration-guide/), +I started using Emacs. + +[^coq]: The situation later improved. Nowadays, you can implement your theories + using [Coqtail](https://github.com/whonore/Coqtail), and [Coq + LSP](https://github.com/ejgallego/coq-lsp) will probably become a viable + and interesting setup in a near future. + +Fast forward 8 years later, and my [Emacs +configuration](https://src.soap.coffee/dotfiles/emacs.d) has become a project +of its own. Overall, I was pretty happy with my setup, but in the same time, I +always remained a bit nostalgic of my Neovim days. This is probably why I +decided to give this old friend a try when my company bought me a new laptop. I +also used this as an opportunity to try out this LSP-thing everyone was talking +about. + +It has been a month now, and I do not plan to come back to my previous habits. +There are still some few edges here and there, I still need to get my head +around lua, but LSP is nice, and plugins like +[telescope](https://github.com/nvim-telescope/telescope.nvim) are simply too +beautiful. + +That being said, there was one aspect of moving from Emacs to Neovim I had not +anticipated: Org mode. Which constitutes a perfect transition to the next +session. + +## Website redesign, again + +Did you notice this website has been revamping recently? The changes are +actually deeper than “just” a redesign, to a point where I had to port *all* my +write-ups to a different markup language[^transition]. + +Why, you ask? Well, it’s actually pretty simple: as time goes, I’ve grown +lazier. + +[^transition]: Are you starting to understand why “Org mode” was the perfect + transition to move on to this section? + +Let me give you some context. Until very recently, my website was built around +the idea to have literate programming as a first-class citizen of my author +tools. For instance, you can have a look at [what used to be the literate +program which was responsible for generating the +website](/posts/CleopatraV1.html). Similarly, most of [my write-ups about +Coq](/tags/coq.html) were actually Coq documents. Literate programming is +actually a very nice paradigm for authoring technical contents, because it +gives you the tools to keep said contents accurate and up-to-date. In a +nutshell, you cannot have a typo in one of your code snippets which would +prevent it from compiling, because you actually +compile the snippet and catch the typo when you try to generate your website. +Or at least, it is what I used to do. + +I decided to stop because, for all its benefits, this approach has one major +drawback: it is hard to maintain. I had invested quite some time and efforts to +keep my website sources under control, but it really was an everyday fight. +There are some strange things which start happening when you fully commit to +this, as I think I did. For instance, software dependencies tie your article +together. Suddenly, you cannot talk about this new fancy feature of the latest +Coq release without upgrading *all* your write-ups implemented as Coq +documents[^actually]. + +[^actually]: Well, in theory you can. Just have each Coq document specifies the + Coq version it requires, and support this level of customization in your + build toolchain. But then, your blog takes forever to build from a cold + repository. + +That being said, most of the work had already been done. This website *was* a +collection of literate programs, and I was pretty proud of the state of things. +I could deal with the annoyances[^coqdoc]. But then, as I explained in the +previous section, I decided to move away from Emacs. The first time I tried to +start a new write-up, it hit me. + +[^coqdoc]: Like using Coqdoc syntax to write my articles, for instance. I could + write about how the Coqdoc syntax irks me for ages. + +I used to write most of my contents using Org mode. Org mode, also known as +*the* Emacs markup language. + +I know of at least [one “Org plugin” for +Neovim](https://github.com/nvim-orgmode/orgmode), but instead of giving it a +try, I decided to use this opportunity to tackle my “maintenance problem” once +and for all. *I gave up on literate programming for this website.* As a result, +this website is now generated from Markdown files only (using +[markdown-it](https://github.com/markdown-it/markdown-it) with many plugins). +As a consequence, the generated HTML is way more “predictable.” This was enough +to motivate me at giving a try at [Soupault’s +indexes](https://soupault.app/reference-manual/#metadata-extraction-and-rendering), +which are way more powerful than I anticipated. Now, this website has + +- Tags. Each write-up can be labeled with as many tags as I want, there is [a + page which lists all the tags used in the website](/tags), and each tag has + its own page (for instance, the [`coq` tag](/tags/coq.html). +- A [RSS feed](/posts/index.xml). It was actually one of the main features I + really wanted to get with this revamp. +- Automatically generated list of articles in the [home page](/), for each + series (see the [Ltac series](/series/Ltac.html) for instance). Before, I was + publishing “curated indexes,” or put in other words: I was writing these + indexes myself, by hand. And again, I’ve grown lazier. + +It took me a week to go through this rework. Translating manually every write-up +was tedious, to say the least, as was implementing the Lua plugins for Soupault +since I have neither proficiency nor tooling to help me write Lua code. But I +am very glad for the final result. + +Also, I have invested in an Antidote license, so hopefully, this website will +have fewer typos and English butchering as of now. A clean text, delivered with +a nice and simple design, from a sane and maintainable [Git +repository](https://src.soap.coffee/soap.coffee/lthms.git/). diff --git a/site/posts/MixingLtacAndGallina.md b/site/posts/MixingLtacAndGallina.md new file mode 100644 index 0000000..288f38f --- /dev/null +++ b/site/posts/MixingLtacAndGallina.md @@ -0,0 +1,187 @@ +--- +published: 2020-07-26 +modified: 2020-08-28 +series: + parent: series/Ltac.html + prev: posts/LtacPatternMatching.html +tags: ['coq'] +abstract: | + One of the most misleading introduction to Coq is to say that “Gallina is + for programs, while tactics are for proofs.” Gallina is the preferred way + to construct programs, and tactics are the preferred way to construct + proofs. The key word here is “preferred.” Coq actually allows for *mixing* + Ltac and Gallina together. +--- + +# Mixing Ltac and Gallina for Fun and Profit + +One of the most misleading introductions to Coq is to say that “Gallina is +for programs, while tactics are for proofs.” Indeed, in Coq we construct +terms of given types, always. Terms encodes both programs and proofs about +these programs. Gallina is the preferred way to construct programs, and +tactics are the preferred way to construct proofs. + +The key word here is “preferred.” We do not always need to use tactics to +construct a proof term. Conversely, there are some occasions where +constructing a program with tactics become handy. Furthermore, Coq actually +allows for *mixing together* Ltac and Gallina. + +In the [previous article of this series](/posts/LtacPatternMatching.html), we +discuss how Ltac provides two very interesting features: + +- With `match goal with`{.coq} it can inspect its context +- With `match type of _ with`{.coq} it can pattern matches on types + +It turns out these features are more than handy when it comes to +metaprogramming (that is, the generation of programs by programs). + +## A Tale of Two Worlds, and Some Bridges + +Constructing terms proofs directly in Gallina often happens when one is +writing dependently typed definition. For instance, we can write a type-safe +`from_option`{.coq} function (inspired by [this very nice +write-up](https://plv.csail.mit.edu/blog/unwrapping-options.html)) such that +the option to unwrap shall be accompanied by a proof that said option contains +something. This extra argument is used in the `None`{.coq} case to derive a +proof of `False`{.coq}, from which we can derive +anything. + +```coq +Definition is_some {α} (x : option α) : bool := + match x with Some _ => true | None => false end. + +Lemma is_some_None {α} (x : option α) + : x = None -> is_some x <> true. +Proof. intros H. rewrite H. discriminate. Qed. + +Definition from_option {α} + (x : option α) (some : is_some x = true) + : α := + match x as y return x = y -> α with + | Some x => fun _ => x + | None => fun equ => False_rect α (is_some_None x equ some) + end eq_refl. +``` + +In `from_option`{.coq}, we construct two proofs without using tactics: + +- `False_rect α (is_some_None x equ some)`{.coq} to exclude the absurd case +- `eq_refl`{.coq} in conjunction with a dependent pattern matching (if you are + not familiar with this trick: the main idea is to allow Coq to + “remember” that `x = None`{.coq} in the second branch) + +We can use another approach. We can decide to implement `from_option`{.coq} +with a proof script. + +```coq +Definition from_option' {α} + (x : option α) (some : is_some x = true) + : α. +Proof. + case_eq x. + + intros y _. + exact y. + + intros equ. + rewrite equ in some. + now apply is_some_None in some. +Defined. +``` + +There is a third approach we can consider: mixing Gallina terms and tactics. +This is possible thanks to the `ltac:()`{.coq} feature. + +```coq +Definition from_option'' {α} + (x : option α) (some : is_some x = true) + : α := + match x as y return x = y -> α with + | Some x => fun _ => x + | None => fun equ => ltac:(rewrite equ in some; + now apply is_some_None in some) + end eq_refl. +``` + +When Coq encounters `ltac:()`{.coq}, it treats it as a hole. It sets up a +corresponding goal, and tries to solve it with the supplied tactic. + +Conversely, there exists ways to construct terms in Gallina when writing a proof +script. Certain tactics take such terms as arguments. Besides, Ltac provides +`constr:()`{.coq} and `uconstr:()`{.coq} which work similarly to +`ltac:()`{.coq}. The difference between `constr:()`{.coq} and +`uconstr:()`{.coq} is that Coq will try to assign a type to the argument of +`constr:()`{.coq}, but will leave the argument of `uconstr:()`{.coq} untyped. + +For instance, consider the following tactic definition. + +```coq +Tactic Notation "wrap_id" uconstr(x) := + let f := uconstr:(fun x => x) in + exact (f x). +``` + +Both `x`{.coq} the argument of `wrap_id`{.coq} and `f`{.coq} the anonymous identity function +are not typed. It is only when they are composed together as an argument of +`exact`{.coq} (which expects a typed argument, more precisely of the type of the +goal) that Coq actually tries to type check it. + +As a consequence, `wrap_id`{.coq} generates a specialized identity function for +each specific context. + +```coq +Definition zero : nat := ltac:(wrap_id 0). +``` + +The generated anonymous identity function is `fun x : nat => x`{.coq}. + +```coq +Definition empty_list α : list α := ltac:(wrap_id nil). +``` + +The generated anonymous identity function is `fun x : list α => x`{.coq}. +Besides, we do not need to give more type information about `nil`{.coq}. If +`wrap_id`{.coq} were to be expecting a typed term, we would have to replace +`nil`{.coq} by [(@nil α)]. + +## Beware the Automation Elephant in the Room + +Proofs and computational programs are encoded in Coq as terms, but there is a +fundamental difference between them, and it is highlighted by one of the axioms +provided by the Coq standard library: proof irrelevance. + +Proof irrelevance states that two proofs of the same theorem (i.e., two proof +terms which share the same type) are essentially equivalent, and can be +substituted without threatening the trustworthiness of the system. From a +formal methods point of view, it makes sense. Even if we value “beautiful +proofs,” we mostly want correct proofs. + +The same reasoning does _not_ apply to computational programs. Two functions of +type `nat -> nat -> nat`{.coq} are unlikely to be equivalent. For instance, +`add`{.coq}, `mul`{.coq} or `sub`{.coq} share the same type, but computes +totally different results. + +Using tactics such as `auto`{.coq} to generate terms which do not live inside +`Prop`{.coq} is risky, to say the least. For instance, + +```coq +Definition add (x y : nat) : nat := ltac:(auto). +``` + +This works, but it is certainly not what you would expect: + +```coq +add = fun _ y : nat => y + : nat -> nat -> nat +``` + +That being said, if we keep that in mind, and assert the correctness of the +generated programs (either by providing a proof, or by extensively testing it), +there is no particular reason not to use Ltac to generate terms when it makes +sense. + +Dependently typed programming can help you here. If we decorate the return type of +a function with the expected properties of the result wrt. the function’s +arguments, we can ensure the function is correct, and conversely prevent tactics +such as `auto`{.coq} to generate “incorrect” terms. Interested readers may +refer to [the dedicated series on this very +website](/posts/StronglySpecifiedFunctions.html). + diff --git a/site/posts/MixingLtacAndGallina.v b/site/posts/MixingLtacAndGallina.v deleted file mode 100644 index d77cd8a..0000000 --- a/site/posts/MixingLtacAndGallina.v +++ /dev/null @@ -1,165 +0,0 @@ -(** #<nav><p class="series">Ltac.html</p> - <p class="series-prev">./LtacPatternMatching.html</p></nav># *) - -(** * Mixing Ltac and Gallina for Fun and Profit *) - -(** One of the most misleading introduction to Coq is to say that “Gallina is - for programs, while tactics are for proofs.” Indeed, in Coq we construct - terms of given types, always. Terms encodes both programs and proofs about - these programs. Gallina is the preferred way to construct programs, and - tactics are the preferred way to construct proofs. - - The key word here is “preferred.” We do not always need to use tactics to - construct a proof term. Conversly, there are some occasions where - constructing a program with tactics become handy. Furthermore, Coq actually - allows for _mixing together_ Ltac and Gallina. - - In the #<a href="LtacPatternMatching.html">#previous article of this - series#</a>#, we discuss how Ltac provides two very interesting features: - - - With [match goal with] it can inspect its context - - With [match type of _ with] it can pattern matches on types - - It turns out these features are more than handy when it comes to - metaprogramming (that is, the generation of programs by programs). *) - -(** #<nav id="generate-toc"></nav># - - #<div id="history">site/posts/MixingLtacAndGallina.v</div># *) - -(** ** A Tale of Two Worlds, and Some Bridges *) - -(** Constructing terms proofs directly in Gallina often happens when one is - writing dependently-typed definition. For instance, we can write a type safe - [from_option] function (inspired by #<a - href="https://plv.csail.mit.edu/blog/unwrapping-options.html">#this very - nice write-up#</a>#) such that the option to unwrap shall be accompagnied by - a proof that said option contains something. This extra argument is used in - the [None] case to derive a proof of [False], from which we can derive - anything. *) - -Definition is_some {α} (x : option α) : bool := - match x with Some _ => true | None => false end. - -Lemma is_some_None {α} (x : option α) - : x = None -> is_some x <> true. -Proof. intros H. rewrite H. discriminate. Qed. - -Definition from_option {α} - (x : option α) (some : is_some x = true) - : α := - match x as y return x = y -> α with - | Some x => fun _ => x - | None => fun equ => False_rect α (is_some_None x equ some) - end eq_refl. - -(** In [from_option], we construct two proofs without using tactics: - - - [False_rect α (is_some_None x equ some)] to exclude the absurd case - - [eq_refl] in conjunction with a dependent pattern matching (if you are - not familiar with this trick: the main idea is to allow Coq to - “remember” that [x = None] in the second branch) - - We can use another approach. We can decide to implement [from_option] - with a proof script. *) - -Definition from_option' {α} - (x : option α) (some : is_some x = true) - : α. - -Proof. - case_eq x. - + intros y _. - exact y. - + intros equ. - rewrite equ in some. - now apply is_some_None in some. -Defined. - -(** There is a third approach we can consider: mixing Gallina terms, and - tactics. This is possible thanks to the [ltac:()] feature. *) - -Definition from_option'' {α} - (x : option α) (some : is_some x = true) - : α := - match x as y return x = y -> α with - | Some x => fun _ => x - | None => fun equ => ltac:(rewrite equ in some; - now apply is_some_None in some) - end eq_refl. - -(** When Coq encounters [ltac:()], it treats it as a hole. It sets up a - corresponding goal, and tries to solve it with the supplied tactic. - - Conversly, there exists ways to construct terms in Gallina when writing a - proof script. Certains tactics takes such terms as arguments. Besides, Ltac - provides [constr:()] and [uconstr:()] which work similarly to [ltac:()]. - The difference between [constr:()] and [uconstr:()] is that Coq will try to - assign a type to the argument of [constr:()], but will leave the argument of - [uconstr:()] untyped. - - For instance, consider the following tactic definition. *) - -Tactic Notation "wrap_id" uconstr(x) := - let f := uconstr:(fun x => x) in - exact (f x). - -(** Both [x] the argument of [wrap_id] and [f] the anonymous identity function - are not typed. It is only when they are composed together as an argument of - [exact] (which expects a typed argument, more precisely of the type of the - goal) that Coq actually tries to typecheck it. - - As a consequence, [wrap_id] generates a specialized identity function for - each specific context. *) - -Definition zero : nat := ltac:(wrap_id 0). - -(** The generated anonymous identity function is [fun x : nat => x]. *) - -Definition empty_list α : list α := ltac:(wrap_id nil). - -(** The generated anonymous identity function is [fun x : list α => x]. Besides, - we do not need to give more type information about [nil]. If [wrap_id] were - to be expecting a typed term, we would have to replace [nil] by [(@nil - α)]. *) - -(** ** Beware the Automation Elephant in the Room *) - -(** Proofs and computational programs are encoded in Coq as terms, but there is - a fundamental difference between them, and it is highlighted by one of the - axiom provided by the Coq standard library: proof irrelevance. - - Proof irrelevance states that two proofs of the same theorem (i.e., two - proof terms which share the same type) are essentially equivalent, and can - be substituted without threatening the trustworthiness of the system. From a - formal methods point of view, it makes sense. Even if we value “beautiful - proofs,” we mostly want correct proofs. - - The same reasoning does _not_ apply to computational programs. Two functions - of type [nat -> nat -> nat] are unlikely to be equivalent. For instance, - [add], [mul] or [sub] share the same type, but computes totally different - results. - - Using tactics such as [auto] to generate terms which do not live inside - [Prop] is risky, to say the least. For instance, *) - -Definition add (x y : nat) : nat := ltac:(auto). - -(** This works, but it is certainly not what you would expect: - -<< -add = fun _ y : nat => y - : nat -> nat -> nat ->> - - That being said, if we keep that in mind, and assert the correctness of the - generated programs (either by providing a proof, or by extensively testing - it), there is no particular reason not to use Ltac to generate terms when it - makes sens. - - Dependently-typed programming can help here. If we decorate the return type - of a function with the expected properties of the result wrt. the function’s - arguments, we can ensure the function is correct, and conversly prevent - tactics such as [auto] to generate “incorrect” terms. Interested readers may - refer to #<a href="/posts/StronglySpecifiedFunctions.html">#the dedicated - series on this very website#</a>. *) diff --git a/site/posts/MonadTransformers.md b/site/posts/MonadTransformers.md new file mode 100644 index 0000000..9b9f2a8 --- /dev/null +++ b/site/posts/MonadTransformers.md @@ -0,0 +1,83 @@ +--- +published: 2017-07-15 +tags: ['haskell', 'opinions'] +abstract: | + Monads are hard to get right, monad transformers are harder. Yet, they + remain a very powerful abstraction. +--- + +# Monad Transformers are a Great Abstraction + +Monads are hard to get right. I think it took me around a year of Haskelling to +feel like I understood them. The reason is, to my opinion, there is not such +thing as *the* Monad. It is even the contrary. When someone asks me how I would +define Monads in only a few words, I say monads are a convenient formalism to +chain specific computations. Once I’ve got that, I started noticing “monadic +construction” everywhere, from the Rust `?`{.rust} operator to the [Elixir +`with`{.elixir} keyword](https://blog.drewolson.org/elixirs-secret-weapon/). + +Haskell often uses another concept above Monads: Monad Transformers. This allows +you to work not only with *one* Monad, but rather a stack. Each Monad brings its +own properties and you can mix them into your very own one. That you can’t have +in Rust or Elixir, but it works great in Haskell. Unfortunately, it is not an +easy concept and it can be hard to understand. This article is not an attempt to +do so, but rather a postmortem review of one situation where I found them +extremely useful. If you think you have understood how they work, but don’t see +the point yet, you might find here a beginning of the answer. + +Recently, I ran into a very good example of why Monad Transformers worth it[^doubts]. I +have been working on a project called ogma for a couple years now. In a +nutshell, I want to build “a tool” to visualize in time and space a +storytelling. We are not here just yet, but, in the meantime, I have written a +software called `celtchar` to build a novel from a list of files. One of its +newest features is the choice of language, and by extension, the typographic +rules. This information is read from a configuration file very early in the +program flow. Unfortunately, its use comes much later, after several function +calls. + +[^doubts]: Time has passed since the publication of this article. Whether or + not I remain in sync with its conclusions is an open question. Monad + Transformers are a great abstraction, but nowadays I would probably try to + choose another approach. + +In Haskell, you deal with that kind of challenge by relying on the Reader +Monad. It carries an environment in a transparent way. The only thing is, I was +already using the State Monad to carry the computation result. But that’s not an +issue with the Monad Transformers. + +```diff +-type Builder = StateT Text IO ++type Builder = StateT Text (ReaderT Language IO) +``` + +As you may have already understood, I wasn't using the “raw” `State`{.haskell} +Monad, but rather the transformer version `StateT`{.haskell}. The underlying +Monad was `IO`{.haskell}, because I needed to be able to read some files from +the file system. By replacing `IO`{.haskell} by `ReaderT Language IO`{.haskell}, +I basically fixed my “carry the variable to the correct function call easily” +problem. + +Retrieving the chosen language is as simple as: + +```haskell +getLanguage :: Builder Language +getLanguage = lift ask +``` + +And that was basically it. + +Now, my point is not that Monad Transformers are the ultimate beast we will have +to tame once and then everything will be shiny and easy[^funny]. There are a lot of +other ways to achieve what I did with my `Builder`{.haskell} stack. For instance, in an +OO language, I probably would have to add a new class member to my `Builder`{.haskell} +class and I would have done basically the same thing. + +[^funny]: It is amusing to see Past Me being careful here. + +However, there is something I really like about this approach: the +`Builder`{.haskell} type definition gives you a lot of useful information +already. Both the `State`{.haskell} and `Reader`{.haskell} Monads have a +well-established semantics most Haskellers will understand in a glance. A bit +of documentation won’t hurt, but I suspect it is not as necessary as one could +expect. Moreover, the presence of the `IO`{.haskell} Monad tells everyone using +the `Builder`{.haskell} Monad might cause I/O. diff --git a/site/posts/NeovimOCamlTreeSitterAndLSP.md b/site/posts/NeovimOCamlTreeSitterAndLSP.md new file mode 100644 index 0000000..c61ece7 --- /dev/null +++ b/site/posts/NeovimOCamlTreeSitterAndLSP.md @@ -0,0 +1,56 @@ +--- +published: 2023-05-01 +modified: 2023-05-02 +tags: ['ocaml', 'neovim'] +abstract: | + Can we all agree that witnessing syntax highlighting being absolutely off + is probably the most annoying thing that can happen to anybody? +--- + +# Neovim, OCaml Interfaces, Tree-Sitter and LSP + +Can we all agree that witnessing syntax highlighting being absolutely off is +probably the most annoying thing that can happen to anybody? + +I mean, just look at this horror. + +#[Syntax highlighting being absolutely wrong.](/img/wrong-highlighting.png) + +What you are looking at is the result of trying to enable `tree-sitter` for +OCaml hacking and calling it a day. In a nutshell, OCaml `mli` files are +quickly turning into a random mess of nonsensical colors, and I didn’t know +why. I tried to blame +[`tree-sitter-ocaml`](https://github.com/tree-sitter/tree-sitter-ocaml/issues/72), +but, of course I was wrong. + +The issue is subtle, and to be honest, I don’t know if I totally grasp it. But +from my rough understanding, it breaks down as follows. + +- `tree-sitter-ocaml` defines two grammars: `ocaml` for the `ml` files, and + `ocaml_interface` (but `ocamlinterface` also works) for the `mli` files +- By default, neovim uses the filetype `ocaml` for `mli` files, so the incorrect + parser is being used for syntax highlighting. This explains the root issue +- Bonus: `ocamllsp` does not recognize the `ocamlinterface` filetype by + default (but somehow use the `ocaml.interface` id for `mli` files…[^contrib]). + +[^contrib]: There is probably something to be done here. + +So, in order to have both `tree-sitter` and `ocamllsp` working at the same +time, I had to tweak my configuration a little bit. + +``` lua +lspconfig.ocamllsp.setup({ + filetypes = vim.list_extend( + require('lspconfig.server_configurations.ocamllsp') + .default_config + .filetypes, + { 'ocamlinterface' } + ), +}) + +vim.cmd([[au! BufNewFile,BufRead *.mli setfiletype ocamlinterface]]) +``` + +And now, I am blessed with a consistent syntax highlighting for my `mli` files. + +#[Syntax highlighting being absolutely right.](/img/good-highlighting.png) diff --git a/site/posts/November2022.md b/site/posts/November2022.md new file mode 100644 index 0000000..b136276 --- /dev/null +++ b/site/posts/November2022.md @@ -0,0 +1,47 @@ +--- +published: 2022-11-19 +modified: 2023-05-09 +tags: ['spatial-shell', 'nanowrimo', 'coqffi'] +series: + parent: series/Retrospectives.html + prev: posts/September2022.html + next: posts/May2023.html +abstract: | + Spatial Sway has basically reached the MVP stage, I failed to fully commit + to this year’s NaNoWriMo, and someone has worked on adding some support for + `coqffi` to `dune`. +--- + +# What happened in October and November 2022? + +It is November 19 today, and I’m one month and 4 days late for the October +Retrospective! Truth is, `$WORK`{.bash} has been intense lately, to a point +where I have not made much progress on my side projects. Anyway. + +I have implemented the last feature I was really missing in my daily +use of Spatial Sway: moving windows to adjacent workspaces. As a +result, I think I can say that Spatial Sway has really reached the +“Minimum Viable Product” stage, with a convenient UX, and a nice +enough UI. It is still lacking when it comes to configurability, +though. It is the next item of my TODO list, but I have no idea when I +will implement the support for a configuration file. + +Another highlight of the past two months was the +[NaNoWriMo](https://nanowrimo.org). I took the last week of October and the +first week of November off to plan and start writing a fiction project for it. +Writing again was really nice, and I even gave writing fiction in English a +shot. That made me uncover a bug in the English support of +[ogam](https://crates.io/crates/ogam), my markup language for fiction writers, +which led me to publish a fix on Crates.io. However, as soon as I came back to +`$WORK`{.bash}, my writing spree ended. That’s okay, though. It gave me plenty +of ideas for future sessions. Thanks, NaNoWriMo! Sorry to quit so soon, and see +you next year, maybe. + +Finally, a nice surprise of the past month is that [someone has started working +on adding proper support for `coqffi` to +`dune`](https://github.com/ocaml/dune/pull/6489), the build system for OCaml +and Coq! I’m thrilled by this. Thanks, +[**@Alizter**](https://github.com/Alizter)! + +This wraps up this retrospective. I hope I will have more interesting, +concrete news to share next month. diff --git a/site/posts/RankNTypesInOCaml.md b/site/posts/RankNTypesInOCaml.md new file mode 100644 index 0000000..d2021ce --- /dev/null +++ b/site/posts/RankNTypesInOCaml.md @@ -0,0 +1,57 @@ +--- +published: 2022-08-07 +modified: 2022-08-12 +tags: ['ocaml'] +abstract: | + In OCaml, it is not possible to write a function whose argument is a + polymorphic function. Trying to write such a function results in the + type-checker complaining back at you. The trick to be able to write such a + function is to use records. +--- + +# Writing a Function Whose Argument is a Polymorphic Function in OCaml + +In OCaml, it is not possible to write a function whose argument is a +polymorphic function. Trying to write such a function results in the +type-checker complaining back at you. + +```ocaml +let foo (type a b) id (x : a) (y : b) = (id x, id y) +``` + +``` +Line 1, characters 50-51: +1 | let foo (type a b) id (x : a) (y : b) = (id x, id y);; + ^ +Error: This expression has type b but an expression was expected +of type a +``` + +When OCaml tries to type check `foo`{.ocaml}, it infers `id`{.ocaml} expects an +argument of type `a`{.ocaml} because of `id x`{.ocaml}, then fails when trying +to type check `id y`{.ocaml}. + +The trick to be able to write `foo`{.ocaml} is to use records. Indeed, while +the argument of a function cannot be polymorphic, the field of a record can. +This effectively makes it possible to write `foo`{.ocaml}, at the cost of a +level of indirection. + +```ocaml +type id = {id : 'a. 'a -> 'a} + +let foo {id} x y = (id x, id y) +``` + +From a runtime perspective, it is possible to tell OCaml to remove the +introduced indirection with the `unboxed`{.ocaml} annotation. There is nothing +we can do in the source, though. We need to destruct `id`{.ocaml} in +`foo`{.ocaml}, and we need to construct it at its call site. + +```ocaml +g {id = fun x -> x} +``` + +As a consequence, this solution is not a silver bullet, but it is an option +that is worth considering if, *e.g.*, it allows us to export a cleaner API to the +consumer of a module. Personally, I have been considering this trick recently +to remove the need for a library to be implemented as a functor. diff --git a/site/posts/RewritingInCoq.md b/site/posts/RewritingInCoq.md new file mode 100644 index 0000000..a0e65fb --- /dev/null +++ b/site/posts/RewritingInCoq.md @@ -0,0 +1,384 @@ +--- +published: 2017-05-13 +tags: ['coq'] +abstract: | + The `rewrite`{.coq} tactics are really useful, since they are not limited + to the Coq built-in equality relation. +--- + +# Rewriting in Coq + +I have to confess something. In the codebase of SpecCert lies a shameful +secret, which takes the form of a set of unnecessary axioms. + +I thought I couldn’t avoid them at first, but it was before I heard about +“generalized rewriting,” setoids and morphisms. Now, I know the truth, and I +will have to update SpecCert eventually. But, in the meantime, let me try to +explain how it is possible to rewrite a term in a proof using an ad hoc +equivalence relation and, when necessary, a proper morphism. + +## Case Study: Gate System + +Now, why would anyone want such a thing as “generalized rewriting” when the +`rewrite`{.coq} tactic works just fine. + +The thing is: it does not in some cases. To illustrate my statement, we will +consider the following definition of a gate in Coq: + +```coq +Record Gate := + { open: bool + ; lock: bool + ; lock_is_close: lock = true -> open = false + }. +``` + +According to this definition, a gate can be either open or closed. It can also +be locked, but if it is, it cannot be open at the same time. To express this +constraint, we embed the appropriate proposition inside the Record. By doing so, +we *know* for sure that we will never meet an ill-formed `Gate`{.coq} instance. +The Coq engine will prevent it, because to construct a gate, one will have to +prove the `lock_is_close`{.coq} predicate holds. + +The `program`{.coq} attribute makes it easy to work with embedded proofs. For +instance, defining the ”open gate” is as easy as: + +```coq +Require Import Coq.Program.Tactics. + +#[program] +Definition open_gate := + {| open := true + ; lock := false + |}. +``` + +Under the hood, `program`{.coq} proves what needs to be proven, that is the +`lock_is_close`{.coq} proposition. Just have a look at its output: + +``` +open_gate has type-checked, generating 1 obligation(s) +Solving obligations automatically... +open_gate_obligation_1 is defined +No more obligations remaining +open_gate is defined +``` + +In this case, using `Program`{.coq} is a bit like cracking a nut with a +sledgehammer. We can easily do it ourselves using the `refine`{.coq} tactic. + +```coq +Definition open_gate': Gate. + refine ({| open := true + ; lock := false + |}). + intro Hfalse. + discriminate Hfalse. +Defined. +``` + +## `Gate`{.coq} Equality + +What does it mean for two gates to be equal? Intuitively, we know they have to +share the same states (`open`{.coq} and `lock`{.coq} is our case). + +### Leibniz Equality Is Too Strong + +When you write something like `a = b`{.coq} in Coq, the `=`{.coq} refers to the +`eq`{.coq} function and this function relies on what is called the Leibniz Equality: +`x`{.coq} and `y`{.coq} are equal iff every property on `A`{.coq} which is true +of `x`{.coq} is also true of `y`{.coq}. + +As for myself, when I first started to write some Coq code, the +Leibniz Equality was not really something I cared about and I tried to +prove something like this: + +```coq +Lemma open_gates_are_equal (g g': Gate) + (equ : open g = true) (equ' : open g' = true) + : g = g'. +``` + +Basically, it means that if two doors are open, then they are equal. That made +sense to me, because by definition of `Gate`{.coq}, a locked door is closed, +meaning an open door cannot be locked. + +Here is an attempt to prove the `open_gates_are_equal`{.coq} lemma. + +```coq +Proof. + assert (forall g, open g = true -> lock g = false). { + intros [o l h] equo. + cbn in *. + case_eq l; auto. + intros equl. + now rewrite (h equl) in equo. + } + assert (lock g = false) by apply (H _ equ). + assert (lock g' = false) by apply (H _ equ'). + destruct g; destruct g'; cbn in *; subst. +``` + +The term to prove is now: + +``` +{| open := true; lock := false; lock_is_close := lock_is_close0 |} = +{| open := true; lock := false; lock_is_close := lock_is_close1 |} +``` + +The next tactic I wanted to use `reflexivity`{.coq}, because I'd basically proven +`open g = open g'`{.coq} and `lock g = lock g'`{.coq}, which meets my definition of equality +at that time. + +Except Coq wouldn’t agree. See how it reacts: + +``` +Unable to unify "{| open := true; lock := false; lock_is_close := lock_is_close1 |}" + with "{| open := true; lock := false; lock_is_close := lock_is_close0 |}". +``` + +It cannot unify the two records. More precisely, it cannot unify +`lock_is_close1`{.coq} and `lock_is_close0`{.coq}. So we abort and try something +else. + +```coq +Abort. +``` + +### Ah-Hoc Equivalence Relation + +This is a familiar pattern. Coq cannot guess what we have in mind. Giving a +formal definition of “our equality” is fortunately straightforward. + +```coq +Definition gate_eq + (g g': Gate) + : Prop := + open g = open g' /\ lock g = lock g'. +``` + +Because “equality” means something very specific in Coq, we won't say “two +gates are equal” anymore, but “two gates are equivalent.” That is, +`gate_eq`{.coq} is an equivalence relation. But “equivalence relation” is also +something very specific. For instance, such relation needs to be symmetric (`R +x y -> R y x`{.coq}), reflexive (`R x x`{.coq}) and transitive (`R x y -> R y z +-> R x z`{.coq}). + +```coq +Require Import Coq.Classes.Equivalence. + +#[program] +Instance Gate_Equivalence + : Equivalence gate_eq. + +Next Obligation. + split; reflexivity. +Defined. + +Next Obligation. + intros g g' [Hop Hlo]. + symmetry in Hop; symmetry in Hlo. + split; assumption. +Defined. + +Next Obligation. + intros g g' g'' [Hop Hlo] [Hop' Hlo']. + split. + + transitivity (open g'); [exact Hop|exact Hop']. + + transitivity (lock g'); [exact Hlo|exact Hlo']. +Defined. +``` + +Afterwards, the `symmetry`{.coq}, `reflexivity`{.coq} and `transitivity`{.coq} +tactics also works with `gate_eq`{.coq}, in addition to `eq`{.coq}. We can try +again to prove the `open_gate_are_the_same`{.coq} lemma and it will +work[^lemma]. + +[^lemma]: I know I should have proven an intermediate lemma to avoid code + duplication. Sorry about that, it hurts my eyes too. + +```coq +Lemma open_gates_are_the_same: + forall (g g': Gate), + open g = true + -> open g' = true + -> gate_eq g g'. +Proof. + induction g; induction g'. + cbn. + intros H0 H2. + assert (lock0 = false). + + case_eq lock0; [ intro H; apply lock_is_close0 in H; + rewrite H0 in H; + discriminate H + | reflexivity + ]. + + assert (lock1 = false). + * case_eq lock1; [ intro H'; apply lock_is_close1 in H'; + rewrite H2 in H'; + discriminate H' + | reflexivity + ]. + * subst. + split; reflexivity. +Qed. +``` + +## Equivalence Relations and Rewriting + +So here we are, with our ad hoc definition of gate equivalence. We can use +`symmetry`{.coq}, `reflexivity`{.coq} and `transitivity`{.coq} along with +`gate_eq`{.coq} and it works fine because we have told Coq `gate_eq`{.coq} is +indeed an equivalence relation for `Gate`{.coq}. + +Can we do better? Can we actually use `rewrite`{.coq} to replace an occurrence +of `g`{.coq} by an occurrence of `g’`{.coq} as long as we can prove that +`gate_eq g g’`{.coq}? The answer is “yes”, but it will not come for free. + +Before moving forward, just consider the following function: + +```coq +Require Import Coq.Bool.Bool. + +Program Definition try_open + (g: Gate) + : Gate := + if eqb (lock g) false + then {| lock := false + ; open := true + |} + else g. +``` + +It takes a gate as an argument and returns a new gate. If the former is not +locked, the latter is open. Otherwise the argument is returned as is. + +```coq +Lemma gate_eq_try_open_eq + : forall (g g': Gate), + gate_eq g g' + -> gate_eq (try_open g) (try_open g'). +Proof. + intros g g' Heq. +Abort. +``` + +What we could have wanted to do is: `rewrite Heq`{.coq}. Indeed, `g`{.coq} and `g’`{.coq} +“are the same” (`gate_eq g g’`{.coq}), so, _of course_, the results of `try_open g`{.coq} and +`try_open g’`{.coq} have to be the same. Except... + +``` +Error: Tactic failure: setoid rewrite failed: Unable to satisfy the following constraints: +UNDEFINED EVARS: + ?X49==[g g' Heq |- relation Gate] (internal placeholder) {?r} + ?X50==[g g' Heq (do_subrelation:=Morphisms.do_subrelation) + |- Morphisms.Proper (gate_eq ==> ?X49@{__:=g; __:=g'; __:=Heq}) try_open] (internal placeholder) {?p} + ?X52==[g g' Heq |- relation Gate] (internal placeholder) {?r0} + ?X53==[g g' Heq (do_subrelation:=Morphisms.do_subrelation) + |- Morphisms.Proper (?X49@{__:=g; __:=g'; __:=Heq} ==> ?X52@{__:=g; __:=g'; __:=Heq} ==> Basics.flip Basics.impl) eq] + (internal placeholder) {?p0} + ?X54==[g g' Heq |- Morphisms.ProperProxy ?X52@{__:=g; __:=g'; __:=Heq} (try_open g')] (internal placeholder) {?p1} +``` + +What Coq is trying to tell us here —in a very poor manner, I’d say— is actually +pretty simple. It cannot replace `g`{.coq} by `g’`{.coq} because it does not +know if two equivalent gates actually give the same result when passed as the +argument of `try_open`{.coq}. This is actually what we want to prove, so we +cannot use `rewrite`{.coq} just yet, because it needs this result so it can do +its magic. Chicken and egg problem. + +In other words, we are making the same mistake as before: not telling Coq what +it cannot guess by itself. + +The `rewrite`{.coq} tactic works out of the box with the Coq equality +(`eq`{.coq}, or most likely `=`{.coq}) because of the Leibniz Equality: +`x`{.coq} and `y`{.coq} are equal iff every property on `A`{.coq} which is true +of `x`{.coq} is also true of `y`{.coq} + +This is a pretty strong property, and one that a lot of equivalence relations do not +have. Want an example? Consider the relation `R`{.coq} over `A`{.coq} such that +forall `x`{.coq} and `y`{.coq}, `R x y`{.coq} holds true. Such a relation is +reflexive, symmetric and reflexive. Yet, there is very little chance that given +a function `f : A -> B`{.coq} and `R’`{.coq} an equivalence relation over +`B`{.coq}, `R x y -> R' (f x) (f y)`{.coq}. Only if we have this property, we +would know that we could rewrite `f x`{.coq} by `f y`{.coq}, *e.g.*, in `R' z +(f x)`{.coq}. Indeed, by transitivity of `R’`{.coq}, we can deduce `R' z (f +y)`{.coq} from `R' z (f x)`{.coq} and `R (f x) (f y)`{.coq}. + +If `R x y -> R' (f x) (f y)`{.coq}, then `f`{.coq} is a morphism because it +preserves an equivalence relation. In our previous case, `A`{.coq} is +`Gate`{.coq}, `R`{.coq} is `gate_eq`{.coq}, `f`{.coq} is `try_open`{.coq} and +therefore `B`{.coq} is `Gate`{.coq} and `R’`{.coq} is `gate_eq`{.coq}. To make +Coq aware that `try_open`{.coq} is a morphism, we can use the following syntax: +*) + +```coq +#[local] +Open Scope signature_scope. + +Require Import Coq.Classes.Morphisms. + +#[program] +Instance try_open_Proper + : Proper (gate_eq ==> gate_eq) try_open. +``` + +This notation is actually more generic because you can deal with functions that +take more than one argument. Hence, given `g : A -> B -> C -> D`{.coq}, +`R`{.coq} (respectively `R’`{.coq}, `R’’`{.coq} and `R’’’`{.coq}) an equivalent +relation of `A`{.coq} (respectively `B`{.coq}, `C`{.coq} and `D`{.coq}), we can +prove `f`{.coq} is a morphism as follows: + +```coq +Add Parametric Morphism: (g) + with signature (R) ==> (R') ==> (R'') ==> (R''') + as <name>. +``` + +Back to our `try_open`{.coq} morphism. Coq needs you to prove the following +goal: + +``` +1 subgoal, subgoal 1 (ID 50) + + ============================ + forall x y : Gate, gate_eq x y -> gate_eq (try_open x) (try_open y) +``` + +Here is a way to prove that: + +```coq +Next Obligation. + intros g g' Heq. + assert (gate_eq g g') as [Hop Hlo] by (exact Heq). + unfold try_open. + rewrite <- Hlo. + destruct (bool_dec (lock g) false) as [Hlock|Hnlock]; subst. + + rewrite Hlock. + split; cbn; reflexivity. + + apply not_false_is_true in Hnlock. + rewrite Hnlock. + cbn. + exact Heq. +Defined. +``` + +Now, back to our `gate_eq_try_open_eq`{.coq}, we now can use `rewrite`{.coq} +and `reflexivity`{.coq}. + +```coq +Require Import Coq.Setoids.Setoid. + +Lemma gate_eq_try_open_eq + : forall (g g': Gate), + gate_eq g g' + -> gate_eq (try_open g) (try_open g'). +Proof. + intros g g' Heq. + rewrite Heq. + reflexivity. +Qed. +``` + +We did it! We actually rewrite `g`{.coq} as `g’`{.coq}, even if we weren’t able +to prove `g = g’`{.coq}. diff --git a/site/posts/RewritingInCoq.v b/site/posts/RewritingInCoq.v deleted file mode 100644 index a285e09..0000000 --- a/site/posts/RewritingInCoq.v +++ /dev/null @@ -1,349 +0,0 @@ -(** #<nav><p class="series">../coq.html</p> - <p class="series-prev">./Ltac.html</p> - <p class="series-next">./ClightIntroduction.html</p></nav># *) - -(** * Rewriting in Coq *) - -(** I have to confess something. In the codebase of SpecCert lies a shameful - secret. It takes the form of a set of unnecessary axioms. I thought I - couldn’t avoid them at first, but it was before I heard about “generalized - rewriting,” setoids and morphisms. Now, I know the truth, and I will have - to update SpecCert eventually. But, in the meantime, let me try to explain - in this article originally published on #<span id="original-created-at">May - 13, 2017</span> how it is possible to rewrite a term in a proof using a - ad-hoc equivalence relation and, when necessary, a proper morphism. *) - -(** #<nav id="generate-toc"></nav># - - #<div id="history">site/posts/RewritingInCoq.v</div># *) - -(** ** Gate: Our Case Study *) - -(** Now, why would anyone want such a thing as “generalized rewriting” when the - [rewrite] tactic works just fine. - - The thing is: it does not in some cases. To illustrate my statement, we will - consider the following definition of a gate in Coq: *) - -Record Gate := - { open: bool - ; lock: bool - ; lock_is_close: lock = true -> open = false - }. - -(** According to this definition, a gate can be either open or closed. It can - also be locked, but if it is, it cannot be open at the same time. To express - this constrain, we embed the appropriate proposition inside the Record. By - doing so, we _know_ for sure that we will never meet an ill-formed Gate - instance. The Coq engine will prevent it, because to construct a gate, one - will have to prove the [lock_is_close] predicate holds. - - The [program] attribute makes it easy to work with embedded proofs. For - instance, defining the ”open gate” is as easy as: *) - -Require Import Coq.Program.Tactics. - -#[program] -Definition open_gate := - {| open := true - ; lock := false - |}. - -(** Under the hood, [program] proves what needs to be proven, that is the - [lock_is_close] proposition. Just have a look at its output: - -<< -open_gate has type-checked, generating 1 obligation(s) -Solving obligations automatically... -open_gate_obligation_1 is defined -No more obligations remaining -open_gate is defined ->> - - In this case, using <<Program>> is a bit like cracking a nut with a - sledgehammer. We can easily do it ourselves using the [refine] tactic. *) - -Definition open_gate': Gate. - refine ({| open := true - ; lock := false - |}). - intro Hfalse. - discriminate Hfalse. -Defined. - -(** ** Gate Equality - -What does it mean for two gates to be equal? Intuitively, we know they -have to share the same states ([open] and [lock] is our case). - -*** Leibniz Equality Is Too Strong - -When you write something like [a = b] in Coq, the [=] refers to the -[eq] function and this function relies on what is called the Leibniz -Equality: [x] and [y] are equal iff every property on [A] which is -true of [x] is also true of [y] - -As for myself, when I first started to write some Coq code, the -Leibniz Equality was not really something I cared about and I tried to -prove something like this: *) - -Lemma open_gates_are_equal (g g': Gate) - (equ : open g = true) (equ' : open g' = true) - : g = g'. - -(** Basically, it means that if two doors are open, then they are equal. That -made sense to me, because by definition of [Gate], a locked door is closed, -meaning an open door cannot be locked. - -Here is an attempt to prove the [open_gates_are_equal] lemmas. *) - -Proof. - assert (forall g, open g = true -> lock g = false). { - intros [o l h] equo. - cbn in *. - case_eq l; auto. - intros equl. - now rewrite (h equl) in equo. - } - assert (lock g = false) by apply (H _ equ). - assert (lock g' = false) by apply (H _ equ'). - destruct g; destruct g'; cbn in *; subst. - -(** The term to prove is now: - -<< -{| open := true; lock := false; lock_is_close := lock_is_close0 |} = -{| open := true; lock := false; lock_is_close := lock_is_close1 |} ->> - -The next tactic I wanted to use [reflexivity], because I'd basically proven -[open g = open g'] and [lock g = lock g'], which meets my definition of equality -at that time. - -Except Coq wouldn’t agree. See how it reacts: - -<< -Unable to unify "{| open := true; lock := false; lock_is_close := lock_is_close1 |}" - with "{| open := true; lock := false; lock_is_close := lock_is_close0 |}". ->> - -It cannot unify the two records. More precisely, it cannot unify -[lock_is_close1] and [lock_is_close0]. So we abort and try something -else. *) - -Abort. - -(** *** Ah hoc Equivalence Relation - -This is a familiar pattern. Coq cannot guess what we have in mind. Giving a -formal definition of “our equality” is fortunately straightforward. *) - -Definition gate_eq - (g g': Gate) - : Prop := - open g = open g' /\ lock g = lock g'. - -(** Because “equality” means something very specific in Coq, we won't say “two -gates are equal” anymore, but “two gates are equivalent”. That is, [gate_eq] is -an equivalence relation. But “equivalence relation” is also something very -specific. For instance, such relation needs to be symmetric ([R x y -> R y x]), -reflexive ([R x x]) and transitive ([R x y -> R y z -> R x z]). *) - -Require Import Coq.Classes.Equivalence. - -#[program] -Instance Gate_Equivalence - : Equivalence gate_eq. - -Next Obligation. - split; reflexivity. -Defined. - -Next Obligation. - intros g g' [Hop Hlo]. - symmetry in Hop; symmetry in Hlo. - split; assumption. -Defined. - -Next Obligation. - intros g g' g'' [Hop Hlo] [Hop' Hlo']. - split. - + transitivity (open g'); [exact Hop|exact Hop']. - + transitivity (lock g'); [exact Hlo|exact Hlo']. -Defined. - -(** Afterwards, the [symmetry], [reflexivity] and [transitivity] tactics also -works with [gate_eq], in addition to [eq]. We can try again to prove the -[open_gate_are_the_same] lemma and it will work[fn:lemma]. *) - -Lemma open_gates_are_the_same: - forall (g g': Gate), - open g = true - -> open g' = true - -> gate_eq g g'. -Proof. - induction g; induction g'. - cbn. - intros H0 H2. - assert (lock0 = false). - + case_eq lock0; [ intro H; apply lock_is_close0 in H; - rewrite H0 in H; - discriminate H - | reflexivity - ]. - + assert (lock1 = false). - * case_eq lock1; [ intro H'; apply lock_is_close1 in H'; - rewrite H2 in H'; - discriminate H' - | reflexivity - ]. - * subst. - split; reflexivity. -Qed. - -(** [fn:lemma] I know I should have proven an intermediate lemma to avoid code -duplication. Sorry about that, it hurts my eyes too. - -** Equivalence Relations and Rewriting - -So here we are, with our ad-hoc definition of gate equivalence. We can use -[symmetry], [reflexivity] and [transitivity] along with [gate_eq] and it works -fine because we have told Coq [gate_eq] is indeed an equivalence relation for -[Gate]. - -Can we do better? Can we actually use [rewrite] to replace an occurrence of [g] -by an occurrence of [g’] as long as we can prove that [gate_eq g g’]? The answer -is “yes”, but it will not come for free. - -Before moving forward, just consider the following function: *) - -Require Import Coq.Bool.Bool. - -Program Definition try_open - (g: Gate) - : Gate := - if eqb (lock g) false - then {| lock := false - ; open := true - |} - else g. - -(** It takes a gate as an argument and returns a new gate. If the former is not -locked, the latter is open. Otherwise the argument is returned as is. *) - -Lemma gate_eq_try_open_eq - : forall (g g': Gate), - gate_eq g g' - -> gate_eq (try_open g) (try_open g'). -Proof. - intros g g' Heq. -Abort. - -(** What we could have wanted to do is: [rewrite Heq]. Indeed, [g] and [g’] -“are the same” ([gate_eq g g’]), so, _of course_, the results of [try_open g] and -[try_open g’] have to be the same. Except... - -<< -Error: Tactic failure: setoid rewrite failed: Unable to satisfy the following constraints: -UNDEFINED EVARS: - ?X49==[g g' Heq |- relation Gate] (internal placeholder) {?r} - ?X50==[g g' Heq (do_subrelation:=Morphisms.do_subrelation) - |- Morphisms.Proper (gate_eq ==> ?X49@{__:=g; __:=g'; __:=Heq}) try_open] (internal placeholder) {?p} - ?X52==[g g' Heq |- relation Gate] (internal placeholder) {?r0} - ?X53==[g g' Heq (do_subrelation:=Morphisms.do_subrelation) - |- Morphisms.Proper (?X49@{__:=g; __:=g'; __:=Heq} ==> ?X52@{__:=g; __:=g'; __:=Heq} ==> Basics.flip Basics.impl) eq] - (internal placeholder) {?p0} - ?X54==[g g' Heq |- Morphisms.ProperProxy ?X52@{__:=g; __:=g'; __:=Heq} (try_open g')] (internal placeholder) {?p1} ->> - -What Coq is trying to tell us here —in a very poor manner, I’d say— is actually -pretty simple. It cannot replace [g] by [g’] because it does not know if two -equivalent gates actually give the same result when passed as the argument of -[try_open]. This is actually what we want to prove, so we cannot use [rewrite] -just yet, because it needs this result so it can do its magic. Chicken and egg -problem. - -In other words, we are making the same mistake as before: not telling Coq what -it cannot guess by itself. - -The [rewrite] tactic works out of the box with the Coq equality ([eq], or most -likely [=]) because of the Leibniz Equality: [x] and [y] are equal iff every -property on [A] which is true of [x] is also true of [y] - -This is a pretty strong property, and one a lot of equivalence relations do not -have. Want an example? Consider the relation [R] over [A] such that forall [x] -and [y], [R x y] holds true. Such relation is reflexive, symmetric and -reflexive. Yet, there is very little chance that given a function [f : A -> B] -and [R’] an equivalence relation over [B], [R x y -> R' (f x) (f y)]. Only if we -have this property, we would know that we could rewrite [f x] by [f y], e.g. in -[R' z (f x)]. Indeed, by transitivity of [R’], we can deduce [R' z (f y)] from -[R' z (f x)] and [R (f x) (f y)]. - -If [R x y -> R' (f x) (f y)], then [f] is a morphism because it preserves an -equivalence relation. In our previous case, [A] is [Gate], [R] is [gate_eq], -[f] is [try_open] and therefore [B] is [Gate] and [R’] is [gate_eq]. To make Coq -aware that [try_open] is a morphism, we can use the following syntax: *) - -#[local] -Open Scope signature_scope. - -Require Import Coq.Classes.Morphisms. - -#[program] -Instance try_open_Proper - : Proper (gate_eq ==> gate_eq) try_open. - -(** This notation is actually more generic because you can deal with functions -that take more than one argument. Hence, given [g : A -> B -> C -> D], [R] -(respectively [R’], [R’’] and [R’’’]) an equivalent relation of [A] -(respectively [B], [C] and [D]), we can prove [f] is a morphism as follows: - -<< -Add Parametric Morphism: (g) - with signature (R) ==> (R') ==> (R'') ==> (R''') - as <name>. ->> - -Back to our [try_open] morphism. Coq needs you to prove the following -goal: - -<< -1 subgoal, subgoal 1 (ID 50) - - ============================ - forall x y : Gate, gate_eq x y -> gate_eq (try_open x) (try_open y) ->> - -Here is a way to prove that: *) - -Next Obligation. - intros g g' Heq. - assert (gate_eq g g') as [Hop Hlo] by (exact Heq). - unfold try_open. - rewrite <- Hlo. - destruct (bool_dec (lock g) false) as [Hlock|Hnlock]; subst. - + rewrite Hlock. - split; cbn; reflexivity. - + apply not_false_is_true in Hnlock. - rewrite Hnlock. - cbn. - exact Heq. -Defined. - -(** Now, back to our [gate_eq_try_open_eq], we now can use [rewrite] and -[reflexivity]. *) - -Require Import Coq.Setoids.Setoid. - -Lemma gate_eq_try_open_eq - : forall (g g': Gate), - gate_eq g g' - -> gate_eq (try_open g) (try_open g'). -Proof. - intros g g' Heq. - rewrite Heq. - reflexivity. -Qed. - -(** We did it! We actually rewrite [g] as [g’], even if we weren’t able to prove -[g = g’]. *) diff --git a/site/posts/September2022.md b/site/posts/September2022.md new file mode 100644 index 0000000..2a15d25 --- /dev/null +++ b/site/posts/September2022.md @@ -0,0 +1,116 @@ +--- +published: 2022-09-18 +modified: 2023-05-09 +series: + parent: series/Retrospectives.html + prev: posts/August2022.html + next: posts/November2022.html +tags: ['spatial-shell', 'meta'] +abstract: | + In a nutshell, my latest hobby project (Spatial Sway) works well enough + so that I can use it daily, and I have done some unsuccessful experiments + for this website. +--- + +# What happened in September 2022? + +It is September 18 today, and it has already been a month since I +decided to start these retrospectives. This means it is time to take a +step back and reflect of what happened these past few thirty days or +so[^syntax]. + +[^syntax]: There is the shocking news that I have started to use syntax + highlighting again. But let’s not linger too much into it just yet. + +## Spatial Sway + +A few days after publishing my August Retrospective, I have learned +the existence of [Material Shell](https://material-shell.com), an extension for +GNOME 3 that provides a very interesting user experience. + +I tried it for a few hours, but the thing kept crashing (it’s +probably on me, I did not even remember I had Gnome installed on my +machine, and I would not be surprised the culprit was my dusty setup +rather than Material Shell itself). The experience remained very +promising, though. Their “spatial model” especially felt like a very +good fit for me. Basically, the main idea is that you have a grid of +windows, with your workspaces acting as the rows. You can navigate +horizontally (from one workspace to another), or horizontally, and +you choose how many windows you want to see at once on your screen. + +And so for a few hours, I was a bit frustrated by the situation… +until I learned about how one can actually manage and extend Sway +(the Wayland compositor I use for several years now) thanks to its IPC +protocol. I spend like three days experimenting, first in Rust, then in +OCaml[^ocaml], and by the end of the week, I had a first working prototype I +called [Spatial Sway](https://github.com/lthms/spatial-shell). It works pretty +well; well enough that I am using it daily for several weeks now. It feels +clunky at times, but it works well, and I have been able to write a +[Waybar](https://github.com/Alexays/Waybar) configuration heavily inspired on +Material Shell UI. + +[^ocaml]: This was actually an interesting thought process. I am using OCaml at + `$WORK`{.bash} for about more than a year now. + + I have curated a setup that works pretty well, and I am familiar with the + development tools. On the contrary, I had not written a line of Rust for at + least a year, my Emacs configuration for this language was broken, and I + had lost all my fluency in this language. Still, I was not expecting to + pick OCaml when I started this project. + +Overall, I am pretty satisfied with this turnout. Writing a hobbyist +software project is always nice, but the ones you can integrate in +your daily workflow are the best one. The last time I did that was +[**keyrd**](https://sr.ht/~lthms/keyrd), my little keystrokes counting +daemon[^keyrcount]. + +[^keyrcount]: 19,970,965 since I started using it at the time of writing this + article + +Anyway, lots remains to be said about Spatial Sway, but I might save +it for a bit later. I still have some key features to implement +(notably, moving a window to another workspace), then I will +probably try to advertise it a bit. I am under the impression this +project could be of interest for others, and I would love to see it +used by folks willing to give a Material Shell-like experience a +try, without having to deal with Gnome Shell. By the way, +considering Sway is a drop-in replacement for i3, and that it +implements the exact same IPC protocol, there is no reason why +Spatial Sway is actually Sway specific, and I will rename it Spatial +Shell at some point. + +#[Mandatory screenshot of Spatial Sway.](/img/spatial-sway-preview.png) + +## This Website + +On a side note, I have started to refine the layout of this website +a bit. Similarly, I have written a new, curated home page where I +want to highlight the most recent things I have published on the +Internet. + +I have been experimenting with +[Alectryon](https://github.com/cpitclaudel/alectryon/) as a way to replace +`coqdoc`, to improve the readability of my Coq-related articles. Unfortunately, +it looks like this tool is missing [a key feature I +need](https://github.com/cpitclaudel/alectryon/issues/86). I might try to get +my hand dirty and implement it myself if I find the time and the motivation +in the following weeks. + +Finally, reading about how [Xe Iaso’s talk about how she generates her +blog](https://xeiaso.net/talks/how-my-website-works) was very inspiring to me. +I can only suggest that you have a look. + +Though not to the same extent, I also think I have spent way too much effort in +my website. Most of my Coq-related articles are actual Coq program, expect the +articles about `coqffi` which are Org mode literate programs. Hell, this website +itself used to be a literate program of the sort, until I stopped using my +homegrown literate programming toolchain **`cleopatra`** last month. At some +point, I have even spent a bit of time to ensure most of the pages of this +website were granted a 100/100 on websites like PageSpeed Insight[^goodnews]. I +had almost forgotten. + +[^goodnews]: Good news, I’ve just checked, and it still is! + +A lot remains to be done, but watching this talk made me reflect on +the job done. And opened my eyes to a new perspective, too. We will +see what translates into reality. diff --git a/site/posts/StackedGit.md b/site/posts/StackedGit.md new file mode 100644 index 0000000..3c0e45b --- /dev/null +++ b/site/posts/StackedGit.md @@ -0,0 +1,275 @@ +--- +published: 2022-01-16 +modified: 2022-08-07 +tags: ['stacked-git', 'workflow'] +abstract: | + I’ve been using Stacked Git at work since early 2021, and as of January + 2022, it has become a cornerstone of my daily workflow. +--- + +# How I Use Stacked Git at `$WORK`{.bash} + +According to [my Lobste.rs history](https://lobste.rs/s/s6quvg/stacked_git), I +have run into [Stacked Git](https://stacked-git.github.io) in early April +2021, and I remember that its promises hit a soft spot. A few weeks later, I was +submitting [a *pull request* to teach Stacked Git to sign +commits](https://github.com/stacked-git/stgit/pull/100). It was all I needed to +start using it at `$WORK`{.bash}, and now it has become a cornerstone of my +development workflow. + +## What is Stacked Git? + +Before going any further, it is probably a good idea to take a moment and +present Stacked Git. The website introduces the tool as follows: + +> Stacked Git, *StGit* for short, is an application for managing Git +> commits as a stack of patches. + +There are a few things to unpack here. First and as its name suggests, Stacked +Git is a tool built on top of Git[^pijul]. It is *not* a brand new VCS, and as +a consequence you keep using all your existing tools and plugins[^magit]. +Secondly, Stacked Git helps you curate your Git history, by turning your +commits into patches, and your branches into stacks of patches. This speaks to +me, maybe because I have been fascinated by email-based workflows for quite +some time. + +[^pijul]: My main takeaway from my Pijul adventure is connected to this. Git + is not limited to the `git` binary. Git comes with a collection of powerful + forges, nice editor plugins, and years of good practices. + + To this day, it’s neither the bugs nor the breaking changes that made me + quite Pijul. Those were expected. What I naively did not anticipate is the + dry feeling that Pijul was just the `pijul` binary, which left me with a + lot of tasks to do manually. + +[^magit]: I am looking at you, Magit. + +To me, the two core features of Stacked Git are (1) allowing you to +name your commits, and (2) to navigate among them. +Together, they create a wonderful companion to help you keep your +history clean. + +## My Subset of Stacked Git + +I do not want this article to be a Stacked Git tutorial. +Fortunately, I don’t really use the tool at its full potential. +I only care about a relatively small subset of commands I feel +comfortable with and use daily. + +First, to decide which commits are part of my “stack of patches,” I +can count of these commands: + +- `stg new NAME` creates an empty commit, and gives it the name + `NAME`. + Having a way to identify a patch with a meaningful name that is + resistant to rebase and amend is very nice. + These are two properties commit hashes do not have. +- `stg uncommit NAME` names the most recent commit under my + stack with `NAME` and integrates it into it. I do this when I am + tasked to work on a merge request made by a colleague, for + instance. +- `stg commit` removes from my stack its last patch. I do this when + said commit has been merged into `master`. + +Once my stack of patches is ready, the fun begins. + +At a given time, a patch can either be (1) applied, (2) unapplied, or (3) +hidden. On the one hand, if a patch is applied it is part of the Git history. +On the other hand, unapplying a patch means removing it from the working branch +(but not from the stack of patches of Stacked Git). If a patch becomes +irrelevant, but you don’t want to remove it entirely because it can become +handy later, you can hide it. A hidden patch sits beside the stack of patches, +and can be reintegrated if need be. + +Analogous to `git log` ---which allows you to visualize your Git history---, +`stg series` gives you a view of the state of your stack of patches. Patches +prefixed with `+` (or `>`) are applied, while `-` means the patch is unapplied. + +Then, + +- `stg pop` unapplies the patch on top of the list of applied + patches. +- `stg push` applies the patch on the bottom of the list of unapplied + patches. +- `stg goto NAME` unapplies or applies the necessary patches so that + `NAME` becomes the top patch of the list of applied patches. + +Both `HEAD` and the work tree are updated accordingly. + +In addition, `stg sink` and `stg float` allow reorganizing your +stack of patches, moving patches around. +Basically, they are like `git rebase -i`, but without having to use +`$EDITOR`. + +Modifying patches is done with `stg refresh`. +It’s akin to `git commit --amend`, except it is more powerful because +you can modify any applied patches with the `-p` option. +I’d always encourage you to `stg goto` first, because `stg refresh +-p` remains unfortunately error-prone (nothing prevents you from targeting +the wrong patch). +But when used carefully, it can be very handy. + +Finally, `stg rebase REF` moves your stack of patches on top of `REF`[^rebase]. +It is akin to `git rebase --onto`, but more straightforward. What happens is +Stacked Git pop all the patches of my stack, reset the `HEAD` of the current +branch to `REF`, and tries applying the patches one by one In case of +conflicts, the process stop, and I am left with an empty patch, and a dirty +work tree with conflicts to solve. The hidden gem is that, contrary to `git +rebase`, the repository is not “in the middle of a rebase.” + +Suppose there are many conflicting patches still waiting in my stack of patches, +and an urgent task I need to take care of first. I can just leave them here. I +can switch to another branch, and when I come back, I get my patches back. I +call this feature “incremental rebases.” + +[^rebase]: Stacked Git is supposedly able to detect, during a rebase, which of + your patches have been applied to your target branch. I’d rather use `stg + uncommit`{.bash} before doing the rebase, though. + +And that is basically it. In a nutshell, Stacked Git equips commits with the +same features as branches. + +## My Stacked Git Workflow + +As mentioned in the introduction of this article, Stacked Git has become a +cornerstone of my workflow. I’ve been asked a few times what this workflow is, +and why Magit is not enough[^magit2]. So let’s try to do that. But first, a +warning. Yes, because Stacked Git is only a wrapper above Git, everything I +will explain can be achieved using Git alone, especially if you are a Magit +wizard. + +[^magit2]: It’s always about Magit. ;) + +Stacked Git makes just everything so more convenient to me. + +### Planning My Commits Ahead Of Time + +I’ve been introduced to Git with a pretty simple workflow: I am +supposed to start working on a feature, and once it’s ready, I +can commit, and move on to the next task on my to-do list. + +To me, this approach is backward. +It makes you set your intent after the fact. +With Stacked Git, I often try to plan my final history /before +writing the very first line of code/. +Using `stack new`, I create my patches, and take the time to write +their description. +It helps me visualize where I want to go. +Then, I use `stack goto` to go back to the beginning of my stack, +and start working. + +It is not, and cannot be, an exact science. I often have to refine +them as my work progresses. +Yet, I think my Git history is cleaner, more focused, since I have +started this exercise. + +### Getting My Fixup Commits Right + +Reviews are a fundamental aspect of a software developer job. +At `$WORK`, we use Gitlab and their merge requests workflow, +which I find very annoying, because it does not provide meaningful +ways to compare two versions of your submission[^gitlab]. + +[^gitlab]: There is a notion of “versions” in Gitlab, but its ergonomics fall + short of my expectations for such a tool. + +What we end up doing is creating “fixup commits,” and we push them +to Gitlab so that reviewers can easily verify that their feedback +has correctly been taken into account. + +A fixup commit is a commit that will eventually be squashed into +another. +You can understand it as a delayed `git commit --amend`. +Git has some built-in features to manipulate them. +You create them with `git commit --fixup=<HASH>`, and they are +interpreted in a specific manner by `git rebase -i`. +But they have always felt to me like a sordid hack. +It is way too easy to create a fixup commit that targets the wrong +commit, and you can end up with strange conflicts when you finally +squash them. +That being said, if used carefully, they are a powerful tool to +keep a Git history clean. + +I am not sure we are using them carefully, though. + +Some reviews can be excruciating, with dozens of comments to +address, and theoretically as many fixup commits to create. +Then you push all of them on Gitlab, and days later, after the +green light from the reviewer, you get to call `git rebase` +and discover your history is broken, you have tones of conflicts +to fix, and you’re good for a long afternoon of untangling. + +The main reason behind this mess is that you end up fixing a commit +from the `HEAD` of your working branch, not the commit itself. +But with Stacked Git, things are different. +With `stg goto`, I put my working tree in the best state possible +to fix a commit: the commit itself. +I can use `stg new` to create a fixup commit, with a meaningful +name. +Then, I am forced to deal with the potential conflicts it brings +when I call `stg push`. + +Once my reviewer is happy with my work, I can call `stg squash`. +It is less automated than `git rebase -i`, but the comfort I gained +during the development is worth this little annoyance. + +### Managing Stacked Merge Requests + +At `$WORK`, we are trying to change how we deliver new features to +our `master` branch. +More precisely, we want to merge smaller contributions more +frequently. +We have had our fair share of large and complex merge requests that +were a nightmare to review in the past, and it’s really not a fun +position to be put in. + +For a few months, I have been involved in a project wherein we +decided /not/ to fall in the same trap again. +We agreed on a “planning of merge requests” and started working. +The first merge request was soon opened. +We’ve nominated an “owner” to take care of the review, and the rest +of the team carried on. +Before the first merge request was merged, the second one was +declared ready, and another owner was appointed. +Then, the owner of the first merge request had a baby, and yours +truly ended up having to manage two interdependent merge requests. + +It turns out Stacked Git is a wonderful tool to help me keep this +under control. + +I only have one branch, and I use the same workflow to deal with +feedback, even if they are coming from more than one merge +request. +To remember the structure of everything, I just prefix the name of +my patches with a merge request nickname. +So my stack will look something like this: + +``` ++ mr1-base ++ mr1-tests ++ mr1-doc +> mr2-command +- mr2-tests +``` + +A reviewer leaves a hard-truth comment that requires a significant rework of +the oldest merge request? `stg goto` reverts my work tree in the appropriate +state, and `stg push` allows me to deal with conflicts one patch at a time. If +I need to spend more time on the oldest merge request at some point, I can +continue my work, knowing the patches related to the newest one are awaiting in +my stack. + +The most annoying part is when the time comes to push everything. I need to +`stg goto` at the last patch of each merge request, and `git push +HEAD:the-branch`. It’s not horrible. But I will probably try to automate it at +some point. + +## Conclusion + +Overall, I am really thankful to Stacked Git’s authors! Thank you! You are +making my interactions with Git fun and carefree. You provide me some of the +convenience of patch-based VCS like [Darcs](http://darcs.net) and +[Pijul](https://pijul.org), but without sacrificing the power of Git. + +I encourage anyone to at least give it a try, and I really hope I +will be able to contribute back to Stacked Git in the near future. diff --git a/site/posts/StackedGit2.md b/site/posts/StackedGit2.md new file mode 100644 index 0000000..313901f --- /dev/null +++ b/site/posts/StackedGit2.md @@ -0,0 +1,138 @@ +--- +published: 2023-01-16 +tags: ['stacked-git', 'workflow'] +abstract: | + One year has passed, and I keep using Stacked Git almost daily. How I am + using it has slightly changed, though. +--- + +# How I Keep Using Stacked Git at `$WORK`{.bash} + +One year ago, I have published an article summarizing [my experience using +Stacked Git at `$WORK`{.bash}](/posts/StackedGit.html). Twelve months later, +enough has changed to motivate a spin-off piece. + +## Stacked Git is *Fast* + +Firstly, it is important to state that my main complaint about +Stacked Git is now a thing of the past[^edit]! Stacked Git does not feel slow +anymore, and far from it. This is because [Stacked Git 2.0 has been rewritten +in Rust](https://github.com/stacked-git/stgit/discussions/185). While RiiR +(*Rewrite it in Rust*) is a running meme on the Internet, in this particular +case, the result is very exciting. + +[^edit]: For fairness, I have removed the related section in my previous + write-up. + +Thanks to the performance boost, my Zsh prompt does not take 0.1s to +appear! + +Speaking of Zsh prompt, basically what I ended up displaying is `(<TOP PATCH +NAME> <APPLIED PATCHES COUNT>/<PATCHSET SIZE> <HIDDEN PATCHES COUNT)`. For +instance, `(fix-1337 1/2 3)`. + +In case you want to take inspiration in my somewhat working configuration, here +is the snippet of interest. + +```bash +local series_top="$(stg top 2> /dev/null)" +local total="$(stg series 2> /dev/null | wc -l)" +local hidden="$(stg series --hidden 2> /dev/null | wc -l)" + +if [[ "${total}" -gt 0 ]]; then + local not_applied="$(stg series | grep -E '^-' | wc -l)" + local applied="$(($total - $not_applied))" + + if [[ -z "${series_top}" ]]; then + series_top="·" + fi + + echo -n "(${status_color}${series_top} ${applied}/${total} ${hidden})" + echo -n " ($(current_branch))" +fi +``` + +## Branchless Workflow + +Last year, I was using Stacked Git on top of git branches. More precisely, I +had one branch for each (stack of) Merge Request. It worked well, because my +typical MR counted 3 to 4 commits in average. + +Fast forward today, and things have changed on this front too. In a nutshell, I +have become a “one commit per MR” maximalist of sort[^onecommit]. I find this +approach very effective to get more focused reviews, and to reduce the time it +takes for a given MR to be integrated into the main branch. + +[^onecommit]: It goes without saying that this approach comes with its set of + drawbacks too. + + During the past year, I’ve pushed fairly large commits which could have + been split into several smaller ones, for the sake of keeping my “one + commit per MR” policy. I have also had to manage large stacks of MRs. + +My previous approach based on git branches did not scale well with +this new mindset, and during the course of the year, I stopped using +branches altogether[^branchless]. + +[^branchless]: I have not invented the branchless workflow, of + course. + + After it was first published, someone posted a link to my Stacked Git + article on Hacker News, and [*@arxanas* posted a comment about + `git-branchless`](https://news.ycombinator.com/item?id=29959224). I tried + the tool, and even if it never clicked for me, I was really compelled by + its core ideas. + + Similarly, [Drew DeVault has published a complete article on its own + branchless workflow in + 2020](https://drewdevault.com/2020/04/06/My-weird-branchless-git-workflow.html). + +These days, I proceed as follows. + +1. I name each patch after the branch to which I will push it on our + upstream Git remote. +2. 99% of the time, I push my work using `git push -f upstream @:$(stg + top)`{.bash} +3. I created a small git plugin I called `git-prepare` which allows + me to select one of the patch of my current patchset using `fzf`, + and which pops all other patches that are currently applied. + +`git-prepare` is really straightforward: + +```bash +#!/bin/sh +patch=$(stg series -P | fzf) + +if [[ ! $? -eq 0 ]] ; then + exit $? +fi + +if [ -n "$(stg series -A)" ]; then + stg pop -a +fi + +stg push ${patch} +``` + +The main hurdle which I still need to figure out is how to deal with +stacked MRs. Currently, this is very manual. I need to remember +which commit belongs to the stack, the order and dependencies of +these commits, and I need to publish each commit individually using +`stg push; git push @:$(stg top)`{.bash}. + +The pragmatic answer is definitely to come back to git branches *for +this particular use case*, but it's not the *fun* answer. So from +time to time, I try to experiment with alternative approaches. My current +intuition is that, by adopting a naming convention for my patches, I +could probably implement a thin tooling on top of Stacked Git to +deal with dependents commits. + +## Conclusion + +Putting aside stacked MRs for now, I am really satisfied with my +workflow. It’s very lightweight and intuitive, and working without +Stacked Git now feels backward and clunky. + +So I will take this opportunity to thank one more time Stacked Git’s +authors and contributors. You all are making my professional like +easier with your project. diff --git a/site/posts/StackedGitPatchTheory.md b/site/posts/StackedGitPatchTheory.md new file mode 100644 index 0000000..5cd81be --- /dev/null +++ b/site/posts/StackedGitPatchTheory.md @@ -0,0 +1,119 @@ +--- +published: 2023-01-26 +tags: ['stacked-git', 'ideas'] +abstract: | + Could patch dependencies could help reduce the friction my branchless + workflow suffers from when it comes to stacked MRs? +--- + +# Patch Dependencies for Stacked Git + +Every time I catch myself thinking about dependencies between +changeset of a software project, the fascinating field of patch +theories comes to my mind. + +A “patch theory” usually refers to the mathematical foundation behind +the data model of so-called Patch-based DVCS like Darcs and +Pijul. More precisely, a patch theory is an encoding of the state of a +repository, equipped with operations (gathered in so-called patches, +not to be confused with `GNU diff` patches) one can do to this +state. For instance, my rough understanding of Pijul’s patch theory is +that a repository is an oriented graph of lines, and a patch is a set +of operations onto this graph. + +An interesting aspect of patch theory is that it requires a partial +order for its patches, from which a Patch-based DVCS derives a +dependency graph. In a nutshell, a patch $P$ depends on the patches +which are responsible for the presence of the lines that $P$ +modifies. + +I have always found this idea of a dependency graph for the patches +of a repository fascinating, because I though it would be a very +valuable tool in the context of software development. + +I wanted to slightly change the definition of what a patch +dependency is, though. See, the partial order of Darcs and Pijul +focus on syntactic dependencies: the relation between lines in text +files. They need that to reconstruct these text files in the file +system of their users. As a software developers writing these text +files, I quickly realized these dependencies were of little interest +to me, though. What I wanted to be able to express was that a +feature introduced by a patch $P$ relied on a fix introduced by a +patch $P'$. + +I have experimented with Darcs and Pijul quite a bit, with this idea +stuck in the back of my mind. At the end of this journey, I +convinced myself[^caution] (1) this beautiful idea I +had simply could not scale, and (2) neither I nor our industry is +ready to give up on the extensive ecosystem that has been built on top +of `git` just yet. As a consequence, my interest in Patch-based DVCS +decreased sensibly. + +[^caution]: I am not trying to convince you, per say. This is a very personal + and subjective feedback, it does not mean someone else couldn't reach a + different conclusion. + +Until very recently, that is. I got reminded of the appeal of a +dependency graph for changesets when I started adopted a Gitlab +workflow centered around Stacked Git and smaller, sometimes +interdependent MRs. + +A manually curated graph dependency for a whole repository is not +practical, but what about my local queue of patches, patiently +waiting to be integrated into the upstream repository I am +contributing too? Not only does it look like a more approachable +task, it could make synchronizing my stacked MRs a lot easier. + +The workflow I have in mind would proceed as follows. + +- Stacked Git’s `new` and `edit` commands could be extended to let + developers declare dependencies between their patches. It would be + the commands’ responsibility to enforce the wellfoundness of the + dependency graph (*e.g.*, prevent the introduction of cycles in the + graph, and maybe diamonds too[^diamond]). +- The `series` command could be improved to display the resulting + dependency graph. +- `push` and `pop` would automatically take care (pushing or popping) + of the selected patch(es) dependencies. +- Ideally, Stacked Git would get a new command `prepare <PATCH NAME>` + which would pop every patches applied, then only only push `<PATCH + NAME>` and its dependencies (in the reverse order). Developers could + fix conflicts if need be. That is, Stacked Git would not be + responsible for the consistency or correctness of the dependency + graph. +- Stacked Git could get commands to detect potential issues with the + dependency graph specified by the developer (mostly consisting in + dry-run of `prepare` to check if it would lead to conflicts). + +[^diamond]: At least in a first version. There is definitely value in being + able to work with two independent patches in conjunction with a third one + that needs them both. That being said, our goal here is to organize our + work locally, and if it is made easier by declaring artificial dependency, + this is a pragmatic sacrifice I am personally willing to make. + +Because what we want is semantic dependencies, not syntactic dependencies +between patches, I really think it makes a lot of sense to completely delegate +the dependencies declaration to the developer[^future]. The very mundane +example that convinced me is the `CHANGELOG` file any mature software project +ends up maintaining. If the contribution guidelines require to modify the +`CHANGELOG` file in the same commit as a feature is introduced, then the +patches to two independent features will systematically conflict. This does not +mean, from my patch queue perspective, I should be forced to `pop` the first +commit before starting to work on the second one. It just means that when I +call `stg prepare`, I can have to fix a conflict, but fixing Git conflicts is +part of the job after all[^rerere]. If for some reasons solving a conflict +proves to be too cumbersome, I can always acknowledge that, and declare a new +dependency to the appropriate patch. It only means I and my reviewers will be +constrained a bit more than expected when dealing with my stack of MRs. + +[^future]: Further versions of Stacked Git could explore computing the + dependency graph automatically, similarly to what Git does. But I think + that if Darcs and Pijul told us anything, it's that this computation is far + from being trivial. + +[^rerere]: And we have tools to help us. I wonder to which extends `git rerere` + could save the day in some cases, for instance. + +I am under the impression that this model extends quite nicely the current way +Stacked Git is working. To its core, it extends its data model to constraint a +bit `push` and `pop`, and empowers developers to organize a bit its local mess. diff --git a/site/posts/StronglySpecifiedFunctions.org b/site/posts/StronglySpecifiedFunctions.org deleted file mode 100644 index 83148c6..0000000 --- a/site/posts/StronglySpecifiedFunctions.org +++ /dev/null @@ -1,16 +0,0 @@ -#+TITLE: A Series on Strongly-Specified Functions in Coq - -#+SERIES: ../coq.html -#+SERIES_NEXT: ./Ltac.html - -Using dependent types and the ~Prop~ sort, it becomes possible to specify -functions whose arguments and results are constrained by properties. Using such -a “strongly-specified” function requires to provide a proof that the supplied -arguments satisfy the expected properties, and allows for soundly assuming the -results are correct too. However, implementing dependently-typed functions can -be challenging. In this series, we explore several approaches available to Coq -developers. - -- [[./StronglySpecifiedFunctionsRefine.html][Implementing Strongly-Specified Functions with the ~refine~ Tactic]] :: - -- [[./StronglySpecifiedFunctionsProgram.html][Implementing Strongly-Specified Functions with the ~Program~ Framework]] :: diff --git a/site/posts/StronglySpecifiedFunctionsProgram.md b/site/posts/StronglySpecifiedFunctionsProgram.md new file mode 100644 index 0000000..a16eca7 --- /dev/null +++ b/site/posts/StronglySpecifiedFunctionsProgram.md @@ -0,0 +1,553 @@ +--- +published: 2017-01-01 +tags: ['coq'] +series: + parent: series/StronglySpecifiedFunctions.html + prev: posts/StronglySpecifiedFunctionsRefine.html +abstract: | + `Program`{.coq} is the heir of the `refine`{.coq} tactic. It gives you a + convenient way to embed proofs within functional programs that are supposed + to fade away during code extraction. +--- + +# Implementing Strongly-Specified Functions with the `Program`{.coq} Framework + +## The Theory + +If I had to explain `Program`{.coq}, I would say `Program`{.coq} is the heir of +the `refine`{.coq} tactic. It gives you a convenient way to embed proofs within +functional programs that are supposed to fade away during code extraction. But +what do I mean when I say "embed proofs" within functional programs? I found +two ways to do it. + +### Invariants + +First, we can define a record with one or more fields of type +`Prop`{.coq}. By doing so, we can constrain the values of other fields. Put +another way, we can specify invariant for our type. For instance, in +[SpecCert](https://github.com/lthms/SpecCert), I have defined the memory +controller's SMRAMC register as follows: + +```coq +Record SmramcRegister := { + d_open: bool; + d_lock: bool; + lock_is_close: d_lock = true -> d_open = false; +}. +``` + +So `lock_is_closed`{.coq} is an invariant I know each instance of +`SmramcRegister` will have to comply with, because every time I +will construct a new instance, I will have to prove +`lock_is_closed`{.coq} holds true. For instance: + +```coq +Definition lock (reg: SmramcRegister) + : SmramcRegister. + refine ({| d_open := false; d_lock := true |}). +``` + +Coq leaves us this goal to prove. + +``` +reg : SmramcRegister +============================ +true = true -> false = false +``` + +This sound reasonable enough. + +```coq +Proof. + trivial. +Defined. +``` + +We have seen in my previous article about strongly specified +functions that mixing proofs and regular terms may lead to +cumbersome code. + +From that perspective, `Program`{.coq} helps. Indeed, the `lock`{.coq} function +can also be defined as follows: + +```coq +From Coq Require Import Program. + +#[program] +Definition lock' (reg: SmramcRegister) + : SmramcRegister := + {| d_open := false + ; d_lock := true + |}. +``` + +### Pre and Post Conditions + +Another way to "embed proofs in a program" is by specifying pre- +and post-conditions for its component. In Coq, this is done using +sigma types. + +On the one hand, a precondition is a proposition a function input has to +satisfy in order for the function to be applied. For instance, a precondition +for `head : forall {a}, list a -> a`{.coq} the function that returns the first +element of a list `l`{.coq} requires `l`{.coq} to contain at least one element. +We can write that using a sigma-type. The type of `head`{.coq} then becomes +`forall {a} (l: list a | l <> []) : a`{.coq}. + +On the other hand, a post condition is a proposition a function +output has to satisfy in order for the function to be correctly +implemented. In this way, `head` should in fact return the first +element of `l`{.coq} and not something else. + +`Program`{.coq} makes writing this specification straightforward. + +```coq +#[program] +Definition head {a} (l : list a | l <> []) + : { x : a | exists l', x :: l' = l }. +``` + +We recall that because `{ l: list a | l <> [] }`{.coq} is not the same as `list +a`{.coq}, in theory we cannot just compare `l`{.coq} with `x :: l'`{.coq} (we need to +use `proj1_sig`{.coq}). One advantage of `Program`{.coq} is to deal with it using +an implicit coercion. + +Note that for the type inference to work as expected, the +unwrapped value (here, `x :: l'`{.coq}) needs to be the left operand of +`=`{.coq}. + +Now that `head`{.coq} have been specified, we have to implement it. + +```coq +#[program] +Definition head {a} (l: list a | l <> []) + : { x : a | exists l', cons x l' = l } := + match l with + | x :: l' => x + | [] => ! + end. + +Next Obligation. + exists l'. + reflexivity. +Qed. +``` + +I want to highlight several things here: + +- We return `x`{.coq} (of type `a`{.coq}) rather than a sigma-type, then + `Program`{.coq} is smart enough to wrap it. To do so, it tries to prove the post + condition and because it fails, we have to do it ourselves (this is the + Obligation we solve after the function definition.) +- The `[]`{.coq} case is absurd regarding the precondition, we tell Coq that + using the bang (`!`{.coq}) symbol. + +We can have a look at the extracted code: + +```ocaml +(** val head : 'a1 list -> 'a1 **) +let head = function +| Nil -> assert false (* absurd case *) +| Cons (a, _) -> a +``` + +The implementation is pretty straightforward, but the pre- and +post conditions have faded away. Also, the absurd case is +discarded using an assertion. This means one thing: [head] should +not be used directly from the Ocaml world. "Interface" functions +have to be total. *) + +## The Practice + +```coq +From Coq Require Import Lia. +``` + +I have challenged myself to build a strongly specified library. My goal was to +define a type `vector : nat -> Type -> Type`{.coq} such as `vector a n`{.coq} +is a list of `n`{.coq} instance of `a`{.coq}. + +```coq +Inductive vector (a : Type) : nat -> Type := +| vcons {n} : a -> vector a n -> vector a (S n) +| vnil : vector a O. + +Arguments vcons [a n] _ _. +Arguments vnil {a}. +``` + +I had three functions in mind: `take`{.coq}, `drop`{.coq} and `extract`{.coq}. +I learned a few lessons. My main takeaway remains: do not use sigma types, +`Program`{.coq} and dependent types together. From my point of view, Coq is not +yet ready for this. Maybe it is possible to make those three work together, but +I have to admit I did not find out how. As a consequence, my preconditions are +defined as extra arguments. + +To be able to specify the post conditions of my three functions and +some others, I first defined `nth`{.coq} to get the _nth_ element of a +vector. + +My first attempt to write `nth`{.coq} was a failure. + +```coq +#[program] +Fixpoint nth {a n} + (v : vector a n) (i : nat) {struct v} + : option a := + match v, i with + | vcons x _, O => Some x + | vcons x r, S i => nth r i + | vnil, _ => None + end. +``` + +raised an anomaly. + +```coq +#[program] +Fixpoint nth {a n} + (v : vector a n) (i : nat) {struct v} + : option a := + match v with + | vcons x r => + match i with + | O => Some x + | S i => nth r i + end + | vnil => None + end. +``` + +With `nth`{.coq}, it is possible to give a very precise definition of +`take`{.coq}: + +```coq +#[program] +Fixpoint take {a n} + (v : vector a n) (e : nat | e <= n) + : { u : vector a e | forall i : nat, + i < e -> nth u i = nth v i } := + match e with + | S e' => match v with + | vcons x r => vcons x (take r e') + | vnil => ! + end + | O => vnil + end. + +Next Obligation. + now apply le_S_n. +Defined. + +Next Obligation. + induction i. + + reflexivity. + + apply e0. + now apply Lt.lt_S_n. +Defined. + +Next Obligation. + now apply PeanoNat.Nat.nle_succ_0 in H. +Defined. + +Next Obligation. + now apply PeanoNat.Nat.nlt_0_r in H. +Defined. +``` + +As a side note, I wanted to define the post condition as follows: +`{ v': vector A e | forall (i : nat | i < e), nth v' i = nth v i +}`{.coq}. However, this made the goals and hypotheses become very hard +to read and to use. Sigma types in sigma types: not a good +idea. + +```ocaml +(** val take : 'a1 vector -> nat -> 'a1 vector **) + +let rec take v = function +| O -> Vnil +| S e' -> + (match v with + | Vcons (_, x, r) -> Vcons (e', x, (take r e')) + | Vnil -> assert false (* absurd case *)) +``` + +Then I could tackle `drop` in a very similar manner: + +```coq +#[program] +Fixpoint drop {a n} + (v : vector a n) (b : nat | b <= n) + : { v': vector a (n - b) | forall i, + i < n - b -> nth v' i = nth v (b + i) } := + match b with + | 0 => v + | S n => (match v with + | vcons _ r => (drop r n) + | vnil => ! + end) + end. + +Next Obligation. + now rewrite <- Minus.minus_n_O. +Defined. + +Next Obligation. + induction n; + rewrite <- eq_rect_eq; + reflexivity. +Defined. + +Next Obligation. + now apply le_S_n. +Defined. + +Next Obligation. + now apply PeanoNat.Nat.nle_succ_0 in H. +Defined. +``` + +The proofs are easy to write, and the extracted code is exactly what one might +want it to be: + +```ocaml +(** val drop : 'a1 vector -> nat -> 'a1 vector **) +let rec drop v = function +| O -> v +| S n -> + (match v with + | Vcons (_, _, r) -> drop r n + | Vnil -> assert false (* absurd case *)) +``` + +But `Program`{.coq} really shone when it comes to implementing extract. I just +had to combine `take`{.coq} and `drop`{.coq}. *) + +```coq +#[program] +Definition extract {a n} (v : vector a n) + (e : nat | e <= n) (b : nat | b <= e) + : { v': vector a (e - b) | forall i, + i < (e - b) -> nth v' i = nth v (b + i) } := + take (drop v b) (e - b). + + +Next Obligation. + transitivity e; auto. +Defined. + +Next Obligation. + now apply PeanoNat.Nat.sub_le_mono_r. +Defined. + +Next Obligation. + destruct drop; cbn in *. + destruct take; cbn in *. + rewrite e1; auto. + rewrite <- e0; auto. + lia. +Defined. +``` + +The proofs are straightforward because the specifications of `drop`{.coq} and +`take`{.coq} are precise enough, and we do not need to have a look at their +implementations. The extracted version of `extract`{.coq} is as clean as we can +anticipate. + +```ocaml +(** val extract : 'a1 vector -> nat -> nat -> 'a1 vector **) +let extract v e b = + take (drop v b) (sub e b) +``` + +I was pretty happy, so I tried some more. Each time, using `nth`{.coq}, I managed +to write a precise post condition and to prove it holds true. For instance, +given `map`{.coq} to apply a function `f`{.coq} to each element of a vector `v`{.coq}: + +```coq +#[program] +Fixpoint map {a b n} (v : vector a n) (f : a -> b) + : { v': vector b n | forall i, + nth v' i = option_map f (nth v i) } := + match v with + | vnil => vnil + | vcons a v => vcons (f a) (map v f) + end. + +Next Obligation. + induction i. + + reflexivity. + + apply e. +Defined. +``` + +I also managed to specify and write `append`{.coq}: + +``` +#[program] +Fixpoint append {a n m} + (v : vector a n) (u : vector a m) + : { w : vector a (n + m) | forall i, + (i < n -> nth w i = nth v i) /\ + (n <= i -> nth w i = nth u (i - n)) + } := + match v with + | vnil => u + | vcons a v => vcons a (append v u) + end. + +Next Obligation. + split. + + now intro. + + intros _. + now rewrite PeanoNat.Nat.sub_0_r. +Defined. + +Next Obligation. + rename wildcard' into n. + destruct (Compare_dec.lt_dec i (S n)); split. + + intros _. + destruct i. + ++ reflexivity. + ++ cbn. + specialize (a1 i). + destruct a1 as [a1 _]. + apply a1. + auto with arith. + + intros false. + lia. + + now intros. + + intros ord. + destruct i. + ++ lia. + ++ cbn. + specialize (a1 i). + destruct a1 as [_ a1]. + apply a1. + auto with arith. +Defined. +``` + +Finally, I tried to implement `map2`{.coq} that takes a vector of `a`{.coq}, a vector of +`b`{.coq} (both of the same size) and a function `f : a -> b -> c`{.coq} and returns a +vector of `c`{.coq}. + +First, we need to provide a precise specification for `map2`{.coq}. To do that, we +introduce `option_app`{.coq}, a function that Haskellers know all to well as being +part of the `Applicative`{.haskell} type class. + +```coq +Definition option_app {a b} + (opf: option (a -> b)) + (opx: option a) + : option b := + match opf, opx with + | Some f, Some x => Some (f x) + | _, _ => None +end. +``` + +We thereafter use `<$>`{.coq} as an infix operator for `option_map`{.coq} and `<*>`{.coq} as +an infix operator for `option_app`{.coq}. *) + +```coq +Infix "<$>" := option_map (at level 50). +Infix "<*>" := option_app (at level 55). +``` + +Given two vectors `v`{.coq} and `u`{.coq} of the same size and a function `f`{.coq}, and given +`w`{.coq} the result computed by `map2`{.coq}, then we can propose the following +specification for `map2`{.coq}: + +`forall (i : nat), nth w i = f <$> nth v i <*> nth u i`{.coq} + +This reads as follows: the `i`{.coq}th element of `w`{.coq} is the result of applying +the `i`{.coq}th elements of `v`{.coq} and `u`{.coq} to `f`{.coq}. + +It turns out implementing `map2`{.coq} with the `Program`{.coq} framework has +proven to be harder than I originally expected. My initial attempt was the +following: + +```coq +#[program] +Fixpoint map2 {a b c n} + (v : vector a n) (u : vector b n) + (f : a -> b -> c) {struct v} + : { w: vector c n | forall i, + nth w i = f <$> nth v i <*> nth u i + } := + match v, u with + | vcons x rst, vcons x' rst' => + vcons (f x x') (map2 rst rst' f) + | vnil, vnil => vnil + | _, _ => ! + end. +``` + +``` +Illegal application: +The term "@eq" of type "forall A : Type, A -> A -> Prop" +cannot be applied to the terms + "nat" : "Set" + "S wildcard'" : "nat" + "b" : "Type" +The 3rd term has type "Type" which should be coercible +to "nat". +``` + +So I had to fallback to defining the function in pure Ltac. + +```coq +#[program] +Fixpoint map2 {a b c n} + (v : vector a n) (u : vector b n) + (f : a -> b -> c) {struct v} + : { w: vector c n | forall i, + nth w i = f <$> nth v i <*> nth u i + } := _. + +Next Obligation. + dependent induction v; dependent induction u. + + remember (IHv u f) as u'. + inversion u'. + refine (exist _ (vcons (f a0 a1) x) _). + intros i. + induction i. + * reflexivity. + * apply (H i). + + refine (exist _ vnil _). + reflexivity. +Qed. +``` + +## Is It Usable? + +This post mostly gives the "happy ends" for each function. I think I tried +too hard for what I got in return and therefore I am convinced `Program`{.coq} +is not ready (at least for a dependent type, I cannot tell for the rest). For +instance, I found at least one bug in Program logic (I still have to report +it). Have a look at the following code: + +```coq +#[program] +Fixpoint map2 {a b c n} + (u : vector a n) (v : vector b n) + (f : a -> b -> c) {struct v} + : vector c n := + match u with + | _ => vnil + end. +``` + +It gives the following error: + +``` +Error: Illegal application: +The term "@eq" of type "forall A : Type, A -> A -> Prop" +cannot be applied to the terms + "nat" : "Set" + "0" : "nat" + "wildcard'" : "vector A n'" +The 3rd term has type "vector A n'" which should be +coercible to "nat". +``` diff --git a/site/posts/StronglySpecifiedFunctionsProgram.v b/site/posts/StronglySpecifiedFunctionsProgram.v deleted file mode 100644 index c5763ea..0000000 --- a/site/posts/StronglySpecifiedFunctionsProgram.v +++ /dev/null @@ -1,526 +0,0 @@ -(** * Implementing Strongly-Specified Functions with the <<Program>> Framework - - This is the second article (initially published on #<span - id="original-created-at">January 01, 2017</span>#) of a series of two on how - to write strongly-specified functions in Coq. You can read the previous part - #<a href="./StronglySpecifiedFunctionsRefine.html">here</a>#. # *) - -(** #<nav id="generate-toc"></nav># - - #<div id="history">site/posts/StronglySpecifiedFunctionsProgram.v</div># *) - -(** ** The Theory *) - -(** If I had to explain `Program`, I would say `Program` is the heir - of the `refine` tactic. It gives you a convenient way to embed - proofs within functional programs that are supposed to fade away - during code extraction. But what do I mean when I say "embed - proofs" within functional programs? I found two ways to do it. *) - -(** *** Invariants *) - -(** First, we can define a record with one or more fields of type - [Prop]. By doing so, we can constrain the values of other fields. Put - another way, we can specify invariant for our type. For instance, in - SpecCert, I have defined the memory controller's SMRAMC register - as follows: *) - -Record SmramcRegister := { - d_open: bool; - d_lock: bool; - lock_is_close: d_lock = true -> d_open = false; -}. - -(** So [lock_is_closed] is an invariant I know each instance of - `SmramcRegister` will have to comply with, because every time I - will construct a new instance, I will have to prove - [lock_is_closed] holds true. For instance: *) - -Definition lock - (reg: SmramcRegister) - : SmramcRegister. - refine ({| d_open := false; d_lock := true |}). - -(** Coq leaves us this goal to prove. - -<< -reg : SmramcRegister -============================ -true = true -> false = false ->> - - This sound reasonable enough. *) - -Proof. - trivial. -Defined. - -(** We have witness in my previous article about strongly-specified - functions that mixing proofs and regular terms may leads to - cumbersome code. - - From that perspective, [Program] helps. Indeed, the [lock] - function can also be defined as follows: *) - -From Coq Require Import Program. - -#[program] -Definition lock' - (reg: SmramcRegister) - : SmramcRegister := - {| d_open := false - ; d_lock := true - |}. - -(** *** Pre and Post Conditions *) - -(** Another way to "embed proofs in a program" is by specifying pre- - and post-conditions for its component. In Coq, this is done using - sigma-types. *) - -(** On the one hand, a precondition is a proposition a function input - has to satisfy in order for the function to be applied. For - instance, a precondition for [head : forall {a}, list a -> a] the - function that returns the first element of a list [l] requires [l] - to contain at least one element. We can write that using a - sigma-type. The type of [head] then becomes [forall {a} (l: list a - | l <> []) : a] - - On the other hand, a post condition is a proposition a function - output has to satisfy in order for the function to be correctly - implemented. In this way, `head` should in fact return the first - element of [l] and not something else. - - <<Program>> makes writing this specification straightforward. *) - -(* begin hide *) -From Coq Require Import List. -Import ListNotations. -(* end hide *) -#[program] -Definition head {a} (l : list a | l <> []) - : { x : a | exists l', x :: l' = l }. -(* begin hide *) -Abort. -(* end hide *) - -(** We recall that because `{ l: list a | l <> [] }` is not the same - as [list a], in theory we cannot just compare [l] with [x :: - l'] (we need to use [proj1_sig]). One benefit on <<Program>> is to - deal with it using an implicit coercion. - - Note that for the type inference to work as expected, the - unwrapped value (here, [x :: l']) needs to be the left operand of - [=]. - - Now that [head] have been specified, we have to implement it. *) - -#[program] -Definition head {a} (l: list a | l <> []) - : { x : a | exists l', cons x l' = l } := - match l with - | x :: l' => x - | [] => ! - end. - -Next Obligation. - exists l'. - reflexivity. -Qed. - -(** I want to highlight several things here: - - - We return [x] (of type [a]) rather than a gigma-type, then <<Program>> is smart - enough to wrap it. To do so, it tries to prove the post condition and because - it fails, we have to do it ourselves (this is the Obligation we solve after - the function definition.) - - The [[]] case is absurd regarding the precondition, we tell Coq that using - the bang (`!`) symbol. - - We can have a look at the extracted code: - -<< -(** val head : 'a1 list -> 'a1 **) -let head = function -| Nil -> assert false (* absurd case *) -| Cons (a, _) -> a ->> - - The implementation is pretty straightforward, but the pre- and - post conditions have faded away. Also, the absurd case is - discarded using an assertion. This means one thing: [head] should - not be used directly from the Ocaml world. "Interface" functions - have to be total. *) - -(** ** The Practice *) - -From Coq Require Import Lia. - -(** I have challenged myself to build a strongly specified library. My goal was to - define a type [vector : nat -> Type -> Type] such as [vector a n] is a list of - [n] instance of [a]. *) - -Inductive vector (a : Type) : nat -> Type := -| vcons {n} : a -> vector a n -> vector a (S n) -| vnil : vector a O. - -Arguments vcons [a n] _ _. -Arguments vnil {a}. - -(** I had three functions in mind: [take], [drop] and [extract]. I - learned few lessons. My main take-away remains: do not use - gigma-types, <<Program>> and dependent-types together. From my - point of view, Coq is not yet ready for this. Maybe it is possible - to make those three work together, but I have to admit I did not - find out how. As a consequence, my preconditions are defined as - extra arguments. - - To be able to specify the post conditions my three functions and - some others, I first defined [nth] to get the _nth_ element of a - vector. - - My first attempt to write [nth] was a failure. - -<< -#[program] -Fixpoint nth {a n} - (v : vector a n) (i : nat) {struct v} - : option a := - match v, i with - | vcons x _, O => Some x - | vcons x r, S i => nth r i - | vnil, _ => None - end. ->> - - raises an anomaly. *) - -#[program] -Fixpoint nth {a n} - (v : vector a n) (i : nat) {struct v} - : option a := - match v with - | vcons x r => - match i with - | O => Some x - | S i => nth r i - end - | vnil => None - end. - -(** With [nth], it is possible to give a very precise definition of [take]: *) - -#[program] -Fixpoint take {a n} - (v : vector a n) (e : nat | e <= n) - : { u : vector a e | forall i : nat, - i < e -> nth u i = nth v i } := - match e with - | S e' => match v with - | vcons x r => vcons x (take r e') - | vnil => ! - end - | O => vnil - end. - -Next Obligation. - now apply le_S_n. -Defined. - -Next Obligation. - induction i. - + reflexivity. - + apply e0. - now apply Lt.lt_S_n. -Defined. - -Next Obligation. - now apply PeanoNat.Nat.nle_succ_0 in H. -Defined. - -Next Obligation. - now apply PeanoNat.Nat.nlt_0_r in H. -Defined. - -(** As a side note, I wanted to define the post condition as follows: - [{ v': vector A e | forall (i : nat | i < e), nth v' i = nth v i - }]. However, this made the goals and hypotheses become very hard - to read and to use. Sigma-types in sigma-types: not a good - idea. - -<< -(** val take : 'a1 vector -> nat -> 'a1 vector **) - -let rec take v = function -| O -> Vnil -| S e' -> - (match v with - | Vcons (_, x, r) -> Vcons (e', x, (take r e')) - | Vnil -> assert false (* absurd case *)) ->> - - Then I could tackle `drop` in a very similar manner: *) - -#[program] -Fixpoint drop {a n} - (v : vector a n) (b : nat | b <= n) - : { v': vector a (n - b) | forall i, - i < n - b -> nth v' i = nth v (b + i) } := - match b with - | 0 => v - | S n => (match v with - | vcons _ r => (drop r n) - | vnil => ! - end) - end. - -Next Obligation. - now rewrite <- Minus.minus_n_O. -Defined. - -Next Obligation. - induction n; - rewrite <- eq_rect_eq; - reflexivity. -Defined. - -Next Obligation. - now apply le_S_n. -Defined. - -Next Obligation. - now apply PeanoNat.Nat.nle_succ_0 in H. -Defined. - -(** The proofs are easy to write, and the extracted code is exactly what one might - want it to be: - -<< -(** val drop : 'a1 vector -> nat -> 'a1 vector **) -let rec drop v = function -| O -> v -| S n -> - (match v with - | Vcons (_, _, r) -> drop r n - | Vnil -> assert false (* absurd case *)) ->> - - But <<Program>> really shone when it comes to implementing extract. I just - had to combine [take] and [drop]. *) - -#[program] -Definition extract {a n} (v : vector a n) - (e : nat | e <= n) (b : nat | b <= e) - : { v': vector a (e - b) | forall i, - i < (e - b) -> nth v' i = nth v (b + i) } := - take (drop v b) (e - b). - - -Next Obligation. - transitivity e; auto. -Defined. - -Next Obligation. - now apply PeanoNat.Nat.sub_le_mono_r. -Defined. - -Next Obligation. - destruct drop; cbn in *. - destruct take; cbn in *. - rewrite e1; auto. - rewrite <- e0; auto. - lia. -Defined. - -(** The proofs are straightforward because the specifications of [drop] and - [take] are precise enough, and we do not need to have a look at their - implementations. The extracted version of [extract] is as clean as we can - anticipate. - -<< -(** val extract : 'a1 vector -> nat -> nat -> 'a1 vector **) -let extract v e b = - take (drop v b) (sub e b) ->> - *) - -(** I was pretty happy, so I tried some more. Each time, using [nth], I managed - to write a precise post condition and to prove it holds true. For instance, - given [map] to apply a function [f] to each element of a vector [v]: *) - -#[program] -Fixpoint map {a b n} (v : vector a n) (f : a -> b) - : { v': vector b n | forall i, - nth v' i = option_map f (nth v i) } := - match v with - | vnil => vnil - | vcons a v => vcons (f a) (map v f) - end. - -Next Obligation. - induction i. - + reflexivity. - + apply e. -Defined. - -(** I also managed to specify and write [append]: *) - -Program Fixpoint append {a n m} - (v : vector a n) (u : vector a m) - : { w : vector a (n + m) | forall i, - (i < n -> nth w i = nth v i) /\ - (n <= i -> nth w i = nth u (i - n)) - } := - match v with - | vnil => u - | vcons a v => vcons a (append v u) - end. - -Next Obligation. - split. - + now intro. - + intros _. - now rewrite PeanoNat.Nat.sub_0_r. -Defined. - -Next Obligation. - rename wildcard' into n. - destruct (Compare_dec.lt_dec i (S n)); split. - + intros _. - destruct i. - ++ reflexivity. - ++ cbn. - specialize (a1 i). - destruct a1 as [a1 _]. - apply a1. - auto with arith. - + intros false. - lia. - + now intros. - + intros ord. - destruct i. - ++ lia. - ++ cbn. - specialize (a1 i). - destruct a1 as [_ a1]. - apply a1. - auto with arith. -Defined. - -(** Finally, I tried to implement [map2] that takes a vector of [a], a vector of - [b] (both of the same size) and a function [f : a -> b -> c] and returns a - vector of [c]. - - First, we need to provide a precise specification for [map2]. To do that, we - introduce [option_app], a function that Haskellers know all to well as being - part of the <<Applicative>> type class. *) - -Definition option_app {a b} - (opf: option (a -> b)) - (opx: option a) - : option b := - match opf, opx with - | Some f, Some x => Some (f x) - | _, _ => None -end. - -(** We thereafter use [<$>] as an infix operator for [option_map] and [<*>] as - an infix operator for [option_app]. *) - -Infix "<$>" := option_map (at level 50). -Infix "<*>" := option_app (at level 55). - -(** Given two vectors [v] and [u] of the same size and a function [f], and given - [w] the result computed by [map2], then we can propose the following - specification for [map2]: - - [forall (i : nat), nth w i = f <$> nth v i <*> nth u i] - - This reads as follows: the [i]th element of [w] is the result of applying - the [i]th elements of [v] and [u] to [f]. - - It turns out implementing [map2] with the <<Program>> framework has proven - to be harder than I originally expected. My initial attempt was the - following: - -<< -#[program] -Fixpoint map2 {a b c n} - (v : vector a n) (u : vector b n) - (f : a -> b -> c) {struct v} - : { w: vector c n | forall i, - nth w i = f <$> nth v i <*> nth u i - } := - match v, u with - | vcons x rst, vcons x' rst' => - vcons (f x x') (map2 rst rst' f) - | vnil, vnil => vnil - | _, _ => ! - end. ->> - -<< -Illegal application: -The term "@eq" of type "forall A : Type, A -> A -> Prop" -cannot be applied to the terms - "nat" : "Set" - "S wildcard'" : "nat" - "b" : "Type" -The 3rd term has type "Type" which should be coercible -to "nat". ->> - *) - -#[program] -Fixpoint map2 {a b c n} - (v : vector a n) (u : vector b n) - (f : a -> b -> c) {struct v} - : { w: vector c n | forall i, - nth w i = f <$> nth v i <*> nth u i - } := _. - -Next Obligation. - dependent induction v; dependent induction u. - + remember (IHv u f) as u'. - inversion u'. - refine (exist _ (vcons (f a0 a1) x) _). - intros i. - induction i. - * reflexivity. - * apply (H i). - + refine (exist _ vnil _). - reflexivity. -Qed. - -(** ** Is It Usable? *) - -(** This post mostly gives the "happy ends" for each function. I think I tried - to hard for what I got in return and therefore I am convinced <<Program>> is - not ready (at least for a dependent type, I cannot tell for the rest). For - instance, I found at least one bug in Program logic (I still have to report - it). Have a look at the following code: - -<< -#[program] -Fixpoint map2 {a b c n} - (u : vector a n) (v : vector b n) - (f : a -> b -> c) {struct v} - : vector c n := - match u with - | _ => vnil - end. ->> - - It gives the following error: - -<< -Error: Illegal application: -The term "@eq" of type "forall A : Type, A -> A -> Prop" -cannot be applied to the terms - "nat" : "Set" - "0" : "nat" - "wildcard'" : "vector A n'" -The 3rd term has type "vector A n'" which should be -coercible to "nat". ->> - *) diff --git a/site/posts/StronglySpecifiedFunctionsRefine.md b/site/posts/StronglySpecifiedFunctionsRefine.md new file mode 100644 index 0000000..eb1fff9 --- /dev/null +++ b/site/posts/StronglySpecifiedFunctionsRefine.md @@ -0,0 +1,409 @@ +--- +published: 2015-01-11 +tags: ['coq'] +series: + parent: series/StronglySpecifiedFunctions.html + next: posts/StronglySpecifiedFunctionsProgram.html +abstract: | + We see how to implement strongly-specified list manipulation functions in + Coq. Strong specifications are used to ensure some properties on functions' + arguments and return value. It makes Coq type system very expressive. +--- + +# Implementing Strongly-Specified Functions with the `refine`{.coq} Tactic + +I started to play with Coq, the interactive theorem prover +developed by Inria, a few weeks ago. It is a very powerful tool, +yet hard to master. Fortunately, there are some very good readings +if you want to learn (I recommend the Coq'Art). This article is +not one of them. + +In this article, we will see how to implement strongly specified +list manipulation functions in Coq. Strong specifications are used +to ensure some properties on functions' arguments and return +value. It makes Coq type system very expressive. Thus, it is +possible to specify in the type of the function `pop`{.coq} that the return +value is the list passed as an argument in which the first element has been +removed, for example. + +## Is This List Empty? + +It's the first question to deal with when manipulating +lists. There are some functions that require their arguments not +to be empty. It's the case for the `pop`{.coq} function, for instance +it is not possible to remove the first element of a list that does +not have any elements in the first place. + +When one wants to answer such a question as “Is this list empty?”, +he has to keep in mind that there are two ways to do it: by a +predicate or by a boolean function. Indeed, `Prop`{.coq} and `bool`{.coq} are +two different worlds that do not mix easily. One solution is to +write two definitions and to prove their equivalence. That is +`forall args, predicate args <-> bool_function args = true`{.coq}. + +Another solution is to use the `sumbool`{.coq} type as middlemen. The +scheme is the following: + +- Defining `predicate : args → Prop`{.coq} +- Defining `predicate_dec : args -> { predicate args } + { ~predicate args }`{.coq} +- Defining `predicate_b`{.coq}: + +```coq +Definition predicate_b (args) := + if predicate_dec args then true else false. +``` + +### Defining the `empty`{.coq} Predicate + +A list is empty if it is `[]`{.coq} (`nil`{.coq}). It's as simple as that! + +```coq +Definition empty {a} (l : list a) : Prop := l = []. +``` + +### Defining a decidable version of `empty`{.coq} + +A decidable version of `empty`{.coq} is a function which takes a list +`l`{.coq} as its argument and returns either a proof that `l`{.coq} is empty, +or a proof that `l`{.coq} is not empty. This is encoded in the Coq +standard library with the `sumbool`{.coq} type, and is written as +follows: `{ empty l } + { ~ empty l }`{.coq}. + +```coq +Definition empty_dec {a} (l : list a) + : { empty l } + { ~ empty l }. +Proof. + refine (match l with + | [] => left _ _ + | _ => right _ _ + end); + unfold empty; trivial. + unfold not; intro H; discriminate H. +Defined. +``` + +In this example, I decided to use the `refine`{.coq} tactic which is +convenient when we manipulate the `Set`{.coq} and `Prop`{.coq} sorts at the +same time. + +### Defining `empty_b`{.coq} + +With `empty_dec`{.coq}, we can define `empty_b`{.coq}. + +```coq +Definition empty_b {a} (l : list a) : bool := + if empty_dec l then true else false. +``` + +Let's try to extract `empty_b`{.coq}: + +```ocaml +type bool = +| True +| False + +type sumbool = +| Left +| Right + +type 'a list = +| Nil +| Cons of 'a * 'a list + +(** val empty_dec : 'a1 list -> sumbool **) + +let empty_dec = function +| Nil -> Left +| Cons (a, l0) -> Right + +(** val empty_b : 'a1 list -> bool **) + +let empty_b l = + match empty_dec l with + | Left -> True + | Right -> False +``` + +In addition to `'a list`{.ocaml}, Coq has created the `sumbool`{.ocaml} and +`bool`{.ocaml} types and `empty_b`{.ocaml} is basically a translation from the +former to the latter. We could have stopped with `empty_dec`{.ocmal}, but +`Left`{.ocaml} and `Right`{.ocaml} are less readable that `True`{.ocaml} and +`False`{.ocaml}. Note that it is possible to configure the Extraction mechanism +to use primitive OCaml types instead, but this is out of the scope of this +article. + +## Defining Some Utility Functions + +### Defining `pop`{.coq} + +There are several ways to write a function that removes the first +element of a list. One is to return `nil` if the given list was +already empty: + +```coq +Definition pop {a} ( l :list a) := + match l with + | _ :: l => l + | [] => [] + end. +``` + +But it's not really satisfying. A `pop` call over an empty list should not be +possible. It can be done by adding an argument to `pop`: the proof that the +list is not empty. + +```coq +Definition pop {a} (l : list a) (h : ~ empty l) + : list a. +``` + +There are, as usual when it comes to lists, two cases to +consider. + +- `l = x :: rst`{.coq}, and therefore `pop (x :: rst) h`{.coq} is `rst`{.coq} +- `l = []`{.coq}, which is not possible since we know `l`{.coq} is not empty. + +The challenge is to convince Coq that our reasoning is +correct. There are, again, several approaches to achieve that. We +can, for instance, use the `refine`{.coq} tactic again, but this time we +need to know a small trick to succeed as using a “regular” `match`{.coq} +will not work. + +From the following goal: + +``` + a : Type + l : list a + h : ~ empty l + ============================ + list a +``` + +Using the `refine`{.coq} tactic naively, for instance, this way: + +```coq + refine (match l with + | _ :: rst => rst + | [] => _ + end). +``` + +leaves us the following goal to prove: + +``` + a : Type + l : list a + h : ~ empty l + ============================ + list a +``` + +Nothing has changed! Well, not exactly. See, `refine`{.coq} has taken +our incomplete Gallina term, found a hole, done some +type-checking, found that the type of the missing piece of our +implementation is `list a`{.coq} and therefore has generated a new +goal of this type. What `refine`{.coq} has not done, however, is +remembering that we are in the case where `l = []`{.coq}. + +We need to generate a goal from a hole wherein this information is +available. It is possible to use a long form of `match`{.coq}. The +general approach is this: rather than returning a value of type +`list a`{.coq}, our match will return a function of type `l = ?l' -> +list a`{.coq}, where `?l`{.coq} is a value of `l`{.coq} for a given case (that is, +either `x :: rst`{.coq} or `[]`{.coq}). Of course and as a consequence, the type +of the `match`{.coq} in now a function which awaits a proof to return +the expected result. Fortunately, this proof is trivial: it is +`eq_refl`{.coq}. + +```coq + refine (match l as l' + return l = l' -> list a + with + | _ :: rst => fun _ => rst + | [] => fun equ => _ + end eq_refl). +``` + +For us to conclude the proof, this is way better. + +``` + a : Type + l : list a + h : ~ empty l + equ : l = [] + ============================ + list a +``` + +We conclude the proof, and therefore the definition of `pop`{.coq}. + +```coq + rewrite equ in h. + exfalso. + now apply h. +Defined. +``` + +It's better and yet it can still be improved. Indeed, according to its type, +`pop`{.coq} returns “some list.” As a matter of fact, `pop`{.coq} returns “the +same list without its first argument.” It is possible to write +such precise definition thanks to sigma types, defined as: + +```coq +Inductive sig (A : Type) (P : A -> Prop) : Type := + exist : forall (x : A), P x -> sig P. +``` + +Rather than `sig A p`{.coq}, sigma-types can be written using the +notation `{ a | P }`{.coq}. They express subsets, and can be used to constraint +arguments and results of functions. + +We finally propose a strongly specified definition of `pop`{.coq}. + +```coq +Definition pop {a} (l : list a | ~ empty l) + : { l' | exists a, proj1_sig l = cons a l' }. +``` + +If you think the previous use of `match`{.coq} term was ugly, brace yourselves. + +```coq + refine (match proj1_sig l as l' + return proj1_sig l = l' + -> { l' | exists a, proj1_sig l = cons a l' } + with + | [] => fun equ => _ + | (_ :: rst) => fun equ => exist _ rst _ + end eq_refl). +``` + +This leaves us two goals to tackle. + +First, we need to discard the case where `l`{.coq} is the empty list. + +``` + a : Type + l : {l : list a | ~ empty l} + equ : proj1_sig l = [] + ============================ + {l' : list a | exists a0 : a, proj1_sig l = a0 :: l'} +``` + +```coq + + destruct l as [l nempty]; cbn in *. + rewrite equ in nempty. + exfalso. + now apply nempty. +``` + +Then, we need to prove that the result we provide (`rst`{.coq}) when the +list is not empty is correct with respect to the specification of +`pop`{.coq}. + +``` + a : Type + l : {l : list a | ~ empty l} + a0 : a + rst : list a + equ : proj1_sig l = a0 :: rst + ============================ + exists a1 : a, proj1_sig l = a1 :: rst +``` + +```coq + + destruct l as [l nempty]; cbn in *. + rewrite equ. + now exists a0. +Defined. +``` + +Let's have a look at the extracted code: + +```ocaml +(** val pop : 'a1 list -> 'a1 list **) + +let pop = function +| Nil -> assert false (* absurd case *) +| Cons (a, l0) -> l0 +``` + +If one tries to call `pop nil`{.coq}, the `assert`{.coq} ensures the call fails. Extra +information given by the sigma type has been stripped away. It can be +confusing, and in practice it means that, we you rely on the extraction +mechanism to provide a certified OCaml module, you _cannot expose +strongly specified functions in its public interface_ because nothing in the +OCaml type system will prevent a misuse which will in practice leads to an +`assert false`{.ocaml}. *) + +## Defining `push`{.coq} + +It is possible to specify `push`{.coq} the same way `pop`{.coq} has been. The only +difference is `push`{.coq} accepts lists with no restriction at all. Thus, its +definition is a simpler, and we can write it without `refine`{.coq}. + +```coq +Definition push {a} (l : list a) (x : a) + : { l' | l' = x :: l } := + exist _ (x :: l) eq_refl. +``` + +And the extracted code is just as straightforward. + +```ocaml +let push l a = + Cons (a, l) +``` + +## Defining `head`{.coq} + +Same as `pop`{.coq} and `push`{.coq}, it is possible to add extra information in the +type of `head`{.coq}, namely the returned value of `head`{.coq} is indeed the first value +of `l`{.coq}. + +```coq +Definition head {a} (l : list a | ~ empty l) + : { x | exists r, proj1_sig l = x :: r }. +``` + +It's not a surprise its definition is very close to `pop`{.coq}. + +```coq + refine (match proj1_sig l as l' + return proj1_sig l = l' -> _ + with + | [] => fun equ => _ + | x :: _ => fun equ => exist _ x _ + end eq_refl). +``` + +The proof is also very similar, and are left to read as an exercise for +passionate readers. + +```coq + + destruct l as [l falso]; cbn in *. + rewrite equ in falso. + exfalso. + now apply falso. + + exists l0. + now rewrite equ. +Defined. +``` + +Finally, the extracted code is as straightforward as it can get. + +```ocaml +let head = function +| Nil -> assert false (* absurd case *) +| Cons (a, l0) -> a +``` + +## Conclusion + +Writing strongly specified functions allows for reasoning about the result +correctness while computing it. This can help in practice. However, writing +these functions with the `refine`{.coq} tactic does not enable a very idiomatic +Coq code. + +To improve the situation, the `Program`{.coq} framework distributed with the +Coq standard library helps, but it is better to understand what `Program`{.coq} +achieves under its hood, which is basically what we have done in this article. diff --git a/site/posts/StronglySpecifiedFunctionsRefine.v b/site/posts/StronglySpecifiedFunctionsRefine.v deleted file mode 100644 index 4ffb385..0000000 --- a/site/posts/StronglySpecifiedFunctionsRefine.v +++ /dev/null @@ -1,396 +0,0 @@ -(** * Implementing Strongly-Specified Functions with the <<refine>> Tactic *) - -(** This is the first article (initially published on #<span - class="original-created-at">January 11, 2015</span>#) of a series of two on - how to write strongly-specified functions in Coq. You can read the next part - #<a href="./StronglySpecifiedFunctionsProgram.html">here</a>#. - - I started to play with Coq, the interactive theorem prover - developed by Inria, a few weeks ago. It is a very powerful tool, - yet hard to master. Fortunately, there are some very good readings - if you want to learn (I recommend the Coq'Art). This article is - not one of them. - - In this article, we will see how to implement strongly-specified - list manipulation functions in Coq. Strong specifications are used - to ensure some properties on functions' arguments and return - value. It makes Coq type system very expressive. Thus, it is - possible to specify in the type of the function [pop] that the - return value is the list passed in argument in which the first - element has been removed for example. - - #<nav id="generate-toc"></nav># - - #<div id="history">site/posts/StronglySpecifiedFunctionsRefine.v</div># *) - -(** ** Is this list empty? *) - -(** Since we will manipulate lists in this article, we first - enable several notations of the standard library. *) - -From Coq Require Import List. -Import ListNotations. - -(** It's the first question to deal with when manipulating - lists. There are some functions that require their arguments not - to be empty. It's the case for the [pop] function, for instance: - it is not possible to remove the first element of a list that does - not have any elements in the first place. - - When one wants to answer such a question as “Is this list empty?”, - he has to keep in mind that there are two ways to do it: by a - predicate or by a boolean function. Indeed, [Prop] and [bool] are - two different worlds that do not mix easily. One solution is to - write two definitions and to prove their equivalence. That is - [forall args, predicate args <-> bool_function args = true]. - - Another solution is to use the [sumbool] type as middleman. The - scheme is the following: - - - Defining [predicate : args → Prop] - - Defining [predicate_dec : args -> { predicate args } + { ~predicate args }] - - Defining [predicate_b]: - -<< -Definition predicate_b (args) := - if predicate_dec args then true else false. ->> - *) - -(** *** Defining the <<empty>> predicate *) - -(** A list is empty if it is [[]] ([nil]). It's as simple as that! *) - -Definition empty {a} (l : list a) : Prop := l = []. - -(** *** Defining a decidable version of <<empty>> *) - -(** A decidable version of [empty] is a function which takes a list - [l] as its argument and returns either a proof that [l] is empty, - or a proof that [l] is not empty. This is encoded in the Coq - standard library with the [sumbool] type, and is written as - follows: [{ empty l } + { ~ empty l }]. *) - -Definition empty_dec {a} (l : list a) - : { empty l } + { ~ empty l }. - -Proof. - refine (match l with - | [] => left _ _ - | _ => right _ _ - end); - unfold empty; trivial. - unfold not; intro H; discriminate H. -Defined. - -(** In this example, I decided to use the [refine] tactic which is - convenient when we manipulate the [Set] and [Prop] sorts at the - same time. *) - -(** *** Defining <<empty_b>> *) - -(** With [empty_dec], we can define [empty_b]. *) - -Definition empty_b {a} (l : list a) : bool := - if empty_dec l then true else false. - -(** Let's try to extract [empty_b]: - -<< -type bool = -| True -| False - -type sumbool = -| Left -| Right - -type 'a list = -| Nil -| Cons of 'a * 'a list - -(** val empty_dec : 'a1 list -> sumbool **) - -let empty_dec = function -| Nil -> Left -| Cons (a, l0) -> Right - -(** val empty_b : 'a1 list -> bool **) - -let empty_b l = - match empty_dec l with - | Left -> True - | Right -> False ->> - - In addition to <<list 'a>>, Coq has created the <<sumbool>> and - <<bool>> types and [empty_b] is basically a translation from the - former to the latter. We could have stopped with [empty_dec], but - [Left] and [Right] are less readable that [True] and [False]. Note - that it is possible to configure the Extraction mechanism to use - primitive OCaml types instead, but this is out of the scope of - this article. *) - -(** ** Defining some utility functions *) - -(** *** Defining [pop] *) - -(** There are several ways to write a function that removes the first - element of a list. One is to return `nil` if the given list was - already empty: *) - -Definition pop {a} ( l :list a) := - match l with - | _ :: l => l - | [] => [] - end. - -(** But it's not really satisfying. A `pop` call over an empty list - should not be possible. It can be done by adding an argument to - `pop`: the proof that the list is not empty. *) - -(* begin hide *) -Reset pop. -(* end hide *) -Definition pop {a} (l : list a) (h : ~ empty l) - : list a. - -(** There are, as usual when it comes to lists, two cases to - consider. - - - [l = x :: rst], and therefore [pop (x :: rst) h] is [rst] - - [l = []], which is not possible since we know [l] is not empty. - - The challenge is to convince Coq that our reasoning is - correct. There are, again, several approaches to achieve that. We - can, for instance, use the [refine] tactic again, but this time we - need to know a small trick to succeed as using a “regular” [match] - will not work. - - From the following goal: - -<< - a : Type - l : list a - h : ~ empty l - ============================ - list a ->> - *) - -(** Using the [refine] tactic naively, for instance this way: *) - - refine (match l with - | _ :: rst => rst - | [] => _ - end). - -(** leaves us the following goal to prove: - -<< - a : Type - l : list a - h : ~ empty l - ============================ - list a ->> - - Nothing has changed! Well, not exactly. See, [refine] has taken - our incomplete Gallina term, found a hole, done some - type-checking, found that the type of the missing piece of our - implementation is [list a] and therefore has generated a new - goal of this type. What [refine] has not done, however, is - remember that we are in the case where [l = []]! - - We need to generate a goal from a hole wherein this information is - available. It is possible using a long form of [match]. The - general approach is this: rather than returning a value of type - [list a], our match will return a function of type [l = ?l' -> - list a], where [?l] is value of [l] for a given case (that is, - either [x :: rst] or [[]]). Of course, As a consequence, the type - of the [match] in now a function which awaits a proof to return - the expected result. Fortunately, this proof is trivial: it is - [eq_refl]. *) - -(* begin hide *) - Undo. -(* end hide *) - refine (match l as l' - return l = l' -> list a - with - | _ :: rst => fun _ => rst - | [] => fun equ => _ - end eq_refl). - -(** For us to conclude the proof, this is way better. - -<< - a : Type - l : list a - h : ~ empty l - equ : l = [] - ============================ - list a ->> - - We conclude the proof, and therefore the definition of [pop]. *) - - rewrite equ in h. - exfalso. - now apply h. -Defined. - -(** It's better and yet it can still be improved. Indeed, according to its type, - [pop] returns “some list”. As a matter of fact, [pop] returns “the - same list without its first argument”. It is possible to write - such precise definition thanks to sigma-types, defined as: - -<< -Inductive sig (A : Type) (P : A -> Prop) : Type := - exist : forall (x : A), P x -> sig P. ->> - - Rather that [sig A p], sigma-types can be written using the - notation [{ a | P }]. They express subsets, and can be used to constraint - arguments and results of functions. - - We finally propose a strongly-specified definition of [pop]. *) - -(* begin hide *) -Reset pop. -(* end hide *) -Definition pop {a} (l : list a | ~ empty l) - : { l' | exists a, proj1_sig l = cons a l' }. - -(** If you think the previous use of [match] term was ugly, brace yourselves. *) - - refine (match proj1_sig l as l' - return proj1_sig l = l' - -> { l' | exists a, proj1_sig l = cons a l' } - with - | [] => fun equ => _ - | (_ :: rst) => fun equ => exist _ rst _ - end eq_refl). - -(** This leaves us two goals to tackle. - - First, we need to discard the case where [l] is the empty list. - -<< - a : Type - l : {l : list a | ~ empty l} - equ : proj1_sig l = [] - ============================ - {l' : list a | exists a0 : a, proj1_sig l = a0 :: l'} ->> - *) - - + destruct l as [l nempty]; cbn in *. - rewrite equ in nempty. - exfalso. - now apply nempty. - -(** Then, we need to prove that the result we provide ([rst]) when the - list is not empty is correct with respect to the specification of - [pop]. - -<< - a : Type - l : {l : list a | ~ empty l} - a0 : a - rst : list a - equ : proj1_sig l = a0 :: rst - ============================ - exists a1 : a, proj1_sig l = a1 :: rst ->> - *) - - + destruct l as [l nempty]; cbn in *. - rewrite equ. - now exists a0. -Defined. - -(** Let's have a look at the extracted code: - -<< -(** val pop : 'a1 list -> 'a1 list **) - -let pop = function -| Nil -> assert false (* absurd case *) -| Cons (a, l0) -> l0 ->> - - If one tries to call [pop nil], the [assert] ensures the call fails. Extra - information given by the sigma-type have been stripped away. It can be - confusing, and in practice it means that, we you rely on the extraction - mechanism to provide a certified OCaml module, you _cannot expose - strongly-specified functions in its public interface_ because nothing in the - OCaml type system will prevent a miseuse which will in practice leads to an - <<assert false>>. *) - -(** ** Defining [push] *) - -(** It is possible to specify [push] the same way [pop] has been. The only - difference is [push] accepts lists with no restriction at all. Thus, its - definition is a simpler, and we can write it without [refine]. *) - -Definition push {a} (l : list a) (x : a) - : { l' | l' = x :: l } := - exist _ (x :: l) eq_refl. - -(** And the extracted code is just as straightforward. - -<< -let push l a = - Cons (a, l) ->> - *) - -(** ** Defining [head] *) - -(** Same as [pop] and [push], it is possible to add extra information in the - type of [head], namely the returned value of [head] is indeed the firt value - of [l]. *) - -Definition head {a} (l : list a | ~ empty l) - : { x | exists r, proj1_sig l = x :: r }. - -(** It's not a surprise its definition is very close to [pop]. *) - - refine (match proj1_sig l as l' - return proj1_sig l = l' -> _ - with - | [] => fun equ => _ - | x :: _ => fun equ => exist _ x _ - end eq_refl). - -(** The proof are also very similar, and are left to read as an exercise for - passionate readers. *) - - + destruct l as [l falso]; cbn in *. - rewrite equ in falso. - exfalso. - now apply falso. - + exists l0. - now rewrite equ. -Defined. - -(** Finally, the extracted code is as straightforward as it can get. - -<< -let head = function -| Nil -> assert false (* absurd case *) -| Cons (a, l0) -> a ->> - *) - -(** ** Conclusion & Moving Forward *) - -(** Writing strongly-specified functions allows for reasoning about the result - correctness while computing it. This can help in practice. However, writing - these functions with the [refine] tactic does not enable a very idiomatic - Coq code. - - To improve the situation, the <<Program>> framework distributed with the Coq - standard library helps, but it is better to understand what <<Program>> achieves - under its hood, which is basically what we have done in this article. *) diff --git a/site/posts/Thanks.org b/site/posts/Thanks.org deleted file mode 100644 index 22e2d57..0000000 --- a/site/posts/Thanks.org +++ /dev/null @@ -1,47 +0,0 @@ -#+TITLE: Thanks! - -#+SERIES: ../meta.html -#+SERIES_NEXT: ../cleopatra.html - -This website could not exist without many awesome free software -projects. Although I could not list them all even if I wanted, my -desire is at least to try keeping up-to-date a curated description of -the most significant ones. - -#+BEGIN_EXPORT html -<nav id="generate-toc"></nav> -<div id="history">site/posts/Thanks.org</div> -#+END_EXPORT - -* Authoring Content - -- [[https://www.gnu.org/software/emacs][Emacs]] :: - Emacs is an extensible editor which I use daily to author this website - content, and write the code related to this website (and any code, really). It - is part of the [[https://www.gnu.org/gnu/gnu.html][GNU project]]. -- [[https://orgmode.org/][Org mode]] :: - Org mode is a major mode of Emacs which I use to author several posts. It has - initially been written by [[https://staff.science.uva.nl/~dominik/][Carsten Dominik]], and is currently maintained by - [[http://bzg.fr/][Bastien Guerry]]. -- [[https://coq.inria.fr/][Coq]] :: - Coq is a theorem prover and a proof assistant built by [[https://www.inria.fr/fr][Inria]]. Many of my posts - on Coq are regular Coq file processed by ~coqdoc~. - -* Static Website Generation - -- [[https://soupault.app][soupault]] :: - Soupault is a static website generator and HTML processor written by [[https://www.baturin.org/][Daniil - Baturin]]. -- [[https://cleopatra.soap.coffee][~cleopatra~]] :: - ~cleopatra~ is a generic, extensible toolchain with facilities for - literate programming projects using Org mode and more. I have - written it for this very website. - -* Frontend - -- [[https://katex.org][\im \KaTeX \mi]] :: - \im \KaTeX \mi is the “fastest” math typesetting library for the web, and is - used to render inline mathematics in my posts at build time. It has been - created by [[https://github.com/xymostech][Emily Eisenberg]] and - [[https://sophiebits.com/][Sophie Alpert]], with the help of - [[https://github.com/KaTeX/KaTeX/graphs/contributors][many contributors]]. diff --git a/site/posts/index.md b/site/posts/index.md new file mode 100644 index 0000000..c1ac8b5 --- /dev/null +++ b/site/posts/index.md @@ -0,0 +1,3 @@ +# Archives + +@[archives](all) |