cleopatra: Bootstrapping an Extensible Toolchain

#+END_EXPORT #+BEGIN_TODO You are about to read the first version of *~cleopatra~*, the toolchain initially implemented to build the website you are reading. Since then, *~cleopatra~* has been completely rewritten as a [[https://cleopatra.soap.coffee][independant, more generic command-line program]]. That being said, the build process described in this write-up remains the one implemented in *~cleopatra~* the Second. #+END_TODO A literate program is a particular type of software program where code is not directly written in source files, but rather in text document as code snippets. In some sense, literate programming allows for writing in the same place both the software program and its technical documentation. That being said, *~cleopatra~* is a toolchain to build a website before being a literate program, and one of its objective is to be /part of this very website it is used to generate/. To acheive this, *~cleopatra~* has been written as a collection of org files which can be either “tangled” using [[https://orgmode.org/worg/org-contrib/babel/][Babel]] or “exported” as a HTML document. Tangling here refers to extracted marked code blocks into files. The page you are currently reading is *~cleopatra~* entry point. Its primilarly purpose is to introduce two Makefiles: ~Makefile~ and ~bootstrap.mk~. #+TOC: headlines 2 #+BEGIN_EXPORT html
#+END_EXPORT ~Makefile~ serves two purposes: it initiates a few global variables, and it provides a rule to generate ~bootstrap.mk~. At this point, some readers may wonder /why/ we need ~Makefile~ in this context, and the motivation behind this choice is really reminescent of a boot sequence. The rationale is that we need a “starting point” for *~cleopatra~*. The toolchain cannot live solely inside org-files, otherwise there would not have any code to execute the first time we tried to generate the website. We need an initial Makefile, one that has little chance to change, so that we can almost consider it read-only. Contrary to the other Makefiles that we will generate, this one will not be deleted by ~make clean~. This is similar to your computer: it requires a firmware to boot, whose purpose —in a nutshell— is to find and load an operating system. Modifying the content of ~Makefile~ in this document /will/ modify ~Makefile~. This means one can easily put *~cleopatra~* into an inconsistent state, which would prevent further generation. This is why the generated ~Makefile~ should be versioned, so that you can restore it using ~git~ if you made a mistake when you modified it. For readers interested in using *~cleopatra~* for their own websites, this documents tries to highlight the potential modifications they would have to make. * Global Constants and Variables First, ~Makefile~ defines several global “constants” (although as far as I know ~make~ does not support true constant values, it is expected further generation process will not modify them). In a nutshell, - ~ROOT~ :: Tell Emacs where the root of your website sources is, so that tangled output filenames can be given relative to it rather than the org files. So for instance, the ~BLOCK_SRC~ tangle parameter for ~Makefile~ looks like ~:tangle Makefile~, instead of ~:tangle ../../Makefile~. - ~CLEODIR~ :: Tell *~cleopatra~* where its sources live. If you place it inside the ~site/~ directory (as it is intended), and you enable the use of ~org~ files to author your contents, then *~cleopatra~* documents will be part of your website. If you don’t want that, just move the directory outside the ~site/~ directory, and update the ~CLEODIR~ variable accordingly. For this website, these constants are defined as follows. #+BEGIN_SRC makefile :tangle Makefile :noweb no-export ROOT := $(shell pwd) CLEODIR := site/cleopatra #+END_SRC We then introduce two variables to list the output of the generation processes, with two purposes in mind: keeping the ~.gitignore~ up-to-date automatically, and providing rules to remove them. - ~ARTIFACTS~ :: Short-term artifacts which can be removed frequently without too much hassle. They will be removed by ~make clean~. - ~CONFIGURE~ :: Long-term artifacts whose generation can be time consuming. They will only be removed by ~make cleanall~. #+BEGIN_SRC makefile :tangle Makefile ARTIFACTS := build.log CONFIGURE := #+END_SRC Generation processes shall declare new build outputs using the ~+=~ assignement operators. Using another operator will likely provent an underisable result. * Easy Tangling of Org Documents *~cleopatra~* is a literate program implemented with Org mode, an Emacs major editing mode. We provide the necessary bits to easily tangle Org documents. The configuration of Babel is done using an emacs lisp script called ~tangle-org.el~ whose status is similar to ~Makefile~. It is part of the bootstrap process, and therefore lives “outside” of *~cleopatra~* (it is not deleted with ~make clean~ for instance). However, it is overwritten. If you try to modify it and find that *~cleopatra~* does not work properly, you should restore it using ~git~. #+BEGIN_SRC emacs-lisp :tangle scripts/tangle-org.el (require 'org) (cd (getenv "ROOT")) (setq org-confirm-babel-evaluate nil) (setq org-src-preserve-indentation t) (add-to-list 'org-babel-default-header-args '(:mkdirp . "yes")) (org-babel-do-load-languages 'org-babel-load-languages '((shell . t))) (org-babel-tangle) #+END_SRC We define variables that ensure that the ~ROOT~ environment variable is set and ~tangle-org.el~ is loaded when using Emacs. #+BEGIN_SRC makefile :tangle Makefile EMACSBIN := emacs EMACS := ROOT="${ROOT}" ${EMACSBIN} TANGLE := --batch \ --load="${ROOT}/scripts/tangle-org.el" \ 2>> build.log #+END_SRC Finally, we introduce a [[https://www.gnu.org/software/make/manual/html_node/Canned-Recipes.html#Canned-Recipes][canned recipe]] to seamlessly tangle a given file. #+BEGIN_SRC makefile :tangle Makefile define emacs-tangle = echo " tangle $<" ${EMACS} $< ${TANGLE} endef #+END_SRC * Bootstrapping The core purpose of ~Makefile~ remains to bootstrap the chain of generation processes. This chain is divided into three stages: ~prebuild~, ~build~, and ~postbuild~. This translates as follows in ~Makefile~. #+BEGIN_SRC makefile :tangle Makefile default : postbuild ignore init : @rm -f build.log prebuild : init build : prebuild postbuild : build .PHONY : init prebuild build postbuild ignore #+END_SRC A *generation process* in *~cleopatra~* is a Makefile which provides rules for these three stages, along with the utilities used by these rules. More precisely, a generation process ~proc~ is defined in ~proc.mk~. The rules of ~proc.mk~ for each stage are expected to be prefixed by ~proc-~, /e.g./, ~proc-prebuild~ for the ~prebuild~ stage. Eventually, the following dependencies are expected between within the chain of generation processes. #+BEGIN_SRC makefile prebuild : proc-prebuild build : proc-build postbuild : proc-postbuild proc-build : proc-prebuild proc-postbuild : proc build #+END_SRC Because *~cleopatra~* is a literate program, generation processes are defined in Org documents –which may contains additional utilities like scripts or templates—, and therefore need to be tangled prior to be effectively useful. *~cleopatra~ relies on a particular behavior of ~make~ regarding the ~include~ directive. If there exists a rule to generate a Makefile used as an operand of ~include~, ~make~ will use this rule to update (if necessary) said Makefile before actually including it. Therefore, rules of the following form achieve our ambition of extensibility. #+BEGIN_SRC makefile :noweb yes <> #+END_SRC where - ~${IN}~ is the Org document which contains the generation process code - ~${PROC}~ is the name of the generation process - ~${AUX}~ lists the utilities of the generation process tangled from ~${IN}~ with ~${PROC}.mk~ We use ~&:~ is used in place of ~:~ to separate the target from its dependencies in the “tangle rule.” This tells ~make~ that the recipe of this rule generates all these files. Writing these rules manually —has yours truly had to do in the early days of his website— has proven to be error-prone. One desirable feature for *~cleopatra~* would be to generate them automatically, by looking for relevant ~:tangle~ directives inside the input Org document. The challenge lies in the “relevant” part: the risk exists that we have false posivite. However and as a first steps towards a fully automated solution, we can leverage the evaluation features of Babel here. Here is a bash script which, given the proper variables, would generate the expected Makefile rule. #+NAME: extends #+BEGIN_SRC bash :var PROC="" :var AUX="" :var IN="" :results output cat <> #+END_SRC Beware that, as a consequence, modifying code block of =extends= is as “dangerous” as modifying ~Makefile~ itself. Keep that in mind if you start hacking *~cleopatra~*! Additional customizations of *~cleopatra~* will be parth ~bootstrap.mk~, rather than ~Makefile~. * Generation Processes Using the =extends= noweb reference, *~cleopatra~* is easily extensible. In this section, we first detail the structure of a typical generation process. Then, we construct ~bootstrap.mk~ by enumerating the generation processes that are currently used to generate the website you are reading. Each generation process shall 1. Define ~proc-prebuild~, ~proc-build~, and ~proc-postbuild~ 2. Declare dependencies between stages of generation processes 3. Declare build outputs (see ~ARTIFACTS~ and ~CONFIGURE~) * Wrapping-up #+BEGIN_SRC bash :tangle scripts/update-gitignore.sh :shebang "#+/bin/bash" BEGIN_MARKER="# begin generated files" END_MARKER="# begin generated files" # remove the previous list of generated files to ignore sed -i -e "/${BEGIN_MARKER}/,/${END_MARKER}/d" .gitignore # remove trailing empty lines sed -i -e :a -e '/^\n*$/{$d;N;};/\n$/ba' .gitignore # output the list of files to ignore echo "" >> .gitignore echo ${BEGIN_MARKER} >> .gitignore for f in $@; do echo "${f}" >> .gitignore done echo ${END_MARKER} >> .gitignore #+END_SRC #+BEGIN_SRC makefile :tangle bootstrap.mk ignore : @echo " update gitignore" @scripts/update-gitignore.sh \ ${ARTIFACTS} \ ${CONFIGURE} clean : @rm -rf ${ARTIFACTS} cleanall : clean @rm -rf ${CONFIGURE} #+END_SRC # Local Variables: # org-src-preserve-indentation: t # End: