summaryrefslogtreecommitdiffstats
path: root/site/posts/CleopatraV1.org
blob: 11d796b33254a602b39640d585b472168e559542 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
#+BEGIN_EXPORT html
<h1><strong><code>cleopatra</code></strong>: Bootstrapping an Extensible Toolchain</h1>
#+END_EXPORT

#+BEGIN_TODO
You are about to read the first version of *~cleopatra~*, the toolchain
initially implemented to build the website you are reading. Since then,
*~cleopatra~* has been completely rewritten as a
[[https://cleopatra.soap.coffee][independant, more generic command-line
program]]. That being said, the build process described in this write-up remains
the one implemented in *~cleopatra~* the Second.
#+END_TODO

A literate program is a particular type of software program where code is not
directly written in source files, but rather in text document as code
snippets. In some sense, literate programming allows for writing in the same
place both the software program and its technical documentation.

That being said, *~cleopatra~* is a toolchain to build a website before being a
literate program, and one of its objective is to be /part of this very website
it is used to generate/. To acheive this, *~cleopatra~* has been written as a
collection of org files which can be either “tangled” using
[[https://orgmode.org/worg/org-contrib/babel/][Babel]] or “exported” as a HTML
document. Tangling here refers to extracted marked code blocks into files.

The page you are currently reading is *~cleopatra~* entry point. Its primilarly
purpose is to introduce two Makefiles: ~Makefile~ and ~bootstrap.mk~.

#+TOC: headlines 2

#+BEGIN_EXPORT html
<div id="history">site/posts/CleopatraV1.org</div>
#+END_EXPORT

~Makefile~ serves two purposes: it initiates a few global variables, and it
provides a rule to generate ~bootstrap.mk~.  At this point, some readers may
wonder /why/ we need ~Makefile~ in this context, and the motivation behind this
choice is really reminescent of a boot sequence. The rationale is that we need a
“starting point” for *~cleopatra~*. The toolchain cannot live solely inside
org-files, otherwise there would not have any code to execute the first time we
tried to generate the website. We need an initial Makefile, one that has little
chance to change, so that we can almost consider it read-only. Contrary to the
other Makefiles that we will generate, this one will not be deleted by ~make
clean~.

This is similar to your computer: it requires a firmware to boot, whose purpose
—in a nutshell— is to find and load an operating system.

Modifying the content of ~Makefile~ in this document /will/ modify
~Makefile~. This means one can easily put *~cleopatra~* into an inconsistent
state, which would prevent further generation. This is why the generated
~Makefile~ should be versioned, so that you can restore it using ~git~ if you
made a mistake when you modified it.

For readers interested in using *~cleopatra~* for their own websites, this
documents tries to highlight the potential modifications they would have to
make.

* Global Constants and Variables

First, ~Makefile~ defines several global “constants” (although as far as I know
~make~ does not support true constant values, it is expected further generation
process will not modify them).

In a nutshell,

- ~ROOT~ ::
  Tell Emacs where the root of your website sources is, so that tangled output
  filenames can be given relative to it rather than the org files.  So for
  instance, the ~BLOCK_SRC~ tangle parameter for ~Makefile~ looks like ~:tangle
  Makefile~, instead of ~:tangle ../../Makefile~.
- ~CLEODIR~ ::
  Tell *~cleopatra~* where its sources live. If you place it inside the ~site/~
  directory (as it is intended), and you enable the use of ~org~ files to author
  your contents, then *~cleopatra~* documents will be part of your website. If
  you don’t want that, just move the directory outside the ~site/~ directory,
  and update the ~CLEODIR~ variable accordingly.

For this website, these constants are defined as follows.

#+BEGIN_SRC makefile :tangle Makefile :noweb no-export
ROOT := $(shell pwd)
CLEODIR := site/cleopatra
#+END_SRC

We then introduce two variables to list the output of the generation processes,
with two purposes in mind: keeping the ~.gitignore~ up-to-date automatically,
and providing rules to remove them.

- ~ARTIFACTS~ ::
  Short-term artifacts which can be removed frequently without too much
  hassle. They will be removed by ~make clean~.
- ~CONFIGURE~ ::
  Long-term artifacts whose generation can be time consuming. They will only be
  removed by ~make cleanall~.

#+BEGIN_SRC makefile :tangle Makefile
ARTIFACTS := build.log
CONFIGURE :=
#+END_SRC

Generation processes shall declare new build outputs using the ~+=~ assignement
operators. Using another operator will likely provent an underisable result.

* Easy Tangling of Org Documents

*~cleopatra~* is a literate program implemented with Org mode, an Emacs major
editing mode. We provide the necessary bits to easily tangle Org documents.

The configuration of Babel is done using an emacs lisp script called
~tangle-org.el~ whose status is similar to ~Makefile~. It is part of the
bootstrap process, and therefore lives “outside” of *~cleopatra~* (it is not
deleted with ~make clean~ for instance).  However, it is overwritten. If you try
to modify it and find that *~cleopatra~* does not work properly, you should
restore it using ~git~.

#+BEGIN_SRC emacs-lisp :tangle scripts/tangle-org.el
(require 'org)
(cd (getenv "ROOT"))
(setq org-confirm-babel-evaluate nil)
(setq org-src-preserve-indentation t)
(add-to-list 'org-babel-default-header-args
             '(:mkdirp . "yes"))
(org-babel-do-load-languages
 'org-babel-load-languages
 '((shell . t)))
(org-babel-tangle)
#+END_SRC

We define variables that ensure that the ~ROOT~ environment variable is set and
~tangle-org.el~ is loaded when using Emacs.

#+BEGIN_SRC makefile :tangle Makefile
EMACSBIN := emacs
EMACS := ROOT="${ROOT}" ${EMACSBIN}
TANGLE := --batch \
          --load="${ROOT}/scripts/tangle-org.el" \
          2>> build.log
#+END_SRC

Finally, we introduce a
[[https://www.gnu.org/software/make/manual/html_node/Canned-Recipes.html#Canned-Recipes][canned
recipe]] to seamlessly tangle a given file.

#+BEGIN_SRC makefile :tangle Makefile
define emacs-tangle =
echo "  tangle  $<"
${EMACS} $< ${TANGLE}
endef
#+END_SRC

* Bootstrapping

The core purpose of ~Makefile~ remains to bootstrap the chain of generation
processes. This chain is divided into three stages: ~prebuild~, ~build~, and
~postbuild~.

This translates as follows in ~Makefile~.

#+BEGIN_SRC makefile :tangle Makefile
default : postbuild ignore

init :
	@rm -f build.log

prebuild : init

build : prebuild

postbuild : build

.PHONY : init prebuild build postbuild ignore
#+END_SRC

A *generation process* in *~cleopatra~* is a Makefile which provides rules for
these three stages, along with the utilities used by these rules. More
precisely, a generation process ~proc~ is defined in ~proc.mk~. The rules of
~proc.mk~ for each stage are expected to be prefixed by ~proc-~, /e.g./,
~proc-prebuild~ for the ~prebuild~ stage.

Eventually, the following dependencies are expected between within the chain of
generation processes.

#+BEGIN_SRC makefile
prebuild : proc-prebuild
build : proc-build
postbuild : proc-postbuild

proc-build : proc-prebuild
proc-postbuild : proc build
#+END_SRC

Because *~cleopatra~* is a literate program, generation processes are defined in
Org documents –which may contains additional utilities like scripts or
templates—, and therefore need to be tangled prior to be effectively
useful. *~cleopatra~ relies on a particular behavior of ~make~ regarding the
~include~ directive. If there exists a rule to generate a Makefile used as an
operand of ~include~, ~make~ will use this rule to update (if necessary) said
Makefile before actually including it.

Therefore, rules of the following form achieve our ambition of extensibility.

#+BEGIN_SRC makefile :noweb yes
<<extends(PROC="${PROC}", IN="${IN}", AUX="${AUX}")>>
#+END_SRC

where

- ~${IN}~ is the Org document which contains the generation process code
- ~${PROC}~ is the name of the generation process
- ~${AUX}~ lists the utilities of the generation process tangled from ~${IN}~
  with ~${PROC}.mk~

We use ~&:~ is used in place of ~:~ to separate the target from its dependencies
in the “tangle rule.” This tells ~make~ that the recipe of this rule generates
all these files.

Writing these rules manually —has yours truly had to do in the early days of his
website— has proven to be error-prone.

One desirable feature for *~cleopatra~* would be to generate them automatically,
by looking for relevant ~:tangle~ directives inside the input Org document. The
challenge lies in the “relevant” part: the risk exists that we have false
posivite. However and as a first steps towards a fully automated solution, we
can leverage the evaluation features of Babel here.

Here is a bash script which, given the proper variables, would generate the
expected Makefile rule.

#+NAME: extends
#+BEGIN_SRC bash :var PROC="" :var AUX="" :var IN="" :results output
cat <<EOF
include ${PROC}.mk

prebuild : ${PROC}-prebuild
build : ${PROC}-build
postbuild : ${PROC}-postbuild

${PROC}-prebuild : ${PROC}.mk ${AUX}
${PROC}-build : ${PROC}-prebuild
${PROC}-postbuild : ${PROC}-build

${PROC}.mk ${AUX} &:\\
   \${CLEODIR}/${IN}
	@\$(emacs-tangle)

CONFIGURE += ${PROC}.mk ${AUX}

.PHONY : ${PROC}-prebuild \\
         ${PROC}-build \\
         ${PROC}-postbuild
EOF
#+END_SRC

The previous source block is given a name (=extends=), and an explicit lists of
variables (~IN~, ~PROC~, and ~AUX~). Thanks to the
[[https://orgmode.org/worg/org-tutorials/org-latex-export.html][noweb syntax of
Babel]], we can insert the result of the evaluation of =extends= inside another
source block when the latter is tangled.

We derive the rule to tangle ~bootstrap.mk~ using =extends=, which gives us the
following Makefile snippet.

#+BEGIN_SRC makefile :tangle Makefile :noweb yes
<<extends(IN="Bootstrap.org", PROC="bootstrap", AUX="scripts/update-gitignore.sh")>>
#+END_SRC

Beware that, as a consequence, modifying code block of =extends= is as
“dangerous” as modifying ~Makefile~ itself. Keep that in mind if you start
hacking *~cleopatra~*!

Additional customizations of *~cleopatra~* will be parth ~bootstrap.mk~, rather
than ~Makefile~.

* Generation Processes

Using the =extends= noweb reference, *~cleopatra~* is easily extensible. In
this section, we first detail the structure of a typical generation process.
Then, we construct ~bootstrap.mk~ by enumerating the generation processes that
are currently used to generate the website you are reading.

Each generation process shall

1. Define ~proc-prebuild~, ~proc-build~, and ~proc-postbuild~
2. Declare dependencies between stages of generation processes
3. Declare build outputs (see ~ARTIFACTS~ and ~CONFIGURE~)

* Wrapping-up

#+BEGIN_SRC bash :tangle scripts/update-gitignore.sh :shebang "#+/bin/bash"
BEGIN_MARKER="# begin generated files"
END_MARKER="# begin generated files"

# remove the previous list of generated files to ignore
sed -i -e "/${BEGIN_MARKER}/,/${END_MARKER}/d" .gitignore
# remove trailing empty lines
sed -i -e :a -e '/^\n*$/{$d;N;};/\n$/ba' .gitignore

# output the list of files to ignore
echo "" >> .gitignore
echo ${BEGIN_MARKER} >> .gitignore
for f in $@; do
    echo "${f}" >> .gitignore
done
echo ${END_MARKER} >> .gitignore
#+END_SRC

#+BEGIN_SRC makefile :tangle bootstrap.mk
ignore :
	@echo "  update  gitignore"
	@scripts/update-gitignore.sh \
	   ${ARTIFACTS} \
	   ${CONFIGURE}

clean :
	@rm -rf ${ARTIFACTS}

cleanall : clean
	@rm -rf ${CONFIGURE}
#+END_SRC

# Local Variables:
# org-src-preserve-indentation: t
# End: