#+BEGIN_EXPORT html

soupault Configuration

#+END_EXPORT In a nutshell, the purpose of ~soupault~ is to post-process HTML files generated by the generation processes of *~cleopatra~*. It is parameterized by two settings, the ~<> directory where ~soupault~ generates its output, and an eventual ~<>~ wherein the website contents lives. The latter allows to generate only a subpart of a larger website. For the present website, these two settings are initialized as follows. #+NAME: build-dir #+BEGIN_SRC text build #+END_SRC #+NAME: prefix #+BEGIN_SRC text ~lthms #+END_SRC The rest of this document proceeds as follows. We first describe the general settings of ~soupault~. Then, we enumerate the widgets enabled for this website. Finally, we provide a proper definition for ~soupault~ the *~cleopatra~* generation process. #+TOC: headlines 2 * ~soupault~ General Settings The general ~settings~ section of ~soupault.conf~ is fairly basic, and there is little to say that the [[https://soupault.neocities.org/reference-manual/#getting-started][“Getting Started”]] already discuss in length. We emphasize three things: - The ~build_dir~ is set to ~<>/<>~ in place of simply ~<>~. - The ~ignore_extensions~ shall be updated to take into account artifacts produces by other *~cleopatra~* generation processes. - We disable the “clean URLs” feature of ~soupault. This option renames a HTML files ~foo/bar.html~ into ~foo/bar/index.html~, which means when served by a HTTP server, the ~foo/bar~ URL will work. The issue we have with this feature is that the internal links within your websiste needs to take their /final/ URL into account, rather than their actual name. If one day ~soupault~ starts rewriting internal URLs when ~clean_url~ is enabled, we might reconsider using it. #+BEGIN_SRC toml :tangle soupault.conf :noweb tangle [settings] strict = true verbose = false debug = false site_dir = "site" build_dir = "<>/<>" page_file_extensions = ["html"] ignore_extensions = [ "draft", "vo", "vok", "vos", "glob", "html~", "org", "aux", "sass", ] generator_mode = true complete_page_selector = "html" default_template = "templates/main.html" content_selector = "main" doctype = "" clean_urls = false #+END_SRC #+BEGIN_TODO The list of ignored extensions should be programmatically generated with the help of *~cleopatra~*. #+END_TODO * Widgets ** Setting Page Title We use the “page title” widget to set the title of the webpage based on the first (and hopefully the only) ~

~ tag of the page. #+BEGIN_SRC toml :tangle soupault.conf [widgets.page-title] widget = "title" selector = "h1" default = "~lthms" prepend = "~lthms: " #+END_SRC ** Acknowledging ~soupault~ When creating a new ~soupault~ project (using ~soupault --init~), the default configuration file suggests advertising the use of ~soupault~. Rather than hard-coding the used version of ~soupault~ (which is error-prone), we rather determine the version of ~soupault~ with the following script. #+NAME: soupault-version #+BEGIN_SRC bash :results verbatim output :exports both soupault --version | head -n 1 | tr -d '\n' #+END_SRC The configuration of the widget ---initially provided by ~soupault~--- becomes less subject to the obsolescence. #+BEGIN_SRC toml :tangle soupault.conf :noweb tangle [widgets.generator-meta] widget = "insert_html" html = """ """ selector = "head" #+END_SRC ** Generating Table of Contents The ~toc~ widget allows for generating a table of contents for HTML files which contains a node matching a given ~selector~ (in the case of this document, ~#generate-toc~). #+BEGIN_SRC toml :tangle soupault.conf [widgets.table-of-contents] widget = "toc" selector = "#generate-toc" action = "replace_element" valid_html = true min_level = 2 numbered_list = true #+END_SRC #+BEGIN_TODO We could propose a patch to ~soupault~'s upstream to add numbering in titles. #+END_TODO ** Fixing Org Internal Links For some reason, Org prefix internal links to other Org documents with ~file://~. To avoid that, we provide a simple plugin which removes ~file://~ from the begining of a URL. #+BEGIN_TODO This plugin definition should be part of [[./Contents/Org.org][the ~org~ generation process]], but that would require to aggregate “subconfig” into a larger one. #+END_TODO This plugin key component is the =fix_org_urls= function. - =fix_org_urls(LIST, ATTR)= :: Enumerate the DOM elements of =LIST=, and check their =ATTR= attribute. #+BEGIN_SRC lua :tangle plugins/fix-org-urls.lua function fix_org_urls(list, attr) index, link = next(list) while index do href = HTML.get_attribute(link, attr) if href then href = Regex.replace(href, "^file://", "") HTML.set_attribute(link, attr, href) end index, link = next(list, index) end end #+END_SRC We use this function to fix the URLs of tags known to be subject to Org strange behavior. For now, only ~~ has been affected. #+BEGIN_SRC lua :tangle plugins/fix-org-urls.lua fix_org_urls(HTML.select(page, "a"), "href") fix_org_urls(HTML.select(page, "img"), "src") #+END_SRC The configuration of this plugin, and the associated widget, is straightforward. #+BEGIN_SRC toml :tangle soupault.conf :noweb tangle [widgets.fix-org-urls] widget = "fix-org-urls" #+END_SRC ** Prefixing Internal URLs On the one hand, internal links can be absolute, meaning they start with a leading ~/~, and therefore are relative to the website root. On the other hand, website (especially static website) can be placed in larger context. For instance, my personal website lives inside the ~~lthms~ directory of the ~soap.coffee~ domain. The purpose of this plugin is to rewrite internal URLs which are relative to the root, in order to properly prefix them. From a high-level perspective, the plugin structure is the following. #+BEGIN_SRC lua :tangle plugins/urls-rewriting.lua :noweb no-export prefix_url = config["prefix_url"] <> <> <> #+END_SRC 1. We validate the widget configuration. 2. We propose a generic function to enumerate and rewrite tags which can have internal URLs as attribute argument. 3. We use this generic function for relevant tags. #+NAME: validate_prefix #+BEGIN_SRC lua if not prefix_url then Plugin.fail("Missing mandatory field: `prefix_url'") end if not Regex.match(prefix_url, "^/(.*)") then prefix_url = "/" .. prefix_url end if not Regex.match(prefix_url, "(.*)/$") then prefix_url = prefix_url .. "/" end #+END_SRC #+NAME: prefix_func #+BEGIN_SRC lua function prefix_urls (links, attr, prefix_url) index, link = next(links) while index do href = HTML.get_attribute(link, attr) if href then if Regex.match(href, "^/") then href = Regex.replace(href, "^/*", "") href = prefix_url .. href end HTML.set_attribute(link, attr, href) end index, link = next(links, index) end end #+END_SRC #+NAME: prefix_calls #+BEGIN_SRC lua prefix_urls(HTML.select(page, "a"), "href", prefix_url) prefix_urls(HTML.select(page, "link"), "href", prefix_url) prefix_urls(HTML.select(page, "img"), "src", prefix_url) prefix_urls(HTML.select(page, "script"), "src", prefix_url) #+END_SRC Again, configuring soupault to use this plugin is relatively straightforward. The only important thing to notice is the use of the ~after~ field, to ensure this plugin is run /after/ the plugin responsible for fixing Org documents URLs. #+BEGIN_SRC toml :tangle soupault.conf :noweb tangle [widgets.urls-rewriting] widget = "urls-rewriting" prefix_url = "<>" after = "fix-org-urls" #+END_SRC ** Marking External Links #+BEGIN_SRC lua :tangle plugins/external-urls.lua function mark(name) return '' end links = HTML.select(page, "a") index, link = next(links) while index do href = HTML.get_attribute(link, "href") if href then if Regex.match(href, "^https?://github.com") then icon = HTML.parse(mark('github')) HTML.append_child(link, icon) elseif Regex.match(href, "^https?://") then icon = HTML.parse(mark('external-link')) HTML.append_child(link, icon) end end index, link = next(links, index) end #+END_SRC #+BEGIN_SRC sass :tangle site/style/plugins.sass .url-mark.fa display: inline font-size: 90% width: 1em .url-mark.fa-github::before content: "\00a0\f09b" .url-mark.fa-external-link::before content: "\00a0\f08e" #+END_SRC #+BEGIN_SRC toml :tangle soupault.conf [widgets.mark-external-urls] after = "generate-history" widget = "external-urls" #+END_SRC ** Generating Per-File Revisions Tables *** Users Instructions This widgets allows to generate a so-called “revisions table” of the filename contained in a DOM element of id ~history~, based on its history. Paths should be relative to the directory from which you start the build process (typically, the root of your repository). The revisions table notably provides hyperlinks to a ~git~ webview for each commit. For instance, considering the following HTML snippet #+BEGIN_SRC html
site/posts/FooBar.org
#+END_SRC This plugin will replace the content of this ~
~ with the revisions table of ~site/posts/FooBar.org~. *** Customization The base of the URL webview for the document you are currently reading —afterwards abstracted with the ~<>~ noweb reference— is #+NAME: repo #+BEGIN_SRC text https://code.soap.coffee/writing/lthms.git #+END_SRC #+BEGIN_SRC html :tangle templates/history.html :noweb tangle
Revisions

This revisions table has been automatically generated from the git history of this website repository, and the change descriptions may not always be as useful as they should.

You can consult the source of this file in its current version here.

{{#history}} {{/history}}
{{date}} {{subject}} {{abbr_hash}}
#+END_SRC #+BEGIN_SRC sass :tangle site/style/plugins.sass table border-top : 2px solid black border-bottom : 2px solid black border-collapse : collapse width : 35rem td border-bottom : 1px solid black padding : .5em #history .commit font-size : smaller font-family : 'Fira Code', monospace width : 7em text-align : center #+END_SRC *** Implementation We use the built-in [[https://soupault.neocities.org/reference-manual/#widgets-preprocess-element][=preprocess_element=]] to implement, which means we need a script which gets its input from the standard input, and echoes its output to the standard input. #+BEGIN_SRC toml :tangle soupault.conf [widgets.generate-history] widget = "preprocess_element" selector = "#history" command = 'scripts/history.sh templates/history.html' action = "replace_content" #+END_SRC #+BEGIN_TODO This plugin should be reimplemented using ~libgit2~ or other ~git~ libraries, in a language more suitable than bash. #+END_TODO This plugin proceeds as follows: 1. Using an ad-hoc script, it generates a JSON containing for each revision - The subject, date, hash, and abbreviated hash of the related commit - The name of the file at the time of this commit 2. This JSON is passed to a mustache engine (~haskell-mustache~) with a proper template 3. The content of the selected DOM element is replaced with the output of ~haskell-mustache~ This translates in Bash like this. #+BEGIN_SRC bash :tangle scripts/history.sh :shebang "#!/usr/bin/bash" function main () { local file="${1}" local template="${2}" tmp_file=$(mktemp) generate_json ${file} > ${tmp_file} haskell-mustache ${template} ${tmp_file} rm ${tmp_file} } #+END_SRC The difficult part of this script is the definition of the =generate_json= function. From a high-level perspective, this function is divided into three steps. 1. We get an initial (but partial) set of data about the ~git~ commit of ~${file}~, from the most recent to the oldest 2. For each commit, we check whether or not ~${file}~ was renamed or not 3. Finally, we output a result (because we are writing a bash script) #+BEGIN_SRC bash :tangle scripts/history.sh :noweb no-export function generate_json () { local file="${1}" local logs=`<>` if [ ! $? -eq 0 ]; then exit 1 fi <> <> } #+END_SRC We will use ~git~ to get the information we need. By default, ~git~ subcommands use a pager when its output is likely to be long. This typically includes ~git-log~. To disable this behavior, ~git~ exposes the ~--no-pager~ command. We introduce =_git=, a wrapper around ~git~ with the proper option. #+BEGIN_SRC bash :tangle scripts/history.sh function _git () { git --no-pager "$@" } #+END_SRC Afterwards, we use =_git= in place of ~git~. Using the ~git-log~ ~--pretty~ command-line argument, we can generate one JSON object per commit which contains most of the information we need, using the following format string. #+NAME: pretty-format #+BEGIN_SRC json { "subject" : "%s", "abbr_hash" : "%h", "hash" : "%H", "date" : "%cs" } #+END_SRC Besides, we also need ~--follow~ to deal with file renaming. Without this option, ~git-log~ stops when the file first appears in the repository, even if this “creation” is actually a renaming. Therefore, the ~git~ command line we use to collect our initial history is #+NAME: git-log #+BEGIN_SRC bash :noweb no-export _git log --follow --pretty=format:'<>' "${file}" #+END_SRC To manipulate JSON, we rely on three operators (yet to be defined): - =jget OBJECT FIELD= :: In an =OBJECT=, get the value of a given =FIELD= - =jset OBJECT FIELD VALIE= :: In an =OBJECT=, set the =VALUE= of a given =FIELD= - =jappend ARRAY VALUE= :: Append a =VALUE= at the end of an =ARRAY= #+NAME: remane-tracking #+BEGIN_SRC bash :noweb no-export local name="${file}" local revisions='[]' local first=0 while read -r rev; do rev=$(jset "${rev}" "filename" "\"${name}\"") if [ ${first} -eq 0 ]; then rev=$(jset "${rev}" "modified" "true") first=1 fi revisions=$(jappend "${revisions}" "${rev}") local hash=$(jget "${rev}" "hash") local rename=$(previous_name "${name}" "${hash}") if [[ ! -z "${rename}" ]]; then name=${rename} fi done < <(echo "${logs}") revisions=$(_jq "${revisions}" "length as \$l | .[\$l - 1].created |= true") #+END_SRC #+BEGIN_SRC bash :tangle scripts/history.sh function previous_name () { local name=${1} local hash=${2} local unfold='s/ *\(.*\){\(.*\) => \(.*\)}/\1\2 => \1\3/' _git show --stat=10000 ${hash} \ | sed -e "${unfold}" \ | grep "=> ${name}" \ | xargs \ | cut -d' ' -f1 } #+END_SRC #+NAME: result-echoing #+BEGIN_SRC bash :noweb no-export jset "$(jset "{}" "file" "\"${file}\"")" \ "history" \ "${revisions}" #+END_SRC The last missing pieces are the definitions of the three JSON operators. We use [[https://stedolan.github.io/jq/][~jq~]] to manipulate JSON data. Since ~jq~ processes JSON from its standard input, we first define a helper (similar to =_git=) to deal with JSON from variables seamlessly. #+BEGIN_SRC bash :tangle scripts/history.sh function _jq () { local input="${1}" local filter="${2}" echo "${input}" | jq -jcM "${filter}" } #+END_SRC - *-j* tells ~jq~ not to print a new line at the end of its outputs - *-c* tells ~jq~ to print JSON in a compact format (rather than prettified) - *-M* tells ~jq~ to output monochrome outputs Internally, =jget=, =jset=, and =jappend= are implemented with ~jq~ [[https://stedolan.github.io/jq/manual/#Basicfilters][basic filters]]. #+BEGIN_SRC bash :tangle scripts/history.sh function jget () { local obj="${1}" local field="${2}" _jq "${obj}" ".${field}" } function jset () { local obj="${1}" local field="${2}" local val="${3}" _jq "${obj}" "setpath([\"${field}\"]; ${val})" } function jappend () { local arr="${1}" local val="${2}" _jq "${arr}" ". + [ ${val} ]" } #+END_SRC Everything is defined. We can call =main= now. #+BEGIN_SRC bash :tangle scripts/history.sh main "$(cat)" "${1}" #+END_SRC ** Rendering Equations Offline *** Users instructions Inline equations written in the DOM under the class src_css{.imath} and using the \im \LaTeX \mi syntax can be rendered once and for all by ~soupault~. User For instance, ~\LaTeX~ is rendered \im \LaTeX \mi as expected. Using this widgets requires being able to inject raw HTML in input files. *** Implementation We will use [[https://katex.org][\im \KaTeX \mi]] to render equations offline. \im \KaTeX \mi availability on most systems is unlikely, but it is part of [[https://www.npmjs.com/package/katex][npm]], so we can define a minimal ~package.json~ file to fetch it automatically. #+BEGIN_SRC json :tangle package.json { "private": true, "devDependencies": { "katex": "^0.11.1" } } #+END_SRC We introduce a Makefile recipe to call ~npm install~. This command produces a file called ~package-lock.json~ that we add to ~GENFILES~ to ensure \im \KaTeX \mi will be available when ~soupault~ is called. If ~Soupault.org~ has been modified since the last generation, Babel will generate ~package.json~ again. However, if the modifications of ~Soupault.org~ do not concern ~package.json~, then ~npm install~ will not modify ~package-lock.json~ and its “last modified” time will not be updated. This means that the next time ~make~ will be used, it will replay this recipe again. As a consequence, we systematically ~touch~ ~packase-lock.json~ to satisfy ~make~. #+BEGIN_SRC makefile :tangle katex.mk package-lock.json : package.json @echo " init npm packages" @npm install &>> build.log @touch $@ CONFIGURE += package-lock.json node_modules/ #+END_SRC Once installed and available, \im \KaTeX \mi is really simple to use. The following script reads (synchronously!) the standard input, renders it using \im \KaTeX \mi and outputs the resut to the standard output. #+BEGIN_SRC js :tangle scripts/katex.js var katex = require("katex"); var fs = require("fs"); var input = fs.readFileSync(0); var displayMode = process.env.DISPLAY != undefined; var html = katex.renderToString(String.raw`${input}`, { throwOnError : false, displayModed : displayMode }); console.log(html) #+END_SRC We reuse once again the =preprocess_element= widget. The selector is ~.imath~ (~i~ stands for inline in this context), and we replace the previous content with the result of our script. #+BEGIN_SRC toml :tangle soupault.conf [widgets.inline-math] widget = "preprocess_element" selector = ".imath" command = "node scripts/katex.js" action = "replace_content" [widgets.display-math] widget = "preprocess_element" selector = ".dmath" command = "DISPLAY=1 node scripts/katex.js" action = "replace_content" #+END_SRC The \im\KaTeX\mi font is bigger than the serif font used for this website, so we reduce it a bit with a dedicated SASS rule. #+BEGIN_SRC sass :tangle site/style/plugins.sass .imath, .dmath font-size : smaller .dmath text-align : center #+END_SRC * *~cleopatra~* Generation Process Definition We introduce the ~soupault~ generation process, obviously based on the [[https://soupault.neocities.org/][~soupault~ HTML processor]]. The structure of a *~cleopatra~* generation process is always the same. #+BEGIN_SRC makefile :tangle soupault.mk :noweb no-export <> <> <> #+END_SRC In the rest of this section, we define these three components. ** Build Stages From the perspective of *~cleopatra~*, it is a rather simple component, since the ~build~ stage is simply a call to ~soupault~, whose outputs are located in a single (configurable) directory. #+NAME: stages #+BEGIN_SRC makefile :noweb no-export soupault-build : @cleopatra echo Running soupault @soupault ARTIFACTS += <>/ #+END_SRC ** Dependencies Most of the generation processes (if not all of them) need to declare themselves as a prerequisite for ~soupault-build~. If they do not, they will likely be executed after ~soupault~ is called. This file defines an auxiliary SASS sheet that needs to be declared as a dependency of the build stage of the [[./Theme.org][~theme~ generation process]]. Finally, the offline rendering of equations requires \im \KaTeX \mi to be available, so we include the ~katex.mk~ file, and make ~package-lock.json~ (the proof that ~npm install~ has been executed) a prerequisite of ~soupault-build~. #+NAME: dependencies #+BEGIN_SRC makefile theme-build : site/style/plugins.sass include katex.mk soupault-build : package-lock.json #+END_SRC ** Ad-hoc Commands Finally, this generation process introduces a dedicated (~PHONY~) command to start a HTTP server in order to navigate the generated website from a browser. #+NAME: ad-hoc-cmds #+BEGIN_SRC makefile :noweb no-export serve : @echo " start a python server" @cd <>; python -m http.server 2>/dev/null .PHONY : serve #+END_SRC This command does not assume anything about the current state of generation of the project. In particular, it does not check whether or not the ~<>~ directory exists. The responsibility to use ~make serve~ in a good setting lies with final users.