300 lines
12 KiB
Markdown
300 lines
12 KiB
Markdown
+++
|
||
title = "[EN] Automatic Meaningful Custom IDs for Org Headings"
|
||
author = ["Lucien “Phundrak” Cartier-Tilet"]
|
||
date = 2020-06-06
|
||
tags = ["emacs", "orgmode"]
|
||
categories = ["emacs", "linux", "conlanging", "orgmode"]
|
||
draft = false
|
||
[menu.main]
|
||
weight = 2001
|
||
identifier = "en-automatic-meaningful-custom-ids-for-org-headings"
|
||
+++
|
||
|
||
Spoiler alert, I will just modify a bit of code that already exists, go
|
||
directly to the bottom if you want the solution, or read the whole post if
|
||
you are interested in how I got there.
|
||
|
||
<div class="ox-hugo-toc toc local">
|
||
<div></div>
|
||
|
||
- [The issue](#the-issue)
|
||
- [A first solution](#a-first-solution)
|
||
- [These headers are not meaningful](#these-headers-are-not-meaningful)
|
||
|
||
</div>
|
||
<!--endtoc-->
|
||
|
||
|
||
## The issue {#the-issue}
|
||
|
||
About two to three years ago, as I was working on a project that was meant
|
||
to be published on the internet, I looked for a solution to get fixed anchor
|
||
links to my various headings when I performed HTML exports. As some of you
|
||
may know, by default when an Org file is exported to an HTML file, a random
|
||
ID will be generated for each header, and this ID will be used as their
|
||
anchor. Here’s a quick example of a simple org file:
|
||
|
||
```org
|
||
#+title: Sample org file
|
||
* First heading
|
||
Reference to a subheading
|
||
* Second heading
|
||
Some stuff written here
|
||
** First subheading
|
||
Some stuff
|
||
** Second subheading
|
||
Some other stuff
|
||
```
|
||
|
||
<div class="src-block-caption">
|
||
<span class="src-block-number">Code Snippet 1</span>:
|
||
Example org file
|
||
</div>
|
||
|
||
And this is the result once exported to HTML (with a lot of noise removed
|
||
from `<head>`):
|
||
|
||
```html
|
||
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
|
||
|
||
<head>
|
||
<title>Sample org file</title>
|
||
<meta name="generator" content="Org mode" />
|
||
<meta name="author" content="Lucien Cartier-Tilet" />
|
||
</head>
|
||
|
||
<body>
|
||
<div id="content">
|
||
<h1 class="title">Sample org file</h1>
|
||
<div id="outline-container-orgd8e6238" class="outline-2">
|
||
<h2 id="orgd8e6238"><span class="section-number-2">1</span> First heading</h2>
|
||
<div class="outline-text-2" id="text-1">
|
||
<p>
|
||
Reference to a subheading
|
||
</p>
|
||
</div>
|
||
</div>
|
||
<div id="outline-container-org621c39a" class="outline-2">
|
||
<h2 id="org621c39a"><span class="section-number-2">2</span> Second heading</h2>
|
||
<div class="outline-text-2" id="text-2">
|
||
<p>
|
||
Some stuff written here
|
||
</p>
|
||
</div>
|
||
<div id="outline-container-orgae45d6b" class="outline-3">
|
||
<h3 id="orgae45d6b"><span class="section-number-3">2.1</span> First subheading</h3>
|
||
<div class="outline-text-3" id="text-2-1">
|
||
<p>
|
||
Some stuff
|
||
</p>
|
||
</div>
|
||
</div>
|
||
<div id="outline-container-org9301aa9" class="outline-3">
|
||
<h3 id="org9301aa9"><span class="section-number-3">2.2</span> Second subheading</h3>
|
||
<div class="outline-text-3" id="text-2-2">
|
||
<p>
|
||
Some other stuff
|
||
</p>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</body>
|
||
|
||
</html>
|
||
```
|
||
|
||
<div class="src-block-caption">
|
||
<span class="src-block-number">Code Snippet 2</span>:
|
||
Output HTML file
|
||
</div>
|
||
|
||
As you can see, all the anchors are in the fomat of `org[a-f0-9]{7}`. First,
|
||
this is not really meaningful if you want to read the anchor and guess where
|
||
it will lead you. But secondly, these anchors will change each time you
|
||
export your Org file to HTML. If I want to share a URL to my website and to
|
||
a specific heading,… well I can’t, it will change the next time I update the
|
||
document. And I don’t want to have to set a `CUSTOM_ID` property for each
|
||
one of my headings manually. So, what to do?
|
||
|
||
|
||
## A first solution {#a-first-solution}
|
||
|
||
A first solution I found came from [this blog post](https://writequit.org/articles/emacs-org-mode-generate-ids.html), where Lee Hinman
|
||
described the very same issue they had and wrote some Elisp code to remedy
|
||
that (it’s a great read, go take a look). And it worked, and for some time I
|
||
used their code in my Emacs configuration file in order to generate unique
|
||
custom IDs for my Org headers. Basically what the code does is it detects if
|
||
`auto-id:t` is set in an `#+OPTIONS` header. If it is, then it will iterate
|
||
over all of the Org headers, and for each one of them it will insert a
|
||
`CUSTOM_ID`, which is made from a UUID generated by Emacs. And tada! we get
|
||
for each header a
|
||
`h-[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}` custom ID
|
||
that won’t change next time we export our Org file to HTML when we save our
|
||
file, and only for headings which don’t already have a `CUSTOM_ID` property.
|
||
Wohoo!
|
||
|
||
Except…
|
||
|
||
|
||
## These headers are not meaningful {#these-headers-are-not-meaningful}
|
||
|
||
Ok, alright, that’s still a huge step forward, we don’t have to type any
|
||
CUSTOM\_ID property manually anymore, it’s done automatically for us. But,
|
||
when I send someone a link like
|
||
`https://langue.phundrak.com/eittland#h-76fc0b91-e41c-42ad-8652-bba029632333`,
|
||
the first reaction to this URL is often something along the lines of “What
|
||
the fuck?”. And they’re right, this URL is unreadable when it comes to the
|
||
anchor. How am I supposed to guess it links to the description of the vowels
|
||
of the Eittlandic language? (That’s a constructed language I’m working on,
|
||
you won’t find anything about it outside my website.)
|
||
|
||
So, I went back to my configuration file for Emacs, and through some trial
|
||
and error, I finally found a way to get a consistent custom ID which is
|
||
readable and automatically set. With the current state of my code, what you
|
||
get is the complete path of the Org heading, all spaces replaced by
|
||
underscores and headings separated by dashes, with a final unique identifier
|
||
taken from an Emacs-generated UUID. Now, the same link as above will look
|
||
like
|
||
`https://langue.phundrak.com/eittland#Aperçu_structurel-Inventaire_phonétique_et_orthographe-Voyelles_pures-84f05c2c`.
|
||
It won’t be more readable to you if you don’t speak French, but you can
|
||
guess it is way better than what we had before. I even added a safety net by
|
||
replacing all forward slashes with dashes. The last ID is here to ensure the
|
||
path will be unique in case we’d have two identical paths in the org file
|
||
for one reason or another.
|
||
|
||
The modifications I made to the first function `eos/org-id-new` are minimal,
|
||
where I just split the UUID and get its first part. This is basically a way
|
||
to simplify it.
|
||
|
||
```emacs-lisp
|
||
(defun eos/org-id-new (&optional prefix)
|
||
"Create a new globally unique ID.
|
||
|
||
An ID consists of two parts separated by a colon:
|
||
- a prefix
|
||
- a unique part that will be created according to
|
||
`org-id-method'.
|
||
|
||
PREFIX can specify the prefix, the default is given by the
|
||
variable `org-id-prefix'. However, if PREFIX is the symbol
|
||
`none', don't use any prefix even if `org-id-prefix' specifies
|
||
one.
|
||
|
||
So a typical ID could look like \"Org-4nd91V40HI\"."
|
||
(let* ((prefix (if (eq prefix 'none)
|
||
""
|
||
(concat (or prefix org-id-prefix)
|
||
"-"))) unique)
|
||
(if (equal prefix "-")
|
||
(setq prefix ""))
|
||
(cond
|
||
((memq org-id-method
|
||
'(uuidgen uuid))
|
||
(setq unique (org-trim (shell-command-to-string org-id-uuid-program)))
|
||
(unless (org-uuidgen-p unique)
|
||
(setq unique (org-id-uuid))))
|
||
((eq org-id-method 'org)
|
||
(let* ((etime (org-reverse-string (org-id-time-to-b36)))
|
||
(postfix (if org-id-include-domain
|
||
(progn
|
||
(require 'message)
|
||
(concat "@"
|
||
(message-make-fqdn))))))
|
||
(setq unique (concat etime postfix))))
|
||
(t (error "Invalid `org-id-method'")))
|
||
(concat prefix (car (split-string unique "-")))))
|
||
```
|
||
|
||
Next, we have here the actual generation of the custom ID. As you can see,
|
||
the `let` has been replaced by a `let*` which allowed me to create the ID
|
||
with the variables `orgpath` and `heading`. The former concatenates the path
|
||
to the heading joined by dashes, and `heading` concatenates `orgpath` to the
|
||
name of the current heading joined by a dash if `orgpath` is not empty. It
|
||
will then create a slug out of the result, deleting some elements such as
|
||
forward slashes or tildes, and all whitespace is replaced by underscores. It
|
||
then passes `heading` as an argument to the function described above to
|
||
which the unique ID will be concatenated.
|
||
|
||
```emacs-lisp
|
||
(defun eos/org-custom-id-get (&optional pom create prefix)
|
||
"Get the CUSTOM_ID property of the entry at point-or-marker POM.
|
||
|
||
If POM is nil, refer to the entry at point. If the entry does not
|
||
have an CUSTOM_ID, the function returns nil. However, when CREATE
|
||
is non nil, create a CUSTOM_ID if none is present already. PREFIX
|
||
will be passed through to `eos/org-id-new'. In any case, the
|
||
CUSTOM_ID of the entry is returned."
|
||
(interactive)
|
||
(org-with-point-at pom
|
||
(let* ((orgpath (mapconcat #'identity (org-get-outline-path) "-"))
|
||
(heading (replace-regexp-in-string
|
||
"/\\|~\\|\\[\\|\\]" ""
|
||
(replace-regexp-in-string
|
||
"[[:space:]]+" "_" (if (string= orgpath "")
|
||
(org-get-heading t t t t)
|
||
(concat orgpath "-" (org-get-heading t t t t))))))
|
||
(id (org-entry-get nil "CUSTOM_ID")))
|
||
(cond
|
||
((and id
|
||
(stringp id)
|
||
(string-match "\\S-" id)) id)
|
||
(create (setq id (eos/org-id-new (concat prefix heading)))
|
||
(org-entry-put pom "CUSTOM_ID" id)
|
||
(org-id-add-location id
|
||
(buffer-file-name (buffer-base-buffer)))
|
||
id)))))
|
||
```
|
||
|
||
The rest of the code is unchanged, here it is anyway:
|
||
|
||
```emacs-lisp
|
||
(defun eos/org-add-ids-to-headlines-in-file ()
|
||
"Add CUSTOM_ID properties to all headlines in the current file
|
||
which do not already have one.
|
||
|
||
Only adds ids if the `auto-id' option is set to `t' in the file
|
||
somewhere. ie, #+OPTIONS: auto-id:t"
|
||
(interactive)
|
||
(save-excursion
|
||
(widen)
|
||
(goto-char (point-min))
|
||
(when (re-search-forward "^#\\+OPTIONS:.*auto-id:t"
|
||
(point-max)
|
||
t)
|
||
(org-map-entries (lambda ()
|
||
(eos/org-custom-id-get (point)
|
||
'create))))))
|
||
|
||
(add-hook 'org-mode-hook
|
||
(lambda ()
|
||
(add-hook 'before-save-hook
|
||
(lambda ()
|
||
(when (and (eq major-mode 'org-mode)
|
||
(eq buffer-read-only nil))
|
||
(eos/org-add-ids-to-headlines-in-file))))))
|
||
```
|
||
|
||
Note that you **will need** the package `org-id` to make this code work. You
|
||
simply need to add the following code before the code I shared above:
|
||
|
||
```emacs-lisp
|
||
(require 'org-id)
|
||
(setq org-id-link-to-org-use-id 'create-if-interactive-and-no-custom-id)
|
||
```
|
||
|
||
And that’s how my links are now way more readable **and** persistent! The only
|
||
downside I found to this is when you move headings and their path is
|
||
modified, or when you modify the heading itself, the custom ID is not
|
||
automatically updated. I could fix that by regenerating the custom ID on
|
||
each save, regardless of whether a custom ID already exists or not, but it’s
|
||
at the risk an ID manually set will get overwritten.
|
||
|
||
<div class="html">
|
||
<div></div>
|
||
|
||
<script defer src="<https://commento.phundrak.com/js/commento.js>"></script>
|
||
<div id="commento"></div>
|
||
|
||
</div>
|