12 KiB
+++ title = "[EN] Automatic Meaningful Custom IDs for Org Headings" author = ["Lucien “Phundrak” Cartier-Tilet"] date = 2020-06-06 tags = ["emacs", "orgmode"] categories = ["emacs", "linux", "conlanging", "orgmode"] draft = false [menu.main] weight = 2001 identifier = "en-automatic-meaningful-custom-ids-for-org-headings" +++
Spoiler alert, I will just modify a bit of code that already exists, go directly to the bottom if you want the solution, or read the whole post if you are interested in how I got there.
The issue
About two to three years ago, as I was working on a project that was meant to be published on the internet, I looked for a solution to get fixed anchor links to my various headings when I performed HTML exports. As some of you may know, by default when an Org file is exported to an HTML file, a random ID will be generated for each header, and this ID will be used as their anchor. Here’s a quick example of a simple org file:
#+title: Sample org file
* First heading
Reference to a subheading
* Second heading
Some stuff written here
** First subheading
Some stuff
** Second subheading
Some other stuff
And this is the result once exported to HTML (with a lot of noise removed
from <head>
):
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
<title>Sample org file</title>
<meta name="generator" content="Org mode" />
<meta name="author" content="Lucien Cartier-Tilet" />
</head>
<body>
<div id="content">
<h1 class="title">Sample org file</h1>
<div id="outline-container-orgd8e6238" class="outline-2">
<h2 id="orgd8e6238"><span class="section-number-2">1</span> First heading</h2>
<div class="outline-text-2" id="text-1">
<p>
Reference to a subheading
</p>
</div>
</div>
<div id="outline-container-org621c39a" class="outline-2">
<h2 id="org621c39a"><span class="section-number-2">2</span> Second heading</h2>
<div class="outline-text-2" id="text-2">
<p>
Some stuff written here
</p>
</div>
<div id="outline-container-orgae45d6b" class="outline-3">
<h3 id="orgae45d6b"><span class="section-number-3">2.1</span> First subheading</h3>
<div class="outline-text-3" id="text-2-1">
<p>
Some stuff
</p>
</div>
</div>
<div id="outline-container-org9301aa9" class="outline-3">
<h3 id="org9301aa9"><span class="section-number-3">2.2</span> Second subheading</h3>
<div class="outline-text-3" id="text-2-2">
<p>
Some other stuff
</p>
</div>
</div>
</div>
</div>
</body>
</html>
As you can see, all the anchors are in the fomat of org[a-f0-9]{7}
. First,
this is not really meaningful if you want to read the anchor and guess where
it will lead you. But secondly, these anchors will change each time you
export your Org file to HTML. If I want to share a URL to my website and to
a specific heading,… well I can’t, it will change the next time I update the
document. And I don’t want to have to set a CUSTOM_ID
property for each
one of my headings manually. So, what to do?
A first solution
A first solution I found came from this blog post, where Lee Hinman
described the very same issue they had and wrote some Elisp code to remedy
that (it’s a great read, go take a look). And it worked, and for some time I
used their code in my Emacs configuration file in order to generate unique
custom IDs for my Org headers. Basically what the code does is it detects if
auto-id:t
is set in an #+OPTIONS
header. If it is, then it will iterate
over all of the Org headers, and for each one of them it will insert a
CUSTOM_ID
, which is made from a UUID generated by Emacs. And tada! we get
for each header a
h-[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}
custom ID
that won’t change next time we export our Org file to HTML when we save our
file, and only for headings which don’t already have a CUSTOM_ID
property.
Wohoo!
Except…
These headers are not meaningful
Ok, alright, that’s still a huge step forward, we don’t have to type any
CUSTOM_ID property manually anymore, it’s done automatically for us. But,
when I send someone a link like
https://langue.phundrak.com/eittland#h-76fc0b91-e41c-42ad-8652-bba029632333
,
the first reaction to this URL is often something along the lines of “What
the fuck?”. And they’re right, this URL is unreadable when it comes to the
anchor. How am I supposed to guess it links to the description of the vowels
of the Eittlandic language? (That’s a constructed language I’m working on,
you won’t find anything about it outside my website.)
So, I went back to my configuration file for Emacs, and through some trial
and error, I finally found a way to get a consistent custom ID which is
readable and automatically set. With the current state of my code, what you
get is the complete path of the Org heading, all spaces replaced by
underscores and headings separated by dashes, with a final unique identifier
taken from an Emacs-generated UUID. Now, the same link as above will look
like
https://langue.phundrak.com/eittland#Aperçu_structurel-Inventaire_phonétique_et_orthographe-Voyelles_pures-84f05c2c
.
It won’t be more readable to you if you don’t speak French, but you can
guess it is way better than what we had before. I even added a safety net by
replacing all forward slashes with dashes. The last ID is here to ensure the
path will be unique in case we’d have two identical paths in the org file
for one reason or another.
The modifications I made to the first function eos/org-id-new
are minimal,
where I just split the UUID and get its first part. This is basically a way
to simplify it.
(defun eos/org-id-new (&optional prefix)
"Create a new globally unique ID.
An ID consists of two parts separated by a colon:
- a prefix
- a unique part that will be created according to
`org-id-method'.
PREFIX can specify the prefix, the default is given by the
variable `org-id-prefix'. However, if PREFIX is the symbol
`none', don't use any prefix even if `org-id-prefix' specifies
one.
So a typical ID could look like \"Org-4nd91V40HI\"."
(let* ((prefix (if (eq prefix 'none)
""
(concat (or prefix org-id-prefix)
"-"))) unique)
(if (equal prefix "-")
(setq prefix ""))
(cond
((memq org-id-method
'(uuidgen uuid))
(setq unique (org-trim (shell-command-to-string org-id-uuid-program)))
(unless (org-uuidgen-p unique)
(setq unique (org-id-uuid))))
((eq org-id-method 'org)
(let* ((etime (org-reverse-string (org-id-time-to-b36)))
(postfix (if org-id-include-domain
(progn
(require 'message)
(concat "@"
(message-make-fqdn))))))
(setq unique (concat etime postfix))))
(t (error "Invalid `org-id-method'")))
(concat prefix (car (split-string unique "-")))))
Next, we have here the actual generation of the custom ID. As you can see,
the let
has been replaced by a let*
which allowed me to create the ID
with the variables orgpath
and heading
. The former concatenates the path
to the heading joined by dashes, and heading
concatenates orgpath
to the
name of the current heading joined by a dash if orgpath
is not empty. It
will then create a slug out of the result, deleting some elements such as
forward slashes or tildes, and all whitespace is replaced by underscores. It
then passes heading
as an argument to the function described above to
which the unique ID will be concatenated.
(defun eos/org-custom-id-get (&optional pom create prefix)
"Get the CUSTOM_ID property of the entry at point-or-marker POM.
If POM is nil, refer to the entry at point. If the entry does not
have an CUSTOM_ID, the function returns nil. However, when CREATE
is non nil, create a CUSTOM_ID if none is present already. PREFIX
will be passed through to `eos/org-id-new'. In any case, the
CUSTOM_ID of the entry is returned."
(interactive)
(org-with-point-at pom
(let* ((orgpath (mapconcat #'identity (org-get-outline-path) "-"))
(heading (replace-regexp-in-string
"/\\|~\\|\\[\\|\\]" ""
(replace-regexp-in-string
"[[:space:]]+" "_" (if (string= orgpath "")
(org-get-heading t t t t)
(concat orgpath "-" (org-get-heading t t t t))))))
(id (org-entry-get nil "CUSTOM_ID")))
(cond
((and id
(stringp id)
(string-match "\\S-" id)) id)
(create (setq id (eos/org-id-new (concat prefix heading)))
(org-entry-put pom "CUSTOM_ID" id)
(org-id-add-location id
(buffer-file-name (buffer-base-buffer)))
id)))))
The rest of the code is unchanged, here it is anyway:
(defun eos/org-add-ids-to-headlines-in-file ()
"Add CUSTOM_ID properties to all headlines in the current file
which do not already have one.
Only adds ids if the `auto-id' option is set to `t' in the file
somewhere. ie, #+OPTIONS: auto-id:t"
(interactive)
(save-excursion
(widen)
(goto-char (point-min))
(when (re-search-forward "^#\\+OPTIONS:.*auto-id:t"
(point-max)
t)
(org-map-entries (lambda ()
(eos/org-custom-id-get (point)
'create))))))
(add-hook 'org-mode-hook
(lambda ()
(add-hook 'before-save-hook
(lambda ()
(when (and (eq major-mode 'org-mode)
(eq buffer-read-only nil))
(eos/org-add-ids-to-headlines-in-file))))))
Note that you will need the package org-id
to make this code work. You
simply need to add the following code before the code I shared above:
(require 'org-id)
(setq org-id-link-to-org-use-id 'create-if-interactive-and-no-custom-id)
And that’s how my links are now way more readable and persistent! The only downside I found to this is when you move headings and their path is modified, or when you modify the heading itself, the custom ID is not automatically updated. I could fix that by regenerating the custom ID on each save, regardless of whether a custom ID already exists or not, but it’s at the risk an ID manually set will get overwritten.