+++ title = "[EN] Automatic Meaningful Custom IDs for Org Headings" author = ["Lucien “Phundrak” Cartier-Tilet"] date = 2020-06-06 tags = ["emacs", "orgmode"] categories = ["emacs", "linux", "conlanging", "orgmode"] draft = false [menu.main] weight = 2001 identifier = "en-automatic-meaningful-custom-ids-for-org-headings" +++ Spoiler alert, I will just modify a bit of code that already exists, go directly to the bottom if you want the solution, or read the whole post if you are interested in how I got there.
- [The issue](#the-issue) - [A first solution](#a-first-solution) - [These headers are not meaningful](#these-headers-are-not-meaningful)
## The issue {#the-issue} About two to three years ago, as I was working on a project that was meant to be published on the internet, I looked for a solution to get fixed anchor links to my various headings when I performed HTML exports. As some of you may know, by default when an Org file is exported to an HTML file, a random ID will be generated for each header, and this ID will be used as their anchor. Here’s a quick example of a simple org file: ```org #+title: Sample org file * First heading Reference to a subheading * Second heading Some stuff written here ** First subheading Some stuff ** Second subheading Some other stuff ```
Code Snippet 1: Example org file
And this is the result once exported to HTML (with a lot of noise removed from ``): ```html Sample org file

Sample org file

1 First heading

Reference to a subheading

2 Second heading

Some stuff written here

2.1 First subheading

Some stuff

2.2 Second subheading

Some other stuff

```
Code Snippet 2: Output HTML file
As you can see, all the anchors are in the fomat of `org[a-f0-9]{7}`. First, this is not really meaningful if you want to read the anchor and guess where it will lead you. But secondly, these anchors will change each time you export your Org file to HTML. If I want to share a URL to my website and to a specific heading,… well I can’t, it will change the next time I update the document. And I don’t want to have to set a `CUSTOM_ID` property for each one of my headings manually. So, what to do? ## A first solution {#a-first-solution} A first solution I found came from [this blog post](https://writequit.org/articles/emacs-org-mode-generate-ids.html), where Lee Hinman described the very same issue they had and wrote some Elisp code to remedy that (it’s a great read, go take a look). And it worked, and for some time I used their code in my Emacs configuration file in order to generate unique custom IDs for my Org headers. Basically what the code does is it detects if `auto-id:t` is set in an `#+OPTIONS` header. If it is, then it will iterate over all of the Org headers, and for each one of them it will insert a `CUSTOM_ID`, which is made from a UUID generated by Emacs. And tada! we get for each header a `h-[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}` custom ID that won’t change next time we export our Org file to HTML when we save our file, and only for headings which don’t already have a `CUSTOM_ID` property. Wohoo! Except… ## These headers are not meaningful {#these-headers-are-not-meaningful} Ok, alright, that’s still a huge step forward, we don’t have to type any CUSTOM\_ID property manually anymore, it’s done automatically for us. But, when I send someone a link like `https://langue.phundrak.com/eittland#h-76fc0b91-e41c-42ad-8652-bba029632333`, the first reaction to this URL is often something along the lines of “What the fuck?”. And they’re right, this URL is unreadable when it comes to the anchor. How am I supposed to guess it links to the description of the vowels of the Eittlandic language? (That’s a constructed language I’m working on, you won’t find anything about it outside my website.) So, I went back to my configuration file for Emacs, and through some trial and error, I finally found a way to get a consistent custom ID which is readable and automatically set. With the current state of my code, what you get is the complete path of the Org heading, all spaces replaced by underscores and headings separated by dashes, with a final unique identifier taken from an Emacs-generated UUID. Now, the same link as above will look like `https://langue.phundrak.com/eittland#Aperçu_structurel-Inventaire_phonétique_et_orthographe-Voyelles_pures-84f05c2c`. It won’t be more readable to you if you don’t speak French, but you can guess it is way better than what we had before. I even added a safety net by replacing all forward slashes with dashes. The last ID is here to ensure the path will be unique in case we’d have two identical paths in the org file for one reason or another. The modifications I made to the first function `eos/org-id-new` are minimal, where I just split the UUID and get its first part. This is basically a way to simplify it. ```emacs-lisp (defun eos/org-id-new (&optional prefix) "Create a new globally unique ID. An ID consists of two parts separated by a colon: - a prefix - a unique part that will be created according to `org-id-method'. PREFIX can specify the prefix, the default is given by the variable `org-id-prefix'. However, if PREFIX is the symbol `none', don't use any prefix even if `org-id-prefix' specifies one. So a typical ID could look like \"Org-4nd91V40HI\"." (let* ((prefix (if (eq prefix 'none) "" (concat (or prefix org-id-prefix) "-"))) unique) (if (equal prefix "-") (setq prefix "")) (cond ((memq org-id-method '(uuidgen uuid)) (setq unique (org-trim (shell-command-to-string org-id-uuid-program))) (unless (org-uuidgen-p unique) (setq unique (org-id-uuid)))) ((eq org-id-method 'org) (let* ((etime (org-reverse-string (org-id-time-to-b36))) (postfix (if org-id-include-domain (progn (require 'message) (concat "@" (message-make-fqdn)))))) (setq unique (concat etime postfix)))) (t (error "Invalid `org-id-method'"))) (concat prefix (car (split-string unique "-"))))) ``` Next, we have here the actual generation of the custom ID. As you can see, the `let` has been replaced by a `let*` which allowed me to create the ID with the variables `orgpath` and `heading`. The former concatenates the path to the heading joined by dashes, and `heading` concatenates `orgpath` to the name of the current heading joined by a dash if `orgpath` is not empty. It will then create a slug out of the result, deleting some elements such as forward slashes or tildes, and all whitespace is replaced by underscores. It then passes `heading` as an argument to the function described above to which the unique ID will be concatenated. ```emacs-lisp (defun eos/org-custom-id-get (&optional pom create prefix) "Get the CUSTOM_ID property of the entry at point-or-marker POM. If POM is nil, refer to the entry at point. If the entry does not have an CUSTOM_ID, the function returns nil. However, when CREATE is non nil, create a CUSTOM_ID if none is present already. PREFIX will be passed through to `eos/org-id-new'. In any case, the CUSTOM_ID of the entry is returned." (interactive) (org-with-point-at pom (let* ((orgpath (mapconcat #'identity (org-get-outline-path) "-")) (heading (replace-regexp-in-string "/\\|~\\|\\[\\|\\]" "" (replace-regexp-in-string "[[:space:]]+" "_" (if (string= orgpath "") (org-get-heading t t t t) (concat orgpath "-" (org-get-heading t t t t)))))) (id (org-entry-get nil "CUSTOM_ID"))) (cond ((and id (stringp id) (string-match "\\S-" id)) id) (create (setq id (eos/org-id-new (concat prefix heading))) (org-entry-put pom "CUSTOM_ID" id) (org-id-add-location id (buffer-file-name (buffer-base-buffer))) id))))) ``` The rest of the code is unchanged, here it is anyway: ```emacs-lisp (defun eos/org-add-ids-to-headlines-in-file () "Add CUSTOM_ID properties to all headlines in the current file which do not already have one. Only adds ids if the `auto-id' option is set to `t' in the file somewhere. ie, #+OPTIONS: auto-id:t" (interactive) (save-excursion (widen) (goto-char (point-min)) (when (re-search-forward "^#\\+OPTIONS:.*auto-id:t" (point-max) t) (org-map-entries (lambda () (eos/org-custom-id-get (point) 'create)))))) (add-hook 'org-mode-hook (lambda () (add-hook 'before-save-hook (lambda () (when (and (eq major-mode 'org-mode) (eq buffer-read-only nil)) (eos/org-add-ids-to-headlines-in-file)))))) ``` Note that you **will need** the package `org-id` to make this code work. You simply need to add the following code before the code I shared above: ```emacs-lisp (require 'org-id) (setq org-id-link-to-org-use-id 'create-if-interactive-and-no-custom-id) ``` And that’s how my links are now way more readable **and** persistent! The only downside I found to this is when you move headings and their path is modified, or when you modify the heading itself, the custom ID is not automatically updated. I could fix that by regenerating the custom ID on each save, regardless of whether a custom ID already exists or not, but it’s at the risk an ID manually set will get overwritten.