300 lines
12 KiB
Markdown
300 lines
12 KiB
Markdown
|
+++
|
|||
|
title = "[EN] Automatic Meaningful Custom IDs for Org Headings"
|
|||
|
author = ["Lucien “Phundrak” Cartier-Tilet"]
|
|||
|
date = 2020-06-06
|
|||
|
tags = ["emacs", "orgmode"]
|
|||
|
categories = ["emacs", "linux", "conlanging", "orgmode"]
|
|||
|
draft = false
|
|||
|
[menu.main]
|
|||
|
weight = 2001
|
|||
|
identifier = "en-automatic-meaningful-custom-ids-for-org-headings"
|
|||
|
+++
|
|||
|
|
|||
|
Spoiler alert, I will just modify a bit of code that already exists, go
|
|||
|
directly to the bottom if you want the solution, or read the whole post if
|
|||
|
you are interested in how I got there.
|
|||
|
|
|||
|
<div class="ox-hugo-toc toc local">
|
|||
|
<div></div>
|
|||
|
|
|||
|
- [The issue](#the-issue)
|
|||
|
- [A first solution](#a-first-solution)
|
|||
|
- [These headers are not meaningful](#these-headers-are-not-meaningful)
|
|||
|
|
|||
|
</div>
|
|||
|
<!--endtoc-->
|
|||
|
|
|||
|
|
|||
|
## The issue {#the-issue}
|
|||
|
|
|||
|
About two to three years ago, as I was working on a project that was meant
|
|||
|
to be published on the internet, I looked for a solution to get fixed anchor
|
|||
|
links to my various headings when I performed HTML exports. As some of you
|
|||
|
may know, by default when an Org file is exported to an HTML file, a random
|
|||
|
ID will be generated for each header, and this ID will be used as their
|
|||
|
anchor. Here’s a quick example of a simple org file:
|
|||
|
|
|||
|
```org
|
|||
|
#+title: Sample org file
|
|||
|
* First heading
|
|||
|
Reference to a subheading
|
|||
|
* Second heading
|
|||
|
Some stuff written here
|
|||
|
** First subheading
|
|||
|
Some stuff
|
|||
|
** Second subheading
|
|||
|
Some other stuff
|
|||
|
```
|
|||
|
|
|||
|
<div class="src-block-caption">
|
|||
|
<span class="src-block-number">Code Snippet 1</span>:
|
|||
|
Example org file
|
|||
|
</div>
|
|||
|
|
|||
|
And this is the result once exported to HTML (with a lot of noise removed
|
|||
|
from `<head>`):
|
|||
|
|
|||
|
```html
|
|||
|
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
|
|||
|
|
|||
|
<head>
|
|||
|
<title>Sample org file</title>
|
|||
|
<meta name="generator" content="Org mode" />
|
|||
|
<meta name="author" content="Lucien Cartier-Tilet" />
|
|||
|
</head>
|
|||
|
|
|||
|
<body>
|
|||
|
<div id="content">
|
|||
|
<h1 class="title">Sample org file</h1>
|
|||
|
<div id="outline-container-orgd8e6238" class="outline-2">
|
|||
|
<h2 id="orgd8e6238"><span class="section-number-2">1</span> First heading</h2>
|
|||
|
<div class="outline-text-2" id="text-1">
|
|||
|
<p>
|
|||
|
Reference to a subheading
|
|||
|
</p>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<div id="outline-container-org621c39a" class="outline-2">
|
|||
|
<h2 id="org621c39a"><span class="section-number-2">2</span> Second heading</h2>
|
|||
|
<div class="outline-text-2" id="text-2">
|
|||
|
<p>
|
|||
|
Some stuff written here
|
|||
|
</p>
|
|||
|
</div>
|
|||
|
<div id="outline-container-orgae45d6b" class="outline-3">
|
|||
|
<h3 id="orgae45d6b"><span class="section-number-3">2.1</span> First subheading</h3>
|
|||
|
<div class="outline-text-3" id="text-2-1">
|
|||
|
<p>
|
|||
|
Some stuff
|
|||
|
</p>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<div id="outline-container-org9301aa9" class="outline-3">
|
|||
|
<h3 id="org9301aa9"><span class="section-number-3">2.2</span> Second subheading</h3>
|
|||
|
<div class="outline-text-3" id="text-2-2">
|
|||
|
<p>
|
|||
|
Some other stuff
|
|||
|
</p>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
</body>
|
|||
|
|
|||
|
</html>
|
|||
|
```
|
|||
|
|
|||
|
<div class="src-block-caption">
|
|||
|
<span class="src-block-number">Code Snippet 2</span>:
|
|||
|
Output HTML file
|
|||
|
</div>
|
|||
|
|
|||
|
As you can see, all the anchors are in the fomat of `org[a-f0-9]{7}`. First,
|
|||
|
this is not really meaningful if you want to read the anchor and guess where
|
|||
|
it will lead you. But secondly, these anchors will change each time you
|
|||
|
export your Org file to HTML. If I want to share a URL to my website and to
|
|||
|
a specific heading,… well I can’t, it will change the next time I update the
|
|||
|
document. And I don’t want to have to set a `CUSTOM_ID` property for each
|
|||
|
one of my headings manually. So, what to do?
|
|||
|
|
|||
|
|
|||
|
## A first solution {#a-first-solution}
|
|||
|
|
|||
|
A first solution I found came from [this blog post](https://writequit.org/articles/emacs-org-mode-generate-ids.html), where Lee Hinman
|
|||
|
described the very same issue they had and wrote some Elisp code to remedy
|
|||
|
that (it’s a great read, go take a look). And it worked, and for some time I
|
|||
|
used their code in my Emacs configuration file in order to generate unique
|
|||
|
custom IDs for my Org headers. Basically what the code does is it detects if
|
|||
|
`auto-id:t` is set in an `#+OPTIONS` header. If it is, then it will iterate
|
|||
|
over all of the Org headers, and for each one of them it will insert a
|
|||
|
`CUSTOM_ID`, which is made from a UUID generated by Emacs. And tada! we get
|
|||
|
for each header a
|
|||
|
`h-[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}` custom ID
|
|||
|
that won’t change next time we export our Org file to HTML when we save our
|
|||
|
file, and only for headings which don’t already have a `CUSTOM_ID` property.
|
|||
|
Wohoo!
|
|||
|
|
|||
|
Except…
|
|||
|
|
|||
|
|
|||
|
## These headers are not meaningful {#these-headers-are-not-meaningful}
|
|||
|
|
|||
|
Ok, alright, that’s still a huge step forward, we don’t have to type any
|
|||
|
CUSTOM\_ID property manually anymore, it’s done automatically for us. But,
|
|||
|
when I send someone a link like
|
|||
|
`https://langue.phundrak.com/eittland#h-76fc0b91-e41c-42ad-8652-bba029632333`,
|
|||
|
the first reaction to this URL is often something along the lines of “What
|
|||
|
the fuck?”. And they’re right, this URL is unreadable when it comes to the
|
|||
|
anchor. How am I supposed to guess it links to the description of the vowels
|
|||
|
of the Eittlandic language? (That’s a constructed language I’m working on,
|
|||
|
you won’t find anything about it outside my website.)
|
|||
|
|
|||
|
So, I went back to my configuration file for Emacs, and through some trial
|
|||
|
and error, I finally found a way to get a consistent custom ID which is
|
|||
|
readable and automatically set. With the current state of my code, what you
|
|||
|
get is the complete path of the Org heading, all spaces replaced by
|
|||
|
underscores and headings separated by dashes, with a final unique identifier
|
|||
|
taken from an Emacs-generated UUID. Now, the same link as above will look
|
|||
|
like
|
|||
|
`https://langue.phundrak.com/eittland#Aperçu_structurel-Inventaire_phonétique_et_orthographe-Voyelles_pures-84f05c2c`.
|
|||
|
It won’t be more readable to you if you don’t speak French, but you can
|
|||
|
guess it is way better than what we had before. I even added a safety net by
|
|||
|
replacing all forward slashes with dashes. The last ID is here to ensure the
|
|||
|
path will be unique in case we’d have two identical paths in the org file
|
|||
|
for one reason or another.
|
|||
|
|
|||
|
The modifications I made to the first function `eos/org-id-new` are minimal,
|
|||
|
where I just split the UUID and get its first part. This is basically a way
|
|||
|
to simplify it.
|
|||
|
|
|||
|
```emacs-lisp
|
|||
|
(defun eos/org-id-new (&optional prefix)
|
|||
|
"Create a new globally unique ID.
|
|||
|
|
|||
|
An ID consists of two parts separated by a colon:
|
|||
|
- a prefix
|
|||
|
- a unique part that will be created according to
|
|||
|
`org-id-method'.
|
|||
|
|
|||
|
PREFIX can specify the prefix, the default is given by the
|
|||
|
variable `org-id-prefix'. However, if PREFIX is the symbol
|
|||
|
`none', don't use any prefix even if `org-id-prefix' specifies
|
|||
|
one.
|
|||
|
|
|||
|
So a typical ID could look like \"Org-4nd91V40HI\"."
|
|||
|
(let* ((prefix (if (eq prefix 'none)
|
|||
|
""
|
|||
|
(concat (or prefix org-id-prefix)
|
|||
|
"-"))) unique)
|
|||
|
(if (equal prefix "-")
|
|||
|
(setq prefix ""))
|
|||
|
(cond
|
|||
|
((memq org-id-method
|
|||
|
'(uuidgen uuid))
|
|||
|
(setq unique (org-trim (shell-command-to-string org-id-uuid-program)))
|
|||
|
(unless (org-uuidgen-p unique)
|
|||
|
(setq unique (org-id-uuid))))
|
|||
|
((eq org-id-method 'org)
|
|||
|
(let* ((etime (org-reverse-string (org-id-time-to-b36)))
|
|||
|
(postfix (if org-id-include-domain
|
|||
|
(progn
|
|||
|
(require 'message)
|
|||
|
(concat "@"
|
|||
|
(message-make-fqdn))))))
|
|||
|
(setq unique (concat etime postfix))))
|
|||
|
(t (error "Invalid `org-id-method'")))
|
|||
|
(concat prefix (car (split-string unique "-")))))
|
|||
|
```
|
|||
|
|
|||
|
Next, we have here the actual generation of the custom ID. As you can see,
|
|||
|
the `let` has been replaced by a `let*` which allowed me to create the ID
|
|||
|
with the variables `orgpath` and `heading`. The former concatenates the path
|
|||
|
to the heading joined by dashes, and `heading` concatenates `orgpath` to the
|
|||
|
name of the current heading joined by a dash if `orgpath` is not empty. It
|
|||
|
will then create a slug out of the result, deleting some elements such as
|
|||
|
forward slashes or tildes, and all whitespace is replaced by underscores. It
|
|||
|
then passes `heading` as an argument to the function described above to
|
|||
|
which the unique ID will be concatenated.
|
|||
|
|
|||
|
```emacs-lisp
|
|||
|
(defun eos/org-custom-id-get (&optional pom create prefix)
|
|||
|
"Get the CUSTOM_ID property of the entry at point-or-marker POM.
|
|||
|
|
|||
|
If POM is nil, refer to the entry at point. If the entry does not
|
|||
|
have an CUSTOM_ID, the function returns nil. However, when CREATE
|
|||
|
is non nil, create a CUSTOM_ID if none is present already. PREFIX
|
|||
|
will be passed through to `eos/org-id-new'. In any case, the
|
|||
|
CUSTOM_ID of the entry is returned."
|
|||
|
(interactive)
|
|||
|
(org-with-point-at pom
|
|||
|
(let* ((orgpath (mapconcat #'identity (org-get-outline-path) "-"))
|
|||
|
(heading (replace-regexp-in-string
|
|||
|
"/\\|~\\|\\[\\|\\]" ""
|
|||
|
(replace-regexp-in-string
|
|||
|
"[[:space:]]+" "_" (if (string= orgpath "")
|
|||
|
(org-get-heading t t t t)
|
|||
|
(concat orgpath "-" (org-get-heading t t t t))))))
|
|||
|
(id (org-entry-get nil "CUSTOM_ID")))
|
|||
|
(cond
|
|||
|
((and id
|
|||
|
(stringp id)
|
|||
|
(string-match "\\S-" id)) id)
|
|||
|
(create (setq id (eos/org-id-new (concat prefix heading)))
|
|||
|
(org-entry-put pom "CUSTOM_ID" id)
|
|||
|
(org-id-add-location id
|
|||
|
(buffer-file-name (buffer-base-buffer)))
|
|||
|
id)))))
|
|||
|
```
|
|||
|
|
|||
|
The rest of the code is unchanged, here it is anyway:
|
|||
|
|
|||
|
```emacs-lisp
|
|||
|
(defun eos/org-add-ids-to-headlines-in-file ()
|
|||
|
"Add CUSTOM_ID properties to all headlines in the current file
|
|||
|
which do not already have one.
|
|||
|
|
|||
|
Only adds ids if the `auto-id' option is set to `t' in the file
|
|||
|
somewhere. ie, #+OPTIONS: auto-id:t"
|
|||
|
(interactive)
|
|||
|
(save-excursion
|
|||
|
(widen)
|
|||
|
(goto-char (point-min))
|
|||
|
(when (re-search-forward "^#\\+OPTIONS:.*auto-id:t"
|
|||
|
(point-max)
|
|||
|
t)
|
|||
|
(org-map-entries (lambda ()
|
|||
|
(eos/org-custom-id-get (point)
|
|||
|
'create))))))
|
|||
|
|
|||
|
(add-hook 'org-mode-hook
|
|||
|
(lambda ()
|
|||
|
(add-hook 'before-save-hook
|
|||
|
(lambda ()
|
|||
|
(when (and (eq major-mode 'org-mode)
|
|||
|
(eq buffer-read-only nil))
|
|||
|
(eos/org-add-ids-to-headlines-in-file))))))
|
|||
|
```
|
|||
|
|
|||
|
Note that you **will need** the package `org-id` to make this code work. You
|
|||
|
simply need to add the following code before the code I shared above:
|
|||
|
|
|||
|
```emacs-lisp
|
|||
|
(require 'org-id)
|
|||
|
(setq org-id-link-to-org-use-id 'create-if-interactive-and-no-custom-id)
|
|||
|
```
|
|||
|
|
|||
|
And that’s how my links are now way more readable **and** persistent! The only
|
|||
|
downside I found to this is when you move headings and their path is
|
|||
|
modified, or when you modify the heading itself, the custom ID is not
|
|||
|
automatically updated. I could fix that by regenerating the custom ID on
|
|||
|
each save, regardless of whether a custom ID already exists or not, but it’s
|
|||
|
at the risk an ID manually set will get overwritten.
|
|||
|
|
|||
|
<div class="html">
|
|||
|
<div></div>
|
|||
|
|
|||
|
<script defer src="<https://commento.phundrak.com/js/commento.js>"></script>
|
|||
|
<div id="commento"></div>
|
|||
|
|
|||
|
</div>
|