198 lines
5.2 KiB
Org Mode
198 lines
5.2 KiB
Org Mode
#+title: LangEvolve-rs
|
||
#+author: Lucien Cartier-Tilet
|
||
|
||
* Introduction
|
||
LangEvolve-rs is a Rust rewrite of the original [[https://github.com/ceronyon/LangEvolve/][LangEvolve project]] written by
|
||
[[https://github.com/ceronyon/LangEvolve/][Ceronyon]]. This tool is a conlanging tool used to apply sound change rules on
|
||
words or text.
|
||
|
||
* Differences with the original project
|
||
The main difference with the main project resides in its settings format:
|
||
while the original project only supports the JSON format, this project
|
||
supports both the JSON and the Yaml formats. The settings are also represented
|
||
differently in JSON between the original project and this one. Lastly, the
|
||
regex crate used in this project does not allow certain expressions, such as
|
||
look-ahead and look-behind searches, and backreferences. To get a better idea
|
||
of what I am talking about, here is the example json given by the original
|
||
project for Latin to Portugese:
|
||
#+BEGIN_SRC json
|
||
{
|
||
"version" : "1",
|
||
"categories" : {
|
||
"V" : "aeiou",
|
||
"L" : "āēīōū",
|
||
"C" : "ptcqbdgmnlrhs",
|
||
"F" : "ie",
|
||
"B" : "ou",
|
||
"S" : "ptc",
|
||
"Z" : "bdg"
|
||
},
|
||
"rules" : [
|
||
{ "[sm]$" : "" },
|
||
{ "i(%V)" : "j\\1" },
|
||
{ "%L" : "%V" },
|
||
{ "(%Vr)e$" : "\\1" },
|
||
{ "(%V)v(%V)" : "\\1\\2" },
|
||
{ "u$" : "o" },
|
||
{ "gn" : "nh" },
|
||
{ "(%V)p(?=%V)" : "\\1b" },
|
||
{ "(%V)t(?=%V)" : "\\1d" },
|
||
{ "(%V)c(?=%V)" : "\\1g" },
|
||
{ "(%F)ct" : "\\1it" },
|
||
{ "(%B)ct" : "\\1ut" },
|
||
{ "(%V)pt" : "\\1t" },
|
||
{ "ii" : "i" },
|
||
{ "(%C)er(%V)" : "\\1r\\2" },
|
||
{ "lj" : "lh" }
|
||
]
|
||
}
|
||
#+END_SRC
|
||
|
||
As you can see, backreferences have their syntax modified from ~\1~ to ~$1~
|
||
(or ~${1}~ if it is followed by other stuff) for instance, and look-ahead and
|
||
look-behind expressions must be incorporated into the expression.
|
||
|
||
And here is the JSON generated by this project (beautified, the original is on
|
||
one line only without unnecessary whitespace):
|
||
#+BEGIN_SRC json
|
||
{
|
||
"version": "1",
|
||
"categories": {
|
||
"S": "ptc",
|
||
"L": "āēīōū",
|
||
"V": "aeiou",
|
||
"Z": "bgd",
|
||
"F": "ie",
|
||
"C": "ptcqbdgmnlrhs",
|
||
"B": "ou"
|
||
},
|
||
"rules": [
|
||
["[sm]$", ""],
|
||
["i(%V)", "j$1"],
|
||
["%L", "%V"],
|
||
["(%Vr)e$", "$1"],
|
||
["(%V)v(%V)", "${1}$2"],
|
||
["u$", "o"],
|
||
["gn", "nh"],
|
||
["(%V)p(%V)", "${1}b$2"],
|
||
["(%V)t(%V)", "${1}d$2"],
|
||
["(%V)c(%V)", "${1}g$2"],
|
||
["(%F)ct", "${1}it"],
|
||
["(%B)ct", "${1}ut"],
|
||
["(%V)pt", "${1}t"],
|
||
["ii", "i"],
|
||
["(%C)er(%V)", "${1}r$2"],
|
||
["lj", "lh"]
|
||
]
|
||
}
|
||
#+END_SRC
|
||
|
||
By the way, here is the Yaml equivalent generated by this project:
|
||
#+BEGIN_SRC yaml
|
||
---
|
||
version: "1"
|
||
categories:
|
||
B: ou
|
||
S: ptc
|
||
L: āēīōū
|
||
Z: bgd
|
||
C: ptcqbdgmnlrhs
|
||
F: ie
|
||
V: aeiou
|
||
rules:
|
||
- - "[sm]$"
|
||
- ""
|
||
- - i(%V)
|
||
- j$1
|
||
- - "%L"
|
||
- "%V"
|
||
- - (%Vr)e$
|
||
- $1
|
||
- - (%V)v(%V)
|
||
- "${1}$2"
|
||
- - u$
|
||
- o
|
||
- - gn
|
||
- nh
|
||
- - (%V)p(%V)
|
||
- "${1}b$2"
|
||
- - (%V)t(%V)
|
||
- "${1}d$2"
|
||
- - (%V)c(%V)
|
||
- "${1}g$2"
|
||
- - (%F)ct
|
||
- "${1}it"
|
||
- - (%B)ct
|
||
- "${1}ut"
|
||
- - (%V)pt
|
||
- "${1}t"
|
||
- - ii
|
||
- i
|
||
- - (%C)er(%V)
|
||
- "${1}r$2"
|
||
- - lj
|
||
- lh
|
||
#+END_SRC
|
||
|
||
Although most of the rules are not between double quotes, it is preferable to
|
||
write them as follows in order to avoid any issues with LangEvolveRs:
|
||
#+BEGIN_SRC yaml
|
||
---
|
||
version: "1"
|
||
categories:
|
||
B: ou
|
||
S: ptc
|
||
L: āēīōū
|
||
Z: bgd
|
||
C: ptcqbdgmnlrhs
|
||
F: ie
|
||
V: aeiou
|
||
rules:
|
||
- - "[sm]$"
|
||
- ""
|
||
- - "i(%V)"
|
||
- "j$1"
|
||
- - "%L"
|
||
- "%V"
|
||
- - "(%Vr)e$"
|
||
- "$1"
|
||
- - "(%V)v(%V)"
|
||
- "${1}$2"
|
||
- - "u$"
|
||
- "o"
|
||
- - "gn"
|
||
- "nh"
|
||
- - "(%V)p(%V)"
|
||
- "${1}b$2"
|
||
- - "(%V)t(%V)"
|
||
- "${1}d$2"
|
||
- - "(%V)c(%V)"
|
||
- "${1}g$2"
|
||
- - "(%F)ct"
|
||
- "${1}it"
|
||
- - "(%B)ct"
|
||
- "${1}ut"
|
||
- - "(%V)pt"
|
||
- "${1}t"
|
||
- - "ii"
|
||
- "i"
|
||
- - "(%C)er(%V)"
|
||
- "${1}r$2"
|
||
- - "lj"
|
||
- "lh"
|
||
#+END_SRC
|
||
|
||
You can find more information on how to use regular expressions with this
|
||
project in the documentation of the regex crate [[https://docs.rs/regex/1.3.6/regex/][here]].
|
||
|
||
The settings schema is set to be fully backward-compatible. This means if the
|
||
version number goes up in the settings file, older settings file will remain
|
||
compatible with newer versions of the software. However, forward compatibility
|
||
cannot be ensured unless someone invents time travel, hence settings files
|
||
with a higher version than the running software’s will be automatically
|
||
rejected.
|
||
|
||
* License
|
||
LangEvolveRs is licensed under the AGPLv3 license. The full license can be
|
||
found [[file:agpl-3.0.txt][here]].
|