lang-evolve-rs/README.org

198 lines
5.2 KiB
Org Mode
Raw Permalink Normal View History

2020-03-26 16:09:51 +00:00
#+title: LangEvolve-rs
#+author: Lucien Cartier-Tilet
* Introduction
LangEvolve-rs is a Rust rewrite of the original [[https://github.com/ceronyon/LangEvolve/][LangEvolve project]] written by
[[https://github.com/ceronyon/LangEvolve/][Ceronyon]]. This tool is a conlanging tool used to apply sound change rules on
words or text.
2020-03-29 15:24:27 +00:00
* Differences with the original project
The main difference with the main project resides in its settings format:
while the original project only supports the JSON format, this project
supports both the JSON and the Yaml formats. The settings are also represented
differently in JSON between the original project and this one. Lastly, the
regex crate used in this project does not allow certain expressions, such as
look-ahead and look-behind searches, and backreferences. To get a better idea
of what I am talking about, here is the example json given by the original
project for Latin to Portugese:
#+BEGIN_SRC json
{
"version" : "1",
"categories" : {
"V" : "aeiou",
"L" : "āēīōū",
"C" : "ptcqbdgmnlrhs",
"F" : "ie",
"B" : "ou",
"S" : "ptc",
"Z" : "bdg"
},
"rules" : [
{ "[sm]$" : "" },
{ "i(%V)" : "j\\1" },
{ "%L" : "%V" },
{ "(%Vr)e$" : "\\1" },
{ "(%V)v(%V)" : "\\1\\2" },
{ "u$" : "o" },
{ "gn" : "nh" },
{ "(%V)p(?=%V)" : "\\1b" },
{ "(%V)t(?=%V)" : "\\1d" },
{ "(%V)c(?=%V)" : "\\1g" },
{ "(%F)ct" : "\\1it" },
{ "(%B)ct" : "\\1ut" },
{ "(%V)pt" : "\\1t" },
{ "ii" : "i" },
{ "(%C)er(%V)" : "\\1r\\2" },
{ "lj" : "lh" }
]
}
#+END_SRC
2020-04-04 16:11:26 +00:00
As you can see, backreferences have their syntax modified from ~\1~ to ~$1~
2020-07-12 11:10:09 +00:00
(or ~${1}~ if it is followed by other stuff) for instance, and look-ahead and
look-behind expressions must be incorporated into the expression.
2020-04-04 16:11:26 +00:00
2020-03-29 15:24:27 +00:00
And here is the JSON generated by this project (beautified, the original is on
one line only without unnecessary whitespace):
#+BEGIN_SRC json
{
"version": "1",
2020-03-29 16:09:43 +00:00
"categories": {
"S": "ptc",
2020-04-04 16:11:26 +00:00
"L": "āēīōū",
"V": "aeiou",
"Z": "bgd",
"F": "ie",
"C": "ptcqbdgmnlrhs",
"B": "ou"
2020-03-29 16:09:43 +00:00
},
2020-04-04 16:11:26 +00:00
"rules": [
["[sm]$", ""],
["i(%V)", "j$1"],
["%L", "%V"],
["(%Vr)e$", "$1"],
["(%V)v(%V)", "${1}$2"],
["u$", "o"],
["gn", "nh"],
["(%V)p(%V)", "${1}b$2"],
["(%V)t(%V)", "${1}d$2"],
["(%V)c(%V)", "${1}g$2"],
["(%F)ct", "${1}it"],
["(%B)ct", "${1}ut"],
["(%V)pt", "${1}t"],
["ii", "i"],
["(%C)er(%V)", "${1}r$2"],
["lj", "lh"]
]
2020-03-29 15:24:27 +00:00
}
#+END_SRC
By the way, here is the Yaml equivalent generated by this project:
#+BEGIN_SRC yaml
---
version: "1"
categories:
2020-04-04 16:11:26 +00:00
B: ou
2020-03-29 16:09:43 +00:00
S: ptc
L: āēīōū
2020-04-04 16:11:26 +00:00
Z: bgd
C: ptcqbdgmnlrhs
F: ie
V: aeiou
rules:
- - "[sm]$"
- ""
- - i(%V)
- j$1
- - "%L"
- "%V"
- - (%Vr)e$
- $1
- - (%V)v(%V)
- "${1}$2"
- - u$
- o
- - gn
- nh
- - (%V)p(%V)
- "${1}b$2"
- - (%V)t(%V)
- "${1}d$2"
- - (%V)c(%V)
- "${1}g$2"
- - (%F)ct
- "${1}it"
- - (%B)ct
- "${1}ut"
- - (%V)pt
- "${1}t"
- - ii
- i
- - (%C)er(%V)
- "${1}r$2"
- - lj
- lh
#+END_SRC
Although most of the rules are not between double quotes, it is preferable to
write them as follows in order to avoid any issues with LangEvolveRs:
#+BEGIN_SRC yaml
---
version: "1"
categories:
2020-03-29 16:09:43 +00:00
B: ou
2020-04-04 16:11:26 +00:00
S: ptc
L: āēīōū
Z: bgd
2020-03-29 16:09:43 +00:00
C: ptcqbdgmnlrhs
2020-04-04 16:11:26 +00:00
F: ie
V: aeiou
2020-03-29 15:24:27 +00:00
rules:
2020-04-04 16:11:26 +00:00
- - "[sm]$"
- ""
- - "i(%V)"
- "j$1"
- - "%L"
- "%V"
- - "(%Vr)e$"
- "$1"
- - "(%V)v(%V)"
- "${1}$2"
- - "u$"
- "o"
- - "gn"
- "nh"
- - "(%V)p(%V)"
- "${1}b$2"
- - "(%V)t(%V)"
- "${1}d$2"
- - "(%V)c(%V)"
- "${1}g$2"
- - "(%F)ct"
- "${1}it"
- - "(%B)ct"
- "${1}ut"
- - "(%V)pt"
- "${1}t"
- - "ii"
- "i"
- - "(%C)er(%V)"
- "${1}r$2"
- - "lj"
- "lh"
2020-03-29 15:24:27 +00:00
#+END_SRC
You can find more information on how to use regular expressions with this
project in the documentation of the regex crate [[https://docs.rs/regex/1.3.6/regex/][here]].
2020-07-12 11:10:09 +00:00
The settings schema is set to be fully backward-compatible. This means if the
version number goes up in the settings file, older settings file will remain
compatible with newer versions of the software. However, forward compatibility
cannot be ensured unless someone invents time travel, hence settings files
with a higher version than the running softwares will be automatically
rejected.
2020-03-26 16:09:51 +00:00
* License
LangEvolveRs is licensed under the AGPLv3 license. The full license can be
found [[file:agpl-3.0.txt][here]].