-
Seb35 authored
This JavaScript flavour of DuraLex is largely inspired by the original DuraLex written in Python by promethe42: * similar result of a semantic tree (slight modifications in the dialect) * similar visitors manipulating the semantic tree to enhance it However it differs in the syntactic recognition: the original DuraLex in Python is a set of Python functions recognising parts of text, and this version is constructed on a grammar then compiled to JavaScript functions thanks to PEG.js. Note that the PEG.js grammar dialect is near the Parsimonious grammar dialect in Python; possibly they could be transformed through a bijective mapping if some constraints are enforced on the writing (e.g. there is no regex in PEG.js vs ( "aa" / "bb" / … ) are quite slow in Parsimonious and should be rewritten to ~"aa|bb"). This version of DuraLex uses the grammar of metslesliens: essentially the so-called references in DuraLex (the subjects of the sentences, i.e. the location of the change) are parsed by metslesliens, and DuraLex grammar parses the global sentences. Then DuraLex refines the resulting raw tree thanks to the visitors, so that the final refined result is useful for real applications like creating a diff. Diffs are created in a form called “exact diffs”, which means “diffs corresponding exactly to the modification described by the modifying text” (and not diffs created by an independent algorithm). These diffs are character-based, so that merges are vastly easier (and with better results) than line-based diffs. With this level of precision, it is probably safe to state that non-merging exact diffs are real political alternatives (« en discussion commune »).
dedc9827
This project is licensed under the GNU Affero General Public License v3.0.
Learn more
Loading