Skip to content
  • Seb35's avatar
    Initial version of DuraLex (JavaScript flavour). · dedc9827
    Seb35 authored
    This JavaScript flavour of DuraLex is largely inspired by the original
    DuraLex written in Python by promethe42:
    * similar result of a semantic tree (slight modifications in the dialect)
    * similar visitors manipulating the semantic tree to enhance it
    
    However it differs in the syntactic recognition: the original DuraLex in
    Python is a set of Python functions recognising parts of text, and this
    version is constructed on a grammar then compiled to JavaScript functions
    thanks to PEG.js.
    
    Note that the PEG.js grammar dialect is near the Parsimonious grammar
    dialect in Python; possibly they could be transformed through a
    bijective mapping if some constraints are enforced on the writing (e.g.
    there is no regex in PEG.js vs ( "aa" / "bb" / … ) are quite slow in
    Parsimonious and should be rewritten to ~"aa|bb").
    
    This version of DuraLex uses the grammar of metslesliens: essentially
    the so-called references in DuraLex (the subjects of the sentences, i.e.
    the location of the change) are parsed by metslesliens, and DuraLex
    grammar parses the global sentences. Then DuraLex refines the resulting
    raw tree thanks to the visitors, so that the final refined result is
    useful for real applications like creating a diff.
    
    Diffs are created in a form called “exact diffs”, which means “diffs
    corresponding exactly to the modification described by the modifying
    text” (and not diffs created by an independent algorithm). These diffs
    are character-based, so that merges are vastly easier (and with better
    results) than line-based diffs. With this level of precision, it is
    probably safe to state that non-merging exact diffs are real political
    alternatives (« en discussion commune »).
    dedc9827