diff options
author | Jed Barber <jjbarber@y7mail.com> | 2021-01-23 02:38:49 +1100 |
---|---|---|
committer | Jed Barber <jjbarber@y7mail.com> | 2021-01-23 02:38:49 +1100 |
commit | c7195329c60123b2363ba13863f6951a21d0ff57 (patch) | |
tree | 35eb2388650888a4301f326812fa3e5554fff52b | |
parent | 0b0c4df3dc7b94c139c5305ea0991a34f0c43238 (diff) |
Implementation detail docs, some old notes removed
-rw-r--r-- | curtail.txt | 26 | ||||
-rw-r--r-- | doc/curtailment.html | 55 | ||||
-rw-r--r-- | doc/error_strings.html | 32 | ||||
-rw-r--r-- | doc/index.html | 11 | ||||
-rw-r--r-- | doc/memoize.html | 32 | ||||
-rw-r--r-- | doc/overview.html | 28 | ||||
-rw-r--r-- | doc/start_finish_ranges.html | 37 |
7 files changed, 194 insertions, 27 deletions
diff --git a/curtail.txt b/curtail.txt deleted file mode 100644 index b9c88b1..0000000 --- a/curtail.txt +++ /dev/null @@ -1,26 +0,0 @@ - - -all results must track curtails - - done - -when merging two results that have curtails for the same combinator, use the smaller curtail - - done - - actually, shouldn't this be the opposite, given that curtails stand for the highest level that a result is valid at? - -when updating a result, replace the curtails with the leftrec level for the current context - - done - -ignore a previous result when the current leftrec level is less than the previous result curtail level for even one of the curtails - - done - -the function to check if reusable needs to take into account a +1 for the combinator being currently memoized - - done - -when depth of a curtail is less than or equal to remaining tokens of input after the furthest finish of a result, that curtail can be deleted? - -when depth of curtail plus remaining tokens after furthest finish of a result is less than or equal to total input tokens plus one, curtail can be deleted from a result - -when depth of curtail plus remaining tokens after furthest finish of a result is less than or equal to remaining tokens at the start of a result, then curtail can be deleted from result, because in that case it is no longer possible for curtail to happen due to excessive combinator recursion - - - diff --git a/doc/curtailment.html b/doc/curtailment.html new file mode 100644 index 0000000..9999dac --- /dev/null +++ b/doc/curtailment.html @@ -0,0 +1,55 @@ + +<!DOCTYPE html> + +<html lang="en"> + <head> + <meta charset="utf-8"> + <title>Curtailment - Packrat Docs</title> + <link href="default.css" rel="stylesheet"> + </head> + + <body> + + + <h2>Curtailment</h2> + + <a href="index.html">Return to Contents</a> + + + <p><strong>CAUTION:</strong> This is implementation information and should not be + relied upon in any way when using the library.</p> + + <p>A curtailment is information attached to a result denoting one or more + combinator/depth pairs.</p> + + <p>Curtailment in general means that the result for a given combinator at a given + position is only valid when that combinator is next encountered at that position for + left recursion depth counts equal or greater than the curtailment.</p> + + <p>If the left recursion count of a combinator at a particular position exceeds the + number of remaining input tokens plus one, the combinator is curtailed at that count + depth and fails.</p> + + <p>When results are folded upwards, any curtailments are retained.</p> + + <p>When two results are merged as in <em>Sequence</em> and <em>Choice</em>, if the + results share curtailments then the curtailment of greater depth is retained.</p> + + <p>If a result has to be recalculated due to the previous result for a given + combinator at a given position not being valid due to being curtailed at a greater + left recursion depth, then the recalculated result has the previous result's + curtailments applied to it, except with the count depths replaced with the current + left recursion depths.</p> + + <p>When a result is recalculated and the depth of a curtailment plus the number of + remaining tokens after the furthest finish of a result is less than the remaining + tokens at the start of a result, then the curtail can be deleted from the result, + since then it is no longer possible for curtailment to happen due to excessive + left recursion count depth. This is algebraically equivalent to checking if the + furthest finish of a result minus the left recursion depth is greater than the + start position of the result.</p> + + + </body> +</html> + diff --git a/doc/error_strings.html b/doc/error_strings.html new file mode 100644 index 0000000..db9318d --- /dev/null +++ b/doc/error_strings.html @@ -0,0 +1,32 @@ + +<!DOCTYPE html> + +<html lang="en"> + <head> + <meta charset="utf-8"> + <title>Error Strings - Packrat Docs</title> + <link href="default.css" rel="stylesheet"> + </head> + + <body> + + + <h2>Error Strings</h2> + + <a href="index.html">Return to Contents</a> + + + <p><strong>CAUTION:</strong> This is implementation information and should not be + relied upon in any way when using the library.</p> + + <p>The format used in the exception message error strings is:<br> + <br> + s<em><SYMBOL_NAME></em>p<em><NUMBER></em><br> + <br> + There is one of these substrings for every error encountered, all concatenated + together in arbitrary order.</p> + + + </body> +</html> + diff --git a/doc/index.html b/doc/index.html index 8b26b6e..096b0e2 100644 --- a/doc/index.html +++ b/doc/index.html @@ -20,10 +20,11 @@ <ol> <li><a href="license.html">License</a></li> <li><a href="depends.html">Dependencies</a></li> + <li><a href="overview.html">Overview</a></li> <li><a href="quickstart.html">Quickstart Guide</a></li> <li><a href="features.html">Features</a></li> <li><a href="limits.html">Limitations</a></li> - <li>File Overview<br> + <li>File Overviews<br> <ol> <li><a href="src_overview.html">/src</a></li> <li><a href="example_overview.html">/example</a></li> @@ -39,6 +40,14 @@ </ol> </li> <li><a href="errors.html">Handling Errors</a></li> + <li>Implementation Details<br> + <ol> + <li><a href="start_finish_ranges.html">Start and Finish Ranges</a></li> + <li><a href="error_strings.html">Error Strings</a></li> + <li><a href="curtailment.html">Curtailment</a></li> + <li><a href="memoize.html">Memoization</a></li> + </ol> + </li> <li><a href="credits.html">Credits and Further Reading</a></li> </ol> diff --git a/doc/memoize.html b/doc/memoize.html new file mode 100644 index 0000000..780031b --- /dev/null +++ b/doc/memoize.html @@ -0,0 +1,32 @@ + +<!DOCTYPE html> + +<html lang="en"> + <head> + <meta charset="utf-8"> + <title>Memoization - Packrat Docs</title> + <link href="default.css" rel="stylesheet"> + </head> + + <body> + + + <h2>Memoization</h2> + + <a href="index.html">Return to Contents</a> + + + <p><strong>CAUTION:</strong> This is implementation information and should not be + relied upon in any way when using the library.</p> + + <p>All intermediate parsing results are memoized. This is automatic and + unavoidable, and is necessary to make piecewise parsing possible.</p> + + <p>When piecewise parsing, the combinators that return partially complete + results are used to determine what portion of the input must be passed + forwards to allow calculating the full result when parsing future pieces.</p> + + + </body> +</html> + diff --git a/doc/overview.html b/doc/overview.html new file mode 100644 index 0000000..62189f8 --- /dev/null +++ b/doc/overview.html @@ -0,0 +1,28 @@ + +<!DOCTYPE html> + +<html lang="en"> + <head> + <meta charset="utf-8"> + <title>Overview - Packrat Docs</title> + <link href="default.css" rel="stylesheet"> + </head> + + <body> + + + <h2>Overview</h2> + + <a href="index.html">Return to Contents</a> + + + <p>Packrat is a parser combinator library that can handle left recursive + ambiguous grammars, and does so in polynomial time and space. It gets its + name from packrat parsing, an algorithm which involves memoizing all + intermediate parsing results in order to avoid exponential time complexity. + The library includes both parser combinators and simpler lexer combinators.</p> + + + </body> +</html> + diff --git a/doc/start_finish_ranges.html b/doc/start_finish_ranges.html new file mode 100644 index 0000000..d69c60e --- /dev/null +++ b/doc/start_finish_ranges.html @@ -0,0 +1,37 @@ + +<!DOCTYPE html> + +<html lang="en"> + <head> + <meta charset="utf-8"> + <title>Start and Finish Ranges - Packrat Docs</title> + <link href="default.css" rel="stylesheet"> + </head> + + <body> + + + <h2>Start and Finish Ranges</h2> + + <a href="index.html">Return to Contents</a> + + + <p>When parsing input, the start position and finish position that + describe a parsed or lexed chunk are the index of the first item and + the last item of that chunk. This was done to match the idea with + how array slicing is done in Ada, but it has a few other implications.</p> + + <p>Empty chunks are described by a finish position that is less than + the start position. For consistency, this is always done as the finish + position being only one less than the start.</p> + + <p>This convention also differs from the convention used by Frost, Hafiz, + and Callaghan (2008). They used start positions corresponding to the first + item in a chunk, but finish positions corresponding to the first item after + the chunk had ended. Care should be taken to keep this in mind to avoid + off-by-one errors.</p> + + + </body> +</html> + |