summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJed Barber <jjbarber@y7mail.com>2021-01-23 02:38:49 +1100
committerJed Barber <jjbarber@y7mail.com>2021-01-23 02:38:49 +1100
commitc7195329c60123b2363ba13863f6951a21d0ff57 (patch)
tree35eb2388650888a4301f326812fa3e5554fff52b
parent0b0c4df3dc7b94c139c5305ea0991a34f0c43238 (diff)
Implementation detail docs, some old notes removed
-rw-r--r--curtail.txt26
-rw-r--r--doc/curtailment.html55
-rw-r--r--doc/error_strings.html32
-rw-r--r--doc/index.html11
-rw-r--r--doc/memoize.html32
-rw-r--r--doc/overview.html28
-rw-r--r--doc/start_finish_ranges.html37
7 files changed, 194 insertions, 27 deletions
diff --git a/curtail.txt b/curtail.txt
deleted file mode 100644
index b9c88b1..0000000
--- a/curtail.txt
+++ /dev/null
@@ -1,26 +0,0 @@
-
-
-all results must track curtails
- - done
-
-when merging two results that have curtails for the same combinator, use the smaller curtail
- - done
- - actually, shouldn't this be the opposite, given that curtails stand for the highest level that a result is valid at?
-
-when updating a result, replace the curtails with the leftrec level for the current context
- - done
-
-ignore a previous result when the current leftrec level is less than the previous result curtail level for even one of the curtails
- - done
-
-the function to check if reusable needs to take into account a +1 for the combinator being currently memoized
- - done
-
-when depth of a curtail is less than or equal to remaining tokens of input after the furthest finish of a result, that curtail can be deleted?
-
-when depth of curtail plus remaining tokens after furthest finish of a result is less than or equal to total input tokens plus one, curtail can be deleted from a result
-
-when depth of curtail plus remaining tokens after furthest finish of a result is less than or equal to remaining tokens at the start of a result, then curtail can be deleted from result, because in that case it is no longer possible for curtail to happen due to excessive combinator recursion
-
-
-
diff --git a/doc/curtailment.html b/doc/curtailment.html
new file mode 100644
index 0000000..9999dac
--- /dev/null
+++ b/doc/curtailment.html
@@ -0,0 +1,55 @@
+
+<!DOCTYPE html>
+
+<html lang="en">
+ <head>
+ <meta charset="utf-8">
+ <title>Curtailment - Packrat Docs</title>
+ <link href="default.css" rel="stylesheet">
+ </head>
+
+ <body>
+
+
+ <h2>Curtailment</h2>
+
+ <a href="index.html">Return to Contents</a>
+
+
+ <p><strong>CAUTION:</strong> This is implementation information and should not be
+ relied upon in any way when using the library.</p>
+
+ <p>A curtailment is information attached to a result denoting one or more
+ combinator/depth pairs.</p>
+
+ <p>Curtailment in general means that the result for a given combinator at a given
+ position is only valid when that combinator is next encountered at that position for
+ left recursion depth counts equal or greater than the curtailment.</p>
+
+ <p>If the left recursion count of a combinator at a particular position exceeds the
+ number of remaining input tokens plus one, the combinator is curtailed at that count
+ depth and fails.</p>
+
+ <p>When results are folded upwards, any curtailments are retained.</p>
+
+ <p>When two results are merged as in <em>Sequence</em> and <em>Choice</em>, if the
+ results share curtailments then the curtailment of greater depth is retained.</p>
+
+ <p>If a result has to be recalculated due to the previous result for a given
+ combinator at a given position not being valid due to being curtailed at a greater
+ left recursion depth, then the recalculated result has the previous result's
+ curtailments applied to it, except with the count depths replaced with the current
+ left recursion depths.</p>
+
+ <p>When a result is recalculated and the depth of a curtailment plus the number of
+ remaining tokens after the furthest finish of a result is less than the remaining
+ tokens at the start of a result, then the curtail can be deleted from the result,
+ since then it is no longer possible for curtailment to happen due to excessive
+ left recursion count depth. This is algebraically equivalent to checking if the
+ furthest finish of a result minus the left recursion depth is greater than the
+ start position of the result.</p>
+
+
+ </body>
+</html>
+
diff --git a/doc/error_strings.html b/doc/error_strings.html
new file mode 100644
index 0000000..db9318d
--- /dev/null
+++ b/doc/error_strings.html
@@ -0,0 +1,32 @@
+
+<!DOCTYPE html>
+
+<html lang="en">
+ <head>
+ <meta charset="utf-8">
+ <title>Error Strings - Packrat Docs</title>
+ <link href="default.css" rel="stylesheet">
+ </head>
+
+ <body>
+
+
+ <h2>Error Strings</h2>
+
+ <a href="index.html">Return to Contents</a>
+
+
+ <p><strong>CAUTION:</strong> This is implementation information and should not be
+ relied upon in any way when using the library.</p>
+
+ <p>The format used in the exception message error strings is:<br>
+ <br>
+ s<em>&lt;SYMBOL_NAME&gt;</em>p<em>&lt;NUMBER&gt;</em><br>
+ <br>
+ There is one of these substrings for every error encountered, all concatenated
+ together in arbitrary order.</p>
+
+
+ </body>
+</html>
+
diff --git a/doc/index.html b/doc/index.html
index 8b26b6e..096b0e2 100644
--- a/doc/index.html
+++ b/doc/index.html
@@ -20,10 +20,11 @@
<ol>
<li><a href="license.html">License</a></li>
<li><a href="depends.html">Dependencies</a></li>
+ <li><a href="overview.html">Overview</a></li>
<li><a href="quickstart.html">Quickstart Guide</a></li>
<li><a href="features.html">Features</a></li>
<li><a href="limits.html">Limitations</a></li>
- <li>File Overview<br>
+ <li>File Overviews<br>
<ol>
<li><a href="src_overview.html">/src</a></li>
<li><a href="example_overview.html">/example</a></li>
@@ -39,6 +40,14 @@
</ol>
</li>
<li><a href="errors.html">Handling Errors</a></li>
+ <li>Implementation Details<br>
+ <ol>
+ <li><a href="start_finish_ranges.html">Start and Finish Ranges</a></li>
+ <li><a href="error_strings.html">Error Strings</a></li>
+ <li><a href="curtailment.html">Curtailment</a></li>
+ <li><a href="memoize.html">Memoization</a></li>
+ </ol>
+ </li>
<li><a href="credits.html">Credits and Further Reading</a></li>
</ol>
diff --git a/doc/memoize.html b/doc/memoize.html
new file mode 100644
index 0000000..780031b
--- /dev/null
+++ b/doc/memoize.html
@@ -0,0 +1,32 @@
+
+<!DOCTYPE html>
+
+<html lang="en">
+ <head>
+ <meta charset="utf-8">
+ <title>Memoization - Packrat Docs</title>
+ <link href="default.css" rel="stylesheet">
+ </head>
+
+ <body>
+
+
+ <h2>Memoization</h2>
+
+ <a href="index.html">Return to Contents</a>
+
+
+ <p><strong>CAUTION:</strong> This is implementation information and should not be
+ relied upon in any way when using the library.</p>
+
+ <p>All intermediate parsing results are memoized. This is automatic and
+ unavoidable, and is necessary to make piecewise parsing possible.</p>
+
+ <p>When piecewise parsing, the combinators that return partially complete
+ results are used to determine what portion of the input must be passed
+ forwards to allow calculating the full result when parsing future pieces.</p>
+
+
+ </body>
+</html>
+
diff --git a/doc/overview.html b/doc/overview.html
new file mode 100644
index 0000000..62189f8
--- /dev/null
+++ b/doc/overview.html
@@ -0,0 +1,28 @@
+
+<!DOCTYPE html>
+
+<html lang="en">
+ <head>
+ <meta charset="utf-8">
+ <title>Overview - Packrat Docs</title>
+ <link href="default.css" rel="stylesheet">
+ </head>
+
+ <body>
+
+
+ <h2>Overview</h2>
+
+ <a href="index.html">Return to Contents</a>
+
+
+ <p>Packrat is a parser combinator library that can handle left recursive
+ ambiguous grammars, and does so in polynomial time and space. It gets its
+ name from packrat parsing, an algorithm which involves memoizing all
+ intermediate parsing results in order to avoid exponential time complexity.
+ The library includes both parser combinators and simpler lexer combinators.</p>
+
+
+ </body>
+</html>
+
diff --git a/doc/start_finish_ranges.html b/doc/start_finish_ranges.html
new file mode 100644
index 0000000..d69c60e
--- /dev/null
+++ b/doc/start_finish_ranges.html
@@ -0,0 +1,37 @@
+
+<!DOCTYPE html>
+
+<html lang="en">
+ <head>
+ <meta charset="utf-8">
+ <title>Start and Finish Ranges - Packrat Docs</title>
+ <link href="default.css" rel="stylesheet">
+ </head>
+
+ <body>
+
+
+ <h2>Start and Finish Ranges</h2>
+
+ <a href="index.html">Return to Contents</a>
+
+
+ <p>When parsing input, the start position and finish position that
+ describe a parsed or lexed chunk are the index of the first item and
+ the last item of that chunk. This was done to match the idea with
+ how array slicing is done in Ada, but it has a few other implications.</p>
+
+ <p>Empty chunks are described by a finish position that is less than
+ the start position. For consistency, this is always done as the finish
+ position being only one less than the start.</p>
+
+ <p>This convention also differs from the convention used by Frost, Hafiz,
+ and Callaghan (2008). They used start positions corresponding to the first
+ item in a chunk, but finish positions corresponding to the first item after
+ the chunk had ended. Care should be taken to keep this in mind to avoid
+ off-by-one errors.</p>
+
+
+ </body>
+</html>
+