Implementation detail docs, some old notes removed

author: Jed Barber <jjbarber@y7mail.com> 2021-01-23 02:38:49 +1100
committer: Jed Barber <jjbarber@y7mail.com> 2021-01-23 02:38:49 +1100
commit: c7195329c60123b2363ba13863f6951a21d0ff57 (patch)
tree: 35eb2388650888a4301f326812fa3e5554fff52b
parent: 0b0c4df3dc7b94c139c5305ea0991a34f0c43238 (diff)
7 files changed, 194 insertions, 27 deletions
diff --git a/curtail.txt b/curtail.txt
deleted file mode 100644
index b9c88b1..0000000
--- a/curtail.txt
+++ /dev/null
@@ -1,26 +0,0 @@
-
-
-all results must track curtails
- - done
-
-when merging two results that have curtails for the same combinator, use the smaller curtail
- - done
- - actually, shouldn't this be the opposite, given that curtails stand for the highest level that a result is valid at?
-
-when updating a result, replace the curtails with the leftrec level for the current context
- - done
-
-ignore a previous result when the current leftrec level is less than the previous result curtail level for even one of the curtails
- - done
-
-the function to check if reusable needs to take into account a +1 for the combinator being currently memoized
- - done
-
-when depth of a curtail is less than or equal to remaining tokens of input after the furthest finish of a result, that curtail can be deleted?
-
-when depth of curtail plus remaining tokens after furthest finish of a result is less than or equal to total input tokens plus one, curtail can be deleted from a result
-
-when depth of curtail plus remaining tokens after furthest finish of a result is less than or equal to remaining tokens at the start of a result, then curtail can be deleted from result, because in that case it is no longer possible for curtail to happen due to excessive combinator recursion
-
-
-
diff --git a/doc/curtailment.html b/doc/curtailment.html
new file mode 100644
index 0000000..9999dac
--- /dev/null
+++ b/doc/curtailment.html
@@ -0,0 +1,55 @@
+
+<!DOCTYPE html>
+
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <title>Curtailment - Packrat Docs</title>
+    <link href="default.css" rel="stylesheet">
+  </head>
+
+  <body>
+
+
+  <h2>Curtailment</h2>
+
+  <a href="index.html">Return to Contents</a>
+
+
+  <p><strong>CAUTION:</strong> This is implementation information and should not be
+  relied upon in any way when using the library.</p>
+
+  <p>A curtailment is information attached to a result denoting one or more
+  combinator/depth pairs.</p>
+
+  <p>Curtailment in general means that the result for a given combinator at a given
+  position is only valid when that combinator is next encountered at that position for
+  left recursion depth counts equal or greater than the curtailment.</p>
+
+  <p>If the left recursion count of a combinator at a particular position exceeds the
+  number of remaining input tokens plus one, the combinator is curtailed at that count
+  depth and fails.</p>
+
+  <p>When results are folded upwards, any curtailments are retained.</p>
+
+  <p>When two results are merged as in <em>Sequence</em> and <em>Choice</em>, if the
+  results share curtailments then the curtailment of greater depth is retained.</p>
+
+  <p>If a result has to be recalculated due to the previous result for a given
+  combinator at a given position not being valid due to being curtailed at a greater
+  left recursion depth, then the recalculated result has the previous result's
+  curtailments applied to it, except with the count depths replaced with the current
+  left recursion depths.</p>
+
+  <p>When a result is recalculated and the depth of a curtailment plus the number of
+  remaining tokens after the furthest finish of a result is less than the remaining
+  tokens at the start of a result, then the curtail can be deleted from the result,
+  since then it is no longer possible for curtailment to happen due to excessive
+  left recursion count depth. This is algebraically equivalent to checking if the
+  furthest finish of a result minus the left recursion depth is greater than the
+  start position of the result.</p>
+
+
+  </body>
+</html>
+
diff --git a/doc/error_strings.html b/doc/error_strings.html
new file mode 100644
index 0000000..db9318d
--- /dev/null
+++ b/doc/error_strings.html
@@ -0,0 +1,32 @@
+
+<!DOCTYPE html>
+
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <title>Error Strings - Packrat Docs</title>
+    <link href="default.css" rel="stylesheet">
+  </head>
+
+  <body>
+
+
+  <h2>Error Strings</h2>
+
+  <a href="index.html">Return to Contents</a>
+
+
+  <p><strong>CAUTION:</strong> This is implementation information and should not be
+  relied upon in any way when using the library.</p>
+
+  <p>The format used in the exception message error strings is:<br>
+  <br>
+  s<em>&lt;SYMBOL_NAME&gt;</em>p<em>&lt;NUMBER&gt;</em><br>
+  <br>
+  There is one of these substrings for every error encountered, all concatenated
+  together in arbitrary order.</p>
+
+
+  </body>
+</html>
+
diff --git a/doc/index.html b/doc/index.html
index 8b26b6e..096b0e2 100644
--- a/doc/index.html
+++ b/doc/index.html
@@ -20,10 +20,11 @@
   <ol>
     <li><a href="license.html">License</a></li>
     <li><a href="depends.html">Dependencies</a></li>
+    <li><a href="overview.html">Overview</a></li>
     <li><a href="quickstart.html">Quickstart Guide</a></li>
     <li><a href="features.html">Features</a></li>
     <li><a href="limits.html">Limitations</a></li>
-    <li>File Overview<br>
+    <li>File Overviews<br>
       <ol>
         <li><a href="src_overview.html">/src</a></li>
         <li><a href="example_overview.html">/example</a></li>
@@ -39,6 +40,14 @@
       </ol>
     </li>
     <li><a href="errors.html">Handling Errors</a></li>
+    <li>Implementation Details<br>
+      <ol>
+        <li><a href="start_finish_ranges.html">Start and Finish Ranges</a></li>
+        <li><a href="error_strings.html">Error Strings</a></li>
+        <li><a href="curtailment.html">Curtailment</a></li>
+        <li><a href="memoize.html">Memoization</a></li>
+      </ol>
+    </li>
     <li><a href="credits.html">Credits and Further Reading</a></li>
   </ol>
 
diff --git a/doc/memoize.html b/doc/memoize.html
new file mode 100644
index 0000000..780031b
--- /dev/null
+++ b/doc/memoize.html
@@ -0,0 +1,32 @@
+
+<!DOCTYPE html>
+
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <title>Memoization - Packrat Docs</title>
+    <link href="default.css" rel="stylesheet">
+  </head>
+
+  <body>
+
+
+  <h2>Memoization</h2>
+
+  <a href="index.html">Return to Contents</a>
+
+
+  <p><strong>CAUTION:</strong> This is implementation information and should not be
+  relied upon in any way when using the library.</p>
+
+  <p>All intermediate parsing results are memoized. This is automatic and
+  unavoidable, and is necessary to make piecewise parsing possible.</p>
+
+  <p>When piecewise parsing, the combinators that return partially complete
+  results are used to determine what portion of the input must be passed
+  forwards to allow calculating the full result when parsing future pieces.</p>
+
+
+  </body>
+</html>
+
diff --git a/doc/overview.html b/doc/overview.html
new file mode 100644
index 0000000..62189f8
--- /dev/null
+++ b/doc/overview.html
@@ -0,0 +1,28 @@
+
+<!DOCTYPE html>
+
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <title>Overview - Packrat Docs</title>
+    <link href="default.css" rel="stylesheet">
+  </head>
+
+  <body>
+
+
+  <h2>Overview</h2>
+
+  <a href="index.html">Return to Contents</a>
+
+
+  <p>Packrat is a parser combinator library that can handle left recursive
+  ambiguous grammars, and does so in polynomial time and space. It gets its
+  name from packrat parsing, an algorithm which involves memoizing all
+  intermediate parsing results in order to avoid exponential time complexity.
+  The library includes both parser combinators and simpler lexer combinators.</p>
+
+
+  </body>
+</html>
+
diff --git a/doc/start_finish_ranges.html b/doc/start_finish_ranges.html
new file mode 100644
index 0000000..d69c60e
--- /dev/null
+++ b/doc/start_finish_ranges.html
@@ -0,0 +1,37 @@
+
+<!DOCTYPE html>
+
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <title>Start and Finish Ranges - Packrat Docs</title>
+    <link href="default.css" rel="stylesheet">
+  </head>
+
+  <body>
+
+
+  <h2>Start and Finish Ranges</h2>
+
+  <a href="index.html">Return to Contents</a>
+
+
+  <p>When parsing input, the start position and finish position that
+  describe a parsed or lexed chunk are the index of the first item and
+  the last item of that chunk. This was done to match the idea with
+  how array slicing is done in Ada, but it has a few other implications.</p>
+
+  <p>Empty chunks are described by a finish position that is less than
+  the start position. For consistency, this is always done as the finish
+  position being only one less than the start.</p>
+
+  <p>This convention also differs from the convention used by Frost, Hafiz,
+  and Callaghan (2008). They used start positions corresponding to the first
+  item in a chunk, but finish positions corresponding to the first item after
+  the chunk had ended. Care should be taken to keep this in mind to avoid
+  off-by-one errors.</p>
+
+
+  </body>
+</html>
+
author	Jed Barber <jjbarber@y7mail.com>	2021-01-23 02:38:49 +1100
committer	Jed Barber <jjbarber@y7mail.com>	2021-01-23 02:38:49 +1100
commit	c7195329c60123b2363ba13863f6951a21d0ff57 (patch)
tree	35eb2388650888a4301f326812fa3e5554fff52b
parent	0b0c4df3dc7b94c139c5305ea0991a34f0c43238 (diff)