blob: e72928048fb81e8c2aca1b4db6d3cea809b628c4 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
|
{%- extends "base.xhtml" -%}
{%- block title -%}Thue Version 2a{%- endblock -%}
{%- block content %}
<h4>Thue Version 2a</h4>
<p>Git repository: <a href="/cgi-bin/cgit.cgi/esoteric">Link</a></p>
<h5>1/1/2017</h5>
<p><a href="http://esolangs.org/wiki/Thue" class="external">Thue</a> is an esoteric programming
language based on unrestricted grammars. A Thue program consists of a number of rules detailing a
sequence of symbols to replace and a sequence of symbols to replace with, and an initial state of
the program. Applicable rules are then applied to the inital state in a random order until no more
are applicable, at which point the program terminates.</p>
<p>An example Thue program that increments a binary number surrounded by '_' characters:</p>
<div class="precontain">
<code>
1_::=1++
0_::=1
01++::=10
11++::=1++0
_0::=_
_1++::=10
::=
_1111111_
</code>
</div>
<p>The before and after symbols in each rule are separated by '::=' and the list of rules is ended
by a blank rule. Note that whitespace in rules and the inital state is NOT ignored.</p>
<p>Output is handled by prefixing the right hand side with '~', which causes those symbols to go to
stdout and the replacement in the program to be the empty string.</p>
<p>The traditional Hello World program:</p>
<div class="precontain">
<code>
a::=~Hello World!
::=
a
</code>
</div>
<p>Input is handled by having the right hand side of a rule be ':::', which causes the left hand
side symbols to be replaced with a line from the standard output. Unfortunately, this immediately
causes problems.</p>
<p>The following is an innocent piece of code that accepts a single line of input and does nothing
more. Maybe.</p>
<div class="precontain">
<code>
a::=:::
::=
a
</code>
</div>
<p>If a string involving the letter 'a' is entered into the above program, the single input rule
will again become applicable and another line of input will be obtained. In other words, the input
in Thue is unescaped and allows direct code injection into a program.</p>
<p>To solve this problem, I've constructed a slightly modified version of Thue that I'm calling
version 2a. In this version, all symbols obtained through stdin are treated as different from
ordinary symbols. Rules can refer to and manipulate symbols obtained through stdin by surrounding
them in double quotes.</p>
<p>A rule that replaces an ordinary symbol 'a' with 'abc':</p>
<div class="precontain">
<code>
a::=abc
</code>
</div>
<p>A rule that replaces the letter 'a' that was obtained from stdin with 'abc':</p>
<div class="precontain">
<code>
"a"::=abc
</code>
</div>
<p>For convenience, a number of escaped characters are also available:</p>
<div class="precontain">
<pre>
\\ -> backslash
\r -> return
\n -> newline
\: -> colon
\" -> double quote
\EOT -> end of file
</pre>
</div>
<p>While this doesn't solve all the problems Thue has (try writing a Thue program that asks for and
greets the user by name!) it should solve this one particular issue.</p>
{% endblock -%}
|