blob: 5675cd72f804b9cf03567d4959c285cae524f0dd (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
|
{% extends "base.html" %}
{% block title %}Thue Version 2a{% endblock %}
{% block content %}
<h4>Thue Version 2a</h4>
<p><a href="http://esolangs.org/wiki/Thue" target="_blank">Thue</a> is an esoteric programming
language based on unrestricted grammars. A Thue program consists of a number of rules detailing
a sequence of symbols to replace and a sequence of symbols to replace with, and an initial
state of the program. Applicable rules are then applied to the inital state in a random
order until no more are applicable, at which point the program terminates.</p>
<p>An example Thue program that increments a binary number surrounded by '_' characters:
<code>
1_::=1++
0_::=1
01++::=10
11++::=1++0
_0::=_
_1++::=10
::=
_1111111_
</code>
</p>
<p>The before and after symbols in each rule are separated by '::=' and the list of rules is
ended by a blank rule. Note that whitespace in rules and the inital state is NOT ignored.</p>
<p>Output is handled by prefixing the right hand side with '~', which causes those symbols to
go to stdout and the replacement in the program to be the empty string.</p>
<p>The traditional Hello World program:
<code>
a::=~Hello World!
::=
a
</code>
</p>
<p>Input is handled by having the right hand side of a rule be ':::', which causes the left
hand side symbols to be replaced with a line from the standard output. Unfortunately, this
immediately causes problems.</p>
<p>The following is an innocent piece of code that accepts a single line of input and does nothing
more. Maybe.
<code>
a::=:::
::=
a
</code>
</p>
<p>If a string involving the letter 'a' is entered into the above program, the single input rule
will again become applicable and another line of input will be obtained. In other words, the
input in Thue is unescaped and allows direct code injection into a program.</p>
<p>To solve this problem, I've constructed a slightly modified version of Thue that I'm calling
version 2a. In this version, all symbols obtained through stdin are treated as different from
ordinary symbols. Rules can refer to and manipulate symbols obtained through stdin by
surrounding them in double quotes.</p>
<p>A rule that replaces an ordinary symbol 'a' with 'abc':
<code>
a::=abc
</code>
</p>
<p>A rule that replaces the letter 'a' that was obtained from stdin with 'abc':
<code>
"a"::=abc
</code>
</p>
<p>For convenience, a number of escaped characters are also available:
<pre>
\\ -> backslash
\r -> return
\n -> newline
\: -> colon
\" -> double quote
\EOT -> end of file
</pre>
</p>
<p>While this doesn't solve all the problems Thue has (try writing a Thue program that asks
for and greets the user by name!) it should solve this one particular issue. Source code is
available <a href="/cgit/cgit.cgi/esoteric.git/" target="_blank">here</a>.</p>
{% endblock %}
|