{%- extends "base_plain.xhtml" -%} {%- block title -%}Thue Version 2a{%- endblock -%} {%- block footer -%}{{ plain_footer ("thue2a.xhtml") }}{%- endblock -%} {%- block content %}

Thue Version 2a

Git repository: Link

1/1/2017

Thue is an esoteric programming language based on unrestricted grammars. A Thue program consists of a number of rules detailing a sequence of symbols to replace and a sequence of symbols to replace with, and an initial state of the program. Applicable rules are then applied to the inital state in a random order until no more are applicable, at which point the program terminates.

An example Thue program that increments a binary number surrounded by '_' characters:

1_::=1++ 0_::=1 01++::=10 11++::=1++0 _0::=_ _1++::=10 ::= _1111111_

The before and after symbols in each rule are separated by '::=' and the list of rules is ended by a blank rule. Note that whitespace in rules and the inital state is NOT ignored.

Output is handled by prefixing the right hand side with '~', which causes those symbols to go to stdout and the replacement in the program to be the empty string.

The traditional Hello World program:

a::=~Hello World! ::= a

Input is handled by having the right hand side of a rule be ':::', which causes the left hand side symbols to be replaced with a line from the standard output. Unfortunately, this immediately causes problems.

The following is an innocent piece of code that accepts a single line of input and does nothing more. Maybe.

a::=::: ::= a

If a string involving the letter 'a' is entered into the above program, the single input rule will again become applicable and another line of input will be obtained. In other words, the input in Thue is unescaped and allows direct code injection into a program.

To solve this problem, I've constructed a slightly modified version of Thue that I'm calling version 2a. In this version, all symbols obtained through stdin are treated as different from ordinary symbols. Rules can refer to and manipulate symbols obtained through stdin by surrounding them in double quotes.

A rule that replaces an ordinary symbol 'a' with 'abc':

a::=abc

A rule that replaces the letter 'a' that was obtained from stdin with 'abc':

"a"::=abc

For convenience, a number of escaped characters are also available:

\\ -> backslash
\r -> return
\n -> newline
\: -> colon
\" -> double quote
\EOT -> end of file

While this doesn't solve all the problems Thue has (try writing a Thue program that asks for and greets the user by name!) it should solve this one particular issue.

{% endblock -%}