Implement unusual syntax rules in grammar

Igor V. Burago iburago at gmail.com
Sun Aug 9 09:22:41 UTC 2009


Dear members of the list,

I'm implementing language for science computations similar to Matlab.  I found Parrot to be the best
solution for that.  Syntax of Matlab (which I'm trying to be compatible to) isn't quite logical and
consistent, so I've faced some problems expressing it grammar in Perl 6 rules.

 1. The language's control structures ('if', 'while', etc.) always ends with keyword 'end'.  It's
    prohibited to use this reserved word for other purposes... except indexing!  It's allowed to use
    it in array subscript to represent length of corresponding array dimension.

    For example,
        if <condition>
            <block of code>
        end
    but
        array(1, end - 3, 5) = 1

    I didn't find any language's implementation in Parrot which handle similar "feature".  Currently
    I can see only one way to implement this.  In grammar, remove 'end' from keywords list, and deny
    single 'end' statement, so 'end' would be a normal identifier, but not a single statement.

        rule statement
        {
            <!statement_end>
            [
                | <statement_if> {*} #= statement_if
                | <statement_while> {*} #= statement_while
                # other types of statements here
            ]
        }

        token statement_end
        {
            'end' <.sep>
            # <sep> is statement terminator token
        }

    In other cases, such as usual expressions, I could deny using of 'end' identifier later in NQP.
    I don't think this is a sane solution of my problem (I also understand that this using of
    keyword is not quite sane too).  Is there a better way?  Please let me know.

 2. Due to Matlab compatibility the language also has inconsistent use of whitespace.  In most cases
    whitespace is not important and can be easily ignored by <ws> rule.  But there is an exception
    --- array constructor where it's used as delimiter between elements.

    For example,
        [ 1 2 3 ]               three elements
        [ 1 -2 3 ]              still three elements
        [ 1-2 3 ]               two elements
        [ 1- 2 ]                one element
        [ 1 - 2 ]               again, one element

    How whitespace matching behavior can be changed (for example, switched from one meaning to
    another inside and outside of expressions in array constructor) to parse such a weird
    constructions?

--
Igor V. Burago


More information about the parrot-dev mailing list