Xcpp preprocessor grammar

We use ISO/IEC 14977:1996(E). Information technology - Syntactic metalanguage - Extended BNF, First edition 1996-12-15 to define the EBNF syntax

Lexical scanner

Let printableChar denote a printable ASCII character. The following describes the lexical scanner (with the caveat that white space and comment processing hasn’t been described – this is done as normal for C/C++).


letter =
    'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'G' |
    'H' | 'I' | 'J' | 'K' | 'L' | 'M' | 'N' |
    'O' | 'P' | 'Q' | 'R' | 'S' | 'T' | 'U' |
    'V' | 'W' | 'X' | 'Y' | 'Z' |
    'a' | 'b' | 'c' | 'd' | 'e' | 'f' | 'g' |
    'h' | 'i' | 'j' | 'k' | 'l' | 'm' | 'n' |
    'o' | 'p' | 'q' | 'r' | 's' | 't' | 'u' |
    'v' | 'w' | 'x' | 'y' | 'z';

digit =
    '0' | '1' | '2' | '3' | '4' |
    '5' | '6' | '7' | '8' | '9';

identifier =
    ('_' | letter), {'_' | letter | digit};

boolLiteral = 'false' | 'true';

digSeq = digit, {digit};

integerLiteral = digSeq;

exp = ('E'|'e'), ['+'|'-'], digSeq;

floatLiteral =
 	  digSeq, exp |
 	  (digSeq, '.' | [digSeq], '.', digSeq), [exp];

hexDigit = digit |
    'A' | 'B' | 'C' | 'D' | 'E' | 'F' |
    'a' | 'b' | 'c' | 'd' | 'e' | 'f';

hexLiteral = '0x', hexDigit, {hexDigit};

escapeChar =
    'r' | 'n' | 'a' | 'b' | 'f' | 't' |
    'v' | 'O' | '"' | "'" | '\' | '?';

stringChar =
    printableChar - escapeChar |
    '\', escapeChar;

stringLiteral =
    '"', {stringChar}, '"' |
    "'", {stringChar}, "'";

Expressions

The xcpp preprocessor must evaluate expressions under various circumstances. For example, in coercions for typed arguments to macros, typed return values from macros, in the @() directive, and the boolean expressions used in @if-@else directives.


expression = assignmentExpr

assignmentOp =
    '=' | '&=' | '|=' | '^=' | '<<=' | '>>=' |
    '&&=' | '||=' | '^^=' |
    '+=' | '-=' | '*=' | '/=' | '%=';

assignmentExpr =
    {varRef, assignmentOp}, conditionalExpr;

varRef = unaryExpr;

conditionalExpr =
    logicalOrExpr, { '?', expression, ':', logicalOrExpr };

logicalOrExpr =
    logicalAndExpr, { '||', logicalAndExpr };

logicalAndExpr =
    bitwiseOrExpr, { '&&', bitwiseOrExpr };

bitwiseOrExpr =
    bitwiseXorExpr, { '|', bitwiseXorExpr };

bitwiseXorExpr =
    bitwiseAndExpr, { '^', bitwiseAndExpr };

bitwiseAndExpr =
    equalityExpr, { '&', equalityExpr };

equalityExpr =
    relationalExpr, { ('==' | '!='), relationalExpr };

relationalExpr =
    shiftExpr, { ('<' | '<=' | '>' | '>='), shiftExpr };

shiftExpr =
    additiveExpr, { ('>>' | '<<'), additiveExpr };

additiveExpr =
    multExpr, { ('+' | '-'), multExpr };

multExpr =
    powExpr, { ('*' | '/' | '%'), powExpr };

powExpr =
    unaryExpr, { '^^', unaryExpr };

unaryExpr =
    {'+' | '-' | '~' | '!'}, postfixExpr;

unaryFnName =
    'len' |
    'bool' | 'int' | 'double' |
    'char' | 'string' |
    'is_bool' |	'is_int' | 'is_double' |
    'is_char' | 'is_string' |
    'sin' |	'cos' |	'tan' |	'exp' |	'log';

postfixExpr =
    primaryExpr, { '[', expression, ']' };

literal =
      boolLiteral |
	  integerLiteral |
	  hexLiteral |
	  floatLiteral |
	  stringLiteral;

primaryExpr =
    identifier - unaryFnName |
    unaryFnName, '(', expression, ')' |
    literal |
    '(', expression, ')';

Grammar of @def directive


type =
    'bool' | 'int' | 'double' |
    'char' | 'string';

argType = type;
argName = identifier;
arg = [argType], argName;

returnType = type;
macroName = identifier;

defDirective =
    '@def', [returnType], macroName,
    ['(', [arg, { ',', arg, ')' }], ')'],
    '=',
    value;

More informally we could define the syntax as follows:


@def [return-type] x [([type] a1,...,[type] an)] = y

This defines a macro named x. The directive itself expands into nothing in the output. x must be a C/C++ style identifier. The macro named x is added to the local namespace. This may replace an existing definition of x in the local namespace. There is no support for overloading of macros - even on arity. The body of a macro introduces a new scope (i.e. namespace). This namespace contains the names of the formal arguments of that macro.

The return type is optional. If given it must be bool, int, double, char or string. If no return type is provided then when x is invoked, x macro expands into the result of macro expanding y. Otherwise, if a return type is provided then after macro expanding y, it is evaluated as an expression that must be implicit convertible to the return type. The invocation of x is then substituted by the string representation of the evaluated result.

The argument list is optional. Each formal argument must be an identifier. Formal argument names cannot be repeated. A formal argument can optionally be preceded by a type which must be bool, char, int, double or string. When a formal argument is not typed, the formal argument binds directly to the text provided in the invocation of the macro. Otherwise, when a formal argument is typed, the text provided for the formal argument in the invocation of the macro is first macro expanded then evaluated as an expression that must be implicit convertible to the type of the formal argument. The formal argument binds to the string representation of the evaluated result.