Synthesized attributes in Lark#
Attributes are values attached to the grammar symbols, and synthesized attributes are attributes of non-terminal symbols which are computed at the moment of a reduction introducing that symbol (so they are computed bottom-up).
In the grammar they are written as python expressions between {{
and }}
at the end of a rule.
For instance the classic arithmetic grammar with synthesized attributes computing the value of the expression:
from lark import Lark
synthesized_arith_grammar = """
%import common.NUMBER
%import common.WS_INLINE
%ignore WS_INLINE
?start: sum {{ syn[1] }}
?sum: product {{ syn[1] }}
| sum "+" product {{ syn[1] + syn[3] }}
| sum "-" product {{ syn[1] - syn[3] }}
?product: atom {{syn[1]}}
| product "*" atom {{syn[1] * syn[3]}}
| product "/" atom {{syn[1] / syn[3]}}
?atom: NUMBER {{int(syn[1])}}
| "-" atom {{-syn[2]}}
| "(" sum ")" {{syn[2]}}
"""
parser = Lark(synthesized_arith_grammar, parser="lalr")
interactive_parser = parser.parse_interactive(start="start")
for t in parser.lex("5 * 3 - 7"):
interactive_parser.feed_token(t)
interactive_parser.feed_eof()
interactive_parser.parser_state.attribute
8
In these attributes, syn
is a special variable referring to the synthesized attributes of the symbols of the expansion it annotates.
The indexing starts at 1 (like in yacc/bison), and terminal symbols are automatically given a synthesized attribute
of their matching value.
Inside the parser the attribute expressions are stored as python AST objects attached to the rules. A stack of attributes is maintained alongside the stack of values and the stack of states, and
these AST objects are transformed so that the syn
slices point to the correct element of the attributes stack.
Let’s look at the internals of one of the rules deriving product
for illustration:
import ast
rule = parser.rules[5]
print(f"{rule.origin} : {rule.expansion}, {{{{ {ast.unparse(rule.ast)} }}}} ")
NonTerminal(Token('RULE', 'product'),'') : [NonTerminal('product',''), Terminal('STAR'), NonTerminal('atom','')], {{ stack[-3] * stack[-1] }}
The AST expression is evaluated at the reduction step, when the top of the stack of attributes is filled with the
attributes of the symbols product
, STAR
and atom
.
The environment of its evaluation is set so that stack
points to attributes stack, effectively computing the product of the attribute of product
with the attribute of atom
, and pushing the result into the attributes stack.