[libre-riscv-dev] [Bug 280] POWER spec parser needs to support fall-through cases
bugzilla-daemon at libre-riscv.org
bugzilla-daemon at libre-riscv.org
Sun Apr 5 22:43:12 BST 2020
http://bugs.libre-riscv.org/show_bug.cgi?id=280
--- Comment #4 from Jacob Lifshay <programmerjake at gmail.com> ---
(In reply to Luke Kenneth Casson Leighton from comment #3)
> (In reply to Jacob Lifshay from comment #2)
> > (In reply to Luke Kenneth Casson Leighton from comment #0)
> > > this is very tricky to get working at the *lexer* level and still support
> > > whitespace indentation.
> > >
> > > switch (n)
> > > case(1): x <- 5
> > > case(2): # here
> > > case(3):
> > > x <- 3
> > > default:
> > > x <- 9
> > > print (5)
> >
> > How about switching the grammar to parse a case-sequence instead of a single
> > case, that way multiple cases before a statement block would be correctly
> > handled.
>
> annoyingly, the more changes that are made to the grammar, the
> less "like the spec" the grammar becomes, with the implication
> that further manual intervention stages are required when it
> comes to verifying against the 3.0B spec and, in future, against
> 3.0C and other releases.
Um, I don't think the pseudocode in the Power spec was ever supposed to be
Python, so I have no issues with changing the grammar file to more accurately
match the spec pdf even if the grammar doesn't match Python as closely.
> i've made quite a few minor changes, some of them necessary (to support
> syntax such as CR0[x] rather than CR0_subscript_x, some of them, well,
> not being lazy, just "trying to get it working fast"
>
> yet-another-preprocessor-stage - even if it is line-based rather than
> ply-lexer-based, looking for 2 "case" statements (or case followed by
> default) which are lined up, and inserting the ghost word "fallthrough"
> would "do the trick"
I think having the grammar correctly reflect the actual syntax that is used is
probably a better way to go than adding preprossing cludges to add
`fallthrough` everywhere.
The space-counting done by the lexer translates the spaces into the INDENT and
DEDENT tokens.
The following algorithm should work to translate lines:
def lexer(lines):
""" lines is a iterator that returns each line
of text without the trailing newline """
indent_depth_stack = [0]
for line in lines:
# assume we don't have to worry about tabs in string literals
expanded_line = line.expandtabs()
line = expanded_line.lstrip()
# count indent depth
depth = len(expanded_line) - len(line)
if line == "" or line[0] == "#":
# empty lines don't have to match depth
# don't yield repeated NL tokens
continue
if depth > indent_depth_stack.top():
yield INDENT
indent_depth_stack.append(depth)
else:
while depth < indent_depth_stack[-1]:
yield DEDENT
indent_depth_stack.pop()
if depth > indent_depth_stack[-1]:
raise IndentDepthMismatch("indent depth doesn't match!")
yield from tokenize(line)
yield NL
while indent_depth_stack[-1] != 0:
yield DEDENT
indent_depth_stack.pop()
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the libre-riscv-dev
mailing list