[Libre-soc-bugs] [Bug 1094] insndb instruction database visitor-walker is needed

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Sat Jun 10 10:04:46 BST 2023


--- Comment #85 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Dmitry Selyutin from comment #82)
> Long story short. I share your opinion almost on everything, but I had to at
> least probe this route to understand its pros and cons.
> The only exception is generic __call__. This actually comes from your
> isinstance hints (about another section?),

this is where the misunderstanding is: i was talking at the time
about ast.py iter_child_nodes use of isinstance:

def iter_child_nodes(node):
    for name, field in iter_fields(node):
        if isinstance(field, AST): yield field
        elif isinstance(field, list):
            for item in field:
                if isinstance(item, AST):  yield item

* you solved the first use there by doing contextmanager.
* you solved the second use by doing Node.subnodes (now Nodes.walk)
* the third instance could again use another contextmanager

whereas i was talking about the use in the *visitor*, which in
ast.py is this:

  def visit(self, node):
        method = 'visit_' + node.__class__.__name__
        visitor = getattr(self, method, self.generic_visit)
        return visitor(node)

which for us would be:

  def visit(self, node):
        method = 'visit_' + node.__class__.__name__
        visitor = getattr(self, method, self.__call__)
        return visitor(node)

> but I realized that relying on
> class name is too fragile.

it's not "fragile", it's the best aspect of the entire visitor pattern
when applied to python.

plus it breaks the "Principle of Least Surprise".  *everyone* who has ever
used Visitors in python expects to have function-names in the Visitor named
after the "data type of the database object being walked".

you are proposing to break that expectation in effect.

i don't mind self.__call__ being the default for 

> isinstance refers to an object (yes, types are
> objects too in Python); name just refers to some string. Whilst it looks
> simpler, it's indeed less intuitive than explicit checks. If somebody wants
> this functionality (per-class context managers), they can do it directly in
> concrete visitors:
> class WeirdVisitor(Visitor):
>     @contextlib.contextmanager
>     def __call__(self, node):
>         method = getattr(self, node.__class__.__name__, self.Node)
>         return method(node)

the *very* first thing that i will do on creating *any* visitors is
precisely and exactly that, copying that exactly and *only* using
that!  and in all documentation i will strongly recommend using
that and not the Visitor!

>     @contextlib.contextmanager
>     def Node(self, node):
>         raise 666
>     @contextlib.contextmanager
>     def Record(self, node):
>         raise 42
> TL;DR: I'm removing paths branch. However, there's still one issue with the
> approach you suggest.
>     def __call__(self, node):
>         print("/".join(self.stack_of_stuff), node)
>         self.stack_of_stuff.append(node.name)
>         yield node
>         self.stack_of_stuff.pop(-1) # i think
> I'f we're going to drop the Node-inherited classes (Path, String and similar
> stuff), there's no way we could call node.name (or, again, some other
> generic field, because "name" is occupied). `str` or `bool` don't have such
> interface.

then the *Visitor* should do "if isinstance(node, str) self.stack.append(node)"

again, like the "depth" parameter, the walker-function should not be
compromised by what the *Visitor* does.

> As for bool inheritance, that's not a problem: whilst bool isn't
> inheritable, int is.

you see how going down that path is a danger?  using bool is clear about
the intent.

additionally i would really like this code to be useable elsewhere in the
project, and even a separate git repo created for it (and put onto pypi).

in particular at some point i would like to replace the existing nmigen
visitor-pattern (which is the usual "faulty" design) with this one.

(In reply to Dmitry Selyutin from comment #84)
> Ah, and one note about a single __call__. Even though it's an unlikely
> scenario, two classes might share the same name (e.g. one is on global level
> and other is class-nested).

that would be completely confusing and seriously inadviseable from a project
maintainability perspective!

as an outside possibility that will immediately get picked up by the user
when they find that they can't call a Visitor on that nested-class, they'll
soon change the name.

with Visitors being such a huge integral part of a good AST-walking /
Language-translator / Compiler API, "users" (us at the moment but
other people in future if we do drop this onto pypi) *will* keep a clean
global namespace.

think about HTML/DOM: imagine the total chaos if HTML's DOM
(Domain Object Model) had an object-type that was locally different
from the global type namespace, somewhere deep inside the structure.

if DOM.Window was different from DOM.Document.Window for example.

or if c had a different meaning for "struct" just because it was
inside a "switch" statement?

part of a clean AST / language / compiler is verrrry much about picking
a clean global namespace - *definitely* don't do nested classes in

again (just on nested-classes): "Principle of Least Surprise".

(i aim to propose to use insndb for work inside the ISA Working Group,
for walking the specification document which is now in latex)

You are receiving this mail because:
You are on the CC list for the bug.

More information about the libre-soc-bugs mailing list