25

I've found several open source visual programing tools like Blockly and friends, and other projects hosted at Github, but could't find any that work directly with the abstract syntax tree.

Why is that?

I'm asking because once I discovered that every compiler out there has a phase in the compilation process where it parses the source code to an AST, it was obvious to me that some visual programing tools could take advantage of this to give the programer ways to edit the AST directly in a visual way, and also to do the round trip from source to node-graph and then back again to source when needed.

For instance one could think that from the JavaScript AST Visualizer to an actual JavaSript visual programming tool there isn’t too much of a difference.

So, what am I missing?

psmears
  • 481
  • 3
  • 8
ery245gs
  • 353
  • 3
  • 6

3 Answers3

28

Many of these tools do work directly with the abstract syntax tree (or rather, a direct one-to-one visualisation of it). That includes Blockly, which you've seen, and the other block-based languages and editors like it (Scratch, Pencil Code/Droplet, Snap!, GP, Tiled Grace, and so on).

Those systems don't show a traditional vertex-and-edge graph representation, for reasons explained elsewhere (space, and also interaction difficulty), but they are directly representing a tree. One node, or block, is a child of another if it is directly, physically inside the parent.


I built one of these systems (Tiled Grace, paper, paper). I can assure you, it is very much working with the AST directly: what you see on the screen is an exact representation of the syntax tree, as nested DOM elements (so, a tree!).

Screenshot of nested Tiled Grace code

This is the AST of some code. The root is a method call node "for ... do". That node has some children, starting with "_ .. _", which itself has two children, a "1" node and a "10" node. What comes up on screen is exactly what the compiler backend spits out in the middle of the process - that's fundamentally how the system works.

If you like, you can think of it as a standard tree layout with the edges pointing out of the screen towards you (and occluded by the block in front of them), but nesting is as valid a way of showing a tree as a vertex diagram.

It will also "do the round trip from source to node-graph and then back again to source when needed". In fact, you can see that happen when you click "Code View" at the bottom. If you modify the text, it'll be re-parsed and the resulting tree rendered for you to edit again, and if you modify the blocks, the same thing happens with the source.


Pencil Code does essentially the same thing with, at this point, a better interface. The blocks it uses are a graphical view of the CoffeeScript AST. So do the other block- or tile-based systems, by and large, although some of them don't make the nesting aspect quite as clear in the visual representation, and many don't have an actual textual language behind them so the "syntax tree" can be a bit illusive, but the principle is there.


What you're missing, then, is that these systems really are working directly with the abstract syntax tree. What you see and manipulate is a space-efficient rendering of a tree, in many cases literally the AST a compiler or parser produces.

Michael Homer
  • 396
  • 3
  • 7
6

At least two reasons:

  1. Because source code is a much more concise representation. Laying out an AST as a graph would take up a lot more visual real estate.

    Programmers prize having as much context as possible -- i.e., having as much code present all at once on the screen at the same time. Context helps them better manage complexity. (That's one reason why many programmers use these crazy small fonts and enormous 30" screens.)

    If we tried to display the AST as a graph or tree, then the amount of code that you could fit on a single screen would be much less than when it is represented as source code. That's a huge loss for developer productivity.

  2. AST's are intended for compiler programming, not for easy understanding by programmers. If you took an existing AST representation and displayed it visually, it probably would be harder for developers to understand, because ASTs weren't designed to be easy for developers to learn.

    In contrast, source code usually is designed to be readable/understandable by developers; that is normally a critical design criteria for source code, but not for ASTs. ASTs only need to be understood by the compiler writers, not by everyday developers.

    And, in any case, the AST language would be a second language developers have to learn, in addition to the source language. Not a win.

See also https://softwareengineering.stackexchange.com/q/119463/34181 for some additional potential reasons.

D.W.
  • 167,959
  • 22
  • 232
  • 500
3

The typical AST by compilers is rather complex and verbose. The directed graph representation would quickly become quite hard to follow. But there are two large areas of CS where ASTs are used.

  1. Lisp languages are actually written as AST. The program source code is written as lists and directly used by the compiler and/or interpreter (depending upon which variant is being used).
  2. Modelling languages, e.g. UML, and many visual domain specific languages use graphical notations which are effectives abstract syntax graphs (ASG) at a higher level of abstraction than the typical general purpose language AST.
CyberFonic
  • 381
  • 1
  • 6