public Object visit(ASTCompositeReference node, Object data){
SimpleNode scope = this.getScope();
if(scope instanceof ASTProgram){
if(node.jjtGetNumChildren() == 3){
SimpleNode aNode = (SimpleNode) node.jjtGetChild(0);
if((aNode instanceof ASTIdentifier) && (((ASTIdentifier) aNode).getName().equals("CXClass"))){
aNode = (SimpleNode) node.jjtGetChild(1);
if(aNode instanceof ASTPropertyIdentifierReference){
aNode = (SimpleNode) aNode.jjtGetChild(0);
if((aNode instanceof ASTIdentifier)
&& ((((ASTIdentifier) aNode).getName().equals("declare")) ||
(((ASTIdentifier) aNode).getName().equals("declareInterface")))){
aNode = (SimpleNode) node.jjtGetChild(2);
if(aNode instanceof ASTFunctionCallParameters){
aNode = (SimpleNode) aNode.jjtGetChild(0);
if(aNode instanceof ASTLiteral){
String className = (String) ((ASTLiteral) aNode).getValue();
Comment comment = this.getComment((SimpleNode) node);
JavaClass javaClass = this.getJavaClass(className);
if(comment != null){
javaClass.setComment(comment);
}
javaClass.setOriginatingFile((JSFile) data);
}
}
}
}
}
}
}
Most of the code above deals with finding out if the node currently visited is interesting for data extraction and/or further manipulation. Let's explain exactly what it means when a node "is interesting". Nodes in an AST belong to one of the AST types generated by jjtree (the tree generating tool of javacc), they are inside a given scope, have or don't have children, have or don't have a parent (of a certain type), have a given value for an attribute etc. All these properties are differentiating one node from another. Visitors usually want to operate only on nodes with properties that fullfill some defined conditions. Example:
A visitor that wants to print out all the local variables of a program. It is therefore interested in AST nodes whose properties match the following conditions:
- node type is ASTVariableDeclaration
- node scope is of type ASTBlock (that's the scope for local stuff)
- number of children is 3
- first child is of type ASTIdentifier
- etc
The if-statements in the example above test for conditions like these. But all these nested conditions look awkward and are probably hard to modify and to maintain.
There is clearly a need to simplify the task of matching nodes according to defined conditions.
The idea behind tree patterns is defining the conditions in a declarative fashion in a simple language. These declarations get transformed by a parser into tree template data structures that are used by a general tree matcher to match nodes. The result is that the if-statements get replaced by an easier to read declaration of conditions and the tree matching code of the general tree matcher gets reused in all instances where locating nodes according to conditions is needed.
Same visitor that prints out local variables of a program:
// define template
ASTLiteral literal;
TreeTemplate treeTemplate = new TreeTemplate();
treeTemplate.node = new HashMap();
literal = new ASTLiteral();
literal.setValue("ASTVariableDeclaration");
treeTemplate.node.put("type", literal);
treeTemplate.node.put("scope", "global");
TreeTemplate childTemplate;
treeTemplate.children = new ArrayList();
childTemplate = new TreeTemplate();
childTemplate.node = new HashMap();
literal = new ASTLiteral();
literal.setValue("ASTIdentifier");
childTemplate.node.put("type", literal);
literal = new ASTLiteral();
literal.setValue("theIdentifier");
childTemplate.node.put("name", literal);
childTemplate.node.put("stop", Boolean.TRUE);
treeTemplate.children.add(childTemplate);
// etc.....
// match
TreeMatcher treeMatcher = new TreeMatcher();
HashMap matchedNodes = new HashMap();
if(treeMatcher.match(simpleNode, scope, treeTemplate, matchedNodes)){
//matched
ASTIdentifier theIdentifier = (ASTIdentifier) matchedNodes.get("theIdentifier");
}
Note how the "name" key was used to mark a child node and then retrieve it from the matchedNodes map. For templates with the multi key in them the matchedNodes map contains a list of nodes under the name specified in the template.
It's not hard to see that constructing the TreeTemplate data structures by hand can get tiresome very quickly. An easier method is to define the tree templates in a simple and concise language and let a parser construct the tree template data structure from a "program" in that language. What follows is an informal description of the tree pattern language. For a more detailed and exact definition look at the javacc grammar file (TreePattern.jj) describing the language.
The lexical tokens of the language include all the javascript literals:
- strings (supported are both quoting types "foo", 'foo', just
like in javascript)
- numbers
- true, false, null
- regular expression literals
White-space is the same as in javascript, but no comments are supported.
The reserved words are value, type, name, stop , scope, parent, next, previous, local , global and multi.
The separators are {, }, [, ], (, ), : and comma itself.
The operators are =, <, >, !=, /=,<=, >=, + and *.
The grammatical productions are
Pattern | ::= | ( ( <LPAREN> Node ( <COMMA> ( Pattern )? )* <RPAREN> ) | Node ) |
Node | ::= | <LBRACE> ( FieldList )? <RBRACE> |
FieldList | ::= | Field ( <COMMA> ( Field )? )* |
Field | ::= | FieldKey Comparator Literal |
| | <MULTI> MExpr | |
| | NodeRefKey <EQUAL> ( Pattern | <NULL> ) | |
| | <STOP> | |
| | <NAME> <COLON> Literal | |
| | <SCOPE> <EQUAL> ( <LOCAL> | <GLOBAL> ) | |
FieldKey | ::= | ( <TYPE> | <VALUE> ) |
NodeRefKey | ::= | ( <PREV> | <NEXT> | <PARENT> ) |
Comparator | ::= | ( <EQUAL> | <NEQUAL> | <REGEQUAL> | <GT> | <LT> | <LE> | <GE> ) |
MExpr | ::= | ( ( <GT> | <LT> | <GE> | <LE> ) <DECIMAL_LITERAL> | <EQUAL> MultiEqualFactor ) |
MultiEqualFactor | ::= | <DECIMAL_LITERAL> |
| | <PLUS> | |
| | <STAR> | |
| | <LBRACKET> <DECIMAL_LITERAL> <COMMA> <DECIMAL_LITERAL> <RBRACKET> | |
Literal | ::= | ( <DECIMAL_LITERAL> | <HEX_LITERAL> | <FLOATING_POINT_LITERAL> | <STRING_LITERAL> | <TRUE> | <FALSE> | <NULL> | <REGEX_LITERAL> | <UNTERMINATED_STRING_LITERAL> ) |
Let's see some examples:
First, the local variable example again:
( { type = 'ASTVariableDeclaration' ,
scope = local }, { type = 'ASTIdentifier', name : 'theIdentifier', stop},
{multi = *, stop} )
The last child template basically tells the matcher "match up any remaining - zero or more - children, we're not interested in them".
Here's a more complex example: we want function declaration nodes that have between 2 and 5 formal arguments and that represent functions named "foo":
( {type='ASTFunctionDeclaration'}, {type='ASTIdentifier', value='foo'}, ( {type= 'ASTFormalArgumentList '}, {multi=[2,5], stop} ), {type='ASTBlock' , stop})