How to traverse AST tree - groovy

I'm trying to create an static analysis for Groovy. As a POC for my superiors I'm just trying to parse simple code and detect SQL injections, which are the easiest kind to spot. I did it successfully on Python, which is my main language, but my company mostly uses Grails (on Groovy).
This is what I have so far:
import org.codehaus.groovy.ast.expr.*;
import org.codehaus.groovy.ast.stmt.*;
import org.codehaus.groovy.ast.*
import org.codehaus.groovy.control.CompilePhase
import org.codehaus.groovy.ast.CodeVisitorSupport
import org.codehaus.groovy.ast.builder.AstBuilder
public class SecurityCheck extends CodeVisitorSupport {
void visitBlockStatement(BlockStatement statement) {
println "NEW BLOCK STATEMENT:"
println statement.getText();
//keep walking...
statement.getStatements().each { ASTNode child ->
println "CHILD FOUND: "
println child.getText();
child.visit(this)
}
}
}
def code = new File('groovy_source.groovy').text // get the code from the source file
def AstBuilder astBuilder = new AstBuilder() // build an instance of the ast builder
def ast = astBuilder.buildFromString(CompilePhase.CONVERSION, code) // build from string when the compiler converts from tokens to AST
def SecurityCheck securityCheck = new SecurityCheck() // create an instance of our security check class
println ast
println ast[0]
ast[0].visit(securityCheck)
The groovy_source.groovy file is very simple, containing only a minimal file with a super easy to spot vulnerability:
def post(id) {
query = "SELECT * FROM table WHERE id = " + id;
result = sql.execute query
return result;
}
It is my understanding that, as I'm inheriting from CodeVisitorSupport, this would just visit a BlockStatement and then, for each statement inside that statement, it would visit it using the method from the supper class.
Nevertheless, when I print the text from the BlockStatement, it is an empty string, and the for each method never even gets called (which I assume must mean the AST found no children for my block statement, even when the function definitively has statements inside it.
[org.codehaus.groovy.ast.stmt.BlockStatement#363a52f[]] // println ast
org.codehaus.groovy.ast.stmt.BlockStatement#363a52f[] // println ast[0]
NEW BLOCK STATEMENT:
{ } // println statement.getText()
Any help here would be tremendously appreciated. Thanks!

I found the answer. I wasn't so hard in the end, but the horrible documentation doesn't make it easy. If you one to traverse the tree, you need to give the constructor the false boolean as a second argument, like this:
def ast = astBuilder.buildFromString(CompilePhase.CONVERSION, false, code)
Then you can use the visit* methods as you expect.

Related

Is there a way to make a class castable to any object?

I am having issues coming up with a good way to cast a class into any other class.
I am using groovy and I am trying to replace a third party library with a series of no-ops in order to provide it with dry run functionality.
Lets say I have a script like this:
class NoOp {
List tree = []
NoOp(List tree=[]){
this.tree = tree
}
def invokeMethod(String name, Object args) {
return new NoOp(tree+name)
}
}
class BigList {
List getNewBigList(){
return ['hello'] // I normally return a lot of stuff
}
}
BigList.metaClass['getNewBigList']={->return new NoOp()}
List list = new BigList().getNewBigList()
// Each function should have a new instance of the NoOp class, causing the next to do nothing
list.iterator().collect{String x->
return x // Marshall Data
}.each{
// Do more stuff here
}
String a = list[0] // Should return a NoOp here
// Remove metaClass from new instances
GroovySystem.metaClassRegistry.removeMetaClass(BigList.class)
Theoretically this works fine (and does until groovy tries to cast a result), invokeMethod will return a new NoOp class to the function returns and causing any subsequent uses to also do nothing.
The problem comes down to casting. As soon List l = new NoOp() happens groovy will throw an error.
Is there a good way to allow NoOp to be cast to any type?
Note: This is just a dummy script, there could be any number of classes and packages that would be overwritten by the NoOp metaclass so doing it manually would not be feasible.

Groovy - Type check AST generated code

I have a Groovy application that can be custimized by a small Groovy DSL I wrote. On startup, the application loads several Groovy scripts, applies some AST transformations and finally executes whatever was specified in the scripts.
One of the AST transformations inserts a couple of lines of code into certain methods. That works fine and I can see the different behavior during runtime. However, sometimes the generated code is not correct. Although I load the scripts with the TypeChecked customizer in place, my generated code is never checked for soundness.
To show my problem, I constructed an extreme example. I have the following script:
int test = 10
println test // prints 10 when executed without AST
I load this script and insert a new line of code between the declaration of test and println:
public void visitBlockStatement(BlockStatement block) {
def assignment = (new AstBuilder().buildFromSpec {
binary {
variable "test"
token "="
constant 15
}
}).first()
def newStmt = new ExpressionStatement(assignment)
newStmt.setSourcePosition(block.statements[1])
block.statements.add(2, newStmt)
super.visitBlockStatement(block)
}
After applying this AST, the script prints 15. When I use AstNodeToScriptVisitor to print the Groovy code of the resulting script, I can see the new assignment added to the code.
However, if I change the value of the assignment to a String value:
// ...
def assignment = (new AstBuilder().buildFromSpec {
binary {
variable "test"
token "="
constant "some value"
}
}).first()
// ...
I get a GroovyCastExcpetion at runtime. Although the resulting script looks like this:
int test = 10
test = "some value" // no compile error but a GroovyCastException at runtime here. WHY?
println test
no error is raised by TypeChecked. I read in this mailing list, that you need to set the source position for generated code to be checked, but I'm doing that an it still doesn't work. Can anyone provide some feedback of what I am doing wrong? Thank you very much!
Update
I call the AST by attaching it to the GroovyShell like this:
def config = new CompilerConfiguration()
config.addCompilationCustomizers(
new ASTTransformationCustomizer(TypeChecked)
)
config.addCompilationCustomizers(
new ASTTransformationCustomizer(new AddAssignmentAST())
)
def shell = new GroovyShell(config)
shell.evaluate(new File("./path/to/file.groovy"))
The class for the AST itself looks like this:
#GroovyASTTransformation(phase = CompilePhase.CANONICALIZATION)
class AddAssignmentAST implements ASTTransformation {
#Override
public void visit(ASTNode[] nodes, SourceUnit source) {
def transformer = new AddAssignmentTransformer()
source.getAST().getStatementBlock().visit(transformer)
}
private class AddAssignmentTransformer extends CodeVisitorSupport {
#Override
public void visitBlockStatement(BlockStatement block) {
// already posted above
}
}
}
Since my Groovy script only consists of one block (for this small example) the visitBlockStatement method is called exactly once, adds the assignment (which I can verify since the output changes) but does not ever throw a compile-time error.

Groovy MarkupBuilder name conflict

I have this code:
String buildCatalog(Catalog catalog) {
def writer = new StringWriter()
def xml = new MarkupBuilder(writer)
xml.catalog(xmlns:'http://www.sybrium.com/XMLSchema/NodeCatalog') {
'identity'() {
groupId(catalog.groupId)
artifactId(catalog.artifactId)
version(catalog.version)
}
}
return writer.toString();
}
It produces this xml:
<catalog xmlns='http://www.sybrium.com/XMLSchema/NodeCatalog'>
<groupId>sample.group</groupId>
<artifactId>sample-artifact</artifactId>
<version>1.0.0</version>
</catalog>
Notice that the "identity" tag is missing... I've tried everything in the world to get that node to appear. I'm ripping my hair out!
Thanks in advance.
There might be a better way, but one trick is to call invokeMethod directly:
String buildCatalog(Catalog catalog) {
def writer = new StringWriter()
def xml = new MarkupBuilder(writer)
xml.catalog(xmlns:'http://www.sybrium.com/XMLSchema/NodeCatalog') {
delegate.invokeMethod('identity', [{
groupId(catalog.groupId)
artifactId(catalog.artifactId)
version(catalog.version)
}])
}
return writer.toString();
}
This is effectively what Groovy is doing behind the scenes. I couldn't get delegate.identity or owner.identity to work, which are the usual tricks.
Edit: I figured out what's going on.
Groovy adds a method with a signature of identity(Closure c) to every object.
This means that when you tried to dynamically invoke the identity element on the XML builder, while passing in a single closure argument, it was calling the identity() method, which is like calling delegate({...}) on the outer closure.
Using the invokeMethod trick forces Groovy to bypass the Meta Object Protocol and treat the method as a dynamic method, even though the identity method already exists on the MetaObject.
Knowing this, we can put together a better, more legible solution. All we have to do is change the signature of the method, like so:
String buildCatalog(Catalog catalog) {
def writer = new StringWriter()
def xml = new MarkupBuilder(writer)
xml.catalog(xmlns:'http://www.sybrium.com/XMLSchema/NodeCatalog') {
// NOTE: LEAVE the empty map here to prevent calling the identity method!
identity([:]) {
groupId(catalog.groupId)
artifactId(catalog.artifactId)
version(catalog.version)
}
}
return writer.toString();
}
This is much more readable, it's clearer the intent, and the comment should (hopefully) prevent anyone from removing the "unnecessary" empty map.

how to retrieve nested properties in groovy

I'm wondering what is the best way to retrieve nested properties in Groovy, taking a given Object and arbitrary "property" String. I would like to something like this:
someGroovyObject.getProperty("property1.property2")
I've had a hard time finding an example of others wanting to do this, so maybe I'm not understanding some basic Groovy concept. It seems like there must be some elegant way to do this.
As reference, there is a feature in Wicket that is exactly what I'm looking for, called the PropertyResolver:
http://wicket.apache.org/apidocs/1.4/org/apache/wicket/util/lang/PropertyResolver.html
Any hints would be appreciated!
I don't know if Groovy has a built-in way to do this, but here are 2 solutions. Run this code in the Groovy Console to test it.
def getProperty(object, String property) {
property.tokenize('.').inject object, {obj, prop ->
obj[prop]
}
}
// Define some classes to use in the test
class Name {
String first
String second
}
class Person {
Name name
}
// Create an object to use in the test
Person person = new Person(name: new Name(first: 'Joe', second: 'Bloggs'))
// Run the test
assert 'Joe' == getProperty(person, 'name.first')
/////////////////////////////////////////
// Alternative Implementation
/////////////////////////////////////////
def evalProperty(object, String property) {
Eval.x(object, 'x.' + property)
}
// Test the alternative implementation
assert 'Bloggs' == evalProperty(person, 'name.second')
Groovy Beans let you access fields directly. You do not have to define getter/setter methods. They get generated for you. Whenever you access a bean property the getter/setter method is called internally. You can bypass this behavior by using the .# operator. See the following example:
class Person {
String name
Address address
List<Account> accounts = []
}
class Address {
String street
Integer zip
}
class Account {
String bankName
Long balance
}
def person = new Person(name: 'Richardson Heights', address: new Address(street: 'Baker Street', zip: 22222))
person.accounts << new Account(bankName: 'BOA', balance: 450)
person.accounts << new Account(bankName: 'CitiBank', balance: 300)
If you are not dealing with collections you can simply just call the field you want to access.
assert 'Richardson Heights' == person.name
assert 'Baker Street' == person.address.street
assert 22222 == person.address.zip
If you want to access a field within a collection you have to select the element:
assert 'BOA' == person.accounts[0].bankName
assert 300 == person.accounts[1].balance​​​​​​​​​
You can also use propertyMissing. This is what you might call Groovy's built-in method.
Declare this in your class:
def propertyMissing(String name) {
if (name.contains(".")) {
def (String propertyname, String subproperty) = name.tokenize(".")
if (this.hasProperty(propertyname) && this."$propertyname".hasProperty(subproperty)) {
return this."$propertyname"."$subproperty"
}
}
}
Then refer to your properties as desired:
def properties = "property1.property2"
assert someGroovyObject."$properties" == someValue
This is automatically recursive, and you don't have to explicitly call a method. This is only a getter, but you can define a second version with parameters to make a setter as well.
The downside is that, as far as I can tell, you can only define one version of propertyMissing, so you have to decide if dynamic path navigation is what you want to use it for.
See
https://stackoverflow.com/a/15632027/2015517
It uses ${} syntax that can be used as part of GString

External Content with Groovy BuilderSupport

I've built a custom builder in Groovy by extending BuilderSupport. It works well when configured like nearly every builder code sample out there:
def builder = new MyBuilder()
builder.foo {
"Some Entry" (property1:value1, property2: value2)
}
This, of course, works perfectly. The problem is that I don't want the information I'm building to be in the code. I want to have this information in a file somewhere that is read in and built into objects by the builder. I cannot figure out how to do this.
I can't even make this work by moving the simple entry around in the code.
This works:
def textClosure = { "Some Entry" (property1:value1, property2: value2) }
builder.foo(textClosure)
because textClosure is a closure.
If I do this:
def text = '"Some Entry" (property1:value1, property2: value2)'
def textClosure = { text }
builder.foo(textClosure)
the builder only gets called for the "foo" node. I've tried many variants of this, including passing the text block directly into the builder without wrapping it in a closure. They all yield the same result.
Is there some way I take a piece of arbitrary text and pass it into my builder so that it will be able to correctly parse and build it?
Your problem is that a String is not Groovy code. The way ConfigSlurper handles this is to compile the text into an instance of Script using GroovyClassLoader#parseClass. e.g.,
// create a Binding subclass that delegates to the builder
class MyBinding extends Binding {
def builder
Object getVariable(String name) {
return { Object... args -> builder.invokeMethod(name,args) }
}
}
// parse the script and run it against the builder
new File("foo.groovy").withInputStream { input ->
Script s = new GroovyClassLoader().parseClass(input).newInstance()
s.binding = new MyBinding(builder:builder)
s.run()
}
The subclass of Binding simply returns a closure for all variables that delegates the call to the builder. So assuming foo.groovy contains:
foo {
"Some Entry" (property1:value1, property2: value2)
}
It would be equivalent to your code above.
I think the problem you described is better solved with a slurper or parser.
See:
http://groovy.codehaus.org/Reading+XML+using+Groovy%27s+XmlSlurper
http://groovy.codehaus.org/Reading+XML+using+Groovy%27s+XmlParser
for XML based examples.
In your case. Given the XML file:
<foo>
<entry name='Some Entry' property1="value1" property2="value2"/>
</foo>
You could slurp it with:
def text = new File("test.xml").text
def foo = new XmlSlurper().parseText(text)
def allEntries = foo.entry
allEntries.each {
println it.#name
println it.#property1
println it.#property2
}
Originally, I wanted to be able to specify
"Some Entry" (property1:value1, property2: value2)
in an external file. I'm specifically trying to avoid XML and XML-like syntax to make these files easier for regular users to create and modify. My current solution uses ConfigSlurper and the file now looks like:
"Some Entry"
{
property1 = value1
property2 = value2
}
ConfigSlurper gives me a map like this:
["Some Entry":[property1:value1,property2:value2]]
It's pretty simple to use these values to create my objects, especially since I can just pass the property/value map into the constructor.

Resources