I ran into a rather odd closure issue related to spock unit testing and wondered if anyone could explain this.
If we imagine a dao, model, and service as follows:
interface CustomDao {
List<Integer> getIds();
Model getModelById(int id);
}
class CustomModel {
int id;
}
class CustomService {
CustomDao customDao
public List<Object> createOutputSet() {
List<Model> models = new ArrayList<Model>();
List<Integer> ids = customDao.getIds();
for (Integer id in ids) {
models.add(customDao.getModelById(id));
}
return models;
}
}
I would like to unit test the CustomService.createOutputSet. I have created the following specification:
class TestSpec extends Specification {
def 'crazy closures'() {
def mockDao = Mock(CustomDao)
def idSet = [9,10]
given: 'An initialized object'
def customService = new CustomService
customService.customDao = mockDao
when: 'createOutput is called'
def outputSet = customService.createOutputSet()
then: 'the following methods should be called'
1*mockDao.getIds() >> {
return idSet
}
for (int i=0; i<idSet.size(); i++) {
int id = idSet.get(i)
1*mockDao.getModelById(idSet.get(i)) >> {
def tmp = new Model()
int tmpId = id // idSet.get(i)
return tmp
}
}
and: 'each compute package is accurate'
2 == outputSet.size()
9 == outputSet.get(0).getId()
10 == outputSet.get(1).getId()
}
}
Notice that in here I test two things. First, I initialize the dao with my mock, verify that the daos are correctly called and return the proper data, and then I verify that I get the proper output (i.e. "and:").
The tricky part is the for loop, in which I wanted to return models from the mock dao that are related to the method parameter. In the above example, if I use a simple for (__ in idSet), the models only return with id 10: outputSet.get(0).getId() == outputSet.get(1).getId() == 10. If I use the traditional for loop, and set the model with idSet.get(i), I get an IndexOutOfBoundsException . The only way to make this work is by retrieving the value in a local variable (id) and setting with variable, as above.
I know this is related to groovy closures and I suspect that spock captures the mock calls into a set of closures before executing them, which means that the model creation depends on the outer state of the closure. I understand why I would get the IndexOutOfBoundsException, but I don't understand why int id = idSet.get(i) is captured by the closure whereas i is not.
What is the difference?
Note: this is not the live code but rather simplified to demonstrate the crux of my challenge. I would not and do not make two subsequent dao calls on getIds() and getModelById().
While stubbing getModelById by a closure, the arguments to the closure has to match with that of the method. If you try something like below, you would not need the local variable id inside for anymore.
for (int i=0; i<idSet.size(); i++) {
//int id = idSet.get(i)
mockDao.getModelById(idSet.get(i)) >> {int id ->
def tmp = new Model()
tmp.id = id // id is closure param which represents idSet.get(i)
return tmp
}
}
Simplified version would be to use each
idSet.each {
mockDao.getModelById(it) >> {int id ->
def tmp = new Model()
tmp.id = id // id is closure param which represents idSet.get(i)
tmp
}
}
Do we need to worry about how many times method is called if it is being stubbed?
Accessing mutable local variables from a closure whose execution is deferred is a common source of errors not specific to Spock.
I don't understand why int id = idSet.get(i) is captured by the closure whereas i is not.
The former gives rise to a separate hoisted variable per iteration whose value is constant. The latter gives rise to a single hoisted variable whose value changes over time (and before the result generator executes).
Instead of solving the problem by introducing a temporary variable, a better solution (already given by #dmahapatro) is to declare an int id -> closure parameter. If it's deemed good enough to stub the calls without enforcing them, the loop can be omitted altogether. Yet another potential solution is to construct the return values eagerly:
idSet.each { id ->
def model = new Model()
model.id = id
1 * mockDao.getModelById(id) >> model
}
Related
For example, consider the following C# code:
interface IBase { void f(int); }
interface IDerived : IBase { /* inherits f from IBase */ }
...
void SomeFunction()
{
IDerived o = ...;
o.f(5);
}
I know how to get a MethodDefinition object corresponding to SomeFunction.
I can then loop through MethodDefinition.Instructions:
var methodDef = GetMethodDefinitionOfSomeFunction();
foreach (var instruction in methodDef.Body.Instructions)
{
switch (instruction.Operand)
{
case MethodReference mr:
...
break;
}
yield return memberRef;
}
And this way I can find out that the method SomeFunction calls the function IBase.f
Now I would like to know the declared type of the object on which the function f is called, i.e. the declared type of o.
Inspecting mr.DeclaringType does not help, because it returns IBase.
This is what I have so far:
TypeReference typeRef = null;
if (instruction.OpCode == OpCodes.Callvirt)
{
// Identify the type of the object on which the call is being made.
var objInstruction = instruction;
if (instruction.Previous.OpCode == OpCodes.Tail)
{
objInstruction = instruction.Previous;
}
for (int i = mr.Parameters.Count; i >= 0; --i)
{
objInstruction = objInstruction.Previous;
}
if (objInstruction.OpCode == OpCodes.Ldloc_0 ||
objInstruction.OpCode == OpCodes.Ldloc_1 ||
objInstruction.OpCode == OpCodes.Ldloc_2 ||
objInstruction.OpCode == OpCodes.Ldloc_3)
{
var localIndex = objInstruction.OpCode.Op2 - OpCodes.Ldloc_0.Op2;
typeRef = locals[localIndex].VariableType;
}
else
{
switch (objInstruction.Operand)
{
case FieldDefinition fd:
typeRef = fd.DeclaringType;
break;
case VariableDefinition vd:
typeRef = vd.VariableType;
break;
}
}
}
where locals is methodDef.Body.Variables
But this is, of course, not enough, because the arguments to a function can be calls to other functions, like in f(g("hello")). It looks like the case above where I inspect previous instructions must repeat the actions of the virtual machine when it actually executes the code. I do not execute it, of course, but I need to recognize function calls and replace them and their arguments with their respective returns (even if placeholders). It looks like a major pain.
Is there a simpler way? Maybe there is something built-in already?
I am not aware of an easy way to achieve this.
The "easiest" way I can think of is to walk the stack and find where the reference used as the target of the call is pushed.
Basically, starting from the call instruction go back one instruction at a time taking into account how each one affects the stack; this way you can find the exact instruction that pushes the reference used as the target of the call (a long time ago I wrote something like that; you can use the code at https://github.com/lytico/db4o/blob/master/db4o.net/Db4oTool/Db4oTool/Core/StackAnalyzer.cs as inspiration).
You'll need also to consider scenarios in which the pushed reference is produced through a method/property; for example, SomeFunction().f(5). In this case you may need to evaluate that method to find out the actual type returned.
Keep in mind that you'll need to handle a lot of different cases; for example, imagine the code bellow:
class Utils
{
public static T Instantiate<T>() where T : new() => new T();
}
class SomeType
{
public void F(int i) {}
}
class Usage
{
static void Main()
{
var o = Utils.Instantiate<SomeType>();
o.F(1);
}
}
while walking the stack you'll find that o is the target of the method call; then you'll evaluate Instantiate<T>() method and will find that it returns new T() and knowing that T is SomeType in this case, that is the type you're looking for.
So the answer of Vagaus helped me come up with a working implementation.
I published it on github - https://github.com/MarkKharitonov/MonoCecilExtensions
Included many unit tests, but I am sure I missed some cases.
I am doing some integration tests with Spock with 3rd party apps. Now I am struggling with a problem that I am not sure wether I am approaching the issue properly or not.
In one of the tests I am connecting to a 3rd party service to get some information in an array. Then each of these items are passed to another method to process them individually.
def get3rdPartyItems = {
[item1, item2, item3]
}
def processItem = { item ->
//do something with item
}
get3rdPartyItems.each {
processItem(it)
}
Then I have a test that connects to real 3rd party service using the method get3rdPartyItems() in which I am testing that processItem is called as many times as items has returned the method get3rdPartyItems().
What I am trying to do is to save one of the items as #Shared variable to write another test to know that the item is processed properly as I don't want to mock the content retrieved from the 3rd party service as I want real data.
Basically, this is what I am doing:
#Shared def globalItem
MyClass.metaClass.processItem = { i ->
if (!globalItem)
globalItem = i
//And now I would need to call the original method processItem
}
Any clue how to achieve this? I am probably overheading too much so I am open to change the solution.
Not sure if this is what you want, as it's hard to see your existing structure from the code and the code isn't runnable as-is, but given this class:
class MyClass {
def get3rdPartyItems = {
['item1', 'item2', 'item3']
}
def processItem( item ) {
println item
//do something with item
}
def run() {
get3rdPartyItems().each {
processItem( it )
}
}
}
You can do this:
def globalItem
def oldProcessItem = MyClass.metaClass.getMetaMethod("processItem", Object)
MyClass.metaClass.processItem = { item ->
if (!globalItem) {
println "Setting global item to $item"
globalItem = item
}
oldProcessItem.invoke( delegate, item )
}
def mc = new MyClass()
new MyClass().run()
Just as a matter of concision, that should be the way of passing the parameters to the metamethod in case you pass multiple parameters:
def globalItem
def oldProcessItem = MyClass.metaClass.getMetaMethod("processItem", ["",[:]] as Object[])
MyClass.metaClass.processItem = { String p1, Map p2 ->
if (!globalItem) {
println "Setting global item to $item"
globalItem = p2
}
oldProcessItem.invoke( delegate, [p1,p2] as Object[] )
}
def mc = new MyClass()
new MyClass().run()
My goal is to parse a large XML file and persist objects to DB based on the XML data, and to do it quickly. The operation needs to be transactional so I can rollback in case there is a problem parsing the XML or an object that gets created cannot be validated.
I am using the Grails Executor plugin to thread the operation. The problem is that each thread I create within the service has its own transaction and session. If I create 4 threads and 1 fails the session for the 3 that didn't fail may have already flushed, or they may flush in the future.
I was thinking if I could tell each thread to use the "current" Hibernate session that would probably fix my problem. Another thought I had was that I could prevent all sessions from flushing until it was known all completed without errors. Unfortunately I don't know how to do either of these things.
There is an additional catch too. There are many of these XML files to parse, and many that will be created in the future. Many of these XML files contain data that when parsed would create an object identical to one that was already created when a previous XML file was parsed. In such a case I need to make a reference to the existing object. I have added a transient isUnique variable to each class to address this. Using the Grails unique constraint does not work because it does not take hasMany relationships into account as I have outlined in my question here.
The following example is very simple compared to the real thing. The XML file's I'm parsing have deeply nested elements with many attributes.
Imagine the following domain classes:
class Foo {
String ver
Set<Bar> bars
Set<Baz> bazs
static hasMany = [bars: Bar, bazs: Baz]
boolean getIsUnique() {
Util.isUnique(this)
}
static transients = [
'isUnique'
]
static constraints = {
ver(nullable: false)
isUnique(
validator: { val, obj ->
obj.isUnique
}
)
}
}
class Bar {
String name
boolean getIsUnique() {
Util.isUnique(this)
}
static transients = [
'isUnique'
]
static constraints = {
isUnique(
validator: { val, obj ->
obj.isUnique
}
)
}
}
class Baz {
String name
boolean getIsUnique() {
Util.isUnique(this)
}
static transients = [
'isUnique'
]
static constraints = {
isUnique(
validator: { val, obj ->
obj.isUnique
}
)
}
}
And here is my Util.groovy class located in my src/groovy folder. This class contains the methods I use to determine if an instance of a domain class is unique and/or retrieve the already existing equal instance:
import org.hibernate.Hibernate
class Util {
/**
* Gets the first instance of the domain class of the object provided that
* is equal to the object provided.
*
* #param obj
* #return the first instance of obj's domain class that is equal to obj
*/
static def getFirstDuplicate(def obj) {
def objClass = Hibernate.getClass(obj)
objClass.getAll().find{it == obj}
}
/**
* Determines if an object is unique in its domain class
*
* #param obj
* #return true if obj is unique, otherwise false
*/
static def isUnique(def obj) {
getFirstDuplicate(obj) == null
}
/**
* Validates all of an object's constraints except those contained in the
* provided blacklist, then saves the object if it is valid.
*
* #param obj
* #return the validated object, saved if valid
*/
static def validateWithBlacklistAndSave(def obj, def blacklist = null) {
def propertiesToValidate = obj.domainClass.constraints.keySet().collectMany{!blacklist?.contains(it)? [it] : []}
if(obj.validate(propertiesToValidate)) {
obj.save(validate: false)
}
obj
}
}
And imagine XML file "A" is similar to this:
<foo ver="1.0">
<!-- Start bar section -->
<bar name="bar_1"/>
<bar name="bar_2"/>
<bar name="bar_3"/>
...
<bar name="bar_5000"/>
<!-- Start baz section -->
<baz name="baz_1"/>
<baz name="baz_2"/>
<baz name="baz_3"/>
...
<baz name="baz_100000"/>
</foo>
And imagine XML file "B" is similar to this (identical to XML file "A" except one new bar added and one new baz added). When XML file "B" is parsed after XML file "A" three new objects should be created 1.) A Bar with name = bar_5001 2.) A Baz with name = baz_100001, 3.) A Foo with ver = 2.0 and a list of bars and bazs equal to what is shown, reusing instances of Bar and Baz that already exist from the import of XML file A:
<foo ver="2.0">
<!-- Start bar section -->
<bar name="bar_1"/>
<bar name="bar_2"/>
<bar name="bar_3"/>
...
<bar name="bar_5000"/>
<bar name="bar_5001"/>
<!-- Start baz section -->
<baz name="baz_1"/>
<baz name="baz_2"/>
<baz name="baz_3"/>
...
<baz name="baz_100000"/>
<baz name="baz_100001"/>
</foo>
And a service similar to this:
class BigXmlFileUploadService {
// Pass in a 20MB XML file
def upload(def xml) {
String rslt = null
def xsd = Util.getDefsXsd()
if(Util.validateXmlWithXsd(xml, xsd)) { // Validate the structure of the XML file
def fooXml = new XmlParser().parseText(xml.getText()) // Parse the XML
def bars = callAsync { // Make a thread for creating the Bar objects
def bars = []
for(barXml in fooXml.bar) { // Loop through each bar XML element inside the foo XML element
def bar = new Bar( // Create a new Bar object
name: barXml.attribute("name")
)
bar = retrieveExistingOrSave(bar) // If an instance of Bar that is equal to this one already exists then use it
bars.add(bar) // Add the new Bar object to the list of Bars
}
bars // Return the list of Bars
}
def bazs = callAsync { // Make a thread for creating the Baz objects
def bazs = []
for(bazXml in fooXml.baz) { // Loop through each baz XML element inside the foo XML element
def baz = new Baz( // Create a new Baz object
name: bazXml.attribute("name")
)
baz = retrieveExistingOrSave(baz) // If an instance of Baz that is equal to this one already exists then use it
bazs.add(baz) // Add the new Baz object to the list of Bazs
}
bazs // Return the list of Bazs
}
bars = bars.get() // Wait for thread then call Future.get() to get list of Bars
bazs = bazs.get() // Wait for thread then call Future.get() to get list of Bazs
def foo = new Foo( // Create a new Foo object with the list of Bars and Bazs
ver: fooXml.attribute("ver")
bars: bars
bazs: bazs
).save()
rslt = "Successfully uploaded ${xml.getName()}!"
} else {
rslt = "File failed XSD validation!"
}
rslt
}
private def retrieveExistingOrSave(def obj, def existingObjCache) {
def dup = Util.getFirstDuplicate(obj)
obj = dup ?: Util.validateWithBlacklistAndSave(obj, ["isUnique"])
if(obj.errors.allErrors) {
log.error "${obj} has errors ${obj.errors}"
throw new RuntimeException() // Force transaction to rollback
}
obj
}
}
So the question is how do I get everything that happens inside of my service's upload method to act as it happened in a single session so EVERYTHING that happens can be rolled back if any one part fails?
You might not be able to do what you're trying to do.
First, a Hibernate session is not thread-safe:
A Session is an inexpensive, non-threadsafe object that should be used once and then discarded for: a single request, a conversation or a single unit of work. ...
Second, I don't think executing SQL queries in parallel will provide much benefit. I looked at how PostgreSQL's JDBC driver works and all the methods that actually run the queries are synchronized.
The slowest part of what you're doing is likely the XML processing so I'd recommend parallelizing that and doing persistence on a single thread. You could create several workers to read from the XML and add the objects to some sort of queue. Then have another worker that owns the Session and saves the objects as they're parsed.
You may also want to take a look at the Hibernate's batch processing doc page. Flushing after each insert is not the fastest way.
And finally, I don't know how your objects are mapped but you might run into problems saving Foo after all the child objects. Adding the objects to foo's collection will cause Hibernate to set the foo_id reference on each object and you'll end up with an update query for every object you inserted. You probably want to make foo first, and do baz.setFoo(foo) before each insert.
Service can be optimized to address some of the pain points:
I agree with #takteek, parsing the xml would be time consuming. So, plan to make that part async.
You do not need flush on each creation of child object. See below for the optimization.
Service class would look something like:
// Pass in a 20MB XML file
def upload(def xml) {
String rslt = null
def xsd = Util.getDefsXsd()
if (Util.validateXmlWithXsd(xml, xsd)) {
def fooXml = new XmlParser().parseText(xml.getText())
def foo = new Foo().save(flush: true)
def bars = callAsync {
saveBars(foo, fooXml)
}
def bazs = callAsync {
saveBazs(foo, fooXml)
}
//Merge the detached instances and check whether the child objects
//are populated or not. If children are
//Can also issue a flush, but we do not need it yet
//By default domain class is validated as well.
foo = bars.get().merge() //Future returns foo
foo = bazs.get().merge() //Future returns foo
//Merge the detached instances and check whether the child objects
//are populated or not. If children are
//absent then rollback the whole transaction
handleTransaction {
if(foo.bars && foo.bazs){
foo.save(flush: true)
} else {
//Else block will be reached if any of
//the children is not associated to parent yet
//This would happen if there was a problem in
//either of the thread, corresponding
//transaction would have rolled back
//in the respective sessions. Hence empty associations.
//Set transaction roll-back only
TransactionAspectSupport
.currentTransactionStatus()
.setRollbackOnly()
//Or throw an Exception and
//let handleTransaction handle the rollback
throw new Exception("Rolling back transaction")
}
}
rslt = "Successfully uploaded ${xml.getName()}!"
} else {
rslt = "File failed XSD validation!"
}
rslt
}
def saveBars(Foo foo, fooXml) {
handleTransaction {
for (barXml in fooXml.bar) {
def bar = new Bar(name: barXml.attribute("name"))
foo.addToBars(bar)
}
//Optional I think as session is flushed
//end of method
foo.save(flush: true)
}
foo
}
def saveBazs(Foo foo, fooXml) {
handleTransaction {
for (bazXml in fooXml.baz) {
def baz = new Baz(name: bazXml.attribute("name"))
foo.addToBazs(baz)
}
//Optional I think as session is flushed
//end of method
foo.save(flush: true)
}
foo
}
def handleTransaction(Closure clos){
try {
clos()
} catch (e) {
TransactionAspectSupport.currentTransactionStatus().setRollbackOnly()
}
if (TransactionAspectSupport.currentTransactionStatus().isRollbackOnly())
TransactionAspectSupport.currentTransactionStatus().setRollbackOnly()
}
I want to store objects in a map (called result). The objects are created or updated from SQL rows.
For each row I read I access the map as follows:
def result = [:]
sql.eachRow('SELECT something') { row->
{
// check if the Entry is already existing
def theEntry = result[row.KEY]
if (theEntry == null) {
// create the entry
theEntry = new Entry(row.VALUE1, row.VALUE2)
// put the entry in the result map
result[row.KEY] = theEntry
}
// use the Entry (create or update the next hierarchie elements)
}
I want to minimize the code for checking and updating the map. How can this be done?
I know the function map.get(key, defaultValue), but I will not use it, because it is to expensive to create an instance on each iteration even if I don't need it.
What I would like to have is a get function with a closure for providing the default value. In this case I would have lazy evaluation.
Update
The solution dmahapatro provided is exactly what I want. Following an example of the usage.
// simulate the result from the select
def select = [[a:1, b:2, c:3], [a:1, b:5, c:6], [a:2, b:2, c:4], [a:2, b:3, c:5]]
// a sample class for building an object hierarchie
class Master {
int a
List<Detail> subs = []
String toString() { "Master(a:$a, subs:$subs)" }
}
// a sample class for building an object hierarchie
class Detail {
int b
int c
String toString() { "Detail(b:$b, c:$c)" }
}
// the goal is to build a tree from the SQL result with Master and Detail entries
// and store it in this map
def result = [:]
// iterate over the select, row is visible inside the closure
select.each { row ->
// provide a wrapper with a default value in a closure and get the key
// if it is not available then the closure is executed to create the object
// and put it in the result map -> much compacter than in my question
def theResult = result.withDefault {
new Master(a: row.a)
}.get(row.a)
// process the further columns
theResult.subs.add new Detail(b: row.b, c: row.c )
}
// result should be [
// 1:Master(a:1, subs:[Detail(b:2, c:3), Detail(b:5, c:6)]),
// 2:Master(a:2, subs:[Detail(b:2, c:4), Detail(b:3, c:5)])]
println result
What I learned from this sample:
withDefault returns a wrapper, so for manipulating the map use the wrapper and not the original map
row variable is visible in the closure!
create the wrapper for the map in each iteration again, since row var changed
You asked for it, Groovy has it for you. :)
def map = [:]
def decoratedMap = map.withDefault{
new Entry()
}
It works the same way you would expect it to work lazily. Have a look at withDefault API for a detailed explanation.
If I dynamically add a property to a class, each instance of the class is initialized with a reference to the same value (even though the properties are correctly at different addresses, I don't want them to share the same reference value):
Here's an example:
class SolarSystem {
Planets planets = new Planets()
static main(args) {
SolarSystem.metaClass.dynamicPlanets = new Planets()
// Infinite loop
// SolarSystem.metaClass.getDynamicPlanets = {
// if (!delegate.dynamicPlanets.initialized) {
// delegate.dynamicPlanets = new Planets(initialized: true)
// }
//
// delegate.dynamicPlanets
// }
// No such field: dynamicPlanets for class: my.SolarSystem
// SolarSystem.metaClass.getDynamicPlanets = {
// if (!delegate.#dynamicPlanets.initialized) {
// delegate.#dynamicPlanets = new Planets(initialized: true)
// }
//
// delegate.#dynamicPlanets
// }
SolarSystem.metaClass.getUniqueDynamicPlanets = {
if (!delegate.dynamicPlanets.initialized) {
delegate.dynamicPlanets = new Planets(initialized: true)
}
delegate.dynamicPlanets
}
// SolarSystem.metaClass.getDynamicPlanets = {
// throw new RuntimeException("direct access not allowed")
// }
def solarSystem1 = new SolarSystem()
println "a ${solarSystem1.planets}"
println "b ${solarSystem1.dynamicPlanets}"
println "c ${solarSystem1.uniqueDynamicPlanets}"
println "d ${solarSystem1.dynamicPlanets}"
println ''
def solarSystem2= new SolarSystem()
println "a ${solarSystem2.planets}"
println "b ${solarSystem2.dynamicPlanets}"
println "c ${solarSystem2.uniqueDynamicPlanets}"
println "d ${solarSystem2.dynamicPlanets}"
}
}
In a separate file:
class Planets {
boolean initialized = false
}
When this runs, you see something like this:
a my.Planets#4979935d
b my.Planets#66100363
c my.Planets#5e0feb48
d my.Planets#5e0feb48
a my.Planets#671ff436
b my.Planets#66100363
c my.Planets#651dba45
d my.Planets#651dba45
Notice how for solarSystem2, the 'normal' member variable planets has a different address when the two objects are created. However, the dynamically added dynamicPlanets points to the same object that solarSystem1 pointed to (in this case, at address 66100363).
I can reassign them in my dynamic getter (getUniqueDynamicPlanets), and that fixes the problem.
However, I cannot override the getDynamicPlanets getter, because I either get an infinite loop, or I cannot get direct access to the dynamically-added property.
Is there a way to directly access the dynamically-added property so I could handle this in the getDynamicPlanets getter? Is there a better strategy for this altogether? Sorry if I missed it, I've looked a bunch...
Thanks
I'm not 100% sure I understand your question, but if I do, did you try setting the getDynamicPlanets closure to have explicitly 0 parameters, so:
SolarSystem.metaClass.getDynamicPlanets = {-> ... }
If you don't have the -> with no args before it, there is an implicit it parameter that's assigned and it's not a zero arg method, so doesn't adhere to the javabean getter/setter pattern.