GenericUDF of hive execute twice on Spark

GenericUDF of hive execute twice on Spark - apache-spark

Hello i facing some problem with creating genericUDF of hive and register as temporary function but when i call it its call twice see code given below
i create a genericUDF with following code
class GenUDF extends GenericUDF{
var queryOI: StringObjectInspector = null
var argumentsOI: Array[ObjectInspector] = null
override def initialize (arguments: Array[ObjectInspector]):ObjectInspector = {
/*if (arguments.length == 0) {
throw new UDFArgumentLengthException("At least one argument must be specified")
}
if (!(arguments(0).isInstanceOf[StringObjectInspector])) {
throw new UDFArgumentException("First argument must be a string")
}
queryOI = arguments(0).asInstanceOf[StringObjectInspector]
argumentsOI = arguments*/
println("inside initializeweeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee")
return PrimitiveObjectInspectorFactory.javaStringObjectInspector
}
override def evaluate (arguments: Array[GenericUDF.DeferredObject]):Object = {
println("inside generic UDF::::::::::::::::::::::((((((((((((((((((((((((FDDDDDDDDDDDDD:")
4.toString
}
def getDisplayString(children: Array[String]): String = {
println("inside displayssssssssssssssssssssssssssssssss")
return "udft"
}
}
And when i register it with following statement
hiveContext.sql("CREATE TEMPORARY FUNCTION udft AS 'functions.GenUDF'")
and when i call this function with following command
select udft()
it will execute the print statement in evaluate body twice.

Related

PDFtron: change name of element

I'm using PDFTron's Java SDK, and I want to change the name of an element, then write the modified PDF to a new file, but I get the following output:
PDFNet is running in demo mode.
Permission: read
Exception:
Message: SetName() can't be invoked on Obj of this type.
How can I change an object's name? My code (in Scala) is as follows:
def main(args: Array[String]): Unit = {
PDFNet.initialize()
var doc = new PDFDoc("example.pdf")
var fdf = doc.fdfExtract
var iter = fdf.getFieldIterator
while (iter.hasNext) {
var field = iter.next
var obj = field.findAttribute("T")
if (obj != null && field.getName.startsWith("MyPrefix")) {
obj.setName("NewPrefix") // `field.setName` produces the same error
}
}
}

The API Field.GetName() is technically an amalgamation of this leaf Field and any parent ones, delimited by a ..
So while Field.getName() might return name.first the Field's T value might just be first. This is why there is Field.getPartialName() exists.
So the better/safer code to change the T value is.
var obj = field.findAttribute("T")
if (obj != null && obj.isString() && obj.getAsPDFText().startsWith("MyPrefix")) {
obj.setString("NewPrefix")
}

How to use MockK to mock an observable

I have a data provider that has an Observable<Int> as part of the public API. My class under test maps this into a Observable<String>.
How do I create a mock so that it can send out different values on the data provider's observable?
I can do it using a Fake object, but that is a lot of work that I don't think is necessary with MockK.
Simplified code:
interface DataProvider {
val numberData:Observable<Int>
}
class FakeDataProvider():DataProvider {
private val _numberData = BehaviorSubject.createDefault(0)
override val numberData = _numberData.hide()
// Note: the internals of this class cause the _numberData changes.
// I can use this method to fake the changes for this fake object,
// but the real class doesn't have this method.
fun fakeNewNumber( newNumber:Int ) {
_numberData.onNext( newNumber )
}
}
interface ClassUnderTest {
val stringData:Observable<String>
}
class MyClassUnderTest( dataProvider: DataProvider ):ClassUnderTest {
override val stringData = dataProvider.numberData.map { "string = " + it.toString() }
}
class MockKTests {
#Test fun testUsingFakeDataProvider() {
val fakeDataProvider = FakeDataProvider()
val classUnderTest = MyClassUnderTest( fakeDataProvider )
val stringDataTestObserver = TestObserver<String>()
classUnderTest.stringData.subscribe( stringDataTestObserver )
fakeDataProvider.fakeNewNumber( 1 )
fakeDataProvider.fakeNewNumber( 2 )
fakeDataProvider.fakeNewNumber( 3 )
// Note we are expecting the initial value of 0 to also come through
stringDataTestObserver.assertValuesOnly( "string = 0", "string = 1","string = 2","string = 3" )
}
// How do you write the mock to trigger the dataProvider observable?
#Test fun testUsingMockDataProvider() {
val mockDataProvider = mockk<DataProvider>()
// every { ... what goes here ... } just Runs
val classUnderTest = MyClassUnderTest( mockDataProvider )
val stringDataTestObserver = TestObserver<String>()
classUnderTest.stringData.subscribe( stringDataTestObserver )
// Note we are expecting the initial value of 0 to also come through
stringDataTestObserver.assertValuesOnly( "string = 0", "string = 1","string = 2","string = 3" )
}
}

Try to use following:
every { mockDataProvider.numberData } answers { Observable.range(1, 3) }
And maybe you need to use another way to make a mock object, like this:
val mockDataProvider = spyk(DataProvider())

Do something like this where we create an observable fakelist of the observable
var fakeList :List<Quiz> = (listOf<Quiz>(
Quiz("G1","fromtest","","",1)
))
var observableFakelist = Observable.fromArray(fakeList)
you can then return your observableFakelist.

Getting an error with Groovy StubFor when I try and set property value

I have a dynamic AORule class that has a embedded DyanmicRule class instance as a delegate like this, and method on AORule that invokes an action on rule which tests to see if a dynamicExecution closure has been set before calling it:
#Component
#Scope("prototype")
#ActiveObject
#Slf4j
#EqualsAndHashCode
class AORule implements org.easyrules.api.Rule {
//setup delegate dynamicRule and make it a part of this ActiveObject
#Delegate DynamicRule rule
AORule (name, description=null) {
rule = new DynamicRule (name, description ?: "Basic Active Object Rule")
}
AORule () {
rule = new DynamicRule ()
}
....rest of class with a method like this
#ActiveMethod
void active_execute () {
//rule.execute()
if (rule.dynamicExecution != null) {
log.debug "activeExec : running dynamicExecution closure "
rule.dynamicExecute() //should call closure, where this is defined
}
else {
log.debug "activeExec : running std Execution action "
rule.execute()
}
} ....
where the top of the DynamicRule class looks like this:
#InheritConstructors
#Component
#Slf4j //use Groovy AST to get logger
#EqualsAndHashCode
class DynamicRule extends BasicRule implements org.easyrules.api.Rule {
Closure dynamicEvaluation = null
Closure dynamicExecution = null
....
I then try and define a simple Spock test for this and Stub the DynamicRule like so
def "set execution closure directly when doing an execute action " () {
given:"setup of stub for testing "
def mockres
def stub = new StubFor (DynamicRule)
stub.demand.dynamicExecute { mockres = "did nothing" }
stub.demand.getDynamicExecution = {true} // pretend this has been set
when : "execution passing closure value at same time "
//have to run new, actions etc in scope of class stub
stub.use {
def rule = new AORule ()
rule.execute()
}
then : "test execute closure ran as expected "
mockres == "did nothing"
}
In this test - tried to set up the Stub demand for getDynamicExecution property (a closure) to return true, and another demand to stub the method dynamicExecution to set the def mockres result. I then assert that's some value
However this gives me an error like this:
groovy.lang.MissingPropertyException: No such property: getDynamicExecution for class: groovy.mock.interceptor.Demand
at org.easyrules.spring.AORuleSpecTest.set execution closure directly when doing an execute action (AORuleSpecTest.groovy:91)
How do I set the expected result of a mocked property access in a class. I thought setting the the Stub demand expectation for getPropertyName = {some value} would do the trick. What am I doing wrong?
Post Script
I tried to setup a dumbed down version of testing approach I have used like this
class DummyStubSpecTest extends Specification {
def "test z stub " () {
def res
given :
def zstub = new StubFor (Zthing.class)
zstub.demand.execute {res = "hello"}
zstub.demand.getVar {false}
when :
zstub.use {
def a = new Athing()
a.z.execute()
}
then :
res == "hello"
}
def "test Athing stub " () {
def res
given :
def astub = new StubFor (Athing.class)
astub.demand.doit {res = "hello"}
when :
astub.use {
def a = new Athing ()
a.doit()
}
then :
res == "hello"
}
}
and a Spock test like this:
def "test z stub " () {
def res
given :
def zstub = new StubFor (Zthing.class)
zstub.demand.execute {res = "hello"}
zstub.demand.getVar {false}
when :
zstub.use {
def a = new Athing()
a.z.execute()
}
then :
res == "hello"
}
and this does work - so I'm not sure what I am doing wrong in previous example.
Second postscript
One suggestion was that I should use GroovyMock to run the test with, so I tried this
def "set execution closure using GroovyMock to Stub to execute action " () {
given:"setup of stub for testing "
def mockres
GroovyMock (DynamicRule, global:true)
DynamicRule.dynamicExecute() >> {mockres = "did nothing"}
when : "execution passing closure value at same time "
//have to run new, actions etc in scope of class stub
def rule = new AORule ()
rule.execute()
then : "test execute closure ran as expected "
mockres == "did nothing"
}
However when I run this the assertion fails again - mockres is not being set, when the execute() triggers the internal DynamicRule.dynamicExecute() method on my mocked object.
Third Postcript
I have tried again - first a 'dummy pair of classes and dummy test to try groovyMocks - and this works. Here are my eponymous Athing and Zthing
//define subject under test
class Athing {
def z = new Zthing()
def res
def doit() {
z.greet()
res = z.execute()
}
}
//dependency to be stubbed
class Zthing {
def var = true
def execute() {if (var) println "its true" else println "its false"; var}
def greet() { println "hello"}
}
here is the test i defined
class DummyStubSpecTest extends Specification {
def "test z with GroovyMock " () {
given:
def a = new Athing()
def res
def mock = GroovyMock (Zthing) { //, global:true
1*execute() >> {println "stub called"; res=true}
}
a.z = mock
when:
a.doit()
then:
res == true
}
I created the mock and set the interaction expectation (1*) and stubbed response, which sets my external res to some string so I can test for that.
Because the instance of 'a' has an embedded 'z' instance, I overwrite the a.z reference to point to the mock. I then call the SUT in the when clause, and check that res has been set. This does work.
However my real class/test still refuse to work. So my AORule has a DynamicRule reference
class AORule implements org.easyrules.api.Rule {
//setup delegate dynamicRule and make it a part of this ActiveObject
#Delegate DynamicRule rule
AORule (name, description=null) {
rule = new DynamicRule (name, description ?: "Basic Active Object Rule")
}
AORule () {
rule = new DynamicRule ()
} ....
where the execute method in AORule look like this
void execute () {
active_execute()
}
/**
* Active method manages async action through object inside through hidden actor.
* variable 'rule' is the variable we are protecting. Runs either any dynmicExecution closure
* where defined or just runs the standard class execute method
*
* #return void
*/
#ActiveMethod
void active_execute () {
//rule.execute()
if (rule.dynamicExecution != null) {
log.debug "activeExec : running dynamicExecution closure "
rule.dynamicExecute() //should call closure, where this is defined
}
else {
log.debug "activeExec : running std Execution action "
rule.execute()
}
} ....
My test for this looks like this:
def "set execution closure using GroovyMock to Stub out dynamicExecute action " () {
given:"setup of stub for testing "
def mockres
def mock = GroovyMock (DynamicRule) { //, global:true
1*dynamicExecute() >> {-> println "stub called"; mockres = "did nothing" }
0*execute()
}
aorule.rule = mock
when : "execution passing closure value at same time "
//have to run new, actions etc in scope of class stub
//def rule = new AORule ()
aorule.setDynamicExecution {println "hi"}
aorule.execute()
then : "test execute closure ran as expected "
mockres == "did nothing"
}
The test doesn't work correctly - the assertion shows mockres to be null (not being set).
I thought it might be because the activeEvaluate is internally checking on activeEvaluation closure is set, so I tried changing the GroovyMock to a GroovySpy (so that my aorule.setDynamicExecution call should work), and it still fails.
I can't seem to see where I am going here. My simple example seems to work and my real test doesn't.
I also tried brute force and used metaClass to hack an execute closure to set mockres, and that didn't get called either.

Groovy DSL: How can I let two delegating classes handle different parts of a DSLScript?

Let's say I have a DSL like this
setup {name = "aDSLScript"}
println "this is common groovy code"
doStuff {println "I'm doing dsl stuff"}
One would have a delegating class implementing the methods 'setup' and 'doStuff' usually. Beside, one could write common Groovy code to be executed (println...).
What I am searching for, is a way to execute this in two steps. In the first step only the setup method should be processed (neither println). The second step handles the other parts.
At the moment, I have two delegating classes. One implements 'setup' the other one implements 'doStuff'. But both execute the println statement, of course.

You can create a single class to intercept the method calls from the script and let it coordinate the following method invoke. I did it through reflection, but you can go declarative if you want. These are the model and script classes:
class FirstDelegate {
def setup(closure) { "firstDelegate.setup" }
}
class SecondDelegate {
def doStuff(closure) { "secondDelegate.doStuff" }
}
class MethodInterceptor {
def invokedMethods = []
def methodMissing(String method, args) {
invokedMethods << [method: method, args: args]
}
def delegate() {
def lookupCalls = { instance ->
def invokes = instance.metaClass.methods.findResults { method ->
invokedMethods.findResult { invocation ->
invocation.method == method.name ?
[method: method, invocation: invocation] : null
}
}
invokes.collect { invoked ->
invoked.method.invoke(instance, invoked.invocation.args)
}
}
return lookupCalls(new FirstDelegate()) + lookupCalls(new SecondDelegate())
}
}
Here be scripts and assertions:
import org.codehaus.groovy.control.CompilerConfiguration
def dsl = '''
setup {name = "aDSLScript"}
println "this is common groovy code"
doStuff {println "Ima doing dsl stuff"}
'''
def compiler = new CompilerConfiguration()
compiler.scriptBaseClass = DelegatingScript.class.name
def shell = new GroovyShell(this.class.classLoader, new Binding(), compiler)
script = shell.parse dsl
interceptor = new MethodInterceptor()
script.setDelegate interceptor
script.run()
assert interceptor.invokedMethods*.method == [ 'setup', 'doStuff' ]
assert interceptor.delegate() ==
['firstDelegate.setup', 'secondDelegate.doStuff']
Notice I didn't bothered intercepting println call, which is a DefaultGroovyMethods thus, a little more cumbersome to handle.
Also having the class MethodInterceptor implementing the method delegate() is not a good idea, since this allows the user-defined script to call it.

I found a way to split up execution of the DSL script. I used a CompilationCustomizer to remove every statement from AST except the doFirst{}. So the first run will only execute doFirst. The second run does everything else. Here's some code:
class DoFirstProcessor {
def doFirst(Closure c) {
c()
}
}
class TheRestProcessor {
def doStuff(Closure c) {
c()
}
def methodMissing(String name, args) {
//nothing to do
}
}
def dsl = "
println 'this is text that will not be printed out in first line!'
doFirst { println 'First things first: e.g. setting up environment' }
doStuff { println 'doing some stuff now' }
println 'That is it!'
"
class HighlanderCustomizer extends CompilationCustomizer {
def methodName
HighlanderCustomizer(def methodName) {
super(CompilePhase.SEMANTIC_ANALYSIS)
this.methodName = methodName
}
#Override
void call(SourceUnit sourceUnit, GeneratorContext generatorContext, ClassNode classNode) throws CompilationFailedException {
def methods = classNode.getMethods()
methods.each { MethodNode m ->
m.code.each { Statement st ->
if (!(st instanceof BlockStatement)) {
return
}
def removeStmts = []
st.statements.each { Statement bst ->
if (bst instanceof ExpressionStatement) {
def ex = bst.expression
if (ex instanceof MethodCallExpression) {
if (!ex.methodAsString.equals(methodName)) {
removeStmts << bst
}
} else {
removeStmts << bst
}
} else {
removeStmts << bst
}
}
st.statements.removeAll(removeStmts)
}
}
}
}
def cc = new CompilerConfiguration()
cc.addCompilationCustomizers new HighlanderCustomizer("doFirst")
cc.scriptBaseClass = DelegatingScript.class.name
def doFirstShell = new GroovyShell(new Binding(), cc)
def doFirstScript = doFirstShell.parse dsl
doFirstScript.setDelegate new DoFirstProcessor()
doFirstScript.run()
cc.compilationCustomizers.clear()
def shell = new GroovyShell(new Binding(), cc)
def script = shell.parse dsl
script.setDelegate new TheRestProcessor()
script.run()
I did another variation of this where I execute the DSL in one step. See my blog post about it: http://hackserei.metacode.de/?p=247

Groovy: Implicit call not working on instance variables inside closure

A class implements call method so that it's objects can be called as a method. This works for most of the case but not when the call is being made inside a closure on a object which is instance variable of a class.
To demonstrate the problem, in the code below I've commented the interesting lines with numbers. While most variants result in same output, only the line with comment 5 doesn't work. It throws groovy.lang.MissingMethodException: No signature of method: Client2.instanceVar() is applicable for argument types: () values: [])
Can someone help me understand the reason? Is it a bug?
class CallableObject {
def call() { println "hello" }
}
class Client {
def instanceVar = new CallableObject()
def method() {
def localVar = new CallableObject()
def closure1 = { localVar() }
def closure2 = { instanceVar.call() }
def closure3 = { instanceVar() } // doesn't work
localVar() // 1
instanceVar() // 2
closure1() // 3
closure2() // 4
closure3() // 5
}
}
new Client().method()

I guess this will make it clear.
class CallableObject {
def call() { println "hello" }
}
class Client {
def instanceVar = new CallableObject()
def getInstanceVar() {
println "Getter Called"
instanceVar
}
def method() {
def localVar = new CallableObject()
def closure1 = { localVar() }
def closure2 = { instanceVar.call() }
def closure3 = { this.#instanceVar() } //should work now
localVar() // 1
instanceVar() // 2
closure1() // 3
closure2() // 4
closure3() // 5
}
}
new Client().method()
You will see "Getter Called" printed when closure2() invoked. For a global property to be accessed in the closure inside a method, the getter in called instead. To surmount the error you get, the field instanceVar needs to be accessed directly in order to implicitly use call().

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

GenericUDF of hive execute twice on Spark - apache-spark

Related

PDFtron: change name of element

How to use MockK to mock an observable

Getting an error with Groovy StubFor when I try and set property value

Groovy DSL: How can I let two delegating classes handle different parts of a DSLScript?

Groovy: Implicit call not working on instance variables inside closure

Categories

Resources