How come the test case is still passing even if I have not provided correct mocking in my opinion - mockito

I am testing this function. The main bit for me is the call to add method of a respository (partitionsOfATagTransactionRepository.add(transaction, infoToAdd,mutationCondition))
def updateOrCreateTagPartitionInfo(transaction:DistributedTransaction,currentTagPartition: Option[TagPartitions], tag: String) = {
val currentCalendar = Calendar.getInstance() //TODOM - should I use a standard Locale/Timezone (eg GMT) to keep time consistent across all instances of the server application
val currentYear = currentCalendar.get(Calendar.YEAR).toLong
val currentMonth = currentCalendar.get(Calendar.MONTH).toLong
val newTagParitionInfo = TagPartitionsInfo(currentYear.toLong, currentMonth.toLong)
val (infoToAdd,mutationCondition) = currentTagPartition match {
case Some(tagPartitionInfo) => {
//checktest-should add new tag partition info to existing partition info
(TagPartitions(tagPartitionInfo.tag, tagPartitionInfo.partitionInfo + (newTagParitionInfo)),new PutIfExists)
}
case None => {
//checktest-should add new tag partition info if existing partition doesn't exist
(TagPartitions(tag, Set(newTagParitionInfo)),new PutIfNotExists)
}
}
partitionsOfATagTransactionRepository.add(transaction, infoToAdd,mutationCondition) //calling a repositoru method which I suppose needs mocking
infoToAdd
}
I wrote this test case to test the method
"should add new tag partition info if existing partition doesn't exist" in {
val servicesTestEnv = new ServicesTestEnv(components = components)
val questionTransactionDBService = new QuestionsTransactionDatabaseService(
servicesTestEnv.mockAnswersTransactionRepository,
servicesTestEnv.mockPartitionsOfATagTransactionRepository,
servicesTestEnv.mockPracticeQuestionsTagsTransactionRepository,
servicesTestEnv.mockPracticeQuestionsTransactionRepository,
servicesTestEnv.mockSupportedTagsTransactionRepository,
servicesTestEnv.mockUserProfileAndPortfolioTransactionRepository,
servicesTestEnv.mockQuestionsCreatedByUserRepo,
servicesTestEnv.mockTransactionService,
servicesTestEnv.mockPartitionsOfATagRepository,
servicesTestEnv.mockHelperMethods
)
val currentCalendar = Calendar.getInstance() //TODOM - should I use a standard Locale/Timezone (eg GMT) to keep time consistent across all instances of the server application
val currentYear = currentCalendar.get(Calendar.YEAR).toLong
val currentMonth = currentCalendar.get(Calendar.MONTH).toLong
val newTagParitionInfo = TagPartitionsInfo(currentYear.toLong, currentMonth.toLong)
val existingTag = "someExistingTag"
val existingTagPartitions = None
val result = questionTransactionDBService.updateOrCreateTagPartitionInfo(servicesTestEnv.mockDistributedTransaction,
existingTagPartitions,existingTag) //calling the funtion under test but have not provided mock for the repository's add method. The test passes! how? Shouldn't the test throw Null Pointer exception?
val expectedResult = TagPartitions(existingTag,Set(newTagParitionInfo))
verify(servicesTestEnv.mockPartitionsOfATagTransactionRepository,times(1))
.add(servicesTestEnv.mockDistributedTransaction,expectedResult,new PutIfNotExists())
result mustBe expectedResult
result mustBe TagPartitions(existingTag,Set(newTagParitionInfo))
}
The various mocks are defined as
val mockCredentialsProvider = mock(classOf[CredentialsProvider])
val mockUserTokenTransactionRepository = mock(classOf[UserTokenTransactionRepository])
val mockUserTransactionRepository = mock(classOf[UserTransactionRepository])
val mockUserProfileAndPortfolioTransactionRepository = mock(classOf[UserProfileAndPortfolioTransactionRepository])
val mockHelperMethods = mock(classOf[HelperMethods])
val mockTransactionService = mock(classOf[TransactionService])
val mockQuestionsCreatedByUserRepo = mock(classOf[QuestionsCreatedByAUserForATagTransactionRepository])
val mockQuestionsAnsweredByUserRepo = mock(classOf[QuestionsAnsweredByAUserForATagTransactionRepository])
val mockDistributedTransaction = mock(classOf[DistributedTransaction])
val mockQuestionTransactionDBService = mock(classOf[QuestionsTransactionDatabaseService])
val mockQuestionNonTransactionDBService = mock(classOf[QuestionsNonTransactionDatabaseService])
val mockAnswersTransactionRepository = mock(classOf[AnswersTransactionRepository])
val mockPartitionsOfATagTransactionRepository = mock(classOf[PartitionsOfATagTransactionRepository])
val mockPracticeQuestionsTagsTransactionRepository = mock(classOf[PracticeQuestionsTagsTransactionRepository])
val mockPracticeQuestionsTransactionRepository = mock(classOf[PracticeQuestionsTransactionRepository])
val mockSupportedTagsTransactionRepository = mock(classOf[SupportedTagsTransactionRepository])
val mockPartitionsOfATagRepository = mock(classOf[PartitionsOfATagRepository])
The test case passes even though I have not provided any mock for partitionsOfATagTransactionRepository.add. Should I get a NullPointer exception when the add method is called.
I was expecting that I would need to write something like doNothing().when(servicesTestEnv.mockPartitionsOfATagTransactionRepository).add(ArgumentMatchers.any[DistributedTransaction],ArgumentMatchers.any[TagPartitions],ArgumentMatchers.any[MutationCondition]) or when(servicesTestEnv.mockPartitionsOfATagTransactionRepository).add(ArgumentMatchers.any[DistributedTransaction],ArgumentMatchers.any[TagPartitions],ArgumentMatchers.any[MutationCondition]).thenReturn(...) for the test case to pass.

Mockito team made a decision to return default value for a method if no stubbing is provided.
See: https://javadoc.io/doc/org.mockito/mockito-core/latest/org/mockito/Mockito.html#stubbing
By default, for all methods that return a value, a mock will return either null, a primitive/primitive wrapper value, or an empty collection, as appropriate. For example 0 for an int/Integer and false for a boolean/Boolean.
This decision was made consciously: if you are focusing on a different aspect of behaviour of method under test, and the default value is good enough, you don't need to specify it.
Note that other mocking frameworks have taken opposite path - they raise an exception when unstubbed call is detected (for example: EasyMock).
See EasyMock vs Mockito: design vs maintainability?

Related

Spark Aggregator on sorted Window never uses merge - is this reliable?

I am using org.apache.spark.sql.expressions.Aggregator to implement custom logic on a series of rows. I have noticed that the merge() function is never called when the Aggregator is applied to an ordered window with rows between unboundedPreceding and currentRow, i.e. the aggregation behavior is entirely determined by how new elements are added to the latest reduction, reduce().
If merge() is indeed never called in this case, UDAFs would be a great tool to integrate arbitrary custom logic on large partitions of ordered rows; see https://softwarerecs.stackexchange.com/questions/83666/foss-data-stack-to-perform-complex-custom-logic-on-billions-of-ordered-rows. However, I cannot find this being mentioned in the Spark documentation or the Spark issue tracker, and hence I am wondering if it is safe to use in this way - specifically for custom algorithms that don't allow for a merge()-like operation.
Below is some code specifically to test this behavior. I have locally checked the observation with a set of 300 million rows and partitioning based on three columns (each partition having a few million rows), and the observation holds up.
timestampdata.csv
category,eventTime
a,240
a,489
b,924
a,890
b,563
a,167
a,134
b,600
b,901
OrderedProcessing.scala
object OrderedProcessing {
def main(args: Array[String]): Unit = {
val spark = SparkSession.builder().master("local[*]").getOrCreate()
import spark.implicits._
val checkOrderingUdf: UserDefinedFunction = udaf[Int, OrderProcessingInfo, OrderProcessingInfo](CheckOrdering)
val df_data = spark.read
.options(Map("inferSchema" -> "true", "delimiter" -> ",", "header" -> "true"))
.csv("./timestampdata.csv")
val df_checked = df_data
.withColumn("orderProcessingInfo",
checkOrderingUdf.apply($"eventTime").over(
Window.partitionBy("category").orderBy("eventTime")
.rowsBetween(Window.unboundedPreceding, Window.currentRow)))
.select($"category", $"eventTime",
$"orderProcessingInfo".getItem("processedAllInOrder").alias("processedAllInOrder"),
$"orderProcessingInfo".getItem("haveUsedReduce").alias("haveUsedReduce"),
$"orderProcessingInfo".getItem("haveUsedMerge").alias("haveUsedMerge"))
df_checked.groupBy("processedAllInOrder", "haveUsedReduce", "haveUsedMerge").count().show()
}
}
OrderProcessingInfo.scala
case class OrderProcessingInfo(latestTime: Int, processedAllInOrder: Boolean, haveUsedReduce: Boolean, haveUsedMerge: Boolean)
CheckOrdering.scala
object CheckOrdering extends Aggregator[Int, OrderProcessingInfo, OrderProcessingInfo] {
override def zero = OrderProcessingInfo(0, true, false, false)
override def reduce(agg: OrderProcessingInfo, e: Int) = OrderProcessingInfo(
latestTime = e, processedAllInOrder = agg.processedAllInOrder & (e >= agg.latestTime),
haveUsedReduce = true, haveUsedMerge = agg.haveUsedMerge
)
override def merge(agg1: OrderProcessingInfo, agg2: OrderProcessingInfo) = OrderProcessingInfo(
latestTime = agg1.latestTime.max(agg2.latestTime),
processedAllInOrder = agg1.processedAllInOrder & agg2.processedAllInOrder & (agg2.latestTime >= agg1.latestTime),
haveUsedReduce = agg1.haveUsedReduce | agg2.haveUsedReduce,
haveUsedMerge = true
)
override def finish(agg: OrderProcessingInfo) = agg
override def bufferEncoder: Encoder[OrderProcessingInfo] = implicitly(ExpressionEncoder[OrderProcessingInfo])
override def outputEncoder: Encoder[OrderProcessingInfo] = implicitly(ExpressionEncoder[OrderProcessingInfo])
}
output
+-------------------+--------------+-------------+-----+
|processedAllInOrder|haveUsedReduce|haveUsedMerge|count|
+-------------------+--------------+-------------+-----+
| true| true| false| 9|
+-------------------+--------------+-------------+-----+

Groovy AST INSTRUCTION_SELECTION phase vs SEMANTIC_ANALYSIS

This piece of code works in my visitor in the SEMANTIC_ANALYSIS phase but not in the INSTRUCTION_SELECTION phase. It looks like I am not able to use the constructor call. If I change the constructor instead to Arrays.asList() it seems to compile. Any help is appreciated
val codeBlock = GeneralUtils.block()
// Declare the futures list
val arrayListNode = ClassHelper.make(ArrayList::class.java)
val variable = "myVariable"
val variableDeclaration = GeneralUtils.varX(variable, arrayListNode)
val asListExpression = GeneralUtils.ctorX(arrayListNode)
val variableStmt = GeneralUtils.declS(variableDeclaration, asListExpression)
codeBlock.addStatement(variableStmt)
//
// Set the codeBlock back to the closure
closure.code = codeBlock
In the instruction selection phase, the error I get is
java.lang.ArrayIndexOutOfBoundsException: size==0
at org.codehaus.groovy.classgen.asm.OperandStack.getTopOperand(OperandStack.java:672)
at org.codehaus.groovy.classgen.asm.BinaryExpressionHelper.evaluateEqual(BinaryExpressionHelper.java:318)
at org.codehaus.groovy.classgen.asm.sc.StaticTypesBinaryExpressionMultiTypeDispatcher.evaluateEqual(StaticTypesBinaryExpressionMultiTypeDispatcher.java:142)
at org.codehaus.groovy.classgen.AsmClassGenerator.visitDeclarationExpression(AsmClassGenerator.java:637)
at org.codehaus.groovy.ast.expr.DeclarationExpression.visit(DeclarationExpression.java:89)
This code seemed to work.
// Declare the futures list
val arrayListNode = ClassHelper.make(ArrayList::class.java)
val variable = "myVariable"
val variableDeclaration = GeneralUtils.varX(variable, arrayListNode)
val asListExpression = GeneralUtils.ctorX(arrayListNode)
// Look for the arraylist constructor in the node
val arrayNodeConstructorMethod = arrayListNode.getDeclaredConstructor(arrayOf())
asListExpression.setNodeMetaData(StaticTypesMarker.DIRECT_METHOD_CALL_TARGET, arrayNodeConstructorMethod)
val variableStmt = GeneralUtils.declS(variableDeclaration, asListExpression)
codeBlock.addStatement(variableStmt)
//
// Set the codeBlock back to the closure
closure.code = codeBlock

Parameterized FIFO in Chisel

I was going through the Chisel 2.2 Tutorial manual (I am aware that Chisel3 is out in BETA version, but I am required to use Chisel2.2 for some extension of previously implemented modules).
I have been looking for examples of using DecoupledIO interface in Chisel and found a few in the tutorial mentioned above and also on StackOverflow.
One such example is Parameterized FIFO example in Chisel Tutorial manual whose implementation is:
class Fifo[T <: Data] (type: T, n: Int)extends Module {
val io = new Bundle {
val enq_val = Bool(INPUT)
val enq_rdy = Bool(OUTPUT)
val deq_val = Bool(OUTPUT)
val deq_rdy = Bool(INPUT)
val enq_dat = type.asInput
val deq_dat = type.asOutput
}
val enq_ptr= Reg(init = UInt(0, sizeof(n)))
val deq_ptr= Reg(init = UInt(0, sizeof(n)))
val is_full= Reg(init = Bool(false))
val do_enq= io.enq_rdy && io.enq_val
val do_deq= io.deq_rdy && io.deq_val
val is_empty= !is_full && (enq_ptr === deq_ptr)
val deq_ptr_inc = deq_ptr + UInt(1)
val enq_ptr_inc = enq_ptr + UInt(1)
val is_full_next = Mux(do_enq && ~do_deq &&(enq_ptr_inc===deq_ptr),Bool(true),Mux(do_deq && is_full, Bool(false), is_full))
enq_ptr := Mux(do_enq, enq_ptr_inc, enq_ptr)
deq_ptr := Mux(do_deq, deq_ptr_inc, deq_ptr)
is_full := is_full_next
val ram = Mem(n)
when (do_enq) {
ram(enq_ptr) := io.enq_dat
}
io.enq_rdy := !is_full
io.deq_val := !is_empty
ram(deq_ptr) <> io.deq_dat
}
I understand most of the implementation but the code snippet:
val is_full_next = Mux(do_enq && ~do_deq &&(enq_ptr_inc===deq_ptr),Bool(true),Mux(do_deq && is_full, Bool(false), is_full))
If I am interpreting it correctly we are trying to check if the next operation will lead to the condition of isFull or not. If it is so, why are we looking to check the value of enq_ptr_inc === deq_ptr.
If someone can share their view on how this might be working, I would like to hear from you.
Also, I am not sure if this is the simplest way to implement a parameterized FIFO. I am working on my own implementation, but if there is an easier way to implement a FIFO (not even a parameterized one), it would help me to clear my doubts.

spark spelling correction via udf

I need to correct some spellings using spark.
Unfortunately a naive approach like
val misspellings3 = misspellings1
.withColumn("A", when('A === "error1", "replacement1").otherwise('A))
.withColumn("A", when('A === "error1", "replacement1").otherwise('A))
.withColumn("B", when(('B === "conditionC") and ('D === condition3), "replacementC").otherwise('B))
does not work with spark How to add new columns based on conditions (without facing JaninoRuntimeException or OutOfMemoryError)?
The simple cases (the first 2 examples) can nicely be handled via
val spellingMistakes = Map(
"error1" -> "fix1"
)
val spellingNameCorrection: (String => String) = (t: String) => {
titles.get(t) match {
case Some(tt) => tt // correct spelling
case None => t // keep original
}
}
val spellingUDF = udf(spellingNameCorrection)
val misspellings1 = hiddenSeasonalities
.withColumn("A", spellingUDF('A))
But I am unsure how to handle the more complex / chained conditional replacements in an UDF in a nice & generalizeable manner.
If it is only a rather small list of spellings < 50 would you suggest to hard code them within a UDF?
You can make the UDF receive more than one column:
val spellingCorrection2= udf((x: String, y: String) => if (x=="conditionC" && y=="conditionD") "replacementC" else x)
val misspellings3 = misspellings1.withColumn("B", spellingCorrection2($"B", $"C")
To make this more generalized you can use a map from a tuple of the two conditions to a string same as you did for the first case.
If you want to generalize it even more then you can use dataset mapping. Basically create a case class with the relevant columns and then use as to convert the dataframe to a dataset of the case class. Then use the dataset map and in it use pattern matching on the input data to generate the relevant corrections and convert back to dataframe.
This should be easier to write but would have a performance cost.
For now I will go with the following which seems to work just fine and is more understandable: https://gist.github.com/rchukh/84ac39310b384abedb89c299b24b9306
If spellingMap is the map containing correct spellings, and df is the dataframe.
val df: DataFrame = _
val spellingMap = Map.empty[String, String] //fill it up yourself
val columnsWithSpellingMistakes = List("abc", "def")
Write a UDF like this
def spellingCorrectionUDF(spellingMap:Map[String, String]) =
udf[(String), Row]((value: Row) =>
{
val cellValue = value.getString(0)
if(spellingMap.contains(cellValue)) spellingMap(cellValue)
else cellValue
})
And finally, you can call them as
val newColumns = df.columns.map{
case columnName =>
if(columnsWithSpellingMistakes.contains(columnName)) spellingCorrectionUDF(spellingMap)(Column(columnName)).as(columnName)
else Column(columnName)
}
df.select(newColumns:_*)

Sorting a table by date in Slick 3.1.x

I have the following Slick class that includes a date:
import java.sql.Date
import java.time.LocalDate
class ReportDateDB(tag: Tag) extends Table[ReportDateVO](tag, "report_dates") {
def reportDate = column[LocalDate]("report_date")(localDateColumnType)
def * = (reportDate) <> (ReportDateVO.apply, ReportDateVO.unapply)
implicit val localDateColumnType = MappedColumnType.base[LocalDate, Date](
d => Date.valueOf(d),
d => d.toLocalDate
)
}
When I attempt to sort the table by date:
val query = TableQuery[ReportDateDB]
val action = query.sortBy(_.reportDate).result
I get the following compilation error
not enough arguments for method sortBy: (implicit evidence$2: slick.lifted.Rep[java.time.LocalDate] ⇒
slick.lifted.Ordered)slick.lifted.Query[fdic.ReportDateDB,fdic.ReportDateDB#TableElementType,Seq].
Unspecified value parameter evidence$2.
No implicit view available from slick.lifted.Rep[java.time.LocalDate] ⇒ slick.lifted.Ordered.
How to specify the implicit default order?
You need to make your implicit val localDateColumnType available where you run the query. For example, this will work:
implicit val localDateColumnType = MappedColumnType.base[LocalDate, Date](
d => Date.valueOf(d),
d => d.toLocalDate)
val query = TableQuery[ReportDateDB]
val action = query.sortBy(_.reportDate).result
I'm not sure where the best place to put this is, but I usually put all these conversions in a package object.
It should work like described here:
implicit def localDateOrdering: Ordering[LocalDate] = Ordering.fromLessThan(_ isBefore _)
Try add this line to your import list:
import slick.driver.MySQLDriver.api._

Resources