Why are methods created outside the main method not able to function inside a RDD? - apache-spark

singleRDD:RDD[(String)]
val xRDD = singleRDD.map(e=>{
func1(e.dest)
})
val xIterator = singleRDD.collectAsMap().map(e => {
func1(e.dest)
})
def func1(str:String):String = {
//some logic
var str = "what I want"
str
}
in terms of this comparison, xRDD contains nothing while xIterator contains what I want. func1 is established outside the main method.
so why does this phoenomenon emerge? how do I address this issue?
Thank you for your hints!

Related

Gatling Rest API Testing - retrieve a value from json response and add it to the list, iterate through list

I am new to Gatling, I am trying to do the performance testing for couple of rest calls. In my scenario I need to extract a value from json response of the 1st call and add those values to the list after looping for few times. Again after looping for few times and adding the values into the list, I want to reuse each value in my next rest call by iterating over the values in the list. Can anyone please suggest on how to implement this. I tried something as below,
var datasetIdList = List.empty[String]
val datasetidsFeeder = datasetIdList.map(datasetId => Map("datasetId" -> datasetId)).iterator
def createData() = {
repeat(20){
feed("").exec(http("create dataset").post("/create/data").header("content-type", "application/json")
.body(StringBody("""{"name":"name"}"""))
.asJson.check(jsonPath("$.id").saveAs("userId"))))
.exec(session => { var usrid = session("userId").as[String].trim
datasetIdList:+= usrid session})
}}
def upload()= feed(datasetidsFeeder).exec(http("file upload").post("/compute-metaservice/datasets/${datasetId}/uploadFile")
.formUpload("File","./src/test/resources/data/File.csv")
.header("content-type","multipart/form-data")
.check(status is 200))
val scn = scenario("create data and upload").exec(createData()).exec(upload())
setUp(scn.inject(atOnceUsers(1))).protocols(httpConf)
}
I am seeing an exception that ListFeeder is empty when trying to run above script. Can someone please help
Updated Code:
class ParallelcallsSimulation extends Simulation{
var idNumbers = (1 to 50).iterator
val customFeeder = Iterator.continually(Map(
"name" -> ("test_gatling_"+ idNumbers.next())
))
val httpConf = http.baseUrl("http://localhost:8080")
.header("Authorization","Bearer 6a4aee03-9172-4e31-a784-39dea65e9063")
def createDatasetsAndUpload() = {
repeat(3) {
//create dataset
feed(customFeeder).exec(http("create data").post("/create/data").header("content-type", "application/json")
.body(StringBody("""{ "name": "${name}","description": "create data and upload file"}"""))
.asJson.check(jsonPath("$.id").saveAs("userId")))
.exec(session => {
val name = session("name").asOption[String]
println(name.getOrElse("COULD NOT FIND NAME"))
val userId = session("userId").as[String].trim
println("%%%%% User ID ====>"+userId)
val datasetIdList = session("datasetIdList").asOption[List[_]].getOrElse(Nil)
session.set("datasetIdList", userId :: datasetIdList)
})
}
}
// File Upload
def fileUpload() = foreach("${datasetIdList}","datasetId"){
exec(http("file upload").post("/uploadFile")
.formUpload("File","./src/test/resources/data/File.csv")
.header("content-type","multipart/form-data")
.check(status is 200))
}
def getDataSetId() = foreach("${datasetIdList}","datasetId"){
exec(http("get datasetId")
.get("/get/data/${datasetId}")
.header("content-type","application/json")
.asJson.check(jsonPath("$.dlp.dlp_job_status").optional
.saveAs("dlpJobStatus")).check(status is 200)
).exec(session => {
val datastId = session("datasetId").asOption[String]
println("request for datasetId >>>>>>>>"+datastId.getOrElse("datasetId not found"))
val jobStatus = session("dlpJobStatus").asOption[String]
println("JOB STATUS:::>>>>>>>>>>"+jobStatus.getOrElse("Dlp Job Status not Found"))
println("Time: >>>>>>"+System.currentTimeMillis())
session
}).pause(10)
}
val scn1 = scenario("create multiple datasets and upload").exec(createDatasetsAndUpload()).exec(fileUpload())
val scn2 = scenario("get datasetId").pause(100).exec(getDataSetId())
setUp(scn1.inject(atOnceUsers(1)),scn2.inject(atOnceUsers(1))).protocols(httpConf)
}
I see below error when I try to execute above script
[ERROR] i.g.c.s.LoopBlock$ - Condition evaluation crashed with message 'No attribute named 'datasetIdList' is defined', exiting loop
var datasetIdList = List.empty[String] defines a mutable variable pointing to a immutable list.
val datasetidsFeeder = datasetIdList.map(datasetId => Map("datasetId" -> datasetId)).iterator uses the immutable list. Further changes to datasetIdList is irrelevant to datasetidsFeeder.
Mutating a global variable with your virtual user is usually not a good idea.
You can save the value into the user's session instead.
In the exec block, you can write:
val userId = session("userId").as[String].trim
val datasetIdList = session("datasetIdList").asOption[List[_]].getOrElse(Nil)
session.set("datasetIdList", userId :: datasetIdList)
Then you can use foreach to iterate them all without using a feeder at all.
foreach("${datasetIdList}", "datasetId") {
exec(http("file upload")
...
}
You should put more work in your question.
Your code is not syntax-highlighted, and is formatted poorly.
You said "I am seeing an exception that ListFeeder is empty" but the words "ListFeeder" are not seen anywhere.
You should post the error message so that it's easier to see what went wrong.
In the documentation linked, there is a Warning. Quoted below:
Session instances are immutable!
Why is that so? Because Sessions are messages that are dealt with in a multi-threaded concurrent way, so immutability is the best way to deal with state without relying on synchronization and blocking.
A very common pitfall is to forget that set and setAll actually return new instances.
This is why the code in the updated question doesn't update the list.
session => {
...
session.set("datasetIdList", userId :: datasetIdList)
println("%%%% List =====>>>" + datasetIdList.toString())
session
}
The updated session is simply discarded. And the original session is returned in the anonymous function.

Why is array.push() not working correctly?

I have a function which returns an array of dishes from a firestore database.
With console.log I check that the dish I want to push is correctly formatted, then push it.
Finally I console.log the array to check if everything is alright.
Here is what I got:
https://image.noelshack.com/fichiers/2019/05/5/1549048418-arraypush.png
switch (type) {
case "Plats": {
this.nourritureCollection.ref.get().then(data => {
let platArray : Plat[] = [];
data.docs.forEach(doc => {
this.plat.name = doc.data().nourritureJson.name;
this.plat.price = doc.data().nourritureJson.price;
this.plat.ingredients = doc.data().nourritureJson.ingredients;
this.plat.type = doc.data().nourritureJson.type;
this.plat.availableQuantity = doc.data().nourritureJson.availableQuantity;
this.plat.isAvailableOffMenu = doc.data().nourritureJson.isAvailableOffMenu;
this.plat.imgUrl = doc.data().nourritureJson.imgUrl;
this.plat.temp = doc.data().nourritureJson.temp;
console.log(this.plat)
platArray.push(this.plat);
});
console.log(platArray)
return platArray;
});
break;
}...
plat is instantiated within my service component, I couldn't declare a new Plat() inside my function.
The expected result is that dishes should be different in my array of dishes.
You are updating this.plat in every iteration. So it will have n number of references in the array pointing to the same object, therefore, the values for all the array elements will be last updated value of this.plat
What you need is to create new Plat object for every iteration.
data.docs.forEach(doc => {
let plat: Plat = new Plat();
plat.name = doc.data().nourritureJson.name;
plat.price = doc.data().nourritureJson.price;
plat.ingredients = doc.data().nourritureJson.ingredients;
plat.type = doc.data().nourritureJson.type;
plat.availableQuantity = doc.data().nourritureJson.availableQuantity;
plat.isAvailableOffMenu = doc.data().nourritureJson.isAvailableOffMenu;
plat.imgUrl = doc.data().nourritureJson.imgUrl;
plat.temp = doc.data().nourritureJson.temp;
console.log(plat)
platArray.push(plat);
});
As pointed out in the comment, you can only use new Plat() if Plat is a class, if it is an interface, just let plat:Plat; would do.

Scala - Remove all elements in a list/map of strings from a single String

Working on an internal website where the URL contains the source reference from other systems. This is a business requirement and cannot be changed.
i.e. "http://localhost:9000/source.address.com/7808/project/repo"
"http://localhost:9000/build.address.com/17808/project/repo"
I need to remove these strings from the "project/repo" string/variables using a trait so this can be used natively from multiple services. I also want to be able to add more sources to this list (which already exists) and not modify the method.
"def normalizePath" is the method accessed by services, 2 non-ideal but reasonable attempts so far. Getting stuck on a on using foldLeft which I woudl like some help with or an simpler way of doing the described. Code Samples below.
1st attempt using an if-else (not ideal as need to add more if/else statements down the line and less readable than pattern match)
trait NormalizePath {
def normalizePath(path: String): String = {
if (path.startsWith("build.address.com/17808")) {
path.substring("build.address.com/17808".length, path.length)
} else {
path
}
}
}
and 2nd attempt (not ideal as likely more patterns will get added and it generates more bytecode than if/else)
trait NormalizePath {
val pattern = "build.address.com/17808/"
val pattern2 = "source.address.com/7808/"
def normalizePath(path: String) = path match {
case s if s.startsWith(pattern) => s.substring(pattern.length, s.length)
case s if s.startsWith(pattern2) => s.substring(pattern2.length, s.length)
case _ => path
}
}
Last attempt is to use an address list(already exists elsewhere but defined here as MWE) to remove occurrences from the path string and it doesn't work:
trait NormalizePath {
val replacements = (
"build.address.com/17808",
"source.address.com/7808/")
private def remove(path: String, string: String) = {
path-string
}
def normalizePath(path: String): String = {
replacements.foldLeft(path)(remove)
}
}
Appreciate any help on this!
If you are just stripping out those strings:
val replacements = Seq(
"build.address.com/17808",
"source.address.com/7808/")
replacements.foldLeft("http://localhost:9000/source.address.com/7808/project/repo"){
case(path, toReplace) => path.replaceAll(toReplace, "")
}
// http://localhost:9000/project/repo
If you are replacing those string by something else:
val replacementsMap = Seq(
"build.address.com/17808" -> "one",
"source.address.com/7808/" -> "two/")
replacementsMap.foldLeft("http://localhost:9000/source.address.com/7808/project/repo"){
case(path, (toReplace, replacement)) => path.replaceAll(toReplace, replacement)
}
// http://localhost:9000/two/project/repo
The replacements collection can come from elsewhere in the code and will not need to be redeployed.
// method replacing by empty string
def normalizePath(path: String) = {
replacements.foldLeft(path){
case(startingPoint, toReplace) => startingPoint.replaceAll(toReplace, "")
}
}
normalizePath("foobar/build.address.com/17808/project/repo")
// foobar/project/repo
normalizePath("whateverPath")
// whateverPath
normalizePath("build.address.com/17808build.address.com/17808/project/repo")
// /project/repo
A very simple replacement could be made as follows:
val replacements = Seq(
"build.address.com/17808",
"source.address.com/7808/")
def normalizePath(path: String): String = {
replacements.find(path.startsWith(_)) // find the first occurrence
.map(prefix => path.substring(prefix.length)) // remove the prefix
.getOrElse(path) // if not found, return the original string
}
Since the expected replacements are very similar, have you tried to generalize them and use regex matching?
There are a million and one ways to extract /project/repo from a String in Scala. Here are a few I came up with:
val list = List("build.address.com/17808", "source.address.com/7808") //etc
def normalizePath(path: String) = {
path.stripPrefix(list.find(x => path.contains(x)).getOrElse(""))
}
Output:
scala> normalizePath("build.address.com/17808/project/repo")
res0: String = /project/repo
val list = List("build.address.com/17808", "source.address.com/7808") //etc
def normalizePath(path: String) = {
list.map(x => if (path.contains(x)) {
path.takeRight(path.length - x.length)
}).filter(y => y != ()).head
}
Output:
scala> normalizePath("build.address.com/17808/project/repo")
res0: Any = /project/repo
val list = List("build.address.com/17808", "source.address.com/7808") //etc
def normalizePath(path: String) = {
list.foldLeft(path)((a, b) => a.replace(b, ""))
}
Output:
scala> normalizePath("build.address.com/17808/project/repo")
res0: String = /project/repo
Depends how complicated you want your code to look (or how silly you want to be), really. Note that the second example has return type Any, which might not be ideal for your scenario. Also, these examples aren't meant to be able to just take the String out of the middle of your path... they can be fairly easily modified if you want to do that though. Let me know if you want me to add some examples just stripping things like build.address.com/17808 out of a String - I'd be happy to do so.

Mess with async function with two threads in Play app

I am building a service method that suppose to build a CSV file. The titles of the file and results are coming from different threads.
def buildCsv(template: Template) : Future[TemporaryFile] = {
val schemaFuture = dbViewSchemaRepository.findOneByTemplateId(template.id)
val resultsFuture = checklistResultRepository.findAllByTemplateId(template.id)
schemaFuture flatMap { optSchema =>
val schema = optSchema match {
case Some(sch : Schema) => sch
case _ => throw UnexpectedException("Schema not found")
}
//get the titles
val titles = buildTitles(schema)
// create temp file
val tempFile = TemporaryFile("test", ".csv")
logger.info("Absolute path: " + tempFile.file.getAbsolutePath)
// start writing results
resultsFuture map { results =>
results foreach { result =>
val resultRow = buildResultRow(result, schema)
tempFile.file.writeCsv(List(resultRow), ',', titles)
}
tempFile
}
}
}
I've built a pretty simple test:
var dbViewSchemaRepo = mock[DbViewSchemaRepository]
doReturn(Future(schema)).when(dbViewSchemaRepo).findOneByTemplateId(schema.templateId)
var checklistResultRepo = mock[ChecklistResultRepository]
doReturn(Future(List(result))).when(checklistResultRepo).findAllByTemplateId(schema.templateId)
val template = mock[Template]
template.id returns schema.templateId
var srv = new ChecklistResultsExportService(dbViewSchemaRepo, checklistResultRepo)
When I run it I get an error:
[error] services.data.model.ChecklistSchema$Schema cannot be cast
to scala.Option (ChecklistResultsExportService.scala:38) [error]
services.data.ChecklistResultsExportService$$anonfun$buildCsv$1.apply(ChecklistResultsExportService.scala:38)
Line 38 is this:
schemaFuture flatMap { optSchema
What am I missing?
Thanks.
My problem was the way I was mocking the response from the repository objects. I should have done it like this:
doReturn(Future.successful(Some(schema))).when(dbViewSchemaRepo).findOneByTemplateId(schema.templateId)
doReturn(Future.successful(List(result))).when(checklistResultRepo).findAllByTemplateId(schema.templateId)
Seem to be working now.
Thanks.

Returning a list of a computations from a method with that uses a sequence of Futures

I want to return the list of the computations from a method that uses a list of Futures:
def foo: List[Long] = {
val res = List(1, 2, 3) map {
x => Future { someCalculation(x) }
}
Future.sequence(res)
// what to do next?
}
def someCalculation(a: Int): Long = //....
How can I do this?
There is a key point to understand when it comes to futures: if you wanna go from Future[T] to T you need to await the result of the operation, but this is something you would like to avoid not to affect the performance of your program. The correct approach is to keep working with asynchronous abstractions as much as you can, and move blocking up to your calling stack.
The Future class has a number of methods you can use to enchain other async operations, such as map, onComplete, onSuccess, etc etc.
If you really need to wait the result, then there is Await.result
val listOfFutures:List[Future[Long]] = val res = List(1, 2, 3) map {
x => Future { someCalculation(x) }
}
// now we have a Future[List[Long]]
val futureList:Future[List[Long]] = Future.sequence(listOfFutures)
// Keep being async here, compute the results asynchronously. Remember the map function on future allows you to pass a f: A=>B on Future[A] and obtain a Future[B]. Here we call the sum method on the list
val yourOperationAsync:Future[Long] = futureList.map{_.sum}
// Do this only when you need to get the result
val result:Long = Await.result(yourOperationAsync, 1 second)
Well the whole point of using Future is to make it asynchronous. i.e
def foo: Future[List[Long]] = {
val res = List(1, 2, 3) map {
x => Future { someCalculation(x) }
}
Future.sequence(res)
}
This would be the ideal solution. But In case if you wish to wait, then you could wait for the result and then return:
val ans = Future.sequence(res)
Await.ready(ans, Duration.inf)

Resources