Jmeter logs are missing - groovy

I am not able to see log messages in the logger view and not in jmeter.log. I am writing a simple jmeter groovy script [jsr223 listener].
org.apache.jorphan.logging.LoggingManager;
private static final Logger log = LoggingManager.getLoggerForClass();
log.info("your log message");
log.error("your error message");
println("something");
System.Out.Println("other thing");
Logs
2021-01-19 01:23:28,999 INFO o.a.j.e.StandardJMeterEngine: Running the test!
2021-01-19 01:23:29,015 INFO o.a.j.s.SampleEvent: List of sample_variables: []
2021-01-19 01:23:29,015 INFO o.a.j.g.u.JMeterMenuBar: setRunning(true, *local*)
2021-01-19 01:23:29,050 INFO o.a.j.e.StandardJMeterEngine: Starting ThreadGroup: 1 : Thread Group
2021-01-19 01:23:29,050 INFO o.a.j.e.StandardJMeterEngine: Starting 1 threads for group Thread Group.
2021-01-19 01:23:29,050 INFO o.a.j.e.StandardJMeterEngine: Thread will continue on error
2021-01-19 01:23:29,066 INFO o.a.j.t.ThreadGroup: Starting thread group... number=1 threads=1 ramp-up=1 delayedStart=false
2021-01-19 01:23:29,066 INFO o.a.j.t.ThreadGroup: Started thread group number 1
2021-01-19 01:23:29,066 INFO o.a.j.e.StandardJMeterEngine: All thread groups have been started
2021-01-19 01:23:29,066 INFO o.a.j.t.JMeterThread: Thread started: Thread Group 1-1
2021-01-19 01:23:29,066 INFO o.a.j.t.JMeterThread: Thread is done: Thread Group 1-1
2021-01-19 01:23:29,066 INFO o.a.j.t.JMeterThread: Thread finished: Thread Group 1-1
2021-01-19 01:23:29,066 INFO o.a.j.e.StandardJMeterEngine: Notifying test listeners of end of test
2021-01-19 01:23:29,066 INFO o.a.j.g.u.JMeterMenuBar: setRunning(false, *local*)
Expecting to see log and error message atleast but none come through. Also , I tried jmeter 5.4.1 snapshot and same behavior. I am window OS / jmeter 5.4 . Can clue what I might be missing?

You don't need the two first lines, log is a pre-defined variable for the JSR223 Test Elements, see Top 8 JMeter Java Classes You Should Be Using with Groovy article for more details.
The last line should look like System.out.println("other thing");
The last two lines print stuff not into the log file but to STDOUT
You should see an error message, something like unable to resolve class Logger, if you don't see it - your listener is not being executed at all

Related

livy session can not get sparkcontext, stuck on starting

I try to use pyspark on jupyterhub (which on kubernetes) for interactive programming to a remote spark cluster on kubernetes. So I use sparkmagic and livy (which on kubernetes, too)
When I try to get sparkContext and sparkSession in notebook, util the livy session dead, it is still stuck on 'starting' status.
My spark-driver-pod is running, and I can see this log:
53469 [pool-8-thread-1] INFO org.apache.livy.rsc.driver.SparkEntries - Spark context finished initialization in 34532ms
53625 [pool-8-thread-1] INFO org.apache.livy.rsc.driver.SparkEntries - Created Spark session.
128775 [dispatcher-CoarseGrainedScheduler] INFO org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint - Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.83.128.194:35040) with ID 1, ResourceProfileId 0
128927 [dispatcher-BlockManagerMaster] INFO org.apache.spark.storage.BlockManagerMasterEndpoint - Registering block manager 10.83.128.194:42385 with 4.6 GiB RAM, BlockManagerId(1, 10.83.128.194, 42385, None)
131902 [dispatcher-CoarseGrainedScheduler] INFO org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint - Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.83.128.130:58232) with ID 2, ResourceProfileId 0
132041 [dispatcher-BlockManagerMaster] INFO org.apache.spark.storage.BlockManagerMasterEndpoint - Registering block manager 10.83.128.130:37991 with 4.6 GiB RAM, BlockManagerId(2, 10.83.128.130, 37991, None)
My spark-executor-pod is also running.
This is my livy-server's log:
2022-05-19 08:36:54,959 DEBUG LivySession Session 0 in state starting. Sleeping 2 seconds.
2022-05-19 08:36:56,969 DEBUG LivySession Session 0 in state starting. Sleeping 2 seconds.
2022-05-19 08:36:58,979 DEBUG LivySession Session 0 in state starting. Sleeping 2 seconds.
2022-05-19 08:37:01,002 DEBUG LivySession Session 0 in state starting. Sleeping 2 seconds.
2022-05-19 08:37:03,015 ERROR LivySession Session 0 did not reach idle status in time. Current status is starting.
2022-05-19 08:37:03,016 INFO EventsHandler InstanceId: 0139a7a9-a0b5-439e-84f5-a9ca6c896360,EventName: notebookSessionCreationEnd,Timestamp: 2022-05-19 08:37:03.016038,SessionGuid: 14da96d9-8b24-4beb-a5ad-a32009c9f772,LivyKind: pyspark,SessionId: 0,Status: starting,Success: False,ExceptionType: LivyClientTimeoutException,ExceptionMessage: Session 0 did not start up in 600 seconds.
2022-05-19 08:37:03,016 INFO EventsHandler InstanceId: 0139a7a9-a0b5-439e-84f5-a9ca6c896360,EventName: notebookSessionDeletionStart,Timestamp: 2022-05-19 08:37:03.016288,SessionGuid: 14da96d9-8b24-4beb-a5ad-a32009c9f772,LivyKind: pyspark,SessionId: 0,Status: starting
2022-05-19 08:37:03,016 DEBUG LivySession Deleting session '0'
2022-05-19 08:37:03,037 INFO EventsHandler InstanceId: 0139a7a9-a0b5-439e-84f5-a9ca6c896360,EventName: notebookSessionDeletionEnd,Timestamp: 2022-05-19 08:37:03.036919,SessionGuid: 14da96d9-8b24-4beb-a5ad-a32009c9f772,LivyKind: pyspark,SessionId: 0,Status: dead,Success: True,ExceptionType: ,ExceptionMessage:
2022-05-19 08:37:03,037 ERROR SparkMagics Error creating session: Session 0 did not start up in 600 seconds.
Please tell me how can I solve this problem, thanks!
My spark version:3.2.1
livy version:0.8.0

All executors finish with state KILLED and exitStatus 1

I am trying to setup a local Spark cluster. I am using Spark 2.4.4 on Windows 10 machine.
To start the master and one worker I do
spark-class org.apache.spark.deploy.master.Master
spark-class org.apache.spark.deploy.worker.Worker 172.17.1.230:7077
After submitting an application to the cluster, it finishes successfully but in the Spark web admin UI it says that the application is KILLED. It's also what I get from worker logs. I have tried running my own examples and examples included in the Spark installation. They all get killed with exitStatus 1.
To start spark JavaSparkPi example from spark installation folder
Spark> spark-submit --master spark://172.17.1.230:7077 --class org.apache.spark.examples.JavaSparkPi .\examples\jars\spark-examples_2.11-2.4.4.jar
Part of the log after finishing calculation outputs
20/01/19 18:55:11 INFO DAGScheduler: Job 0 finished: reduce at JavaSparkPi.java:54, took 4.183853 s
Pi is roughly 3.13814
20/01/19 18:55:11 INFO SparkUI: Stopped Spark web UI at http://Nikola-PC:4040
20/01/19 18:55:11 INFO StandaloneSchedulerBackend: Shutting down all executors
20/01/19 18:55:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down
20/01/19 18:55:11 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
20/01/19 18:55:11 WARN TransportChannelHandler: Exception in connection from /172.17.1.230:58560
java.io.IOException: An existing connection was forcibly closed by the remote host
stderr log of the completed application outputs this at the end
20/01/19 18:55:11 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 910 bytes result sent to driver
20/01/19 18:55:11 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 910 bytes result sent to driver
20/01/19 18:55:11 INFO CoarseGrainedExecutorBackend: Driver commanded a shutdown
The worker log outputs
20/01/19 18:55:06 INFO ExecutorRunner: Launch command: "C:\Program Files\Java\jdk1.8.0_231\bin\java" "-cp" "C:\Users\nikol\Spark\bin\..\conf\;C:\Users\nikol\Spark\jars\*" "-Xmx1024M" "-Dspark.driver.port=58484" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler#Nikola-PC:58484" "--executor-id" "0" "--hostname" "172.17.1.230" "--cores" "12" "--app-id" "app-20200119185506-0001" "--worker-url" "spark://Worker#172.17.1.230:58069"
20/01/19 18:55:11 INFO Worker: Asked to kill executor app-20200119185506-0001/0
20/01/19 18:55:11 INFO ExecutorRunner: Runner thread for executor app-20200119185506-0001/0 interrupted
20/01/19 18:55:11 INFO ExecutorRunner: Killing process!
20/01/19 18:55:11 INFO Worker: Executor app-20200119185506-0001/0 finished with state KILLED exitStatus 1
I have tried with Spark 2.4.4 for Hadoop 2.6 and 2.7. The problem remains in both the cases.
This problem is the same as this one.

Why am I getting "Removing worker because we got no heartbeat in 60 seconds" on Spark master

I think I might of stumbled across a bug and wanted to get other people's input. I am running a pyspark application using Spark 2.2.0 in standalone mode. I am doing a somewhat heavy transformation in python inside a flatMap and the driver keeps killing the workers.
Here is what am I seeing:
The master after 60s of not seeing any heartbeat message from the workers it prints out this message to the log:
Removing worker [worker name] because we got no heartbeat in 60
seconds
Removing worker [worker name] on [IP]:[port]
Telling app of
lost executor: [executor number]
I then see in the driver log the following message:
Lost executor [executor number] on [executor IP]: worker lost
The worker then terminates and I see this message in its log:
Driver commanded a shutdown
I have looked at the Spark source code and from what I can tell, as long as the executor is alive it should send a heartbeat message back as it is using a ThreadUtils.newDaemonSingleThreadScheduledExecutor to do this.
One other thing that I noticed while I was running top on one of the workers, is that the executor JVM seems to be suspended throughout this process. There are as many python processes as I specified in the SPARK_WORKER_CORES env variable and each is consuming close to 100% of the CPU.
Anyone have any thoughts on this?
I was facing this same issue, increasing interval worked.
Excerpt from Logs start-all.sh logs
INFO Utils: Successfully started service 'sparkMaster' on port 7077.
INFO Master: Starting Spark master at spark://master:7077
INFO Master: Running Spark version 3.0.1
INFO Utils: Successfully started service 'MasterUI' on port 8080.
INFO MasterWebUI: Bound MasterWebUI to 0.0.0.0, and started at http://master:8080
INFO Master: I have been elected leader! New state: ALIVE
INFO Master: Registering worker slave01:41191 with 16 cores, 15.7 GiB RAM
INFO Master: Registering worker slave02:37853 with 16 cores, 15.7 GiB RAM
WARN Master: Removing worker-20210618205117-slave01-41191 because we got no heartbeat in 60 seconds
INFO Master: Removing worker worker-20210618205117-slave01-41191 on slave01:41191
INFO Master: Telling app of lost worker: worker-20210618205117-slave01-41191
WARN Master: Removing worker-20210618204723-slave02-37853 because we got no heartbeat in 60 seconds
INFO Master: Removing worker worker-20210618204723-slave02-37853 on slave02:37853
INFO Master: Telling app of lost worker: worker-20210618204723-slave02-37853
WARN Master: Got heartbeat from unregistered worker worker-20210618205117-slave01-41191. This worker was never registered, so ignoring the heartbeat.
WARN Master: Got heartbeat from unregistered worker worker-20210618204723-slave02-37853. This worker was never registered, so ignoring the heartbeat.
Solution: add following configs to $SPARK_HOME/conf/spark-defaults.conf
spark.network.timeout 50000
spark.executor.heartbeatInterval 5000
spark.worker.timeout 5000

Spark Jobserver fail just by receiving a job request

Jobserver 0.7.0 it have 4Gb ram available and 10Gb for the context, the system have 3 more free Gb. The context was running for a while and at the time when receive a request fails without any error. The request is the same like other ones that have processed while it was up, is not a special one. The following log corresponds to the jobserver log and as you can see, the last successfully job was finished at 03:08:23,341 and when receive the next one then the driver command a shutdown.
[2017-05-16 03:08:23,340] INFO output.FileOutputCommitter [] [] - Saved output of task 'attempt_201705160308_0321_m_000199_0' to file:/value_iq/spark-warehouse/spark_cube_users_v/tenant_id=7/_temporary/0/task_201705160308_0321_m_000199
[2017-05-16 03:08:23,340] INFO pred.SparkHadoopMapRedUtil [] [] - attempt_201705160308_0321_m_000199_0: Committed
[2017-05-16 03:08:23,341] INFO he.spark.executor.Executor [] [] - Finished task 199.0 in stage 321.0 (TID 49474). 2738 bytes result sent to driver
[2017-05-16 03:39:02,195] INFO arseGrainedExecutorBackend [] [] - Driver commanded a shutdown
[2017-05-16 03:39:02,239] INFO storage.memory.MemoryStore [] [] - MemoryStore cleared
[2017-05-16 03:39:02,254] INFO spark.storage.BlockManager [] [] - BlockManager stopped
[2017-05-16 03:39:02,363] ERROR arseGrainedExecutorBackend [] [] - RECEIVED SIGNAL TERM
[2017-05-16 03:39:02,404] INFO k.util.ShutdownHookManager [] [] - Shutdown hook called
[2017-05-16 03:39:02,412] INFO k.util.ShutdownHookManager [] [] - Deleting directory /tmp/spark-556033e2-c456-49d6-a43c-ef2cd3494b71/executor-b3ceaf84-e66a-45ed-acfe-1052ab1de2f8/spark-87671e4f-54da-47d7-a077-eb5f75d07e39
The Spark Worker server just log the following:
17/05/15 19:25:54 INFO ExternalShuffleBlockResolver: Registered executor AppExecId{appId=app-20170515192550-0004, execId=0} with ExecutorShuffleInfo{localDirs=[/tmp/spark-556033e2-c456-49d6-a43c-ef2cd3494b71/executor-b3ceaf84-e66a-45ed-acfe-1052ab1de2f8/blockmgr-eca888c0-4e63-421c-9e61-d959ee45f8e9], subDirsPerLocalDir=64, shuffleManager=org.apache.spark.shuffle.sort.SortShuffleManager}
17/05/16 03:39:02 INFO Worker: Asked to kill executor app-20170515192550-0004/0
17/05/16 03:39:02 INFO ExecutorRunner: Runner thread for executor app-20170515192550-0004/0 interrupted
17/05/16 03:39:02 INFO ExecutorRunner: Killing process!
17/05/16 03:39:02 INFO Worker: Executor app-20170515192550-0004/0 finished with state KILLED exitStatus 0
17/05/16 03:39:02 INFO Worker: Cleaning up local directories for application app-20170515192550-0004
17/05/16 03:39:07 INFO ExternalShuffleBlockResolver: Application app-20170515192550-0004 removed, cleanupLocalDirs = true
17/05/16 03:39:07 INFO ExternalShuffleBlockResolver: Cleaning up executor AppExecId{appId=app-20170515192550-0004, execId=0}'s 1 local dirs
And the Master log:
17/05/16 03:39:02 INFO Master: Received unregister request from application app-20170515192550-0004
17/05/16 03:39:02 INFO Master: Removing app app-20170515192550-0004
17/05/16 03:39:02 INFO Master: 157.97.107.150:33928 got disassociated, removing it.
17/05/16 03:39:02 INFO Master: 157.97.107.150:55444 got disassociated, removing it.
17/05/16 03:39:02 WARN Master: Got status update for unknown executor app-20170515192550-0004/0
Before receiving this request spark wasn't executing any other job, the context was using 5,3G/10G and the driver 1,3G/4G.
What meas "Driver commanded a shutdown"?
There is any log property that can be changed to see more details on the logs?
How can a simple request just break the context?

giraph.numInputThreads execution time for "input superstep" it's the same using 1 or 8 threads, how this can be possible?

I'm doing BFS search through the Wikipedia (spanish edition) site. I converted the dump into a file that could be read with Giraph.
Using 1 worker, a file of 1 GB took 452 seconds. I executed Giraph with this command:
/home/hadoop/bin/yarn jar /home/hadoop/giraph/giraph.jar ar.edu.info.unlp.tesina.lectura.grafo.BusquedaDeCaminosNavegacionalesWikiquote -vif ar.edu.info.unlp.tesina.vertice.estructuras.IdTextWithComplexValueInputFormat -vip /user/hduser/input/grafo-wikipedia.txt -vof ar.edu.info.unlp.tesina.vertice.estructuras.IdTextWithComplexValueOutputFormat -op /user/hduser/output/caminosNavegacionales -w 1 -yh 120000 -ca giraph.metrics.enable=true,giraph.useOutOfCoreMessages=true
Container logs:
16/08/24 21:17:02 INFO master.BspServiceMaster: generateVertexInputSplits: Got 8 input splits for 1 input threads
16/08/24 21:17:02 INFO master.BspServiceMaster: createVertexInputSplits: Starting to write input split data to zookeeper with 1 threads
16/08/24 21:17:02 INFO master.BspServiceMaster: createVertexInputSplits: Done writing input split data to zookeeper
16/08/24 21:17:02 INFO yarn.GiraphYarnTask: [STATUS: task-0] MASTER_ZOOKEEPER_ONLY checkWorkers: Done - Found 1 responses of 1 needed to start superstep -1
16/08/24 21:17:02 INFO netty.NettyClient: Using Netty without authentication.
16/08/24 21:17:02 INFO netty.NettyClient: connectAllAddresses: Successfully added 1 connections, (1 total connected) 0 failed, 0 failures total.
16/08/24 21:17:02 INFO partition.PartitionUtils: computePartitionCount: Creating 1, default would have been 1 partitions.
...
16/08/24 21:25:40 INFO netty.NettyClient: stop: Halting netty client
16/08/24 21:25:40 INFO netty.NettyClient: stop: reached wait threshold, 1 connections closed, releasing resources now.
16/08/24 21:25:43 INFO netty.NettyClient: stop: Netty client halted
16/08/24 21:25:43 INFO netty.NettyServer: stop: Halting netty server
16/08/24 21:25:43 INFO netty.NettyServer: stop: Start releasing resources
16/08/24 21:25:44 INFO bsp.BspService: process: cleanedUpChildrenChanged signaled
16/08/24 21:25:47 INFO netty.NettyServer: stop: Netty server halted
16/08/24 21:25:47 INFO bsp.BspService: process: masterElectionChildrenChanged signaled
16/08/24 21:25:47 INFO master.MasterThread: setup: Took 0.898 seconds.
16/08/24 21:25:47 INFO master.MasterThread: input superstep: Took 452.531 seconds.
16/08/24 21:25:47 INFO master.MasterThread: superstep 0: Took 64.376 seconds.
16/08/24 21:25:47 INFO master.MasterThread: superstep 1: Took 1.591 seconds.
16/08/24 21:25:47 INFO master.MasterThread: shutdown: Took 6.609 seconds.
16/08/24 21:25:47 INFO master.MasterThread: total: Took 526.006 seconds.
As you guys can see, the first line tell us that input superstep is executing with only one thread. And took 492 second in finish Input Superstep.
I did another test, using giraph.numInputThreads=8, tryng to do the input superstep with 8 threads:
/home/hadoop/bin/yarn jar /home/hadoop/giraph/giraph.jar ar.edu.info.unlp.tesina.lectura.grafo.BusquedaDeCaminosNavegacionalesWikiquote -vif ar.edu.info.unlp.tesina.vertice.estructuras.IdTextWithComplexValueInputFormat -vip /user/hduser/input/grafo-wikipedia.txt -vof ar.edu.info.unlp.tesina.vertice.estructuras.IdTextWithComplexValueOutputFormat -op /user/hduser/output/caminosNavegacionales -w 1 -yh 120000 -ca giraph.metrics.enable=true,giraph.useOutOfCoreMessages=true,giraph.numInputThreads=8
The result was the following one:
16/08/24 21:54:00 INFO master.BspServiceMaster: generateVertexInputSplits: Got 8 input splits for 8 input threads
16/08/24 21:54:00 INFO master.BspServiceMaster: createVertexInputSplits: Starting to write input split data to zookeeper with 1 threads
16/08/24 21:54:00 INFO master.BspServiceMaster: createVertexInputSplits: Done writing input split data to zookeeper
...
16/08/24 22:10:07 INFO master.MasterThread: setup: Took 0.093 seconds.
16/08/24 22:10:07 INFO master.MasterThread: input superstep: Took 891.339 seconds.
16/08/24 22:10:07 INFO master.MasterThread: superstep 0: Took 66.635 seconds.
16/08/24 22:10:07 INFO master.MasterThread: superstep 1: Took 1.837 seconds.
16/08/24 22:10:07 INFO master.MasterThread: shutdown: Took 6.605 seconds.
16/08/24 22:10:07 INFO master.MasterThread: total: Took 966.512 seconds.
So, my question is, how can be possible that Giraph is using 452 seconds without input threads and 891 seconds with them? Should be exacly the opposite, right?
The cluster used for this was 1 master and one slave, both of a r3.8xlarge EC2 instance on AWS.
The problem is related to the HDFS access. There are 8 threads accessing the same resource, when that resource only can be accessed in a sequential way. For getting best performance, giraph.numInputThreads should be 2 or 1.

Resources