Fluent-bit inside Spark Container - apache-spark

I am trying to run fluent-bit inside spark container so that spark driver container which is writing the logs in a file /var/log/sparkDriver.log controlled by spark log4j properties, can be read or consumed by fluent-bit. I know that running multiple processes in one container is an AntiParttern but right now I have no choice. What configuration I need, to read this file (/var/log/sparkDriver.log) and forward the logs to our internal splunk hec server.
I know fluent-bit can be used as a sidecar in the pod but I am using simple spark-submit to submit my spark job to K8S and spark-submit doesn't have any way to tell k8s that I want to run a sidecar (fluent-bit) as well.
I also know that fluent-bit can installed as deamonSet in the cluster which will basically run on each node in the k8s cluster and forward logs from the container via node to Splunk. But this option is also not going to work for me.
So I thought if I could bake fluent-bit or splunkforwarder or even fluentd and read the logs from a file or stdout. I know that the other 2 options will inflate my spark docker image but I don't have an option right now.
Any help or suggestion will be really appreciated
I actually tried the tail and splunk but somehow I am not able to figure out the right configuration for fluent-bit
Here is my log file which is spark logs using log4j:
I actually tried it but somehow I am not able to put the right configuration around it. Here is how my log files look:
20/03/02 19:35:47 INFO TaskSetManager: Starting task 12526.0 in stage 0.0 (TID 12526, 172.16.7.233, executor 1, partition 12526, PROCESS_LOCAL, 7885 bytes)
20/03/02 19:35:47 DEBUG KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Launching task 12526 on executor id: 1 hostname: 172.16.7.233.
20/03/02 19:35:47 INFO TaskSetManager: Finished task 12524.0 in stage 0.0 (TID 12524) in 1 ms on 172.16.7.233 (executor 1) (12525/1000000)
20/03/02 19:35:47 TRACE MessageDecoder: Received message OneWayMessage: OneWayMessage{body=NettyManagedBuffer{buf=CompositeByteBuf(ridx: 5, widx: 1622, cap: 1622, components=2)}}
20/03/02 19:35:47 TRACE MessageDecoder: Received message OneWayMessage: OneWayMessage{body=NettyManagedBuffer{buf=PooledUnsafeDirectByteBuf(ridx: 13, widx: 1630, cap: 32768)}}
20/03/02 19:35:47 TRACE MessageDecoder: Received message OneWayMessage: OneWayMessage{body=NettyManagedBuffer{buf=PooledUnsafeDirectByteBuf(ridx: 13, widx: 2414, cap: 4096)}}
Here is my fluent-bit configuration:
[INPUT]
Name tail
Path /var/log/sparklog.log
# nest the record under the 'event' key
[FILTER]
Name nest
Match *
Operation nest
Wildcard *
Nest_under event
# add event metadata
[FILTER]
Name modify
Match *
Add index myindex
Add host ${HOSTNAME}
Add app_name ${APP_NAME}
Add namespace ${NAMESPACE}
[OUTPUT]
Name Splunk
Match *
Host splunk.example.com
Port 30000
Splunk_Token XXXX-XXXX-XXXX-XXXX
Splunk_Send_Raw On
TLS On
TLS.Verify Off

Tail https://docs.fluentbit.io/manual/input/tail and splunk plugin https://docs.fluentbit.io/manual/output/splunk should do the trick for you.
Are you facing any specific issue with configuring these two?

Related

How can I inspect per executor/node memory usage metrics of a pyspark job on Dataproc?

I'm running a PySpark job in Google Cloud Dataproc, in a cluster with half the nodes being preemptible, and seeing several errors in the job output (the driver output) such as:
...spark.scheduler.TaskSetManager: Lost task 9696.0 in stage 0.0 ... Python worker exited unexpectedly (crashed)
...
Caused by java.io.EOFException
...
...YarnSchedulerBackend$YarnSchedulerEndpoint: Requesting driver to remove executor 177 for reason Container marked as failed: ... Exit status: -100. Diagnostics: Container released on a *lost* node
...spark.storage.BlockManagerMasterEndpoint: Error try to remove broadcast 3 from block manager BlockManagerId(...)
Perhaps by coincidence, the errors mostly seem to be coming from preemptible nodes.
My suspicion is that these opaque errors are coming from the node or executors running out of memory, but there don't seem to be any granular memory related metrics exposed by Dataproc.
How can I determine why a node was considered lost? Is there a way I can inspect memory usage per node or executor to validate whether these errors are being caused by high memory usage? If YARN is the one which is killing containers / determining nodes are lost, then hopefully there's a way to introspect why?
Because you are using Preemptible VMs which are short-lived and guaranteed to last for up to 24 hours. This means that when GCE shutdowns Preemptible VMs you see errors like this:
YarnSchedulerBackend$YarnSchedulerEndpoint: Requesting driver to remove executor 177 for reason Container marked as failed: ... Exit status: -100. Diagnostics: Container released on a lost node
Open a secure shell from your machine to the cluster. You'll require gcloud sdk installed for that.
gcloud compute ssh ${HOSTNAME}-m --project=${PROJECT}
Then run the following commands in the cluster.
List all nodes in the cluster
yarn node -list
Then using ${NodeID} to get report on the node state.
yarn node -status ${NodeID}
You could also set up local port forwarding via SSH to Yarn WebUI server instead of running commands directly in the cluster.
gcloud compute ssh ${HOSTNAME}-m \
--project=${PROJECT} -- \
-L 8088:${HOSTNAME}-m:8088 -N
Then go to http://localhost:8088/cluster/apps in your browser.

How to fix this fatal error while running spark jobs on HDIinsight cluster? Session 681 unexpectedly reached final status 'dead'. See logs:

I am running pyspark code on HDIcluster and getting this error:
The code failed because of a fatal error: Session 681 unexpectedly
reached final status 'dead'. See logs:
I don't have experience in YARN or Hadoop. I tried few links provided in stack overflow. But none of them helped. One strange thing is I was able to run the same code yesterday with out that error.
I just ran this import
from pyspark.sql import SparkSession
This is the error I am getting:
19/06/21 20:35:35 INFO Client:
client token: N/A
diagnostics: [Fri Jun 21 20:35:35 +0000 2019] Application is Activated, waiting for resources to be assigned for AM. Details : AM Partition = <DEFAULT_PARTITION> ; Partition Resource = <memory:819200, vCores:240> ; Queue's Absolute capacity = 50.0 % ; Queue's Absolute used capacity = 99.1875 % ; Queue's Absolute max capacity = 100.0 % ;
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1561149335158
final status: UNDEFINED
tracking URL: https://mmsorderpredhdi.azurehdinsight.net/yarnui/hn/proxy/application_1560840076505_0062/
user: livy
19/06/21 20:35:35 INFO ShutdownHookManager: Shutdown hook called
19/06/21 20:35:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-bb63c5f0-7579-4456-b32a-0e643ca97ecc
YARN Diagnostics:
Application killed by user..
Question : Is there something to deal with Queue's absolute used capacity?
Could you please check the logs to find the exact issue?
Where do I find the log file?
On Azure HDInsight cluster, You may found the livy log by connecting to one of the Head nodes with SSH and downloading a file at this path.
hdfs dfs -ls /app-logs/livy/logs-ifile
For more details, refer "Access Apache Hadoop YARN application logs on Linux-based HDInsight"
And also, you may refer “How to start sparksession in pyspark”.
Hope this helps.

SPARK YARN: cannot send job from client (org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:8032)

I'm trying to send spark job to yarn (without HDFS) in HA mode.
For submitting I'm using org.apache.spark.deploy.SparkSubmit.
When I send request from machine with active Resource Manager, it works well. But if I' trying to send from machine with standby Resource Manager, job fails with error:
DEBUG org.apache.hadoop.ipc.Client - Connecting to spark2-node-dev/10.10.10.167:8032
DEBUG org.apache.hadoop.ipc.Client - Connecting to /0.0.0.0:8032
org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep
However, when I send request via command line (spark-submit), it works well through both active and standby machine.
What can cause the problem?
P.S. Use the same parameters for both type of sending job: org.apache.spark.deploy.SparkSubmit and spark-submit command line request. And properties yarn.resourcemanager.hostname.rm_id defined for all rm hosts
The problem was with absence of yarn-site.xml within class path for spark-submitter jar. Actually spark submitter jar does not take to account YARN_CONF_DIR or HADOOP_CONF_DIR env var, so cannot see yarn-site.
One solution that I found was to put yarn-site into classpath of jar.

How to enable a Spark-Mesos job to be launched from inside a Docker container?

Summary:
Is it possible to submit a Spark job on Mesos from inside a Docker container with 1 Mesos master (no Zookeeper) and 1 Mesos agent also each running in separate Docker containers (on the same host for now)? The Mesos containerizer described at http://mesos.apache.org/documentation/latest/container-image/ seems to apply to the case where the Mesos application is simply encapsulated in a Docker container and run. My Docker application is more interactive with multiple PySpark Mesos jobs being instantiated at run-time based on user input. The driver program in the Docker container is not itself run as a Mesos app. Only the user-initiated job requests are handled as PySpark Mesos apps.
Specifics:
I have 3 Docker containers based on centos:7 linux, and running on the same host machine for now:
Container "Master" running a Mesos Master.
Container "Agent" running a Mesos Agent.
Container "Test" with Spark and Mesos installed where I run a bash shell and launch the following PySpark test program from the command line.
from pyspark import SparkContext, SparkConf
from operator import add
# Configure Spark
sp_conf = SparkConf()
sp_conf.setAppName("spark_test")
sp_conf.set("spark.scheduler.mode", "FAIR")
sp_conf.set("spark.dynamicAllocation.enabled", "false")
sp_conf.set("spark.driver.memory", "500m")
sp_conf.set("spark.executor.memory", "500m")
sp_conf.set("spark.executor.cores", 1)
sp_conf.set("spark.cores.max", 1)
sp_conf.set("spark.mesos.executor.home", "/usr/local/spark-2.1.0")
sp_conf.set("spark.executor.uri", "file://usr/local/spark-2.1.0-bin-without-hadoop.tgz")
sc = SparkContext(conf=sp_conf)
# Simple computation
x = [(1.5,100.),(1.5,200.),(1.5,300.),(2.5,150.)]
rdd = sc.parallelize(x,1)
tot = rdd.foldByKey(0,add).collect()
cnt = rdd.countByKey()
time = [t[0] for t in tot]
avg = [t[1]/cnt[t[0]] for t in tot]
print 'tot=', tot
print 'cnt=', cnt
print 't=', time
print 'avg=', avg
The relevant software versions I am using are as follows:
Hadoop: 2.7.3
Spark: 2.1.0
Mesos: 1.2.0
Docker: 17.03.1-ce, build c6d412e
The following works fine:
I can run the simple PySpark test program above from inside the Test container with Spark's MASTER=local[N] for N=1 or N=4.
I can see in the Mesos logs and in the Mesos user interface (UI) that the Mesos agent and master come up fine. The Mesos UI shows that the agent is connected with plenty of resources (cpu, memory, disk).
I can run the Mesos Python tests successfully from inside the Test container with /usr/local/mesos-1.2.0/build/src/examples/python/test-framework 127.0.0.1:5050. This seems to confirm that the Mesos containers can be accessed from within my Test container, but these tests are not using Spark.
This is the Failure:
With Spark's MASTER=mesos://127.0.0.1:5050, when I launch my PySpark test program from inside the Test container there is activity in the logs of both the Mesos Master and Agent, and in the couple seconds before failure, the Mesos UI shows resources assigned for the job that are well within what is available. However, the PySpark test program then fails with: WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources.
The steps I followed are as follows.
Start Mesos Master:
docker run -it --net=host -p 5050:5050 the_master
Relevant excerpts from the master's log shows:
I0418 01:05:08.540192 27 master.cpp:383] Master 15b354eb-6a20-4bc9-a13b-6533b1e91bd2 (localhost) started on 127.0.0.1:5050
I0418 01:05:08.540210 27 master.cpp:385] Flags at startup: --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate_agents="false" --authenticate_frameworks="false" --authenticate_http_frameworks="false" --authenticate_http_readonly="false" --authenticate_http_readwrite="false" --authenticators="crammd5" --authorizers="local" --framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --max_unreachable_tasks_per_framework="1000" --quiet="false" --recovery_agent_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" --registry_store_timeout="20secs" --registry_strict="false" --root_submissions="true" --user_sorter="drf" --version="false" --webui_dir="/usr/local/mesos-1.2.0/build/../src/webui" --work_dir="/var/lib/mesos" --zk_session_timeout="10secs"
Start Mesos Agent:
docker run -it --net=host -e MESOS_AGENT_PORT=5051 the_agent
The agent's log shows:
I0418 01:42:00.234244 40 slave.cpp:212] Flags at startup: --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="/tmp/mesos/store/appc" --authenticate_http_readonly="false" --authenticate_http_readwrite="false" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authorizer="local" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="mesos" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_mesos_image="spark-mesos-agent-test" --docker_registry="https://registry-1.docker.io" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/mesos/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_command_executor="false" --http_heartbeat_interval="30secs" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" --launcher="posix" --launcher_dir="/usr/local/mesos-1.2.0/build/src" --logbufsecs="0" --logging_level="INFO" --max_completed_executors_per_framework="150" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="1secs" --revocable_cpu_low_priority="true" --runtime_dir="/var/run/mesos" --sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="false" --systemd_enable_support="false" --systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/var/lib/mesos"
I get the following warning for both the Mesos Master and Agent, but ignore it because I am running everything on the same host for now:
Master/Agent bound to loopback interface! Cannot communicate with remote schedulers or agents. You might want to set '--ip' flag to a routable IP address.
In fact, my tests with assigning a routable IP address instead of 127.0.0.1 failed to change any of the behavior I describe here.
Start Test Container (with bash shell for testing):
docker run -it --net=host the_test /bin/bash
Some relevant environment variables set inside all three container (Master, Agent, and Test):
HADOOP_HOME=/usr/local/hadoop-2.7.3
HADOOP_CONF_DIR=/usr/local/hadoop-2.7.3/etc/hadoop
SPARK_HOME=/usr/local/spark-2.1.0
SPARK_EXECUTOR_URI=file:////usr/local/spark-2.1.0-bin-without-hadoop.tgz
MASTER=mesos://127.0.0.1:5050
PYSPARK_PYTHON=/usr/local/anaconda2/bin/python
PYSPARK_DRIVER_PYTHON=/usr/local/anaconda2/bin/python
PYSPARK_SUBMIT_ARGS=--driver-memory=4g pyspark-shell
MESOS_PORT=5050
MESOS_IP=127.0.0.1
MESOS_WORKDIR=/var/lib/mesos
MESOS_HOME=/usr/local/mesos-1.2.0
MESOS_NATIVE_JAVA_LIBRARY=/usr/local/lib/libmesos.so
MESOS_MASTER=mesos://127.0.0.1:5050
PYTHONPATH=:/usr/local/spark-2.1.0/python:/usr/local/spark-2.1.0/python/lib/py4j-0.10.1-src.zip
Run Mesos (non-Spark) tests from inside the Test container:
/usr/local/mesos-1.2.0/build/src/examples/python/test-framework 127.0.0.1:5050
This produces the following log output (as expected I think):
I0417 21:28:36.912542 20 sched.cpp:232] Version: 1.2.0
I0417 21:28:36.920013 62 sched.cpp:336] New master detected at master#127.0.0.1:5050
I0417 21:28:36.920472 62 sched.cpp:352] No credentials provided. Attempting to register without authentication
I0417 21:28:36.924165 62 sched.cpp:759] Framework registered with be89e739-be8d-430e-b1e9-3fe55fa18459-0000
Registered with framework ID be89e739-be8d-430e-b1e9-3fe55fa18459-0000
Received offer be89e739-be8d-430e-b1e9-3fe55fa18459-O0 with cpus: 16.0 and mem: 119640.0
Launching task 0 using offer be89e739-be8d-430e-b1e9-3fe55fa18459-O0
Launching task 1 using offer be89e739-be8d-430e-b1e9-3fe55fa18459-O0
Launching task 2 using offer be89e739-be8d-430e-b1e9-3fe55fa18459-O0
Launching task 3 using offer be89e739-be8d-430e-b1e9-3fe55fa18459-O0
Launching task 4 using offer be89e739-be8d-430e-b1e9-3fe55fa18459-O0
Task 0 is in state TASK_RUNNING
Task 1 is in state TASK_RUNNING
Task 2 is in state TASK_RUNNING
Task 3 is in state TASK_RUNNING
Task 4 is in state TASK_RUNNING
Task 0 is in state TASK_FINISHED
Task 1 is in state TASK_FINISHED
Task 2 is in state TASK_FINISHED
Task 3 is in state TASK_FINISHED
Task 4 is in state TASK_FINISHED
All tasks done, waiting for final framework message
Received message: 'data with a \x00 byte'
Received message: 'data with a \x00 byte'
Received message: 'data with a \x00 byte'
Received message: 'data with a \x00 byte'
Received message: 'data with a \x00 byte'
All tasks done, and all messages received, exiting
Run PySpark test program from inside the Test container:
python spark_test.py
This produces the following log output:
17/04/17 21:29:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
I0417 21:29:19.187747 205 sched.cpp:232] Version: 1.2.0
I0417 21:29:19.196535 188 sched.cpp:336] New master detected at master#127.0.0.1:5050
I0417 21:29:19.197453 188 sched.cpp:352] No credentials provided. Attempting to register without authentication
I0417 21:29:19.201884 195 sched.cpp:759] Framework registered with be89e739-be8d-430e-b1e9-3fe55fa18459-0001
17/04/17 21:29:34 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
I searched for this error on the internet but every page I found indicates that it is a common error caused by insufficient resources being allocated to the Mesos agent. As I mentioned, the Mesos UI indicates that there are sufficient resources. Please respond if you have any idea why my Spark job is not accepting resources from Mesos or if you have any suggestions of things I could try.
Thank you for your help.
This error is now resolved. In case anybody encounters a similar problem, I wanted to post that in my case it was caused by the HADOOP CLASSPATH not being set in the Mesos Master and Agent containers. Once set, everything works as expected.

Spark job submiited from local machine to remore cluster can't see data on remote server

The post may seem a bit long but I am providing all the specific details to help readers what I am trying to achieve and what all I have already done but still running into issue.
I am trying to submit the spark job to remote cluster from eclipse running locally on windows 7 machine but running into issue with respect to finding the input path to data on cluster nodes. I followed the suggestion made in this forum to configure the sparkContext as following where I set the spark.driver.host to IP address of Windows machine.
SparkConf sparkConf = new SparkConf().setAppName("Count Lines")
.set("spark.driver.host", "9.1.194.199") //IP address of Windows 7
.set("spark.driver.port", "51910")
.set("spark.fileserver.port", "51811")
.set("spark.broadcast.port", "51812")
.set("spark.replClassServer.port", "51813")
.set("spark.blockManager.port", "51814")
.setMaster("spark://master.aa.bb.com:7077"); //mater hostname
I also had to set HADOOP_HOME to c:\winutils in eclipse, to be able to run this code on windows.
Then I set the path to data which exists on all the nodes of spark cluster as following
String topDir = "/data07/html/test";
JavaRDD<String> lines = sc.textFile(topDir+"/*");
However, I get following error.
5319 [main] INFO org.apache.spark.SparkContext - Created broadcast 0 from textFile at CountLines2.java:65
Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException: Input Pattern file:/data07/html/test/* matches 0 files
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:251)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:270)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:201)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
Now considering the fact that running the code inside eclipse needed local hadoop installation (ie., setting HADOOP_HOME to c:\winutils), I modified the code to use a data path that exists locally on Windows machine. With that modification, the progam went a bit further and launched tasks on all the nodes of the cluster but failed later for path issue with a different error.
105926 [task-result-getter-2] INFO org.apache.spark.scheduler.TaskSetManager - Lost task 15.2 in stage 0.0 (TID 162) on executor master.aa.bb.com: java.lang.IllegalArgumentException (java.net.URISyntaxException: Relative path in absolute URI: C:%5Cdata%5CMedicalSieve%5Crepositories%5Craw%5CMedscape%5Cclinical/*) [duplicate 162]
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 44 in stage 0.0 failed 4 times, most recent failure: Lost task 44.3 in stage 0.0 (TID 148, aalim03.almaden.ibm.com): java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: C:%5Cdata%5Chtml%5Ctest/*
at org.apache.hadoop.fs.Path.initialize(Path.java:206)
at org.apache.hadoop.fs.Path.<init>(Path.java:172)
at org.apache.hadoop.util.StringUtils.stringToPath(StringUtils.java:241)
As a rule of thumb every input you use should be accessible on every node (both workers and driver). These could be local file system, files on some DFS or external resource.
The only situation when data is shipped directly from the driver is when you use ParallelCollectionRDD with parallelize / makeRDD.

Resources