The below used to generated only required amount of logging to us in spark 2.2 version. HOwever after moving to spark 3.3 the log4j.properties is nolonger respected. a lot of Spark Trace and debug info is being printed.
I hear that its because spark moved to log4j2 from log4j. Finally inspite of googling for a lot of time its still not clear how to configure log4j across all the drivers and executors during the spark submit for spark 3.3.
the command that worked beautifully in spark 2.2
spark-submit --conf "spark.executor.extraJavaOptions=-Dlog4j.debug=true" --conf "spark.driver.extraJavaOptions=-Dlog4j.debug=true" --files /home/hadoop/log4j.properties --name app --master yarn --deploy-mode cluster --class a.b.c.Entrypoint /home/hadoop/jars/app.jar
So the questions:
Any sample log4j2 file?
How to pass it from the master node during spark submit command?
How to print log4j debug information?
[ Edit 1] Issue not solved yet!
Based on the comments i made the below changes. But i see a lot of spark internal data getting logged -- not just my data alone
spark-submit --driver-memory 1g --executor-memory 2g --conf "spark.driver.extraJavaOptions=-Dlog4j2.debug=true --files /home/hadoop/log4j2.properties --master yarn --deploy-mode cluster --class com.a.b.ABC /home/hadoop/jars/spark-1.0-SNAPSHOT.jar
log4j2.properties
status=warn
name=campV2
appender.console.type = Console
appender.console.name = console
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = %d{yy-MM-dd HH:mm:ss} %p %c: %m%n%ex
rootLogger.level = warn
rootLogger.appenderRef.stdout.ref = console
logger.app.name=com.company1
logger.app.level = debug
logger.app.additivity = false
logger.app.appenderRef.console.ref = console
logger.app2.name=com.company2
logger.app2.level = debug
logger.app2.additivity = false
logger.app2.appenderRef.console.ref = console
THe logs generated with unwanted data
LogLastModifiedTime:Tue Dec 20 05:52:31 +0000 2022
LogLength:36546
LogContents:
ls -l:
total 20
lrwxrwxrwx 1 yarn yarn 62 Dec 20 05:52 __app__.jar -> /mnt/yarn/usercache/hadoop/filecache/23/spark-1.0-SNAPSHOT.jar
lrwxrwxrwx 1 yarn yarn 58 Dec 20 05:52 __spark_conf__ -> /mnt/yarn/usercache/hadoop/filecache/21/__spark_conf__.zip
lrwxrwxrwx 1 yarn yarn 78 Dec 20 05:52 __spark_libs__ -> /mnt1/yarn/usercache/hadoop/filecache/22/__spark_libs__7763583720024624816.zip
-rw-r--r-- 1 yarn yarn 93 Dec 20 05:52 container_tokens
-rwx------ 1 yarn yarn 646 Dec 20 05:52 default_container_executor.sh
...
...
echo "broken symlinks(find -L . -maxdepth 5 -type l -ls):" 1>>"/var/log/hadoop-yarn/containers/application_1671425963628_0204/container_1671425963628_0204_01_000003/directory.info"
find -L . -maxdepth 5 -type l -ls 1>>"/var/log/hadoop-yarn/containers/application_1671425963628_0204/container_1671425963628_0204_01_000003/directory.info"
echo "Launching container"
exec /bin/bash -c "LD_LIBRARY_PATH=\"/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native:$LD_LIBRARY_PATH\" $JAVA_HOME/bin/java -server -Xmx2048m '-verbose:gc' '-XX:+PrintGCDetails' '-XX:+PrintGCDateStamps' '-XX:OnOutOfMemoryError=kill -9 %p' '-XX:+IgnoreUnrecognizedVMOptions' '--add-opens=java.base/java.lang=ALL-UNNAMED' '--add-opens=java.base/java.lang.invoke=ALL-UNNAMED' '--add-opens=java.base/java.lang.reflect=ALL-UNNAMED' '--add-opens=java.base/java.io=ALL-UNNAMED' '--add-opens=java.base/java.net=ALL-UNNAMED' '--add-opens=java.base/java.nio=ALL-UN
...
...
DEBUG StatusLogger PluginManager 'Lookup' found 16 plugins
DEBUG StatusLogger PluginManager 'Lookup' found 16 plugins
DEBUG StatusLogger Using configurationFactory org.apache.logging.log4j.core.config.ConfigurationFactory$Factory#6bedbc4d
TRACE StatusLogger Trying to find [log4j2-test18b4aac2.properties] using context class loader sun.misc.Launcher$AppClassLoader#18b4aac2.
TRACE StatusLogger Trying to find [log4j2-test18b4aac2.properties] using sun.misc.Launcher$AppClassLoader#18b4aac2 class loader.
NOw with somany unwanted logs getting generated, finding my logs is like finding a needle in hay-stack. Is there a way to just display my logs and not the spark internal logs?
So the question remains
how to configure the log4j2 so that i get to see only my loggers
any pointers/ examples will be helpful
Edit 2 set log4j2.debug=false and TRACE logs are gone now. HOwever i still see script outputs
--conf "spark.driver.extraJavaOptions=-Dlog4j.debug=false -Dlog4j2.debug=false
echo "Setting up job resources"
ln -sf -- "/mnt/yarn/usercache/hadoop/filecache/3758/__spark_libs__3245215202131718232.zip" "__spark_libs__"
ln -sf -- "/mnt/yarn/usercache/hadoop/filecache/3760/log4j2.properties" "log4j2.properties"
ln -sf -- "/mnt/yarn/usercache/hadoop/filecache/3759/spark-1.0-SNAPSHOT.jar" "__app__.jar"
ln -sf -- "/mnt/yarn/usercache/hadoop/filecache/3757/__spark_conf__.zip" "__spark_conf__"
ln -sf -- "/mnt/yarn/usercache/hadoop/filecache/3756/hudi-defaults.conf" "hudi-defaults.conf"
echo "Copying debugging information"
# Creating copy of launch script
Not sure how to off this.
Finally after trying several options the below one only is working.
Login to the Spark master. In mycase its EMR master. Open the log4j2.properties --> located at /usr/lib/spark/conf/log4j2.properties
Take backup of the file, and change it to reflect the below file.
Its disappointing to know that something that was working beautifully in spark 2.2 with the flag --files log4j.properties doesnt work on upgrading the spark( --files log4j2.properties) and we have to do a ugly fix by editing files on server-files.
My log4j2.properties looks like follows
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Set everything to be logged to the console
rootLogger.level = warn
rootLogger.appenderRef.stdout.ref = STDOUT
appender.console.type = Console
appender.console.name = STDOUT
appender.console.target = SYSTEM_OUT
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = %d{yy-MM-dd HH:mm:ss} %p %c{1}: %m%n%ex
logger.pp.name = com.company
logger.pp.level = debug
logger.pp.additivity = false
logger.pp.appenderRef.console.ref=STDOUT
logger.pp1.name = com.company2
logger.pp1.level = debug
logger.pp1.additivity = false
logger.pp1.appenderRef.console.ref=STDOUT
# Settings to quiet third party logs that are too verbose com.amazonaws.services.s3
logger.jetty.name = org.sparkproject.jetty
logger.jetty.level = warn
logger.jetty2.name = org.sparkproject.jetty.util.component.AbstractLifeCycle
logger.jetty2.level = error
logger.repl1.name = org.apache.spark.repl.SparkIMain$exprTyper
logger.repl1.level = info
logger.repl2.name = org.apache.spark.repl.SparkILoop$SparkILoopInterpreter
logger.repl2.level = info
# Set the default spark-shell log level to WARN. When running the spark-shell, the
# log level for this class is used to overwrite the root logger's log level, so that
# the user can have different defaults for the shell and regular Spark apps.
logger.repl.name = org.apache.spark.repl.Main
logger.repl.level = warn
# SPARK-9183: Settings to avoid annoying messages when looking up nonexistent UDFs
# in SparkSQL with Hive support
logger.metastore.name = org.apache.hadoop.hive.metastore.RetryingHMSHandler
logger.metastore.level = fatal
logger.hive_functionregistry.name = org.apache.hadoop.hive.ql.exec.FunctionRegistry
logger.hive_functionregistry.level = error
# Parquet related logging
logger.parquet.name = org.apache.parquet.CorruptStatistics
logger.parquet.level = error
logger.parquet2.name = parquet.CorruptStatistics
logger.parquet2.level = error
Related
We have 10 kafka machines with kafka version - 1.X
this kafka cluster version is part of HDP version - 2.6.5
We noticed that under /var/log/kafka/server.log the following message
ERROR Error while accepting connection {kafka.network.Accetpr}
java.io.IOException: Too many open files
We saw also additionally
Broker 21 stopped fetcher for partition ...................... because they are in the failed log dir /kafka/kafka-logs {kafka.server.ReplicaManager}
and
WARN Received a PartitionLeaderEpoch assignment for an epoch < latestEpoch. this implies messages have arrived out of order. New: {epoch:0, offset:2227488}, Currnet: {epoch 2, offset:261} for Partition: cars-list-75 {kafka.server.epochLeaderEpocHFileCache}
so regarding to the issue -
ERROR Error while accepting connection {kafka.network.Accetpr}
java.io.IOException: Too many open files
how to increase the MAX open files , in order to avoid this issue
update:
in ambari we saw the following parameter from kafka --> config
is this is the parameter that we should to increase?
It can be done like this:
echo "* hard nofile 100000
* soft nofile 100000" | sudo tee --append /etc/security/limits.conf
Then you should reboot.
After Apache Spark Executor JVM crash in C++ library I'm unable to locate hs_err_pid.log file, that is specified in the Executor JVM output log. Here's an example of Executor output log:
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f6326dce8b0, pid=28580, tid=0x00007f630ea57700
#
# JRE version: OpenJDK Runtime Environment (8.0_212-b01) (build 1.8.0_212-8u212-b01-1~deb9u1-b01)
# Java VM: OpenJDK 64-Bit Server VM (25.212-b01 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [libessence-jni.so+0x18b0] Java_com_evernote_service_nts_indexer_lib_Essence_EssProcess+0x0
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /hadoop/yarn/nm-local-dir/usercache/root/appcache/application_1559573462307_0002/container_1559573462307_0002_01_000005/hs_err_pid28580.log
10:50:00:[32m562[m [Executor task launch worker for task 41] [32mINFO[m .....NtsLibInternalIndexerProcessor(NtsLibInternalIndexerProcessor.java:50) [32mprocess [m Process for user: 18432
[thread 140063422109440 also had an error]
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
But when I'm SSH to target worker machine to locate hs_err_pid28580.log I can't find any traces of this file. I've tried:
vglazkov#reindex-cluster-vg-w-0:~$ sudo find / -name hs_err_pid28580.log
vglazkov#reindex-cluster-vg-w-0:~$
vglazkov#reindex-cluster-vg-w-0:~$ sudo ls -la /hadoop/yarn/nm-local-dir/usercache/root/appcache/
total 12
drwx--x--- 3 yarn yarn 4096 Jun 4 10:46 .
drwxr-x--- 4 yarn yarn 4096 May 15 15:47 ..
drwx--x--- 3 yarn yarn 4096 Jun 4 10:48 application_1557935076075_0097
But in the last case directory named application_1557935076075_0097 does not match my applicationId application_1559573462307_0002 and does not contain any hs_err_pid.log files
I have installed Timesten database (full version) on linux (Linux is guest OS installed through Oracle viritual box with cloudera VM)
I am trying to run following sqoop command on linux and getting below errors
command
sqoop list-tables --connect jdbc:timesten:direct:dsn=sampledb_1122 --driver com.timesten.jdbc.TimesTenDriver
**error**
ERROR manager.SqlManager: Error reading database metadata: java.sql.SQLException: Problems with loading native library/missing methods: no ttJdbc in java.library.path
java.sql.SQLException: Problems with loading native library/missing methods: no ttJdbc in java.library.path
at com.timesten.jdbc.JdbcOdbcConnection.connect(JdbcOdbcConnection.java:1809)
at com.timesten.jdbc.TimesTenDriver.connect(TimesTenDriver.java:305)
at com.timesten.jdbc.TimesTenDriver.connect(TimesTenDriver.java:161)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:233)
at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:878)
at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)
at org.apache.sqoop.manager.SqlManager.listTables(SqlManager.java:520)
at org.apache.sqoop.tool.ListTablesTool.run(ListTablesTool.java:49)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
Could not retrieve tables list from server
18/02/18 18:56:04 ERROR tool.ListTablesTool: manager.listTables() returned null
TimesTen bin and lib folder location
/home/cloudra/timesten/TimesTen/tt1122_64/bin
/home/cloudera/timesten/TimesTen/tt1122_64/lib
Following values are setup in my environment and other parameters
USERNAME=cloudera
DESKTOP_SESSION=gnome
MAIL=/var/spool/mail/cloudera
PATH=/var/lib/sqoop:/home/cloudera/timesten/TimesTen/tt1122_64/bin:/home/cloudera/timesten/TimesTen/tt1122_64/lib:/home/cloudera/anaconda3/bin:/var/lib/sqoop:/home/cloudra/timesten/TimesTen/tt1122_64/bin:/home/cloudera/timesten/TimesTen/tt1122_64/lib:/home/cloudera/anaconda3/bin:/home/cloudera/anaconda3/bin:/usr/local/firefox:/sbin:/usr/java/jdk1.7.0_67-cloudera/bin:/usr/local/apache-ant/apache-ant-1.9.2/bin:/usr/local/apache-maven/apache-maven-3.0.4/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/cloudera/bin
PWD=/home/cloudera
THREAD_FLAGS=native
HOME=/home/cloudera
SHLVL=2
M2_HOME=/usr/local/apache-maven/apache-maven-3.0.4
GNOME_DESKTOP_SESSION_ID=this-is-deprecated
LOGNAME=cloudera
CVS_RSH=ssh
CLASSPATH=/home/cloudera/timesten/TimesTen/tt1122_64/lib/ttjdbc6.jar
[cloudera#quickstart ~]$ echo $LD_LIBRARY_PATH
/home/cloudera/timesten/TimesTen/tt1122_64/lib:/home/cloudera/timesten/TimesTen/tt1122_64/lib:
[cloudera#quickstart ~]$ java -version
java version "1.7.0_67"
Java(TM) SE Runtime Environment (build 1.7.0_67-b01)
Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
[cloudera#quickstart ~]$
cloudera#quickstart bin]$ ./ttversion
TimesTen Release 11.2.2.8.0 (64 bit Linux/x86_64) (tt1122_64:53396) 2015-01-20T08:36:31Z
Instance admin: cloudera
Instance home directory: /home/cloudera/timesten/TimesTen/tt1122_64
World accessible
Daemon home directory: /home/cloudera/timesten/TimesTen/tt1122_64/info
PL/SQL enabled.
In addition to above.. the ttjdbc6.jar file is located at following location
[cloudera#quickstart sqoop]$ pwd
/var/lib/sqoop
[cloudera#quickstart sqoop]$ ls -ltr
total 0
lrwxrwxrwx 1 root root 40 Jun 9 2015 mysql-connector-java.jar -> /usr/share/java/mysql-connector-java.jar
lrwxrwxrwx 1 root root 58 Feb 16 21:37 ttjdbc6.jar -> /home/cloudera/timesten/TimesTen/tt1122_64/lib/ttjdbc6.jar
[cloudera#quickstart timesten]$ pwd
/usr/lib/timesten
[cloudera#quickstart timesten]$ ls -ltr
total 276
-rwxrwxrwx 1 root root 279580 Feb 18 11:33 ttjdbc6.jar
Java_library_path output
[cloudera#quickstart timesten]$ java -XshowSettings:properties
Property settings:
awt.toolkit = sun.awt.X11.XToolkit
file.encoding = UTF-8
file.encoding.pkg = sun.io
file.separator = /
java.awt.graphicsenv = sun.awt.X11GraphicsEnvironment
java.awt.printerjob = sun.print.PSPrinterJob
java.class.path = /home/cloudera/timesten/TimesTen/tt1122_64/lib/ttjdbc6.jar
java.class.version = 51.0
java.endorsed.dirs = /usr/java/jdk1.7.0_67-cloudera/jre/lib/endorsed
java.ext.dirs = /usr/java/jdk1.7.0_67-cloudera/jre/lib/ext
/usr/java/packages/lib/ext
java.home = /usr/java/jdk1.7.0_67-cloudera/jre
java.io.tmpdir = /tmp
java.library.path = /home/cloudera/timesten/TimesTen/tt1122_64/lib
/home/cloudera/timesten/TimesTen/tt1122_64/lib
/usr/java/packages/lib/amd64
/usr/lib64
/lib64
/lib
/usr/lib
java.runtime.name = Java(TM) SE Runtime Environment
java.runtime.version = 1.7.0_67-b01
I execute ttenv.sh scripts but it is not setting up any parameter when I check env parameters, so I had to do it manually.
Gurus and experts.. please help me here.. not sure what is the issue and why I am getting the above error.
Thanks for your help..
The key line here is this:
java.sql.SQLException: Problems with loading native library/missing methods:
no ttJdbc in java.library.path
The TimesTen JDBC driver is a type 1 / 2 driver and it relies on the underlying TimesTen native libraries. Specifically it needs several shared libraries located in <TimesTen_install_dir>/lib such as libttJdbc.so (the one that the error is complaining about), libtten.so etc. Typically you need to make sure that the java.library.path includes this directory (which it appears is the case) and that the CLASSPATH includes the ttjdbc7.jar file in that directory. Another possibility is that your TimesTen installation is a 'client only' installation in which case you cannot use the 'direct' driver and if you try to do so then you would get this exact error. I suggest checking to see if you actually have the files libttJdbc.so and libtten.so in <TimesTen_install_dir>/lib and if not, then this means you have a client only install and need to configure / use client/server connectivity instead.
In standalone mode master process uses /usr/bin/java which resolves to JVM 1.8 and worker process /usr/lib/jvm/java/bin/java which resolves to 1.7. In my Spark application I'm using some APIs introduced in 1.8.
Looking at stack trace one line that comes up is: Caused by: java.lang.NoClassDefFoundError: Could not initialize class SomeClassDefinedByMe which internally creates instance from java.time which I believe is only in JDK 1.8.
How do I force worker to use JVM 1.8?
Update:
For now I renamed /usr/lib/jvm/java/bin/java and created a link that points to /usr/bin/java. This solved the problem but still would like to know why both processes use different binary location and where is this set.
On each Worker node, edit ${SPARK_HOME}/conf/spark-env.sh and define the appropriate $JAVA_HOME e.g.
export JAVA_HOME=/usr/bin/java
That file is sourced by ${SPARK_HOME}/bin/load-spark-env.sh which is invoked by each and every Spark command-line utility:
${SPARK_HOME}/bin/spark-shell via ${SPARK_HOME}/bin/spark-class
${SPARK_HOME}/bin/spark-submit via ${SPARK_HOME}/bin/spark-class
...
${SPARK_HOME}/sbin/start-slave.sh
...
Side note: the Linux alternatives are the standard way to define which JVM is on top of your PATH...
Typical setup with a "fixed" setting, not relying on the priority set by the OpenJDK RPM install:
$ ls -AFl $(which java)
lrwxrwxrwx. 1 root root 22 Feb 15 16:06 /usr/bin/java -> /etc/alternatives/java*
$ alternatives --display java | grep -v slave
java - status is manual.
link currently points to /usr/java/jdk1.8.0_92/jre/bin/java
/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java - priority 18091
/usr/lib/jvm/jre-1.6.0-openjdk.x86_64/bin/java - priority 16000
/usr/java/jdk1.8.0_92/jre/bin/java - priority 18092
Current `best' version is /usr/java/jdk1.8.0_92/jre/bin/java.
...provided that $PATH is defined properly for the Linux account that launches the Spark slaves!
I am trying to fix an issue with running out of memory, and I want to know whether I need to change these settings in the default configurations file (spark-defaults.conf) in the spark home folder. Or, if I can set them in the code.
I saw this question PySpark: java.lang.OutofMemoryError: Java heap space and it says that it depends on if I'm running in client mode. I'm running spark on a cluster and monitoring it using standalone.
But, how do I figure out if I'm running spark in client mode?
If you are running an interactive shell, e.g. pyspark (CLI or via an IPython notebook), by default you are running in client mode. You can easily verify that you cannot run pyspark or any other interactive shell in cluster mode:
$ pyspark --master yarn --deploy-mode cluster
Python 2.7.11 (default, Mar 22 2016, 01:42:54)
[GCC Intel(R) C++ gcc 4.8 mode] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Error: Cluster deploy mode is not applicable to Spark shells.
$ spark-shell --master yarn --deploy-mode cluster
Error: Cluster deploy mode is not applicable to Spark shells.
Examining the contents of the bin/pyspark file may be instructive, too - here is the final line (which is the actual executable):
$ pwd
/home/ctsats/spark-1.6.1-bin-hadoop2.6
$ cat bin/pyspark
[...]
exec "${SPARK_HOME}"/bin/spark-submit pyspark-shell-main --name "PySparkShell" "$#"
i.e. pyspark is actually a script run by spark-submit and given the name PySparkShell, by which you can find it in the Spark History Server UI; and since it is run like that, it goes by whatever arguments (or defaults) are included with its spark-submit command.
Since sc.deployMode is not available in PySpark, you could check out spark.submit.deployMode configuration property.
>>> sc.getConf().get("spark.submit.deployMode")
'client'
This is not available in PySpark
Use sc.deployMode
scala> sc.deployMode
res0: String = client
scala> sc.version
res1: String = 2.1.0-SNAPSHOT
As of Spark 2+ the below works.
for item in spark.sparkContext.getConf().getAll():print(item)
(u'spark.submit.deployMode', u'client') # will be one of the items in the list.