Issue with Hive while Storing Streaming Data - apache-spark

I have a Kafka Streaming Consumer job which stores the data in Hive table.
The issue is that sometimes the job fails due to Hive and has to be restarted again.
This is happening with all the streaming consumer jobs which are trying to access Hive.
Not able to trace the reason.
Appreciate your suggestions.
Below is the high level error log
INFO ApplicationMaster: Starting the user application in a separate Thread
INFO ApplicationMaster: Waiting for spark context initialization...
WARN SparkConf: The configuration key 'spark.yarn.applicationMaster.waitTries' has been deprecated as of Spark 1.3 and may be removed in the future. Please use the new key 'spark.yarn.am.waitTime' instead.
INFO metastore: Trying to connect to metastore with URI thrift://<server1:port1>
WARN metastore: Failed to connect to the MetaStore Server...
INFO metastore: Trying to connect to metastore with URI thrift://<server1:port1>
WARN metastore: Failed to connect to the MetaStore Server...
INFO metastore: Waiting 1 seconds before next connection attempt.
INFO metastore: Trying to connect to metastore with URI thrift://<server2:port1>
WARN metastore: Failed to connect to the MetaStore Server...
INFO metastore: Trying to connect to metastore with URI thrift://<server1:port1>
WARN metastore: Failed to connect to the MetaStore Server...
INFO metastore: Waiting 1 seconds before next connection attempt.
INFO metastore: Trying to connect to metastore with URI thrift://<server2:port1>
WARN metastore: Failed to connect to the MetaStore Server...
INFO metastore: Trying to connect to metastore with URI thrift://<server1:port1>
WARN metastore: Failed to connect to the MetaStore Server...
INFO metastore: Waiting 1 seconds before next connection attempt.
WARN Hive: Failed to access metastore. This class should not accessed in runtime.
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Caused by: java.lang.reflect.InvocationTargetException
Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: Peer indicated failure: DIGEST-MD5: IO error acquiring password
INFO metastore: Trying to connect to metastore with URI thrift://<server2:port1>
WARN metastore: Failed to connect to the MetaStore Server...
INFO metastore: Trying to connect to metastore with URI thrift://<server1:port1>
WARN metastore: Failed to connect to the MetaStore Server...
INFO metastore: Waiting 1 seconds before next connection attempt.
INFO metastore: Trying to connect to metastore with URI thrift://<server2:port1>
WARN metastore: Failed to connect to the MetaStore Server...
INFO metastore: Trying to connect to metastore with URI thrift://<server1:port1>
WARN metastore: Failed to connect to the MetaStore Server...
INFO metastore: Waiting 1 seconds before next connection attempt.
INFO metastore: Trying to connect to metastore with URI thrift://<server2:port1>
WARN metastore: Failed to connect to the MetaStore Server...
INFO metastore: Trying to connect to metastore with URI thrift://<server1:port1>
WARN metastore: Failed to connect to the MetaStore Server...
INFO metastore: Waiting 1 seconds before next connection attempt.
ERROR ApplicationMaster: User class threw exception: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':
java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':
Caused by: org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Caused by: java.lang.reflect.InvocationTargetException
Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: Peer indicated failure: DIGEST-MD5: IO error acquiring password

Related

I want to run spark with yarn, but I get a java.net.ConnectException error

I want to run spark on yarn
root#server01:/export/server/spark# bin/spark-shell --master yarn
But it goes to error like this
root#server01:/export/server/spark# bin/spark-shell --master yarn
22/12/06 07:51:42 WARN Utils: Your hostname, server01 resolves to a loopback address: 127.0.1.1; using 192.168.40.133 instead (on interface ens33)
22/12/06 07:51:42 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
22/12/06 07:51:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/12/06 07:57:58 ERROR YarnClientSchedulerBackend: The YARN application has already ended! It might have been killed or the Application Master may have failed to start. Check the YARN application logs for more details.
22/12/06 07:57:58 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Application application_1670312235175_0001 failed 2 times due to Error launching appattempt_1670312235175_0001_000002. Got exception: java.net.ConnectException: Call From localhost/127.0.0.1 to localhost:41647 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
spark-defaults.conf
spark.eventLog.enabled true
spark.eventLog.dir hdfs://node1:8020/sparklog/
spark.eventLog.compress true
# spark-yarn jar package
spark.yarn.jars hdfs://node1:8020/spark/jars/*
# spark and yarn history server
spark.yarn.historyServer.address node1:18080
yarn-site.xml
<property>
<name>yarn.log.server.url</name>
<value>http://node1:19888/jobhistory/logs</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
I verify the spark.yarn.historyServer to node1:19888 and restart spark,but it doesn't work.

Spark Thrift 3.2.2 impersonate user facing error with metastore authen. SASL negotiation failure, GSS initiate failed

On hadoop kerberized cluster. If im not impersonate user on spark thrift server. It work well. But when i do it. Im facing an error about authentication with metastore.
I flow this document
https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/config-sts-user-imp.html
https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.4/bk_data-access/content/ref-5422cb60-d1d5-425a-b719-ec7bd03ee5d3.1.html
Step 1:
Set hive.server2.enable.doAs = true in Advanced spark-hive-site-override
Add spark.jars = /usr/hdp/current/spark-thriftserver/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/current/spark-thriftserver/lib/datanucleus-core-3.2.10.jar,/usr/hdp/current/spark-thriftserver/lib/datanucleus-rdbms-3.2.9.jar in Custom spark-thrift-sparkconf
Step 2: in Advanced hiveserver2-site
Set hive.security.authorization.enabled = true
Set hive.server2.enable.doAs = true
Set hive.metastore.pre.event.listeners = org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener
Set hive.security.metastore.authorization.manager = org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
Step 3:
I created a user keytab and princinpal and kinit
Run cli: beeline -u 'jdbc:hive2://:/default;principal=spark3/#;auth=KERBEROS;transportMode=binary'
Result:
Connecting to jdbc:hive2://<host>:<port>/default;principal=spark3/<HOST>#<REAM>;auth=KERBEROS;transportMode=binary
Connected to: Spark SQL (version 3.2.2)
Driver: Hive JDBC (version 3.1.0.3.1.4.0-315)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 3.1.0.3.1.4.0-315 by Apache Hive
Run cli: show databases;
And I'm facing an error like this
Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
....
Caused by: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
....
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
....
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
....
Caused by: java.lang.reflect.InvocationTargetException
....
Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: GSS initiate failed
I checked log of spark thrift see like that
22/10/07 15:07:31 INFO ThriftCLIService: Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V10
22/10/07 15:07:31 INFO HiveSessionImpl: Operation log session directory is created: /tmp/spark3/operation_logs/64eb19a6-1bdc-4ed8-81c9-8881c4251e75
22/10/07 15:07:31 INFO metastore: Trying to connect to metastore with URI thrift://<host>:<port>
22/10/07 15:07:32 INFO metastore: Opened a connection to metastore, current connections: 1
22/10/07 15:07:32 INFO metastore: Connected to metastore.
22/10/07 15:07:39 INFO SparkExecuteStatementOperation: Submitting query 'show databases' with fdcf90cb-74bb-4574-99b7-bfd981ce8010
22/10/07 15:07:39 INFO SparkExecuteStatementOperation: Running query with fdcf90cb-74bb-4574-99b7-bfd981ce8010
22/10/07 15:07:39 INFO metastore: Closed a connection to metastore, current connections: 0
22/10/07 15:07:39 INFO metastore: Trying to connect to metastore with URI thrift://<host>:<port>
22/10/07 15:07:39 ERROR TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
22/10/07 15:07:39 WARN metastore: Failed to connect to the MetaStore Server...
22/10/07 15:07:39 INFO metastore: Waiting 5 seconds before next connection attempt.
22/10/07 15:07:44 INFO metastore: Trying to connect to metastore with URI thrift://<host>:<port>
22/10/07 15:07:44 ERROR TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
I test connect to spark thrift server successed, but when i run query. Im facing error above. Where am i wrong?
Spark Thrift Server is built upon a single spark application, unfortunately, it does not support impersonation yet.
Maybe you can try Apache Kyuubi https://github.com/apache/incubator-kyuubi

Unable to connect to cassandra by java client through external ip

I'm able to connect to cassandra docker instance using external ip with CQLSH but it is failing to connect from java client.
Server logs:
Server.java:156 - Starting listening for CQL clients on /0.0.0.0:9042 (unencrypted)...
Error from application:
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /14.24.34.44:9042 (com.datastax.driver.core.exceptions.TransportException: [/14.24.34.44:9042] Cannot connect))
Cassandra version used: 3.11.1

Ambari Hadoop Spark Cluster Firewall Issues

I have just rolled out a Hadoop/Spark cluster in efforts to kick start a data science program at my company. I used Ambari as the manager and installed the Hortonworks distribution (HDFS 2.7.3, Hive 1.2.1, Spark 2.1.1, as well as the other required services. By the way, I am running RHEL 7. I have 2 name nodes, 10 data nodes, 1 hive node and 1 management node (Ambari).
I built a list of firewall ports based on Apache and Ambari documentation and had my infrastructure guys push those rules. I ran into an issue with Spark wanting to pick random ports. When I attempted to run a Spark job (the traditional Pi example), it would fail, as I did not have the whole ephemeral port range open. Since we will probably be running multiple jobs, it makes sense to let Spark handle this and just choose from the ephemeral range of ports (1024 - 65535) rather than specifying a single port. I know I can pick a range, but to make it easy I just asked my guys to open the whole ephemeral range. At first my infrastructure guys balked at that, but when I told them the purpose, they went ahead and did so.
Based on that, I thought I had my issue fixed, but when I attempt to run a job, it still fails with:
Log Type: stderr
Log Upload Time: Thu Oct 12 11:31:01 -0700 2017
Log Length: 14617
Showing 4096 bytes of 14617 total. Click here for the full log.
Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:28:52 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:28:53 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:28:54 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:28:55 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:28:56 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:28:57 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:28:57 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:28:59 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:29:00 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:29:01 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:29:02 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:29:03 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:29:04 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:29:05 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:29:06 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:29:06 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:29:07 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:29:09 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:29:10 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:29:11 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:29:12 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:29:13 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:29:14 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:29:15 ERROR ApplicationMaster: Failed to connect to driver at 10.10.98.191:33937, retrying ...
17/10/12 11:29:15 ERROR ApplicationMaster: Uncaught exception:
org.apache.spark.SparkException: Failed to connect to driver!
at org.apache.spark.deploy.yarn.ApplicationMaster.waitForSparkDriver(ApplicationMaster.scala:607)
at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:461)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:283)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:783)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:67)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:66)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:781)
at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:804)
at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
17/10/12 11:29:15 INFO ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: org.apache.spark.SparkException: Failed to connect to driver!)
17/10/12 11:29:15 INFO ShutdownHookManager: Shutdown hook called
At first I thought maybe I had some sort of misconfiguration with Spark and the namenodes/datanodes. However, to test it, I simply stopped firewalld on every node and attempted the job again and it worked just fine.
So, my question - I have the entire 1024 - 65535 port range open - I can see the Spark drivers are trying to connect on those high ports (as show above - 30k - 40k range). However, for some reason when the firewall is on, it fails and when its off it works. I checked the firewall rules and sure enough, the ports are open - and those rules are working as I can access the web services for Ambair, Yarn and HFDS which are specified in the same firewalld xml rules file....
I am new to Hadoop/Spark, so I am wondering is there something I am missing? Is there some lower port under 1024 I need to account for? Here is a list of the ports below 1024 I have open, in addition to the 1024 - 65535 port range:
88
111
443
1004
1006
1019
It's quite possible I missed a lower number port that I really need and just don't know it. Above that, everything else should be handled by the 1024 - 65535 port range.
Ok, so working with some folks on the Hortonworks community, I was was able to come up with a solution. Basically, you need to have at least one port defined, but you can extend that by specifying the spark.port.MaxRetries = xxxxx. By combining this setting along with the spark.driver.port = xxxxx, you can have a range starting at the spark.driver.port and ending with the spark.port.maxRetries.
If you use Ambari as your manager, the settings are under "Custom spark2-defaults" section (I assume under a fully open-source stack install, this would just be a setting under the normal Spark config):
I was advised to separate out these ports by 32 count blocks, for example, if you start at 40000 for your driver, you should start the spark.blockManager.port at 40033, etc.. See this post at:
https://community.hortonworks.com/questions/141484/spark-jobs-failing-firewall-issue.html

Error while creating spark session

I am trying to connect to DSE 5.1 cluster using the spark session. I am getting the below error when using the cluster url.However if i set the master to localhost it all works fine.
SparkSession.builder().appName(getClass.getSimpleName).master(""spark://10.10.10.10:1010"") >>> error.
SparkSession.builder().appName(getClass.getSimpleName).master("localhost") >>
Works fine.
Stack trace of the error:
17/05/05 12:44:22 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/05/05 12:45:23 ERROR StandaloneSchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
17/05/05 12:45:23 WARN StandaloneSchedulerBackend: Application ID is not initialized yet.
17/05/05 12:45:23 WARN StandaloneAppClient$ClientEndpoint: Drop UnregisterApplication(null) because has not yet connected to master
17/05/05 12:45:23 ERROR MapOutputTrackerMaster: Error communicating with MapOutputTracker
java.lang.InterruptedException

Resources