Spark ElasticSearch EsHadoopIllegalArgumentException unable to find keystore with valid URI - apache-spark

I'm trying to connect spark to my elasticsearch with SSL.
Setup
Spark 2.4.0 from CDH 6.3.2 (Cloudera)
ElasticSearch 7.6.1 (Open Distro)
elasticsearch-hadoop-7.6.1.jar
Considering
1) I already managed to authenticate logstash with SSL and pkcs12 keystore manually created
2) Connexion Spark to ES works without security
Here spark conf provided :
spark.es.nodes=node1
spark.es.port=9200
spark.es.net.ssl=true
spark.es.net.ssl.keystore.location= ===> See below what i tried
spark.es.net.ssl.keystore.type=PKCS12
spark.es.net.ssl.cert.allow.self.signed=true
spark.es.net.http.auth.user=admin
spark.es.net.http.auth.pass=admin
spark.es.nodes.wan.only=false //tried true
Doing
spark.read.format("org.elasticsearch.spark.sql")
.option("es.query", "?q=*:*")
.load("spark/docs")
.show
====================================================
FileSystem Values tried with spark.es.net.ssl.keystore.location (after copying admin.pkcs12 on all nodes)
file:///PATH/certs/admin.pkcs12
Error :
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
... elided
Caused by: org.elasticsearch.hadoop.EsHadoopIllegalStateException: Cannot initialize SSL - Get Key failed: null
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.createSSLContext(SSLSocketFactory.java:175)
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.getSSLContext(SSLSocketFactory.java:160)
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.createSocket(SSLSocketFactory.java:129)
at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport.doExecute(CommonsHttpTransport.java:685)
at org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport.execute(CommonsHttpTransport.java:664)
at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:116)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:432)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:428)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:388)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:392)
at org.elasticsearch.hadoop.rest.RestClient.get(RestClient.java:168)
at org.elasticsearch.hadoop.rest.RestClient.mainInfo(RestClient.java:745)
at org.elasticsearch.hadoop.rest.InitializationUtils.discoverClusterInfo(InitializationUtils.java:330)
... 61 more
Caused by: java.security.UnrecoverableKeyException: Get Key failed: null
at sun.security.pkcs12.PKCS12KeyStore.engineGetKey(PKCS12KeyStore.java:435)
at java.security.KeyStore.getKey(KeyStore.java:1023)
at sun.security.ssl.SunX509KeyManagerImpl.<init>(SunX509KeyManagerImpl.java:133)
at sun.security.ssl.KeyManagerFactoryImpl$SunX509.engineInit(KeyManagerFactoryImpl.java:70)
at javax.net.ssl.KeyManagerFactory.init(KeyManagerFactory.java:256)
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.loadKeyManagers(SSLSocketFactory.java:217)
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.createSSLContext(SSLSocketFactory.java:173)
... 78 more
Caused by: java.lang.NullPointerException
at sun.security.pkcs12.PKCS12KeyStore.engineGetKey(PKCS12KeyStore.java:374)
... 84 more
====================================================
I copied a keystore a valid admin.pkcs12 to hdfs => /user/company/ with 777 rights, (as i'm writing, is it too permissive, like ssh ?)
//returns true
FileSystem.get(spark.sparkContext.hadoopConfiguration).exists(new Path("hdfs://namenode:8020/user/company/admin.pkcs12"))
HDFS Values tried with spark.es.net.ssl.keystore.location
hdfs:///namenode:8020/user/company/admin.pkcs12
hdfs://namenode:8020/user/company/admin.pkcs12
/user/company/admin.pkcs12
Error :
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
... elided
Caused by: org.elasticsearch.hadoop.EsHadoopIllegalStateException: Cannot initialize SSL - Expected to find keystore file at [...] but was unable to. Make sure that it is available on the classpath, or if not, that you have specified a valid URI.
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.createSSLContext(SSLSocketFactory.java:175)
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.getSSLContext(SSLSocketFactory.java:160)
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.createSocket(SSLSocketFactory.java:129)
at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport.doExecute(CommonsHttpTransport.java:685)
at org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport.execute(CommonsHttpTransport.java:664)
at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:116)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:432)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:428)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:388)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:392)
at org.elasticsearch.hadoop.rest.RestClient.get(RestClient.java:168)
at org.elasticsearch.hadoop.rest.RestClient.mainInfo(RestClient.java:745)
at org.elasticsearch.hadoop.rest.InitializationUtils.discoverClusterInfo(InitializationUtils.java:330)
... 61 more
Caused by: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Expected to find keystore file at [...] but was unable to. Make sure that it is available on the classpath, or if not, that you have specified a valid URI.
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.loadKeyStore(SSLSocketFactory.java:195)
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.loadKeyManagers(SSLSocketFactory.java:215)
at org.elasticsearch.hadoop.rest.commonshttp.SSLSocketFactory.createSSLContext(SSLSocketFactory.java:173)
I tried JKS too.
What am I missing ?

//Works
file:///PATH/certs/admin.pkcs12
I was getting this error because of the missing password.
spark.es.net.ssl.keystore.pass=PASSWORD

Related

Spark 2.4 Got an error when resolving hostNames Falling back to /default-rack

Running an application in in client mode, the driver logs are printed with the below info messages, any idea on how to resolve this? Any spark configs to be updated? or missing?
[INFO ][dispatcher-event-loop-29][SparkRackResolver:54] Got an error when resolving hostNames. Falling back to /default-rack for all
The jobs runs fine, this msg is not in the executor logs.
Check this bug:
https://issues.apache.org/jira/browse/SPARK-28005
If you want to suppress this in the logs you can try to add this into your log4j.properties
log4j.logger.org.apache.spark.deploy.yarn.SparkRackResolver=ERROR
This can happen while using spart-submit with master yarn in a deploy mode local (not using --deploy-mode cluster) and the path to topology.py script is not correct into your core-site.xml.
Path to core-site.xml can be set via environment variable HADOOP_CONF_DIR (or YARN_CONF_DIR).
Check the path in the param net.topology.script.file.name value of core-site.xml.
If the path is incorrect, deploying driver in local mode will lead to error of executing with the following warning:
23/01/15 18:39:43 WARN ScriptBasedMapping: Exception running /home/alexander/xxx/.conf/topology.py 10.15.21.199
java.io.IOException: Cannot run program "/etc/hadoop/conf.cloudera.yarn/topology.py" (in directory "/home/john"): error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
...
23/01/15 18:39:43 INFO SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all

How to renew Kerberos ticket on spark yarn client mode?

I was using Spark 1.6.0 to access data on Kerberos enabled HDFS by API DataFrame.read.parquet($path).
My application is deployed as spark on yarn with client mode.
By default, Kerberos ticket expires every 24 hours. Everything works fine in the first 24 hours but failing to read files after 24 hours(or more, like 27 hours).
I have tried several ways to login and renew the ticket, doesn't work.
Set spark.yarn.keytab and spark.yarn.principal in spark-defaults.conf
Set --keytab and --principal in the spark-submit command line
Start a timer in code to call UserGroupInformation.getLoginUser().checkTGTAndReloginFromKeytab() every 2 hours.
Error details are:
WARN [org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:671)] - Couldn't setup connection for adam/cluster1#DEV.COM to cdh01/192.168.1.51:8032
DEBUG [org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1632)] - PrivilegedActionException as:adam/cluster1#DEV.COM (auth:KERBEROS) cause:java.io.IOException: Couldn't setup connection for adam/cluster1#DEV.COMto cdh01/192.168.1.51:8032
ERROR [org.apache.spark.Logging$class.logError(Logging.scala:95)] - Failed to contact YARN for application application_1490607689611_0002.
java.io.IOException: Failed on local exception: java.io.IOException: Couldn't setup connection for adam/cluster1#DEV.COM to cdh01/192.168.1.51:8032; Host Details : local host is: "cdh05/192.168.1.41"; destination host is: "cdh01":8032;
The problem was solved.
It was caused by the wrong version of Hadoop lib.
In Spark 1.6 assembly jar, it refer to the old ver. of Hadoop lib, so I downloaded it again without build-in Hadoop lib, and referring to a third party Hadop 2.8 lib.
Then it just works.

Provider is not supported, or was incorrectly entered

I have installed gerrit server setup in localhost. And after making the successful connection the Web UI has been launched. There i have registered with my gmail id in "Sign in with a Launchpad ID" option.
Its worked earlier, but now it shows the error "Provider is not supported, or was incorrectly entered." when i try to login. I had searched a lot and found some solution regarding the security issues in the installed java in the system. I have Oracle Jdk8 not OpenJdk in my system. so should i have to switch to Open Jdk. Here is my error log messages from log file.
Caused by: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
at sun.security.validator.PKIXValidator.<init>(PKIXValidator.java:90)
at sun.security.validator.Validator.getInstance(Validator.java:179)
at sun.security.ssl.X509TrustManagerImpl.getValidator(X509TrustManagerImpl.java:312)
at sun.security.ssl.X509TrustManagerImpl.checkTrustedInit(X509TrustManagerImpl.java:171)
at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:184)
at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:124)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1491)
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216)
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:979)
at sun.security.ssl.Handshaker.process_record(Handshaker.java:914)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403)
... 66 more
Caused by: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
at java.security.cert.PKIXParameters.setTrustAnchors(PKIXParameters.java:200)
at java.security.cert.PKIXParameters.<init>(PKIXParameters.java:120)
at java.security.cert.PKIXBuilderParameters.<init>(PKIXBuilderParameters.java:104)
at sun.security.validator.PKIXValidator.<init>(PKIXValidator.java:88)
... 78 more
Issue Fixed !
As i have been using Oracle Java 8, i have installed Open Jdk 7 with the following commnad.
sudo apt-get install ca-certificates-java
But issue resolved only when i have changed java home variable in gerrit.config file.
javaHome = /usr/lib/jvm/java-7-openjdk-amd64/jre
Now the issues fixed for me..

FileNotFoundException with Titan (titan-all)

I'm trying to set up a basic Titan example. In following the docs, I tried running bin/gremlin-server.sh -i com.thinkaurelius.titan titan-all 1.0.0 which throws;
Could not install the dependency: java.io.FileNotFoundException: /usr/share/titan/ext/titan-all/plugin/titan-all-1.0.0.jar (No such file or directory)
java.lang.RuntimeException: java.io.FileNotFoundException: /usr/share/titan/ext/titan-all/plugin/titan-all-1.0.0.jar (No such file or directory)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:215)
at org.apache.tinkerpop.gremlin.groovy.util.DependencyGrabber.getAdditionalDependencies(DependencyGrabber.groovy:165)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:215)
at org.apache.tinkerpop.gremlin.groovy.util.DependencyGrabber.copyDependenciesToPath(DependencyGrabber.groovy:99)
at org.apache.tinkerpop.gremlin.server.util.GremlinServerInstall.main(GremlinServerInstall.java:38)
Caused by: java.io.FileNotFoundException: /usr/share/titan/ext/titan-all/plugin/titan-all-1.0.0.jar (No such file or directory)
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.<init>(ZipFile.java:219)
at java.util.zip.ZipFile.<init>(ZipFile.java:149)
at java.util.jar.JarFile.<init>(JarFile.java:166)
at java.util.jar.JarFile.<init>(JarFile.java:130)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:215)
at org.apache.tinkerpop.gremlin.groovy.util.DependencyGrabber.getAdditionalDependencies(DependencyGrabber.groovy:148)
... 3 more
I also tried it from gremlin.sh;
root#ubuntu:/usr/share/titan# bin/gremlin.sh
\,,,/
(o o)
-----oOOo-(3)-oOOo-----
plugin activated: aurelius.titan
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/share/titan/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/share/titan/lib/logback-classic-1.1.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14:45:44 INFO org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph - HADOOP_GREMLIN_LIBS is set to: /usr/share/titan/lib
plugin activated: tinkerpop.hadoop
plugin activated: tinkerpop.tinkergraph
gremlin> :install com.thinkaurelius.titan titan-all 1.0.0
==>java.io.FileNotFoundException: /usr/share/titan/ext/titan-all/plugin/titan-all-1.0.0.jar (No such file or directory)
gremlin>
I've confirmed that groovy has the file;
root#ubuntu:/usr/share/titan# ls ~/.groovy/grapes/com.thinkaurelius.titan/titan-all/jars
titan-all-1.0.0.jar
So now I'm stumped.. Has anyone come across this before?
EDIT: Some notes on how I got here..
My first attempt at getting this working was to use the all-inclusive zip file as per the docs... I changed gremlin-server.yaml to;
graph: conf/titan-cassandra-es.properties
That threw;
407 [main] WARN org.apache.tinkerpop.gremlin.server.GremlinServer - Graph [graph] configured at [conf/titan-cassandra-es.properties] could not be instantiated and will not be available in Gremlin Server. GraphFactory message: Configuration must contain a valid 'gremlin.graph' setting
java.lang.RuntimeException: Configuration must contain a valid 'gremlin.graph' setting
Ok, simple google search tells me I need to add this to conf/titan-cassandra-es.properties;
gremlin.graph=com.thinkaurelius.titan.core.TitanFactory
At which point, I get..
484 [main] WARN org.apache.tinkerpop.gremlin.server.GremlinServer - Graph [graph] configured at [conf/titan-cassandra-es.properties] could not be instantiated and will not be available in Gremlin Server. GraphFactory message: GraphFactory could not instantiate this Graph implementation [class com.thinkaurelius.titan.core.TitanFactory]
java.lang.RuntimeException: GraphFactory could not instantiate this Graph implementation [class com.thinkaurelius.titan.core.TitanFactory]
This leads me to believe that I'm missing com.thinkaurelius.titan.core.TitanFactory. Which is curious, since $TITAN_HOME/lib does in fact contain titan-all-1.0.0.jar. So I assumed (perhaps wrongly) that I need to run the titan-all install to make it actually load the jars..
The basic install for Titan is unzip the titan-1.0.0-hadoop1.zip. That is it!
Download it from http://titandb.io
http://s3.thinkaurelius.com/docs/titan/1.0.0/getting-started.html
It is already packaged with the Titan plugins, so you don't need to install them into the Gremlin Console or Gremlin Server.
If you want to try the Titan Server, there is a pre-packaged titan.sh script which automatically starts Cassandra and Elasticsearch with the server.
http://s3.thinkaurelius.com/docs/titan/1.0.0/server.html#_getting_started
For anyone that comes across this strangeness, read the whole stack trace. It turns out waaay at the bottom, it actually had the real issue; it couldn't connect to Cassandra because I had not enabled Thrift.

java.security.NoSuchAlgorithmException:Algorithm PBKDF2WithHmacSHA1 not available

My webserver, orion 1.5.4, run on jre 1.4.2, when I run
SecretKeyFactory factory = SecretKeyFactory.getInstance("PBKDF2WithHmacSHA1");
the following exception is thrown
java.security.NoSuchAlgorithmException:Algorithm PBKDF2WithHmacSHA1 not available
I google and find need to add Bouncy Castle provider, so I download bcprov-jdk14-150.jar and placed it in classpath, and download the unlimited policy files in the JVM, then when I run the program code, error thrown in line
aesCipher.init(Cipher.DECRYPT_MODE,secretKey, new IvParameterSpec(ivByte));
the error message is
Caused by: java.lang.SecurityException: Cannot set up certs for trusted CAs
at javax.crypto.SunJCE_b.(DashoA12275)
... 15 more
Caused by: java.lang.SecurityException: Jurisdiction policy files are not signed by trusted signers!
at javax.crypto.SunJCE_b.a(DashoA12275)
at javax.crypto.SunJCE_b.g(DashoA12275)
at javax.crypto.SunJCE_b.f(DashoA12275)
at javax.crypto.SunJCE_t.run(DashoA12275)
at java.security.AccessController.doPrivileged(Native Method)
... 16 more
how to solve it?
I found the problem, I download the unlimited policy files for java 1.6 wrongly, should download for java 1.4.
thanks

Resources