Data transfer between two kerberos secured cluster - security

I am trying to transfer data between two secured kerberos. Cluster. I am facing issue that I have no config change access to source cluster I need to change everything on destination cluster. Is any way that I can setup trust realm between both the cluster without edit any config on source cluster.

If you are using distcp, then you will have to make sure both the clusters KDC know each other, by editing krb5.conf to add [realms] and [domain_realms] on each cluster to know about the other cluster as follows:
[realms]
<CLUSTER2_REALM> = {
kdc = <cluster2_server_kdc_host>:88
admin_server = <cluster2_server_kdc_host>:749
default_domain = <cluster2_host>
}
[domain_realm]
Clustre2_NN1 = CLUSTER2_REALM
Cluster2_NN2= CLUSTER2_REALM
Similarly on cluster2 as well, with CLUSTER1 details.
Then you need to create principals on both the clusters
addprinc -e "aes128-cts-hmac-sha1-96:normal aes256-cts-hmac-sha1-96:normal" krbtgt/<CLUSTER1_REASLM>#<CLUSTER2_REALMS>
modprinc -maxrenewlife <n>day krbtgt/<CLUSTER1_REALM>#<CLUSTER2_REALM>
Below properties needs to be set for hadoop.security.auth_to_local
In Cluster1:
RULE:[1:$1#$0](.*#\Q<CLUSTER2_REALM>\E$)s/#\Q<CLUSTER2_REALM>\E$//
RULE:[2:$1#$0](.*#\Q<CLUSTER2_REALM>\E$)s/#\Q<CLUSTER2_REALM>\E$//
In Cluster2:
RULE:[1:$1#$0](.*#\Q<CLUSTER1_REALM>\E$)s/#\Q<CLUSTER1_REALM>\E$//
RULE:[2:$1#$0](.*#\Q<CLUSTER1_REALM>\E$)s/#\Q<CLUSTER1_REALM>\E$//
Restart kdc
/etc/init.d/krb5kdc stop
/etc/init.d/kadmin stop
/etc/init.d/krb5kdc start
/etc/init.d/kadmin start
Failover or Restart Namenodes

Related

Kerberos: Spark UGI credentials are not getting passed down to Hive

I'm using Spark-2.4, I have a Kerberos enabled cluster where I'm trying to run a query via the spark-sql shell.
The simplified setup basically looks like this: spark-sql shell running on one host in a Yarn cluster -> external hive-metastore running one host -> S3 to store table data.
When I launch the spark-sql shell with DEBUG logging enabled, this is what I see in the logs:
> bin/spark-sql --proxy-user proxy_user
...
DEBUG HiveDelegationTokenProvider: Getting Hive delegation token for proxy_user against hive/_HOST#REALM.COM at thrift://hive-metastore:9083
DEBUG UserGroupInformation: PrivilegedAction as:spark/spark_host#REALM.COM (auth:KERBEROS) from:org.apache.spark.deploy.security.HiveDelegationTokenProvider.doAsRealUser(HiveDelegationTokenProvider.scala:130)
This means that Spark made a call to fetch the delegation token from the Hive metastore and then added it to the list of credentials for the UGI. This is the piece of code in Spark which does that. I also verified in the metastore logs that the get_delegation_token() call was being made.
Now when I run a simple query like create table test_table (id int) location "s3://some/prefix"; I get hit with an AWS credentials error. I modified the hive metastore code and added this right before the file system in Hadoop is initialized (org/apache/hadoop/hive/metastore/Warehouse.java):
public static FileSystem getFs(Path f, Configuration conf) throws MetaException {
...
try {
// get the current user
UserGroupInformation ugi = UserGroupInformation.getCurrentUser();
LOG.info("UGI information: " + ugi);
Collection<Token<? extends TokenIdentifier>> tokens = ugi.getCredentials().getAllTokens();
// print all the tokens it has
for(Token token : tokens) {
LOG.info(token);
}
} catch (IOException e) {
e.printStackTrace();
}
...
}
In the metastore logs, this does print the correct UGI information:
UGI information: proxy_user (auth:PROXY) via hive/hive-metastore#REALM.COM (auth:KERBEROS)
but there are no tokens present in the UGI. Looks like Spark code adds it with the alias hive.server2.delegation.token but I don't see it in the UGI. This makes me suspect that somehow the UGI scope is isolated and not being shared between spark-sql and hive metastore. How do I go about solving this?
Spark is not picking up your Kerberos identity -it asks each FS to issue some "delegation token" which lets the caller interact with that service and that service alone. This is more restricted and so more secure.
The problem here is that spark collects delegation tokens from every filesystem which can issue them -and as your S3 connector isn't issuing any, nothing is coming down.
Now, Apache Hadoop 3.3.0's S3A connector can be set to issue your AWS credentials inside a delegation token, or, for bonus security, ask AWS for session credentials and send only those over. But (a) you need a spark build with those dependencies, and (b) Hive needs to be using those credentials to talk to S3.

Spark RDD.pipe run bash script as a specific user

I notice that RDD.pipe(Seq("/tmp/test.sh")) runs the shell script with the user yarn . that is problematic because it allows the spark user to access files that should only be accessible to the yarn user.
What is the best way to address this ?
Calling sudo -u sparkuser is not a clean solution . I would hate to even consider that .
I am not sure if this is the fault of Spark to treat the Pipe() differently, but I opened a similar issue on JIRA: https://issues.apache.org/jira/projects/SPARK/issues/SPARK-26101
Now on to the problem. Apparently in YARN cluster Spark Pipe() asks for a container, whether your Hadoop is nonsecure or is secured by Kerberos is the difference between whether container runs by user yarn/nobody or the user who launches the container your actual user.
Either use Kerberos to secure your Hadoop or if you don't want to go through securing your Hadoop, you can set two configs in YARN which uses the Linux users/groups to launches the container. Note, you must share the same users/groups across all the nodes in your cluster. Otherwise, this won't work. (perhaps use LDAP/AD to sync your users/groups)
Set these:
yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users = false
yarn.nodemanager.container-executor.class = org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
Source: https://hadoop.apache.org/docs/r2.7.4/hadoop-yarn/hadoop-yarn-site/NodeManagerCgroups.html
(this is the same even in Hadoop 3.0)
This fixed worked on Cloudera latest CDH 5.15.1 (yarn-site.xml):
http://community.cloudera.com/t5/Batch-Processing-and-Workflow/YARN-force-nobody-user-on-all-jobs-and-so-they-fail/m-p/82572/highlight/true#M3882
Example:
val test = sc.parallelize(Seq("test user")).repartition(1)
val piped = test.pipe(Seq("whoami"))
val c = piped.collect()
est: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[4] at repartition at <console>:25
piped: org.apache.spark.rdd.RDD[String] = PipedRDD[5] at pipe at <console>:25
c: Array[String] = Array(maziyar)
This will return the username who started the Spark session after setting those configs in yarn-site.xml and sync all the users/groups among all the nodes.

Lagom external Cassandra authentication

I have been trying to set up an external Cassandra for my Lagom setup.
In root pom I have written
<configuration>
<unmanagedServices>
<cas_native>http://ip:9042</cas_native>
</unmanagedServices>
<cassandraEnabled>false</cassandraEnabled>
</configuration>
In my impl application.conf
akka {
persistent {
journal {
akka.persistence.journal.plugin = "this-cassandra-journal"
this-cassandra-journal {
contact-points = ["10.15.2.179"]
port = 9042
cluster-id = "cas_native"
keyspace = "hello"
authentication.username = "cassandra"
authentication.password = "rodney"
# Parameter indicating whether the journal keyspace should be auto created
keyspace-autocreate = true
# Parameter indicating whether the journal tables should be auto created
tables-autocreate = true
}
}
snapshot-store {
akka.persistence.snapshot-store.plugin = "this-cassandra-snapshot-store"
this-cassandra-snapshot-store {
contact-points = ["10.15.2.179"]
port = 9042
cluster-id = "cas_native"
keyspace = "hello_snap"
authentication.username = "cassandra"
authentication.password = "rodney"
# Parameter indicating whether the journal keyspace should be auto created
keyspace-autocreate = true
# Parameter indicating whether the journal tables should be auto created
tables-autocreate = true
}
}
}
But I get the error
[warn] a.p.c.j.CassandraJournal - Failed to connect to Cassandra and initialize.
It will be retried on demand. Caused by: Authentication error on host /10.15.2.
179:9042: Host /10.15.2.179:9042 requires authentication, but no authenticator f
ound in Cluster configuration
[warn] a.p.c.s.CassandraSnapshotStore - Failed to connect to Cassandra and initi
alize. It will be retried on demand. Caused by: Authentication error on host /10
.15.2.179:9042: Host /10.15.2.179:9042 requires authentication, but no authentic
ator found in Cluster configuration
[warn] a.p.c.j.CassandraJournal - Failed to connect to Cassandra and initialize.
It will be retried on demand. Caused by: Authentication error on host /10.15.2.
179:9042: Host /10.15.2.179:9042 requires authentication, but no authenticator f
ound in Cluster configuration
[error] a.c.s.PersistentShardCoordinator - Persistence failure when replaying ev
ents for persistenceId [/sharding/ProductCoordinator]. Last known sequence numbe
r [0]
com.datastax.driver.core.exceptions.AuthenticationException: Authentication erro
r on host /10.15.2.179:9042: Host /10.15.2.179:9042 requires authentication, but
no authenticator found in Cluster configuration
at com.datastax.driver.core.AuthProvider$1.newAuthenticator(AuthProvider
.java:40)
at com.datastax.driver.core.Connection$5.apply(Connection.java:250)
at com.datastax.driver.core.Connection$5.apply(Connection.java:234)
at com.google.common.util.concurrent.Futures$AsyncChainingFuture.doTrans
form(Futures.java:1442)
at com.google.common.util.concurrent.Futures$AsyncChainingFuture.doTrans
form(Futures.java:1433)
at com.google.common.util.concurrent.Futures$AbstractChainingFuture.run(
Futures.java:1408)
at com.google.common.util.concurrent.Futures$2$1.run(Futures.java:1177)
at com.google.common.util.concurrent.MoreExecutors$DirectExecutorService
.execute(MoreExecutors.java:310)
at com.google.common.util.concurrent.Futures$2.execute(Futures.java:1174
)
I also tried providing this config
lagom.persistence.read-side {
cassandra {
}
}
How to make it work by providing credentials for Cassandra?
In Lagom, you may already use akka-persistence-cassandra settings for your journal and snapshot-store (see reference.conf in the source code, and scroll down for cassandra-snapshot-store.authentication.*). There's no need to configure it because Lagom's support for Cassandra persistence already declares akka-persistence-cassandraas the Akka Persistence implementation:
akka.persistence.journal.plugin = cassandra-journal
akka.persistence.snapshot-store.plugin = cassandra-snapshot-store
See https://github.com/lagom/lagom/blob/c63383c343b02bd0c267ff176bfb4e48c7202d7d/persistence-cassandra/core/src/main/resources/play/reference-overrides.conf#L5-L6
The third last bit to configure when connecting Lagom to Cassandra is Lagom's Read-Side. That is also doable via application.conf if you override the defaults.
Note how each storage may use a different Cassandra Ring/Keyspace/credentials/... so you can tune them separately.
See extra info in the Lagom docs.

How to configure multiple cassandra contact-points for lagom?

In Lagom it appears the contact point get loaded from the service locator which accepts only a single URI. How can we specify multiple cassandra contact-points?
lagom.services {
cas_native = "tcp://10.0.0.120:9042"
}
I have tried setting just the contact points in the akka persistence config but that doesn't seem to override the service locator config.
All that I was missing was the session provider to override service lookup:
session-provider = akka.persistence.cassandra.ConfigSessionProvider
contact-points = ["10.0.0.120", "10.0.3.114", "10.0.4.168"]
was needed in the lagom cassandra config

OpsCenter Community, keep data on different cluster

Trying to setup OpsCenter free keep data on different cluster, but getting error:
WARN: Unable to find a matching cluster for node with IP [u'x.x.x.1']; the message was {u'os-load': 0.35}. This usually indicates that an OpsCenter agent is still running on an old node that was decommissioned or is part of a cluster that OpsCenter is no longer monitoring.
Same error for second node in cluster :(
But, if I set [dse].enterprise_override = true in cluster config -- everything works fine.
My config is:
user#casnode1:~/opscenter/conf/clusters# cat ClusterTest.conf
[jmx]
username =
password =
port = 7199
[kerberos_client_principals]
[kerberos]
[agents]
[kerberos_hostnames]
[kerberos_services]
[storage_cassandra]
seed_hosts = x.x.x.2
api_port = 9160
connect_timeout = 6.0
bind_interface =
connection_pool_size = 5
username =
password =
send_thrift_rpc = True
keyspace = OpsCenter2
[cassandra]
username =
seed_hosts = x.x.x.1, x.x.x.4
api_port = 9160
password =
So, the question is: Is it possible in OpsCenter Community setup different cluster to keep opscenter data?
OpsCenter version is 4.0.3
Is it possible in OpsCenter Community setup different cluster to keep opscenter data?
It is not. Storing data on a separate cluster is only supported on DataStax Enterprise clusters.
Note: Using the override you mentioned without permission from DataStax is a violation of the OpsCenter license agreement, and will not be supported.

Resources