Set up basic password authentication for JDBC connections in EMR using Presto - presto

My use case is simple. I have an EMR cluster deployed through CDK running Presto using the AWS Data Catalog as the meta store. The cluster will be having just the default user running queries. By default, the master user is hadoop, which I can use to connect to the cluster via JDBC and run queries. However, I can establish said connection without a password. I have read the Presto docs and they mention LDAP, Kerberos and file based authentication. I just want this to behave like, say, a MySQL database, where I have to pass both username AND password to connect. However, for the life of me, I can't find what configuration to set the password on. These are the settings I have so far:
{
classification: 'spark-hive-site',
configurationProperties: {
'hive.metastore.client.factory.class': 'com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory',
},
},
{
classification: 'emrfs-site',
configurationProperties: {
'fs.s3.maxConnections': '5000',
'fs.s3.maxRetries': '200',
},
},
{
classification: 'presto-connector-hive',
configurationProperties: {
'hive.metastore.glue.datacatalog.enabled': 'true',
'hive.parquet.use-column-names': 'true',
'hive.max-partitions-per-writers': '7000000',
'hive.table-statistics-enabled': 'true',
'hive.metastore.glue.max-connections': '20',
'hive.metastore.glue.max-error-retries': '10',
'hive.s3.use-instance-credentials': 'true',
'hive.s3.max-error-retries': '200',
'hive.s3.max-client-retries': '100',
'hive.s3.max-connections': '5000',
},
},
Which setting can I use to set the hadoop password? Kerberos, LDAP and file based seem overly complicated for this simple use case. Am I missing something obvious?
EDIT
After reading countless pages of documentation and talking to AWS Support, i decided to move to Trino, but am running into more issues. These are the current configurations on my CDK deployment:
configurations: [
{
classification: 'spark-hive-site',
configurationProperties: {
'hive.metastore.client.factory.class': 'com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory',
},
},
{
classification: 'emrfs-site',
configurationProperties: {
'fs.s3.maxConnections': '5000',
'fs.s3.maxRetries': '200',
},
},
{
classification: 'presto-connector-hive',
configurationProperties: {
'hive.metastore.glue.datacatalog.enabled': 'true',
'hive.parquet.use-column-names': 'true',
'hive.max-partitions-per-writers': '7000000',
'hive.table-statistics-enabled': 'true',
'hive.metastore.glue.max-connections': '20',
'hive.metastore.glue.max-error-retries': '10',
'hive.s3.use-instance-credentials': 'true',
'hive.s3.max-error-retries': '200',
'hive.s3.max-client-retries': '100',
'hive.s3.max-connections': '5000',
},
},
{
classification: 'trino-config',
configurationProperties: {
'query.max-memory-per-node': `${instanceMemory * 0.15}GB`, // 25% of a node
'query.max-total-memory-per-node': `${instanceMemory * 0.5}GB`, // 50% of a node
'query.max-memory': `${instanceMemory * 0.5 * coreInstanceGroupNodeCount}GB`, // 50% of the cluster
'query.max-total-memory': `${instanceMemory * 0.8 * coreInstanceGroupNodeCount}GB`, // 80% of the cluster
'query.low-memory-killer.policy': 'none',
'task.concurrency': vcpuCount.toString(),
'task.max-worker-threads': (vcpuCount * 4).toString(),
'http-server.authentication.type': 'PASSWORD',
'http-server.http.enabled': 'false',
'internal-communication.shared-secret': 'abcdefghijklnmopqrstuvwxyz',
'http-server.https.enabled': 'true',
'http-server.https.port': '8443',
'http-server.https.keystore.path': '/home/hadoop/fullCert.pem',
},
},
{
classification: 'trino-password-authenticator',
configurationProperties: {
'password-authenticator.name': 'file',
'file.password-file': '/home/hadoop/password.db',
'file.refresh-period': '5s',
'file.auth-token-cache.max-size': '1000',
},
},
],
I started here:
https://trino.io/docs/current/security/tls.html
I am using this approach:
"Secure the Trino server directly. This requires you to obtain a valid certificate, and add it to the Trino coordinator’s configuration."
I have obtained an internal wildcard certificate from my company. This gets me:
A certificate text
A certificate chain
A private key
From here: https://trino.io/docs/current/security/inspect-pem.html
It seems i need to plug those 3 files into one, for which I do:
-----BEGIN RSA PRIVATE KEY-----
Content of private key
-----END RSA PRIVATE KEY-----
-----BEGIN CERTIFICATE-----
Content of certificate text
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
First content of chain
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
Second content of chain
-----END CERTIFICATE-----
Then from a bootstrap action, i put the file in all nodes. That way i can fullfil this: https://trino.io/docs/current/security/tls.html#configure-the-coordinator with these configs:
'http-server.https.enabled': 'true',
'http-server.https.port': '8443',
'http-server.https.keystore.path': '/home/hadoop/fullCert.pem',
I know for sure the file is deployed to the nodes. THen I proceeded to do this: https://trino.io/docs/current/security/password-file.html
I also know that particular part works, because if I use the trino CLI directly on the master node with the wrong password, i get a credentials error.
Now, I'm currently stuck doing this:
[hadoop#ip-10-0-10-245 ~]$ trino-cli --server https://localhost:8446 --catalog awsdatacatalog --user hadoop --password --insecure
trino> select 1;
Query 20220701_201620_00001_9nksi failed: Insufficient active worker nodes. Waited 5.00m for at least 1 workers, but only 0 workers are active
From /var/log/trino/server.log I see:
2022-07-01T21:30:12.966Z WARN http-client-node-manager-51 io.trino.metadata.RemoteNodeState Error fetching node state from https://ip-10-0-10-245.ec2.internal:8446/v1/info/state: Failed communicating with server: https://ip-10-0-10-245.ec2.internal:8446/v1/info/state
2022-07-01T21:30:13.902Z ERROR Announcer-0 io.airlift.discovery.client.Announcer Service announcement failed after 8.11ms. Next request will happen within 1000.00ms
2022-07-01T21:30:14.913Z ERROR Announcer-1 io.airlift.discovery.client.Announcer Service announcement failed after 10.35ms. Next request will happen within 1000.00ms
2022-07-01T21:30:15.921Z ERROR Announcer-3 io.airlift.discovery.client.Announcer Service announcement failed after 8.40ms. Next request will happen within 1000.00ms
2022-07-01T21:30:16.930Z ERROR Announcer-0 io.airlift.discovery.client.Announcer Service announcement failed after 8.59ms. Next request will happen within 1000.00ms
2022-07-01T21:30:17.938Z ERROR Announcer-1 io.airlift.discovery.client.Announcer Service announcement failed after 8.36ms. Next request will happen within 1000.00ms
Also with this:
[hadoop#ip-10-0-10-245 ~]$ trino-cli --server https://localhost:8446 --catalog awsdatacatalog --user hadoop --password
trino> select 1;
Error running command: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
trino>
Even though I am following this to upload the .pem files as assets to S3:
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-encryption-enable.html#emr-encryption-certificates
Am I wrong in saying that something this simple shouldn't be this complicated? I really will appreciate any help here.

Based on the message you are getting from Trino, Insufficient active worker nodes, the authentication system is working, and you are now having problems with secure internal communication. Specifically, the machines are having problems talking to each other. I would start by disabling internal TLS, verifying that everything is working, and only then work on enabling that (assuming you need this in your environment). To disable TLS, use:
internal-communication.shared-secret=<secret>
internal-communication.https.required=false
discovery.uri=http://<coordinator ip address>:<http port>
Then restar all your machines. You should not see Service announcement failed. There might be a couple of these when the machines are starting up, but once they establish communication the error messages should stop.

Related

"unable to verify the first certificate" when connecting to elasticsearch from nodejs using self-generated certificates

How do I connect to my elasticsearch cluster (TLS secured) when there are certificates generated by myself with the elasticsearch-certutil?
According to the ES documentation this code snippet should do it:
const client = new Client({
node: config.elastic.node,
auth: {
username: "elastic",
password: config.elastic.password
},
tls: {
ca: fs.readFileSync( "./share/es/certs/ca.crt" ),
rejectUnauthorized: false
}
})
Unfortunately, this gives me this famous error:
ConnectionError: unable to verify the first certificate
I've setup ES via docker-compose. To wrap up, I did the following:
Generating the certs using the elasticsearch-certutil using cert command via: bin/elasticsearch-certutil cert --silent --pem --in config/instances.yml -out /certs/bundle.zip. instances.yml contains all of my nodes as well as kibana. bundle.zip contains all certs and keys as well as the certificate for CA.
Configuring my nodes in docker-compose.yml so that they can read the generated certificates. For instance,
...
- xpack.security.http.ssl.key=${ES_CERTS_DIR}/es01/es01.key
- xpack.security.http.ssl.certificate_authorities=${ES_CERTS_DIR}/ca/ca.crt
- xpack.security.http.ssl.certificate=${ES_CERTS_DIR}/es01/es01.crt
- xpack.security.transport.ssl.certificate_authorities=${ES_CERTS_DIR}/ca/ca.crt
- xpack.security.transport.ssl.certificate=${ES_CERTS_DIR}/es01/es01.crt
- xpack.security.transport.ssl.key=${ES_CERTS_DIR}/es01/es01.key
...
Validating the connection with curl with this command
$ curl -X GET "https://elastic:$ES_PASSWORD#my-cluster-doomain.com:9201" -H "Content-type: application/json" --cacert $CACERT --key $KEY --cert $CERT
where $CACERT, $KEY, $CERT are pointing to the CA cert, the key and certificate for the node that I am connecting to. This results in:
{
"name" : "es01",
"cluster_name" : "es-docker-cluster",
...
"tagline" : "You Know, for Search"
}
which is fine I suppose.
But why can't I connect to my cluster from my expressjs application? I read something about creating a the certificate chain and letting ES know that. But, I this necessary? I mean, I can connect via curl and also using elasticdump. What gives my an error is when I access the cluster via browser https://my-cluster-domain.com:9201. The browser warns me that, although the certificate is valid, the connection is not secure.
Any ideas? Thank you.
Well, after a lot of googling it turned out that adding the CA file to the ES client config is not enough, as indicated in my example configuration above.
...
tls: {
ca: fs.readFileSync( "./share/es/certs/ca.crt" ),
rejectUnauthorized: false # don't do this in production
}
Instead, one has to announce the CA certificate to the Node process itself, before configuring your connecting to ES. You can do this, as described in this and in this post (solution 2a, with the NODE_EXTRA_CA_CERTS environment variable. I now start my process like this and it worked out:
$ NODE_EXTRA_CA_CERTS="./share/es/certs/ca.crt" NODE_ENV=prod ...
One last remark, you don't have to set rejectUnauthorized: false, as some workarounds do, in case you have the current version of the elasticsearch client.
My final configuration looks like this:
const client = new Client({
node: config.elastic.node,
auth: {
username: "elastic",
password: config.elastic.password
}
})

Prefect Server: No Tenant Found

I am attempting to spin up a Prefect Agent, in order to complete the setup with a Prefect Server. Rather than using prefect server start for out-of-the-box setup, I used prefect server config to generate the Docker Compose file and then docker compose up to spin up the server's services. When I tried to start the Agent, I got the following error:
prefect.utilities.exceptions.ClientError:
[{'message': 'No tenant found.',
'locations': [{'line': 2, 'column': 5}],
'path': ['register_agent'],
'extensions': {
'code': 'INTERNAL_SERVER_ERROR',
'exception': {'message': 'No tenant found.'}
}
}]
How do I fix this?
Using the Prefect CLI: prefect backend server, then prefect server create-tenant -n default
Using Prefect Server GraphQL API, as done in the Prefect source code:
tenant_info = self.graphql(
{
"mutation($input: create_tenant_input!)": {
"create_tenant(input: $input)": {"id"}
}
},
variables=dict(input=dict(name=name, slug=slug)),
)

How to copy paste Google's SSO certificate for connecting with dex?

I keep getting the follwoing error in dex server -
failed to initialize server: server: Failed to open connector saml: failed to open connector: failed to create connector saml: parse cert: trailing data:
I'm copying the Google SSO certificate, converting it to base64 and pasting it . This is for configuring argocd with google sso login.( https://argo-cd.readthedocs.io/en/release-1.8/operator-manual/user-management/google/) I tried copying the certificate with \n , \r\n and without \n. Still the same error. I'm editing the argocd cm file and adding it. Is there a correct format of copying it?
1: Go to https://www.base64encode.org/ and paste your original cert there for encoding. The original in full format as:
-----BEGIN CERTIFICATE-----
MIIDdDDDD
XXXXXX
VVVVVVV
-----END CERTIFICATE-----
Copy the encoded result string end to end and be careful to have no extra characters.
2: Edit your config map and ensure the yaml formatting is right:
#kubectl -n argocd edit cm argocd-cm
Here is a sample config that worked:
---
#in argocd-cm
data:
url: https://argocd.int.domain.com
dex.config: |
logger:
level: debug
format: json
connectors:
- type: saml
id: saml
name: saml
config:
ssoURL: https://accounts.google.com/o/saml2/idp?idpid=XXXXXXXX
entityIssuer: https://argocd.int.domain.com/api/dex/callback
redirectURI: https://argocd.int.domain.com/api/dex/callback
ssoIssuer: https://accounts.google.com/o/saml2/idp?idpid=XXXXXXXXX
caData: |
LS0tLS1CRUdJTiBXXXXXXXXXXThe long BASE64EncodedString
usernameAttr: name
emailAttr: email
#etcetc
---
I hope this fixes your problem.
Note: Formatting characters in the configmap will likely break things by introducing yaml parse errors so ensure you are not seeing /n and such when you open up the config map after your edit is saved.
You should consider a restart of both the argocd-dex-server and argocd-server deployments and confirm that the logs in the new pods come up clean.
[taproot#ip-10-10-15-500 ~]# kubectl -n argocd rollout restart deployment argocd-dex-server
deployment.apps/argocd-dex-server restarted
[taproot#ip-10-10-15-500 ~]# kubectl -n argocd rollout restart deployment argocd-server
deployment.apps/argocd-server restarted
I had to do the above restart to get rid of prominent errors on the UI that read something like:
"unable to load data: grpc: the client connection is closing"
Ref: https://argoproj.github.io/argo-cd/operator-manual/user-management/google/

Ansible Lookup with azure_keyvault_secret Invalid Credentails

I'm attempting to retrieve a secret stored in Azure Key Vault with Ansible. I found and installed the azure.azure_preview_modules using ansible-galaxy. I've also updated the ansible.cfg to point to the lookup_plugins directory from the role. When Running the following playbook I get the error:
- hosts: localhost
connection: local
roles:
- { role: azure.azure_preview_modules }
tasks:
- name: Look up secret when ansible host is general VM
vars:
url: 'https://myVault.vault.azure.net/'
secretname: 'SecretPassword'
client_id: 'ServicePrincipalIDHere'
secret: 'ServicePrinipcalPassHere'
tenant: 'TenantIDHere'
debug: msg="the value of this secret is {{lookup('azure_keyvault_secret',secretname,vault_url=url, cliend_id=client_id, secret=secret, tenant_id=tenant)}}"
fatal: [localhost]: FAILED! => {"msg": "An unhandled exception occurred while running the lookup plugin 'azure_keyvault_secret'. Error was a <class 'ansible.errors.AnsibleError'>, original message: Invalid credentials provided."}
Using the same information I can connect to Azure using AZ PowerShell and AZCLI and retrieve the Azure Key Vault secrets at the commandline. However, those same credentails do not work within this task for the playbook using the lookup plug-in.
I had a similar error when using python sdk (which ansible is built on top of). try changing url to this:
url: 'https://myVault.vault.azure.net' # so remove the trailing slash
the error text is 101% misleading
After much toil I figured out the issue! The argument client_id is misspelled in the example and I didn't catch it which resulted in the error. cliend_id=client_id,
https://github.com/Azure/azure_preview_modules/blob/master/lookup_plugins/azure_keyvault_secret.py#L49
Corrected example below.
- name: Look up secret when ansible host is general VM
vars:
url: 'https://valueName.vault.azure.net'
secretname: 'secretName/version'
client_id: 'ServicePrincipalID'
secret: 'P#ssw0rd'
tenant: 'tenantID'
debug: msg="the value of this secret is {{lookup('azure_keyvault_secret',secretname,vault_url=url, client_id=client_id, secret=secret, tenant_id=tenant)}}"

puppet Forbidden request /puppet-ca/v1/certificate/ca

I'm not able to make a puppet node join a master, i'm using puppet enterprise on AWS cloud.
Master
puppetserver --version
puppetserver version: 2017.3.0.38
Node
# puppet agent --test
Error: Could not request certificate: Error 403 on SERVER: Forbidden request: /puppet-ca/v1/certificate/ca (method :get). Please see the server logs for details.
Exiting; failed to retrieve certificate and waitforcert is disabled
obviously error message is related to permission on master side, when i check the log on the master i see
ERROR [qtp2147089302-255] [p.t.a.rules] Forbidden request: 10.0.10.224 access to /puppet-ca/v1/certificate/ca (method :get) (authenticated: false) denied by rule 'puppetlabs certificate'.
but i checked that the new HOCON format for auth.conf is allowing un authenticated node to send CSR
{
"allow-unauthenticated": "*",
"match-request": {
"method": "get",
"path": "/puppet-ca/v1/certificate/",
"query-params": {},
"type": "path"
},
"name": "puppetlabs certificate",
"sort-order": 500
}
i checked also that pe-puppet-server.conf is not using the legacy auth.conf method
# (optional) Authorize access to Puppet master endpoints via rules specified
# in the legacy Puppet auth.conf file (if true or not specified) or via rules
# specified in the Puppet Server HOCON-formatted auth.conf (if false).
use-legacy-auth-conf: false
max-active-instances: 2
max-requests-per-instance: 0
environment-class-cache-enabled: true
please advise, the same error msg occurs on both windows and linux
i did reboot the entire server(ec2 instance) since reloading puppetserver didn't help ... i also did the auth change from the console, as structed here
windows Puppet agent does not connect to the awsopsworks puppet Enterprise master
I had a similar issue when trying to setup my puppet nodes, but was using Vagrant instead of AWS.
The fix was to unset the following environment variables: http_proxy, https_proxy, HTTP_PROXY and HTTPS_PROXY.
My fix was to remove server_list from puppet.conf, cleanup CM cert and re-generate cert. In my case I have autosign=true so the process was:
Stop PE on CM:
systemctl stop puppet pxp-agent pe-puppetserver pe-puppetdb
Remove ssl dir
rm -fr /etc/puppetlabs/puppet/ssl
Cleanup cert from Primary:
puppetserver ca clean --certname='<CM>'
Run puppet agent on CM
puppet agent -t
Done.

Resources