Working with multiple AWS keys in Hadoop environment

Working with multiple AWS keys in Hadoop environment - security

What's the workaround for having multiple AWS keys in Hadoop environment? My hadoop jobs will require access to two different S3 buckets (two different keys). Tried with "credential" provider but looks like it's pretty limited. It stores all keys in lower case, as a result I cannot use "s3a" for one job and "s3n" for other job. For example: for s3a, it looks for:
fs.s3a.access.key
fs.s3a.secret.key
And for s3n:
fs.s3n.awsAccessKeyId
fs.s3n.awsSecretAccessKey
But if I create provider with "fs.s3n.awsAccessKeyId", it stores as "fs.s3n.awsaccesskeyid", as a result, during runtime it fails to load the expected key.
As a workaround, I tried to generate two different credential providers and pass as:
--Dhadoop.security.credential.provider.path=key1,key2
But it didn't work togher as both of the keys have fs.s3a.access.key & fs.s3a.secrety.key pair.
I don't want to pass access and secret key using -D option as it's visible. Is there any better way to handle this scenario?

If you upgrade to Hadoop 2.8 you can use the per-bucket configurations to address this problem. Everything in fs.s3a.bucket.$BUCKETNAME is patched into the config for the FS instance for that bucket, overriding any other configs
fs.s3a.bucket.engineering.access.key=AAID..
fs.s3a.bucket.logs.access.key=AB14...
We use this a lot for talking to buckets in different regions, encryption, other things. Works well, so far. Though I would say that.
Special exception: if you encrypt credential secrets in JCECKS files. The docs cover this.

Related

How to get the data source for an AWS CloudFront Origin Access Identity in Terraform

We have terraform code in another project that must remain in that separate project that creates three AWS CloudFront Origin Access Identities - one that we want to use for all of our qa environments, one for all of our pprd environments, and one for all of our prod environments.
In another project, how can I use Terraform to get the datasource for these to use them in creating a CloudFront distribution with Terraform?
Does the datasource have to use the OAI ID or name to filter on and how? What happens if the OAI changes. I guess what I am getting at is I would prefer to avoid hard coding the ID or name if possible - Or is that the only way to do this?
We have three OAI's that we will need to use separately - In other words, we will be creating multiple qa distributions that will use the qa OAI, multiple pprd distributions that will use the pprd OAI, and multiple prod distributions that will use the prod OAI.
Let's assume that the ID's are AAAAAAA for the qa one, BBBBBBBB for the pprd one, and CCCCCCC for the prod one (blurred out the real ones in case there is a security issue in posting them).

Yes, you can get the Origin Access Identity created by another stack. In fact there are multiple ways to get it.
The easiest way would be to use a aws_cloudfront_origin_access_identity data source. You can define a data source as follows:
data "aws_cloudfront_origin_access_identity" "example" {
id = "EDFDVBD632BHDS5"
}
The id is the identifier of the distribution. For the attribute references of the data block, you would want to check out the docs.
What happens if the OAI changes?
The data block assumes that the resource already exists in AWS and was created outside of the current state. This means that it will be refreshed every time you do a terraform plan. If something changes on the resource, it will be detected at the next plan.
I guess what I am getting at is I would prefer to avoid hard coding the ID or name if possible.
In case of a data block, you have to provide the ID somehow. This can be either using a variable or hard-coding it. Now, if you really want to avoid this, you can use another method for importing remote resources.
The other option would be to read from a Terraform remote state by having a terraform_remote_state data source. This option is a bit more complex, since the remote state has to expose attributes as outputs. Also, you have to provide the location of the remote state, so can also be considered a hardcoded value.

How to secure ConnectionString and/or AppSettings in asp.net core (on-prem)

First off, I know we dont have ConnectionStrings and AppSettings per se in .Net core, but in my case I want to encrypt a ConnectionString and maybe some other application configurations stored in my appsettings.json (or other settings file).
I know this has been discussed alot all over the internet, but no one seems to have a legit answer..
So suggestions that has beeen thrown out there are:
Using EnvironmentConfigurationBuilder, however... that doesnt really solve the issue, since we just moved our plain text configurations from appsettings.json to the Env-variables)
Create a custom ConfigurationProvider that encrypts and decrypts the appsettings.json (or selective parts of it), however.. that doesnt solve the issue either, since we need to store our key(s) for the decryption somewhere accessible by our application, and even if we did store the key as a "hard-coded" string in our application, a "hacker" could simply just de-compile it.
Someone also mentioned that even if you do encrypt the appsettings.json, a "hacker" could always just do a memory dump and find the decrypted version of the data.. Im no expert on that field, so Im not sure how likely or how complicated such as thing would be.
Azure Key Vault has also been mentioned a few times, however.. in my case and in alot of cases when working with authorities, this is not an option since cloud-services are not allowed.
I might be overthinking this, since if an attacker/hacker actually has managed to get into our server, then we might have bigger issues.. but what would be the way to deal with this issue? Do you simply dont care and leave it all as "plain text"? Or should you take some sort if action and encrypt or obscure the secrets?

You don't need to encrypt connection strings from your config file because the best way is still to NOT store this information in your config files but as environment variables on your server.
In your appsettings.json file just store your local development connection string. For other environments, on the server it is deployed set an environment variable with __ (double underscore) for each child node in you config file.
You can read how this works on this page
If you have a config file as follow
{
"ConnectionStrings": {
"default": "Server=192.168.5.1; Database=DbContextFactorySample3; user id=database-user; password=Pa$$word;"
}
}
On a Windows server you would set the value like this
set "ConnectionStrings__default=Server=the-production-database-server; Database=DbContextFactorySample2; Trusted_Connection=True;"
I don't know how is your deployment flow and tools you're using but it's worth digging into it and find how you can use of this feature.
For example if you're deploying on Kubernetes you could use Helm to set your secret values.
At my company on TFS we create a Release pipeline and make use of the variables section to set the secret values. These values will then be used when the code is deployed on Kubernetes.
Variables in Release pipelines in TFS can be hidden like passwords and no developer can see the production values. Only administrators can

Setup clustered Traefik Edge Router on Azure kubernetes with Lets Encrypt

I'm trying to setup traefik with Lets Encrypt on kubernetes in Azure, so far so good and every thing is almost working ... this is the first time, so hoping I'm missing something to get everything working.
I have used the DeploymentController with 1 replica(later there will be more than one, going for clustered setup).
The issue is with the Lets Encrypt certificate.
I'm getting this error:
Failed to read new account, ACME data conversion is not available : permissions 755 for acme/acme.json are too open, please use 600
This seems like a fair requirement but how do I set this since I'm using the "node's storage" ... I know this is not the best option but having a hard time finding a good guide to follow ... so need some guidence here.
Guides says using a KV Storage as etcd
I have read:
https://docs.traefik.io/configuration/acme/
https://docs.traefik.io/user-guide/kubernetes/
It also says here: https://docs.traefik.io/configuration/acme/#as-a-key-value-store-entry
ACME certificates can be stored in a KV Store entry. This kind of storage is mandatory in cluster mode.
So I guess this is a requirement :-)
This all makes sense so every pod don't request the same certificate but can share it and be notified when a new certicate is requested ...
This page show the KV stores that is supported: https://docs.traefik.io/user-guide/kv-config/ - kubentes uses etcd, but I can't find any information if I can use that to store the certicate ... ?
So what is my options here? Do I need to install my own KV Store to support Lets Encrypt Certificates? Can i use Azure Storage Disk?

Best practice for managing web service credentials for Node.JS?

We're planning a secure Node.JS server, which uses several third-party web services. Each requires credentials that will need to be configured by the operations team.
Clearly they could simply put them in plain text in a configuration file.
Microsoft .NET seems to offer a better option with DPAPI (Data Protection API) - see Credential storage best practices. Is there a way to make this available through IISNode? Or is there any other option to secure such credentials within Node-JS configuration?

There's an extensive discussion of several options here, including the two suggested by xShirase:
http://pmuellr.blogspot.co.uk/2014/09/keeping-secrets-secret.html
User-defined services solves the problem, but only for Cloud Foundry.
This blog http://encosia.com/using-nconf-and-azure-to-avoid-leaking-secrets-on-github/ points out that you can often set environment variables separately on servers, and suggests using nconf to read them and config files separately.
I still wonder if there are specials for IIS?

There is 2 ways to do it securely :
First one is to use command line parameters when you launch your app.
These parameters are then found in process.argv
So, node myapp.js username password would give you :
process.argv[0]=node
process.argv[1]=/.../myapp.js (absolute path)
process.argv[2]=username
process.argv[3]=password
Second is to set the credentials as ENV variables. It is generally considered as the best practice as only you have access to these variables.
You would have to set the variables using the export command, than you'd access it in process.env

I currently had to do the exact same thing for my External API credentials. this is what i did
install node-config module
create a folder and file called config/config.js
here require(config) module
In local box it reads the configuation from local.json file
i have dummy values in local.json for api key and shared secret
on my QA environment i export two variables NODE_ENV="QA" and NODE_CONFIG_DIR="path to my configuation folder on qa server"
node-config module reads configuation from "path to your config folder / QA.json"
now i have real api key and credential in QA.json
here you can use an encryption to encrypt these values and put it back in QA.json
in your app get these config values and decrypt use it in your rest call
hope this helps.
so your config can live in the same container as node code.
refer to this for encryption and decryption
http://lollyrock.com/articles/nodejs-encryption/

Credential distribution/storage across fleets

What are the options for secure password/credential storage on a host and propagation of changes across a fleet of hosts?
An example would be you have fleet of size N and you want to store credentials, such as AWS access keys, on those hosts. The simple approach is to store it in the source code or a config file, but this is bad because it's usually plain text. Also, if you make a change to the credentials you then want them to propagate to all of the hosts in the fleet.
Are there any solutions that would allow for something along these lines?
Credentials credentials = new Credentials();
system.doAction( credentials.getCredential("CredentialKeyId") );
This is assuming a non-Windows fleet.

Are you asking for something similar to NIS? To Kerberos, maybe?

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string