databricks secrets in init scripts

databricks secrets in init scripts - databricks

In order to use Datadog, I use an init script according to databricks datatdog integration.
Hence I fill spark environment variables as on the attached picture.
It is working when I fill in the datadog key not encrypted but as soon as I fill it with {{secrets/datadog/api_key}}, it is not working.

It is difficult to determine the actual issue based on the way your question is currently worded. Based on the limited information provided, I would check the following:
Verify the secret exists in the location provided.
Is {{secret/datadog/api_key}} an actual secret scope?
Does your Databricks instance have access to the secret scope?

Related

In PySpark, is there a way to pass credentials as variables into spark.read?

Spark allows us to read directly from Google BigQuery, as shown below:
df = spark.read.format("bigquery") \
.option("credentialsFile", "googleKey.json") \
.option("parentProject", "projectId") \
.option("table", "project.table") \
.load()
However having the key saved on the virtual machine, isn't a great idea. I have the Google key saved as JSON securely in a credential management tool. The key is read on-demand and saved into a variable called googleKey.
Is it possible to pass JSON into speak.read, or pass in the credentials as a Dictionary?

The other option is credentials. From spark-bigquery-connector docs:
How do I authenticate outside GCE / Dataproc?
Credentials can also be provided explicitly, either as a parameter or from Spark runtime configuration. They should be passed in as a
base64-encoded string directly.
// Globally
spark.conf.set("credentials", "<SERVICE_ACCOUNT_JSON_IN_BASE64>")
// Per read/Write
spark.read.format("bigquery").option("credentials", "<SERVICE_ACCOUNT_JSON_IN_BASE64>")

This is more like chicken and egg situation. if you are storing credential file in secret manager (hope that's not your credential manager tool). How would you access secret manager. For that you might need key and where would you store that key.
For this, Azure has created a managed identities, through which two different services can talk to each other without providing any keys (credential) explicitly.

If you are running from Dataproc, then the node has a built in service account which you can control on cluster creation. In this case you do not need to pass any credentials/credentialsFile option.
If you are running on another cloud or on prem, you can use the local secret manager, or implement the connector's AccessTokenProvider which lets you full customization of the credentials creation.

Is it safe to put in secrets inside Google App Script code?

I'm creating a Google Workspace Add-On and need to make some requests using OAuth. They provide a guide here explaining how to do so. In the sample code, it's suggested that the OAuth client secret be inline:
function getOAuthService() {
return OAuth2.createService('SERVICE_NAME')
.setAuthorizationBaseUrl('SERVICE_AUTH_URL')
.setTokenUrl('SERVICE_AUTH_TOKEN_URL')
.setClientId('CLIENT_ID')
.setClientSecret('CLIENT_SECRET')
.setScope('SERVICE_SCOPE_REQUESTS')
.setCallbackFunction('authCallback')
.setCache(CacheService.getUserCache())
.setPropertyStore(PropertiesService.getUserProperties());
}
Is this safe for me to do?
I don't know how Google App Script is architected so I don't have details on where and how the code is being run.

Most likely it is safe since the script is only accessible to the script owner and Workspace Admins if it is for Google workspace (which may or may not be an issue).
Well, you can add some security/safety by making use of a container, by using Container-bound script which makes use of Google Spreadsheet, Google Doc or any other that allows user interaction. Or a standalone script but also makes use of other way to connect to UI for interaction. Refer to this link for more detailed explanation on that: What is the appropriate way to manage API secrets within a Google Apps script?
Otherwise, the only way I see that you can do is store the keys and secrets in User Properties. Here's how you can do it: Storing API Keys and secrets in Google AppScript user property
Also you can refer to this link below for more general information on how you can manage or add some security: https://softwareengineering.stackexchange.com/questions/205606/strategy-for-keeping-secret-info-such-as-api-keys-out-of-source-control

How do I determine which AWS Access Keys are used for boto3 calls in Python?

I'm writing a script to automatically rotate AWS Access Keys on Developer laptops. The script runs in the context of the developer using whichever profile they specify from their ~/.aws/credentials file.
The problem is if they have two API keys associated with their IAM User account, I cannot create a new key pair until I delete an existing one. However, if I delete whichever key the script is using (which is probably from the ~/.aws/credentials file, but might be from Environment variables of session tokens or something), the script won't be able to create a new key. Is there a way to determine what AWS Access Key ID is being used to sign boto3 API calls within python?
My fall back is to parse the ~/.aws/credentials file, but I'd rather a more robust solution.

Create a default boto3 session and retrieve the credentials:
print(boto3.Session().get_credentials().access_key)
That said, I'm not necessarily a big fan of the approach that you are proposing. Both keys might legitimately be in use. I would prefer a strategy that notified users of multiple keys, asked them to validate their usage, and suggest they deactivate or delete keys that are no longer in use.
You can also use IAM's get_access_key_last_used() to retrieve information about when the specified access key was last used.
Maybe it would be reasonable to delete keys that are a) inactive and b) haven't been used in N days, but I think that's still a stretch and would require careful handling and awareness among your users.
The real solution here is to move your users to federated access and 100% use of IAM roles. Thus no long-term credentials anywhere. I think this should be the ultimate goal of all AWS users.

Security on azure Cosmos db

I want to use Cosmos db with c# code. A really important point is that data should stay encrypted at any point. So, as I understood, once the data on the server, it's automaticaly encrypted by azure by the encryption-at-rest. But during the transportation, do I have to use certificate or it's automatically encrypted. I used this link to manage the database https://learn.microsoft.com/fr-fr/azure/cosmos-db/create-sql-api-dotnet. My question is finally : Is there any risk of safety if I just follow this tutorial?
Thanks.

I think that's a great starting point.
Just one note, your data is only as secure as the access keys to the account so, on top encryption at rest and in transit, the Access Key is probably the most sensitive piece of information you need to protect.
My advice is to use a KeyVault to store the database access key rather than define them as environment variables. Combined with Managed Identity, your key will never leave the confines of the azure portal which makes it the most secure option. I'm not sure how you plan on deploying your code but more times than not I've seen those keys encoded in source code or in some configuration file that ends up exposed.
A while ago I wrote a step-by-step tutorial describing how to implement this. You can find my article here

I would suggest you to follow the instructions mentioned in here, and not even using access keys, because if they are accidentally exposed, no matter that you have stored them in a Key Vault or not, your database is out there. Besides, if you want to use access keys, it is recommended to change the access keys periodically, which then you need to make this automatic and known to your key vault, here it is described how you could automate that.

Decrypt Azure Function App Operation Secret

I'm looking to get at an Azure function app's list of operational endpoints for each function, in particular the secret code that needs to be passed in to invoke the function.
I've tried lots of current answers in SO but all only seem to work with Function App's which use Files as the secret storage type.
We have a requirement to use Blob storage which is also the default in V2 function apps.
What I'm really after is the code piece that comes after the function name when it's retrieved from the Azure portal, I can manufacture all the other pieces before that myself.
For example https://mytestfunapp-onazure-apidev03.azurewebsites.net/api/AcceptQuote?code=XYZABCYkVeEj8zkabgSUTRsCm7za4jj2OLIQWnbvFRZ6ZIiiB3RNFg==
I can see where the secrets are stored in Azure Blob Storage as we need to configure that anyway when we create all the resources in our scripts.
What I'm really look for is how to decrypt the secret stored in the file. I don't care what programming language or script the solution may be written in, I'll work with it, or convert it to another language that we can use.
Here's a snippet of what the stored secret looks like in Blob storage, it's just a JSON file.
I'm wondering if anyone out there has some experience with this issue and may be able to help me out.

For now it's not supported to get the true key value programmatically. you could just view your key or create new key in the portal. You could find the description here: Obtaining keys.
If your function is a WebHook, when using a key other than the default you must also specify the clientId as a query param (the client ID is the name of your new key):
https://<yourapp>.azurewebsites.net/api/<funcname>?clientid=<your key name>
More information refer to this wiki doc: WebHooks.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

databricks secrets in init scripts - databricks

In order to use Datadog, I use an init script according to databricks datatdog integration. Hence I fill spark environment variables as on the attached picture. It is working when I fill in the datadog key not encrypted but as soon as I fill it with {{secrets/datadog/api_key}}, it is not working.

Related

In PySpark, is there a way to pass credentials as variables into spark.read?

Is it safe to put in secrets inside Google App Script code?

How do I determine which AWS Access Keys are used for boto3 calls in Python?

Security on azure Cosmos db

Decrypt Azure Function App Operation Secret

Categories

Resources