Read from Zip files in Helm Templates - zip

I have the following Template
apiVersion: v1
kind: Secret
metadata:
name: poc-secrets-secret-data-txt-{{ .Release.Namespace }}
type: Opaque
stringData:
myZipdata: {{ .Files.Get "secrets/base64.zip" }}
I get the following error when i do install helm
error converting YAML to JSON: yaml: control characters are not allowed
My use case is to send zip file only. Can anyone suggest ?

The YAML structure can only hold textual data. Similarly, within a Secret, stringData specifically holds textual data. Neither channel can communicate a (binary) zip file.
However: if you use data instead of stringData, that's allowed to contain binary data, and the Helm b64enc function can base64 encode arbitrary binary data into a valid YAML string. There's an example in the Helm documentation "Accessing Files Inside Templates" page. For your example, you can:
apiVersion: v1
kind: Secret
metadata:
name: poc-secrets-secret-data-txt-{{ .Release.Namespace }}
type: Opaque
data: # not stringData
myZipdata: {{ .Files.Get "secrets/base64.zip" | b64enc }} # add base64 encode

So you can't do it via helm from what I found.
My solution was
kubectl create configmap my-scripts --from-file=my-scripts.jar
kubectl describe configmap my-scripts
...
Data
====
Events: <none>
Which looks empty, but when you do the volume mount with helm to this configmap it will work.

Related

Azure ML CLI v2 create data asset with MLTable

I uploaded parquet files to a blobstorage and created a data asset via the Azure ML GUI. The steps are precise and clear and the outcome is as desired. For future usage I would like to use the CLI to create the data asset and new versions of it.
The base command would be az ml create data -f <file-name>.yml. The docs provide a minimal example of a MLTable file which should reside next to the parquet files.
# directory in blobstorage
├── data
│ ├── MLTable
│ ├── file_1.parquet
.
.
.
│ ├── file_n.parquet
I am still not sure how to properly specify those files in order to create a tabular dataset with column conversion.
Do I need to specify the full path or the pattern in the yml file?
$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
type: mltable
name: Test data
description: Basic example for parquet files
path: azureml://datastores/workspaceblobstore/paths/*/*.parquet # pattern or path to dir?
I have more confusion about the MLTable file:
type: mltable
paths:
- pattern: ./*.parquet
transformations:
- read_parquet:
# what comes here?
E.g. I have a column with dates with format %Y-%m%d %H:%M:%S which should be converted to a timestamp. (I can provide this information at least in the GUI!)
Any help on this topic or hidden links to documentation would be great.
A working MLTable file to convert string columns from parquet files looks like this:
---
type: mltable
paths:
- pattern: ./*.parquet
transformations:
- read_parquet:
include_path_column: false
- convert_column_types:
- columns: column_a
column_type:
datetime:
formats:
- "%Y-%m-%d %H:%M:%S"
- convert_column_types:
- columns: column_b
column_type:
datetime:
formats:
- "%Y-%m-%d %H:%M:%S"
(By the way, at the moment of writing this specifying multiple columns as array did not work, e.g. columns: [column_a, column_b])
To perform this operation, we need to check with installations and requirements for the experiment. We need to have valid subscription and workspace.
Install the required mltable library.
There are 4 different supported paths as the parameters in Azure ML
• Local computer path
• Path on public server like HTTP/HTTPS
• Path on azure storage (Like blob in this case)
• Path on datastore
Create a YAML file in the folder which was created as an assert
Filename can be anything (filename.yml)
$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
type: uri_folder
name: <name_of_data>
description: <description goes here>
path: <path>
to create the data assert using CLI.
az ml data create -f filename.yml
To create a specific file as the data asset
$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
# Supported paths include:
# local: ./<path>/<file>
# blob: https://<account_name>.blob.core.windows.net/<container_name>/<path>/<file>
# ADLS gen2: abfss://<file_system>#<account_name>.dfs.core.windows.net/<path>/<file>
# Datastore: azureml://datastores/<data_store_name>/paths/<path>/<file>
type: uri_file
name: <name>
description: <description>
path: <uri>
All the paths need to be mentioned according to your workspace credentials.
To create MLTable file as the data asset.
Create a yml file with the data pattern like below with the data in your case
type: mltable
paths:
- pattern: ./*.filetypeextension
transformations:
- read_delimited:
delimiter: ,
encoding: ascii
header: all_files_same_headers
Use the below python code to use the MLTable
import mltable
table1 = mltable.load(uri="./data")
df = table1.to_pandas_dataframe()
To create MLTable data asset. Use the below code block.
$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
# path must point to **folder** containing MLTable artifact (MLTable file + data
# Supported paths include:
# blob: https://<account_name>.blob.core.windows.net/<container_name>/<path>
type: mltable
name: <name_of_data>
description: <description goes here>
path: <path>
Blob is the storage mechanism in the current requirement.
The same procedure is used to create a data asset of MLTable
az ml data create -f <file-name>.yml

Iterating through yml file and storing the values in variable

I am trying to iterate through yml files which have data in this format
this file name is artifact.yml
artifacts:
- name: feture1
version: 2.0.5
git_url: git#somethign:tmm/tmm.git
- name: feture2
version: 1.0
git_url: git#123.git
My end gole is to fetch the version and name from this file and store it in a variable for further use.
i have tried using the yml module (snap of code i am trying)
import yaml
op = open('artifact.yml')
myyml = yaml.safe_load(op)
for i in myyml['artifacts']:
print (i)
op.close()
the output i am getting is this.
{'name': 'feture1', 'version': '2.0.5', 'git_url': 'git#somethign:tmm/tmm.git'}
{'name': 'feture2', 'version': 1.0, 'git_url': 'git#123.git'}
but i am not sure how to separate out the below two dictionaries and store the name and version in a separate variable
also i have tried something like this
myyml['artifacts'][0])
the output is
{'name': 'feture1', 'version': '2.0.5', 'git_url': 'git#somethign:tmm/tmm.git'}
but the problem is i will not be aware that how many elements will be there, as i parse it so also need some way to make this effective regardless of the number of entries in the yml file (FYI, the yml source format is same what i have given on top it is just that the number of entries could be more or less)
Thanks .
using myyml['artifacts'][0]) is giving you a map
then all you have to do is giving it a key and get the desired value
myyml['artifacts'][0]["name"])
myyml['artifacts'][0]["version"])

Is there a way to pass truststore.jks value in place for file location

I connect by Elasticsearch Instance through Spark code .. which requires to pass truststore file location and keystore file location, while instantiating the spark session as below.
.config("es.net.ssl.keystore.location", truststore_location)
.config("es.net.ssl.keystore.pass", truststore_password)
.config("es.net.ssl.truststore.location", truststore_location)
.config("es.net.ssl.truststore.pass", truststore_password)
I do have a file location but the challenge here is the value in the truststore.jks file is basically the encoded value of original value. This was done to when the ask was to copy the truststore.jks content and upload it as secret in Kubernetes pod.
I extracted the same by passing
cat truststore.jks | base64
Now as the file location when passed to spark session builder it gives invalid format error which is obvious. So is there any way by which I can extract the value and decode it and then pass the value ... not any location.
below is the way I loaded the volumes and volume mount for same
volumes:
- name: elasticsearch-truststore
secret:
secretName: "env-elasticsearch-truststore"
items:
- key: truststore.jks
path: truststore.jks
volumeMounts:
- name: elasticsearch-truststore
mountPath: /etc/encrypted
If anyone can suggest any other way I can approach the issue it will be great.
Thanks
There is a problem with the secret object you've created. The encoded value is only relevant when the object exists as a manifest in the etcd database of Kubernetes API server. The encoding has no effect on the actual contents of the secret.
I think what could have caused this is you encoded the contents and then created a secret of the encoded contents, which is what you're observing here.
A simple fix would be to delete the existing secret and create a new secret simply from your truststore.jks file, as follows:
kubectl create secret env-elasticsearch-truststore --from-file=truststore.jks=/path/to/truststore
This will create a secret named env-elasticsearch-truststore and this will contain one key truststore.jks with a value of /path/to/truststore file contents.
You can then use this secret as a file by mounting it in your pod, the specification will look like this:
...
volumes:
- name: elasticsearch-truststore
secret:
secretName: env-elasticsearch-truststore
volumeMounts:
- name: elasticsearch-truststore
mountPath: "/etc/encrypted"
...
This will ensure that the file truststore.jks will be available at the path /etc/encrypted/truststore.jks and will contain the contents of the original truststore.jks file.

How to insert a jks file in Vault Hashicorp?

I have a jks file which I need to put in Vault but before putting the jks file it should be base64 encoded and saved as json.
This is the process in short -
encoding the jks to base64 --> Storing the string in a file --> Modify
to json --> Store to vault
Here is what I am doing -
#encode and store it in a file
cat my-jks-file.jks | base64 > my-jks-file.txt
#Manually convert this to a json file which looks like this -
{
"my-secret" : <base64 string>
}
#Put this inside vault
vault kv put kv/foo #converted-jks-file.json
Is there a better way to do this? I want to avoid the manual step. Thanks
After doing a some research and going through the docs here -https://www.vaultproject.io/docs/commands/kv/put
I found a way to do this all in a single line
cat my-jks-file.jks | base64 | vault kv put kv/my-new-secret vault-jks=-

How do I consume files in Django 2.2 for yaml parsing?

I'm trying to upgrade my site from Django 1.11 to Django 2.2, and I'm having trouble with uploading and parsing yaml files.
The error message is:
ScannerError : mapping values are not allowed here in "", line 1, column 34: b'---\n recipeVersion: 9\n name: value\n'
^
I'm getting the file contents using a ModelForm with a widget defined as:
'source': AsTextFileInput()
... using ...
class AsTextFileInput(forms.widgets.FileInput):
def value_from_datadict(self, data, files, name):
return files.get(name).read()
... and then I get the source variable to parse with:
cleaned_data = super(RecipeForm, self).clean()
source = cleaned_data.get("source")
From that error message above, it looks like my newlines are being escaped, so yaml sees the text all on a single line. I tried logging the source of this file, and here's how it shows in my log file:
DEBUG b'---\n recipeVersion: 9\n name: value\n'
So, how can I get this file content without (what looks to me like) escaped newlines so I can parse it as yaml?
Edit: my code and yaml (simplified for this question) have not changed; upgrading Python projects has broken the parsing.
Decoding the bytestring fixed it:
class AsTextFileInput(forms.widgets.FileInput):
def value_from_datadict(self, data, files, name):
return files.get(name).read().**decode('utf-8')**

Resources