Pimcore CSV upload from folder location and create data objects automatically - pimcore

I'm trying to import csv from pimcore /web/var/temp folder and want to create data objects automatically.
Not using admin interface I need to do this using code.
Is there any example or solution to follow?
Thanks.

I have found solution to this it can done using https://github.com/w-vision/ImportDefinitions
It can import CSV and it'll automatically create data-object as well.
we can run import process using command line, we can configure crone job to run command.
Thanks

Related

Azure Synapse: Upload directory of py files in Spark job reference files

I am trying to pass a whole directory of python files that are referenced in the main python file in Azure Synapse Spark Job Definition but the files are not appearing in the location and I get Module Not Found Error. Trying to upload like this:
abfss://[directory path in data lake]/*
You have to trick the Spark job definition by exporting it, editing it as a JSON, and importing it back.
After the export, open in a text editor and add the following:
"conf": {
"spark.submit.pyFiles":
"path-to-abfss/module1.zip, path-to-abfss/module2.zip"
},
Now, import the JSON back.
The way to achieve this on Synapse is to package your python files into a wheel package and upload the wheel package to a specific location the Azure Data Lake Storage where your spark pool will load them from every time it starts. This will make the custom python packages available to all jobs and notebooks using that spark pool.
You can find more details on the official documentation: https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-manage-python-packages#install-wheel-files

MLFlow - How to migrate or copy a run from one experiment to other?

I am trying to move a run in MLflow from one experiment to another. Does anybody know if its possible? If yes, how? (I use Python API)
https://github.com/amesar/mlflow-export-import
You can copy a run from one experiment to another - either in the same tracking server or between two tracking servers. Caveats apply if they are Databricks MLflow tracking servers.
For local users, I directly move run folders and modify the experiment id in meta.yaml.
For local users, I directly move run folders and modify the experiment
id in meta.yaml.
To help elaborate, every run has its own folder, which basically contains more folders like "artifacts", "metrics", "params", and "tags", you could find a "meta.yaml" file in that directory.
folder example
Open the meta.yaml file and you should find the experiment_id, simply change the number behind it should do the trick.
meta.yaml example

External Properties File in Azure Databricks

We have a full fledge Spark Application that is taking a lot off parameter from properties file. Now we want move the application to Azure notebook format .Entire code is working fine and giving expected result with hard coded parameter. But is it possible to use external properties file in Azure Databricks Notebook also ??If we can, then where we need to place properties file??
You may utilize the Databricks DBFS Filestore, Azure Databricks note books can access user's files from here.
To Upload the properties file you have, you can use 2 options
Using wget,
import sys
"wget -P /tmp/ http://<your-repo>/<path>/app1.properties"
dbutils.fs.cp("file:/tmp/app1.properties", "dbfs:/FileStore/configs/app1/")
Using dbfs.fs.put, (may be an one-time activity to create this file)
dbutils.fs.put("FileStore/configs/app1/app1.properties", "prop1=val1\nprop2=val2")
To import the properties file values,
properties = dict(line.strip().split('=') for line in open('/dbfs/FileStore/configs/app1/app1.properties'))
Hope this helps!!
There's a possibility of providing/returning arguments with use of Databricks Jobs REST API, more information can be found e.g. here: https://docs.databricks.com/dev-tools/api/latest/examples.html#jobs-api-example

Access blob file using time stamp in Azure

I want to access a blob file that is getting generated out of azure ml web service along with the ilearner and csv file. The problem is that the file is getting generated automatically with guid as its name, and with no response mentioning the existence of that file. I know that the file is getting generated as i can access it through azure portal. i would like to automatically access the file and the only possibility i can see is by using the time stamp of other file created at the same instance. is there any api or method available to access blobs created at a particular instance using time stamp instead of file name?
According to your description, I guess you used Export Data Module.
As your requirements, it is highly recommended that you could replace Export Data with Execute Python Script in Azure Machine Learning which allows you to customize the blob file name.
For the introduction to Execute Python Script, you could refer to the official documentation here.
Please refer to the following steps to implement:
Step 1: Please use Python virtualenv create Python independent running environment, specific steps please refer to https://virtualenv.pypa.io/en/stable/userguide/, then use the pip install command to download Azure Storage related Scripts.
Compress all of the files in the Lib/site-packages folder into a zip package (I'm calling it azure - storage - package here)
Step 2: Upload the zip package into the Azure Machine Learning WorkSpace DataSet.
specific steps please refer to the Technical Notes.
After success, you will see the uploaded package in the DataSet List, dragging it to the third node of the Execute Python Script.
Step 3 : Customize the blob file name in the python script to the timestamp, you could even add GUID to ensure uniqueness at the end of the file name.
I provided a simple snippet of code:
import pandas as pd
from azure.storage.blob import BlockBlobService
import time
def azureml_main(dataframe1 = None, dataframe2 = None):
myaccount= '****'
mykey= '****'
block_blob_service = BlockBlobService(account_name=myaccount, account_key=mykey)
block_blob_service.create_blob_from_text('test', 'str(int(time.time()))+'.txt', 'upload image test')
return dataframe1,
Also,you could refer to the SO thread Access Azure blog storage from within an Azure ML experiment.
Hope it helps you.

Can I load a Zip file to my app (previously downloaded)

I downloaded the Zip file containing all the app data and now I want to load it back instead of the existing one.
Is it even possible?
Thanks.
Yes is it possible through the HTTP API but you'll have to code some scripts.
The easiest solution is to create a new Wit.ai app and import your zip file during the creation process.

Resources